Show Navigation

Public
- Public
- Groups
- Recent tags
- Popular
- People

Notices tagged with yacy

LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Saturday, 26-Mar-2022 15:19:27 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

Search engines with their own indexes: https://seirdy.one/2021/03/10/search-engines-with-own-indexes.html

#YaCy: slow, with mostly irrelevant results

Very true, but the second part, mostly irrelevant results is a matter of degree. #Google's results were once fantastic, but these days, a big chunk of the first five pages of a search are irrelevant results, often SEO'd into high positions and knocking relevant results lower.

#Bing (and #Yahoo, #DDG, and others who use Bing's results) are also described with increasing accuracy by "mostly irrelevant results".

But yes, YaCy's results are even worse. I want to set up another YaCy instance, just to crawl sites of interest in various topics. I know there's a major software change needed to get the most improvement in search results, but having the right information in the global index is a necessary precursor to producing good results.
Saturday, 26-Mar-2022 15:19:27 UTC from nu.federati.net permalink
Attachments
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Tuesday, 28-Dec-2021 01:51:18 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
AFAIK, #Brave search uses its own crawler and index, as does #Gigablast. And let us not forget #YaCy. I intend to host a YaCy peer-to-peer search instance on its own VPS (well, shared with a #Searx instance, which will use it)

Tuesday, 28-Dec-2021 01:51:18 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 03-Dec-2021 17:55:20 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

#LowEndBox / #LEB sending out Black Friday / Cyber Monday e-mails a week late. I'd like to get something to host #Yacy + #Searx

Friday, 03-Dec-2021 17:55:20 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 14-May-2021 15:38:57 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- Douglas A. Whitfield
@musicman I just heard of #MeiliSearch last night, so I did not do any deep diving. I know that #YaCy uses #Solr, but I do not know if it always did.

Friday, 14-May-2021 15:38:57 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 14-May-2021 06:50:53 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

https://opensource.com/article/20/2/yacy-search-engine-hacks [opensource com]

Ways to improve #YaCy a little bit.
Friday, 14-May-2021 06:50:53 UTC from nu.federati.net permalink
Attachments
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 14-May-2021 05:10:14 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

Not a general search engine, and no public instance provided, but “MeiliSearch”:{https://www.meilisearch.com/} say they are an open source self-hostable search API.

I don’t think you’d use it as a general search engine, but more as something integrated into your site. So “Dog Buddy Magazine Online” might use MeiliSearch to let visitors seek out articles by breed, puppy or adult, and other factors relevant to the sites readers.

License: MIT. See https://blog.meilisearch.com/oss-paradigm/
Paid plan: I haven’t seen it, but I just discovered the software and site a few minutes ago; maybe they’re charging for support.

Anyway, it is interesting. They seem to compete partly with “ElasticSearch”:{https://www.elastic.co/elasticsearch/} (and maybe a little bit with #YaCy’s stand-alone mode).
Friday, 14-May-2021 05:10:14 UTC from nu.federati.net permalink
Attachments
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Thursday, 25-Mar-2021 17:09:25 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
It looks like Sarchy (search provider based first on #YaCy, then on #Searx) is gone. That page goes to a domain parking page.

Thursday, 25-Mar-2021 17:09:25 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Sunday, 29-Nov-2020 04:02:45 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

Looking at #LEB for a possible place to put another #YaCy instance. Disposable.

Sunday, 29-Nov-2020 04:02:45 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 30-Oct-2020 07:50:30 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
https://searchlab.eu/t/self-hosted-s3-buckets-for-distributed-data-collection/480

A #YaCy dev has an idea: peers can use #S3 buckets to store and share their index data.

I see one flaw right away: every peer would need to run a second VPS (and possibly more) with #min.io to provide the storage backend ... or rent such storage from Amazon or other cloud vendors.
Friday, 30-Oct-2020 07:50:30 UTC from nu.federati.net permalink
Attachments
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 30-Oct-2020 05:50:32 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
But frankly, my interest in #YaCy is because of its peer to peer nature. I understand it wasn't good enough to run a search engine business with it. (As Sarchy found out.) Its results weren't great anyway.

Friday, 30-Oct-2020 05:50:32 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 30-Oct-2020 02:54:37 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

I used to host a #YaCy p2p search node ( https://www.yacy.net/ ), and I intend to do so again in the future, but now I'm seeing "YaCy Grid" (a non-peer-to-peer search) is being developed. https://searchlab.eu/t/the-story-of-yacy-grid/48/17
Friday, 30-Oct-2020 02:54:37 UTC from nu.federati.net permalink
Attachments
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Thursday, 10-Sep-2020 19:47:59 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

Maybe #BING is a recursive acronym: BING Is No Good.

It’s been years since I touched Java, but someone should really be trying to improve #YaCy’s results. I’m still hoping to host a node on their FreeWorld search network again, but I cannot do so right now.

Thursday, 10-Sep-2020 19:47:59 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Tuesday, 21-Jul-2020 03:27:59 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- mangeurdenuage
@mangeurdenuage I think the thing #YaCy needs most is algorithm improvements. It already crawls and indexes a significant subset of the things in the major search engines' indexes, but it fails to select correct enough results out of that index.

Tuesday, 21-Jul-2020 03:27:59 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Monday, 20-Jul-2020 18:35:22 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
On search: I really want to host another #YaCy node, but YaCy's results are still not great. Even when my node was crawling & indexing sites in a particular field (at the time, FOSS SQL and NOSQL databases) weekly, searches for things in that field were infested with non-related results such that finding the desired answer was unreliable.
.
YaCy integrates #Solr ... so its results should be improvable.

Monday, 20-Jul-2020 18:35:22 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Tuesday, 14-Jul-2020 19:58:52 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- Douglas A. Whitfield
@musicman I only indirectly encountered #Solr when I ran a #YaCy instance, so I have no opinion about Solr 6.

Tuesday, 14-Jul-2020 19:58:52 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Tuesday, 25-Feb-2020 17:45:15 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- Douglas A. Whitfield
@musicman Good. I think the current implementation of the #YaCy search engine relies on #Solr.

Tuesday, 25-Feb-2020 17:45:15 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Friday, 01-Nov-2019 01:27:31 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- i am warm and powerful
@xj9 I don't see the context, but I like #YaCy's peer-to-peer search, despite their search algorithm not being anywhere close to #Google in the quality of results (probably close to #Bing, though). I have hosted YaCy instances in the past, in part to help improve the peer network's results by crawling and indexing sites related to databases, Java, Tcl, Python, PHP (the things I was searching for most often at the time).

I intend to host YaCy again (perhaps feeding a #Searx instance, so its results would not be wholly dependent on the goodwill of big corporate search engines).

Friday, 01-Nov-2019 01:27:31 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Sunday, 20-Oct-2019 14:57:42 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...

Deleting my bookmark for https://sarchy.tech/ ... it was using #YaCy (where peers crawl and index), but now it uses #Searx (which depends on the goodwill of corpocentric search providers).

Admittedly, Searx is likely to give better results, but why use them when #DDG and #Startpage do the same things, only better?

(And yes, DuckDuckGo's results have gotten much worse lately. Still not as bad as using #Bing directly, but I'm leaning more to Startpage these days.)

Sunday, 20-Oct-2019 14:57:42 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Monday, 29-Apr-2019 21:30:20 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
in reply to
- i am warm and powerful
- Agnel Vishal
@agnelvishal @xj9 Thank you. I do appreciate that. I’m still planning on running my own #YaCy node again. If for no other reason, I like running a crawling and indexing peer, to expand and hopefully improve search results.

Monday, 29-Apr-2019 21:30:20 UTC from nu.federati.net permalink
LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864} (lnxw48a1)'s status on Tuesday, 23-Apr-2019 17:43:11 UTC LinuxWalt (@lnxw48a1) {3EB165E0-5BB1-45D2-9E7D-93B31821F864}
Remote profile options...
- Raspberry Pi and other SBC "Maker" board enthusiasts
- i am warm and powerful
@xj9 One thing I'd like to test is hosting a #YaCy instance on a !raspi like device + external storage. Though I think it might be unresponsive to searches during a periodic crawl & index.

Tuesday, 23-Apr-2019 17:43:11 UTC from nu.federati.net permalink

Before

Arkwood Pond Social is a social network, courtesy of McCullaugh.com Network. It runs on GNU social, version 1.2.0-beta4, available under the GNU Affero General Public License.

All Arkwood Pond Social content and data are available under the Creative Commons Attribution 3.0 license.

Switch to desktop site layout.