John Conway @john

Recent searches

Search options

Only available when logged in.

**myrmepropagandist** @futurebird · Dec 31, 2024 *

Dec 31, 2024 *

I haven't thought "I should try to build my *own* web spider, then maybe I could find things." since... Well, since 1998.

The Google landing page as it looked in 1998. Superficially one might think that not much has changed.

This one is so old it still has an "about" page to explain what they are doing. And links to Stanford University.

**Dawn Ahukanna** @dahukanna@mastodon.social · Dec 31, 2024 *

Dec 31, 2024 *

Dawn Ahukanna @dahukanna@mastodon.social

@futurebird

To remove & externalise bookmark dependency from browsers, I’ve resorted to manually collecting & curating links as I find them, with personal notes+tags reminding me why they are of interest. They’re always 100% searchable & findable.

Given the inconsiderate, effective DDOS behavior of AI scraper bots, adding to that melee with more robo-indexing may not produce a usable search index - https://mastodon.social/@dahukanna/113741237599333856

MastodonDawn Ahukanna (@dahukanna@mastodon.social)@recursive@hachyderm.io how do they, the producers and indoctrinators of “Artificial Intelligence Large Language Models (AI-LLM)”, not “compute” that: content production rate ≠ content request rate? I’d like to have and host a website without it or the environment being “knackered” from constant demands.

myrmepropagandist @futurebird@sauropods.win

@dahukanna

I'm thinking of something much more modest:

https://sauropods.win/@futurebird/113744151630008623

Dec 31, 2024, 10:23 AM··Web

0boosts·2favorites

**Dawn Ahukanna** @dahukanna@mastodon.social · Dec 31, 2024

Dec 31, 2024

Dawn Ahukanna @dahukanna@mastodon.social

@futurebird

… extract links from within the post and links to the source post?

**myrmepropagandist** @futurebird · Dec 31, 2024

Dec 31, 2024

myrmepropagandist @futurebird

@dahukanna

I think so, yes. Basically I want a database of every single link that's been posted to *my* feed. It would also contain any hash tags used with the link, the post ID so I can go back and see the context.

Next I'd strip out all of the "big sites" and focus more on the obscure.

Then if I'm curious about, say # fossils I would get links mentioned in that context.

And if # fossils is used with the tag # crinoids often I could move laterally and find more links.

**myrmepropagandist** @futurebird · Dec 31, 2024

Dec 31, 2024

myrmepropagandist @futurebird

@dahukanna

Importantly this database would grow over time, it wouldn't be focused on "what's new" ... basically I have a high level of trust in the way people #onhere associate hash tags with links and I think that'd be a great way to find things.

In fact I do it manually often enough, but it's time consuming. I just want all of the links sometimes.

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back