I’m pretty sure I’ve seen some huge, semi-complete datasets of the site somewhere. Maybe the-eye has something? That would ease the process, since the content has already been hoarded.
Also, the new posts of some subreddits are being copied live to the equivalent Lemmy communities by bots made for that purpose.
I assumed that the people over in the DataHoarder communities had archives running this whole time. There’s probably data out there somewhere, it’s a matter of getting it into a usable hosted state
Probably because it’d take a ton of time and money (because of API limits and costs), and it’d potentially disrupt Lemmy too much.
Reddit is a massive site, as far as how much content it’s stored over the years.
I’m pretty sure I’ve seen some huge, semi-complete datasets of the site somewhere. Maybe the-eye has something? That would ease the process, since the content has already been hoarded.
Also, the new posts of some subreddits are being copied live to the equivalent Lemmy communities by bots made for that purpose.
I assumed that the people over in the DataHoarder communities had archives running this whole time. There’s probably data out there somewhere, it’s a matter of getting it into a usable hosted state
Probably this https://the-eye.eu/redarcs/