Would be interesting to have a Large Language Model fine-tuned on leftist sources

GodOfThunder@lemm.ee · 2 years ago

Would be interesting to have a Large Language Model fine-tuned on leftist sources

ඞmir@lemmy.ml · 2 years ago

IIRC Microsoft and Google are both working on a citation-based LLM

🔻Sleepless One🔻@lemmygrad.ml · 2 years ago

Is the first character in your display name an amogus?

Aria 🏳️‍⚧️🇧🇩@lemmygrad.ml · edit-2 2 years ago

i’ll admit, i have thought of making a commie llm (so like, comradegpt? lol) sometime ago

i wonder if i should do it at some point, especially if we’re told to learn about ai and neural networking in the uni

albigu@lemmygrad.ml · edit-2 2 years ago

I started crawling the Portuguese version of the MIA and there was definitely enough data there alone to fine tune one (about 30 GB but a lot of it was pdfs). Just be aware that it requires a lot of data preparation and each trial takes a while on civilian hardware, which is why I postponed it. If there’s interest here I’d be happy to collaborate.

Also seems like a fun way to read a lot of diverse texts.

Edit: but be aware that LLMs are inherently unreliable and we shouldn’t trust them blindly like libs trust chatGPT.

Aria 🏳️‍⚧️🇧🇩@lemmygrad.ml · 2 years ago

Just be aware that it requires a lot of data preparation and each trial takes a while on civilian hardware, which is why I postponed it.

How long is it? Just so you know, I’m a total newbie at LLM creation/AI/Neural Networking.

Edit: but be aware that LLMs are inherently unreliable and we shouldn’t trust them blindly like libs trust chatGPT.

So… if they’re inherently unreliable, why make them? Genuine question.