• 1 Post
  • 35 Comments
Joined 1 year ago
cake
Cake day: June 12th, 2023

help-circle


  • That’s kind of the point and how’s it different than a human. A human is going to weight local/recent contextual information as much more relevant to the conversation because they’re actively learning and storing the information (our brains work on more of an associative memory basis than temporal). However, with our current models it’s simulated by decaying weights over the data stream. So when you get conflicts between contextual correct vs “global” correct output, global has a tendency to win out that is more obvious. Remember you can’t actually make changes to the model as a user without active learning. Thus the model will always eventually return to it’s original behaviour as long as you can fill up the memory.









  • The problem isn’t the memory capacity, even thought the LLM can store the information, it’s about prioritization/weighting. For example, if I tell chatgpt not to include a word (for example apple) in it’s responses then ask it some questions then ask it a question about what are popular fruit-based pies then it will tend to pick the “better” answer of including apple pie rather than the rule I gave it a while ago about not using the word apple. We do want decaying weights on memory because most of the time old information isn’t as relevant but it’s one of those things that needs optimization. Imo I think we’re going to get to the point where the optimal parameters for maximizing “usefullness” to the average user is different enough from what’s needed to pass someone intentionally testing the AI. Mostly bc we know from other AI (like Siri) that people don’t actually need that much context saved to find them helpful


  • You don’t get to complain about people being condescending to you when you are going around literally copy and pasting wikipedia. Also you’re not right, major progress in this field started in the 80s although the concepts were published earlier, they were basically ignored by researchers. You’re making it sound like the NNs we’re using now are the same as the 60s when in reality our architectures and just even how we approach the problem have changed significantly. It’s not until the 90s-00s that we started getting decent results that could even match older ML techniques like SVM or kNN.



  • The idea of NN or the basis itself is not AI. If you had actual read D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Internal Representations by Error Propagation.” Sep. 01, 1985. then you would understand this bc that paper is about a machine learning technique not AI. If you had done your research properly instead of just reading wikipedia, then you would have also come across autoassociative memory which is the precursor to autoencoders and generative autoencoders which is the foundation of a lot of what we now think of as AI models. H. Abdi, “A Generalized Approach For Connectionist Auto-Associative Memories: Interpretation, Implication Illustration For Face Processing,” in In J. Demongeot (Ed.) Artificial, University Press, 1988, pp. 151–164.





  • So I’m a reasearcher in this field and you’re not wrong, there is a load of hype. So the area that’s been getting the most attention lately is specifically generative machine learning techniques. The techniques are not exactly new (some date back to the 80s/90s) and they aren’t actually that good at learning. By that I mean they need a lot of data and computation time to get good results. Two things that have gotten easier to access recently. However, it isn’t always a requirement to have such a complex system. Even Eliza, a chatbot was made back in 1966 has suprising similar to the responses of some therapy chatbots today without using any machine learning. You should try it and see for yourself, I’ve seen people fooled by it and the code is really simple. Also people think things like Kalman filters are “smart” but it’s just straightforward math so I guess the conclusion is people have biased opinions.