Have you ever wondered how ChatGPT truly works? Where does it get all its information? Simply put, large language models like ChatGPT are trained on two types of data: data for reasoning and thinking and data for facts and information.
But how can we integrate personalized information? How can we get tools like ChatGPT to deliver the specific answer we’re looking for? Catch up on the “Data Talks: Mastering Knowledge Processing and System Observability” event – featuring João Rocha e Melo, an AI division leader with expertise in data science and machine learning. He explains large language models and its internal knowledge processing through Retrieved-Augmented Generation.
How Do LLMs Acquire Information?
There are three specific ways to input information into a large language model: initial training, fine tuning, and prompts.
Initial training is the foundational stage of creating and training the model, where everything begins!
Fine-tuning involves building on what the model has already learned. It adjusts the model’s internal parameters, allowing it to respond with new information.
A prompt is the text we provide to the model to receive an answer.
Pros and Cons of Initial Training and Fine Tuning
Initial training and fine-tuning are essential stages involving fixed parameters. This means that once the information is provided, the model is finalized and cannot be further updated. This means that once information is provided, the model is finalized and cannot be further updated.
Users can benefit from better accuracy due to smaller prompts, which enables the model to function more efficiently. However, one major drawback is that it cannot discard or unlearn information, making data collection and processing less efficient. It also becomes difficult to measure the quality of the information provided. Finally, while both initial training and fine-tuning have a low cost of influence, it is difficult to master and can be expensive.
Pros and Cons of Prompts
In today’s constantly changing digital world, there is a need for a system that not only allows us to chat with but also helps us collect and process information. According to João, the solution lies with prompts:
“Future to the industry is through the prompt, which will change the status quo of asking questions.”
Unlike initial training and fine-tuning, prompts can adapt real-time information to domain knowledge. This improves the quality of information through modularity, meaning that if a new model is released, other models can leverage it, allowing them to evolve together.
However, the larger the prompt, the less accurate it becomes, which contradicts the precision we expect from the model, making it difficult to measure the quality of the output.
What is Retrieval Augmented Generation?
The retrieval process works by taking documents, placing them in a database, and checking whether the most relevant information has been retrieved from that database. This process improves the generation of an LLM model; by providing it with more information over time. But how does it work?
“Retrieval process works when there is a match between your question and the documents that might contain the answer.”
The retrieval process is carried out using a specific technique called embeddings. To embed is the process by which computers transform words into numbers, enabling them to interpret the concept of a word and determine whether two ideas are closely related or further apart. This technique links certain words to the question being asked, thereby retrieving the most relevant documents.
The importance of the RAG (Retrieved-Augmented Generation) system lies in its adaptation to a rapidly changing industry. Over the past year, these systems have grown, enabling companies and businesses to retrieve augmented generations based on their internal documents.
If you’re interested in learning more about artificial intelligence with its agents and systems, check out Ironhack’s Artificial Intelligence Bootcamp and expand your skills and knowledge in the tech industry!