Internal Knowledge Processing with Retrieved

Internal Knowledge Processing with Retrieved - Augmented Generation

Enhancing Decision-Making through Contextual Retrieval and AI-Driven Synthesis.

Tala Sammar

Events and Content Marketing Intern

Data Science & Machine Learning

Have you ever wondered how ChatGPT truly works? Where does it get all its information? Simply put, large language models like ChatGPT are trained on two types of data: data for reasoning and thinking and data for facts and information.

But how can we integrate personalized information? How can we get tools like ChatGPT to deliver the specific answer we’re looking for? Catch up on the “Data Talks: Mastering Knowledge Processing and System Observability” event – featuring João Rocha e Melo, an AI division leader with expertise in data science and machine learning. He explains large language models and its internal knowledge processing through Retrieved-Augmented Generation.

How Do LLMs Acquire Information?

There are three specific ways to input information into a large language model: initial training, fine tuning, and prompts.

Initial training is the foundational stage of creating and training the model, where everything begins!
Fine-tuning involves building on what the model has already learned. It adjusts the model’s internal parameters, allowing it to respond with new information.
A prompt is the text we provide to the model to receive an answer.

Pros and Cons of Initial Training and Fine Tuning

Initial training and fine-tuning are essential stages involving fixed parameters. This means that once the information is provided, the model is finalized and cannot be further updated. This means that once information is provided, the model is finalized and cannot be further updated.

Users can benefit from better accuracy due to smaller prompts, which enables the model to function more efficiently. However, one major drawback is that it cannot discard or unlearn information, making data collection and processing less efficient. It also becomes difficult to measure the quality of the information provided. Finally, while both initial training and fine-tuning have a low cost of influence, it is difficult to master and can be expensive.

Pros and Cons of Prompts

In today’s constantly changing digital world, there is a need for a system that not only allows us to chat with but also helps us collect and process information. According to João, the solution lies with prompts:

“Future to the industry is through the prompt, which will change the status quo of asking questions.”

Unlike initial training and fine-tuning, prompts can adapt real-time information to domain knowledge. This improves the quality of information through modularity, meaning that if a new model is released, other models can leverage it, allowing them to evolve together.

However, the larger the prompt, the less accurate it becomes, which contradicts the precision we expect from the model, making it difficult to measure the quality of the output.

What is Retrieval Augmented Generation?

The retrieval process works by taking documents, placing them in a database, and checking whether the most relevant information has been retrieved from that database. This process improves the generation of an LLM model; by providing it with more information over time. But how does it work?

“Retrieval process works when there is a match between your question and the documents that might contain the answer.”

The retrieval process is carried out using a specific technique called embeddings. To embed is the process by which computers transform words into numbers, enabling them to interpret the concept of a word and determine whether two ideas are closely related or further apart. This technique links certain words to the question being asked, thereby retrieving the most relevant documents.

The importance of the RAG (Retrieved-Augmented Generation) system lies in its adaptation to a rapidly changing industry. Over the past year, these systems have grown, enabling companies and businesses to retrieve augmented generations based on their internal documents.

If you’re interested in learning more about artificial intelligence with its agents and systems, check out Ironhack’s Artificial Intelligence Bootcamp and expand your skills and knowledge in the tech industry!

5 minutes
2025’s 5 Most In-Demand Machine Learning Languages
Juliette Carreiro - 2025-06-24
With tons of programming languages out there, choosing just one can be tough.
Read article
8 minutes
4 Data Science Programming Languages Used in 2025
Juliette Carreiro - 2025-06-24
Discover four of 2025's most popular programming languages
Read article
7 minutes
Looking for Creative Data Science Career Paths? Here’s What You Need to Know
Ironhack - 2025-03-06
Here are a few creative tips to find top opportunities in the field of data science
Read article
4 minutes
AI in Recruitment: How Machine Learning is Shaping the Future of Hiring
Tala Sammar - 2024-11-22
Revolutionizing Hiring: Explore how machine learning is making recruitment smarter and more efficient.
Read article
5 minutes
Top 10 Pandas Functions Every AI Expert Should Know
Juliette Carreiro - 2024-10-24
Master these essential Pandas functions to streamline data preparation and analysis for AI projects.
Read article
6 minutes
TensorFlow vs. PyTorch: Which Deep Learning Framework Should You Learn?
Juliette Carreiro - 2024-10-18
The Key Differences Between PyTorch and TensorFlow: Which Deep Learning Framework Should You Choose?
Read article
8 minutes
How to Properly Implement Data Classification
Ironhack - 2024-10-17
Learn how to properly implement data classification to protect sensitive information
Read article
2 minutes
Observability and Evaluation of LLM Systems & Agents
Tala Sammar - 2024-10-15
Ensuring Transparency and Performance in Large Language Models and AI Agents
Read article
5 minutes
Feature Engineering Explained: Unlocking the Power of Data for Machine Learning
Juliette Carreiro - 2024-10-11
A Step-by-Step Guide to Feature Engineering: Boosting Machine Learning Performance with Smarter Data.
Read article
6 minutes
Why Learning Python is a Must for Aspiring Data Scientists
Juliette Carreiro - 2024-09-20
Unlock the power of data with Python and supercharge your career in data science.
Read article
Learn Data Science and AI in 1 Year with Ironhack Germany
Marta Aguilar - 2024-09-16
At Ironhack Germany, we’re excited to announce the launch of our brand new 1 Year Data Science and AI Program! After 10 years of transforming lives and launching careers around the world, we’re excit…
Read article
6 minutes
Big Data and AI: How Do They Work Together?
Ironhack - 2024-07-22
Harnessing the Power of Big Data and AI for Enhanced Decision-Making
Read article

Recommended for you

7 minutes
Learn Data Science and Machine Learning with Ironhack’s New Bootcamp
Whitney van der Zanden - 2023-11-14
Learn tech’s most versatile skill set, and launch your new career.
Read article
5 minutes
11 Great Jobs in Tech for Creative People
Juliette Carreiro - 2023-07-08
Discover jobs in tech that don't require math!
Read article
9 minutes
What is a Tech Lead? Responsibilities, Skills, and Career Path
Juliette Carreiro - 2023-06-17
Let’s fight some common misconceptions about a key member in the software development team.
Read article
7 minutes
Google Bard: What it Means for You
Ironhack - 2023-06-02
You’ve heard of ChatGPT, but do you know what Google Bard can do for you?
Read article
7 minutes
10 Best Tech Companies To Work For And Why
Juliette Carreiro - 2024-04-02
A look into what it's like to work for the companies making the biggest impact in the world of tech.
Read article
9 minutes
How to Begin a Career in Cybersecurity Without Previous Knowledge
Juliette Carreiro - 2023-12-14
Land your first job in cybersecurity, without sweating your lack of experience!
Read article
Data Analytics Is Changing The World - Here’s Why You Should Care
Marta Aguilar - 2023-07-05
Data Analytics isn't just good for business. It's good for the planet, and it's doing great things for YOU! Yes YOU! Let's see how Data is being used to change the world, and why you should be paying attention.
Read article
5 minutes
What Does a Career in Web3 Look Like?
Ironhack - 2022-11-11
Mad about Meta? Curious about Crypto? Maybe you need a career in Web3...
Read article
8 minutes
Common Misconceptions About Tech Bootcamps
Ironhack - 2023-04-27
They’re expensive, time-consuming, and who knows if you will get a job, right? Not quite.
Read article
7 minutes
A Day in the Life of a Tech Bootcamp Student
Juliette Carreiro - 2023-10-22
Discover what it’s really like to be an Ironhacker
Read article
26 minutes
The Gender Gap in Tech…Let’s Talk About It
Juliette Carreiro - 2023-03-09
Tech’s gender gap is quite pervasive and requires societal and personal efforts to resolve.
Read article
8 minutes
Top Coding Languages to Learn in 2025: Stay Ahead in Tech
Juliette Carreiro - 2024-12-30
Discover the best programming languages to learn in 2025.
Read article

Internal Knowledge Processing with Retrieved - Augmented Generation

How Do LLMs Acquire Information?

Pros and Cons of Initial Training and Fine Tuning

Pros and Cons of Prompts

What is Retrieval Augmented Generation?

Related Articles

2025’s 5 Most In-Demand Machine Learning Languages

4 Data Science Programming Languages Used in 2025

Looking for Creative Data Science Career Paths? Here’s What You Need to Know

AI in Recruitment: How Machine Learning is Shaping the Future of Hiring

Top 10 Pandas Functions Every AI Expert Should Know

TensorFlow vs. PyTorch: Which Deep Learning Framework Should You Learn?

How to Properly Implement Data Classification

Observability and Evaluation of LLM Systems & Agents

Feature Engineering Explained: Unlocking the Power of Data for Machine Learning

Why Learning Python is a Must for Aspiring Data Scientists

Learn Data Science and AI in 1 Year with Ironhack Germany

Big Data and AI: How Do They Work Together?

Recommended for you

Learn Data Science and Machine Learning with Ironhack’s New Bootcamp

11 Great Jobs in Tech for Creative People

What is a Tech Lead? Responsibilities, Skills, and Career Path

Google Bard: What it Means for You

10 Best Tech Companies To Work For And Why

How to Begin a Career in Cybersecurity Without Previous Knowledge

Data Analytics Is Changing The World - Here’s Why You Should Care

What Does a Career in Web3 Look Like?

Common Misconceptions About Tech Bootcamps

A Day in the Life of a Tech Bootcamp Student

The Gender Gap in Tech…Let’s Talk About It

Top Coding Languages to Learn in 2025: Stay Ahead in Tech

Ready to join?