I would strongly recommend against fine-tuning over a set of documents as this is a very lossy information system retrieval system. LLMs are not well suited for information retrieval like databases and search engines.
The applications of fine-tuning that we are seeing have a lot of success is making completion models like LLaMA or original GPT3 become prompt-able. In essence, prompt-tuning or instruction-tuning. That is, giving it the ability to respond with a user prompt, llm output chat interface.
Vector databases, for now, are a great way to store mappings of embeddings of documents with the documents themselves for relevant-document information retrieval.
I would highly recommend skimming this RLHF paper for how demonstration data was used to make a model prompt-able [1]. Keep in mind RLHF is another concept all together and we might be seeing a revolution where it might become optional (thanks to LIMA)!
Fine-tuned Models: Imagine you have a super-smart robot that can talk about anything. But you want it to be really good at talking about, say, dinosaurs. So, you teach it more about dinosaurs specifically. That's what fine-tuning is – you're teaching the robot (or model) to be really good at a specific topic.
Vector Databases and Embeddings with LLM: This might be a little tricky, but let's think of it this way. Imagine you have a huge library of books and you want to find information on a specific topic, say, ancient Egypt. Now, instead of reading every book, you have a magical index that can tell you which books talk about ancient Egypt. This index is created by magically converting each book into a "summary dot" (that's the embedding). When you ask about ancient Egypt, your question is also converted into a "summary dot". Then, the magical index finds the books (or "summary dots") that are most similar to your question. That's how the vector database and embeddings work.
So, if you want your super-smart robot to be really good at one specific topic, you use fine-tuning. But if you want it to quickly find information from a huge library of knowledge, you use vector databases and embeddings. Sometimes, you might even use both for different parts of the same task!
Embeddings = Input
Fine-tuning is like a chef modifying a general pizza recipe to perfect a specific pizza, such as Neapolitan. This customization optimizes the result. In AI, fine-tuning adjusts a pre-existing model to perform better on a specific task.
Embeddings are like categorizing ingredients based on properties. They represent inputs so that similar inputs have similar representations. For instance, 'dog' and 'puppy' in an AI model have similar meanings. Like ingredients in a pizza, embeddings help the model understand and interpret the inputs. So, fine-tuning is about improving the model's performance, while embeddings help the model comprehend its inputs.
It turns out, you can search a vector space of embeddings to find similar embeddings. If I turned my above post into 2 embeddings, and you searched for "golden retreiver" though neither paragraph has that exact phrase, the model should know a golden retreiver is most similar to the second paragraph that compares puppy to dog.