@moffkalast 12d
> I can use smthg like GPT-4 to label data and then use that as a train set for my own LLM, right?

Yes, almost all improved LLama models are tuned exactly that way (trained on examples of questions and answers from say GPT 4). If OpenAI stole copyrighted works to train their models it is morally fair game to do the same to them regardless of their TOS. It's not like they can prove it anyway.

Plus there's the other point where they also say that everything generated by their models is public domain, so which one is it eh?

@wodenokoto 13d
It is my understanding that this is how “alignment” works.

That is, openAI paid people to chat with their LLM to fine tune it and then other LLMs use chatgpt to generate training data to align their models.

@fallingmeat 13d
That is against their ToS though if you use your new LLM commercially.
@montenegrohugo 13d
Yup, totally. This is a form of knowledge distillation. Openai, or other foundational model providers, can't really do anything about it.
@chaxor 12d
Yes, and in fact that's the best method available if you want good performance. I would suggest using a local open source model to do this however, to cut down on costs and make it far simpler to deal with than the unwieldy OpenAI systems.

https://arxiv.org/pdf/2305.02301.pdf

@snickmy 13d
Indeed, fine tuning with either synthetic data (as you are proposing) or human review works like that. you can read more here: https://huggingface.co/blog/rlhf
@notpublic 12d
not an AI expert but from a talk I recently heard... if there is a mismatch in training data between the "teacher" LLM and "student" LLM, you risk teaching the student to hallucinate or to ignore information
@foobarbecue 12d
Is "ca" "can" or "can't"?