@fnordpiglet
12d
It does make me wonder what the converged fixed point on this technique is. If I fine tune with GPT4 to make model A, which then performs better than GPT4, then fine tune model B with A, at what point does either artifacting or diminishing returns set in?