Will ChatGPT Do X? Build Your Own Mini-ChatGPT and Then Decide for Yourself
implement your own reinforcement learning from human feedback large language model and discus what you might expect to see when such an agent is scaled up using some of the literature on what exactly emerges with scaling up