SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions
The paper introduces a framework called SELF-INSTRUCT, which aims to enhance the ability of pretrained language models to follow instructions. These models have demonstrated impressive generalization capabilities across new tasks but rely heavily on human-written instruction data that is often limited in quantity, diversity, and creativity. SELF-INSTRUCT addresses this limitation by generating instructions, input data, and output samples using a language model and then filtering out invalid or similar ones. These generated instructions are used to fine-tune the original model. When applied to the vanilla GPT-3 model, SELF-INSTRUCT achieves a significant 33% absolute improvement on the SUPER-NATURALINSTRUCTIONS task, comparable to the performance of InstructGPT001, which was trained with private user data and human annotations. The paper also presents human evaluation results showing that SELF-INSTRUCT outperforms existing public instruction datasets by a large margin when tuning GPT-3, leaving only a 5% absolute gap behind InstructGPT001. Overall, SELF-INSTRUCT offers an almost annotation-free approach to aligning pretrained language models with instructions and provides a large synthetic dataset to support further research on instruction tuning.