GPT:Generative Pre-Training
Two key points to GPT’s success are (I) training decoder-onlly Transformer language models that can accurately predict the next word and (II) scaling up the size of language models

ICT
1.training on code data
Codex: a GPT model fine-tuned on a large corpus of GitHub
code
2.alignment with human preference
reinforcement learning from human feedback (RLHF) algorithm
Note that it seems that the wording of “instruction tuning” has seldom
been used in OpenAI’s paper and documentation, which is substituted by
supervised fine-tuning on human demonstrations (i.e., the first step
of the RLHF algorithm).
chatGPT(based on gpt3.5 and gpt4) and GPT-4(multimodal)

Stanford Alpaca is the first open instruct-following model fine-tuned based on LLaMA (7B).
Alpaca LoRA (a reproduction of Stanford Alpaca using LoRA)
model 、data、library

General Text Data:webpages, books, and conversational text
Specialized Text Data:Multilingual text, Scientific text, Code
Existing work has found that duplicate data in a corpus would reduce the diversity of language models, which may cause the training process to become unstable and thus affect the model performance.