1. Bloom (2022)
1) 오픈소스
2) 176B parameters
3) decoder-only Transformer
4) trained on the ROOTS corpus
2. OPT (2022)
1) 오픈소스
2) a suite of decoder-only pre-trained transformers
3) 125M ~ 175M parameters
4 OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint
3. Mistral 7B (2023)
1) 7B parameters
2) Llama 1, 2 이김
3) transformer based
4) grouped-query attention (GQA)
5) sliding window attention (SWA)
4. DeciLM 6B (2023)
1) 오픈소스
2) 5.7B parameters
3) decoder-only transformer
4) grouped-query attention (GQA)
4) Llama 2-7B 이김
5. Baichuan 2 (2023)
1) large-scale multilingual
2) 7B, 13B
6. GLM (2023)
1) 오픈소스
2) bilingual (English, Chinese)
3) 130B
4) GPT-3 이김
7) Rhea-72b
1) 오픈소스
2) 72B
3) Self-Generated Dataset Creation Method (SGD): 생성된 문장이 정답 문장과 다르면 추가
4) 지금 huggingface llm leaerboard 1위
5) transformer based