최근 LLM 모델

1. Bloom (2022)

1) 오픈소스

2) 176B parameters

3) decoder-only Transformer

4) trained on the ROOTS corpus

2. OPT (2022)

1) 오픈소스

2) a suite of decoder-only pre-trained transformers

3) 125M ~ 175M parameters

4 OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint

3. Mistral 7B (2023)

1) 7B parameters

2) Llama 1, 2 이김

3) transformer based

4) grouped-query attention (GQA)

5) sliding window attention (SWA)

4. DeciLM 6B (2023)

1) 오픈소스

2) 5.7B parameters

3) decoder-only transformer

4) grouped-query attention (GQA)

4) Llama 2-7B 이김

5. Baichuan 2 (2023)

1) large-scale multilingual

2) 7B, 13B

6. GLM (2023)

1) 오픈소스

2) bilingual (English, Chinese)

3) 130B

4) GPT-3 이김

7) Rhea-72b

1) 오픈소스

2) 72B

3) Self-Generated Dataset Creation Method (SGD): 생성된 문장이 정답 문장과 다르면 추가

4) 지금 huggingface llm leaerboard 1위

5) transformer based

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

티스토리툴바