공지 [November 23, 2023] Large Language Models(in 2023) - Hyung Won Chung P…
페이지 정보
작성자 최고관리자 댓글 조회 작성일 23-11-28 16:51본문
Abstract
There is one unique aspect of large language models (LLMs): larger models exhibit abilities that were not present in the smaller models. These emergent abilities
have far-reaching consequences in how we should work in the field of AI. I will share some of my observations on the implications of scaling and emergent
abilities. After that, I will introduce multiple stages involved in the current generations of LLM training: pre-training and post-training (including instruction
fine-tuning and RLHF).
Bio
Hyung Won is a research scientist at OpenAI ChatGPT team. He has worked on various aspects of Large Language Models: pre-training, instruction fine-tuning,
reinforcement learning with human feedback, reasoning, multilinguality, parallelism strategies, etc. Before OpenAI he spent 3.5 years at Google Brain.
Some of the notable work includes scaling Flan paper (Flan-T5, Flan-PaLM) and T5X, the training framework used to train the PaLM language model. He has
participated in open source projects such as Flan-T5, switch transformer, UL2. Before Google, he received a PhD from MIT where he worked on renewable
energy and clean water systems.
관련링크
- https://hwchung27.github.io 725회 연결
- Prev[February 26, 2024] Commencement 2024, College of Computing 24.03.05
- Next[September 26, 2023] SEMINAR | AI X Music 23.09.27
댓글목록
등록된 댓글이 없습니다.