Pegasus Introduction
by mervyn
김한길 님의 유튜브 와 기타 기술 블로그들을 참고했다.
What is Pegasus?
Pegasus is pre-trained jointly on two self-supervised objective functions: Masked Language Modeling (MLM) and a novel summarization specific pretraining objective, called Gap Sentence Generation (GSG).
In contrast to BART, Pegasus’ pretraining task is intentionally similar to summarization: important sentences are masked and are generated together as one output sequence from the remaining sentences, similar to an extractive summary.
Gap Sentence Generation
Self-supervised objective for summarization.
- Mask sentences from a document (gap-sentences)
- Generate gap sentences from the rest of the document (by choosing important sentences)
- Choosing important sentences: use ROUGE1-F1 between selected sentences and remaining sentences
Comments
Post comment