Generation(2)
-
GPT 정리
1. GPT (Generative Pre-Training) • goal: learn a universal representation • generative pre-training (unlabeled text) + discriminative fine-tuning (labeled text) 1.1. Unsupervised pre-training 1.2. Supervised fine-tuning 2. GPT-2 • difference from BERT GPT-2 BERT Direction uni-directional auto-regression mask future tokens bi-directional Tokenizer BPE(Byte-pair Encoding) WordPiece Tokenizer Fine-..
2020.10.07 -
BLEU and BLEURT: evaluation for text generation 정리
BLEU and BLEURT 1. BLEU (Bilingual Evaluation Understudy, 2002) “The closer a machine translation is to a professional human translation, the better it is.” 0-1 사이의 score - BP: Brevity Penalty - p_n: modified n-gram precision - w_n: positive weights (baseline: 1/N) - N: n-gram의 전체 길이 (baseline: N=4) 1.1. modified n-gram precision - unigram precision: 7/7 - (candidate 토큰 중 reference에도 있는 토큰의 개수) ..
2020.10.07