However, discrete prompting (Shin et al., 2020; Gao et al., 2020) can lead to suboptimal performance in many cases compared to fine-tuning.

문제 제기 : prompting은 mask 같은 방법으로 구현이 가능하지만 이러한 discrete prompting은 fine-tuning에 비해 많은 차선의 성과를 가짐.

Prompt tuning is an idea of tuning only the continuous prompts.

Only the continuous prompts are updated during training. While prompt tuning improves over prompting on many tasks, it still underperforms fine-tuning when the model size is not large, specifically less than 10 billion parameters (Lester et al., 2021). M

Prompt tuning은 연속적인 prompts에 tuning하는 아이디어다. 오직 continuous prompt는 학습동안에만 update된다. 많은 일을 하지만 모델 크기가 크지 않으면 성과가 여전히 저조함.,

Our main contribution in this paper is a novel empirical finding that properly optimized prompt tuning can be comparable to fine-tuning universally across various model scales and NLU tasks

저자는 NLU와 다양한 모델 크기에 finetuning과 비교할 수 있는 적절하게 최적화된 prompt tuning을 찾는 일이다.

1. Preliminaries

Untitled

NLU Tasks. In this work, we categorize NLU challenges into two families: simple classification tasks and hard sequence labeling tasks.

NLU(자연어 이해) 문제는 2개의 카테고리로 나눌 수 있음: 간단한 classification tasks와 hard sequence labeling tasks이다.

Prompt Tuning. trainable continuous prompts as a substitution to natural language prompts for NLU with the parameters of pretrained language models frozen

Prompt Tuning : 학습가능한 연속된 prompts로 frozen model에 학습함. Fig2에 잘 나와있음.

2. P-Tuning v2

Lack of Universality

Lack of universality across scales.

However, for medium-sized models (from 100M to 1B) that are widely used, prompt tuning performs much worse than fine-tuning.

중간 크기 모델들은 널리 쓰이지만 prompt tuning은 성능이 fine-tuning보다 안좋음.

Lack of universality across tasks.

Sequence tagging predicts a sequence of labels for each input token, which can be harder and incompatible with verbalizers