P-tuning is a novel method introduced to enhance the performance of GPTs on natural language understanding (NLU) tasks by using trainable continuous prompt embeddings, showcasing results that are either superior to or on par with similar-sized BERTs. This approach significantly enhances outcomes on the knowledge probing LAMA benchmark and improves BERTs in both few-shot and supervised settings, minimizing the necessity for prompt engineering. The paper reveals that language models possess more world and task-specific knowledge than previously believed, although giant models face challenges with transferability and fine-tuning on downstream tasks proves ineffective for trillion-scale models. P-tuning demonstrates that GPTs can perform competitively with BERTs and suggests that language models have a deeper grasp of pre-trained knowledge, while the reliance on handcrafted prompts can lead to overfitting due to dependence on large validation sets.