Company
Date Published
Author
Kira Hunter
Word count
1596
Language
English
Hacker News points
None

Summary

BERT and GPT-3 are two influential modern tools in natural language processing (NLP) and natural language generation (NLG), leading the charge towards artificial general intelligence (AGI). BERT, introduced by Google in 2018, is a neural network-based NLP pre-training technique that enables anyone to train their own state-of-the-art question answering system. It performs well in classification tasks like sentiment analysis and answering questions, and has been used by companies such as Google and Facebook. GPT-3, on the other hand, is a general language model trained on uncategorized text data from the internet, with its largest model boasting around 170 billion parameters, ten times larger than the nearest notable NLP model. It learns these parameters from historical training data and applies its "knowledge" to downstream tasks such as language inference, paraphrasing, and sentiment analysis. GPT-3 has displayed promising performance in zero, one, or a few multitask settings, and is used in various applications including building websites, assisting the development of written content, and generating machine learning code. The two models have different architectural distinctions, with BERT being the encoder part and GPT-3 being the decoder part, and have been shown to be competitive or even better than state-of-the-art models in certain tasks. However, GPT-3's sheer size and black box nature can be restrictive for smaller operations, while BERT requires additional training which uses many resources. The future of NLP and NLG will be bright, busy, and full of possibilities with new models emerging such as BLOOM.