insights
Timeline by Antoine Louis on A Brief History of Natural Language Processing
Illustration of neural network by DeepMind design and Novoto Studio
(A) Large Language Model (LLM)
It all started with a large Language Model (LLM), a type of pre-trained neural network that is designed to understand and generate natural language in a way that is similar to human language. Being one of the largest LLMs available today, ChatGPT consists of over 175 billion parameters which grant it the ability to generate text that is remarkably similar to human writing. These models are engineered to comprehend to process a large corpus of text data to learn the patterns and structures of natural language. By feeding the model a large dataset of text from Wikipedia and Reddit, the model can analyze and learn from the patterns and relationships between the words and phrases in the text. As the model continues to learn and refine its understanding of natural language, it becomes increasingly adept at generating high-quality text outputs.
Training steps like predicting a word in a sentence, be it a next-word prediction or masked language modelling are crucial in shaping a high-accuracy LLM. Both techniques are normally deployed using Long-Short Term Memory (LSTM), which consists of feedback connections, i.e., it is capable of processing the entire sequence of data, apart from single data points such as images. However, the model has its drawbacks which limit the potential of large datasets.
To address this, a team at Google Brain introduced transformers in 2017, which significantly improves the ability of LLMs to incorporate meaning, as well as the capacity to handle much larger datasets. Transformers differ from LSTMs in that they can process all input data at the same time. The model can assign varying importance to different parts of the input data in relation to any position of the language sequence, thanks to a self-attention mechanism.
A simple yet comprehensive animation by Raimi Karim illustrating the self-attention mechanism
Source from OpenAI
"XTOPIA helps Malaysian businesses navigate AI adoption —from strategy to execution. Whether you’re just beginning your AI journey or ready to scale with agent-based automation, we provide tailored solutions grounded in technology, trust, and transformation. XTOPIA is owned and developed by XIMNET."