Large Language Model

Share

What is a "Large Language Model"?

A large language model (abbreviated LLM) is an AI model for natural language processing (NLP) that is so large that it can be used as a general purpose model, either for interpreting text written by humans, or for generating human-like text. An example of an LLM is ChatGPT.

Traditionally, AI models were trained for specific tasks, such as identifying if the sentiment of a text sounds positive (i.e. happy) or negative (angry). For this purpose, the AI models had fewer parameters and were trained on smaller datasets. By contrast, Large language models have lots of parameters and are trained on immense datasets. This allows them to recognize patterns that rarely occur, as they occur more often when you have more data.

Quotes

An LLM is a model that is so large that it achieves general-purpose language understanding and generation.

https://pamelafox.github.io/my-py-talks/gpt-python/#/4 (accessed 2025-02-21)

See also [Using LLMs in Python (on Youtube)] (accessed 2025-02-21);

Large-scale, higher-order N-gram language models (e.g., N=5) have proven very effective in many applications, such as automatic speech recognition and machine translation. At Facebook, for example, this is used to automatically generate captions for videos uploaded to pages, and detecting pages with potentially low quality place names (eg. “Home sweet home,” “Apt #00, Fake lane, Foo City”).

Language models trained with large datasets have better accuracy compared with ones trained with smaller datasets. The possibility of covering ample instances of infrequent words (or N-grams) increases with a larger dataset. For training with larger dataset, distributed computing frameworks (e.g. MapReduce) are generally used for better scalability and parallelizing model training.

https://engineering.fb.com/2017/02/07/core-infra/using-apache-spark-for-large-scale-language-model-training/ (accessed 2025-02-21)
Written by Noel Santos.

About the Author

I'm a self-taught Brazilian programmer graduated in IT from a FATEC. In a world of increasingly complex and essential computers, I decided to use my technical expertise in hardware, desktop applications, and web technologies to create an informative resource to make PC's easier to understand.

View Comments