How to Get Started with Generative Models and Large Language Models (LLMs)

Nowadays, almost everyone has heard the names ChatGPT or Gemini and many people use them in their daily lives. But what exactly are these tools? At a high level, ChatGPT or Gemini is a type of artificial intelligence (AI) model trained on massive amounts of text data. This training allows the model to generate new, human-like text based on the context it's given.

Digging a bit deeper, ChatGPT stands for Chat - Generative Pretrained Transformer. Let’s break that down:

  • Generative: This refers to AI models that focus on generating new data (like text, images, video, or even audio), rather than classifying data. For example, while a discriminative model might tell you whether a sentence is grammatically correct, a generative model can write an entirely new sentence for you.

  • Pretrained Transformer: This means the model is built using the Transformer architecture (a neural network model introduced in 2017) and has been pretrained on a massive corpus of text before being fine-tuned for specific tasks like chatting, summarization, or translation.

So, to truly understand how ChatGPT and similar tools work, you'll need to get familiar with two key areas:

  1. Generative Deep Learning

  2. Pretrained Transformer, which are the backbone of today’s Large Language Models (LLMs), such as ChatGPT.

This article is your beginner-friendly guide to understanding the core ideas behind Generative Deep Learning and Large Language Models. This article assumes your familiarity with the concept of Deep Learning.

Last updated