IA & LLM

Introduction to LLMs and SLMs Explained Simply (with Diagrams and Real Code)

Introduction to LLMs and SLMs: The Essentials in One Article — Real Code, Diagrams, and Concrete Steps, Excerpts from a 44-Lesson Course.

REHOUMA Haythem

12 Jun 2026 • 10 min read

A no-nonsense guide: Introduction LLMs SLMs dissected with diagrams, concrete examples and tested commands. Everything comes from a structured 11-chapter course — here are the best parts.

tl;dr

Introduction and Installation
How an LLM Works
Transformer Architecture
Overview of LLMs in 2026
SLMs Small Language Models

~$ cat ./parcours.md # Introduction LLMs SLMs — 10 chapters

Introduction and Installation

→ Course presentation and brief history of LLMs→ Install Ollama and run your first model+ 1 more lessons

How an LLM Works

→ Tokens: how the model sees text→ Context window (context window)+ 2 more lessons

Transformer Architecture

→ The "Attention is All You Need" paper explained→ Encoder vs Decoder vs Encoder-Decoder+ 2 more lessons

Overview of LLMs in 2026

→ Proprietary LLMs: OpenAI, Anthropic, Google→ Open-weights models: Llama, Mistral, Qwen, Gemma+ 2 more lessons

SLMs Small Language Models

→ SLM vs LLM: definition and threshold→ Overview: Phi-3, Gemma, TinyLlama, Qwen-small+ 2 more lessons

Local Inference With Ollama

→ Essential Ollama commands→ Quantization: Q4, Q5, Q8 explained+ 2 more lessons

Hugging Face Transformers

→ Installation and first pipeline→ AutoModel and AutoTokenizer+ 2 more lessons

Choosing the Right Model

→ Criteria: cost, latency, privacy, quality→ LLM decision matrix cloud / open / SLM+ 1 more lessons

🏁

Final project (+ 2 chapters along the way)

→ You leave with a concrete and demonstrable project

Install Ollama and run your first model

NOTEObjective — Install Ollama on Windows, macOS or Linux, download your first model and have a complete conversation with an LLM running entirely locally on your machine, with no Internet connection after the initial download.

Learning objectives

TIPAt the end of this module

Install Ollama on your operating system
Verify that the service is running correctly
Download a model from the Ollama library
Start a conversation with a local model
Understand where models are stored on your machine
Know the basic Ollama commands

Why Ollama?

Ollama is an open-source tool that radically simplifies using LLMs locally. Where you previously had to manually manage quantization, GPU bindings and Python dependencies, Ollama gives you a single binary and a command as simple as ollama run llama3. It has become the reference in 2026 for running an LLM on your laptop.

Simplicity

A single command to download and run a model. No manual GPU configuration.

Multi-platform

Windows, macOS (Apple Silicon) and Linux. Automatically optimized for CPU and GPU.

Built-in REST API

Ollama exposes a local API on http://localhost:11434 for integration into your apps.

Step-by-step installation

Go to https://ollama.com/download and choose your system. Installation takes less than two minutes.

Windows

TIPTip: type /bye to exit the conversation and /? to see all available commands in Ollama's interactive mode.

Where are the models stored?

Models can be large. Knowing where they live avoids unpleasant storage surprises.

OS	Default location
Windows	`C:\Users\<votre-nom>\.ollama\models`
macOS	`~/.ollama/models`
Linux	`/usr/share/ollama/.ollama/models`

To change this location (to an external drive for example), set the OLLAMA_MODELS environment variable before starting the service.

Ollama API and Python integration

NOTEObjective — Discover Ollama's local API and call it from Python, to integrate a local LLM into your own scripts and applications.

Learning objectives

TIPAt the end of this module

Understand that Ollama exposes a local HTTP API
Call the API with curl and from Python
Use the official Python library
Pass a system prompt and options
Integrate a local LLM into an application

Ollama is also a server

In addition to the command line, Ollama runs in the background as a local HTTP server, accessible at http://localhost:11434. Everything the CLI does, you can do via HTTP request, from any language.

The bridge to code

From chatting in the terminal to the Python API, you now know how to integrate a local LLM into any program.

Installation and first pipeline

NOTEObjective — Install the Transformers library from Hugging Face and perform your first inference in Python with the simplest abstraction: the pipeline.

Learning objectives

TIPAt the end of this module

Install Transformers and its dependencies
Understand what a pipeline is
Run sentiment analysis in 3 lines
Know the available tasks
Load a specific model into a pipeline

Hugging Face: the GitHub of models

Hugging Face provides the Transformers library, which has become the standard for using open-source models in Python. It gives access to hundreds of thousands of models through a uniform interface.

Ready-to-use tasks

Task (string)	What it does
`sentiment-analysis`	Determines whether a text is positive or negative.
`text-generation`	Completes or generates text.
`summarization`	Summarizes a long text.
`translation`	Translates from one language to another.
`question-answering`	Answers a question from a provided context.
`zero-shot-classification`	Classifies a text into categories you define.

Choosing a specific model

For French or a specific need, explicitly specify the model (its Hugging Face identifier).

WARNINGWarning: Default generation pipelines use small, older models (such as GPT-2). Do not judge the quality of modern LLMs by them: these are teaching tools, not production models.

go-further

This article covers the most useful excerpts — the full Introduction LLMs SLMs course (11 chapters, 44 lessons, corrected exercises and final project) takes you all the way.

./access-the-full-course free course: Prompt Engineering

FAQ

How long does it take to learn Introduction LLMs SLMs?

With a structured progression (11 chapters, 44 short and practical lessons), you reach an operational level in a few weeks at 30 to 60 minutes per day. The key is to practice each concept immediately.

Are there any prerequisites?

No prerequisites: the course starts from zero; every concept is introduced before being used.

Where to start concretely?

Reproduce the commands in this article, then follow the full Introduction LLMs SLMs course: it chains the 44 lessons in order, with exercises and a final project.

./further-reading

→ Effective AI Prompts: the 9 key steps to go from zero to operational → Get started with Advanced Prompt Engineering: your first concrete step today → Fine Tuning LLMs explained simply (with diagrams and real code)

📬 Want to receive this type of guide every week? Subscribe for free — real code, zero fluff.

Install Ollama and run your first model

Learning objectives

Why Ollama?

Simplicity

Multi-platform

Built-in REST API

Step-by-step installation

Windows

Where are the models stored?

Ollama API and Python integration

Learning objectives

Ollama is also a server

The bridge to code

Installation and first pipeline

Learning objectives

Hugging Face: the GitHub of models

Ready-to-use tasks

Choosing a specific model

FAQ

Stay up to date