Dive into ML Infrastructure with Kubernetes: Your First Concrete Step Today

ML Infrastructure Kubernetes: The Essentials in One Article — Real Code, Diagrams and Concrete Steps, Excerpts from a 41-Lesson Course.

Dive into ML Infrastructure with Kubernetes: Your First Concrete Step Today

The best way to learn ML Infrastructure Kubernetes is by doing. This article gives you a head start with practical excerpts from a 41-lesson course — enough to get your first result today.

tl;dr
  • Install the Kubernetes environment
  • Discover Kubernetes
  • Essential Kubernetes Objects
  • YAML Files and Configuration
  • Deploy an ML API with Flask
~$ cat ./parcours.md # ML Infrastructure Kubernetes — 11 chapters
01
Install the Kubernetes environment
→ Download Docker Desktop and kubectl→ Install Minikube and create a cluster+ 1 more lessons
02
Discover Kubernetes
→ What is Kubernetes and why use it?→ Kubernetes Architecture (Master and Workers)+ 1 more lessons
03
Essential Kubernetes Objects
→ Pods, ReplicaSets and Deployments→ Services and Networking+ 1 more lessons
04
YAML Files and Configuration
→ Anatomy of a Kubernetes YAML file→ ConfigMaps and Secrets+ 1 more lessons
05
Deploy an ML API with Flask
→ Flask Recap and creating an ML API→ Dockerfile and Kubernetes Deployment+ 1 more lessons
06
Deploy an ML API with FastAPI
→ FastAPI Recap and creating a prediction API→ FastAPI Deployment and HPA+ 1 more lessons
07
Storage and Persistence
→ What are Kubernetes Volumes?→ PersistentVolumes and PersistentVolumeClaims+ 1 more lessons
08
Helm and Package Management
→ Why Helm? The Kubernetes package manager→ Install Charts and customize+ 2 more lessons
🏁
Final project (+ 3 chapters along the way)
→ You leave with a concrete and demonstrable project

Final Project – Complete Step-by-Step Guide

Guide • 5 parts • Backend • Frontend • Helm • Monitoring • CI/CD

NOTEOverview — This guide walks you through the final project step by step. Follow each part in order to build a complete ML platform on Kubernetes.

Part 1: Backend API (FastAPI + ML Model)

1.1 Initialize the project

bash
mkdir -p ml-prediction-platform/{backend/{app,train,tests},frontend,helm,k8s/{security,monitoring},.github/workflows,docs}
cd ml-prediction-platform
git init

1.2 Train the model

output
# backend/train/train_model.py
import joblib
import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

accuracy = accuracy_score(y_test, model.predict(X_test))
print(f"Accuracy: {accuracy:.4f}")

model_info = {
    "model": model,
    "feature_names": list(iris.feature_names),
    "target_names": list(iris.target_names),
    "accuracy": accuracy,
    "version": "1.0.0"
}
joblib.dump(model_info, "model.pkl")
bash
cd backend/train
pip install scikit-learn joblib numpy
python train_model.py

1.3 Create the Pydantic schemas

output
# backend/app/schemas.py
from pydantic import BaseModel, Field
from typing import List

class PredictionRequest(BaseModel):
    features: List[float] = Field(..., min_length=4, max_length=4)
    class Config:
        json_schema_extra = {"example": {"features": [5.1, 3.5, 1.4, 0.2]}}

class PredictionResponse(BaseModel):
    prediction: str
    prediction_id: int
    confidence: float
    probabilities: dict
    model_version: str

class HealthResponse(BaseModel):
    status: str
    model_loaded: bool
    version: str

1.4 Create the model loading module

output
# backend/app/model.py
import os
import joblib
import numpy as np
import logging

logger = logging.getLogger(__name__)

class MLModel:
    def __init__(self):
        self.model = None
        self.feature_names = None
        self.target_names = None
        self.version = None
        self.loaded = False

    def load(self, path: str = None):
        path = path or os.getenv("MODEL_PATH", "/models/model.pkl")
        info = joblib.load(path)
        self.model = info["model"]
        self.feature_names = info["feature_names"]
        self.target_names = info["target_names"]
        self.version = info["version"]
        self.loaded = True
        logger.info(f"Model v{self.version} loaded")

    def predict(self, features: list) -> dict:
        X = np.array(features).reshape(1, -1)
        pred = self.model.predict(X)[0]
        proba = self.model.predict_proba(X)[0]
        return {
            "prediction": self.target_names[pred],
            "prediction_id": int(pred),
            "confidence": float(max(proba)),
            "probabilities": {
                n: float(p) for n, p in zip(self.target_names, proba)
            }
        }

ml_model = MLModel()

1.5 Create the FastAPI application

output
# backend/app/main.py
import os, time, logging
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from prometheus_client import Counter, Histogram, generate_latest
from starlette.responses import Response
from .model import ml_model
from .schemas import PredictionRequest, PredictionResponse, HealthResponse

logging.basicConfig(level=os.getenv("LOG_LEVEL", "INFO"))
app = FastAPI(title="ML Prediction API", version="1.0.0")
app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])

PREDICTIONS = Counter("predictions_total", "Total predictions", ["status"])
LATENCY = Histogram("prediction_latency_seconds", "Prediction latency")

@app.on_event("startup")
async def startup():
    ml_model.load()

@app.get("/health", response_model=HealthResponse)
async def health():
    return HealthResponse(
        status="healthy" if ml_model.loaded else "unhealthy",
        model_loaded=ml_model.loaded,
        version=ml_model.version or "unknown"
    )

@app.post("/predict", response_model=PredictionResponse)
async def predict(req: PredictionRequest):
    start = time.time()
    try:
        result = ml_model.predict(req.features)
        PREDICTIONS.labels(status="success").inc()
        LATENCY.observe(time.time() - start)
        return PredictionResponse(model_version=ml_model.version, **result)
    except Exception as e:
        PREDICTIONS.labels(status="error").inc()
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/metrics")
async def metrics():
    return Response(content=generate_latest(), media_type="text/plain")

ConfigMaps and Secrets

NOTEObjective — Learn how to externalize configuration and sensitive data for your ML applications using Kubernetes ConfigMaps and Secrets.

Learning objectives

TIPAt the end of this module — You will be able to master these essential skills.

1. Why externalize configuration?

TIPAnalogy — Imagine a chef who keeps recipes (code) separate from ingredients (configuration). Depending on the meal (environment), different ingredients are used with the same recipe. This is exactly the role of ConfigMaps.

In ML, your API needs parameters that vary by environment:

Development

Staging

Production

With ConfigMaps, you change configuration without rebuilding the Docker image.

2. ConfigMaps: storing configuration

2.1 Create a ConfigMap from literals

bash
# Create a ConfigMap with key-value pairs
kubectl create configmap ml-config \
  --from-literal=MODEL_NAME=iris_classifier \
  --from-literal=MODEL_VERSION=v2 \
  --from-literal=LOG_LEVEL=INFO \
  --from-literal=MAX_BATCH_SIZE=32

2.2 Create a ConfigMap from a file

First create a configuration file:

output
# config.properties
model.name=iris_classifier
model.version=v2
model.threshold=0.85
api.port=5000
api.workers=4
log.level=INFO
bash
# Create the ConfigMap from the file
kubectl create configmap ml-config --from-file=config.properties

# Create from an entire directory
kubectl create configmap ml-config --from-file=./config/

2.3 Declarative ConfigMap in YAML

output
apiVersion: v1
kind: ConfigMap
metadata:
  name: ml-config
  labels:
    app: ml-api
data:
  MODEL_NAME: "iris_classifier"
  MODEL_VERSION: "v2"
  LOG_LEVEL: "INFO"
  MAX_BATCH_SIZE: "32"
  FEATURE_COLUMNS: "sepal_length,sepal_width,petal_length,petal_width"
  config.yaml: |
    model:
      name: iris_classifier
      version: v2
      threshold: 0.85
    api:
      port: 5000
      workers: 4
NOTEMulti-line key — The | symbol lets you include an entire file as a key value. Very useful for complete configuration files.

3. Using ConfigMaps in Pods

3.1 As environment variables

output
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ml-api
  template:
    metadata:
      labels:
        app: ml-api
    spec:
      containers:
        - name: ml-api
          image: monregistry/ml-api:v1
          ports:
            - containerPort: 5000
          envFrom:
            - configMapRef:
                name: ml-config
          env:
            - name: SPECIFIC_KEY
              valueFrom:
                configMapKeyRef:
                  name: ml-config
                  key: MODEL_NAME
MethodUsageDescription
envFromAll keysInjects all keys from the ConfigMap as environment variables
valueFromSpecific keyInjects a single key from the ConfigMap into a named variable

3.2 As a mounted volume

output
apiVersion: v1
kind: Pod
metadata:
  name: ml-pod-config
spec:
  containers:
    - name: ml-api
      image: monregistry/ml-api:v1
      volumeMounts:
        - name: config-volume
          mountPath: /app/config
          readOnly: true
  volumes:
    - name: config-volume
      configMap:
        name: ml-config

Anatomy of a Kubernetes YAML file

NOTEObjective — Understand YAML syntax and the structure of Kubernetes manifests to describe your ML infrastructure resources declaratively.

Learning objectives

TIPAt the end of this module — You will be able to master these essential skills.

1. Introduction to YAML

YAML stands for “YAML Ain’t Markup Language”. It is a human-readable data serialization format widely used for configuration.

TIPAnalogy — Think of YAML as a recipe. Each recipe describes ingredients (key-value pairs), steps (lists) and sections (nested maps). Kubernetes reads this recipe to “cook” your infrastructure.

1.1 Basic YAML rules

RuleDescriptionExample
IndentationOnly spaces (never tabs), usually 2 spaces  key: value
Key-valueSeparated by : followed by a spacename: my-pod
ListsPrefixed by a dash -- item1
CommentsStart with ## This is a comment
StringsQuotes optional unless special charactersname: "my:pod"
Booleanstrue / falseenabled: true

1.2 Key-value pairs

The simplest structure in YAML — a key associated with a value:

output
name: flask-ml-api
version: "1.0"
replicas: 3
debug: false

1.3 Lists (sequences)

Lists use a dash - for each item:

output
frameworks:
  - scikit-learn
  - tensorflow
  - pytorch
  - fastapi

1.4 Nested maps (dictionaries)

Maps let you create hierarchical structures:

output
server:
  host: 0.0.0.0
  port: 5000
  options:
    debug: true
    workers: 4
WARNINGCaution — Indentation is critical in YAML. A single-space error can break the entire file. Always use 2 spaces and never tabs.

1.5 YAML data types

Strings

output
simple: hello
quotes: "world"
multi: |
  line 1
  line 2

Numbers

output
integer: 42
float: 3.14
scientific: 1e+6
octal: 0o14

Special

output
true: true
false: false
null: null
date: 2026-03-05
go-further

This article covers the most useful excerpts — the complete ML Infrastructure Kubernetes course (12 chapters, 41 lessons, corrected exercises and final project) takes you all the way.

./access-the-full-course free course: Mastering Claude Code

FAQ

How long does it take to learn ML Infrastructure Kubernetes?
With a structured progression (12 chapters, 41 short and practical lessons), you reach an operational level in a few weeks at 30–60 minutes per day. The key is to practice each concept immediately.
Are there any prerequisites?
It helps to be comfortable with the fundamentals of the domain: this content goes in depth with real-world cases.
Where to start concretely?
Reproduce the commands in this article, then follow the full ML Infrastructure Kubernetes course: it sequences the 41 lessons in order, with exercises and a final project.

📬 Want to receive this type of guide every week? Subscribe for free — real code, zero fluff.