Logo Dark Logo luc.run Clusters Practice Ecosystem Tips & Tricks AI & LLMs Apps Portfolio About TechWhale@YouTube Homelab
CTRL K
    CTRL K
      • Clusters
        • Local clusters
          • Kubeadm
          • K3s
          • K0s
          • Microk8s
          • Kind
          • K3d
          • Minikube
          • VMs in seconds with Multipass
        • Exoscale SKS
          • Exoscale account
          • Using Terraform
          • Using Pulumi
          • Using the exo CLI
      • Practice
        • Introduction to Docker
          • Containers
          • Images
            • Container Layer
            • Creation
            • Entrypoint Cmd
            • Cache
            • Multistage Build
          • Registry
          • Compose
            • Voting App
            • Elastic
        • Introduction to k8s
          • Concepts
          • Local cluster
          • VotingApp
          • Resources
            • Pod
              • Learn
              • Practice
            • Service
              • Learn
              • Practice
            • Deployment
              • Learn
              • Practice
            • Configuration
              • Learn
              • Practice
            • Storage
              • Learn
              • Practice
            • Job
              • Learn
              • Practice
            • Ingress
              • Learn
              • Practice
          • Packaging
            • Learn
            • Practice
        • Preparing the CKAD
          • Pod
            • Whoami
            • Wordpress
            • Probes
            • Scheduling
          • Service
            • Cluster IP
            • Node Port
            • Load Balancer
          • Deployment
            • Creation
            • Rollout Rollback
            • Hpa
          • Namespace
            • Usage
            • Resource Quota
            • Limit Range
            • Network Policy
          • ConfigMap
          • Secret
          • DaemonSet
          • Job / CronJob
          • Ingress
            • Installation
            • Expose the Voting App
            • Tls
          • RBAC
            • Roles
            • Service Account
          • Storage
            • Longhorn
            • Statefulset
          • Helm
            • Installation
            • Tick
        • Preparing the CKA
          • Certifications
          • Creation
          • Workload
            • Exercises
              • Autoscaling
              • Broken Probe
              • Command
              • Config
              • Daemonset
              • Deploy Svc
              • Deployment Rollout Rollback
              • Deployment Strategy
              • Environment
              • Handling Logs
              • Job Cronjob
              • Multi Containers
              • Pdb
              • Pod Deletion
              • Probes
              • Quota
              • Resources
              • Secret
          • Scheduling
            • Exercises
              • Labels
              • Node Affinity
              • Pods Eviction
              • Priorityclass
              • Taint
          • Network
            • Exercises
              • Endpoints
              • Services
              • Ingress
              • Gateway
          • Storage
            • Exercises
              • Ephemeral
              • Static Pv
              • Storage Class
          • Security
            • Exercises
              • Roles
              • Service Account
              • Rbac
              • Network Policy
          • Configuration
            • Exercises
              • Crds
              • Helm
              • Kustomize
          • Troubleshooting
            • Exercises
              • Logs
              • Events
              • Worker Kubelet
              • Kube DNS
              • Certificates
              • Apiserver
              • Crictl
              • Application
          • Operations
            • Exercises
              • Audit
              • Cert
              • Drain
              • Etcd
              • Metrics
              • Node
              • Upgrade
          • Challenges
            • Challenge 1
            • Challenge 2
            • Challenge 3
            • Challenge 4
            • Challenge 5
            • Challenge 6
            • Challenge 7
        • Udemy courses
          • Docker
            • Plateforme
              • Articles
                • Client Serveur
                • Daemon Configuration
                • Installation
              • Multipass
              • Vagrant
            • Containers
            • Images
              • Container Layer
              • Creation
                • Container
                • Dockerfile
              • Entrypoint Cmd
              • Multistage Build
              • Cache
            • Compose
              • Elastic
              • Votingapp
              • Wordpress
            • Registry
              • Dockerhub
              • Opensource
            • Storage
              • Volumes
              • Sshfs
            • Swarm
              • Creation
              • Service Creation
              • Service Update Rollback
              • Config Secret
              • Tick
              • Articles
                • Raft Logs
                • Restore
                • Tolerance Pannes
            • Network
            • Security
            • Logs
              • Drivers
              • Elastic
              • Sumologic
          • Kubernetes
            • Installation
              • K3d
              • K3s
              • Kind
              • Kubeadm Multipass
              • Kubeadm
              • Kubectl
              • Micro K8s
              • Minikube
              • Multipass
            • Pod
              • Probes
              • Scheduling
              • Whoami
              • Wordpress
            • Services
              • Cluster IP
              • Node Port
              • Load Balancer
            • Deployment
              • Ghost
              • Hpa
              • Rollback
            • Namespace
              • Introduction
              • Resource Quota
              • Limit Range
              • Network Policy
            • Voting App
            • Config Map
              • Utilisation
              • Update
            • Elastic
            • Secret
            • Rbac
              • Service Account
              • Roles
              • Certificat X509
            • Daemon Set
              • Creation
            • Job
              • Backup
            • Ingress
              • Nginx
              • Votingapp
              • Tls
            • Operators
              • Kopf
              • Prometheus
            • Stateful
              • Longhorn
              • Statefulset
            • Helm
              • Client
              • Tick
            • Service Mesh
              • Linkerd
      • Ecosystem
        • Argo CD
        • Flux
        • vCluster
        • cert-manager
        • ExternalDNS
        • Kyverno
        • kube-bench
        • falco
        • Trivy
        • kubescape
        • Kube Score
        • Cosign
        • Kubectl krew plugin
        • Building container images
        • Hubtool
        • NATS
        • The k0s Family
      • Tips & Tricks
      • AI & LLMs
        • Understand the foundations
        • Run locally first
        • Bring it to Kubernetes
        • How LLMs Work
        • LLM Internals
        • Tools Overview
        • Hugging Face
        • Reading a Model Name
        • Running Ollama on your Mac
        • Ollama on Kubernetes
        • Why vLLM?
        • vLLM on Kubernetes
      • Apps
        • VotingApp
        • Webhooks.app
        • Fakely.app
        • genx
      • Portfolio
        • Events
          • DevOpsDays Zurich 2026
          • KubeCon Amsterdam 2026
      • About
      • How LLMs Work
      • LLM Internals
      • Tools Overview
      • Hugging Face
      • Reading a Model Name
      • Running Ollama on your Mac
      • Ollama on Kubernetes
      • Why vLLM?
      • vLLM on Kubernetes

      On this page

      • Understand the foundations
      • Run locally first
      • Bring it to Kubernetes

      AI & LLMs

      This section is the result of my exploration of the AI domain. Some articles started as a question I asked myself, and are the result of many back-and-forth conversations with an AI, rephrasing, refining, and digging into the details. Others are hands-on tutorials built from experiments.

      Understand the foundations

      How LLMs Work
      A plain-language overview: tokens, attention, MLP blocks, sampling, and what it all means for the tools you use
      LLM Internals
      The deep dive: tensor inventory, Q/K/V mechanics, the full forward pass, KV cache, and prefill vs decode
      Tools overview
      A quick map of the AI tooling: Ollama, vLLM, LiteLLM, llm-d, and where each one fits
      Hugging Face
      Where open-source models live: how to find a model, read a model card, and download what you need
      Reading a Model Name
      What qwen2.5:7b-instruct-q4_K_M actually means: family, parameter count, variant, and quantization explained

      Run locally first

      The fastest way to get started with LLMs is to run them on your own machine. No API keys, no subscription, full control over the model.

      Ollama on macOS
      Run open-source LLMs locally using Ollama and add a web UI with Open WebUI

      Bring it to Kubernetes

      Ollama on Kubernetes
      Deploy Ollama with persistent model storage and a web UI, using an initContainer to pull models automatically
      Why vLLM?
      PagedAttention, continuous batching, and why a GPU is essential for production inference
      vLLM on Kubernetes
      Install the NVIDIA GPU Operator and serve your first model via the OpenAI-compatible API