Build Your Own Local Copilot in VS Code with Continue + Ollama (CodeLlama‑7B)

Build Your Own Local Copilot in VS Code with Continue + Ollama (CodeLlama‑7B)

TL;DR — In under 30 minutes you can have a fully‑offline, ChatGPT‑style coding assistant that reads your entire codebase and never leaks a byte to the cloud.




1 · Why bother?

  • Privacy first — No proprietary code leaves your laptop.

  • Low latency — Replies stream in milliseconds, not seconds.

  • Free — No API keys, no per‑token billing.

  • Hackable — Swap models, add custom prompts, or fine‑tune later.


2 · What we’ll build

A local stack where:

  1. Ollama runs the codellama:7b model at http://localhost:11434.

  2. Continue in VS Code chats with that model.

  3. Continue indexes all your repositories so you can ask questions like:

    “Where do we create the AWS SQS client?”


3 · Prerequisites

ItemMinimum
OSWindows 11 22H2+
RAM16 GB (32 GB recommended)
GPU (optional)NVIDIA card with ≥ 8 GB VRAM & latest CUDA driver
WSL 2Auto‑installed by Ollama if missing
VS Code1.88 or newer

4 · Install Ollama

  1. Download the Windows installer from https://ollama.ai/download/windows.

  2. Run the MSI and accept defaults.

  3. Verify the CLI:

    ollama --version   # e.g. v0.1.32

**Heads‑up ** — The installer registers Ollama as a Windows service and listens on port 11434.


5 · Pull CodeLlama‑7B

ollama pull codellama:7b    # 4‑bit Q4_0 fits in 8 GB RAM

You’ll see a 100 % progress bar; the model lives in %USERPROFILE%\.ollama\models.




6 · Install Continue for VS Code

  1. Open VS Code → Extensions.

  2. Search “Continue” by Continue.dev and click Install.

  3. Reload VS Code; a sunflower icon shows up in the Activity Bar.


7 · Point Continue at Ollama

  1. Click the sunflower icon → gear ⚙ → Settings.

  2. Choose Model Provider: Custom.

  3. Fill the fields:

    FieldValue
    Providerollama
    Base URLhttp://localhost:11434
    Modelcodellama:7b
  4. Press Save.

(This writes the same keys to your settings.json.)



8 · Index your repositories (no training required)

Add one line to User or Workspace settings:

"continue.directoryContext": [
  { "path": "D:/code", "depth": 4, "maxFiles": 25000 }
]

Continue walks the tree once, stores embeddings in %APPDATA%\continue\index, and keeps them fresh in the background.


9 · Smoke‑test the setup

  1. Press Ctrl + Shift + L to open chat.

  2. Type:

    @Codebase Which file defines the KafkaConsumer settings?
  3. The reply should include a Context block listing real file paths.

  4. Optional: enable the Continue Console (Settings → Enable Console) to watch retrieved files live.


10 · Tame hallucinations (optional but recommended)

Add this to settings.json:

"continue.systemMessage": "You are a senior full‑stack engineer. Answer concisely and reference only provided context. If unsure, say 'IDK'.",
"continue.modelOptions": {
  "temperature": 0.2,
  "top_p": 0.9,
  "repeat_penalty": 1.1
}

11 · Daily cheat‑sheet

TaskHow
Explain selected codeSelect → right‑click → Continue: Explain
Generate testsSame menu → Generate Tests
Inline autocompleteToggle Tab Autocomplete in Continue footer
Force retrievalPrefix prompt with @Codebase, @File, or @Folder

12 · Troubleshooting

SymptomFix
connect ECONNREFUSED 11434Tray → Start Ollama Service or Start‑Service Ollama
Model OOM on pullUse quantised tag: codellama:7b:Q4_0
No Context blockCheck directoryContext path / depth; rebuild index

13 · Next steps

  • Swap in StarCoder2‑15B or DeepSeek‑Coder‑33B for higher quality.

  • Drop temperature to 0 and let it draft unit tests you tweak.

  • Graduate to Tabby LoRA if you need inline completions that mimic your exact style.

Comments

Popular Posts