Build Your Own Local Copilot in VS Code with Continue + Ollama (CodeLlama‑7B)
Build Your Own Local Copilot in VS Code with Continue + Ollama (CodeLlama‑7B)
TL;DR — In under 30 minutes you can have a fully‑offline, ChatGPT‑style coding assistant that reads your entire codebase and never leaks a byte to the cloud.1 · Why bother?
Privacy first — No proprietary code leaves your laptop.
Low latency — Replies stream in milliseconds, not seconds.
Free — No API keys, no per‑token billing.
Hackable — Swap models, add custom prompts, or fine‑tune later.
2 · What we’ll build
A local stack where:
Ollama runs the
codellama:7bmodel athttp://localhost:11434.Continue in VS Code chats with that model.
Continue indexes all your repositories so you can ask questions like:
“Where do we create the AWS SQS client?”
3 · Prerequisites
| Item | Minimum |
|---|---|
| OS | Windows 11 22H2+ |
| RAM | 16 GB (32 GB recommended) |
| GPU (optional) | NVIDIA card with ≥ 8 GB VRAM & latest CUDA driver |
| WSL 2 | Auto‑installed by Ollama if missing |
| VS Code | 1.88 or newer |
4 · Install Ollama
Download the Windows installer from https://ollama.ai/download/windows.
Run the MSI and accept defaults.
Verify the CLI:
ollama --version # e.g. v0.1.32
**Heads‑up ** — The installer registers Ollama as a Windows service and listens on port 11434.
5 · Pull CodeLlama‑7B
ollama pull codellama:7b # 4‑bit Q4_0 fits in 8 GB RAMYou’ll see a 100 % progress bar; the model lives in %USERPROFILE%\.ollama\models.
6 · Install Continue for VS Code
Open VS Code → Extensions.
Search “Continue” by Continue.dev and click Install.
Reload VS Code; a sunflower icon shows up in the Activity Bar.
7 · Point Continue at Ollama
Click the sunflower icon → gear ⚙ → Settings.
Choose Model Provider: Custom.
Fill the fields:
Field Value Provider ollamaBase URL http://localhost:11434Model codellama:7bPress Save.
(This writes the same keys to your settings.json.)
8 · Index your repositories (no training required)
Add one line to User or Workspace settings:
"continue.directoryContext": [
{ "path": "D:/code", "depth": 4, "maxFiles": 25000 }
]Continue walks the tree once, stores embeddings in %APPDATA%\continue\index, and keeps them fresh in the background.
9 · Smoke‑test the setup
Press Ctrl + Shift + L to open chat.
Type:
@Codebase Which file defines the KafkaConsumer settings?The reply should include a Context block listing real file paths.
Optional: enable the Continue Console (Settings → Enable Console) to watch retrieved files live.
10 · Tame hallucinations (optional but recommended)
Add this to settings.json:
"continue.systemMessage": "You are a senior full‑stack engineer. Answer concisely and reference only provided context. If unsure, say 'IDK'.",
"continue.modelOptions": {
"temperature": 0.2,
"top_p": 0.9,
"repeat_penalty": 1.1
}11 · Daily cheat‑sheet
| Task | How |
| Explain selected code | Select → right‑click → Continue: Explain |
| Generate tests | Same menu → Generate Tests |
| Inline autocomplete | Toggle Tab Autocomplete in Continue footer |
| Force retrieval | Prefix prompt with @Codebase, @File, or @Folder |
12 · Troubleshooting
| Symptom | Fix |
connect ECONNREFUSED 11434 | Tray → Start Ollama Service or Start‑Service Ollama |
| Model OOM on pull | Use quantised tag: codellama:7b:Q4_0 |
| No Context block | Check directoryContext path / depth; rebuild index |
13 · Next steps
Swap in StarCoder2‑15B or DeepSeek‑Coder‑33B for higher quality.
Drop temperature to 0 and let it draft unit tests you tweak.
Graduate to Tabby LoRA if you need inline completions that mimic your exact style.

%20applied%20to%20Transformer%20models%20in%20machine%20learning.%20The%20image%20shows%20a%20neural%20networ.webp)

Comments
Post a Comment