companionintelligence · mrhegemon · Jun 12, 2026 · Jun 11, 2026
diff --git a/.dockerignore b/.dockerignore
@@ -2,6 +2,7 @@ node_modules
 dist
 .git
 .github
+docs
 tests
 playwright-report
 test-results

diff --git a/Dockerfile b/Dockerfile
@@ -19,6 +19,7 @@ ENV NODE_ENV=production PORT=3000
 COPY --from=build /app/node_modules ./node_modules
 COPY --from=build /app/dist ./dist
 COPY --from=build /app/index.html ./index.html
+COPY --from=build /app/assets ./assets
 COPY --from=build /app/package.json ./package.json
 EXPOSE 3000
 CMD ["node", "dist/server.js"]
diff --git a/EXAMPLE.md b/EXAMPLE.md
@@ -1,100 +1,92 @@
-# Example: Running the Local LLM Benchmark
+# Example: running Local-Bench
 
-This example demonstrates how to use the Local-Bench tool.
+A quick walkthrough of benchmarking local models and viewing the results.
 
 ## Prerequisites
 
-1. Install Ollama: https://ollama.ai/
-2. Download some models:
+1. Install [Ollama](https://ollama.ai/) and start it: `ollama serve`
+2. Pull a few models you can run on your hardware:
+
 ```bash
-ollama pull llama2
-ollama pull mistral
-ollama pull phi
+ollama pull gemma3:4b
+ollama pull qwen3:8b
+ollama pull llama3.1:8b
 ```
 
-## Step 1: Run the Benchmark
+## Step 1: Run the benchmark
 
 ```bash
-# Benchmark default models
-npm run benchmark
+npm install
+npm run build
+
+# Benchmark specific installed models...
+node dist/benchmark.js gemma3:4b qwen3:8b llama3.1:8b
 
-# Or benchmark specific models
-node benchmark.js llama2 mistral phi codellama
+# ...or the whole curated catalog
+npm run benchmark
 ```
 
 Expected output:
+
 ```
 === Local LLM Benchmark Tool ===
 Ollama API URL: http://localhost:11434
 
-Models to benchmark: llama2, mistral, phi, codellama
+Models to benchmark: gemma3:4b, qwen3:8b, llama3.1:8b
 ✓ Connected to Ollama API
 
-Benchmarking llama2...
-  ✓ Completed in 2.65s
-  ✓ Generated 120 tokens
-  ✓ Speed: 45.23 tokens/second
+Benchmarking gemma3:4b...
+  ✓ Completed in 5.25s
+  ✓ Generated 412 tokens
+  ✓ Speed: 78.42 tokens/second
 
-Benchmarking mistral...
-  ✓ Completed in 2.40s
-  ✓ Generated 125 tokens
-  ✓ Speed: 52.18 tokens/second
+Benchmarking qwen3:8b...
+  ✓ Completed in 9.52s
+  ✓ Generated 498 tokens
+  ✓ Speed: 52.31 tokens/second
 
 Results saved to benchmark_results.csv
 
 === Benchmark Summary ===
 
 Ranking (by tokens/second):
-  1. mistral: 52.18 tokens/s
-  2. llama2: 45.23 tokens/s
+  1. gemma3:4b: 78.42 tokens/s
+  2. qwen3:8b: 52.31 tokens/s
+  3. llama3.1:8b: 49.87 tokens/s
 
-Done! Open index.html in a browser to view the results.
+Done! Open the dashboard to view the results.
 ```
 
-## Step 2: View Results in Web Interface
+## Step 2: View results in the dashboard
 
 ```bash
-# Start the web server
 npm start
-
-# Open in browser
-# Navigate to http://localhost:3000
+# open http://localhost:3000
 ```
 
-You should see:
-- Statistics cards showing total models, successful tests, average speed, and fastest model
-- A bar chart comparing model performance
-- A detailed table with all benchmark results
+You'll see:
 
-## Step 3: Re-run and Refresh
+- Summary cards (catalog size, installed models, top intelligence, fastest measured)
+- The **Model intelligence** catalog ranked by the Artificial Analysis Intelligence Index
+- System specifications captured during the run
+- A throughput bar chart and a detailed results table (with each model's `IQ` score)
 
-After running new benchmarks:
-1. Click the "Refresh Results" button in the web interface
-2. The page will reload with updated data from the CSV file
+## Step 3: Re-run and refresh
 
-## Custom Configuration
+Run more benchmarks (CLI or the **Run benchmark** button in the UI), then click **Refresh** in the dashboard to reload the latest data.
+
+## Custom configuration
 
-### Custom Ollama URL
 ```bash
+# Point at a non-default Ollama
 OLLAMA_API_URL=http://192.168.1.100:11434 npm run benchmark
-```
 
-### Custom Port for Web Server
-```bash
+# Custom dashboard port
 PORT=8080 npm start
 ```
 
 ## Troubleshooting
 
-### Error: Cannot connect to Ollama API
-- Make sure Ollama is running: `ollama serve`
-- Check the API endpoint: `curl http://localhost:11434/api/tags`
-
-### Error: Model not found
-- List available models: `ollama list`
-- Pull the missing model: `ollama pull <model-name>`
-
-### Benchmark times out
-- The default timeout is 2 minutes
-- Some larger models may take longer
-- Consider testing with smaller prompts or fewer models
+- **Cannot connect to Ollama API** — make sure `ollama serve` is running; check `curl http://localhost:11434/api/tags`.
+- **Model not found** — `ollama list` to see what's installed, then `ollama pull <model-name>`.
+- **Benchmark times out** — the per-model timeout is 2 minutes; try smaller models or fewer at once.
diff --git a/LLM_TESTS.md b/LLM_TESTS.md
@@ -1,44 +1,54 @@
-| Name | Size | Context | Input |
-| --- | --- | --- | --- |
-| gemma3:270m | 292MB | 32K | Text |
-| qwen3:0.6b | 523MB | 40K | Text |
-| gemma3:1b | 815MB | 32K | Text |
-| deepseek-r1:1.5b | 1.1GB | 128K | Text |
-| llama3.2:1b | 1.3GB | 128K | Text |
-| qwen3:1.7b | 1.4GB | 40K | Text |
-| qwen3-vl:2b | 1.9GB | 256K | Text, Image |
-| llama3.2:3b latest | 2.0GB | 128K | Text |
-| qwen3:4b | 2.5GB | 256K | Text |
-| gemma3:4b latest | 3.3GB | 128K | Text, Image |
-| qwen3-vl:4b | 3.3GB | 256K | Text, Image |
-| deepseek-r1:7b | 4.7GB | 128K | Text |
-| llama3.1:8b latest | 4.9GB | 128K | Text |
-| deepseek-r1:8b latest | 5.2GB | 128K | Text |
-| qwen3:8b latest | 5.2GB | 40K | Text |
-| qwen3-vl:8b latest | 6.1GB | 256K | Text, Image |
-| gemma3:12b | 8.1GB | 128K | Text, Image |
-| deepseek-r1:14b | 9.0GB | 128K | Text |
-| qwen3:14b | 9.3GB | 40K | Text |
-| gpt-oss:20b | 14GB | 128K | Text |
-| gemma3:27b | 17GB | 128K | Text, Image |
-| qwen3-coder:latest | 19GB | 256K | Text |
-| qwen3-coder:30b latest | 19GB | 256K | Text |
-| qwen3:30b | 19GB | 256K | Text |
-| deepseek-r1:32b | 20GB | 128K | Text |
-| qwen3:32b | 20GB | 40K | Text |
-| qwen3-vl:30b | 20GB | 256K | Text, Image |
-| qwen3-vl:32b | 21GB | 256K | |
-| deepseek-r1:70b | 43GB | 128K | Text |
-| llama3.1:70b | 43GB | 128K | Text |
-| gpt-oss:120b | 65GB | 128K | Text |
-| llama4:16x17b latest | 67GB | 10M | Text, Image |
-| GLM-4.6:TQ1_0 | 84GB | 198K | Text |
-| qwen3:235b | 142GB | 256K | Text |
-| qwen3-vl:235b | 143GB | 256K | |
-| GLM-4.6:Q4_K_M | 216GB | 198K | Text |
-| llama3.1:405b | 243GB | 128K | Text |
-| llama4:128x17b | 245GB | 1M | Text, Image |
-| qwen3-coder:480b | 290GB | 256K | Text |
-| deepseek-v3.1:671b latest | 404GB | 160K | Text |
-| deepseek-r1:671b | 404GB | 160K | Text |
-| minmax m2 | 968GB | 200K | Text |
+# Curated Companion model catalog
+
+The models Local-Bench benchmarks and ranks by default. The **Intelligence** column
+is the [Artificial Analysis Intelligence Index](https://artificialanalysis.ai/)
+(higher = more capable; snapshot `2026-06`). `—` means the model is not individually
+rated by the index (vision-only or very small models).
+
+This table is generated from `SUPPORTED_OLLAMA_MODELS` in [`src/benchmark.ts`](src/benchmark.ts) —
+edit the `intelligenceIndex` values there to override the scores.
+
+| Model | Size | Context | Inputs | Intelligence |
+| --- | --- | --- | --- | --- |
+| gemma3:270m | 292MB | 32K | Text | — |
+| qwen3:0.6b | 523MB | 40K | Text | — |
+| gemma3:1b | 815MB | 32K | Text | — |
+| deepseek-r1:1.5b | 1.1GB | 128K | Text | — |
+| llama3.2:1b | 1.3GB | 128K | Text | — |
+| qwen3:1.7b | 1.4GB | 40K | Text | 3 |
+| qwen3-vl:2b | 1.9GB | 256K | Text, Image | — |
+| llama3.2:3b | 2.0GB | 128K | Text | 4 |
+| qwen3:4b | 2.5GB | 256K | Text | 6 |
+| gemma3:4b | 3.3GB | 128K | Text, Image | 4 |
+| qwen3-vl:4b | 3.3GB | 256K | Text, Image | — |
+| deepseek-r1:7b | 4.7GB | 128K | Text | 8 |
+| llama3.1:8b | 4.9GB | 128K | Text | 8 |
+| deepseek-r1:8b | 5.2GB | 128K | Text | 9 |
+| qwen3:8b | 5.2GB | 40K | Text | 9 |
+| qwen3-vl:8b | 6.1GB | 256K | Text, Image | — |
+| gemma3:12b | 8.1GB | 128K | Text, Image | 7 |
+| deepseek-r1:14b | 9.0GB | 128K | Text | 13 |
+| qwen3:14b | 9.3GB | 40K | Text | 11 |
+| gpt-oss:20b | 14GB | 128K | Text | 24 |
+| gemma3:27b | 17GB | 128K | Text, Image | 10 |
+| qwen3-coder:latest | 19GB | 256K | Text | 20 |
+| qwen3-coder:30b | 19GB | 256K | Text | 20 |
+| qwen3:30b | 19GB | 256K | Text | 15 |
+| deepseek-r1:32b | 20GB | 128K | Text | 18 |
+| qwen3:32b | 20GB | 40K | Text | 15 |
+| qwen3-vl:30b | 20GB | 256K | Text, Image | — |
+| qwen3-vl:32b | 21GB | 256K | Text, Image | — |
+| deepseek-r1:70b | 43GB | 128K | Text | 20 |
+| llama3.1:70b | 43GB | 128K | Text | 16 |
+| gpt-oss:120b | 65GB | 128K | Text | 33 |
+| llama4:16x17b | 67GB | 10M | Text, Image | 13 |
+| GLM-4.6:TQ1_0 | 84GB | 198K | Text | 30 |
+| qwen3:235b | 142GB | 256K | Text | 45 |
+| qwen3-vl:235b | 143GB | 256K | Text, Image | — |
+| GLM-4.6:Q4_K_M | 216GB | 198K | Text | 30 |
+| llama3.1:405b | 243GB | 128K | Text | 17 |
+| llama4:128x17b | 245GB | 1M | Text, Image | 18 |
+| qwen3-coder:480b | 290GB | 256K | Text | 24 |
+| deepseek-v3.1:671b | 404GB | 160K | Text | 28 |
+| deepseek-r1:671b | 404GB | 160K | Text | 27 |
+| minmax m2 | 968GB | 200K | Text | 44 |
-Original file line number
+Diff line change
@@ Expand Up / @@ -2,6 +2,7 @@ node_modules @@
     dist
     .git
     .github
+    docs
     tests
     playwright-report
     test-results
@@ Expand Down @@