Experimental

Edge Inference Benchmark Harness

Reproducible benchmark suite comparing on-device LLM inference across consumer hardware configurations.

About this project

On-device LLM choices were guesswork — vendor benchmarks were inconsistent and rarely matched real workloads.

Solution

Open benchmark harness with a small workload library, hardware fingerprinting, and standardised reporting.

Technology

  • Rust
  • Python
  • llama.cpp
  • ONNX Runtime

Impact

Surfaced 2× perf gaps between vendor claims and measured throughput on representative workloads. Used internally for hardware procurement decisions.