Experimental
Edge Inference Benchmark Harness
Reproducible benchmark suite comparing on-device LLM inference across consumer hardware configurations.
About this project
On-device LLM choices were guesswork — vendor benchmarks were inconsistent and rarely matched real workloads.
Solution
Open benchmark harness with a small workload library, hardware fingerprinting, and standardised reporting.
Technology
- Rust
- Python
- llama.cpp
- ONNX Runtime
Impact
Surfaced 2× perf gaps between vendor claims and measured throughput on representative workloads. Used internally for hardware procurement decisions.