Edge Inference Benchmark Harness — Matthias Sammer

About this project

On-device LLM choices were guesswork — vendor benchmarks were inconsistent and rarely matched real workloads.

Solution

Open benchmark harness with a small workload library, hardware fingerprinting, and standardised reporting.

Technology

Rust
Python
llama.cpp
ONNX Runtime

Impact

Surfaced 2× perf gaps between vendor claims and measured throughput on representative workloads. Used internally for hardware procurement decisions.