Skip to main content

Odin v0 Performance Benchmarks

Odin v0 benchmarks a Jetson Orin Nano Super paired with the Axelera Metis M.2 D-IMC accelerator against a standalone NVIDIA Jetson Orin Nano Super across three satellite-relevant workloads.

High-Level Results

BenchmarkMetricOdin v0NVIDIA JetsonDelta
Pose EstimationInference latency0.82 ms6.4 ms7.8× faster
Pose EstimationTiming jitter (σ)0.40 ms17.44 ms43.6× tighter
Pose EstimationSystem power10.3 W14.8 W30.4% lower
Cloud DetectionInference power10.1 W14.4 W29.9% lower
Cloud DetectionScene processing29.88 s60.34 s2.1× faster
GEMM (N=1024)Efficiency3,390 GOPS/W852 GOPS/W3.98× more efficient

Test Environment

ComponentValue
PlatformNVIDIA Jetson Orin Nano Super 8GB
AcceleratorAxelera Metis M.2 (PCIe 3.0 x4)
OSUbuntu 22.04 (JetPack 6.2.1)
Baseline runtimeTensorRT 10.3 / CUDA 12.6
Accelerator SDKVoyager SDK v1.5
PrecisionOdin v0: INT8 · Jetson: FP16 or INT8 (per test)
Power monitoringINA260 external current sensor

Methodology

Models are exported to ONNX opset 17, then compiled separately for each backend: trtexec for TensorRT and the Voyager Compiler for the D-IMC accelerator. Clocks are pinned via nvpmodel and jetson_clocks before each run. Each configuration includes a warmup phase followed by a timed measurement loop.

Efficiency (GOPS/W) is measured as net active power — device inference minus device idle baseline — ensuring a fair comparison of compute efficiency independent of platform idle draw.


Workloads

1. GEMM Raw Throughput

NxN INT8 matrix multiplication (N = 64–1024). Establishes raw compute efficiency and the operating point where D-IMC spatial execution outperforms GPU temporal execution.

2. Spacecraft Pose Estimation

Pose-ResNet50 (4.1 GFLOPs, 224×224 input) for 6-DoF spacecraft pose from monocular video frames. Characterizes inference latency, jitter, and system power for a real-time GNC pipeline.

3. Sentinel-2 Cloud Detection

DTACSNet-CD (U-Net/MobileNetV2, 0.62 GFLOPs) over full 10,980 × 10,980 px Sentinel-2 scenes (2,809 tiles at 224×224). Characterizes tiled-inference throughput and power under sustained load.