Notebook client du framework nostos.benchmarks.pipeline_audit.
Question : pourquoi les diagrammes g–g sont-ils 1000× plus diffus sur hardware commercial raw (Teltonika FMC880) que sur smartphone / BeagleBone à gravité pré-compensée firmware ? Où se creuse la différence dans le pipeline ?
Compagnon empirique du dépôt eSoleau INPI DSO2026011691 (P-PIPELINE,
31 mars 2026). Cf. P020.md pour le paper complet.
from pathlib import Path
import sys
NB_DIR = Path.cwd()
NOSTOS_ROOT = NB_DIR.parent.parent.parent.parent.parent / 'nostos'
sys.path.insert(0, str(NOSTOS_ROOT / 'src'))
from nostos.benchmarks import load_aegis, load_clermont, load_greensboro
from nostos.benchmarks.pipeline_audit import (
run_pipeline_audit, format_audit_table, DEFAULT_STAGES,
)
from IPython.display import Markdown, display
print('Framework OK')Framework OK
1. Datasets — version enrichie 14 avril¶
AEGIS chargé en mode complet : 33 trips + gyroscope + OBD (vitesse via PID 0x0D = vérité terrain indépendante du GPS).
UAH et PVS prévus mais non disponibles localement (cf. scripts/download_public_datasets.sh).
datasets = {
'AEGIS_full': load_aegis(top_n_trips=None, with_gyro=True, with_obd=True),
'Clermont': load_clermont(),
'Greensboro': load_greensboro(post_daxos_only=True),
}
for name, df in datasets.items():
moving_pct = (df['speed_mps'] > 3).mean() * 100
print(f' {name:12s} : {len(df):>9,d} samples, moving {moving_pct:.1f}%') AEGIS_full : 1,063,350 samples, moving 82.5%
Clermont : 10,884 samples, moving 54.8%
Greensboro : 3,176 samples, moving 89.4%
2. Audit cross-stage¶
Stages successifs D0 → D1.a (IMU Calibrator Rodrigues) → D1.b (GPS Cleaner) → D1.c (SQS Scorer). Métriques de diffusion à chaque étape.
df_audit = run_pipeline_audit(datasets)
df_audit[['dataset', 'stage', 'n_samples', 'n_moving', 'hull_area_g2', 'std_long_g', 'std_lat_g', 'g_norm_mean_mps2']]validate_d0: lat hors range: [4633.3852, 4731.5476]
validate_d0: lon hors range: [1420.0108, 1540.3036]
Burst sampling: 475 frames @ 24 Hz, effective 3 Hz (gap 124531 ms)
Burst sampling: 60 frames @ 1 Hz, effective 1 Hz (gap 22690 ms)
validate_d0: timestamps non monotones: 11 inversions
Burst sampling: 6 frames @ 0 Hz, effective 0 Hz (gap 26000 ms)
3. Tableaux pivot par métrique¶
display(Markdown('### Hull area (g²) — surface du nuage *g–g*'))
display(Markdown(format_audit_table(df_audit, 'hull_area_g2')))
display(Markdown('### Std longitudinal (g)'))
display(Markdown(format_audit_table(df_audit, 'std_long_g')))
display(Markdown('### |g| mean (m/s²) — biais d échelle IMU'))
display(Markdown(format_audit_table(df_audit, 'g_norm_mean_mps2')))4. Mosaïque heatmaps g–g (3 datasets × 4 stages)¶
from gg_diagram import gg_heatmap
from PIL import Image, ImageDraw, ImageFont
tmp_dir = NB_DIR / 'tmp_panels'
tmp_dir.mkdir(exist_ok=True)
panel_paths = {}
for name, df in datasets.items():
df_cur = df
for stage_name, stage_func in DEFAULT_STAGES:
df_cur = stage_func(df_cur)
fig = gg_heatmap(df_cur, ax_col='ax_mps2', ay_col='ay_mps2',
speed_col='speed_mps', hz=10.0)
pth = tmp_dir / f'{name}_{stage_name}.png'
fig.write_image(str(pth), scale=1)
panel_paths[(name, stage_name)] = pth
stage_names = [s[0] for s in DEFAULT_STAGES]
dataset_names = list(datasets.keys())
sample = Image.open(panel_paths[(dataset_names[0], stage_names[0])])
cw, ch = sample.size
HEADER, LABEL_W = 60, 140
W = LABEL_W + cw * len(stage_names)
H = HEADER + ch * len(dataset_names)
mosaic = Image.new('RGB', (W, H), 'white')
draw = ImageDraw.Draw(mosaic)
try:
f = ImageFont.truetype('/System/Library/Fonts/Helvetica.ttc', 16)
except OSError:
f = ImageFont.load_default()
for j, s in enumerate(stage_names):
draw.text((LABEL_W + j * cw + cw // 2 - 40, 20), s, fill='black', font=f)
for i, d in enumerate(dataset_names):
draw.text((20, HEADER + i * ch + ch // 2 - 8), d, fill='black', font=f)
for j, s in enumerate(stage_names):
img = Image.open(panel_paths[(d, s)])
mosaic.paste(img, (LABEL_W + j * cw, HEADER + i * ch))
fig_path = NB_DIR.parent / 'figures' / 'diffusion_audit_mosaic.png'
fig_path.parent.mkdir(exist_ok=True)
mosaic.save(str(fig_path))
print(f'sauvé: {fig_path.relative_to(NB_DIR.parent)}')
from IPython.display import Image as IPImage
IPImage(filename=str(fig_path))validate_d0: lat hors range: [4633.3852, 4731.5476]
validate_d0: lon hors range: [1420.0108, 1540.3036]
Burst sampling: 475 frames @ 24 Hz, effective 3 Hz (gap 124531 ms)
Burst sampling: 60 frames @ 1 Hz, effective 1 Hz (gap 22690 ms)
validate_d0: timestamps non monotones: 11 inversions
Burst sampling: 6 frames @ 0 Hz, effective 0 Hz (gap 26000 ms)
sauvé: figures/diffusion_audit_mosaic.png

5. Persistance¶
out_csv = NB_DIR.parent / 'experiments' / 'diffusion_audit_results.csv'
df_audit.to_csv(out_csv, index=False)
print(f'sauvé: {out_csv.relative_to(NB_DIR.parent)}')sauvé: experiments/diffusion_audit_results.csv