Module
| Champ | Valeur |
|---|---|
| Stage | GPSCleanerStage |
| Fichier | nostos/src/nostos/stages/d1_gps_cleaner.py |
| Version | 0.4 |
| Couche | D0 → D1 |
| Entrée | D0 brut (1-100 Hz, multi-rate, avec ou sans speed/gyro) |
| Sortie | D1 normalisé (speed_mps, dist_m, gps_blackout remplis) |
Fonctionnalités
v0.1 — Base
- Smoothing Savitzky-Golay sur lat/lon
v0.2 — Nettoyage avancé
- Détection de sauts GPS (outlier jumps via vitesse implicite)
- Marquage des blackouts GPS (séquences NaN > seuil)
- Correction de la dérive cold start (biais exponentiel décroissant)
- Smoothing adaptatif HDOP (fenêtre variable selon la qualité GPS)
v0.3 — Multi-rate
- Reconstruction de la vitesse à partir du GPS quand speed_mps est absent
- Calcul haversine entre points GPS valides non consécutifs (multi-rate)
- Forward-fill + lissage de la vitesse reconstruite
v0.4 — Adaptatif
- Détection automatique de la fréquence IMU et GPS
- Validation D0 en entrée (colonnes, ranges, monotonie)
- Forward-fill intelligent lat/lon (interpolation linéaire, max 30s gap)
- Distance hybride (speed * dt prioritaire, haversine en fallback)
- Paramètres de smoothing adaptatifs à la fréquence détectée
- Rapport de qualité enrichi dans les artifacts
Algorithmes clés
Détection de fréquence
- Fréquence IMU = 1 / médiane(dt) sur toutes les rows
- Fréquence GPS = 1 / médiane(dt) entre rows où lat/lon changent
- Multirate = IMU_Hz > GPS_Hz × 1.5
Reconstruction de vitesse (v0.3+)
- Identifier les indices des GPS valides (lat/lon non NaN)
- Pour chaque paire consécutive de GPS valides : distance = haversine(lat1, lon1, lat2, lon2) vitesse = distance / dt
- Écrêtage à 200 km/h (55.6 m/s)
- Forward-fill + lissage moyenne glissante (5 points)
Distance hybride (v0.4)
- Si speed_mps disponible (> 50% non NaN) : dist = |speed| × dt
- Sinon : haversine entre points GPS consécutifs
Tests unitaires (12 tests)
- Cohérence distance (dist_m vs speed*dt < 20%)
- Reconstruction vitesse multi-rate
- Détection rotation IMU (pitch 15°)
- Détection freinage brusque
- Hz detection 10/24/100 Hz (3 tests)
- Validation D0 (colonnes manquantes, OK)
- Forward-fill GPS
- GPS Cleaner v0.4 complet sur multi-rate
Datasets validés
| Dataset | Pays | Hz IMU | Hz GPS | Speed source | Résultat |
|---|---|---|---|---|---|
| UAH | Espagne | 10 | 1 | GPS direct | OK |
| AEGIS | Autriche | 24 | 1 | Reconstruit | OK (75.3 km) |
| PVS | Brésil | 100 | 1 | Forward-fill | OK |
| Accident | Inde | 1 | 1 | GPS direct | OK |
| RS3 | France | 10 | 1 | Reconstruit | OK |
| field device | France | 50 (burst 50× per cycle) | 1 | GPS-derived | ✓ multi-rate OK, 2205 outliers |
Field Validation — experimental telematics prototype (urban environment, France, 2025)
Hardware disclosure: the field dataset described in this section comes from an experimental telematics prototype device deployed by Fluidy in a delivery vehicle, not from a commercial Teltonika FMC880. The Teltonika FMC880 was also installed in the vehicle but did not produce usable GPS data during this campaign. A validation on the Teltonika FMC880 is planned for the next deployment campaign.
Dataset: 14 trips, 351,356 points, experimental telematics prototype device, light commercial delivery vehicle.
Multi-rate characteristics:
- IMU: 50 Hz nominal, burst pattern of 50 frames per cycle (1 s capture at 50 Hz, then ~1 s gap, 25 Hz effective)
- GPS: 1 Hz (polyline-encoded)
- Effective multi-rate ratio: 25:1 (50 Hz nominal but 25 Hz effective due to duty cycle)
Key findings for D1 GPS Cleaner:
-
Burst pattern detection: The new
detect_burst_pattern()(v0.5) correctly identifies the bimodal dt distribution: 50 frames at 50 Hz native followed by ~1020 ms gap, yielding 25 Hz effective rate. The cleaner now reports:Burst sampling: 50 frames @ 50 Hz, effective 25 Hz (gap 1020 ms). Field-validated on 612 bursts in a single 19-min trip from the prototype device. -
Outlier detection: 2,205 points flagged as outliers (>55.6 m/s implied speed). These come from GPS polyline interpolation artifacts, not actual sensor errors — a new failure mode not seen in research datasets.
-
Blackout detection: 56 ticks marked as GPS blackout. Consistent with the 1 Hz GPS rate and inter-trip gaps.
-
Gravity-compensated accelerometer: The field device firmware subtracts gravity before transmission. The cleaner’s QSD (Quasi-Static Detection) still works because it uses variance thresholds, not absolute gravity magnitude. However, the
speed_reconstructionfrom IMU double-integration is affected (initial condition shifted by ~0.45 m/s² bias). -
Contribution — Burst sampling: No published GPS/IMU cleaning pipeline documents this burst pattern. The 50-frame burst creates a distinctive bimodal dt signature: 1 s of 50 Hz data followed by ~1 s of silence. Naive median-based frequency detection reports 50 Hz (correct nominal but misleading on data density). The new
detect_burst_pattern()function (v0.5) returns a richer descriptor:{nominal_hz: 50, effective_hz: 25, burst_size: 50, gap_ms: 1020}. Downstream stages (smoothing windows, SQS scoring) should useeffective_hz, notnominal_hz.
New row for datasets table:
| Dataset | Country | IMU Hz | GPS Hz | Speed | Pattern | Result |
|---|---|---|---|---|---|---|
| field device | France | 50 (burst 50f/cycle, eff. 25 Hz) | 1 | GPS-derived | Burst 25:1 | ✓ burst-aware (v0.5) |