pipeline --steps
> 4-stage PySpark data engineering pipeline — ETL → Analytics → Visualization → BI
143 CAN-bus signals · 10 Hz
Segment · Feature-engineer
+ CSV aggregates
Optimal speed: 70 km/h
Loads 441K rows of raw CAN-bus telemetry across 20 CSV files. Parses pipe-separated columns, filters driving state, segments trips, and engineers Wh/km, speed bins, and driving mode features using Spark window functions.
Generates 6 dark-theme Matplotlib/Seaborn charts: energy vs speed curve, autopilot comparison panels, combined cost overlay, SOC analysis, traffic impact, and a full dashboard PNG.
Transforms aggregated results into a star-schema for Power BI: 4 fact tables + 3 dimension tables. Generates DAX measures and a full POWERBI_INSTRUCTIONS.txt with relationship diagrams.
Computes the combined cost function C = 0.5×(energy_cost) + 0.5×(time_cost) across speed bins. Includes a Spark 4.0 CTE fix and a pure Pandas fallback for environments without Spark.
Results & Analysis
Key findings from 522 validated Wh/km samples across 11 test drives
Wh/km
| Speed | Wh/km | Energy Cost | Time Cost | Combined |
|---|---|---|---|---|
| 70 km/h | 202.2 | 0.202 | 0.143 | 0.173 ✓ |
| 80 km/h | 295.5 | 0.296 | 0.125 | 0.210 |
| 60 km/h | 369.1 | 0.369 | 0.167 | 0.268 |
| 50 km/h | 360.8 | 0.361 | 0.200 | 0.280 |
| 40 km/h | 356.0 | 0.356 | 0.250 | 0.303 |
visualizations --charts
> 6 dark-theme charts generated by the pipeline — hover to enlarge
Wh/km vs speed curve. Optimal zone (±15 km/h) marked. Physics annotations explain rolling resistance at low speed and aerodynamic drag at high speed.
4-panel grouped bar chart: energy consumption, average speed, jerk, and speed variability for Autopilot / ACC-Only / Manual modes.
Normalized energy + time cost overlay. Global minimum at 70 km/h clearly visible.
SOC% vs distance + linear regression for range estimation. Energy consumption histogram with mean/median lines.
Wh/km and avg speed by traffic density: No Traffic → Light → Moderate → Heavy.
Complete 5-panel summary dashboard combining all key findings in a single view.
tech --stack
> Production-grade data engineering tools from ingestion to insight