From Simulation to Reality: Building Predictive Maintenance for Connected Vehicles

Connected vehicles generate enormous amounts of data, but accessing real-world fleets for testing is costly, time-consuming, and often impractical in the early stages. At Sigma Software, we tackled this challenge by combining cloud-native architectures with simulated telemetry — sourced from both game environments, such as Euro Truck Simulator and Forza Horizon, and a custom hardware stand that mimics key vehicle systems.

This approach enabled us to rapidly prototype predictive maintenance solutions, validate data pipelines, and deliver engaging, real-time demonstrations for stakeholders — all without requiring live fleet access. In this article, we share our practical insights, architectural decisions, and lessons learned, offering guidance for engineers, cloud architects, and innovation leaders exploring Software-Defined Vehicles (SDVs), connected fleets, and scalable predictive analytics.

Background and Motivation

Developing cloud-based solutions for connected vehicles is both thrilling and challenging. Modern cars generate massive streams of data from brakes, engines, tyres, GPS, and infotainment systems. Making sense of this data in real time requires scalable pipelines, reliable architectures, and flexible integration.

Early-stage projects face several realities:

  • Data is fast, diverse, and massive. Telemetry from brakes, engines, tyres, GPS, and infotainment systems pours in nonstop. That demands scalable pipelines and robust architectures capable of handling volume and velocity at scale.
  • Enterprise integration is essential. Predictive insights are only valuable when they inform business tools, such as ERPs, fleet dashboards, or third-party systems. That often means navigating API differences, message formats, and middleware.
  • The tech is evolving constantly. Sensors, cloud platforms, and ML tools are constantly evolving. We needed an architecture that could evolve with them without full rewrites.
  • Standards are catching up. With so many vendors in play, aligning around open standards like COVESA’s VSS or Eclipse SDV’s blueprint helps us stay interoperable and avoid vendor lock-in.

To navigate these challenges, we chose to simulate vehicle telemetry using both a physical hardware stand and game environments. This strategy allowed us to experiment quickly, validate cloud-first architectures, and demonstrate real-time predictive maintenance — all before having access to live fleets. By doing so, we built a platform that is scalable, modular, and ready for production-grade deployments.

System Architecture and Implementation

The Fleet Predictive Maintenance R&D project is built on a modular, cloud-native architecture designed to handle telemetry from both real and simulated connected vehicles. Inspired by the Eclipse SDV Fleet Management Blueprint and Microsoft’s Software-Defined Vehicle (SDV) reference model, our setup emphasises scalability, interoperability, and ease of experimentation.

Cloud and Data Architecture

At the heart of our solution is the Azure ecosystem, chosen for its integration capabilities and enterprise-grade scalability. Key services include:

  • Azure Event Hubs for high-throughput ingestion of telemetry data from simulators and hardware
  • Azure Databricks for data transformation, ML training, and inference using the Medallion (Bronze-Silver-Gold) architecture
  • Azure Data Lake Storage to persist raw and processed telemetry
  • Databricks AutoML to train predictive models, such as brake pad wear estimation
  • Power BI (transitioning to Databricks Apps) for real-time dashboards and stakeholder visibility
  • Azure Event Grid for routing incoming messages and decoupling producers from consumers

We adopted the Medallion architecture pattern to structure the data pipeline. The Bronze layer captures raw input (e.g. telemetry from Raspberry Pi or driving simulators), while the Silver layer cleans and enriches it. The Gold layer holds curated datasets optimised for ML and visualisation. AutoML automates model training, selection, and hyperparameter tuning, with the resulting model deployed via Databricks Managed MLflow.

Cloud architecture for processing and analysing connected vehicle telemetry using Azure

Diagram: End-to-end cloud architecture for processing and analysing connected vehicle telemetry using Azure. The pipeline covers real-time ingestion, storage, ML prediction, and live visualisation.

: Medallion architecture used in Azure Databricks to organise telemetry data

Diagram: Medallion architecture used in Azure Databricks to organise telemetry data into Bronze (raw), Silver (cleansed), and Gold (curated) layers. This layered structure enables reliable analytics and model training.

Implementation Highlights

Environments and Source Control

We utilise a blue-green deployment strategy with dedicated environments (and corresponding Git branches) for development and demo-ready, stable versions. Each environment maintains its own configurations – Event Hubs, Power BI dashboards, and Event Grid resources – while sharing a unified Databricks workspace. Notebooks and infrastructure-as-code (Bicep templates) are organised per environment folder for isolation and maintainability.

Telemetry Simulation and Hardware Integration

To bridge the gap between concept and real-world data, we simulate vehicle telemetry in two ways:

  • Game-based simulation (Euro Truck Simulator 2 and Forza Horizon): Telemetry is extracted using shared memory or UDP interfaces, passed through Redis, and forwarded to Event Hubs via a standardised telemetry publisher. This data is aligned to the COVESA Vehicle Signal Specification (VSS).
  • Custom hardware stand: A quarter-car physical setup equipped with a Raspberry Pi, electric motor, steering system, and 3D-printed brake pads generates real-time sensor data. Sensor data includes steering angle, speed, braking pressure, and shock detection. The Raspberry Pi transmits data via MQTT to Azure IoT Hub, which then forwards it into Event Hubs.

The hardware stand uses Eclipse KUKSA for internal signal management. It supports integration with gaming input via an Arduino Nano, making it a versatile platform for both data generation and live demos.

Custom hardware stand

 

Hardware stand simulating core automotive functions - steering
Hardware stand simulating core automotive functions - engine operation

 

Photos: Hardware stand simulating core automotive functions like steering, braking, and engine operation – used to generate realistic telemetry data for cloud processing and demos

Standardised telemetry data flow from Euro Truck Simulator 2

Standardised telemetry data flow from Forza Horizon

Diagrams: Standardised telemetry data flow from Euro Truck Simulator 2 and Forza Horizon: from in-game data extraction to Redis, through the Telemetry Publisher, and into Azure Event Hubs.

Real-Time Prediction and Visualisation

Once telemetry data reaches the Gold layer, it’s processed by a Prediction Worker, a Databricks job that applies the trained ML model to generate insights such as remaining brake pad lifespan. Results are sent to a Power BI Live Dataset or Databricks App dashboard for immediate visibility. A parallel handler also streams raw telemetry into visual dashboards, enabling a unified view of both raw and predicted signals.

A live dashboard displaying predicted brake pad wear over time

Dashboard: A live dashboard displaying predicted brake pad wear over time. Real-time insights enable fleet operators to schedule proactive maintenance and minimise vehicle downtime.

Challenges, Lessons Learned, and Best Practices

Key Technical Hurdles and Mitigation Strategies

Throughout the Fleet Predictive Maintenance R&D, we encountered a range of technical challenges across both the cloud and the physical simulation setup. Solving them took a mix of creative thinking and practical engineering to keep the system stable, scalable, and demo-ready.

  1. Simulating Real-Time Brake Pad Wear
    At exhibitions, we needed something tangible to demonstrate brake wear within minutes – real brake pads obviously wouldn’t suffice.
    Solution: We used 3D-printed pads designed to wear down visibly during short demo sessions. It brought both realism and a wow factor to our stand.
  2. Network Instability During Live Demos
    Exhibition networks are famously unreliable. Even a brief disconnection caused telemetry gaps, resulting in jittery and inconsistent dashboards.Solution: We added local buffering and retry logic to the simulator PC, while smoothing out missing values on the Power BI side to mask minor gaps.
  3. Cloud Costs and Resource Management
    Running everything 24/7 during development and demos threatened to blow up our cloud bill.Solution: We leveraged Databricks’ auto-shutdown settings and scaled down cluster sizes for demos, ensuring the environment remained responsive yet cost-efficient.

Insights Gained from the R&D

The R&D phase is not just about building a solution – it also helps us develop reusable patterns and validate architectural decisions.

Simulation Should Be Practical – and Engaging

From the spinning motor to visibly eroding brake pads, our stand wasn’t just functional – it drew people in. A good demo can serve as both a conversation starter and a proof point.

Cloud and Edge Are One Pipeline, Not Two

It’s easy to think of the hardware and cloud as separate systems. But issues like dropped packets reminded us to treat everything from sensor to dashboard as one continuous flow – and to build resilience into both ends.

Visual Impact Drives Stakeholder Buy-In

We found that nothing beats showing data in motion. Watching the system respond in real time, with physical components and live dashboards, was far more persuasive than static charts or technical slides.

Start Simple, Then Scale

Beginning with simulator data (Euro Truck Simulator, Forza) gave us speed and flexibility. The hardware stand bridged the gap toward realism – without waiting on OEM access or legal agreements.

Best Practices for Cloud-Driven Automotive Solutions

From architecture to execution, several key lessons emerged that can help guide similar automotive cloud projects.

  • Design with evolution in mind: Simulated data is just the start. Build pipelines and models that can later accept real-world feeds without a complete overhaul.
  • Expect things to break: Plan for buffering, retries, and graceful degradation from the start – especially in environments like expos or field trials.
  • Make the data visible: Use physical cues and intuitive dashboards to tell the story of your telemetry. If your audience can see the value, they’ll understand it faster.
  • Stick to scalable cloud-native components: Azure Event Hubs, Functions, Databricks – these choices allowed us to scale smoothly as complexity grew.
  • Think business-first, not just tech-first: Our telemetry was structured with use cases in mind – predictive maintenance, fleet analytics, even insurance scoring – not just for technical completeness.

Future Directions and Scalability

Transitioning from Simulation to Real-World Vehicle Data

With a solid foundation in place, our next step is bridging simulation with reality. We’ll begin integrating anonymised datasets from commercial telematics providers. These offer real-world structure and variability without legal overhead, giving us a critical proving ground before ingesting live vehicle data.

This phased approach ensures we:

  • Stress-test our ingestion and analytics pipelines using authentic signal patterns,
  • Fine-tune models with diverse, domain-specific inputs,
  • Stay agile by postponing OEM-grade security and compliance hurdles until we’re fully production-ready.

Once validated, the platform will begin handling real-time telemetry from fleet partners with minimal architectural changes.

Roadmap: What’s Next

Several near-term improvements are underway to enhance the platform and prepare it for production-grade deployments:

  • Open Collaboration: We plan to release our work as an Eclipse SDV blueprint, ensuring it is modular, documented, and aligned with SDV standards. This includes simulation sources (Euro Truck Simulator, Forza Horizon, hardware stand), along with cloud deployment templates and reusable modules.
  • Hardware Showcase: Our hardware stand – a physical demo unit featuring sensors and moving parts – will soon be ready for exhibition. It’s not just a data source but also a conversation starter at events.
  • New Signals: We’re expanding our telemetry coverage to include signals such as battery health, DPF regeneration, driver behaviour metrics, and fuel system efficiency.
  • Deeper Analytics: A dedicated ML/AI engineer will join the team to improve feature engineering, refine model selection, and boost explainability.
  • Edge Intelligence: Preprocessing at the edge will reduce cloud load and latency, which is particularly important as we scale to real-world vehicles.
  • Visualisations: As Power BI streaming is deprecated, we’re shifting to Databricks Apps for interactive dashboards – natively integrated, more customizable, and built for scale.
  • API Integration: Work is underway to expose predictions via APIs for integration into customer ERP systems, fleet management systems, and maintenance planning systems.

Conclusion

The Fleet Predictive Maintenance R&D project aimed to explore how cloud-native technologies can power connected vehicle analytics – and it succeeded. By creatively combining game-based telemetry, a custom hardware simulation stand, and Microsoft’s SDV reference architecture, we built an end-to-end solution capable of ingesting, processing, and analysing real-time vehicle data – even without a live fleet.

Our architecture, grounded in modular and scalable Azure components, has proven effective for simulating predictive maintenance use cases, such as forecasting brake pad wear. We leveraged AutoML for model training, Databricks for processing, and Power BI (soon to be replaced by Databricks Apps) for real-time insights, ensuring the entire pipeline is both demonstrable and extensible.

Key Takeaways

  • Simulated data works. With Euro Truck Simulator, Forza Horizon, and a physical hardware stand, we created realistic and varied telemetry streams for R&D and demonstrations.
  • Cloud-native design scales. Microsoft’s SDV blueprint provided a strong foundation, and tools like Event Hubs, Databricks, and AutoML enabled us to move quickly from prototype to production readiness.
  • Business value drives architecture. We focused on predictive maintenance and fleet analytics – use cases that resonate with insurers, logistics providers, and smart city platforms.
  • The journey isn’t over. We’re now preparing to incorporate anonymised real-world datasets, extend our ML capabilities, and contribute our work to the Eclipse SDV community as an open-source blueprint.

The automotive industry is entering a new phase, where innovation doesn’t have to start in the vehicle. By decoupling from in-vehicle software and OTA complexities, we’ve created a flexible platform that supports a wide range of value-added services, from predictive maintenance to risk modelling and beyond.

In closing, our journey shows that innovation in connected mobility doesn’t have to wait for access to real vehicles. With the right architecture, simulations, and vision, it’s possible to build meaningful, scalable automotive solutions from the cloud outward.

 

 

Share article: