Last changes: January 13, 2016
In the following scenario of advanced analytics we will show how car dealers, insurances and automobile manufacturers can use Cortana Analytics including Power BI to gain real-time and predictive insights on vehicle health and driving pattern behavior.
The solution can be applied to following business use cases:
- Usage-based insurance
- Vehicle diagnostic
- Engine emission control
- Engine performance remapping
- Roadside assistance calls
- Fleet management
Starting December 1, 2015 the solution called Vehicle Telemetry Analytics template is available at Cortana Analytics Gallery. Here is quick promotional video:
In the following video and text below you will see some details on solution architecture which includes following technologies: Event Hub, Azure Stream Analytics, Azure Machine Learning, Azure Data Factory, HDInsight, Azure Storage, Azure SQL DW, and Power BI.
Let’s look on data flow and solution components.
The Event Hub is used to ingest huge amount of events from the vehicles into Azure for real-time and batch analytics.
The Stream Analytics job is performing real-time data ingestion into the long term storage for batch analytics and data preparation for real-time predictive insights.
Below you can see description of three queries processed in the Stream Analytics for following purposes. (All three queries are enriched with detailed data on each vehicle from Blob Storage).
Query #1 performs join with reference data from Azure Blob Storage and accumulates the resultant data into a different container in the Blob Storage for rich batch analytics.
Query #2 publishes the data as-is to the output Event Hub so that it can be consumed by the RealtimeDashboard app that invokes machine learning request/response end-point for real-time anomaly detection and pushes the results to the PowerBI live dashboard.
Query #3 performs aggregations on the data within a 3 sec tumbling window and publishes it to an Azure SQL instance that got provisioned as part of the deployment.
Data Factory is used for
- Orchestration, monitoring and management of the batch analytics pipeline
- Transformation of the data in an on-demand HDInisght cluster for rich insights on Driving Behavior Pattern and Vehicle Health Trending
- Data movement across the various data stores
All data in source datasets are processed using Hive queries where we describe data structures based on CSV files. Additionally we define new tables and calculate aggregations using INSERT request.
In this solution, we are targeting the following batch insights:
- Aggressive driving behavior (Identifies the trend of the models, locations, driving conditions, and time of the year to gain insights on aggressive driving pattern allowing Contoso Motors to use it for marketing campaigns, driving new personalized features and usage based insurance.)
- Fuel efficient driving behavior (Identifies the trend of the models, locations, driving conditions, and time of the year to gain insights on fuel efficient driving pattern allowing Contoso Motors to use it for marketing campaigns, driving new features and proactive reporting to the drivers for cost effective and environment friendly driving habits.)
- Recall models (Identifies models requiring recalls by anomaly detection trend and correlation with driving habits)
An anomaly detection Azure Machine Learning model is used in this demo to detect safety issues for vehicle recall and identifying vehicles requiring maintenance. This model is published in an existing subscription and the web service endpoint is leveraged both in request/response and batch mode for operationalization in the real-time and the batch processing.
Aggregated data from Blob Storage is moved to Azure Data Warehouse for historical storage.
Power BI dashboards contain historical data from Azure DW and real-time data from the Azure Stream Analytics and the Event Hub.
Special thanks to authors of the demo scenario: Anand Subbaraj, Sanjay Soni, Christoph Schuler, Santosh Waghmare, Shashank Khedikar, and Sam Istephan.