In industrial IoT, running machine learning (ML) models on edge devices is necessary for many use cases, such as predictive maintenance, quality improvement, real-time monitoring, process optimization, and security. The energy industry, for instance, invests heavily in ML to automate power delivery, monitor consumption, optimize efficiency, and extend the lifetime of their equipment.
Wind energy is one of the most popular renewable energy sources. According to the Global Wind Energy Council, 22,893 wind turbines were installed globally in 2019, produced from 33 suppliers and accounting for over 63 GW of wind power capacity. With such scale, energy companies need an efficient platform to manage and maintain their wind turbine fleets, and the ML models running on the devices. A commercial wind turbine costs around $3–4 million. If a turbine is out of service, it costs $800–1,600 per day and results in a total loss of 7.5 megawatts, which is enough energy to power approximately 2,500 homes.
A wind turbine is a complex piece of engineering and consists of many sensors that can be used by a monitoring mechanism to capture data such as vibration, temperature, wind speed, and air humidity. You could train an ML model with this data, deploy it to an edge device connected to the turbine’s sensors, and predict anomalies in real time at the edge. It would reduce the operational cost of your fleet of turbines. But imagine the effort to maintain this solution on a fleet of thousands or millions of devices. How do you operate, secure, deploy, run, and monitor ML models on a fleet of devices at the edge?
Amazon SageMaker Edge Manager can help you to answer this question. The service allows you to optimize, secure, monitor, and maintain ML models on fleets of smart cameras, robots, personal computers, industrial equipment, mobile devices, and more. With Edge Manager, you can manage the lifecycle of each ML model on each device in your device fleets for up to thousands or millions of devices. The service provides a software agent that runs on edge devices and a management interface on the AWS Management Console.
In this post, we show how to use Edge Manager to create a robust end-to-end solution that manages the lifecycle of ML models deployed to a wind turbine fleet. But instead of using real wind turbines, you learn how to build your own fleet of mini 3D printed wind turbines. This is a DIY open-source, open-hardware project created to demonstrate how to build an ML at the edge solution with Amazon SageMaker. You can use to it as a platform to learn, experiment, and get inspired.
The next sections cover the following topics:
The wind turbine farm created for this project has five mini 3D printed wind turbines connected to five distinct Jetson Nanos via USB. The Jetson Nanos are connected to the internet through Ethernet cables plugged to a cable modem. A fan, positioned in front of the farm, produces the wind to simulate an outdoor condition. The following image shows how the wind farm is organized.
The mini wind turbine of this project is a mechanical device integrated with a microcontroller (Arduino) and some sensors. It was modeled using FreeCAD, an open-source tool for designing industrial parts. These parts were then 3D printed using PETG (plastic filament type) and assembled with the electronics components. Its base is static, which means that the turbine doesn’t align with the wind direction by itself. This restriction was important to simplify the project.
Each turbine has one voltage generator (small motor) and seven different sensors:
An Arduini Mini Pro is responsible for interfacing with these sensors and collecting data from them. This data is streamed through the serial pins (TX, RX). An FTDI device that converts this serial signal to USB is the bridge between the Arduino and the Jetson Nano. A Python application that runs on Jetson Nano receives the raw data from the sensors through this bridge.
A micro servo was modified and transformed into a voltage generator. Its internal gearbox increases the generator (motor) speed by five times to produce a (low) voltage between 0–3.3v. This generator is also connected to the Arduino through an analog input pin. This information is also sent with the sensor’s readings.
The frequency at which the data is collected depends on the sensor. All the signals from BME650 are collected each 150 milliseconds, the rotation encoder each 1 second, and the voltage generator and the vibration sensor each 50 milliseconds.
If you want to know more about these technical details and learn how to build your own mini wind turbine, see the GitHub repository.
Each Jetson Nano has a built-in GPU with 128-core NVIDIA Maxwell™ and a Quad-core ARM® A57 CPU running at 1.43 GHz. This hardware is enough to run a Python application that collects and formats the data from the sensors of the turbine and then calls the Edge Manager agent API to get the predictions. This application compares the prediction with a threshold to check for anomalies in the data. The model is invoked in real time.
When SageMaker Neo compiles the ML model for Jetson Nano, a runtime (DLR) optimized for this target device is included in the deployment package. This runtime detects automatically that it’s running on a Jetson Nano and loads the model directly into the device’s GPU for maximum performance.
The Edge Manager agent is also distributed as a Linux (arm64) application that can be run as a background process (daemon) on your Jetson Nano. It uses the runtime SageMaker Neo includes in the compilation package to interface with the optimized model and expose it as a well-defined API. This API is integrated with the local application through a low latency protocol (grpc + unix socket).
Now that you know some details about the physical hardware used to develop the wind turbine farm, it’s time to see which AWS services support the solution on the cloud side. A minimal, standalone setup to get a model deployed and running on the Edge Manager agent requires only SageMaker and nothing more. However, other services were used in this project with two important features: a mechanism for over-the-air (OTA) deployment and a dashboard for monitoring the anomalies in near-real time.
In summary, the components required for this project are:
The following diagram illustrates this architecture.
Now that we have all the components of our wind turbine farm, it’s time to understand the steps we need to take to integrate all these moving parts, deploy a model to our edge devices, and keep an application running and predicting anomalies in real time.
The following diagram shows all the steps involved in the process.
The solution consists of the following steps:
The Edge Manager agent uses certificates provided by AWS IoT Core to authenticate and call other AWS services. That way you need to create an IoT thing first and then an edge device fleet. But first, you need to prepare some basic resources to support your solution.
Before getting started, you need to configure AWS Command Line Interface in your workstation first (if necessary) and then to create the following resources:
Each time you call CaptureData in the agent API, it uploads the tensors (input and predictions) into this bucket.
Next, you create your IAM role.
Use the following code (provide the name for the S3 bucket, your AWS account, and Region):
{ “Version”: “2012-10-17”, “Statement”: [ { “Action”: [ “s3:GetObject”, “s3:PutObject”, “s3:ListBucket”, “s3:GetBucketLocation” ], “Resource”: [ “arn:aws:s3:::<
You’re now ready to create your IoT thing, which you later map to your Edge Manager device.
You now create your policy, which controls the permissions of the temporary credentials of the edge device.
{ “Version”: “2012-10-17”, “Statement”: [ { “Effect”: “Allow”, “Action”: [ “iot:Connect” ], “Resource”: “arn:aws:iot:<
Lastly, you attach the policy to the certificate.
Now your IoT thing is ready to be linked to an edge device. Repeat these steps (except for creating the policy) for each additional device in your device fleet. For a production environment with hundreds or thousands of devices, you just apply a different approach, using automated scripts and parameter files to provision all the IoT things.
To create your edge fleet, complete the following steps:
Now you need to add a new device to the fleet.
Repeat the registering process for all your other devices. Now you can SSH to your Jetson Nano and complete the configuration of your device.
Before you start configuring your Jetson Nano, you need to install JetPack 4.4.1 in your Nano. This is the version you use to build, run, and test this demo.
The model preparation process for your target device is very sensitive in relation to the versions of the libraries installed in your device. For instance, because the target device is a Jetson Nano, Neo optimizes the model and runtime to a given version of the TensorRT and CUDA. The runtime (libdlr.so) is physically linked to the versions you specify in the compilation job. This means that if you compile your model using Neo for JetPack 4.4.1, it doesn’t work with JetPack 3.x. and vice versa.
echo “export TVM_TENSORRT_MAX_WORKSPACE_SIZE=2147483647” >> ~/.bashrc echo “export SM_EDGE_AGENT_HOME=/home/${USER}/agent” >> ~/.bashrc # Also export the variables for the current session export TVM_TENSORRT_MAX_WORKSPACE_SIZE=2147483647 export SM_EDGE_AGENT_HOME=/home/${USER}/agent sudo apt install -y protobuf-compiler python3-serial sudo apt install -y python3-pip python3-joblib python3-boto3 libssl-dev sudo apt install -y curl sudo pip3 install grpcio-tools grpcio PyWavelets paho-mqtt
mkdir -p ~/agent/certificates/iot mkdir -p ~/agent/certificates/root tar -xzvf <
You should see the following files in this directory:
aws s3 cp s3://sagemaker-edge-release-store-us-west-2-linux-armv8/Certificates/<
Next, you create the Edge Manager agent configuration file.
{ “sagemaker_edge_core_device_uuid”: “<
Provide the information for the following resources:
aws iot describe-endpoint –endpoint-type iot:CredentialProvider
Now you’re ready to run the Edge Manager agent in your Jetson Nano.
cd ~/agent rm -f /tmp/edge_agent ./bin/sagemaker_edge_agent_binary -c sagemaker_edge_config.json -a /tmp/edge_agent &
The following screenshot shows your output.
The agent is now running. After a few minutes, you can see the heartbeat of the device, reported on the console. To see it on the SageMaker console, under Edge Inference, choose Edge Devices and choose your device.
Now it’s time to set up the application that runs on the edge device. This application is responsible for the following:
To install the application, first get the custom AWS IoT endpoint. On the AWS IoT Core console, choose Settings. Copy the endpoint and use it in the following code:
cd ~/ git clone https://github.com/aws-samples/amazon-sagemaker-edge-manager-demo wind_turbine cd wind_turbine/04_EdgeApplication ## by the AWS IoT Endpoint host you just copied and save the file chmod +x run.py ./run.py &
The application outputs something like the following screenshot.
Optional: run this application with the parameter –test-mode if you just want to run a test with no wind turbine connected to the edge device.
If everything went fine, the application keeps waiting for a new model. It’s time to train a new model and deploy it to the Jetson Nano.
This post demonstrates how to detect anomalies in the components of a wind turbine. There are many ways of doing this with the data collected by its sensors. To keep this example as simple as possible, you prepare a model that analyzes vibration, wind speed, rotation (per second), and the produced voltage to determine whether an anomaly exists or not. For that purpose, we train an autoencoder using PyTorch on SageMaker and prepare it for deployment on your Jetson Nano.
This model architecture has two advantages: it’s unsupervised, so we don’t need to label our data, and you can collect data from wind turbines that are working perfectly. Therefore, your model is trained to detect what you consider normal behavior of your wind turbines. When a defect appears in any part of the turbine, a drift occurs on the sensors data, which the model interprets as abnormal behavior (an anomaly).
The following screenshot is a sample of the raw data captured by the turbine sensors.
The data has the following features:
The selected features based on our goals are: qx,qx,qy,qz (angular acceleration), wind_speed_rps, rps, and voltage. The following image is a sample of the feature qx. The data produced by the accelerometer is too noisy so we need to clean it first.
The angular velocity (quaternion) is first converted to Euler Angles (roll, pitch, yaw). Then we denoise all the features with Wavelets (PyWavelets), and normalize them. The following screenshot shows the signals after these transformations.
Finally, we apply a sliding window to this resulting dataset (six features) to capture the temporal relationship between neighbor readings and create the input tensor of our ML model. The average interval between two sequential samples is approximately 50 milliseconds. Each time window (of our sliding window) is then converted into a tensor, using the following structure:
Interval, time step and step are hyperparameters that you can adjust during training. The final result is a stream of data, encoded as a multidimensional tensor (representing a few seconds in the past). The trained autoencoder tries to recreate the input tensor as the output (prediction). By measuring the MAE between the input and output and comparing it with a pre-defined threshold, you can identify potential anomalies.
One important aspect of this approach is that it extracts the linear and non-linear correlations between the features, to better understand the impacts of one feature into another, such as wind speed on the rotation or produced voltage.
Now it’s time to run this experiment.
The repository contains a folder named 03_Notebooks with two Jupyter notebooks.
The final dataset has only six features: roll, pitch, yaw (converted from a Quaternion to Euler angles), wind_speed_rps, rps (rotations per second), voltage (produced by the generator).
The application gets the package, unpacks it, loads the model in the Edge Manager agent, and unblocks the application run.
Both notebooks are very detailed, so follow the steps carefully, after which you’ll have an anomaly detection model to deploy in your Jetson Nano.
One of the most important steps of the whole process is the model optimization step in the second notebook. When you compile a model with SageMaker Neo, it not only optimizes the model to improve the prediction performance in the target device, it also converts the original model into an intermediate representation. After this conversion, you don’t need to use the original framework anymore (PyTorch, TensorFlow, MXNet). This representation is then interpreted by a light runtime (DLR), which is packaged with the model by Neo. Both the runtime and optimized model are libraries, compiled as native programs for a specific operational system and architecture. In the case of Jetson Nano, the OS is a Linux distro and the architecture: ARM8 64bits. The runtime in this case uses TensorRT for maximum performance on the Jetson’s GPU.
When you launch a compilation job on Neo, you need to specify some parameters related to the setup of your target device, for instance:
The Jetson Nano’s GPU is a NVIDIA Maxwell, architecture version 53, so the parameter gpu-code is the same for all compilation jobs. However, trt-ver and cuda-ver depend of the version of the TensorRT and CUDA installed on your Nano. When you were preparing your edge device, you set up your Jetson Nano with JetPack 4.4.1. This makes sure that the model you optimize using Neo is compatible with your Jetson Nano.
The dashboard setup is out of scope for this post. For more information, see Analyze device-generated data with AWS IoT and Amazon Elasticsearch Service.
Now that you have your model deployed and running on your Jetson Nano, it’s time to look at the behavior of your wind turbines through a dashboard. The application you deployed to the Jetson Nano collects some logs and sends them to two different places:
You can get this data and ingest it into Amazon ES or another database. Then you can use your preferred reporting to prepare dashboards.
The following visualization shows three different but correlated things for each one of the five turbines: the rotation speed (in RPS), the produced voltage, and the detected anomalies for voltage, rotation, and vibration.
Some noise was injected in the raw data from the turbines to simulate failures.
The following visualization shows an aggregation of the turbines’ speed and produced voltage anomalies over time.
Securely and reliably maintaining the lifecycle of an ML model deployed across a fleet of devices isn’t an easy task. However, with Edge Manager, you can reduce the implementation effort and operational cost of such a solution. Also, with a demo like the mini wind turbine farm, you can experiment, optimize, and automate your ML pipeline with the services and expertise provided by AWS.
To build a solution for your own needs, get the code and artifacts used in this project from the GitHub repo. If you want more practice using Edge Manager, check out the end-to-end workshop for Edge Manager on Studio.
Samir Araújo is an AI/ML Solutions Architect at AWS. He helps customers creating AI/ML solutions which solve their business challenges using AWS. He has been working on several AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. He likes playing with hardware and automation projects in his free time, and he has a particular interest for robotics.