Amazon Forecast is a fully managed service that uses machine learning (ML) to generate highly accurate forecasts, without requiring any prior ML experience. Forecast is applicable in a wide variety of use cases, including estimating supply and demand for inventory management, travel demand forecasting, workforce planning, and computing cloud infrastructure usage.
You can use Forecast to seamlessly conduct what-if analyses up to 80% faster to analyze and quantify the potential impact of business levers on your demand forecasts. A what-if analysis helps you investigate and explain how different scenarios might affect the baseline forecast created by Forecast. With Forecast, there are no servers to provision or ML models to build manually. Additionally, you only pay for what you use, and there is no minimum fee or upfront commitment. To use Forecast, you only need to provide historical data for what you want to forecast, and, optionally, any additional data that you believe may impact your forecasts.
Water utility providers have several forecasting use cases, but primary among them is predicting water consumption in an area or building to meet the demand. Also, it’s important for utility providers to forecast the increased consumption demand because of more apartments added in a building or more houses in the area. Predicting water consumption accurately is critical to avoid any service interruptions to the customer.
This post explores using Forecast to address this use case by using historical time series data.
Water is a natural resource and very critical to industry, agriculture, households, and our lives. Accurate water consumption forecasting is critical to make sure that an agency can run day-to-day operations efficiently. Water consumption forecasting is particularly challenging because demand is dynamic, and seasonal weather changes can have an impact. Predicting water consumption accurately is important so customers don’t face any service interruptions and in order to provide a stable service while maintaining low prices. Improved forecasting enables you to plan ahead to structure more cost-effective future contracts. The following are the two most common use cases:
Forecasting can be challenging because you first need accurate models to forecast demand and then a quick and simple way to reproduce the forecast across a range of scenarios.
This post focuses on a solution to perform water consumption forecasting and a what-if analysis. This post doesn’t consider weather data for model training. However, you can add weather data, given its correlation to water consumption.
Before getting started, we set up our resources. For this post, we use the us-east-1 Region.
This post demonstrates two use cases related to water demand forecast: forecasting the water demand based on past water consumption, and conducting a what-if analysis for increased demand.
Forecast can accept three types of datasets: target time series (TTS), related time series (RTS), and item metadata (IM). Target time series data defines the historical demand for the resources you’re predicting. The target time series dataset is mandatory. A related time series dataset includes time-series data that isn’t included in a target time series dataset and might improve the accuracy of your predictor.
In our example, the target time series dataset contains item_id and timestamp dimensions, and the complementary related time series dataset includes no_of_consumer. An important note with this dataset: the TTS ends on 2023-01-01, and the RTS ends on 2023-01-15. When performing what-if scenarios, it’s important to manipulate RTS variables beyond your known time horizon in TTS.
To conduct a what-if analysis, we need to import two CSV files representing the target time series data and the related time series data. Our example target time series file contains the item_id, timestamp, and demand, and our related time series file contains the product item_id, timestamp, and no_of consumer.
To import your data, complete the following steps:
You’re redirected to the dashboard that you can use to track progress.
Next, we train a predictor.
Forecast uses AutoPredictor to create predictors. For more information, refer to Training Predictors.
After our predictor is trained (this can take approximately 3.5 hours), we create a forecast. You will know that your predictor is trained when you see the View predictors button on your dashboard.
You can query a forecast using the Query forecast option. By default, the complete range of the forecast is returned. You can request a specific date range within the complete forecast. When you query a forecast, you must specify filtering criteria. A filter is a key-value pair. The key is one of the schema attribute names (including forecast dimensions) from one of the datasets used to create the forecast. The value is a valid value for the specified key. You can specify multiple key-value pairs. The returned forecast will only contain items that satisfy all the criteria.
The following screenshot shows the forecast energy consumption for the same apartment (item ID A_10001) using the forecast model.
At this point, we have created our baseline forecast can now conduct a what-if analysis. Let’s imagine a scenario where an existing apartment building adds an extension, and the number of households and people increases by 20%. Now you need to do an analysis to forecast increased supply based on increased demand.
There are three stages to conducting a what-if analysis: setting up the analysis, creating the what-if forecast by defining what is changed in the scenario, and comparing the results.
Next, we create a what-if forecast to define the scenario we want to analyze.
For our example, we create a scenario where we increase no_of_consumer by 20% applicable to item ID A_10001, and no_of_consumer is a feature in the dataset. You need this analysis to forecast and meet the water supply for increased demand. This analysis also helps you make a cost-effective contract based on the water demand forecast.
We can now compare the what-if forecasts for both our scenarios, comparing a 20% increase in consumers with the baseline demand.
The following graph shows the resulting demand for our scenario. The red line shows the forecast of future water consumption for 20% increased population. The P90 forecast type indicates the true value is expected to be lower than the predicted value 90% of the time. You can use this demand forecast to effectively manage water supply for increased demand and avoid any service interruptions.
To export your data to CSV, complete the following steps:
You can export multiple scenarios at once in a combined file.
The export file will contain the timestamp, item_id, and forecasts for each quantile for all scenarios selected (including the base scenario).
To avoid incurring future charges, remove the resources created by this solution:
In this post, we showed you how easy to use how to use Forecast and its underlying system architecture to predict water demand using water consumption data. A what-if scenario analysis is a critical tool to help navigate through the uncertainties of business. It provides foresight and a mechanism to stress-test ideas, leaving businesses more resilient, better prepared, and in control of their future. Other utility providers like electricity or gas providers can use Forecast to build solutions and meet utility demand in a cost-effective way.
The steps in this post demonstrated how to build the solution on the AWS Management Console. To directly use Forecast APIs for building the solution, follow the notebook in our GitHub repo.
We encourage you to learn more by visiting the Amazon Forecast Developer Guide and try out the end-to-end solution enabled by these services with a dataset relevant to your business KPIs.
Dhiraj Thakur is a Solutions Architect with Amazon Web Services. He works with AWS customers and partners to provide guidance on enterprise cloud adoption, migration, and strategy. He is passionate about technology and enjoys building and experimenting in the analytics and AI/ML space.