Snowflake universe, part #6 - Forecasting2

Snowflake
Forecasting with built-in ML module

Snowflake - ML icons One of the key benefits of the Snowflake system is its ability to leverage multiple machine learning (ML) models for a diverse range of data analytics applications, enhancing insight and accuracy.

In the Snowflake AI & ML Studio optionally Forecasting, Anomaly detection, Classification and LLM models may be developed and applied to the data to understand our data better.

The model can be created and trained manually, but there is a "model wizard" that helps you through the steps and the resulting automatically created worksheet may be used directly or with customizations.

With some simple steps the values and the date/time columns can be selected, and additional related columns can be added to the model which have (proven or considered) affect on the values. At the end forecasting period should be defined (in the unit of the timeline), and the prediciton interval width
 
The model creation in general requires at least a minute, including the creation of train and test dataset split. Keep in mind that this is a resource consuming step, so prepare correctly the data and the parameters to avoid useless or wrong runs of the model.

The created model is called with conditions to create a table of predicted values for the given amount of time-unit (here 14 months), after the last known date/time, here 2018-01-01:
The instance(s) of the created models are listed using the following SQL query:
For visualization purposes the  Altair python module can be used immediately without any need for further installation and settings management in the Snowflake (Anaconda) Python environment. However I am more confident in Matplotlib, and Altair has similar-but-different syntax in some details, the plot was easy to create.  
With a simple trick I created the following line plot, which contains the
  • original (historical) data,
  • lower_bound, for estimated lowest values,
  • upper_bound, as predicted maximum values,
  • predicted values (in between the above to limits).
The trick helped me to visually compare real (historical) data with predicted values and I found a good match. 
The details of the ML model training in SnowPark Notebook, and the simple trick of the vizualisation can be found in the 7th part of Snowflake posts.

No comments:

Post a Comment

Snowflake universe, part #6 - Forecasting2

Forecasting with built-in ML module Further posts in  Snowflake  topic SnowFlake universe, part#1 SnowFlake, part#2 SnowPark Notebook Snow...