SnowPark Python - Notebook
Further posts in Snowflake topic
SnowFlake universe, part#1
SnowFlake, part#3 Initializing the system
SnowFlake, part#4 using 3rd party Python
SnowFlake, part#5 ML - forecasting
Snowflake, part#6 ML Forecast model (2)
"The Snowpark API provides an intuitive library for querying and processing data at scale in Snowflake."
I found really helpful the option that 3rd party python packages may be added easily to the SnowPark Notebook or Worksheets and thus you can integrate a Slider for interactive SQL/python queries (SnowFlake docs or my slider test)
I chose the so called Tasty Bytes learning dataset to verify the useability of the website, both in case of testing SQL or Python in SnowPark.
Notebook in SQL mode
As a simple step dataset was fully loaded to get to know the content.
First, data was copied to a database on SnowFlake from a given site (as defined in the tutorial):
Notebook in Python mode
As basic concept Notebooks were made originally for Python based tests of data wrangling. I have found that SnowPark has the most requested python functionalities.
The outcome is a (Pandas) DataFrame that can be easily handled for data processing in further steps and can be represented in the above mentioned modes.
SnowPark also offers some function(ality) hints for Python codes which enormously ease the use of the Notebook.
Representation in general
Independently of using Python or SQL to collect and/or filter our data SnowPark has some robust and reliable methods to represent final datasets. One advantage was noted already, that if Chart option is selected the X-Y axis values are automatically recognised and corresponding values plotted, which may be overwritten or modified to match to our needs or expectations in regard data representation.
I chose another free dataset: Finance & Economics of Cybersyn (about initial steps and data selection you may reed in part#3)
Another a
dvantage of the SnowPark system is that when query results are demonstrated as a table, you may get immediate insight into overall details of the data on the right hand side. By clicking on different columns (header) we get instantaneous breakdown of that columns content and we may also use that bar chart to filter the represented data defining left and right edges of the shown range to match with our interest.
It is simple to add another dataset to a chart (bar or line colours are defined automatically).
Here I provide two snapshots of the time settings if the video does not help and with changed to whote background for better visualisation of the line chart menu. Data (number of credit card issues of a special bank) with daily sum up:
and after changing to monthly basis:Part#3 initializing SnowFlake environment
Part#4 using Python 3rd party modules
Part#5 Dashboards
Part#6 AI & ML using SnowFlake ...
No comments:
Post a Comment