Analysis 3rd attempt using MS Power BI on the DataCamp - OnlineRetail dataset
Additional Power BI #1 - Frequency of purchased quantities in countries
Additional Power BI #2 - Clustering with Power BI

Dataset reference: The original dataset (xlsx file) was downloaded from the Dataset owner's site.
The complete dataset and the BI solution is zipped. The package sizes 26 MB and requires installed Microsoft Power BI software (ver 2.130+, made wtih ver 2.131.1203.0).
Microsoft (MS) Power BI software is able to load data easily from Excel (or from multiple files or online databases) and recognize data types automatically so I only had to concentrate on visualization of the required information. I created the simplest figures with minimal style adjustments, as making more elaborate or eye-catching visuals would have taken nearly twice as long to complete the project.
Before the questions and answers, here you find some general information on the data and the dataset:
It is important to know the dataset size, the amount of countries and Users (website user Customers), see the numbers on the right:
In the order of the questions, let's go through the answers.
1) The most returned items
The details of the Returns - general dashboard
Not surprisingly the returned goods in highest amounts are related to UK customers, it is a direct consequence of that the most purchases are originating from there, as well (data not shown).
At the bottom Treemap, UK related transactions are excluded to demonstrate the contribution of different countries in the amount of returned goods (the tile graph may be optimized in size).
2) Profits earned in the UK, on different timescales
A simple demand, easy to answer using the in-built [Month] or [MonthNo], [Quarter] part of DateTime variable. I used 'Online Retail'[InvoiceDate][MonthNo] to define a new column in the data Table.
In case of weeks the week number can be determinde by the formula
WEEKNUM('Online Retail'[InvoiceDate])
in a new column.
The profit is considered as the sum of the purchases and returns related costs/refunds, so a 'simple' sum of Quantity*Unit price for all transactions in the given time period. Remember that returned quantities are defined with negative integers.
3) Is non-UK purchased amount significantly larger than UK quantities?
First thing to mention; Power BI does not have an in-built (statistical) t-test functionality, so at the moment I provide only such simple graphs and related data that is the base of a t-test. Expert data analyser
Note that the larger mean value does not prove immediately that non-UK customers buy goods in significantly larger amount. However the plots would indicate so for a professional Data Scientist, the absolut proof can be made by a t-test (not shown here).
In addition to the above-mentioned dataset-related supplementary information, a lot of other facts and conclusion can be extracted from the dataset, but this post is about answering the DataCamp defined 3 questions using Power BI.
No comments:
Post a Comment