The Junior Data Scientist Academy
This online course provided (and still provides) a comprehensive introduction to theory and practices of a Data Scientist tools. It included data analysis and predictive modeling, blending practical tools with foundational concepts. Over several weeks, we progressed from gathering and automating data workflows to leveraging SQL for data organization and segmentation.
Business-focused topics like revenue segmentation, key metrics, and visualization techniques are explored, followed by advanced analytical methods including funnel analysis and cohort studies. The final weeks introduced Python’s Pandas library for data manipulation, regression techniques for prediction, and beginner-friendly machine learning (ML) concepts such as classification. By the end, participants gain hands-on skills to analyze data and derive actionable insights.
Prerequisites were almost nothing in regard Linux (Bash) or Python and Data Science at all, so beginners could join. If someone had some practice in those - like me - could find easier the solution for tricky tasks. There was an intro before someone became eligible to join the course: a task to create an online Linux server (Linode), do some settings, then grant access to the tutor to verify the result and ... this might have been a steep learning curve for real beginners.
- WEEK #0 |
- Linux installation on Workstation and on Linode (web)server /
- Basic Bash commands (Linux) /
- Setup Python environment (3.x) /
- Basic Python with practice /
- Python module imports, functions
- WEEK #1 |
- Welcome /
- Get the data /
- ETL bash /
- ETL python /
- Automate
- WEEK #2 |
- Put your data into SQL (python + SQL) /
- Automate the SQL load /
- Verify and analyze (SQL) - basic SQL queries /
- Data analitical and modifying SQL queries /
- Complex analytics - JOINs /
- Database/Table modifications
- WEEK #3 |
- Segments /
- Segmentation (Revenue) /
- Business metrics /
- Visualize (Matplotlib, PowerBI)
- WEEK #4 |
- Funnels /
- Cohort Analysis /
- Cohort Analysis 2
- WEEK #5 |
- Pandas Intro /
- Prediction with Regression /
- Simple Machine Learning: Classification /
- Further Classification
- WEEK #6 Presentations, One-on-one meeting(s), summary, takeaways, ...
During the SQL part of the course we practiced all 4 main forms of SQL queries.
-
Database Modification (DM): commands like
INSERT
,UPDATE
,DELETE
, andALTER
. -
Data Analysis (DA): queries used to analyze and retrieve data, such as
SELECT
with analytical functions (SUM
,AVG
,GROUP BY
, etc.). -
Database Creation (DC): commands that create new database objects, like
CREATE TABLE
,CREATE VIEW
, andCREATE INDEX
. -
Database Control (DCn): commands that manage database access and constraints, like
GRANT
,REVOKE
, andSET
.
No comments:
Post a Comment