This course is designed by Raazesh Sainudiin, an Associate Professor in Mathematics with Specialisation in Data Science at the Department of Mathematics, Uppsala University and Consulting Principal Data Scientist at Combient AB in Stockholm.

# COURSE

The course will provide *an* **in***troduction to Data Science* simultaneously from a computational, mathematical and statistical perspective.

# Style

The approach will use formal mathematical communication of concepts with concomitant development of computer programs.

Lectures will build on: Sets, Maps, Functions, Modular Arithmetic, Axiomatic Probability, Conditional probability, constructive understanding of random variables and structures including graphs, Statistics, Likelihood Principle, Decisions (parametric and non-parametric) including hypothesis testing and estimation.

Everyone should get something from this course and it is meant to help one prototype in the most active open-source mathematically consistent ecosystem based around Python.

# Course Materials

**Download**the SageMath`.ipynb`

notebooks as well as`images`

and`data`

directories in the following`.zip`

file:**Unzip**it into the directory you launched the SageMath Jupyter notebook server from (see SageMath instrctions below).- You should be able to see all the jupyter
`.ipynb`

notebooks by navigating from your jupyter notebook server.

**Individual SageMath Jupyter .ipynb Notebooks**

*- add or replace with new or updated notebooks, as they become available*to cover target topics including correlation, linear regression and non/parametric hypothesis testing, etc.

Roughly one notebook will be covered per lab/lecture (we have a total of 12 lectures; some of which can go faster).

We will break regularly for exercises as assignments for maximum retention and immediate brain-muscle memory on your operating systems.

- 00. Introduction
- 01. BASH Crash
- 02. Numbers, Strings, Booleans and Sets
- 03. Map, Function, Collection and Probability
- 04. Conditional Probability, Random Variable, Loop and Conditional
- 05. Random Variables, Expectations, Data, Statistics, Arrays and Tuples, Iterators and Generators
- 06. Statistics from Data: Fetching New Zealand Earthquakes, Counting votes in 2018 Swedish National Election, Locating Biergartens in Germany & Live Play with
`data/`

- 07. Modular Arithmetic, Linear Congruential Generators, and Pseudo-Random Numbers
- 08. Pseudo-Random Numbers, Simulating from Some Discrete and Continuous Random Variables
- 09. Estimation, Likelihood, Maximum Likelihood Estimators and Regressions
- 10. Convergence of Limits of Random Variables, Confidence Set Estimation and Testing
- 11. Non-parametric Estimation and Testing
- 12. Linear Regression
- Now, we are ready for scalable data engineering sciences in Python and Scala via Apache Spark
- ML
- DL
- geospatial
- NLP

This is the GitHub directory containing the SageMath contents for the course:

# Assignments for Exercising between Lectures

Assignments will be tried in between lab-lectures regulary:

- first try on your own,
- start cooperating with your course-mates to learn/co-learn/trach,
- the solutions will be displayed only after a good amount of time is devoted to solving the assigned exercises.

# Assignments

# Software

We will use SageMath locally during face-to-face interactions.

**Supplementary Book:**

## Computer labs

It is strongly recommended that you install SageMath in your laptop or desktop computer for convenience by following the instructions in the next section.

You can also do it in the cloud for free using cocalc.com for **co**llaborative **cal**culation in the **c**loud. We might do this as the download is happening.

### Local: Prepare your laptop on your own

It may be more convenient to install SageMath on your laptop or desktop.

- Follow the download and installation instructions for your Operating System from the following URL:
- To test that you have installed correctly do the following:
- On a Mac OS X or Unix/Linux system, say you installed sage in a directory inside your home directory called
`~/all/software/sage/`

, then you can see if the following command launches a Jupyter notebook server successfully:`$ ~/all/software/sage/SageMath/sage -n jupyter`

- Those with Windows should follow the instructions in the following URL and test that the jupyter notebook server launches successfully:

- On a Mac OS X or Unix/Linux system, say you installed sage in a directory inside your home directory called

### 2. By browser for collaborative calculation in the cloud

As a backup in the cloud:

- Get a free account in cocalc.com, for collaborative calculation in the cloud.
- Upload the course materials as
`.zip`

file to your account.

# Course on GitHub

The source GitHub directory for the course contents is:

# YouTube Video Clips for SageMath Setup

- Step 1: How to download Applied Statistics Course Content for the first time:
- Step 2: How to Download SageMath and set it up on your system (done for a similar course):
- Mac OS X
*proper*way watch https://youtu.be/i8AweQ2GyS0 - Linux (you know better?)
- Windows - follow instructions above (may not be supported for Windows 7!)

- Mac OS X