Brief Overview of a 360-in-525 Minutes Course Set
For more details see Overview of a 360-in-525 Minutes Course Set in Data Sciences, Spring 2018
360-in-525-4: Mathematical, Statistical and Computational Foundations for Data Scientists
Three-full-day workshops (3 hp) on May 11, 18 and 25 2018. Prerequisites: current proficiency in high-school level mathematics (pre-calculus, geometry and algebra with some programming experience beyond Excel). Target Audience: any MSc or PhD student at UU who wants to understand the mathematical statistical foundations in the data scientist’s computational toolbox. The approach will use formal mathematical communication of concepts starting from sets and logic, but with concomitant development of computer programming skills to algorithmically construct and implement the concepts. Topics will include: Sets, Maps, Functions, Modular Arithmetic, Axiomatic Probability, Conditional probability, Pseudo-random constructive understanding of random variables and structures including graphs, Statistics, Likelihood Principle, Bayes Rule, Decisions (parametric and non-parametric) including tests and estimators, Markov chains and their pseudorandom constructions, etc. We will use SageMath locally and collaborate in COCALC during the lab/lectures.
Background and Context:
This is a mathematically more careful (at an advanced undergraduate level) version of UC Berkeley’s most popular freshman course:
- http://data8.org/ with the formula:
- computational thinking + inferential thinking = data science
- as talked about at the end here.
Prepare your laptop:
SOFTWARE: We will be using SageMath/Python ecosystem for the next three Fridays. Follow the download and installation instructions for your Operating System from the following URL:
To test that you have installed correctly do the following:
- On a Mac OS X or Unix/Linux syatem, say you installed sage in a directory inside your home directory called
~/all/software/sage/
, then you can see if the following command launches a Jupyter notebook server successfully:
$ ~/all/software/sage/SageMath/sage -n jupyter
- Those with Windows should follow the instructions in the following URL and test that the jupyter notebook server launches successfully:
Course Content
Download the zip file of SageMath ipynb notebooks from:
After downloading the zip file, unzip it inside the directory you launched the sage jupyter notebook server from. You should be able to see all the jupyter .ipynb
notebooks by navigating from your jupyter notebook server.
Individual SageMath Jupyter .ipynb
Notebooks
Use the above archived .zip file directly!
- 00. Introduction
- 01. BASH crash
- 02. Numbers, Strings, Booleans and Sets
- 03. Map, Function, Collection, and Probability
- 04. Conditional Probability, Random Variables, Loops and Conditionals
- 05. Random Variables, Expectations, Data, Statistics, Arrays and Tuples, Iterators and Generators
- 06. Statistics and List Comprehensions with New Zealand Earthquakes
- 07. Modular Arithmetic, Linear Congruential Generators, and Pseudo-Random Numbers
- 08. Pseudo-Random Numbers, Simulating from Some Discrete and Continuous Random Variables
- 09. Estimation, Likelihood, Maximum Likelihood Estimators and Symbolic Expressions
- 10. Markov Chains
- 11. Limits, Convergence, and Estimation
- 12. Non-parametric Estimation and Testing
- 1515-1630 hours, May 25 2018: Kollokvium: Local asymptotic equivalence of pure quantum states ensembles and quantum gaussian white noise
YouTube Archive of lab/lectures
-
360-in-525-04 (Day-1/3-LabLec-1/4) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-1/3-LabLec-2/4) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-1/3-LabLec-3/4) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-1/3-LabLec-4/4) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-2/3-LabLec-1/5) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-2/3-LabLec-2/5) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-2/3-LabLec-3/5) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-2/3-LabLec-4/5) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-2/3-LabLec-5/5) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-3/3-LabLec-1/3) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-3/3-LabLec-2/3) Minutes Course Set: Scalable Data Science from Atlantis
-
360-in-525-04 (Day-3/3-LabLec-3/3) Minutes Course Set: Scalable Data Science from Atlantis