SDS
  • SDS-3.x
  • SDS-2.x
  • SDS-2.2
  • Privacy-Decisions
  • 360-in-525
  • SDS-Research
  • LaMaStEx
  • Contact
    • SDS-2.x Course
      • Basics
      • Setup Instructions
    • Contents-databricks
      • course content as .dbc link
      • Introduction
      • Why Spark?
      • Login to databricks
      • Scala Crash Course
      • RDDs
      • RDDs HOMEWORK
      • Word Count - SOU
      • Russian Word Count
      • SparkSQL Basics
      • SparkSQL HW-a
      • SparkSQL HW-b
      • SparkSQL HW-c
      • SparkSQL HW-d
      • SparkSQL HW-e
      • SparkSQL HW-f
      • ETL Diamonds Data
      • ETL Power Plant
      • Wiki Click streams
      • Spark SQL Windows and Activity Detection by Random Forest
      • Graph Frames Intro
      • Ontime Flight Performance
      • Spark Streaming Intro
      • Extended Twitter Utils
      • Tweet Transmission Trees
      • Tweet Collector
      • Tweet Track, Follow
      • Tweet Hashtag Counter
      • GDELT dataset
      • Old Bailey Online - ETL of XML
      • Latent Dirichlet Allocation of Cornell Movie Dialogs
      • MLeap Model Export Demo
      • Market Basket Analysis via FP Growth
      • Animal Names Streaming Files
      • Normal Mixture Streaming Files
      • Structured Streaming Prog Guide
      • Graph Mixture Streaming Files
      • Structured Streaming of JSONs
      • T-Digest Normal Mixture Streaming Files
      • Sketching with T-Digest
      • Streaming with T-Digest
      • Tuning Utilities
      • Tuning Transformations and Actions
      • Tuning Utilities in Action
      • Tuning for Caching
      • Tuning for Partitioning
      • Introduction to Data Science: A Computational, Mathematical and Statistical Approach
      • Simulation Intro
      • Machine Learning Intro
      • K-Means 1MSongs Intro
      • 1MSongs - 1 ETL
      • 1MSongs - 2 Explore
      • 1MSongs - 3 Model
      • Decision Trees for Digits
      • Linear Algebra Intro
      • Linear Regression Intro
      • DLA - Distributed Linear Algebra
      • DLA - Data Types Prog Guide
      • DLA - Local Vector
      • DLA - Labeled Point
      • DLA - Local Matrix
      • DLA - Distributed Matrix
      • DLA - Row Matrix
      • DLA - Indexed Row Matrix
      • DLA - Coordinate Matrix
      • DLA - Block Matrix
      • Power Plant - Model Tune Evaluate
      • Tweet Language Classifier
      • Power Plant - Model Tune Evaluate Deploy
      • Activity Detection - Random Forest
      • 20 Newsgroups - Latent Dirichlet Allocation
      • Cornell Movie Dialogs - Latent Dirichlet Allocation
      • Movie Recommendation - Alternating Least Squares
      • Geospatial Analytics in Magellan
      • Open Street Map Ingestion in Magellan
      • NY Taxi trips in Magellan
      • Querying Beijin Taxi Trajectories in Magellan
      • Map-matching and Visualizing Uber Trajectories
      • Intro to NLP
      • Getting Started with SparkNLP
      • Annotations with Pretrained Pipelines
      • Named Entity Recognitions with Pretrained Models
      • Tain Lemmatizer Model in Italian
      • Training French POS Tagger
      • Evaluate French POS Model
      • Intro to Deep Learning
      • Outline for DL
      • Neural Networks
      • Deep feed Forward NNs with Keras
      • Hello Tensorflow
      • Batch Tensorflow with Matrices
      • Convolutional Neural Nets
      • MNIST: Multi-Layer-Perceptron
      • MNIST: Convolutional Neural net
      • CIFAR-10: CNNs
      • Recurrent Neural Nets and LSTMs
      • LSTM solution
      • LSTM spoke Zarathustra
      • Generative Networks
      • Reinforcement Learning
      • DL Operations
      • Intro to Privacy-preserving Machine Learning
      • Privacy and Pseudonomyzation
      • Differential Privacy

    As an importable databricks notebook

    Updated: August 29, 2025

    Share on

    Twitter Facebook Google+ LinkedIn
    Previous Next
    • Follow:
    • Twitter
    • GitHub
    • Feed
    © 2025 Raazesh Sainudiin. Powered by Jekyll & Minimal Mistakes.