Brief Overview of a 360-in-525 Minutes Course Set

For more details see Overview of a 360-in-525 Minutes Course Set in Data Sciences, Spring 2018

360-in-525-3: Geospatial Analytics and Big Data

This is a two-full-days workshop (2 hp) on May 3-4 2018. Prerequisites: 360-in-525-1 or Introduction to data Science. The first day will be done by domain experts from Uppsala University’s Department of Social and Economic Geography in order to introduce the basic problems and datasets of the field with hands-on lab tutorials in non-distributed geospatial analytics. The second day will be on distributed geospatial analytics over real datasets that can be scaled to petabytes (syllabus is jointly designed with experts in London’s big data industry). Topics include efficient distributed spatial joins, ingestion and representations of Open Street Maps that are conducive to pregel-style distributed vertex programs, SparkSQL and Spark Machine Learning pipelines with spatiotemporal GPS trajectories of multiple individuals.

MSR Cross-domain Data Fusion Image

360-in-525-1,2,3 should prepare you for Microsoft Research’s urban computing and cross-domain data fusion

MSR Urban Computing Image

Course Content

360-in-525-3: Geospatial Analytics and Big Data on May 3 2018

YouTube Archive of lab-lectures:

SCHEDULE:

360-in-525-3: Geospatial Analytics and Big Data on May 4 2018

YouTube Archive of lab-lectures:

SCHEDULE

  • 0830-1000: Scalable Geospatial Analytics, An Introduction
    • Cross-Domain Data Fusion and Knowledge Extraction (~20 minutes Lecture)
    • Markov Random Forests, Activity Detection, Intro to Spark’s GraphX/GraphFrames
  • Fika break 30 minutes - sponsored by Combient AB
  • 1030-1200: Introduction to Pregel Distributed Vertex Programs in Spark’s GraphX and GraphFrames Library
  • Lunch
  • 1330-1500: Scalable Gepspatial Computing with Magellan: Uber GPS trajectories in SanFrancisco
  • Fika break 30 minutes - sponsored by Combient AB
  • 1530-1700: Scalable Geospatial Constraint Satisfaction Problems, Distributed Map-Matching and Lumped Markov chain GraphX Representations of Open Street Maps
    • A nonparametric formulation of trajectories as Markov chains over lumped state-space representations of OpenStreetMaps as a generic framework of computational/inferential thinking for geospatial data scientists
    • Listen to latest geospatial data engineering podcast by Ram Sriharsha in softwareengineeringdaily.

All databricks notebooks

Import all databricks notebooks for this module as a .dbc file from:

Updated: