ScaDaMaLe Course site and book

Edited by Suparerk Angkawattanawit and Raazesh Sainudiin.

Peer-reviewed by project authors according to these instructions.


A total of 22 PhD Student Groups did Projects of their choosing in Scalable Data Science and Distributed Machine Learning, a mandatory course of The WASP Graduate School AI-track in 2020-2021. See ScaDaMaLe Course Pathways to appreciate the pre-requisite modules 000_1 through 000_9 for the union of all 23 projects, including a voluntary one from Masters thesis students.

Best Group Project: The Group Project named MixUp and Generalization by by Olof Zetterqvist, Jimmy Aronsson and Fredrik Hellström of Chalmers University won the Best Group-Project Prize on the basis of peer-review. The prize was donated kindly by the Databricks University Alliance under Rob Reed.

Table of Contents

  1. The Two Cultures by Daniel Ahlsén, Martin Andersson, Niklas Gunnarsson and Jonathan Styrud.
  2. Exploring the GQA Scene Graph Dataset Structure and Properties by Adam Dahlgren, Pavlo Melnyk and Emanuel Sanchez Aimar.
  3. Signed Triads in Social Media by Guangyi Zhang.
  4. Distributed Linear Algebra by Måns Williamson and Jonatan Vallin.
  5. Wikipedia analysis using Latent Dirichlet Allocation (LDA) by Axel Berg, Johan Grönqvist and Jens Gulin.
  6. Unsupervised clustering of particle physics data with distributed training by Karl Bengtsson Bernander, Colin Desmarais, Daniel Gedon and Olga Sunneborn Gudnadottir.
  7. Motif Finding by Adam Lindhe, Petter Restadh and Francesca Tombari.
  8. Distributed Ensemble by Amanda Olmin, Amirhossein Ahmadian and Jakob Lindqvist.
  9. Topic Modeling with SARS-Cov-2 Genome by Hugo Werner and Gizem Çaylak.
  10. Twitter Streaming Using Geolocation and Emoji Based Sentiment Analysis by Georg Bökman and Rasmus Kjær Høier.
  11. Anomaly Detection with Iterative Quantile Estimation and T-digest by Alexander Karlsson, Alvin Jin and George Osipov.
  12. Analysis and Prediction of COVID-19 Data by Chi Zhang, Shuangshuang Chen and Magnus Tarle.
  13. Genomics Analysis with Glow and Spark by Karin Stacke and Milda Pocoviciute.
  14. Distributed Combinatorial Bandits by Niklas Åkerblom, Jonas Nordlöf and Emilio Jorge.
  15. Reinforcement Learning for Intraday Trading by Fabian Sinzinger, Karl Bäckström and Rita Laezza.
  16. Intrusion Detection by MohamedReza Faridghasemnia, Javad Forough, Quantao Yang and Arman Rahbar.
  17. Density Estimation via Voronoi Diagrams in High Dimensions by Robert Gieselmann and Vladislav Polianskii.
  18. Recommender System by Ines De Miranda De Matos Lourenço, Yassir Jedra and Filippo Vannella.
  19. Fundamental Matrix by Linn Öström, Patrik Persson, Johan Oxenstierna and Alexander Dürr.
  20. MixUp and Generalization by Olof Zetterqvist, Jimmy Aronsson and Fredrik Hellström.
  21. Graph Spectral Analysis by Ciwan Ceylan and Hanna Hultin.
  22. SWAP With DDP by Christos Matsoukas, Emir Konuk, Johan Fredin Haslum and Miquel Marti.
  23. Distributed Deep Learning by William Anzén and Christian von Koch.