Group Projects: ScaDaMaLe WASP Instance 2024-2025
Edited by Raazesh Sainudiin.
Peer-reviewed by project authors according to these instructions and made available here.
Introduction
A total of 32 PhD students in 9 groups did projects of their choosing in Scalable Data Science and Distributed Machine Learning, a mandatory as well as elective course of The WASP Graduate School in 2024-2025. See ScaDaMaLe Course Pathways for details.
The Best Student Group Project on the basis of peer-review was Federated Named Entity Recognition: NERF by Jennifer Andersson, Lovisa Hagstrom, Katriin Kukk and Thibault Marette.
Student Group Projects
- Federated Named Entity Recognition: NERF by Jennifer Andersson, Lovisa Hagstrom, Katriin Kukk, Thibault Marette. Presentation Video.
- Federated Learning for Music Recommendation System by Lukas Eveborn, Christian Gustavsson, Olle Hansson and Yangyang Wen. Presentation Video.
- ray-sam: Segment Anything Model uisng Ray.io by Songtao Cheng, Jingyu Guo, Nils Mechtel and Thanadol Sutantiwanichkul. Presentation Video.
- Plant Map: Federated learning for segmentation, detection, and classification of weed species in aerial images taken from farm fields by Derya Akbaba, Sofia Andersson, Xavante Erickson, Markus Fritzche and Sara Karimi. Presentation Video.
- GCP-llm-CPT: Continued Pre-Training of Large Language Models on Google Cloud Platform by Borggren Lukas. Presentation Video.
- Descent RSC: Decentralized Training for Road Surfaces Classification by Palatip Jopanya, Sheng Liu and Zesen Wang. Presentation Video.
- Matrees: Scalable missingness-avoiding decision trees by Anton Matsson and Newton Mwai. Presentation Video.
- NeedleDDD: A Distributed and Decentralized Machine Learning Framework by Eric Olsson, Qi Shao, Hantang Zhang and Huaifeng Zhang. Presentation Video.
- Time series anomaly detection using an ensemble of Autoencoders by Kasper Bagmark, Erik Jansson, Peng Kuang, Michele Di Sabato and Selma Tabakovic. Presentation Video.
Acknowledgements
These projects were partially supported by the Wallenberg AI, Autonomous Systems and Software Program funded by Knut and Alice Wallenberg Foundation to fufil the requirements to pass the WASP Graduate School Course Scalable Data Science and Distributed Machine Learning - ScaDaMaLe-WASP-UU-2024. Computing infrastructure for learning was supported by Databricks Inc.'s Community Edition. The course was industrially sponsored by Jim Dowling of Logical Clocks AB, Stockholm, Sweden, Reza Zadeh of Matroid Inc., Palo Alto, California, USA, and Andreas Hellander & Salman Toor of Scaleout Systems AB, Uppsala, Sweden.