Research Overview

Research at LaMaStEx is at the interdisciplinary interface of computing, mathematics and statistics. We use computer arithmetic and combinatorial data-structures through custom-built mathematical and statistical models to rigorously solve numerical optimization and simulation problems that arise in statistical decision-making from real-world data.

Available Student Projects

Current student research projects involve a mixture of skills from mathematics, computer science, statistics, data engineering and data science. Therefore, finding a suitable project of mutual interest given the acquirable skillset requires some discussions.

Here is a list of student project proposals of current interest. It is meant as a guide for a student or a small group of students who are interested in being guided by Raazesh Sainudiin for their individual or group project in a course of study or research at Uppsala University. These projects may help student(s) decide if they would like to do a Bachelor’s thesis, Masters thesis (exjobb), or pursue other research and development possibilities with Raazesh Sainudiin at the Department of Mathematics, Uppsala University.

The exact nature and pathway towards a project will depend on your current knowledge and skills and your willingness to acquire them. A subset of the following skills is needed for each project. We can make a self-study pathway specifically for you and the project.

  1. Software Skills: docker, docker-compose, git, sbt, maven, c/g/make, etc.
  2. Programming Languages: Python, Scala, Haskell, C, C++, R, Rust, Javascript,
  3. Mathematical and Statistical Background: Probability, Graphs, Algebra, Analysis, Optimization, ML models, etc.
  4. Computing Skills: system administration, distibuted computing, high-performance computing, etc.

GDELT Project

The GDELT project has many possibilities depending on your interests and abilities.

  • Applied Digital Humanities: Learn basic SQL and GQL to be able to work in a delta lake house with GDELT data with the aim of providing useful insights to researchers in digital humanities. See the following libraries to appreciate some of the possibilities:
    • First familiarize with spark-gdelt and see some example analytics.
    • GDELT PSL project: Learn PSL and apply it to GDELT data extract.
    • GDELT GQL project: Learn Graph Query Language and apply it to GDELT data to provide insights for digital humanities researchers (may want to work closely with such researchers).
    • GDELT Interact project: Develop an interactive UX for an analyst interested in interacting with details of SQL and GQL queries. This project requires experience in full-stack development in a team of 3 to 4 students.
    • Other projects based on the GDELT data.

Financial Streams Project

Using multiple time series of financial stock market data, develop further from trend-calculus streams of them. The development can be along predictive fronts using appropriate ML models or involve estimators by extending these examples. The general idea would be to identify recurrent multivariate signals of interest in the trends of historical financial data.

Meme Evolution in Twitterverse Project

Using twitter experiments with Project MEP there are many posibilities here. The projects can focus

Scalable Density Estimators Project

This project is suitable for a student of mathematics with skills in functional programming over a distributed computing architecture (or one who can self-learn such skills). To appreciate the starting point of this advanced project read this paper and try out this Scala Spark library with a view towards implementing the algorithms in this paper using the Scala Spark library.

Notebook-Format-Agnostic Data Engineering Science Project

This is for a small group of 2-4 engineering students who collectively have skills spanning Haskell to be able to use pinot, docker-compose or Kubernetes to provision Apache Spark clusters in its modern ecosystem with zeppelin and jupyter notebooks servers. The objective of this project is to allow for notebook-format-agnostic data science and analytics. Cloud computing across onpremise and public clouds is a pre-requiste.

Other Projects

There are several other project possibilities. However, this depends on current research interests and coming up with a reasonable plan. Please go through Selected Publications by Field to propose other project ideas spanning across population genetics, computer-aided proofs, mobility science, etc.