Research Overview
Research at LaMaStEx is at the interdisciplinary interface of computing, mathematics and statistics. We use computer arithmetic and combinatorial data-structures through custom-built mathematical and statistical models to rigorously solve numerical optimization and simulation problems that arise in statistical decision-making from real-world data.
- Curriculum Vitae for a field-specific picture
- The full list of peer-reviewed publications
- Research students supervised at LaMaStEx
- Mathematical Statistical Software
Available Student Projects
Current student research projects involve a mixture of skills from mathematics, computer science, statistics, data engineering and data science. Therefore, finding a suitable project of mutual interest given the acquirable skillset requires some discussions.
Here is a list of student project proposals of current interest. It is meant as a guide for a student or a small group of students who are interested in being guided by Raazesh Sainudiin for their individual or group project in a course of study or research at Uppsala University. These projects may help student(s) decide if they would like to do a Bachelor’s thesis, Masters thesis (exjobb), or pursue other research and development possibilities with Raazesh Sainudiin at the Department of Mathematics, Uppsala University.
The exact nature and pathway towards a project will depend on your current knowledge and skills and your willingness to acquire them. A subset of the following skills is needed for each project. We can make a self-study pathway specifically for you and the project.
- Software Skills: docker, docker-compose, git, sbt, maven, c/g/make, etc.
- Programming Languages: Python, Scala, Haskell, C, C++, R, Rust, Javascript,
- Mathematical and Statistical Background: Probability, Graphs, Algebra, Analysis, Optimization, ML models, etc.
- Computing Skills: system administration, distibuted computing, high-performance computing, etc.
GDELT Project
The GDELT project has many possibilities depending on your interests and abilities.
- Applied Digital Humanities: Learn basic SQL and GQL to be able to work in a delta lake house with GDELT data with the aim of providing useful insights to researchers in digital humanities. See the following libraries to appreciate some of the possibilities:
- First familiarize with spark-gdelt and see some example analytics.
- GDELT PSL project: Learn PSL and apply it to GDELT data extract.
- GDELT GQL project: Learn Graph Query Language and apply it to GDELT data to provide insights for digital humanities researchers (may want to work closely with such researchers).
- GDELT Interact project: Develop an interactive UX for an analyst interested in interacting with details of SQL and GQL queries. This project requires experience in full-stack development in a team of 3 to 4 students.
- Other projects based on the GDELT data.
Financial Streams Project
Using multiple time series of financial stock market data, develop further from trend-calculus streams of them. The development can be along predictive fronts using appropriate ML models or involve estimators by extending these examples. The general idea would be to identify recurrent multivariate signals of interest in the trends of historical financial data.
Meme Evolution in Twitterverse Project
Using twitter experiments with Project MEP there are many posibilities here. The projects can focus
- purely on the data engineering side involving terraform.io by extending the infrastructure as code work started here, or
- on analytic and mathematical modeling side comparing the polarised state of the Swedish political twitterverse from last Swedish election to a new collection that can be started this Semester, or
- refine and extend the interactive visual investigations developed in twitterVisualizations or
- extend distributed algorithms to characterize the evolution of ideological networks in the Twitterverse (see this statistical application for example).
Scalable Density Estimators Project
This project is suitable for a student of mathematics with skills in functional programming over a distributed computing architecture (or one who can self-learn such skills). To appreciate the starting point of this advanced project read this paper and try out this Scala Spark library with a view towards implementing the algorithms in this paper using the Scala Spark library.
Notebook-Format-Agnostic Data Engineering Science Project
This is for a small group of 2-4 engineering students who collectively have skills spanning Haskell to be able to use pinot, docker-compose or Kubernetes to provision Apache Spark clusters in its modern ecosystem with zeppelin and jupyter notebooks servers. The objective of this project is to allow for notebook-format-agnostic data science and analytics. Cloud computing across onpremise and public clouds is a pre-requiste.
Other Projects
There are several other project possibilities. However, this depends on current research interests and coming up with a reasonable plan. Please go through Selected Publications by Field to propose other project ideas spanning across population genetics, computer-aided proofs, mobility science, etc.