360-in-525-2019-1: Introduction to Privacy-Aware Decisions

This is a one-full-day workshop (1 hp) on Tuesday April 23 2019 on Privacy-Aware Decisions. The workshop includes introductory tutorials, research talks/seminars and culminates in an open discussion led by domain experts in privacy, security and geospatial analytics.

Signing up is necessary at https://simpleeventsignup.com/event/146419 to attend the workshop.

Who is this workshop for? It is for PhD students at Uppsala University and for researchers and developers from various sectors of data industry who want to up/re-skill in privacy-aware and GDPR-compliant data science process, including machine learning. Students from other universities in Sweden may also take it for possibly transferable credits. Attendance and completion of assignments can lead to an industrially-endorsed certificate in Privacy-Aware Decisions from The Department of Mathematics, Uppsala University. See Course Details of the Workshop below for more information.

Support: This is partly supported by the Centre for Interdisciplinary Mathematics jointly with the Department of Mathematics and Combient Competence Centre for Data Engineering Sciences via Combient-MIX.

Schedule of the Workshop

Abstracts of Tutorials and Seminars

  1. Tutorial on Privacy-preserving Data Mining: Data mining via structured querying and statistical machine learning or AI algorithms may have enormous benefits to science, business and society. At the same time, the large amount of data collected and analyzed may reveal a lot about our private lives, from our habits or personality traits to personal information that we may prefer to keep private. To motivate privacy enhancing techniques, we will present examples of attacks to privacy that were possible due to naïve anonymization of medical records, AOL internet search queries, Netflix movie ratings and NYC-taxis’ geo-location data. We will review some well known methods for privacy protection (such as k-anonymity or differential privacy) to prevent attribute and identity disclosure. Then, we will discuss the trade-offs between the risk of disclosure and the utility loss in different contexts such as databases, online social networks and geo-located data.

  2. GDPR-compliant Learning in Apache Spark: The increasing concern on privacy issues in data handling has led the EU to legislate data holders to provide proper protection for individuals in data sets using a combination of best-practices and anonymization techniques. This talk will look at Apache Spark implementations of existing methods for scalable anonymisation techniques using pseudo-anonymisation and K-anonymity, detailing a GDPR compliant framework for protecting an individual’s data. Further, the current applications and uses for differential privacy techniques will be discussed, where privacy is protected by obfuscating each individual’s relevance to a data set, in particular in the context of ML and Deep Learning applications. In parallel, we will discuss the statistical consequences of differentially private mechanisms, quantifying the information lost and privacy protection gained from privatisation of the data and estimators.

  3. Statistic Preserving Sanitizers for Markov Models of Co-trajectories: With today’s easy access to location aware devices, such as mobile phones, huge amounts of location based data is generated. Analysis of this data could contribute with a lot information for example in city planning or driving directions. However, location data contains a lot of private information that we may not want to reveal. The goal is to have tools for analysis that also allows us to keep sensitive parts of the data private. In relation to this, we introduce the concept of sanitizers and statistic-preserving sanitizers. We given an example when the analysis is done using a Markov chain model for the data and using a sanitizer called SwapMob.

  4. Simple Complex Phenomenon of Urban Parking: Parking is a core phenomenon of the modern urban transportation system. For the travelers, parking search time and parking price are the major factors that define the choice of the transportation mode, and, in a longer run, the decision on car ownership. For the city transportation planners and managers parking prices and constraints are the easiest tool for affecting travelers’ behavior and, eventually, urban traffic. I investigate several analytical and simulation models of urban parking dynamics and propose easy and practical algorithms for estimating parking search time and parking prices as dependent on standard high-resolution urban GIS data and, if necessary, simple field studies. Based on the model study, I propose parking policies that improve the state of urban transportation and discuss the ability of society to implement these policies.

  5. Privacy-preserving federated machine learning: Federated privacy-preserving machine learning allows parties that are not able to share data with each other for privacy reasons, for competitive reasons, or for practical reasons (such as the size of data), to form machine learning alliances with objective to train a joint global model without moving or disclosing any local private data. As such, federated learning algorithms hold great promise to become fundamental components of privacy-aware artificial intelligence. In this talk we will give an overview of approaches to federated learning, and discuss the research challenges involved in building a production-grade platform and runtime to support such algorithms.

  6. Space and place - the grammar of geography: A talk about geography, core features in spatial analysis, geographical scale, opportunities and ethics. We will talk about definitions of neighbourhoods, the implications of spatial aggregation, and MAUP (Modifiable Areal Unit Problem). The interaction between distance, cost, and time decay - fuzzy definition of neighbourhood. Neighbourhood and scale affect measurements such as segregation/integration. Economic sorting of landscapes - crash intro to urban economics and geographical theories. Intro to time geography - space-time and activity space. Spatial mismatch hypothesis - Kains negative feedback loop.

  7. Freedom in the Digital World: The digital world is a new world, with as many opportunities for freedom as for alienation. In this talk, I will go over those challenges, with many concrete examples. In particular, I will emphasize on the fundamental tension between the softness of software and the super rigid walls of the digital world. I’ll conclude with a conjecture: “there must be software design principles that encourage freedom in the digital world”.

Course Details of the Workshop

Candidate Thesis students or Masters students at Uppsala University must contact their study counsellor in order to see if it is possible to take the course as part of their candidate thesis preparation or as “Selected Topics Course” in their Department, respectively.

If you are working in industry and want to re-skill in privacy-aware data science processes then you can join the workshop (free of cost) and receive a certificate from the Department of Mathematics upon successful completion. You need to choose the appropriate option when signing up for the event.

If you are a PhD student at Uppsala University (or another Swedish University) and are interested in obtaining 1hp of PhD course credits (that may be transferable to students at other universities in Sweden) and/or an industrially-endorsed certificate of completion, then you need to choose the appropriate option when signining up for the event.

Physical presence in all the planned events of the day is mandatory for the course credits and/or certificate of completion.

What is expected in a 1hp Course at Uppsala University?

Learning concepts in depth and familiarizing oneself with new syntax requires completion of watching/reading assignments and completion of homework exercises. At Uppsala University, 1hp is about 25 hours of work and there is plenty of time, outside the 6-8 hours of face-to-face lab/lecture meetings, for deepening one’s understanding.

Workshop Coorganisers and Coordinator

This workshop is coorganised by:

and coordinated by Raazesh Sainudiin, Associate Professor of Mathematics with Specialisation in Data Science at the Department of Mathematics, Uppsala University in Uppsala and Consuting Principal Data Scientist at Combient AB in Stockholm (coordinator’s cv and contact), with the generous support of invited speakers and discussion panel members.

Dinner for Speakers and Discussion Panelists

We will be going for dinner at Aaltos, Italian Grill and Garden located at Sysslomansgatan 14, 75311 Uppsala starting 1830 hours. It is about 30 minutes by foot along the river or about 20 minutes by bus.