17 feb
12:00 - 13:00

UM Data Science Research Seminar

De UM Data Science Research Seminar Series zijn maandelijkse sessies georganiseerd door het Institute of Data Science, namens de UM Data Science Community, in samenwerking met verschillende afdelingen binnen de UM met als doel data scientists van de Universiteit Maastricht samen te brengen om doorbraken en onderzoeksonderwerpen gerelateerd aan Data Science te bespreken.

Deze sessie wordt georganiseerd in samenwerking met het Institute of Data Science.

- deze informatie is alleen beschikbaar in het Engels -

Schedule

 

Presentation 1

Time: 12:00 - 12:30

Speaker: Michel Dumontier

Title: "Evaluating FAIRness"

Abstract
The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles light a path towards improving the discovery and reuse of digital objects (data, documents, software, web services, etc) by machines. Machine re-usability is a crucial strategic component in building robust digital infrastructure that strengthens scholarship and opens new pathways for innovation on a truly global scale. 

However, as the FAIR principles do not specify any particular implementation, communities have the homework to devise, standardise and implement technical specifications to improve the ‘FAIRness’ of digital assets. In this talk, I will focus on the history and state of the art in the FAIRness assessment, including manual, semi-automated and fully automated approaches, and how these can be used to incrementally and realistically improve the FAIRness of their resources.

 

Presentation 2

Time: 12:30 - 13:00

Speaker: Chang Sun

Title: "Generating Privacy-Preserving Synthetic Tabular Data using Conditional GANs"

Abstract
A large amount of personal data is highly valuable for the research and scientific community. However, these data are often not accessible or require a lengthy request process because of privacy concerns and legal restrictions. Synthetic data as one of the solutions has been studied and proposed to be a promising alternative to this issue. But, generating realistic and privacy-preserving synthetic data still retains challenges. 

In this talk, Chang Sun will talk about an overview of synthetic data generation and her work on generating realistic and privacy-preserving synthetic tabular data using conditional GANs (Generative Adversarial Network) models combining differential privacy techniques. Her talk will cover the remaining challenges in the field and her proposed approaches such as generating data with mixed types of variables (categorical or continuous), imbalanced classes variables, and model collapses. She will also talk about adding differential privacy into synthetic data generators to protect individuals’ privacy. Finally, she will discuss the legal-ethical aspect of generating and use of synthetic data and the future direction in this field.

Lees ook

  • 21 okt 25 okt

    Masterclass over privacybeheer en gegevensbeheer

    Deze masterclass biedt praktische begeleiding bij het succesvol implementeren van een alomvattend gegevensbeheermodel en het duidelijk definiëren van de belangrijkste rollen en verantwoordelijkheden van degenen die betrokken zijn bij het beheer van persoonsgegevens in een organisatie in...