Hi! I’m a data scientist / neuroscientist / cat lover interested in the complex dynamics of the world we live in and how we can use data and science to solve humanity’s big and small challenges. One idea at a time.

Fun fact: I started university when I was 15 years old and got my PhD at 25, and, while I wouldn’t recommend it, this has allowed me to explore many of my interests throughout my life and to develop a diverse and unique set of skills 🙃 You can find out more about my professional background on LinkedIn

Posts

This is a selection of hobby projects I have worked on in the past:

Graph Convolutional Networks for Fraud Detection of Bitcoin Transactions

Detecting fraudulent transactions is essential in keeping financial systems trustworthy. Traditionally, fraud detection is done through the analysis and vetting of carefully engineered features of individual transactions or of the individual entities involved (companies, accounts, individuals). Here I illustratre an end-to-end approach of node classification by graph neural networks to identify suspicious transactions. I compare my results on the elliptic dataset with the available literature and propose further ideas to be explored in the future.

Dec 15, 2019
COVID-19 Germany local incidence and ICU occupancy (in German)

Some time ago at the beginning of the COVID-19 pandemic, I decided to create a Twitter bot to automatically gather the latest data in Germany and share it to the Twittersphere. I also created and deployed a dashboard heroku app (links below). I was motivated by the lack, at the time, of easily accessible incidence and ICU occupancy data at a local and city level. Back then, only aggregated data by _Bundesland_ and at a national level were available through the official Robert Koch Institute website. Meanwhile, both the RKI and DIVI for ICU data have improved their dashboard data granularity. I still maintain both the Twitter bot and the dashboard, as they run with little overhead. Link to the dashboard

Jan 1, 2021
Fitbit activity and sleep data: a time-series analysis with Generalized Additive Models

This is a time-series analysis of activity and sleep data from a fitbit user throughout a year. I use this data to predict an additional year of the life of the user using Generalized Additive Models.

Apr 1, 2018
Personalized Medicine Kaggle Competition

This was my approach to the Personalized Healthcare Redefining Cancer Treatment Kaggle competition. The goal of the competition was to create a machine learning algorithm that can classify genetic variations that are present in cancer cells.

Oct 7, 2017
Exploratory analysis of Medicare drug cost data 2011-2015

Health care systems world-wide are under pressure due to the high costs associated with disease. In this post, I performed an analysis of Medicare data in the USA. Furthermore I used a drug-disease open database to cluster the costs by disease. I identified the most expensive diseases (mostly chronic diseases such as Diabetes) and the most expensive medicines.

Feb 6, 2017
Visualizing parallel event series in Python

In this post, I will use Python to visualize two different series of events, plotting them on top of each other to gain insights from time series data."

Feb 6, 2017
Simulating the revenue of a product with Monte-Carlo random walks

I take a look at how we can model the future revenue of a product by making certain assumptions and running a Monte Carlo simulation.

Oct 15, 2016

Posts

Graph Convolutional Networks for Fraud Detection of Bitcoin Transactions

COVID-19 Germany local incidence and ICU occupancy (in German)

Fitbit activity and sleep data: a time-series analysis with Generalized Additive Models

Personalized Medicine Kaggle Competition

Exploratory analysis of Medicare drug cost data 2011-2015

Visualizing parallel event series in Python

Simulating the revenue of a product with Monte-Carlo random walks