What we saw (and what we showed) at KDD 2019

Jose Antonio Rodriguez Serrano and Axel Brando

d&a blog

One of our first and most successful applications of machine learning to a retail financial tool included in the BBVA app is that which allows customers to know a forecast of recurring expenses and incomes for next month. Knowing what day you will receive the car insurance charge or a recurring transfer -and its amount- is key to manage your accounts and monitor your financial health.

A team from BBVA AI Factory, of which BBVA Data & Analytics is part, is now investigating how to enrich this forecast by providing more information to customers, including the forecast of unusual movements. Deep Neural Networks are a great approach to forecasting problems, but the mainstream application of this technology is focused on producing point estimates, which implies a clear limitation in scenarios where being aware of the uncertainty in prediction is crucial. In the short article presented in the Workshop on “Anomaly Detection in Finance”, within the KDD 2019 conference held last August in Anchorage (USA), our colleagues José Rodríguez and Axel Brando, in collaboration with the University of Barcelona, propose the use of Deep Neural Networks that yield distributions rather than point estimates, allowing us to detect unusual transactions more efficiently.

This work is still in exploratory phase and is based on a previous model for Uncertainty Estimates in Deep Regression Networks (published in ECML/PKDD 2018). A line of work that is continuing with a very recently accepted paper at NeurIPS.

Beyond the presentation of this short article, this year’s “Knowledge Discovery and Data Mining” conference (KDD 2019) was packed with new data science based applications. Here is a highlight of what we saw, of course with a bias on applications to real-world systems and products, and on applications in the banking and financial industries.

A conference on… Data Science

Many data scientists feel that most of the top ML/AI conferences rated A* do not talk in the “Data Science” language. KDD is an exception: it describes itself explicitly as targeting the “Data Science” community. The differentiators are: an applied data science track, a new track on invited Data Science speakers, and hands-on tutorials. The bar is still very high technically, but the problems to solve come from the real world. All articles of the conference are available here. Some highlights of the main conference were:

Workshop on Anomaly Detection in Finance

In addition to the main conference, the workshop on Anomaly Detection in Finance had representation from Data Science teams in banks and other financial institutions. As an example, Rabobank presented a visualization prototype to help analysts spot fraudulent transactions (using SHAP feature importances and MDS projections), in collaboration with University of Eindhoven, while Deutsche Bundesbank, PwC and Univ. St. Gallen proposed a method to detect fraud in corporate operations by detecting anomalies in ERP transactions (using adversarial autoencoders).

On the other hand, Capital One co-organized the workshop and presented a method to detect changepoints in time series, and the MIT-IBM Research Centre presented an algorithm to detect money laundering in financial graphs (related article). They have also released a public graph dataset of Bitcoin transactions, the Elliptic Dataset, but mentioned that were interested in piloting this algorithm with banks. Uber were also present, with articles on fake ridership detection and fake account detection. After all, Uber processes financial transactions.