Ten years ago the term “Data Science” was only a 7% of what it is today in Google Trends. It was almost non-existent in the news, and only timidly gaining ground in the corporate narrative. One has to go back to 2010 to see a first comprehensive definition of the nascent discipline of data science in the media. The Economist …
A few Recommendations for a Data Scientist who wants to get started in Recommender Systems
Juan ArévaloAs a Data Scientist, you are expected to be able to build all sort of data products, that may involve simple-yet highly valuable business trends extracted through data querying and cleansing; and sometimes, more sophisticated Machine Learning algorithms for prediction, classification, or even recommendation. However, the cold start in a specific topic may be tough for Data Scientists, especially for …
The most important developments in data science of 2018
Jairo MejíaThe year 2018 has been one of the most important years in terms of breakthroughs in Machine Learning technologies. It has also been important vis à vis the debate on how to move forward beyond pure optimization, into a more advanced discipline of Data Science and real applied Artificial Intelligence. Ranging from a realization of the challenges of the industrialization …
The Best Online Courses for Data Scientists
Jairo Mejía and Joan LlopThe Data Scientist profile is one of the most demanded profiles in the labor market. At the same time, the Data Science toolset is becoming more diverse and the skills demanded are broader. Luckily, for those trying to take the first steps into Data Science or mastering techniques, there are many excellent online courses. After publishing a list of recommended …
Financial Text Classification: an Analysis of different Methods for Word Embedding
Pau BatlleIn this post, I would like to revisit the focus of the work I undertook during the last two summers as an intern at BBVA Data & Analytics. This is a technical summary of the teachings and insights obtained working with word embeddings for the categorization of short sentences for financial transactions. Text classification is at the center of many …
Bayesian Deep Learning meets Google Cloud for a better forecasting engine at BBVA
Jairo MejíaBBVA Data & Analytics have just published a white paper in partnership with Google Cloud that showcases an end-to-end solution to deploy to production a Deep Learning model for time series forecasting. The model incorporates uncertainty of the predictions, which, we believe will have a powerful impact on improving the customer experience of products such as BBVA’s expected expense tracker …
How Data-Driven Initiatives can Save Young Lives
Jairo Mejía and Joan LlopIn one of the largest cities on the planet, a lot of things happen every day. Mexico City is one of the largest megalopolis in the world, and where adequate sensorization could make it an ideal laboratory for the use of data for the good of its citizens. Furthermore, the city could encourage public participation in data-driven initiatives. One such …
Self-Service Performance Tuning for Hive
Angel PuertoHive is a very powerful data warehouse framework based on Apache Hadoop. The two together provide stable storing and processing capabilities for big data analysis. In this article, we will analyze how to monitor metrics, tune and optimize the workflow in this environment with Dr. Elephant. Hive is designed to enable easy data summarization, ad-hoc queries, and big data analysis. …
Building Open Source Software in a Large Corporation
Santiago BasaldúaThe world runs on data. However, without the dynamic, accessible and adaptable nature of OSS (Open Source Software) the pace of exploitation of data-rich fields would be painfully slow. Imagine a world of Data Science without Linux, Python, Anaconda or Tensorflow, just to cite some relevant examples of Open Source Software. During the last few years, the trend of using …
Improving Predictions in Deep Learning by Modelling Uncertainty
Axel BrandoAt BBVA we have been working for some time to leverage transactional data of our clients and Deep Learning modes to offer a personalized and meaningful digital banking experience. Our ability to foresee recurrent income and expenses in an account is unique in the sector. This kind of forecasting helps customers plan budgets, act upon a financial event, or avoid overdrafts. All …