I'm a Data Scientist with experience as a Data Engineer and Software Engineer. My interest is in finding and using data in a way that actually helps organizations make better decisions. My proudest achievements are:
- 🔧 Built the Data Architecture of a startup from scratch using Modern Data Stack tools (BigQuery, DBT, Airflow and Airbyte). The architecture and rationale for the chosen tools are documented here
- 📂 Created a set of data modeling layers, following DBT best practices. This ensured our company had organized and documented data. My background in Software Engineering has motivated me to incorporate best practices from software development into the data stack. Best practices include: CI/CD, unit testing, integration testing, and light but robust data governance, which enables every collaborator to make changes in an agile manner while adhering to established conventions. See my article on DBT pre-commit here.
- 📊 Collaborated closely with every area of the business to identify data needs and deliver impactful solutions. Some examples of completed projects include: patient profiling, churn forecasting, item stock forecasting (time series), A/B testing, hypothesis testing, key metrics definition
- 📚 Developed several side projects (with more in progress) fueled by my interest in specific topics or tools I wanted to learn. See the list of projects below:
Project | Description | Stack | Repository |
---|---|---|---|
Linkedin Company Enricher | This project attempts to infer the quality of an organization's culture based on employee role rotation. I.e. how long do employees stay at the company and is this an indicator of how good a company is to work for? | Python, BigQuery, Google Cloud Storage, DBT, Google Data Studio | https://github.com/ignaciovi/job-hunt-company-analyser |
Song Genre Prediction | Attempt to predict the genre of a song based on audio analysis features from Spotify | Python | https://github.com/ignaciovi/song-genre-prediction |
Tweet Geolocation Spain | We propose a framework for estimating Twitter user's location based solely on the text of the tweets (using Spanish tweets) | Python | https://github.com/ignaciovi/tweet-geolocation-spain |
Wikipedia Newsletter | Developed it to learn about AWS. Luigi running in EC2 that retrieved Wikipedia events and stored them in S3 and RDS | Python Luigi, AWS (EC2, S3) | https://github.com/ignaciovi/wikipedia-newsletter |
Songbinator | Takes a list of artists and creates a Spotify playlist based on similar ones. Developed it to learn React + Flask and IBM Cloud. | React, Python Flask, IBM Cloud | https://github.com/ignaciovi/songbinator |
- Guitar player and singer
- I have an advanced scuba diving certification