I'm Liam, an NYC-based Data Scientist with a focus on NLP and Computational Semantics. Beyond my expertise in the field, I have an unwavering passion for music, particularly jazz, and love for painting. I find joy in blending my analytical skills with my creative inclinations, seamlessly merging the realms of data science and artistic expression. This unique combination allows me to approach problem-solving with a fresh perspective, bringing a touch of inspiration to my work.
- Fake News Detector
- Web app built using Flask and deployed Heroku hosting a fine-tuned BERT model. Model was trained to > 99.9% accuracy on this kaggle kernel using this kaggle dataset. Project repository can be found here.
- NYC Community District Needs
- Tableau Viz and SQL queries of the annual requests that community districts in New York have.
- Text Augmentation Toolkit (TATK)
- Created a python package for lightweight and powerful text augmentation techniques such as back translation, word-vector based synonym replacement, etc.
- CNN For Sentence Classification
- Implemented a Convolutional Neural Network for sentence classification using Pytorch and wrapped it in a Sklearn estimator so I can use their great built in functions for cross validation and hyperparameter tuning.
- Recursive Neural Network
- Masters Thesis, implemented the lesser-known Recursive Neural Network using Pytorch to see how well it performs at representing syntactic structure in word/phrase vectors.
Data Analytics: NumPy, Pandas, Scipy
Machine and Deep Learning Frameworks: Scikit-Learn, Tensorflow, Keras, PyTorch, Huggingface
Natural Language Processing: Spacy, NLTK, BERT
Development: Python, Flask, Django, Git, Heroku
Data Viz: Tableau, PowerBI, Matplotlib, Seaborn
Cloud Services: AWS, Kaggle kernel
March 2022 - June 2023
- Collaborated with cross-functional teams at Toyota, including Customer Data Science, Guest Experience and Retention, and Marketing Data Science, to develop data pipelines, models, and analyses driving key business decisions.
- Led the refinement and deployment of sentiment analysis models for freeform survey response text data, resulting in a 16% increase in accuracy when deployed in a production environment.
- Created interactive dashboards for customer survey data, integrating text analytics solutions such as keyword extraction, topic modeling, and sentiment analysis, enabling stakeholders to gain actionable insights at a glance.
- Developed a high-precision lead conversion prediction model, achieving 94% accuracy, leading to improved marketing targeting and increased conversion rates.
May 2018 - April 2020
- Spearheaded data science efforts for the Trends product, driving automatic tagging and categorization of customer feedback across diverse channels, including email, chat, social, reviews, and surveys.
- Designed and implemented the machine learning backend, processing over 20 million messages, leveraging Flask, Google Cloud Platform, and AWS for efficient and scalable data processing.
- Developed and maintained inferential models for text classification using neural networks, logistic regression, and ensemble methods for 80+ companies across 15 industries, including Gap, Abercrombie, and Everlane.
- Mentored junior Data Scientists and supervised 2 interns, fostering a collaborative and innovative environment.