Portfolio contents my skill-set, list of courses and data analysis projects which I've been working on during my study and exploration in the Data Science field. All the projects are created in Python using Jupyter Notebooks. Feel free to contact me and have a talk about interesting collaboration, new opportunities, or if you have any questions: [email protected]
- Programming Languages:
- Python - pandas, NumPy, scikit-learn (sklearn), Dask, XGBoost, LightGBM, nltk, sqlalchemy
- R basics - BSDA, lawstat
- Data Visualization: matplotlib, seaborn, plotly, Bokeh, ggplot
- Machine Learning: Regression, Classification, Clustering, NLP, Time Series Analysis
- Big Data: Hadoop, Impala, Hive, Hue, Kafka, Redash
- Databases: MySQL, SQLite, PostgreSQL, ClickHouse
- BI: Redash, Google Data Studio, PowerBI
- Technologies: Django, Flask, HTML, CSS, Bootstrap
- Tools: Git, Jupyter, Atom, PyCharm, AWS, Digital Ocean, Slack, Jira, Confluence, Trello
- OS: Linux (Ubuntu), Windows, macOS
- Project Management: CRISP-DM
- Business Analytics. Specialization (5 courses) by Warthon School, Universitry of Pennsylvania (in progress)
- Machine Learning and Data Analysis. Specialization (6 courses) by MIPT & Yandex (certificate)
- ML Course Open by OpenDataScience. Rated 26th of 1800+ participants (rating)
- Data Analyst with Python. Career Track by DataCamp (certificate)
- A/B-testing. Intense Course by aic.academy
- Executive Data Science. Specialization (5 courses) by Johns Hopkins University (certificate)
- Structuring Machine Learning Projects. Course by deeplearning.ai (certificate)
-
TweetLikeTrump (in progress, repository) - Flask-based web application which can measure the similarity between yours and Donald Trump's written text. Just for fun and NLP experiments.
#Python #NLP #Flask #ProjectManagement -
LibDS is a curated list of Data Science books. It's just a repository but I develop a "library-like" website using Django framework.
#Python #Django #ProjectManagement
-
Catch me if you can: Intruder Detection (nbviewer). Capstone Project and in-class competition during Machine Learning & Data Analysis Specialization @Coursera by MIPT & Yandex. The goal is to identify a user on the Internet tracking his/her sequence of attended web pages and predict whether it belongs to the specific user (Alice) or somebody else.
#Python #Classification #LogisticRegression #TF-IDF #ROC-AUC -
The forecasting of the average wage in Russia (nbviewer). Time Series analysis task.
#Python #TimeSeries #DataVisualization -
Mobile Games AB-testing with Cookie Cats (nbviewer). As players progress through the levels of the game, they will occasionally encounter gates that force them to wait or make an in-app purchase to progress. The target of this project is to decide where to place the gate looking at the impact on player retention.
#Python #DataAnalysis #AB-testing -
TweetLikeTrump (in progress, nbviewer). NLP experiments for TweetLikeTrump pet-project.
#Python #NLP #TF-IDF #CosineSimilarity #OneClassSVM #word2vec -
Risk and Returns: The Sharpe Ratio (nbviewer). Using
pandas
to calculate and compare profitability and risks of different investments using the Sharpe Ratio.
#Python #pandas #DataAnalysis -
Exploring the Bitcoin cryptocurrency market (nbviewer). Exploring the market capitalization of Bitcoin and other cryptocurrencies.
#Python #DataAnalysis #DataVisualization
-
3200tweets is a simple Python script for grabbing tweets using Twitter API and Tweepy package.
#Python #DataCollection #WebScraping #TwitterAPI #Tweepy -
Kaggle Competitions Contributor - my profile and the list of competitions.
-
Dask: when Pandas fails is an intro article about Dask library (in Russian).
#Article #Python #Dask