Hi there,
I’m currently learning Python with some application in data science - statistical analysis and machine learning.
My accomplished and current projects are listed in the table below with links at appropriate datasets and scripts. I assume they will be pretty useful to other newbies because cover most of common problems and solutions especially business-related ones (clusterization, forecasting, modelling etc.)
My LinkedIn profile is availble here.
Analytics and ML
Project | Description | Year |
---|---|---|
Musical service analysis | Data analysis of a musical streaming service users' preferences split by city and genre. Key points: basic Python with Pandas, function, cycle, data structures, slice. |
2021 |
Research of personal loan borrowers | Detailed data pre-processing (including stemming via pymystem3 ) and revealing dependencies between personal installment loan repayment and borrower's social characteristics (family status, children, income).Key points: data pre-processing, data types duplacates, gaps, groupping, descriptive statistics, lemmatization. |
2021 |
Real estate market analysis | Modeling of real estate pricing in St.Petersburg depending of location, area, floor and other factors. Key points: matplotlib, plot, box plot, scatter plot, query. |
2021 |
Analysis of mobile tariffs | Data analysis with selection between 2 mobile tariffs: which one generates more revenue depending on customers' behavior and region. Key points: scipy, hypotesis, t-test, p-value. |
2021 |
Game market analysis | Analysis of game sales within 1980-2016 with best suited genres and platforms for NA, EU and JP markets. Key points: matplotlib, seaborn. |
2021 |
Best suited mobile tariff | Machine learning model deployment - prediction the best suited mobile tariff to a customer. Key points: data split (train, valid, test), classification models, desicion tree, random forest, logistic regression, accuracy score. |
2022 |
Modeling of customers' attrition | Supervised learning ML models (decision tree, random forest and logistic regression) of cunsimer business attrition (weighting target classes, upsampling and downsampling) reaching sutisfied F1. Key points: problem of data disbalance, class weighting, upsampling, downsampling, ROC curve, ROC-AUC, F1 score. |
2022 |
Researching of oil deposits | Machine learning model for choosing the best region for drilling deployment, based on its oil deposits and business profit Key points: regression models, linear regression, bootstrap, R2. |
2022 |
Gold floatation ML model | Prediction of gold recovery effectiveness based on technological process. Key points: GridSearchCV, RandomizedSearchCV, KNNImputer, pipeline. |
2022 |
Protecting customer data | Data encoding with matrix algebra tools. Key points: matrix, linear regression. |
2022 |
Car cost prediction | Comparsion of 3 models (RandomForestRegressor, LightGBM and CatBoost) in terms of the best RMSE and learning/validation time, useng ordinal encoding for qulitative attributes. Key points: gradient boosting, lightgbm, CatBoostRegressor, OrdinalEncoder. |
2022 |
Taxi prediction | Prediction of number of taxi orders within next hour, based on short trends in time series, day of week and hourly seasonality. Models are tested by RMSE. Key points: TimeSeriesSplit, seasonal_decompose, resample, rolling_mean. |
2022 |
Classification of comments | NLP classification model with preliminary text stemming and vectorization. The model has to classify a user's comment as toxic or non-toxic based on sentiment analysis. Key points: WordNetLemmatizer, TfidfVectorizer, SGDClassifier. |
2022 |
Kaggle pet projects
Project | Description | Year |
---|---|---|
Titanic survivals | A classical assignment from Kaggle competition | 2023 |
Stock price prediction model | A simple machine learning stock price prediction model (D1). | in progress |
Other Python projects
Project | Description | Year |
---|---|---|
Todo bot | Online Telegram bot helps to create simple tasks and review user's shedule. | 2021 |
Pet pic bot | Telegram bot sends a nice random picture of cat or dog to the user via chat message. | 2023 |
API bot | Telegram bot checks status of homework at Yandex.Practicum (Russian e-learning company) via API each 10 minutes and provedes its status to the user as Telegram message. | 2023 |
Blog | A blog with full functionality (registration, pass reset, searching, posts publication/edition/deletion, comments publication/edition/delition, imgs posting etc.) | 2023 |
Spanish bot | This is a Telegram bot (on python-telegram-bot library) which helps to learn spanish words. | 2024 |