Skip to content

Portfolio with Data Science projects | Machine Learning | Python

License

Notifications You must be signed in to change notification settings

luislauriano/Data_Science

Repository files navigation

author GPLv3 license contributions welcome

Luis Vinicius

Linkedin

Projects:

Here you can find the notebooks of my projects in the area of Data Science and Machine Learning.

  • Machine learning project to predict possible outcomes of 2022 world cup matches: This is a project for the purposes of curiosity and machine learning study, with the aim of developing a model capable of predicting possible outcomes of the 2022 World Cup matches, until reaching the result of the grand winner of the championship.

  • League of Legends and Data Science – Predicting match results: This Machine Learning project, defined as an end-to-end project, aimed to go from collecting match data to building a machine learning model, to predict the chances of the team that is playing on the blue side. on the side of the map win. Performing steps such as: Pre-processing and data analysis, dimensionality reduction and selection of variables, and construction of both a model completed with XGBClassifier, and construction of a logistic regression model from the results obtained from AutoML with Pycaret.

  • Spotify & Python and Data Science – Data Analysis of Artist NexoAnexo Albums: The objective of the project was to perform a data analysis of the Spotify albums by the artist NexoAnexo, going through the main steps of a data analysis. Being, data collection, pre-processing, exploration and visualization of the data. Finally, after the analysis was completed, an application/dashboard was built with Python and the application was made available on the web through heroku. Another objective is that from the conclusions made from the analysis of the data of the songs of the albums, factors that help or contribute to an album or song to be more successful and how this can be used in future releases could be identified.

    Repository/Application source code

    Project Application/Dashboard

    After publicizing the project and the application developed with Streamlit on Linkedin, Product Marketing Ted Ricks from Streamlit found my application and in his words said "Really enjoyed your app- wanted to let you know it was included in this week's Weekly Roundup on our community forum Streamlit", Therefore, the application was included in the [weekly summary 29/11/2020](https://discuss.streamlit.io/t/weekly-roundup-agraph components-streambackmachines-text-generation-tutorials-and-more/7640) from the Streamlit community in the apps topic of the week.

  • What they didn't tell you about the coronavirus: An analysis of covid-19 data: During the month of March, in China, the number of recovered cases were already greater than the number of confirmed cases, however, countries such as the United States and South Korea still had their number of cases of deaths greater than the number of recovered cases. and for countries like Canada and Brazil, it was still very new. In this brief analysis of covid-19 data from 01/22 to 03/09, I was able to identify and alert the number of increasing cases of deaths in countries like the United States, even before the high peak of the virus.

  • Predictive model for the occurrence of diabetes: Based on the dataset of the National Institute of Diabetes and Digestive and Kidney Diseases, a simple model was built capable of predicting whether or not a pregnant patient has diabetes, based on certain diagnostic measures included in the dataset.

  • Manipulating and processing data to generate indicators for the company: The objective of the project was to process and manipulate data from files in XLSX and CSV from the company Bemol, to generate a single set of data in xlsx, which will serve as a report and indicator for the company.

  • Airbnb data analysis for the city of Rio de Janeiro: The city of Recife is one of the cities that most attract tourists during the carnival period in Brazil. However, the city of Rio de Janeiro in 2020 was among the three most sought after cities to enjoy the carnival period, with 2 million tourists expected to enjoy the carnival marathon, thus a growth in the hotel chain. In addition, when we travel, we always think about which would be the best hotel, the best location and the best value for money. With that in mind, an exploratory analysis was made of data from one of the largest hotel companies today, Airbnb, using the dataset provided by the company itself.

  • Machine Learning for breast cancer detection: In this Machine Learning project, a simple Machine Learning model was built in order to detect the presence of breast cancer.

  • Exploratory Data Analysis with Streamlit: This application was built with streamlit, a python framework for creating an application/dashboard. The application makes an initial exploratory analysis of the data through statistical methods and data visualization, I also took the opportunity to insert some statistical explanations in the application.


Made with 💖 by Luis Vinicius