Diplomature Statistic Techniques and Data Mining

In this repository store the notebooks that were developed in the diplomature. Themes per module:

Module 1 - Data Bases Design: I learned to design and create databases usign MySQL, DDL and DML.
Module 2 - Statistics Models: I learned about the probability, distributions, random numbers, random variables and their uses in the life, using Python as language programming.
Module 3 - Regression and Time Series: How my first course where I knowed about the time series as ARIMA. Also I learned Regression Lineal and Logistic through examples using Python as language programming.
Module 4 - Data Mining: This module is my favourite, first I kwnowed about Datawarehouses and how it´s used for data analysis of the data. Also I used models to clasification and regression as Decision Tree, PCA, Neuronal Networks, Regression Lineal, Kmeans, etc. In this module I used XGBoost to create a model of Binary Classification of the data.
Modele 5 - Stochastic Simulation: I used a software of simulation ARENA to create a simulation of proccess where it´s used the generation of random numbers with differents distributions applied in generation of times.
Module 6 - Analysis of Variance, Factorial and Correspondence: Some models as PCA, Agglomerative cluster, hierarchical cluster, Discriminant analysis, Kmeans and Factor Analysis using R as language programming.

I liked this diplomate because I learned many theory and models to analyze data. How use models in the different problems and also tools to could be used to fit the models. The best tool is Python, because exists a lot of information as Documentation, forums and video tuturials about the language, It´s very easy to learn. I could understand how to interpret the models and manipulate the data from database to input model selected. View the data through graphs generated with tools as PowerBI, Python and R, in this way is possible know what happen with the data.

Anything very important is the process ETL, whitout the ETL process is very difficult to achieve fit the models and get good results.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
AnalysisOfVarinaceFactorialAndCorrespondence		AnalysisOfVarinaceFactorialAndCorrespondence
DataMining		DataMining
DesignDataBase		DesignDataBase
RegresionAndTimeSeries		RegresionAndTimeSeries
StatisticModels		StatisticModels
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diplomature Statistic Techniques and Data Mining

About

Releases

Packages

Languages

gblasd/StatisticalTechniquesAndDataMining

Folders and files

Latest commit

History

Repository files navigation

Diplomature Statistic Techniques and Data Mining

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages