-
Elastic-Search Public
The goal of this project is to become familiar with the process of installing and starting Elasticsearch. Elasticsearch is an information retrieval system that can index documents as we saw in clas…
UpdatedOct 15, 2020 -
White-Box-Testing Public
In this project, I have applied three white-box techniques to the ATM system.
-
Black-Box-Testing Public
In this project, I have applied three of the foundational black box techniques on the ATM system.
-
Kibana Public
The goal of this project is to gain familiarity with Kibana and ElasticSearch.
UpdatedOct 15, 2020 -
MongoDB Public
The goal of this project is to get experience with MongoDB, one of the most widely-used tools for the management and querying of big, unstructured data.
JavaScript UpdatedOct 15, 2020 -
The goal of this project is to gain experience with the MapReduce programming model, text processing, and index development. You can either use Java or Python. Python is strongly recommended. I hav…
UpdatedOct 15, 2020 -
Intro-to-Datascience Public
In this project, you can find the use of data analytics techniques such as loss functions, predicting outcomes using provided dataset, visualization, simple linear model, model selection, confidenc…
visualization bootstrap evaluation prediction model-selection confidence-intervals inferential-statisticsJupyter Notebook UpdatedOct 14, 2020 -
Nested-Spheres Public
Simulation is an incredibly useful tool in data science. We can use simulation to evaluate how algorithms perform against ground truth, and how algorithms compare to one another. In this project, I…
Jupyter Notebook UpdatedOct 14, 2020 -
Clustering Public
In this project, I have worked with some age (measured in years) and height (measured in fractional feet. So, for instance, 5 feet 6 inches would be 5.5 since there are 12 inches in a foot). In the…
Jupyter Notebook UpdatedOct 14, 2020 -
In this notebook, we're going to explore the use of a few different ways of setting up an image classification model. The images and more details are available here: https://tiny-imagenet.herokuapp…
Jupyter Notebook UpdatedOct 14, 2020 -
The Dataset: This dataset consists of 3921 e-mails to a single account, some of which are spam. These data represent incoming emails for the first three months of 2012 for an email account. The tab…
Jupyter Notebook UpdatedOct 14, 2020 -
In this project, I have worked with some data on possums. It is a relatively small data set, but it's a good size to try with ordinary least squares (OLS) and least absolute deviation (LAD), and to…
-
The Dataset: This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient…
pipeline cross-validation classification logistic-regression lasso-regression gridsearchcv diabetes-predictionJupyter Notebook UpdatedOct 14, 2020 -
This project is part 2 of the project "A Data Scientist for a Professional Football Club". In this project, managers want to test some hypotheses relating a player's overall rating and some of thei…
-
In this project, you work as a Data Scientist for a professional football club. The owner of the team is very interested in seeing how the use of data can help improve the team's performance, and p…
-
Exploring the confidence-Interval concept and bootstrapping.
-
Maximum-Likelihood Public
The Poisson distribution https://en.wikipedia.org/wiki/Poisson_distribution is a discrete probability distribution often used to describe count-based data, like how many snowflakes fall in a day. I…
-
-
-
-
-
-
-
-
-
-
-
-
-