Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Bitcoin_Analysis		Bitcoin_Analysis
FX_Analysis_During_Election		FX_Analysis_During_Election
GDP and Future Orientation		GDP and Future Orientation
Power_Law_vs_Lognormal_US_Babynames		Power_Law_vs_Lognormal_US_Babynames
Ridge_Lasso_CV_Comparison		Ridge_Lasso_CV_Comparison
Twitter_Sentiement_Analysis		Twitter_Sentiement_Analysis
.gitattributes		.gitattributes
README.md		README.md

Repository files navigation

Data Science Portfolio

This is a repository of the projects I worked on or currently working on. It is constantly updated. The projects are either written in R (R markdown) or Python (Jupyter Notebook). The goal is to use data science/statistical modelling techniques to find something that is interesting. A typical project consist of finding and cleaning data, analysis, visualization and conclusion.

Projects:

Bitcoin Analysis

Plot Bitcoin Price vs S&P500 prices, and perform Granger Causality test.
Fitted ARIMA model on Bitcoin prices to forecast Bitcoin range of movement.
Keywords(R, Time Series, Causality)

Exchange Rate Analysis During Election

In this project, I tried to predict US (2016) and UK (2017) election victories as the voting results of each region becomes available.
The prior information is the polling data and as each regions results comes out, the model is updated.
Monte Carlos simulation is used to simulate the winner of the election.
The result is compared with exchange rates fluctuations to see how the financial market kept up with the result.
Keywords(Python, Linear Regression, Monte Carlos Simulation, Twitter API)

Power-law or Log-normal? Baby Name and Twitter Analysis

Fitted power-law and log-normal distribution to US baby names since 1960 and compared the fit.
Use bootstrapping techniques to find a distribution of the power-law parameters
Crawled Twitter to find 20000 random user and fitted power law distribution to users' friends count and followers count.
Keywords(R, Power-law, Bootstrapping, Log-normal)

Comparing Ridge and Lasso Regularization with Cross Validation

Fitted polynomial linear regression on wine quality vs wine chemical properties.
Used ridge and lasso regularization to tackle overfitting and compared result
Used cross validation to select the optimal regularization strength
Keywords(Python, Linear Regression, Ridge and Lasso Regularization, Cross Validation)

Twitter Sentiment Daily and Weekly Fluctuations

Parsed a few GB of Tweets to select all the tweets in UK and in English.
Used 'qdap' package to analyze the emotion of the Tweets
Plotted the emotions over the day and over the week and analysed the interesting results.
Keywords(R, Twitter API, Time Series, Sentiment Analysis)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science Portfolio

Projects:

Bitcoin Analysis

Exchange Rate Analysis During Election

Power-law or Log-normal? Baby Name and Twitter Analysis

Comparing Ridge and Lasso Regularization with Cross Validation

Twitter Sentiment Daily and Weekly Fluctuations

About

Releases

Packages

Languages

alexhuang1117/Data-Science-Portfolio

Folders and files

Latest commit

History

Repository files navigation

Data Science Portfolio

Projects:

Bitcoin Analysis

Exchange Rate Analysis During Election

Power-law or Log-normal? Baby Name and Twitter Analysis

Comparing Ridge and Lasso Regularization with Cross Validation

Twitter Sentiment Daily and Weekly Fluctuations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages