Major airlines data analysis by comparing with LCC data
purpose of this project: compare the key difference of Low Cost Career(LCC) with major korean airlines
Source of SNS for analysis: Youtube, Twitter, Naver Blog
Key Brands: Jeju airlines, Asiana Airlines, Korean Air
SNS type : Youtube Naver Blog Twitter
Language : Python
Essential Module : sys urllib time pandas re bs4(beautifulsoup) selenium
Final code update date : 2019.05.07
crawling 기간 : Youtube_2017.05.01~ 2019.05.03 Naver_2018.04.27~2019.05.04 Twitter_2018.04.03~2019.05.04
Youtube | Naver Blog | ||
---|---|---|---|
Language | Python3 | ||
Module | sys, urllib, time, pandas, re, bs4, selenium | ||
Final update | 2019.05.07 | ||
Crawling Target | Commentes | Blog preview text | Posts |
crawling 기간 | 2017.05.01 ~ 2019.05.03 | 2018.04.27 ~ 2019.05.04 | 2018.04.03 ~ 2019.05.04 |
-Filtering Code: filter spam words in crawled data
-Or: get data with word A or B
-And: get data with word A and B (this code is to filter SNS comment which mentioned more than two brands)