The code source files provided herein will enable you to reproduce the experiments on Big Data econometrics nowcasting and early estimates presented in Eurostat Handbook on Rapid Estimates (cite this source code or the reference's doi: 10.2785/488740). Further details are also available in the other associated working papers (see Kapetanios et al.'s publications below).
Description
The source code is organised in 4 distinct folders:
- extract/: Methods for feature extraction of various Big Data sources to usable time-series for econometric modelling. The scripts enable to convert unstructured datasets into structured time-series for different types of Big Data sources, e.g.: Google searches (Google Trends, Google Correlates), social network activities (Twitter), mobile phone data, IoT sensors, etc...
- filter/: Filtering techniques for high frequency data. It contains some signal extraction/decomposition techniques in order to remove seasonal, very high frequency periodicity and deterministic phenomena which are not relevant for nowcasting exercises. It covers in particular outliers' detection.
- model/: Relevant econometric modelling techniques for Big Data have been identified and some implementations are made available. A particular attention is set ipon Bayesian ones, e.g. the possibility of using Bayesian panel VAR models, quantile regression model and expectile regression models for dealing with Big Data.
- nowcast/: Modelling strategies for nowcasting/early estimates purposes taking into account various Big data characteristics have been elaborated. Scripts that operate some empirical test on possible timeliness gains when using Google Trends, other easily accessible big data and macroeconomic and financial variables are provided. Accuracy gains through improving the timeliness of the selected variables at the beginning, middle and end of the reference period together with the associated accuracy loss are also investigated.
The results presented in the various publications referred to below can be reproduced. For that purpose, the necessary raw (as well as the output data) are made available to the user in the data/ folder. Further (narrative) description of the various functions/scripts is also provided in this document, located in the docs/ folder, including the evaluation of the nowcasting/flash estimation techniques based on a big set of indicators.
About
authors | Papailias F., Kapetanios G., Marcellino M. and Petrova K. | |
version | 1.0 | |
status | since 2017 – closed | |
license | EUPL (cite the source code or the reference above!) |
-
Marcellino M.G., Papailias F., Mazzi G.L., Kapetanios G. and Buono D. (2018): Big Data econometrics: Now casting and early estimates, no. 2018-82, BAFFI CAREFIN Centre Research, ssrn:3206554.
-
Kapetanios G., Marcellino M. and Papailias F. (2017): Guidance and recommendations on the use of Big data for macroeconomic nowcasting in Handbook on Rapid Estimates, Chapter 17, Publications Office of the European Union, doi:10.2785/4887400.
-
Kapetanios G., Marcellino M. and Papailias F. (2017): Filtering techniques for big data and big data based uncertainty indexes, Publications Office of the European Union, doi:10.2785/880943.
-
Kapetanios G., Marcellino M. and Papailias F. (2017): Big data conversion techniques including their main features and characteristics, Publications Office of the European Union, doi:10.2785/461700.
-
Baldacci E., Buono D., Kapetanios G., Krische S., Marcellino M., Mazzi G.L. and Papailias F. (2016): Big Data and macroeconomic nowcasting: From data access to modelling, Publications Office of the European Union, doi:10.2785/360587.
-
Mazzi G.L., Moauro F. and Ruggeri Cannata R. (2016): Advances in econometric tools to complement official statistics in the field of Principal European Economic Indicators, Publications Office of the European Union, doi:10.2785/397407.