Code associated with paper: "Estimation and Inference on Heterogeneous Treatment Effects in High-Dimensional Dynamic Panels" Semenova, Goldman, Chernozhukov, Taddy (2017)
AggData*.csv for * in {Drinks, DairyPart1, DairyPart2, NonEdible1, NonEdible2, Snacks} contains anonymized grocery sales data from a food distributor. The grocery items are sold at 8 different sites, via 2 different channels (Collection, Delivery), in the years 2012-2017. Using catalog descriptions, we organize the products into a tree (see example below)
To preserve the distributor's anonymity, we replaced the names of the nodes at Level 2 and below by numbers (for Drinks Level 3 and below) by numbers. We also added lags of sales and prices. The resulting data set takes the form:
To replicate the results:
- Download the github repo
- Open Rstudio (or R) and run
install.packages(c("tictoc","cowplot","xtable","parallel","expm", "foreach", 'gamlr", "glmnet","ggplot2", "dplyr","plyr","reshape2","tidyr","iterators","assertthat","tidyverse","rmutil"))
Figure : Own price elasticities for categories as estimated by Orthogonal Least Squares (Double Machine Learning)
- Open /orthoml-master/src/Figure3.R and set directoryname to the location of downloaded file. From the shell/or in R, run
The code produces the estimate and 95% confidence interval for the average price elasticity for each category in {Drinks, Dairy, NonEdible, Snacks}. A plot example for Drinks is given below
Figure : Own price elasticities by the months of a calendar year as estimated by Orthogonal Least Squares (Double Machine Learning)
- Open /orthoml-master/src/Figure3.R and set directoryname to the location of downloaded file. From the shell/or in R,
The code produces the estimate and 95% confidence interval for the average price elasticity by calendar month for each category in {Dairy, NonEdible, Snacks, Sodas, Water}. A plot example is given below
Figure : Distribution of Own price elasticities as estimated by Orthogonal Lasso, Double Orthogonal Ridge, and Orthogonal Least Squares
- Open /orthoml-master/src/Figure3.R and to the location of downloaded file. From the shell/or in R,
The code produces a histogram of estimates for the average price elasticity for categories, aggregated at Level2, Level 3, Level4, grouped by color at Level1. A plot example is given below
We see that Lasso estimates are most concenrated (shrinked towards homogenous specification), Orthogonal Least Squares is most dispersed (and least precise), and Double Orthogonal Ridge is in the middle.
"Double/Debiased Machine Learning for Treatment and Causal Parameters" (Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, James Robins), 2017,
"Estimation and Inference about Heterogeneous Treatment Effects in High-Dimensional Dynamic Panels" Vira Semenova, Matt Goldman, Victor Chernozhukov, Matt Taddy, 2017,
"Pricing Engine: Estimating Causal Impacts in Real World Business Settings" Matt Goldman, Brian Quistorff, 2018,