Event study in Stata 

First of all, thank you for this wonderful resource!

I am confused by the Stata event study code, and think it might not be totally correct. For reference, here it is

```
use "https://raw.githubusercontent.com/LOST-STATS/LOST-STATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta", clear

* create the lag/lead for treated states
* fill in control obs with 0
* This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
* Otherwise, there may be some NAs and the estimations will be off.
g time_to_treat = year - _nfd
replace time_to_treat = 0 if missing(_nfd)
* this will determine the difference
* btw controls and treated states
g treat = !missing(_nfd)

* Stata won't allow factors with negative values, so let's shift
* time-to-treat to start at 0, keeping track of where the true -1 is
summ time_to_treat
g shifted_ttt = time_to_treat - r(min)
summ shifted_ttt if time_to_treat == -1
local true_neg1 = r(mean)

* Regress on our interaction terms with FEs for group and year,
* clustering at the group (state) level
* use ib# to specify our reference group
reghdfe asmrs ib`true_neg1'.shifted_ttt pcinc asmrh cases, a(stfips year) vce(cluster stfips)
```

My problem stems from the line 

```
replace time_to_treat = 0 if missing(_nfd)
```

This means that states which are not treated are given `0`, meaning they are treated in that year. This gives the following

```

time_to_tre	
at	Freq.	Percent	Cum.
			
-21	1	0.06	0.06
-20	2	0.12	0.19
-19	2	0.12	0.31
-18	2	0.12	0.43
-17	2	0.12	0.56
-16	3	0.19	0.74
-15	3	0.19	0.93
-14	3	0.19	1.11
-13	6	0.37	1.48
-12	7	0.43	1.92
-11	9	0.56	2.47
-10	12	0.74	3.22
-9	22	1.36	4.58
-8	25	1.55	6.12
-7	32	1.98	8.10
-6	34	2.10	10.20
-5	36	2.23	12.43
-4	36	2.23	14.66
-3	36	2.23	16.88
-2	36	2.23	19.11
-1	36	2.23	21.34
0	465	28.76	50.09
1	36	2.23	52.32
2	36	2.23	54.55
3	36	2.23	56.77
4	36	2.23	59.00
5	36	2.23	61.22
6	36	2.23	63.45
7	36	2.23	65.68
8	36	2.23	67.90
9	36	2.23	70.13
10	36	2.23	72.36
11	36	2.23	74.58
12	35	2.16	76.75
13	34	2.10	78.85
14	34	2.10	80.95
15	34	2.10	83.06
16	34	2.10	85.16
17	33	2.04	87.20
18	33	2.04	89.24
19	33	2.04	91.28
20	30	1.86	93.14
21	29	1.79	94.93
22	27	1.67	96.60
23	24	1.48	98.08
24	14	0.87	98.95
25	11	0.68	99.63
26	4	0.25	99.88
27	2	0.12	100.00
			
Total	1,617	100.00
```

It's possible that because in control units, `time_to_treat` does not vary across years,  the state (`stfips`) fixed effects "take care" of  this. But I can't intuitively reason about what's really happening given `0` stands for both untreated and `treated`, but year `0`. 

I would recommend making the `time_to_treat` variable `100` or the maximum plus `100`, to avoid this confusion.  The values don't matter since they are used as fixed effects anyways. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event study in Stata #145

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Event study in Stata #145

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions