-
Notifications
You must be signed in to change notification settings - Fork 58
Add weights option to event_study #920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…st stops complaining
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
I also added a fix for the saturated event study not being able to call summarize because of an out-of-date check in the summarize code. |
Codecov ReportAttention: Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 11 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Awesome, thanks so much! Will review this first thing tomorrow morning! |
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
if weights is not None and use_weights: | ||
post = ( | ||
df[df[treatment] == 1] | ||
.groupby([cohort, period])[weights] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be frequency weights? I.e. we sum up weights over all treated by cohort and period? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is also the implementation in fixest. I didn't test but would guess the std error if we called this frequency weights wouldn't match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the fixest docs - if use_weights = True
, then the aggregation using weights and the weights are treated as frequency weights:
#' @param use_weights Logical, default is
TRUE
. If the estimation was weighted,
#' whether the aggregation should take into account the weights. Basically if the
#' weights reflected frequency it should beTRUE
.
So my understanding would be - use_weights = FALSE
-> analytical weights, use_weights = TRUE
-> frequency weights?
Co-authored-by: Alexander Fischer <alexander-fischer1801@t-online.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments, let me make sure the weighted test covers the saturated twfe function and update the PR.
The "use_weights" issue is down to your taste, I think. I added to match fixest, but I also don't see the purpose in having this as a separate argument if the user already supplied weights.
I think the questions on freq versus analytic comes back to the discussion we were having about guessing which it was fixest used. At least feeding the argument as analytic to feols matched the std. errors from fixest.
if weights is not None and use_weights: | ||
post = ( | ||
df[df[treatment] == 1] | ||
.groupby([cohort, period])[weights] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is also the implementation in fixest. I didn't test but would guess the std error if we called this frequency weights wouldn't match.
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Now I think every thing looks good - the only thing I'd like to clarify before merging:
|
Thanks, Alex. I checked to verify that the tests would fail if we were using "fweights" instead of "aweights". I had expected this since you had mentioned that fixest isn't explicit about the types of weights it uses, but it should be aweights. As far as I can see, the sunab code only additionally weights in the way I already implemented in the code. I agree both the text and the usage make it fairly clear that the weights here are used as "fweights", though I don't think this is ever seen by the underlying estimation routine that is fitting the model. My best current guess is that the weights argument is used in both ways in fixest. It is used as aweights when running the core estimation routine. Then for aggregating from the cohort-period estimates to whatever the user specifies, it is practically treated as a frequency weight. Finally, I realized that the "aggregate" function is not really user facing in fixest. It should just be called behind the scenes when the user specifies the "agg" argument when summarizing the result from the estimation (https://lrberge.github.io/fixest/reference/sunab.html). In the code base I didn't find a case where use_weights = False so I am guessing they always use these weights if supplied when setting up in the initial model. If my understanding of all this is correct, then I don't think there is a way to match fixest's output and be consistent in pyfixest's use of the weights. If you want to match the output, then I guess the right thing is to remove the "use_weights" option and force the aggregation to use the weights as sampling weights. If you don't want to match the fixest output, necessarily, the whole PR might need a re-think. |
Sorry for the delayed response, my parents were visiting over the weekend =)
This is also my understanding, and I wonder if this does not lead to errors because frequency weights and analytical weights SEs are different? I.e. when computing the vcov matrix, fixest uses "analytical" errors? There's also a slight chance that analytical and frequency errors have the exact same form and I still misunderstand something. Mabye the best next step here would be to open a PR in the |
Also linking to the Stata documentation that might be helpful / I need to take a closer look at later: link |
Thanks, I'll also try to read through and get more clarity before pushing an update. I'll should have time again this weekend. |
Sorry for disappearing on this one, Alex. Let me take a look this weekend. |
Hi, no worries at all! Thanks for the update! |
This PR opens the (analytic) weight option to the did/event_study function and all the subclasses of DiD it currently calls (twfe, did2s, and saturated_twfe). The related issue is #919
Changes in this PR: