|
4 | 4 | "cell_type": "markdown",
|
5 | 5 | "metadata": {},
|
6 | 6 | "source": [
|
7 |
| - "# Python: Pepeated Cross-Sectional Data with Multiple Time Periods\n", |
| 7 | + "# Python: Repeated Cross-Sectional Data with Multiple Time Periods\n", |
8 | 8 | "\n",
|
9 | 9 | "In this example, a detailed guide on Difference-in-Differences with multiple time periods using the [DoubleML-package](https://docs.doubleml.org/stable/index.html). The implementation is based on [Callaway and Sant'Anna(2021)](https://doi.org/10.1016/j.jeconom.2020.12.001).\n",
|
10 | 10 | "\n",
|
|
37 | 37 | "source": [
|
38 | 38 | "## Data\n",
|
39 | 39 | "\n",
|
40 |
| - "We will rely on the `make_did_CS2021` DGP, which is inspired by [Callaway and Sant'Anna(2021)](https://doi.org/10.1016/j.jeconom.2020.12.001) (Appendix SC) and [Sant'Anna and Zhao (2020)](https://doi.org/10.1016/j.jeconom.2020.06.003).\n", |
| 40 | + "We will rely on the `make_did_cs_CS2021` DGP, which is inspired by [Callaway and Sant'Anna(2021)](https://doi.org/10.1016/j.jeconom.2020.12.001) (Appendix SC) and [Sant'Anna and Zhao (2020)](https://doi.org/10.1016/j.jeconom.2020.06.003).\n", |
41 | 41 | "\n",
|
42 |
| - "We will observe `n_obs` units over `n_periods`. Remark that the dataframe includes observations of the potential outcomes `y0` and `y1`, such that we can use oracle estimates as comparisons. " |
| 42 | + "We will observe approximately `n_obs` units over `n_periods`. The parameter `lambda_t` determines the probability of observing a unit ``i`` in time period ``t``. The parameter `lambda_t` is set to 0.5 for all time periods, which means that each unit has a 50% chance of being observed in each time period.\n", |
| 43 | + "\n", |
| 44 | + "Remark that the dataframe includes observations of the potential outcomes `y0` and `y1`, such that we can use oracle estimates as comparisons." |
43 | 45 | ]
|
44 | 46 | },
|
45 | 47 | {
|
|
389 | 391 | "The choice `gt_combinations=\"standard\"`, used estimates all possible combinations of $ATT(g,t_\\text{eval})$ via $\\widehat{ATT}(\\mathrm{g},t_\\text{pre},t_\\text{eval})$,\n",
|
390 | 392 | "where the standard choice is $t_\\text{pre} = \\min(\\mathrm{g}, t_\\text{eval}) - 1$ (without anticipation).\n",
|
391 | 393 | "\n",
|
392 |
| - "Remark that this includes pre-tests effects if $\\mathrm{g} > t_{eval}$, e.g. $\\widehat{ATT}(g=\\text{2025-04}, t_{\\text{pre}}=\\text{2025-01}, t_{\\text{eval}}=\\text{2025-02})$ which estimates the pre-trend from January to February even if the actual treatment occured in April." |
| 394 | + "Remark that this includes pre-tests effects if $\\mathrm{g} > t_{eval}$, e.g. $\\widehat{ATT}(g=3, t_{\\text{pre}}=0, t_{\\text{eval}}=1)$ which estimates the pre-trend from time period $0$ to $1$ even if the actual treatment occured in time period $3$." |
393 | 395 | ]
|
394 | 396 | },
|
395 | 397 | {
|
|
630 | 632 | "metadata": {},
|
631 | 633 | "outputs": [],
|
632 | 634 | "source": [
|
633 |
| - "df[\"e\"] = pd.to_datetime(df[\"t\"]).values.astype(\"datetime64[M]\") - \\\n", |
634 |
| - " pd.to_datetime(df[\"d\"]).values.astype(\"datetime64[M]\")\n", |
635 |
| - "df.groupby(\"e\")[\"ite\"].mean()[1:]" |
| 635 | + "df_treated = df[df[\"d\"] != np.inf].copy()\n", |
| 636 | + "df_treated[\"e\"] = df_treated[\"t\"] - df_treated[\"d\"]\n", |
| 637 | + "df_treated.groupby(\"e\")[\"ite\"].mean().iloc[1:]" |
636 | 638 | ]
|
637 | 639 | },
|
638 | 640 | {
|
|
899 | 901 | "cell_type": "markdown",
|
900 | 902 | "metadata": {},
|
901 | 903 | "source": [
|
902 |
| - "### Selected Combinations\n", |
| 904 | + "### Universal Base Period\n", |
903 | 905 | "\n",
|
904 |
| - "Instead it is also possible to just submit a list of tuples containing $(\\mathrm{g}, t_\\text{pre}, t_\\text{eval})$ combinations. E.g. only two combinations" |
| 906 | + "The option `gt_combinations=\"universal\"` set $t_\\text{pre} = \\mathrm{g} - \\delta - 1$, corresponding to a universal/constant comparison or base period.\n", |
| 907 | + "\n", |
| 908 | + "Remark that this implies $t_\\text{pre} > t_\\text{eval}$ for all pre-treatment periods (accounting for anticipation). Therefore these effects do not have the same straightforward interpretation as ATT's." |
905 | 909 | ]
|
906 | 910 | },
|
907 | 911 | {
|
|
910 | 914 | "metadata": {},
|
911 | 915 | "outputs": [],
|
912 | 916 | "source": [
|
913 |
| - "gt_dict = {\n", |
914 |
| - " \"gt_combinations\": [\n", |
915 |
| - " (4.0, 1, 2),\n", |
916 |
| - " (4.0, 1, 3),\n", |
917 |
| - " ]\n", |
918 |
| - "}\n", |
919 |
| - "\n", |
920 |
| - "dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| gt_dict))\n", |
921 |
| - "dml_obj_all.fit()\n", |
922 |
| - "dml_obj_all.bootstrap(n_rep_boot=5000)\n", |
923 |
| - "dml_obj_all.plot_effects()" |
| 917 | + "dml_obj_universal = DoubleMLDIDMulti(dml_data, **(default_args| {\"gt_combinations\": \"universal\"}))\n", |
| 918 | + "dml_obj_universal.fit()\n", |
| 919 | + "dml_obj_universal.bootstrap(n_rep_boot=5000)\n", |
| 920 | + "dml_obj_universal.plot_effects()" |
924 | 921 | ]
|
925 | 922 | },
|
926 | 923 | {
|
927 | 924 | "cell_type": "markdown",
|
928 | 925 | "metadata": {},
|
929 | 926 | "source": [
|
930 |
| - "### Universal Base Period\n", |
931 |
| - "\n", |
932 |
| - "The option `gt_combinations=\"universal\"` set $t_\\text{pre} = \\mathrm{g} - \\delta - 1$, corresponding to a universal/constant comparison or base period.\n", |
| 927 | + "### Selected Combinations\n", |
933 | 928 | "\n",
|
934 |
| - "Remark that this implies $t_\\text{pre} > t_\\text{eval}$ for all pre-treatment periods (accounting for anticipation). Therefore these effects do not have the same straightforward interpretation as ATT's." |
| 929 | + "Instead it is also possible to just submit a list of tuples containing $(\\mathrm{g}, t_\\text{pre}, t_\\text{eval})$ combinations. E.g. only two combinations" |
935 | 930 | ]
|
936 | 931 | },
|
937 | 932 | {
|
|
940 | 935 | "metadata": {},
|
941 | 936 | "outputs": [],
|
942 | 937 | "source": [
|
943 |
| - "dml_obj_universal = DoubleMLDIDMulti(dml_data, **(default_args| {\"gt_combinations\": \"universal\"}))\n", |
944 |
| - "dml_obj_universal.fit()\n", |
945 |
| - "dml_obj_universal.bootstrap(n_rep_boot=5000)\n", |
946 |
| - "dml_obj_universal.plot_effects()" |
| 938 | + "gt_dict = {\n", |
| 939 | + " \"gt_combinations\": [\n", |
| 940 | + " (4.0, 1, 2),\n", |
| 941 | + " (4.0, 1, 3),\n", |
| 942 | + " ]\n", |
| 943 | + "}\n", |
| 944 | + "\n", |
| 945 | + "dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| gt_dict))\n", |
| 946 | + "dml_obj_all.fit()\n", |
| 947 | + "dml_obj_all.bootstrap(n_rep_boot=5000)\n", |
| 948 | + "dml_obj_all.plot_effects()" |
947 | 949 | ]
|
948 | 950 | }
|
949 | 951 | ],
|
950 | 952 | "metadata": {
|
951 | 953 | "kernelspec": {
|
952 |
| - "display_name": "Python 3", |
| 954 | + "display_name": "dml_dev", |
953 | 955 | "language": "python",
|
954 | 956 | "name": "python3"
|
955 | 957 | },
|
|
963 | 965 | "name": "python",
|
964 | 966 | "nbconvert_exporter": "python",
|
965 | 967 | "pygments_lexer": "ipython3",
|
966 |
| - "version": "3.12.10" |
| 968 | + "version": "3.12.8" |
967 | 969 | }
|
968 | 970 | },
|
969 | 971 | "nbformat": 4,
|
|
0 commit comments