Reproducing the COVID-19 results in your paper #9

PeterJackNaylor · 2021-02-05T17:13:35Z

Hi,

Thanks for the paper and for the open source code.
I have been trying to reproduce your results and particularly those on the covid-19 dataset (smallest dataset so fastest to run without a GPU).
Could you share the settings for the hyper-parameters used for this dataset? Those that you think have a meaningful impact on the training. I am sorry; I cannot do a grid search due to very little computation power.

In addition, when we set the horizon to 28 on the covid-19 dataset, we get an error when trying to reproduce.
It occurs at line 71 in Forecast_dataloader:
range(self.window_size, self.df_length - self.horizon + 1)
From the paper and github, I gathered that for the validation set we had a window size of 28, and dataset length of 50 (50 days) and a horizon of 28. With these values, it would seem normal that the code fails as we get a range(28, 23).
Are there any step I could take to make the code work?

Thank you in advance,
Peter

The text was updated successfully, but these errors were encountered:

hangzhao-microsoft · 2021-03-13T09:16:48Z

Hi Peter,
Thanks for reaching out.
The error is result from the horizon size 28 and window size 28 need the valid set has more than 56 time stamps, however there are only 50 in your valid set. You can reduce the size of horizon or window to fit your data.

Best regards,
Hang

michaeldemos · 2021-04-12T00:58:11Z

Hi,

I would like to reproduce the following results.

As mentioned above, when we set the horizon to 28 on the covid-19 dataset, we get an error when trying to reproduce.

Can you please share

all hyperparameters used
length of training, validation and test set
dataset used (the paper mentions 25 countries used for 110 days from 1/22/2020 to 5/10/2020)

Thankyou

moreOver0 · 2021-04-13T06:44:48Z

@michaeldemos

There is some difference between the code to reproduce the result of COVID19 case and the master branch of the github repo.

For the COVID19 case, we were forecasting one step at one time, and then getting the longer horizon by rolling forecasting. So the training was based on samples whose length is 28+1=29.
As for your question about how to split data, we first split the 110 days into (110 – (28 + horizon) + 1) * (28 + horizon), and the historical window of the first testing sample always starts from 02/23/2020 and ends with 03/21/2020. So no matter what horizon it is, the model always forecasts the time range 03/22/2020 ~ 05/10/2020 which is 50 days.

The current master does not handle the split of COVID19 case automatically, I'm afraid that some code modification is necessary.

catcatwang · 2021-12-14T07:22:57Z

Hello, I have collected the COVID-19 data set, which contains 25 countries and 110 timestamps, but when I run the code, I get a ValueError: need at least one array to concatenate error. How can I solve it? Looking forward to your answer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing the COVID-19 results in your paper #9

Reproducing the COVID-19 results in your paper #9

PeterJackNaylor commented Feb 5, 2021

hangzhao-microsoft commented Mar 13, 2021

michaeldemos commented Apr 12, 2021

moreOver0 commented Apr 13, 2021

catcatwang commented Dec 14, 2021

Reproducing the COVID-19 results in your paper #9

Reproducing the COVID-19 results in your paper #9

Comments

PeterJackNaylor commented Feb 5, 2021

hangzhao-microsoft commented Mar 13, 2021

michaeldemos commented Apr 12, 2021

moreOver0 commented Apr 13, 2021

catcatwang commented Dec 14, 2021