Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing the COVID-19 results in your paper #9

Open
PeterJackNaylor opened this issue Feb 5, 2021 · 4 comments
Open

Reproducing the COVID-19 results in your paper #9

PeterJackNaylor opened this issue Feb 5, 2021 · 4 comments

Comments

@PeterJackNaylor
Copy link

Hi,

Thanks for the paper and for the open source code.
I have been trying to reproduce your results and particularly those on the covid-19 dataset (smallest dataset so fastest to run without a GPU).
Could you share the settings for the hyper-parameters used for this dataset? Those that you think have a meaningful impact on the training. I am sorry; I cannot do a grid search due to very little computation power.

In addition, when we set the horizon to 28 on the covid-19 dataset, we get an error when trying to reproduce.
It occurs at line 71 in Forecast_dataloader:
range(self.window_size, self.df_length - self.horizon + 1)
From the paper and github, I gathered that for the validation set we had a window size of 28, and dataset length of 50 (50 days) and a horizon of 28. With these values, it would seem normal that the code fails as we get a range(28, 23).
Are there any step I could take to make the code work?

Thank you in advance,
Peter

@hangzhao-microsoft
Copy link
Contributor

Hi Peter,
Thanks for reaching out.
The error is result from the horizon size 28 and window size 28 need the valid set has more than 56 time stamps, however there are only 50 in your valid set. You can reduce the size of horizon or window to fit your data.

Best regards,
Hang

@michaeldemos
Copy link

Hi,

I would like to reproduce the following results.

image

As mentioned above, when we set the horizon to 28 on the covid-19 dataset, we get an error when trying to reproduce.

Can you please share

  1. all hyperparameters used
  2. length of training, validation and test set
  3. dataset used (the paper mentions 25 countries used for 110 days from 1/22/2020 to 5/10/2020)

Thankyou

@moreOver0
Copy link
Contributor

@michaeldemos

There is some difference between the code to reproduce the result of COVID19 case and the master branch of the github repo.

For the COVID19 case, we were forecasting one step at one time, and then getting the longer horizon by rolling forecasting. So the training was based on samples whose length is 28+1=29.
As for your question about how to split data, we first split the 110 days into (110 – (28 + horizon) + 1) * (28 + horizon), and the historical window of the first testing sample always starts from 02/23/2020 and ends with 03/21/2020. So no matter what horizon it is, the model always forecasts the time range 03/22/2020 ~ 05/10/2020 which is 50 days.

The current master does not handle the split of COVID19 case automatically, I'm afraid that some code modification is necessary.

@catcatwang
Copy link

Hello, I have collected the COVID-19 data set, which contains 25 countries and 110 timestamps, but when I run the code, I get a ValueError: need at least one array to concatenate error. How can I solve it? Looking forward to your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants