Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for data update #BUG ALSO #1658

Open
Imbernoulli opened this issue Sep 25, 2023 · 5 comments
Open

Request for data update #BUG ALSO #1658

Imbernoulli opened this issue Sep 25, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@Imbernoulli
Copy link

🌟 Request for data update

Convenient data retrieve method (like python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn) can only get data to 2020-9-24. However, if we get data from yahoo manually, It costs lots of time(more than 8 hours here only for DOWNLOAD) and raise bugs.

And when it begins to normalize, it happens [58410:MainThread](2023-09-25 06:27:36,588) ERROR - qlib.workflow - [utils.py:41] - An exception has been raised[TypeError: can't compare offset-naive and offset-aware datetimes]. File "scripts/data_collector/yahoo/collector.py", line 1207, in <module> fire.Fire(Run) File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/site-packages/fire/core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "scripts/data_collector/yahoo/collector.py", line 1182, in update_data_to_bin self.normalize_data_1d_extend(qlib_data_1d_dir) File "scripts/data_collector/yahoo/collector.py", line 1072, in normalize_data_1d_extend yc.normalize() File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/site-packages/qlib/scripts/data_collector/base.py", line 319, in normalize for _ in worker.map(self._executor, file_list): File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists for element in iterable: File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator yield fs.pop().result() File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/concurrent/futures/_base.py", line 444, in result return self.__get_result() File "/Users/bernoulli_hermes/opt/anaconda3/envs/qlibenv/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result raise self._exception TypeError: can't compare offset-naive and offset-aware datetimes.

@Imbernoulli Imbernoulli added the enhancement New feature or request label Sep 25, 2023
@Imbernoulli Imbernoulli changed the title Request for data update Request for data update #BUG ALSO Sep 25, 2023
@SunsetWolf
Copy link
Collaborator

The data obtained using qlib.run.get_data is the data prepared by qlib team, the last time of the data is 2020-09-25.
I would like to know, which command are you getting the above error while executing.

@Imbernoulli
Copy link
Author

use python scripts/data_collector/yahoo/collector.py update_data_to_bin --qlib_data_1d_dir <user data dir> --trading_date <start date> --end_date <end date> to manually get the most recent data

@SunsetWolf
Copy link
Collaborator

I recently changed the update_data_to_bin functionality in PR1641, this PR is currently merge,you can pull the latest code and try again.

@Imbernoulli
Copy link
Author

Thank you, I will try later.

@ElonJustin7
Copy link

ElonJustin7 commented Nov 24, 2023

I recently changed the update_data_to_bin functionality in PR1641, this PR is currently merge,you can pull the latest code and try again.

Excuse me, when I used the latest code, a new error occurred: AttributeError: 'Index' object has no attribute 'tz_localize'. It seems to be related to a new line of code added in collector.py, as shown in the screenshot below. However, this was supposed to address the previous issue with TypeError: can't compare offset-naive and offset-aware datetimes. Do you know what should I do?
2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants