fix concat then filter #397

binlins · 2021-04-21T10:21:15Z

Description

Motivation and Context

How Has This Been Tested?

Pass the test by running: pytest qlib/tests/test_all_pipeline.py under upper directory of qlib.
If you are adding a new feature, test on your own test scripts.

Screenshots of Test Results (if appropriate):

Pipeline test:
Your own tests:

Types of changes

Fix bugs
Add new feature
Update documentation

ghost · 2021-04-21T10:21:27Z

All CLA requirements met.

you-n-g · 2021-04-21T13:43:02Z

qlib/data/dataset/__init__.py

+        kwargs['col_set'] = ['filter']
+        data_filter = super()._prepare_seg(slc=slc, **kwargs)
+        if kwargs.get('data_key') == DataHandlerLP.DK_L: 
+            col_filter = data_filter['filter']['keep_train']


Please don't hardcode them

you-n-g · 2021-04-22T02:02:55Z

qlib/data/dataset/__init__.py

+        if kwargs.get('data_key') == DataHandlerLP.DK_L: 
+            col_filter = data_filter['filter']['keep_train']
+        elif kwargs.get('data_key') == DataHandlerLP.DK_I:
+            col_filter = data_filter['filter']['keep_test']


discussion....

you-n-g · 2021-04-22T02:03:29Z

qlib/data/dataset/__init__.py

@@ -470,6 +489,7 @@ def _prepare_seg(self, slc: slice, **kwargs) -> TSDataSampler:

        # TSDatasetH will retrieve more data for complete
        data = super()._prepare_seg(slice(pad_start, end), **kwargs)
+        col_filter = _prepare_col_filter(slice(pad_start, end), **kwargs)


you-n-g · 2021-04-22T10:11:52Z

qlib/data/dataset/__init__.py

+                    if col_filter[idx2]: 
+                        idx_map[idx] = (i, j)
+                        idx += 1
+                    idx2 += 1


you-n-g · 2021-04-22T10:22:03Z

qlib/data/dataset/__init__.py

@@ -279,8 +279,12 @@ def __init__(self, data: pd.DataFrame, start, end, step_len: int, fillna_type: s

        # the data type will be changed
        # The index of usable data is between start_idx and end_idx
-        self.start_idx, self.end_idx = self.data.index.slice_locs(start=pd.Timestamp(start), end=pd.Timestamp(end))
-        self.idx_df, self.idx_map = self.build_index(self.data)
+        if col_filter is None:


notice padding

you-n-g · 2021-04-22T10:33:49Z

qlib/data/dataset/__init__.py

@@ -470,6 +489,7 @@ def _prepare_seg(self, slc: slice, **kwargs) -> TSDataSampler:

        # TSDatasetH will retrieve more data for complete
        data = super()._prepare_seg(slice(pad_start, end), **kwargs)
+        col_filter = _prepare_col_filter(slice(pad_start, end), **kwargs)


you-n-g · 2021-05-23T07:49:46Z

Closed due to the same feature merged in #290

blin added 2 commits April 21, 2021 10:17

fix concat then filter

a635286

fix concat then filter

122a2a9

you-n-g reviewed Apr 22, 2021

View reviewed changes

you-n-g reviewed May 23, 2021

View reviewed changes

you-n-g closed this May 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix concat then filter #397

fix concat then filter #397

binlins commented Apr 21, 2021 •

edited

Loading

ghost commented Apr 21, 2021 •

edited by ghost

Loading

you-n-g Apr 21, 2021

you-n-g Apr 22, 2021

you-n-g Apr 22, 2021

you-n-g Apr 22, 2021

you-n-g Apr 22, 2021

you-n-g Apr 22, 2021

you-n-g Apr 22, 2021

you-n-g commented May 23, 2021

fix concat then filter #397

fix concat then filter #397

Conversation

binlins commented Apr 21, 2021 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Screenshots of Test Results (if appropriate):

Types of changes

ghost commented Apr 21, 2021 • edited by ghost Loading

you-n-g Apr 21, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g Apr 22, 2021

Choose a reason for hiding this comment

you-n-g commented May 23, 2021

binlins commented Apr 21, 2021 •

edited

Loading

ghost commented Apr 21, 2021 •

edited by ghost

Loading