Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes to work with large datasets in better way #1019

Merged
merged 4 commits into from
Jan 24, 2023
Merged

Conversation

nicl-nno
Copy link
Collaborator

@nicl-nno nicl-nno commented Jan 11, 2023

There are several mini-fixes are applied:

  • Stopping if even one fold failed
  • Presets bug fixed
  • More stable wor for short timeouts
  • Various minor changes
  • Sequatial mode without Joblib parallelization
  • Tuner timeouts processing

@nicl-nno nicl-nno added the in progress task in progress label Jan 11, 2023
@codecov
Copy link

codecov bot commented Jan 11, 2023

Codecov Report

Merging #1019 (3fff7d4) into master (03ae732) will decrease coverage by 0.13%.
The diff coverage is 72.22%.

❗ Current head 3fff7d4 differs from pull request most recent head 4c6ebb4. Consider uploading reports for the commit 4c6ebb4 to get more accurate results

@@            Coverage Diff             @@
##           master    #1019      +/-   ##
==========================================
- Coverage   87.88%   87.75%   -0.13%     
==========================================
  Files         206      206              
  Lines       13805    13849      +44     
==========================================
+ Hits        12132    12153      +21     
- Misses       1673     1696      +23     
Impacted Files Coverage Δ
fedot/core/pipelines/tuning/tuner_builder.py 97.18% <ø> (ø)
.../core/optimisers/opt_history_objects/individual.py 81.41% <20.00%> (-2.85%) ⬇️
fedot/core/optimisers/gp_comp/evaluation.py 94.01% <33.33%> (-3.48%) ⬇️
fedot/core/composer/composer_builder.py 90.74% <50.00%> (-0.77%) ⬇️
.../core/optimisers/archive/individuals_containers.py 90.00% <50.00%> (-1.18%) ⬇️
fedot/core/composer/gp_composer/gp_composer.py 86.95% <55.55%> (-7.92%) ⬇️
fedot/core/pipelines/tuning/unified.py 90.00% <72.41%> (-10.00%) ⬇️
fedot/api/main.py 81.12% <75.00%> (-0.23%) ⬇️
fedot/api/api_utils/api_composer.py 97.79% <100.00%> (+0.03%) ⬆️
...t/api/api_utils/assumptions/assumptions_handler.py 86.00% <100.00%> (+0.58%) ⬆️
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

fedot/api/api_utils/api_composer.py Outdated Show resolved Hide resolved
@@ -171,7 +171,10 @@ def fit(self,
self.data_processor.accept_and_apply_recommendations(self.train_data, recommendations)
self.params.accept_and_apply_recommendations(self.train_data, recommendations)
self._init_remote_if_necessary()
self.params.update_available_operations_by_preset(self.train_data)

if self.params.api_params['preset'] != 'auto':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Почему потребовалось добавать условие?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Потому что сейчас если задать список available_operations вначале - то потом в варианте auto они уже не меняются. Не нашел блоее изящного решения, так кажется более масштабный рефакторинг нужен.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?

fedot/api/api_utils/assumptions/assumptions_handler.py Outdated Show resolved Hide resolved
@@ -171,7 +171,10 @@ def fit(self,
self.data_processor.accept_and_apply_recommendations(self.train_data, recommendations)
self.params.accept_and_apply_recommendations(self.train_data, recommendations)
self._init_remote_if_necessary()
self.params.update_available_operations_by_preset(self.train_data)

if self.params.api_params['preset'] != 'auto':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?


trials = Trials()

remaining_time = self.max_seconds - global_tuner_timer.minutes_from_start * 60
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Почему нельзя считать секунды и надо округлять до минут?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял.

@@ -183,6 +190,9 @@ def fit(self,
self.current_pipeline, self.best_models, self.history = \
self.api_composer.obtain_model(**self.params.api_params)

if self.current_pipeline is None:
raise ValueError('No any models were found')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

грамматика: достаточно "No models were found", c any масло масляное немного

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял

Comment on lines +49 to +50
if not population:
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

сюда может придти None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Может прийти [], если ни один индивид в популяции не успел обработаться.

fedot/core/optimisers/gp_comp/evaluation.py Show resolved Hide resolved
fedot/core/pipelines/tuning/unified.py Outdated Show resolved Hide resolved
@@ -183,6 +190,9 @@ def fit(self,
self.current_pipeline, self.best_models, self.history = \
self.api_composer.obtain_model(**self.params.api_params)

if self.current_pipeline is None:
raise ValueError('No any models were found')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Здесь можно просто 'No models were found'.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Поменял.

@nicl-nno
Copy link
Collaborator Author

Обновил до мастера.

@nicl-nno
Copy link
Collaborator Author

Сейчас получается, что работа с available_operations происходит тут и в этом куске кода. Так и должно быть?

Если они явно не заданы - то вроде да.

@YamLyubov YamLyubov mentioned this pull request Jan 24, 2023
@nicl-nno nicl-nno merged commit 3deb3e3 into master Jan 24, 2023
@nicl-nno nicl-nno deleted the logger-imp branch January 27, 2023 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress task in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants