-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge codebase-hydra-restructure into main #90
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cameronraysmith
changed the title
merge codebase hydra restructure into codebase
merge codebase-hydra-restructure into codebase
Mar 5, 2023
This was referenced Mar 5, 2023
Closed
cameronraysmith
changed the title
merge codebase-hydra-restructure into codebase
merge codebase-hydra-restructure into main
Mar 5, 2023
cameronraysmith
force-pushed
the
codebase-hydra-restructure
branch
from
March 7, 2023 16:24
6897a69
to
bbe8da4
Compare
cameronraysmith
requested review from
mateibejan1,
ssenan,
LucasSilvaFerreira and
cmvcordova
March 7, 2023 16:27
cameronraysmith
added a commit
that referenced
this pull request
Mar 7, 2023
* wip: dataloader first draf * Fixing train, val, and test path * Added initial project structure Added a bunch of directories with (mostly) empty/dummy .py files for now, so that everyone can see what the project will be structured like. On top of the present directories, there will also be a datasets and a logs directory, the latter being dynamically created at traintime or validation time. * rename file, remove one-hot encode * Revert "wip: dataloader first draft" * Updating component loading section * sequence dataloader baseline model * fixing a couple typos * Delete src/metrics directory Deleting metrics directory as it was decided we'll have only one file with all metrics. * Added refactored DDPM and UNet from notebook V2 Refactored Lucas's DDPM, UNet and units and added them as PL modules. * Update diffusion.py Added "instantiate_from_config" import. * Update ddpm.py Added nucleotides as a parameter with a default of 4 to the sample method. * wip: separate train/val/test subclasses * Delete codebase/src/data directory * Updated PL dataloader * placeholder test file * Update unet_lucas.py Added default function import. * Added matching dummy test files * complete: initial dataloader * Added config template Designed config template mainly for PL-related parameters. Keeping multiprocessing arguments for multi-GPU for the first test, which we'll change to multi-node. Diffusion and UNet parameters can easily vary. * Delete dummy_config.yaml * delete test_diffusion * fix: fixed function naming convention * feat: Add initial CI proposal * feat: Add a simple pyproject config file * wip: train.py + configs * config folder structure update * fix datapath param of datasets * add additional sequence encoding schemes + separate transforms * add tests for sequence dataloader * add additional asserts for data batches * check sequence lengths in datasets * add more tests for invalid data * style: run black * feat: Refactor schedules and remove time_difference * feat: Add type hints to schedule utility functions * feat: Refactor noise schedule fn * feat: refactor q_sample fn * feat: add type hints to q_sample * feat: drop bit_scale * feat: run black and switch to torch.log * feat: drop t_index * feat: refactor p_sample fn * feat: refactor p_sample_loop fn * feat: refactor sample fn * feat: refactor training_step fn * feat(ci): Add `codebase` branch to CI Based on discussion with @mateibejan1, running the tests on the `codebase` branch is also essential. It's the branch which is under heavy development and we should ensure all tests pass before we merge into `codebase` as well. * reqs: add `pandas` to requirements.txt * reqs: add `torch` to requirements.txt * reqs: bump torch to `1.11.0` for compatibility * fix(ci): run pytest as a module * reqs: add torchvision to `0.12.0` * reqs: add `pytorch-lightning` * fix: failing CI tests for dataloader across platforms * fix: failing CI tests for dataloader - wrap transforms * fix: failing CI tests for dataloader - no multiprocessing for transforms * Add Lucas' conditioned UNet * Update EMA with Lucas' version * Added mean_flat util from P2 paper * Added P2 weighting skeleton. Need to figure out how to use P2 weighting on DNA data. * misc: create a PR template Fixes #51 * misc: add doc strings and type hints to the PR template cc: @mateibejan1 * Add files via upload * Add files via upload * Add files via upload Updated DDPM with the Noah's refactored notebook version. Preemptively added p2_weighting, need to figure out if/how it works on bit sequences. * Add files via upload * Add files via upload * style: run black * feat: add type hints to `utils/misc.py` * feat: add type hints to utils/metrics * feat: add type hints to utils/schedules * feat: add type hints to unet_bitdiffusion * feat: add type hints to unet_lucas * feat: add type hints to ddim * feat: add type hints to seq dataloader * feat: add type hints to unet_lucas_cond * Delete ddim.py Deprecated. * Delete unet_bitdiffusion.py Deprecated. * Update unet_conditional.yaml Changed default number of timesteps from 1000 to 200. * Update unet_conditional.yaml Moved unet_config params inside the diffusion models params, so it mirrors the hierarchical relationship between the diffusion class and the unet class. * Update misc.py Minor dict property name changes. * Update diffusion.py * Update diffusion.py * Update default.yaml * Update unet_lucas.py * initial test lucas unet * add test vq * ddm * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * merge codebase-hydra-restructure into main (#90) * WIP new folder structure * ema parameter fix * Base dataloader instantiation with full hydraconfig succesful, missing full params * Update sequence_dataloader.py * Remove outputs folder, update .gitignore * Update network.py * Update sequence_datamodule.py * Update sequence_datamodule.py --------- Co-authored-by: cmvcordova <cmvcordova@github.com> Co-authored-by: cmvcordova <cmvcordova@pm.me> Co-authored-by: Matei Bejan <24592776+mateibejan1@users.noreply.github.com> --------- Co-authored-by: ssenan <simonsenan@gmail.com> Co-authored-by: Matei Bejan <24592776+mateibejan1@users.noreply.github.com> Co-authored-by: Bendidi Ihab <ihabnobendidi@gmail.com> Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com> Co-authored-by: Jan Sobotka <jsobotka1188@gmail.com> Co-authored-by: ceziegler <cheyenneeziegler@gmail.com> Co-authored-by: jamesthesnake <james.ryan.hennessy@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: cmvcordova <cmvcordova@github.com> Co-authored-by: cmvcordova <cmvcordova@pm.me>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To be followed by #92 .