Skip to content
This repository has been archived by the owner on Mar 29, 2022. It is now read-only.

SystemDS 0.2.0 (March 24, 2020)

Compare
Choose a tag to compare
@corepointer corepointer released this 24 Mar 00:30
· 7 commits to master since this release
07dd56a

Release Notes

SystemDS 0.2.0 is the second release under the new name after forking from SystemML.
This release has seen a wealth of little fixes here and there to accomodate some the major
features which extend the functionality of this system.

Changes in this release include

  • Initial work on federated operations on matrices and tensors, where special instructions push down as much computation as possible to remote workers.
  • Tensor operations have been extended.
  • Python bindings bridge the gap to make SystemDS available to a greater audience that already has experience or existing code in Python. Initial functionality is there for matrix operations, federated tensors and lineage traces.
  • Lineage support has gained caching and reuse functionality. Lineage can also be traced on Spark now.
  • Several methods for data cleaning have been implemented. A first version of multiple imputations with an implementation of multivariate imputation by chained equations (MICE) and support for outlier detection using standard deviation and inter-quartile range. Additional methods and builtin functions are detectSchema, typeof, hidden markov models for missing value imputation and functional dependency discovery.
  • A slice finder helps in model debugging.
  • Cloud deployment scripts for AWS and scripts to set up and start federated operations.
  • More algorithms/methods/builtins (shared marriage, data augmentation, feature hashing, l2svm, msvm, multiLogReg, set intersection, crossvalidation, naïve bayes, is.na/nan/inf, eval fcall, list-entry-removal, GNMF, PNMF).
  • Performance improvements (parallel sort, gpu cum agg, append cbind)
  • New rewrites (agg remove empty, lineage, nary plus element-wise operations, eliminate rmEmpty, tsmm/mm over lists of folds)
  • New data reader/writer for json frames and support for sql as a data source
  • Miscellaneous improvements: compressed matrices ported from SystemML, more documentation, better testing, run/release scripts, bug fixes

Acknowledgements

Thanks to Enrique Barba Roque, Sebastian Baunsgaard, Matthias Boehm, Mark Dokter, Lukas Erlbacher, Kevin Innerebner, Florijan Klezin, Valentin Leutgeb, Arnab Phani, Benjamin Rath, Svetlana Sagadeeva, Afan Secic, Shafaq Siddiqi, Thomas Wedenig, Sebastian Wrede for their support in the creation of the release of SystemDS 0.2.0.