Skip to content

Latest commit

 

History

History
171 lines (96 loc) · 15.1 KB

README.md

File metadata and controls

171 lines (96 loc) · 15.1 KB

Data related notes

License: MIT PR's Welcome

A continualy expanding collection of data-related notes. Please, contribute and get in touch! See MDmisc notes for other programming and genomics-related notes.

Table of content

Datasets

Datasets in R

  • library(help = "datasets") or data() - shows built-in R datasets

  • A list of over 1,000 datasets available in R packages, curated by @VincentAB.

  • curran/data - A collection of public data sets, primarily in text format

  • Tidy Tuesday - A weekly social data project in R with curated datasets

  • dsbox - Data Science in the Box datasets

  • dslabs - Data Science Labs - Datasets and functions that can be used for data analysis practice, homework and projects in data science courses and workshops. 26 datasets are available for case studies in data visualization, statistical inference, modeling, linear regression, data wrangling and machine learning. Made by Rafael Irizarry and Amy Gill.

Genomics

Machine learning

Imaging

COVID-19

  • SARS COV-2 database of uniformly processed 21 COVID-19 scRNA-seq datasets (over 3.2 million cells). Table 1 - COVID-19 data obtained with various technologies. GitHub with processing scripts.
    Paper Tian, Yuan, Lindsay N. Carpp, Helen E. R. Miller, Michael Zager, Evan W. Newell, and Raphael Gottardo. “Single-Cell Immunology of SARS-CoV-2 Infection.” Nature Biotechnology, December 20, 2021. https://doi.org/10.1038/s41587-021-01131-y.

Text

  • Gitenberg is a collaborative, open source community curating and publishing highly usable and attractive ebooks in the public domain. Our books are free to use by anyone for any purpose. They contain detailed metadata and are accessible in a wide variety of formats. https://gitenberg.org/

Misc