The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
-
Updated
Sep 1, 2024
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
ClickHouse® is a real-time analytics DBMS
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
An open source cybersecurity protocol for syncing decentralized graph data.
QuestDB is an open source time-series database for fast ingest and SQL queries
The Data Engineering Cookbook
PredictionIO, a machine learning server for developers and ML engineers.
CMAK is a tool for managing Apache Kafka clusters
A distributed, fast open-source graph database featuring horizontal scalability and high availability
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Open-Source Web UI for Apache Kafka Management
The most widely used Python to C compiler
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."