-
Notifications
You must be signed in to change notification settings - Fork 2
paulmw/impala-demo
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Build: mvn package Environmental assumptions: 1. Java 2. MapReduce 3. Hive 4. Impala Run: The demo can be run with: bin/script.sh The data generator can be run with: hadoop impala-demo-0.1-SNAPSHOT.jar com.cloudera.tools.rmat.RMat <options> output-directory-in-hdfs Options: The number of nodes (accounts) in the graph: -Drmat.nodes=100000 The number of edges (transactions) in the graph: -Drmat.edges=400000 The number of mappers to parallelise over: -Drmat.mappers=4 Whether or not to generate random transactions: -Drmat.random=true Non-random means use a fixed seed of 0 What probability distribution to use: -Drmat.distribution=0.7,0.15,0.10,0.05 This gives a vaguely Zipfian distribution on number of transactions. A even distribution can be generated by using -Drmat.distribution=0.5,0.5,0.5,0.5
About
Impala Demo
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published