Trajectory Segmentation and Similarity Estimation Using Spark

This is a demo application for trajectory segmentation and similarity estimation. Its's based on Spark and Neo4J Graph Database.

Algorithms

Greedy-Split to segment trajectories into Minimum Bounding Rectangles(MBRs).
Quad-tree repartitioing strategy to distribute MBRs.
In-memory or external Nosql R-tree indexing.
Multiple query metrics support.

Highlights

Using Spark for parallelism.
Using JTS topology Library to calculate two trajectory similarity.
Support Neo4j Spatial to persist trajectory segmentations.

More info:

Building

The simplest way to build Neo4j Spatial is by using maven. Just clone the git repository and run

mvn install

This will download all dependencies, compiled the library, run the tests and install the artifact in your local repository. The application be created in the target directory, and can be copied to your local server or upload to AWS for execution.

deployment into AWS

we recommend using EMR as Spark Environment.

Software Version:emr-5.5.1

P.S. Please make sure you have enough disk space. 300GB or more for each node

Submit job: spark-submit --deploy-mode cluster \ --class com.hqkang.SparkApp.core.Import s3://path_of_application/application.jar \ -i s3://path_to_geolife_trajectories/ -o s3://path_to_output_dir/ -s 20 -p 5000 -z 10

-i input path
-o output path
-z MBR margin
-p Data load initial partition number
-s Trajectory Segmentation number

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.idea/libraries		.idea/libraries
.settings		.settings
000		000
Output		Output
logs		logs
output.txt		output.txt
src		src
.DS_Store		.DS_Store
.classpath		.classpath
.gitignore		.gitignore
.project		.project
IEEE_Tran_jrnl.pdf		IEEE_Tran_jrnl.pdf
README.md		README.md
SparkApp.iml		SparkApp.iml
cuthesis.pdf		cuthesis.pdf
model.di		model.di
model.notation		model.notation
model.uml		model.uml
pom.xml		pom.xml
pres.pdf		pres.pdf
store_lock		store_lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trajectory Segmentation and Similarity Estimation Using Spark

Algorithms

Highlights

Building

deployment into AWS

About

Releases

Packages

Languages

kanghq/SparkApp

Folders and files

Latest commit

History

Repository files navigation

Trajectory Segmentation and Similarity Estimation Using Spark

Algorithms

Highlights

Building

deployment into AWS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages