Predictive maintenance using Spark

It is educational project with synthetically generated data.

The main idea is to use current measurements along with several previous to generate feature vector.

Issues

1 It can detect failure state several times per device. Some logic should be added if needed.

2 Since data is synthetic no algrorithm selection was done. Basic logistic regression was used and count of iterations was determined by also generated test dataset

3 There is a shuffling phase in prediction app. It can be removed by using, for example, http://spark.apache.org/docs/latest/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers

4 Streaming integration tests are ugly and fragile (since shared variable is used). Some machinery with DSL should be added

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
project		project
src		src
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive maintenance using Spark

Issues

About

Releases

Packages

Languages

kirillsablin/predictive-maintenance-spark

Folders and files

Latest commit

History

Repository files navigation

Predictive maintenance using Spark

Issues

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages