Skip to content

msinev/pmc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

pmc

Poor man's cluster

There are BIG clusters for the BIG data. Ceph, Mesos, Spark, Hadoop, Flink.. you name it. Complex tools and advanced setup, high salary and great resposibility, advanced perfomance and unbeatable performance. Blah... Blah... Blah...

But what if you are a simple man with a big file collection... and you have several computers...

Case one - very stupid

Assume that you have a simple shared folder with your favorite movies and you need to recode all movies to new h.2xx latest and greatest slowest ever codec. You have few coputers at home... or just near by. You want to run codec on all of them to accelerate process, but how to split files...
Here is an idea.

Run a file processing script on each machine you have mounted your file storage...

#!/bin/bash
until [[ $fileprocess == "." ]]
  do
    date
    if [[ $fileprocess ]];then
      echo "Processing $fileprocess"
      ff-decoder-recoder /path/to where/files/mounted/$fileprocess
      
    else
      echo "Wait to start"
      sleep 1
    fi
  fileprocess=$(curl -s "http://192.168.1.199:3000/get")
  done
date
echo "Complete!"

With simple server below you should get files from URL (aka REST service endpoint) one by one

// It runs in background pushing file list into channel
func pollFiles(dirname string, WorkSource chan<- string) {
	defer close(WorkSource)

	files, err := ioutil.ReadDir(dirname)
	if err != nil {
		log.Fatal(err)
	}

	for _, file := range files {
		if !file.IsDir() {
			WorkSource<-file.Name()
			log.Println("Queueing... "+file.Name())
		}
	}
	log.Println("Complete list of "+dirname+"... ")
}

// It serves files from channel one by one on request
func getHandler(w http.ResponseWriter, r *http.Request) {
	select {
	case inData,ok:=<-WorkSource:
		if(ok) {
			w.Write([]byte(inData))
		}else {
			w.Write([]byte("."))
		}
	case <-time.After(1 * time.Second):
		w.Write([]byte(""))
	}
}


Here it is pkg/servepath Take it and use wisely

Case two - better than nothing

With simple case one all works fine while everything works fine. Crashes or outages not handled. Shared folder also required. Let's try to fix it. Extra features would be required to mention if data processing task being started finished aborted or stalled. Ability to show where task is stalled. Extra bonus - ability to pull and push data over the http.

About

Poor man's cluster

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published