-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Add bayesian training #2430
base: master
Are you sure you want to change the base?
Conversation
@seemanne: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
@seemanne: There are no area labels on this PR. You can add as many areas as you see fit.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
/kind feature |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #2430 +/- ##
===========================================
- Coverage 56.94% 37.79% -19.16%
===========================================
Files 195 191 -4
Lines 26675 26302 -373
===========================================
- Hits 15191 9940 -5251
- Misses 9901 15018 +5117
+ Partials 1583 1344 -239
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
pkg/bayesiantrain/trainer.go
Outdated
|
||
for _, v := range s.ParsedIpEvents { | ||
go evaluateProgramOnBucket(&v, compiled, inputChan) | ||
} | ||
|
||
go controllerRoutine(inputChan, outputChan, s.total) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not aware of the larger context of where this fits in. Looks like you’re adding new function in scenarios/parsers. That being said I think using go routines won't add performance but is only making the code more complex. The “go-routined” function evaluateProgramOnBucket
is not doing any I/O work hence doesn't make sense to use go routines on it. Something like
var result evalHypothesisResult
for _, v := range s.ParsedIpEvents {
r := evaluateProgramOnBucket(&v)
// update result
}
Would have equivalent if not better performance while eliminating the overhead of controller routine etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok maybe I should explain. The whole thing is the main training loop for the bayesian buckets. The construction works the following way:
- The logs are loaded into the
LogEventStorage
into the map with the key being the Ip and the all the events for this Ip are added to thefakeBucket
- The user can then test different hypothesis expr to see if any of them would make for good conditions in the bucket using
TestHypothesis
- To speed up the hypothesis testing the idea way to run it in parallel threads for each IP (as its basically counting some stuff per IP)
- The goal of the channel/goroutine design in
TestHypothesis
is to enable this parallelism by spawing an independent routine for each IP (fake bucket) and then collecting all the results using the controller.
Does this make more sense now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, the goroutines would definitely increase throughput. If the training is CPU intensive task than goroutine would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you
2fa5c07
to
ba7a5a3
Compare
No description provided.