Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove auglog #25

Merged
merged 14 commits into from
Jun 2, 2017
Merged

Remove auglog #25

merged 14 commits into from
Jun 2, 2017

Conversation

Tuxprogrammer
Copy link
Contributor

Should be a fix to Issue #15

Had to change the way the DataParser filters out comment lines, since there is a one-line footer at the end of the Bro Logs that we weren't aware of.

Console Output:

spencer$ $SPARK_HOME/bin/spark-submit csb.jar seed -b dataset_01/conn.log
[TIME] Log to graph started...
Vertices #: 37462, Edges #: 1954371
[TIME] Log to graph completed in 5.373518713 s
[TIME] Save seed graph started...
[TIME] Save seed graph completed in 2.117491267 s
[TIME] Gen seed distributions started...
[TIME] Gen seed distributions completed in 26.574239216 s
[TIME] Save seed distributions started...
[TIME] Save seed distributions completed in 11.530209146 s

@Tuxprogrammer
Copy link
Contributor Author

Looking at the project structure, we might want to remove the alert file from the dataset zip.

val logAug = new log_Augment()
logAug.run(config.alertLog, config.connLog, config.augLog)
//val logAug = new log_Augment()
//logAug.run(config.alertLog, config.connLog, config.augLog)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can safely delete these lines.

// Drop the 8-lines header and filter lines that contains only IPv4 addresses
val augLog = augLogFile.mapPartitionsWithIndex { (idx, lines) => if (idx == 0) lines.drop(8) else lines }
// Drop the 8-lines header and 1-line footer and filter lines that contains only IPv4 addresses
val theLog = logFile.filter(isNotComment)
.filter(isInet4).filter(isAllowedProto)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Since it is a simple check, I would replace the isNotComment function with an anonymous function and also all filters can be placed together on one line:

val theLog = logFile.filter(line => line(0) != '#').filter(isInet4).filter(isAllowedProto)

respBytes = pieces(10).toLong,
duration = if (pieces(8) != "-") pieces(8).toDouble else 0.0,
origBytes = if (pieces(9) != "-") pieces(9).toLong else 0L,
respBytes = if (pieces(10) != "-") pieces(10).toLong else 0L,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we totally forgot the error handling for all conversions. I think it would be better to handle any int/long/double conversion inside a separate try/catch block, e.g.:

duration = try { pieces(8).toDouble } catch { case NumberFormatException => 0.0 }

It is OK to use 0 as a fallback value for numeric fields.

@@ -88,8 +87,7 @@ object EdgeData {
origPkts.toLong,
origIpBytes.toLong,
respPkts.toLong,
respIpBytes.toLong,
desc
respIpBytes.toLong
)
// TODO: check why we need the following, i.e. why might "desc" be empty?
case Array(ts, origPort, respPort, proto, duration, origBytes, respBytes, connState, origPkts, origIpBytes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have the desc field anymore, this second case block can be safely deleted. Also, I would handle again any int/long/double conversion inside a separate try/catch block as in DataParser::logToGraph()

@@ -103,9 +92,9 @@ class OptionParser(override val programName: String, programVersion: String, con

opt[String]('l', "log")
Copy link
Contributor

@scordio scordio Jun 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually don't need the log option anymore in the synth scope since DataDistributions is now made from a graph.

.text(s"Path of the Snort connection log [default: ${config.alertLog}].")
.validate(path => if ( new File(path).isFile ) success else failure(s"$path is not a regular file") )
.action((x, c) => c.copy(alertLog = x)),

opt[String]('b', "bro-log")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we discarded the snort and the augLog options, this option can be changed to:

 opt[String]('l', "log")

as it would be easier to use.

@scordio scordio merged commit 5445062 into master Jun 2, 2017
@scordio scordio deleted the remove-auglog branch June 2, 2017 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants