ParserBot: erroneous raw line recovery in error handling #1850
Labels
bug
Indicates an unexpected problem or unintended behavior
component: core
help wanted
Indicates that a maintainer wants help on an issue or pull request
This logic here:
intelmq/intelmq/lib/bot.py
Lines 1005 to 1031 in 7ba8b62
does not work with all recover_line_* methods. Some methods use the parameter
line
, others useself.current_line
. The overall logic is fine, but there is a major bug:process collects all fails (
self.__failed
is appended withline
) in the first loop (for line in self.parse(report)
)In the second loop (
for exc, line in self.__failed
),recover_line
is called withline
.If
recover_line
accessesself.current_line
, the data is wrong, asself.current_line
is then the last line of the report, not the actual one.Unfortunately, simply fixing some
recover_line_*
functions is not enough, the process inself.process
needs to be thought through and eventually adapted as well.self.current_line
should be deleted after the parsing end to prohibit this error in the future.self.recover_line
behaviour should be harmonized, making it applicable for use inself.parse_line
and inself.process
self.process
should be investigatedIn the future we also need better tests, but that's a bigger task and I'm afraid we can't stem that on a short term. And unfortunately the issue is important, as it leads to bogus (wrong/duplicated) data in the dumps and therefore loss of data.
The text was updated successfully, but these errors were encountered: