Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED JENKINS-39535] - Optimize get log method #2607

Merged
merged 4 commits into from
Nov 6, 2016

Conversation

Jimilian
Copy link
Contributor

@Jimilian Jimilian commented Nov 1, 2016

Current implementation of getLog(int maxLines) reads entire log file.
In this PR I'm addressing this issue. Instead of reading all lines from start, I'm readying only maxLines from end of log.

@Jimilian Jimilian force-pushed the optimize_getLog_method branch 2 times, most recently from 9c11d13 to e15852a Compare November 1, 2016 13:36
It should speed up and reduce memory consumption for some plugins (i.e.
Email-ext Plugin).
Also now this method could be used to get last lines of build output in efficient manner.
@oleg-nenashev oleg-nenashev added the needs-more-reviews Complex change, which would benefit from more eyes label Nov 2, 2016
Copy link
Member

@oleg-nenashev oleg-nenashev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal! It should really help to improve the performance, especially for tests and progressive logs.

import java.io.Reader;
import java.io.StringWriter;

import java.io.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not do it in the production code. We have so many internal classes with "common" names and interfaces. It may cause implicit ambiguous behavior

}

int lines = 0;
final List<Byte> bytes = new ArrayList<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize by something reasonably small? E.g. 256. Just a performance hack

Copy link
Contributor Author

@Jimilian Jimilian Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes.clear() doesn't resize array, it means that capacity of array would grow until reaching max line length. And after that capacity would be constant. So, I don't see profit in this hack.

// operations.
if (lineCount > maxLines)
logLines.remove(0);
final List<String> lastLines = new ArrayList<>(maxLines);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 if somebody is dare enough to pass MAX_INT in his code, it will consume too much memory. I would limit it by something reasonably small (e.g. 1024) if you really need an ArrayList

if (lineCount > maxLines)
logLines.set(0, "[...truncated " + (lineCount - (maxLines - 1)) + " lines...]");
if (lines == maxLines) {
lastLines.set(0, "[...truncated lines...]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐜 no info about number of truncated lines left. I understand it's not possible in the current implementation, but maybe "[....truncated N bytes..]." could work as a replacement


return ConsoleNote.removeNotes(logLines);
private String convertBytesToString(List<Byte> bytes) {
Collections.reverse(bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC there is a method, which creates new collection from ArrayList or LinkedList without updating the original one. Should be better from the performance PoV. Or you can just use Reverese iterator since you have an Array list

Copy link
Contributor Author

@Jimilian Jimilian Nov 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about converting bytes to String? or reverse()?
reverse doesn't create new collection, doesn't allocate it, etc. Just pure swap. I can't imagine anything faster.
Anyway collections and allocations in this method should be quite small.

private String convertBytesToString(List<Byte> bytes) {
Collections.reverse(bytes);
Byte[] byteArray = bytes.toArray(new Byte[bytes.size()]);
return new String(ArrayUtils.toPrimitive(byteArray), Charset.forName("UTF-8"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Original code was using getCharset()

@Jimilian
Copy link
Contributor Author

Jimilian commented Nov 2, 2016

@oleg-nenashev Looks better now?

@@ -189,7 +189,7 @@ public void getLogReturnsAnRightOrder() throws Exception {
for (int i = 1; i < 10; i++) {
assertEquals("dummy" + (10+i), logLines.get(i));
}
assertEquals("[...truncated lines...]", logLines.get(0));
assertEquals("[...truncated 68 B...]", logLines.get(0));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not perfect, but this is a standard method

@oleg-nenashev
Copy link
Member

LGTM 👍

@daniel-beck
Copy link
Member

Does this affect the gzipped log file use case?

@Jimilian
Copy link
Contributor Author

Jimilian commented Nov 3, 2016

@daniel-beck good catch! It would not work for gzipped files. So, I can check file type and switch to old implementation if file is gzipped. Because AFAIK it's impossible to read gzipped file from end.
If you see another option, I would appreciate advice.

@KostyaSha
Copy link
Member

Does this affect the gzipped log file use case?

Is there test case for it?

@Jimilian
Copy link
Contributor Author

Jimilian commented Nov 3, 2016

@KostyaSha No, I found only one test case for this method in core - that method returns empty list instead of null. After that I wrote two tests to 'document' current behaviour before starting to re-write this method.

@KostyaSha
Copy link
Member

So where gzipped logs at all exists?

@KostyaSha
Copy link
Member

@daniel-beck how this old implementation may use gzipped log files? Method directly opens file and reads lines, why it should affect gzipped logs?

@daniel-beck
Copy link
Member

@KostyaSha Not sure, hence me asking about it. If there's no impact (e.g. because this code is never called on a finalized log file), that's great. But a number of other methods of Run support gzipped log files.

@oleg-nenashev
Copy link
Member

I do not believe it impacts gzipped logs since the change operates within existing public method. GZIP build logging should have a completely different implementation/override. One of IMPLs is here: https://wiki.jenkins-ci.org/display/JENKINS/Compress+Build+Log+Plugin.

If the method is broken for this case, it is unlikely a regression

@KostyaSha
Copy link
Member

One of IMPLs is here: https://wiki.jenkins-ci.org/display/JENKINS/Compress+Build+Log+Plugin.

Maintainer(s): Daniel Beck

@daniel-beck
Copy link
Member

@KostyaSha @oleg-nenashev I have no idea what you are referring to. The core feature handling compressed build logs – the concern here – predates the sample-code sized plugin that automatically compresses them by half a decade: d300ff8

@Jimilian
Copy link
Contributor Author

Jimilian commented Nov 3, 2016

@daniel-beck But previous implementation of getLog didn't use GZIPInputStream. It means that it didn't support gzipped logs too.

@oleg-nenashev
Copy link
Member

@daniel-beck @KostyaSha WDYT?

@oleg-nenashev oleg-nenashev added the ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback label Nov 5, 2016
@daniel-beck
Copy link
Member

As @Jimilian points out, the previous version had no gzip support either. So that part is okay.

Beyond that, no opinion, as I haven't had a chance to look into this further.

@oleg-nenashev oleg-nenashev changed the title Optimize get log method [FIXED JENKINS-39535] - Optimize get log method Nov 6, 2016
@oleg-nenashev
Copy link
Member

Merging since we have 2 +1s. I doubt it's a subject for backporting, but I've created JENKINS-39535 just in case

@oleg-nenashev oleg-nenashev merged commit 2e8c3be into jenkinsci:master Nov 6, 2016
@Jimilian Jimilian deleted the optimize_getLog_method branch November 6, 2016 11:18
oleg-nenashev added a commit that referenced this pull request Nov 6, 2016
@daniel-beck
Copy link
Member

@jglick
Copy link
Member

jglick commented Nov 7, 2016

@daniel-beck hard to see how; Stage View uses Pipeline-specific log storage, not methods on Run.

@jglick
Copy link
Member

jglick commented Sep 28, 2018

improve the performance, especially for tests and progressive log

Only for tests (JenkinsRule). Run.getLog(int) is unused in production code AFAIK.

@Jimilian
Copy link
Contributor Author

@jglick it was used by some plugins. In our case it was plugin that sends mail notifications in case of failed build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-more-reviews Complex change, which would benefit from more eyes ready-for-merge The PR is ready to go, and it will be merged soon if there is no negative feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants