New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[FIXED JENKINS-39535] - Optimize get log method #2607

Merged

oleg-nenashev merged 4 commits into jenkinsci:master from Jimilian:optimize_getLog_method

Nov 6, 2016

Contributor

Jimilian commented Nov 1, 2016

Current implementation of getLog(int maxLines) reads entire log file.
In this PR I'm addressing this issue. Instead of reading all lines from start, I'm readying only maxLines from end of log.

Jimilian added 2 commits

November 1, 2016 14:30


          Merge pull request #1 from jenkinsci/master

357a042

Update from upstream


          Add some tests to current behaviour of getLog method

52286cc

Jimilian force-pushed the optimize_getLog_method branch 2 times, most recently from 9c11d13 to e15852a Compare

November 1, 2016 13:36


          getLog(maxLines) reads only last maxLines lines now

fa6ef08

It should speed up and reduce memory consumption for some plugins (i.e.
Email-ext Plugin).
Also now this method could be used to get last lines of build output in efficient manner.

Jimilian force-pushed the optimize_getLog_method branch from e15852a to fa6ef08 Compare

November 1, 2016 17:03

oleg-nenashev added the needs-more-reviews label

oleg-nenashev requested changes

View reviewed changes

Member

oleg-nenashev left a comment

Thanks for the proposal! It should really help to improve the performance, especially for tests and progressive logs.

core/src/main/java/hudson/model/Run.java

-              import java.io.Reader;
-              import java.io.StringWriter;
+              import java.io.*;

Member

oleg-nenashev Nov 2, 2016

Please do not do it in the production code. We have so many internal classes with "common" names and interfaces. It may cause implicit ambiguous behavior

core/src/main/java/hudson/model/Run.java

+                      }
+                      int lines = 0;
+                      final List<Byte> bytes = new ArrayList<>();

Member

oleg-nenashev Nov 2, 2016

Initialize by something reasonably small? E.g. 256. Just a performance hack

Contributor Author

Jimilian Nov 2, 2016 •

edited

Loading

bytes.clear() doesn't resize array, it means that capacity of array would grow until reaching max line length. And after that capacity would be constant. So, I don't see profit in this hack.

core/src/main/java/hudson/model/Run.java

-                              // operations.
-                              if (lineCount > maxLines)
-                                  logLines.remove(0);
+                      final List<String> lastLines = new ArrayList<>(maxLines);

Member

oleg-nenashev Nov 2, 2016

🐛 if somebody is dare enough to pass MAX_INT in his code, it will consume too much memory. I would limit it by something reasonably small (e.g. 1024) if you really need an ArrayList

core/src/main/java/hudson/model/Run.java

-                      if (lineCount > maxLines)
-                          logLines.set(0, "[...truncated " + (lineCount - (maxLines - 1)) + " lines...]");
+                      if (lines == maxLines) {
+                          lastLines.set(0, "[...truncated lines...]");

Member

oleg-nenashev Nov 2, 2016

🐜 no info about number of truncated lines left. I understand it's not possible in the current implementation, but maybe "[....truncated N bytes..]." could work as a replacement

core/src/main/java/hudson/model/Run.java

-                      return ConsoleNote.removeNotes(logLines);
+                  private String convertBytesToString(List<Byte> bytes) {
+                      Collections.reverse(bytes);

Member

oleg-nenashev Nov 2, 2016

IIRC there is a method, which creates new collection from ArrayList or LinkedList without updating the original one. Should be better from the performance PoV. Or you can just use Reverese iterator since you have an Array list

Contributor Author

Jimilian Nov 2, 2016 •

edited

Loading

Are you talking about converting bytes to String? or reverse()?
reverse doesn't create new collection, doesn't allocate it, etc. Just pure swap. I can't imagine anything faster.
Anyway collections and allocations in this method should be quite small.

core/src/main/java/hudson/model/Run.java

+                  private String convertBytesToString(List<Byte> bytes) {
+                      Collections.reverse(bytes);
+                      Byte[] byteArray = bytes.toArray(new Byte[bytes.size()]);
+                      return new String(ArrayUtils.toPrimitive(byteArray), Charset.forName("UTF-8"));

Member

oleg-nenashev Nov 2, 2016

🐛 Original code was using getCharset()


          Fix issues from code review

a11058e

Contributor Author

Jimilian commented Nov 2, 2016

@oleg-nenashev Looks better now?

KostyaSha approved these changes

View reviewed changes

oleg-nenashev reviewed

View reviewed changes

core/src/test/java/hudson/model/RunTest.java

@@ @@ -189,7 +189,7 @@ public void getLogReturnsAnRightOrder() throws Exception { @@
                       for (int i = 1; i < 10; i++) {
                           assertEquals("dummy" + (10+i), logLines.get(i));
                       }
-                      assertEquals("[...truncated lines...]", logLines.get(0));
+                      assertEquals("[...truncated 68 B...]", logLines.get(0));

Member

oleg-nenashev Nov 3, 2016

Not perfect, but this is a standard method

Member

oleg-nenashev commented Nov 3, 2016

LGTM 👍

oleg-nenashev approved these changes

View reviewed changes

Member

daniel-beck commented Nov 3, 2016

Does this affect the gzipped log file use case?

Contributor Author

Jimilian commented Nov 3, 2016

@daniel-beck good catch! It would not work for gzipped files. So, I can check file type and switch to old implementation if file is gzipped. Because AFAIK it's impossible to read gzipped file from end.
If you see another option, I would appreciate advice.

Member

KostyaSha commented Nov 3, 2016

Does this affect the gzipped log file use case?

Is there test case for it?

Contributor Author

Jimilian commented Nov 3, 2016

@KostyaSha No, I found only one test case for this method in core - that method returns empty list instead of null. After that I wrote two tests to 'document' current behaviour before starting to re-write this method.

Member

KostyaSha commented Nov 3, 2016

So where gzipped logs at all exists?

Member

KostyaSha commented Nov 3, 2016

@daniel-beck how this old implementation may use gzipped log files? Method directly opens file and reads lines, why it should affect gzipped logs?

Member

daniel-beck commented Nov 3, 2016

@KostyaSha Not sure, hence me asking about it. If there's no impact (e.g. because this code is never called on a finalized log file), that's great. But a number of other methods of Run support gzipped log files.

Member

oleg-nenashev commented Nov 3, 2016

I do not believe it impacts gzipped logs since the change operates within existing public method. GZIP build logging should have a completely different implementation/override. One of IMPLs is here: https://wiki.jenkins-ci.org/display/JENKINS/Compress+Build+Log+Plugin.

If the method is broken for this case, it is unlikely a regression

Member

KostyaSha commented Nov 3, 2016

One of IMPLs is here: https://wiki.jenkins-ci.org/display/JENKINS/Compress+Build+Log+Plugin.

Maintainer(s): Daniel Beck

Member

daniel-beck commented Nov 3, 2016

@KostyaSha @oleg-nenashev I have no idea what you are referring to. The core feature handling compressed build logs – the concern here – predates the sample-code sized plugin that automatically compresses them by half a decade: d300ff8

Contributor Author

Jimilian commented Nov 3, 2016

@daniel-beck But previous implementation of getLog didn't use GZIPInputStream. It means that it didn't support gzipped logs too.

Member

oleg-nenashev commented Nov 5, 2016

@daniel-beck @KostyaSha WDYT?

oleg-nenashev added the ready-for-merge label

Member

daniel-beck commented Nov 5, 2016

As @Jimilian points out, the previous version had no gzip support either. So that part is okay.

Beyond that, no opinion, as I haven't had a chance to look into this further.

oleg-nenashev changed the title ~~Optimize get log method~~ [FIXED JENKINS-39535] - Optimize get log method

Member

oleg-nenashev commented Nov 6, 2016

Merging since we have 2 +1s. I doubt it's a subject for backporting, but I've created JENKINS-39535 just in case

oleg-nenashev merged commit 2e8c3be into jenkinsci:master

Jimilian deleted the optimize_getLog_method branch

November 6, 2016 11:18

oleg-nenashev added a commit that referenced this pull request


          Changelog: Noting #2607, #2609, #2610, #2611 and #2608

bf59cf6

Member

daniel-beck commented Nov 7, 2016

Did this cause https://issues.jenkins-ci.org/browse/JENKINS-39555 ?

Member

jglick commented Nov 7, 2016

@daniel-beck hard to see how; Stage View uses Pipeline-specific log storage, not methods on Run.

Member

jglick commented Sep 28, 2018

improve the performance, especially for tests and progressive log

Only for tests (JenkinsRule). Run.getLog(int) is unused in production code AFAIK.

jglick mentioned this pull request

[JEP-210] Log handling rewrite jenkinsci/workflow-job-plugin#27

Merged

11 tasks

Contributor Author

Jimilian commented Oct 24, 2018

@jglick it was used by some plugins. In our case it was plugin that sends mail notifications in case of failed build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-more-reviews ready-for-merge