Fix data rereads #1529

cbowman0 · 2016-06-01T16:47:55Z

Cherry-picked the commits to fix issue #1521 against the master branch.

My peers were experiencing this exact problem yesterday. We are testing this fix now.

JeanFred · 2016-06-01T17:43:56Z

webapp/graphite/render/functions.py

+    windowPoints = previewSeconds / data.step
+    deviation = TimeSeries(data.name, data.start + previewSeconds, data.end, data.step, data[windowPoints:])
+    deviation.pathExpression = data.pathExpression
+


This makes Travis lint check fail:

graphite-web/webapp/graphite/render/functions.py:2331:1: W293 blank line contains whitespace

codecov-io · 2016-06-01T18:13:54Z

Current coverage is 49.40%

Merging #1529 into master will increase coverage by 3.53%

@@             master      #1529   diff @@
==========================================
  Files            52         52          
  Lines          5790       5779    -11   
  Methods           0          0          
  Messages          0          0          
  Branches       1116       1111     -5   
==========================================
+ Hits           2656       2855   +199   
+ Misses         2927       2713   -214   
- Partials        207        211     +4

Powered by Codecov. Last updated by c44feea...b4fe3b2

obfuscurity · 2016-06-09T17:01:07Z

@cbowman0 How well has this been working for you in production (against master)?

cbowman0 · 2016-06-09T17:01:09Z

I will attempt to get test coverage on these once I work out how to simulate the request object. If someone has any ideas, please let me know.

cbowman0 · 2016-06-09T17:18:55Z

@obfuscurity I'll let you know early next week. I deployed the changes to the primary cluster today.

cbowman0 · 2016-06-17T21:21:32Z

No complaints from my users.

I discovered that movingAverage() on data that was already retrieved by a non-summary function like perSecond() was going back to retrieve earlier data, which it needs to do to get the first points in the graph correct. However in retrieving this data it was not using the original wildcard but polling each datapoint individually. On queries with over a thousand datapoints in a clustered environment this was incredibly resource intensive and slow. This change passes the original tokens to each function as an element 'args' in the resourceContext. So if a function needs to retrieve the original data with a time offset it can do so, and retrieve the data in the same way as it was originally retrieved with wildcards intact. This is effecient. Four functions used _fetchWithBootstrap, so I updated all of them to use the new mechanism. There are a bunch of other function that also re-pull data but do not use _fetchWithBootstrap. I believe they could also benefit from this, but I have not addressed them. I verified the output against the original function results and did find some discrepancies, but I believe these are bugs in the old code, so the results should now be better as well as faster. These discrepancies were: In movingMedian _fetchWithBootsstap sometimes inserted an incorrect Null before the time window, changing the result for the first few points holtWinters functions old version had a discontinuity around a 7 day time selection. 167 hours significantly different from 169 hours. This discontinuity is now gone, but the results for shorter time windows are now significantly changed. I'm not a statistician but I suspect these are intended for long time windows and the discontinuity was a bug brought on by mixing two whisper ranges together. Conflicts: .gitignore webapp/graphite/render/evaluator.py

I discovered that movingAverage() on data that was already retrieved by a non-summary function like perSecond() was going back to retrieve earlier data, which it needs to do to get the first points in the graph correct. However in retrieving this data it was not using the original wildcard but polling each datapoint individually. On queries with over a thousand datapoints in a clustered environment this was incredibly resource intensive and slow. This change passes the original tokens to each function as an element 'args' in the resourceContext. So if a function needs to retrieve the original data with a time offset it can do so, and retrieve the data in the same way as it was originally retrieved with wildcards intact. This is effecient. Four functions used _fetchWithBootstrap, so I updated all of them to use the new mechanism. There are a bunch of other function that also re-pull data but do not use _fetchWithBootstrap. I believe they could also benefit from this, but I have not addressed them. I verified the output against the original function results and did find some discrepancies, but I believe these are bugs in the old code, so the results should now be better as well as faster. These discrepancies were: In movingMedian _fetchWithBootsstap sometimes inserted an incorrect Null before the time window, changing the result for the first few points holtWinters functions old version had a discontinuity around a 7 day time selection. 167 hours significantly different from 169 hours. This discontinuity is now gone, but the results for shorter time windows are now significantly changed. I'm not a statistician but I suspect these are intended for long time windows and the discontinuity was a bug brought on by mixing two whisper ranges together. Conflicts: webapp/graphite/render/functions.py

Conflicts: .gitignore

* Remove extraneous whitespace on a blank line in graphite/render/functions.py * Add tests for movingMedian() * Add tests for movingAverage() * First cut at test coverage of holtWinters* functions.

cbowman0 · 2016-06-22T15:35:54Z

Any objections to merging this?

obfuscurity · 2016-06-23T13:40:41Z

@cbowman0 I haven't had a time to review this yet. Sorry, been hands-full with $newjob and Monitorama. Will try to look closer while I'm traveling tomorrow.

obfuscurity · 2016-07-01T17:18:25Z

@cbowman0 Sorry for the delay. Been HAM on Monitorama lately. This looks sane and we already merged a version of this in 0.9.x, so there's no reason not to get this in master and let people hammer away at it. Thanks for this fix.

JeanFred reviewed Jun 1, 2016
View reviewed changes

obfuscurity mentioned this pull request Jun 9, 2016

replace _fetchWithBootstrap #1523

Merged

arielnh56 added 4 commits June 20, 2016 16:17

revert gitignore

87ff9ec

Conflicts: .gitignore

removed vi dross

a0d1afb

cbowman0 force-pushed the fixDataRereads branch from d952505 to c0ae415 Compare June 21, 2016 20:05

Add test coverage

b4fe3b2

* Remove extraneous whitespace on a blank line in graphite/render/functions.py * Add tests for movingMedian() * Add tests for movingAverage() * First cut at test coverage of holtWinters* functions.

cbowman0 force-pushed the fixDataRereads branch from c0ae415 to b4fe3b2 Compare June 21, 2016 20:52

obfuscurity merged commit 69f8814 into graphite-project:master Jul 1, 2016

iain-buclaw-sociomantic mentioned this pull request Jul 6, 2016

Sync graphite-api with upstream graphite-web brutasse/graphite-api#173

Merged

5 tasks

cbowman0 mentioned this pull request Jul 7, 2016

Linear complexity for movingAverage function #1568

Merged

cbowman0 deleted the fixDataRereads branch August 18, 2016 17:41

iain-buclaw-sociomantic mentioned this pull request Apr 3, 2017

movingAverage shows no data under ceres file format if time window is "empty" at the start graphite-project/ceres#18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix data rereads #1529

Fix data rereads #1529

cbowman0 commented Jun 1, 2016 •

edited

Loading

JeanFred Jun 1, 2016

codecov-io commented Jun 1, 2016 •

edited

Loading

obfuscurity commented Jun 9, 2016

cbowman0 commented Jun 9, 2016

cbowman0 commented Jun 9, 2016

cbowman0 commented Jun 17, 2016

cbowman0 commented Jun 22, 2016

obfuscurity commented Jun 23, 2016

obfuscurity commented Jul 1, 2016

Fix data rereads #1529

Fix data rereads #1529

Conversation

cbowman0 commented Jun 1, 2016 • edited Loading

JeanFred Jun 1, 2016

Choose a reason for hiding this comment

codecov-io commented Jun 1, 2016 • edited Loading

Current coverage is 49.40%

obfuscurity commented Jun 9, 2016

cbowman0 commented Jun 9, 2016

cbowman0 commented Jun 9, 2016

cbowman0 commented Jun 17, 2016

cbowman0 commented Jun 22, 2016

obfuscurity commented Jun 23, 2016

obfuscurity commented Jul 1, 2016

cbowman0 commented Jun 1, 2016 •

edited

Loading

codecov-io commented Jun 1, 2016 •

edited

Loading