-
Notifications
You must be signed in to change notification settings - Fork 193
Add blog post for my final project #538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2025sp
Are you sure you want to change the base?
Add blog post for my final project #538
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work here, @InnovativeInventor! This was a clear explanation of the approach and the goal, and the outcome seems more or less expected. You'll see one big question in my inline comments: did you find any bugs? 😃 Either way, it might be nice to state that outcome in the post somewhere.
for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the | ||
principle is the same: to tractably verify you (typically) need to | ||
overapproximate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you briefly mention the relationship between this standard model and actual floating-point reality? Presumably, the relationship is: actual floating point evaluation has error that is bounded by \epsilon, but the actual amount of error might differ between different operations and different inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. I added the following text to clarify the relationship between these two:
+To be precise, for every operation, the following overapproximation (of
+the IEEE floating-point spec) holds for some unit round-off value $u$:
> $(op_{float} \ x \ y) = (op_{real} \ x \ y) * (1 + \epsilon)$
for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the
the principle is the same: to tractably verify you (typically) need to
+overapproximate. The IEEE floating-point spec, by contrast, is a fully
+executable and deterministic spec.
Does this help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It helps some, but would it be correct to say that actual IEEE FP semantics have error that is bounded by the "standard error"? Kind of like my comment above, I want to know the relationship between these two things.
other words, the testing game is to find input $x$ and $\epsilon$-trace | ||
$\epsilon_1$, $\epsilon_2$, $\epsilon_3$ ... added at run-time such that we can violate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an "$\epsilon$-trace"? I think I get it, but it would be nice to allocate one additional sentence to define what this new term means. Especially because the rest of this paragraph seems to imply that you'll always add the same \epsilon to every operation…
For this program, the $\epsilon$-trace we computed happens to be the | ||
**worst-case**, which is awesome. This technique generalizes in a pretty | ||
straightforward manner to a backwards static analysis. [^2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that static analysis always produce the worst-case epsilon-trace? I'm assuming not… if not, can you explain a little bit more about when it is conservative (in the sense that there may exist an even worse/larger epsilon-trace)? Seeing one example of such a program would be instructive!
- under an approximate semantics that adds a trace of $\epsilon$s at run-time. | ||
|
||
To evaluate, I selected a few benchmarks from the | ||
[FPBench](https://fpbench.org/benchmarks.html) I had already written a parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[FPBench](https://fpbench.org/benchmarks.html) I had already written a parser | |
[FPBench](https://fpbench.org/benchmarks.html). I had already written a parser |
Below is a log-scale violin plot showing the distribution of absolute error | ||
abstractly witnessed by my prototype tool. [^4] (_Aside: I think more people should use | ||
violin plots._) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say a little bit more about where this distribution came from? It sounded like, from the above description, your analysis would produce a single \epsilon-trace, and therefore there would be a single error value for every benchmark you try. What are you randomly sampling? The input values? If so, from what ranges, and with what distributions?
how to find a good input $x$. Currently, the testing tool uniformly samples from | ||
the input space. One technique I wrote in my initial proposal (but did not have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, maybe this answers my question above. But it seems worth hoisting earlier…
|
||
| Name | Min. error sampled | Max. error sampled | FPTaylor Guarantee | Daisy Guarantee | | ||
|:----:|:-----------------:|:-----------------:|:-:|:-:| | ||
| rigidBody1 | 0.0 | 3.409494-13 | 2.948752e-13 | 2.948752e-13 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this case (and a few others), should I be alarmed that the max error is greater than the guarantees from the tools? Does this mean you found a bug??
Hi, @InnovativeInventor! Looks like you've added a few commits here—let me know when you think this is ready and I'll take another look (and publish). |
Just one last reminder about the above, @InnovativeInventor—let me know when it's time to take another look. |
Closes #510.