Skip to content

Add blog post for my final project #538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: 2025sp
Choose a base branch
from

Conversation

InnovativeInventor
Copy link
Contributor

@InnovativeInventor InnovativeInventor commented May 13, 2025

Closes #510.

Copy link
Owner

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work here, @InnovativeInventor! This was a clear explanation of the approach and the goal, and the outcome seems more or less expected. You'll see one big question in my inline comments: did you find any bugs? 😃 Either way, it might be nice to state that outcome in the post somewhere.

Comment on lines 76 to 78
for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the
principle is the same: to tractably verify you (typically) need to
overapproximate.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly mention the relationship between this standard model and actual floating-point reality? Presumably, the relationship is: actual floating point evaluation has error that is bounded by \epsilon, but the actual amount of error might differ between different operations and different inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I added the following text to clarify the relationship between these two:

+To be precise, for every operation, the following overapproximation (of
+the IEEE floating-point spec) holds for some unit round-off value $u$:

 > $(op_{float} \ x \ y) = (op_{real} \ x \ y) * (1 + \epsilon)$

 for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the
 the principle is the same: to tractably verify you (typically) need to
+overapproximate. The IEEE floating-point spec, by contrast, is a fully
+executable and deterministic spec.

Does this help?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps some, but would it be correct to say that actual IEEE FP semantics have error that is bounded by the "standard error"? Kind of like my comment above, I want to know the relationship between these two things.

Comment on lines 84 to 85
other words, the testing game is to find input $x$ and $\epsilon$-trace
$\epsilon_1$, $\epsilon_2$, $\epsilon_3$ ... added at run-time such that we can violate
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an "$\epsilon$-trace"? I think I get it, but it would be nice to allocate one additional sentence to define what this new term means. Especially because the rest of this paragraph seems to imply that you'll always add the same \epsilon to every operation…

Comment on lines +137 to +139
For this program, the $\epsilon$-trace we computed happens to be the
**worst-case**, which is awesome. This technique generalizes in a pretty
straightforward manner to a backwards static analysis. [^2]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that static analysis always produce the worst-case epsilon-trace? I'm assuming not… if not, can you explain a little bit more about when it is conservative (in the sense that there may exist an even worse/larger epsilon-trace)? Seeing one example of such a program would be instructive!

- under an approximate semantics that adds a trace of $\epsilon$s at run-time.

To evaluate, I selected a few benchmarks from the
[FPBench](https://fpbench.org/benchmarks.html) I had already written a parser
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[FPBench](https://fpbench.org/benchmarks.html) I had already written a parser
[FPBench](https://fpbench.org/benchmarks.html). I had already written a parser

Comment on lines 155 to 157
Below is a log-scale violin plot showing the distribution of absolute error
abstractly witnessed by my prototype tool. [^4] (_Aside: I think more people should use
violin plots._)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say a little bit more about where this distribution came from? It sounded like, from the above description, your analysis would produce a single \epsilon-trace, and therefore there would be a single error value for every benchmark you try. What are you randomly sampling? The input values? If so, from what ranges, and with what distributions?

Comment on lines +181 to +182
how to find a good input $x$. Currently, the testing tool uniformly samples from
the input space. One technique I wrote in my initial proposal (but did not have
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, maybe this answers my question above. But it seems worth hoisting earlier…


| Name | Min. error sampled | Max. error sampled | FPTaylor Guarantee | Daisy Guarantee |
|:----:|:-----------------:|:-----------------:|:-:|:-:|
| rigidBody1 | 0.0 | 3.409494-13 | 2.948752e-13 | 2.948752e-13 |
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this case (and a few others), should I be alarmed that the max error is greater than the guarantees from the tools? Does this mean you found a bug??

@sampsyo sampsyo added the 2025sp label May 16, 2025
@sampsyo
Copy link
Owner

sampsyo commented May 28, 2025

Hi, @InnovativeInventor! Looks like you've added a few commits here—let me know when you think this is ready and I'll take another look (and publish).

@sampsyo
Copy link
Owner

sampsyo commented Jun 5, 2025

Just one last reminder about the above, @InnovativeInventor—let me know when it's time to take another look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Project Proposal: Tractable Validation of Floating-Point Verification Tools
2 participants