Add blog post for my final project #538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

InnovativeInventor wants to merge 10 commits into sampsyo:2025sp from InnovativeInventor:abstract-testing

Contributor

InnovativeInventor commented May 13, 2025 •

edited

Loading

Closes #510.

InnovativeInventor added 6 commits

May 13, 2025 13:44


          Add blog post for my final project

d46660e


          Proofread a bit

5d246c3


          Remove dup word

3a35ab2


          Chill out on ! in punctuation

b83038b


          Add another disclaimer, clean up some more

373b700


          Add another disclaimer to toy example

fc63a72

sampsyo requested changes

View reviewed changes

Owner

sampsyo left a comment

Nice work here, @InnovativeInventor! This was a clear explanation of the approach and the goal, and the outcome seems more or less expected. You'll see one big question in my inline comments: did you find any bugs? 😃 Either way, it might be nice to state that outcome in the post somewhere.

content/blog/2025-05-13-abstract-testing-floating-point.md Show resolved Hide resolved

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated Show resolved Hide resolved

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated Show resolved Hide resolved

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated

Comment on lines 76 to 78

+              for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the
+              principle is the same: to tractably verify you (typically) need to
+              overapproximate.

Owner

sampsyo May 14, 2025

Can you briefly mention the relationship between this standard model and actual floating-point reality? Presumably, the relationship is: actual floating point evaluation has error that is bounded by \epsilon, but the actual amount of error might differ between different operations and different inputs.

Contributor Author

InnovativeInventor May 14, 2025

Thanks for the feedback. I added the following text to clarify the relationship between these two:

+To be precise, for every operation, the following overapproximation (of
+the IEEE floating-point spec) holds for some unit round-off value $u$:

 > $(op_{float} \ x \ y) = (op_{real} \ x \ y) * (1 + \epsilon)$

 for $|\epsilon| <= u$. Different tools may use varying overapproximations, but the
 the principle is the same: to tractably verify you (typically) need to
+overapproximate. The IEEE floating-point spec, by contrast, is a fully
+executable and deterministic spec.

Does this help?

Owner

sampsyo May 15, 2025

It helps some, but would it be correct to say that actual IEEE FP semantics have error that is bounded by the "standard error"? Kind of like my comment above, I want to know the relationship between these two things.

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated

Comment on lines 84 to 85

		other words, the testing game is to find input $x$ and $\epsilon$-trace
		$\epsilon_1$, $\epsilon_2$, $\epsilon_3$ ... added at run-time such that we can violate

Owner

sampsyo May 14, 2025

What is an "$\epsilon$-trace"? I think I get it, but it would be nice to allocate one additional sentence to define what this new term means. Especially because the rest of this paragraph seems to imply that you'll always add the same \epsilon to every operation…

content/blog/2025-05-13-abstract-testing-floating-point.md

Comment on lines +137 to +139

+              For this program, the $\epsilon$-trace we computed happens to be the
+              **worst-case**, which is awesome. This technique generalizes in a pretty
+              straightforward manner to a backwards static analysis. [^2]

Owner

sampsyo May 14, 2025

Does that static analysis always produce the worst-case epsilon-trace? I'm assuming not… if not, can you explain a little bit more about when it is conservative (in the sense that there may exist an even worse/larger epsilon-trace)? Seeing one example of such a program would be instructive!

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated

+              - under an approximate semantics that adds a trace of $\epsilon$s at run-time.
+              To evaluate, I selected a few benchmarks from the
+              [FPBench](https://fpbench.org/benchmarks.html) I had already written a parser

Owner

sampsyo May 14, 2025

Suggested change

      
            [FPBench](https://fpbench.org/benchmarks.html) I had already written a parser
          
            [FPBench](https://fpbench.org/benchmarks.html). I had already written a parser

content/blog/2025-05-13-abstract-testing-floating-point.md Outdated

Comment on lines 155 to 157

+              Below is a log-scale violin plot showing the distribution of absolute error
+              abstractly witnessed by my prototype tool. [^4] (_Aside: I think more people should use
+              violin plots._)

Owner

sampsyo May 14, 2025

Can you say a little bit more about where this distribution came from? It sounded like, from the above description, your analysis would produce a single \epsilon-trace, and therefore there would be a single error value for every benchmark you try. What are you randomly sampling? The input values? If so, from what ranges, and with what distributions?

content/blog/2025-05-13-abstract-testing-floating-point.md

Comment on lines +181 to +182

		how to find a good input $x$. Currently, the testing tool uniformly samples from
		the input space. One technique I wrote in my initial proposal (but did not have

Owner

sampsyo May 14, 2025

Ah, maybe this answers my question above. But it seems worth hoisting earlier…

content/blog/2025-05-13-abstract-testing-floating-point.md

+              | Name | Min. error sampled | Max. error sampled | FPTaylor Guarantee | Daisy Guarantee |
+              |:----:|:-----------------:|:-----------------:|:-:|:-:|
+              | rigidBody1 | 0.0 | 3.409494-13 | 2.948752e-13 | 2.948752e-13 |

Owner

sampsyo May 14, 2025

For this case (and a few others), should I be alarmed that the max error is greater than the guarantees from the tools? Does this mean you found a bug??

InnovativeInventor added 3 commits

May 14, 2025 18:19


          Partially address Adrian feedback

d6f64b3


          Clarify langauge relating IEEE FP spec and overapproximate model

6f0df4f


          Clarify point even harder

070df94

sampsyo added the 2025sp label

Owner

sampsyo commented May 28, 2025

Hi, @InnovativeInventor! Looks like you've added a few commits here—let me know when you think this is ready and I'll take another look (and publish).

Owner

sampsyo commented Jun 5, 2025

Just one last reminder about the above, @InnovativeInventor—let me know when it's time to take another look.


          Address more of Adrian's feedback

6c12f38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels