Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect dumping of parsed numbers with exponents, but without decimal places #230

Closed
nlohmann opened this issue Apr 5, 2016 · 20 comments

Comments

@nlohmann
Copy link
Owner

nlohmann commented Apr 5, 2016

The new floating point representation (#201) has a bug when it comes to negative exponents. Number 2342e-2 is correctly duped as 23.42, but when it is parsed from a string, the output is 2e01.

Example code:

json j2a = 2342e-2;
json j2b = json::parse("2342e-2");

CHECK(j2a.dump() == "23.42");
CHECK(j2b.dump() == "2342e-2");

Output:

FAILED:
  CHECK( j2b.dump() == "2342e-2" )
with expansion:
  "2e01" == "2342e-2"

@twelsby Any idea about this?

@nlohmann nlohmann added this to the Release 2.0.0 milestone Apr 5, 2016
@nlohmann
Copy link
Owner Author

There is also an issue with other parsed numbers with exponents, but without decimal separator:

2342e2 and 23420e1 both serialize to 2e05.

@nlohmann nlohmann changed the title Incorrect dumping of parsed numbers with negative exponents Incorrect dumping of parsed numbers with exponents, but without decimal places Apr 10, 2016
@nlohmann nlohmann added the state: help needed the issue needs help to proceed label Apr 10, 2016
@nlohmann
Copy link
Owner Author

❗ This bug is currently a blocker for the 2.0.0 release, because the serialization of parsed numbers is not correct.

@nlohmann
Copy link
Owner Author

@twelsby @gregmarr - Any idea how to fix this?

@camerondruyor
Copy link

Caveat: I'm new to this project (like 1 hour), so please forgive my ignorance.

I drilled down into dump() to try to isolate the problem to either the parser or the formatting. For this problem case, the underlying data is:

m_type.bits.exp_cap    = 0 
m_type.bits.exp_plus   = 0 
m_type.bits.has_exp    = 1 
m_type.bits.parsed     = 1 
m_type.bits.precision  = 0 
m_type.bits.type       = 7 
m_value                = 23.420000

It seems to me, that the precision should be 4, not 0. Am I understanding this correctly?

@camerondruyor
Copy link

Never mind, after looking through the lexer, I realized precision isn't sig figs, but number-of-figs-after-radix in this context. In which case, 2342e-2 is being lexed and parsed correctly, so I'll look back into the dump formatting again.

@camerondruyor
Copy link

Since we're using snprintf with "%.*e" to generate the string, wouldn't we want 2342e-2 to dump as 2.342e01? (since it is going to print one leading digit for scientific notation)

If I detect this particular case, and pack the buffer with precision=3, I can get the test to pass.

My spidey sense tells me that we might need a little more logic in the lexer at this point:

            // If no radix point was found then precision would now be set to
            // the number of digits, which is wrong - clear it.
            result.m_type.bits.precision = precision & found_radix_point;

This logic works fine for test cases like 10e3. They actually only have 1 significant figure anyway, so the precision can be set to 0.

@camerondruyor
Copy link

Ok, I have a working fix.

It passes all the old tests, and now we get 2.342e01 instead of 2e01.

I need to clean up my patch before submitting it, but it's getting late (1:30am here), so I'll do that tomorrow.

@nlohmann
Copy link
Owner Author

Thanks for looking at the issue @KAMTRON! I am looking forward to the fix. Good night! :)

@camerondruyor
Copy link

camerondruyor commented Apr 29, 2016

I have a dumb question @nlohmann .

How do I

make re2c

in order to generate json.hpp? I don't see that target in the makefile anywhere. I have built re2c and run it on json.hpp.re2c, but that gives me the lexer with all the gotos.

I feel like I'm missing something really simple.

@nlohmann
Copy link
Owner Author

@KAMTRON,

running make re2c should execute

re2c --bit-vectors --nested-ifs --no-debug-info src/json,hpp.re2c | gsed '1d' > src/json.hpp

The Makefile has src/json.hpp.re2c as prerequisite, so maybe it is not building because that file was not changed.

If it still does not work, please let me know.

@camerondruyor
Copy link

@nlohmann Thanks. I never did get it to work with make, but I could get by running the command manually.

@falemagn
Copy link

Hey guys, any news about the patch?

@nlohmann
Copy link
Owner Author

I'll check tonight.

@nlohmann
Copy link
Owner Author

I had a look at #240. I still see the following issue: Some numbers (2342e-2, 234200e-4, 10E3) are still not serialized to the same string they have been parsed from. Though the actual numerical value is correct in the test cases, the whole point of #201 was to preserve the number representation of parsed numbers. With the current mismatch, the library's behavior is hard to predict as it only preserves parsed numbers sometimes and I haven't found a rule to distinguish when it works and when it does not.

In the current state, I am unhappy with this behavior and - as a last resort - I would tend to reverse #201 and not to take any effort into preserving the format of parsed numbers as this is neither demanded by the JSON standard nor can it be realized with less code than a simple transformation of strings to numbers.

I would be happy to discuss this issue, but I do not want to merge #240 in the current state.

@nlohmann nlohmann added the state: please discuss please discuss the issue or vote for your favorite option label May 11, 2016
@nlohmann
Copy link
Owner Author

@twelsby It would be great if you could have a look at this issue.

@camerondruyor
Copy link

Sorry I haven't gotten back to this thread in a while, it's been a busy few weeks.

Anyhow, to bring @twelsby and anyone else up to speed, here's the core of the issue:

Say we have the following metadata saved about an input decimal number:

m_type.bits.exp_cap    = 0 
m_type.bits.exp_plus   = 0 
m_type.bits.has_exp    = 1 
m_type.bits.parsed     = 1 
m_type.bits.precision  = 0 
m_type.bits.type       = 7 
m_value                = 23.420000

There are multiple ways this could be represented as a string, so we are inherently sunk. I put a fix into the parser to correctly set the precision to 4, so we would at least get the right value (before it was truncated), but there isn't enough information to reconstruct the original string (my fix just defaults to scientific notation).

We can

  1. Reverse Fixed Issue #187 - Change parse to record floating point representation #201 (and not make round-trip promises)
  2. Add more info to m_type.bits (possibly a lot of work, plus reduced performance)

I would tend to lean towards option 1, but I'm new to the project and I don't want to rock the boat :-)

@camerondruyor
Copy link

Also, a big thanks to @nlohmann for helping me get up to speed on the project!

@nlohmann
Copy link
Owner Author

@KAMTRON Thanks for coming back on this. I currently also tend to reverse #201. As much as I like the idea of having nice roundtrip behavior, the costs for this become more and more apparent.

Since nothing is lost due to versioning, I could move the issue to the long-term whishlist for the library to have it looked at after the 2.0.0 release.

@nlohmann
Copy link
Owner Author

I opened a feature branch to remove the roundtripping support for floating-point numbers. The code compiles and passes the adjusted test suite, but I have not checked with #201 whether I can remove more code. I tend to pull this into the develop branch and prepare the 2.0.0 release soon.

@nlohmann
Copy link
Owner Author

Merged branch https://github.com/nlohmann/json/tree/feature/undo-number-roundtrip. Number roundtripping support has been removed. The feature may be added again in the future, but the current implementation had too many subtle flaws.

@nlohmann nlohmann removed state: help needed the issue needs help to proceed state: please discuss please discuss the issue or vote for your favorite option labels Jun 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants