Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integers are not always printed as integers #811

Closed
hanche opened this issue Mar 28, 2019 · 10 comments
Closed

Integers are not always printed as integers #811

hanche opened this issue Mar 28, 2019 · 10 comments
Labels

Comments

@hanche
Copy link
Contributor

hanche commented Mar 28, 2019

I think it would be preferable if “proper” integers would be represented as such and not in scientific notation. Assuming that the internal representation of numbers is an IEEE double, that means any integer whose absolute value is less than 2^53 (9007199254740992), since these are exactly representable as doubles. In particular, I find the following behaviour less than optimal:

⬥ echo 12345678 | from-json
▶ 1.2345678e+07

It gets in the way of serving those numbers to other programs that might expect integers, not floating point numbers.

Larger numbers, though they too may represent integers, should still be printed in scientific format to indicate loss of accuracy.

@xiaq
Copy link
Member

xiaq commented Apr 6, 2019

TBH I'm not sure why it is printing this way, I thought I'm using %g which should take care of that...

@xiaq xiaq changed the title Print “proper” integers as integers Integers are not always printed as integers Jan 12, 2020
@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

OK, Go's %g prints floating point numbers using scientific notation for "large exponents". It doesn't say what is threshold for "large", but it seems to be 7:

~> float64 123456
▶ (float64 123456)
~> float64 1234567
▶ (float64 1.234567e+06)

I am now foraging the strconv package for a more ideal choice.

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

I researched how Firefox and the Lua CLI print numbers. Both JavaScript and Lua only have floating point numbers.

It is not really a matter of being a whole number; they actually use similar heuristics as Go's %g, just with larger thresholds. For example, 1e100 is technically an integer and it is printed as 1e+100 in both environments.

In Firefox, the threshold is 21 for positive exponents, and -7 for negative exponents:

>> 1e20
100000000000000000000
>> 1e21
1e+21
>> 1e-6
0.000001
>> 1e-7
1e-7

In Lua CLI, the thresholds are 14 and -5

> 1e13
10000000000000.0
> 1e14
1e+14
> 1e-4
0.0001
> 1e-5
1e-05

In Go's %g: the thresholds are 6 (the previous comment said 7 which was incorrect) and -5:

~> float64 1e5
▶ (float64 100000)
~> float64 1e6
▶ (float64 1e+06)
~> float64 1e-4
▶ (float64 0.0001)
~> float64 1e-5
▶ (float64 1e-05)

The threshold may be actually based on the binary exponent, not the decimal exponent.

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

It is not possible to change the threshold, other than copying ftoa.go and changing it.

But we can implement the threshold by not using %g. It is a bit hacky, but doable (assuming that the positive threshold is p and the negative threshold is n):

  1. Use %f to print the number.
  2. Parse the result; if the number of digits before the decimal point is larger than p, or the number of digits after the decimal point is larger than n, reprint the number with %e instead.

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

Corrections for step 2:

  • If the number has no decimal point, apply the positive threshold to the total number of digits (as if there is an invisible decimal point at the end).
  • The negative threshold applies to the number of 0's after the decimal point, not the number of digits.

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

For the interest of choosing a good threshold, I surveyed more languages. This is the complete result of all languages I surveyed:

Language p n
Go 6 -5
Chez Scheme 10 -4
Lua 14 -5
Racket 14 -5
Ruby 15 -5
Perl 15 -5
Raku 15 -5
Python 16 -5
JavaScript (Firefox) 21 -7

I will go with 14 and -5.

@xiaq xiaq closed this as completed in 9ad6eba Jan 12, 2020
@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

Hmm, the heuristics should be a bit more sophisticated than that: the positive threshold should apply to the number of trailing zeros, not the exponent.

@xiaq xiaq reopened this Jan 12, 2020
@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

Hmm, at least Racket and Chez Scheme's algorithms are both more sophisticated than just using the number of trailing zeroes:

> 12345678901230000000.0
1.234567890123e+19

@xiaq
Copy link
Member

xiaq commented Jan 12, 2020

Here is another go at reverse-engineering Racket's algorithm. I think the part with the positive threshold is more sophisticated:

  • It only applies to whole numbers (applying to number with a fraction point does not make output shorter)

  • It only applies when the number ends in 0.

@xiaq xiaq closed this as completed in 0d58fea Jan 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants