Integers are not always printed as integers #811

hanche · 2019-03-28T19:16:12Z

I think it would be preferable if “proper” integers would be represented as such and not in scientific notation. Assuming that the internal representation of numbers is an IEEE double, that means any integer whose absolute value is less than 2^53 (9007199254740992), since these are exactly representable as doubles. In particular, I find the following behaviour less than optimal:

⬥ echo 12345678 | from-json
▶ 1.2345678e+07

It gets in the way of serving those numbers to other programs that might expect integers, not floating point numbers.

Larger numbers, though they too may represent integers, should still be printed in scientific format to indicate loss of accuracy.

The text was updated successfully, but these errors were encountered:

xiaq · 2019-04-06T12:14:52Z

TBH I'm not sure why it is printing this way, I thought I'm using %g which should take care of that...

xiaq · 2020-01-12T13:36:20Z

OK, Go's %g prints floating point numbers using scientific notation for "large exponents". It doesn't say what is threshold for "large", but it seems to be 7:

~> float64 123456
▶ (float64 123456)
~> float64 1234567
▶ (float64 1.234567e+06)

I am now foraging the strconv package for a more ideal choice.

xiaq · 2020-01-12T14:26:11Z

I researched how Firefox and the Lua CLI print numbers. Both JavaScript and Lua only have floating point numbers.

It is not really a matter of being a whole number; they actually use similar heuristics as Go's %g, just with larger thresholds. For example, 1e100 is technically an integer and it is printed as 1e+100 in both environments.

In Firefox, the threshold is 21 for positive exponents, and -7 for negative exponents:

>> 1e20
100000000000000000000
>> 1e21
1e+21
>> 1e-6
0.000001
>> 1e-7
1e-7

In Lua CLI, the thresholds are 14 and -5

> 1e13
10000000000000.0
> 1e14
1e+14
> 1e-4
0.0001
> 1e-5
1e-05

In Go's %g: the thresholds are 6 (the previous comment said 7 which was incorrect) and -5:

~> float64 1e5
▶ (float64 100000)
~> float64 1e6
▶ (float64 1e+06)
~> float64 1e-4
▶ (float64 0.0001)
~> float64 1e-5
▶ (float64 1e-05)

The threshold may be actually based on the binary exponent, not the decimal exponent.

xiaq · 2020-01-12T14:29:53Z

Indeed: https://golang.org/src/strconv/ftoa.go#L209

xiaq · 2020-01-12T15:03:13Z

It is not possible to change the threshold, other than copying ftoa.go and changing it.

But we can implement the threshold by not using %g. It is a bit hacky, but doable (assuming that the positive threshold is p and the negative threshold is n):

Use %f to print the number.
Parse the result; if the number of digits before the decimal point is larger than p, or the number of digits after the decimal point is larger than n, reprint the number with %e instead.

xiaq · 2020-01-12T15:06:02Z

Corrections for step 2:

If the number has no decimal point, apply the positive threshold to the total number of digits (as if there is an invisible decimal point at the end).
The negative threshold applies to the number of 0's after the decimal point, not the number of digits.

xiaq · 2020-01-12T15:18:10Z

For the interest of choosing a good threshold, I surveyed more languages. This is the complete result of all languages I surveyed:

Language	p	n
Go	6	-5
Chez Scheme	10	-4
Lua	14	-5
Racket	14	-5
Ruby	15	-5
Perl	15	-5
Raku	15	-5
Python	16	-5
JavaScript (Firefox)	21	-7

I will go with 14 and -5.

xiaq · 2020-01-12T15:44:57Z

Hmm, the heuristics should be a bit more sophisticated than that: the positive threshold should apply to the number of trailing zeros, not the exponent.

xiaq · 2020-01-12T15:51:08Z

Hmm, at least Racket and Chez Scheme's algorithms are both more sophisticated than just using the number of trailing zeroes:

> 12345678901230000000.0
1.234567890123e+19

xiaq · 2020-01-12T16:03:36Z

Here is another go at reverse-engineering Racket's algorithm. I think the part with the positive threshold is more sophisticated:

It only applies to whole numbers (applying to number with a fraction point does not make output shorter)
It only applies when the number ends in 0.

xiaq added comp:lang bug labels Apr 6, 2019

xiaq added P:Number Type and removed c:language labels Dec 28, 2019

xiaq changed the title ~~Print “proper” integers as integers~~ Integers are not always printed as integers Jan 12, 2020

xiaq closed this as completed in 9ad6eba Jan 12, 2020

xiaq reopened this Jan 12, 2020

xiaq closed this as completed in 0d58fea Jan 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integers are not always printed as integers #811

Integers are not always printed as integers #811

hanche commented Mar 28, 2019

xiaq commented Apr 6, 2019

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

Integers are not always printed as integers #811

Integers are not always printed as integers #811

Comments

hanche commented Mar 28, 2019

xiaq commented Apr 6, 2019

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020

xiaq commented Jan 12, 2020