Add arbitrary precision decimal floating point #71

LdBeth · 2021-11-29T15:08:52Z

Resolves proposal #69

Ready for review.

LdBeth · 2021-11-29T15:25:00Z

The specification followed (partially, to stay compatible with Float) is available at http://speleotrove.com/decimal

hebisch · 2021-12-01T02:54:59Z

On Mon, Nov 29, 2021 at 07:09:04AM -0800, LdBeth wrote: Resolves proposal #69 Ready for review. You can view, comment on, or merge this pull request online at: #71

Let me say first that I am not fun of decimal float. But since you already wrote it and it is certainly more useful than some domains that we already have, there is no principal obstacle to including it. Now some specific remarks. First trivialites: in Makefile.in (and in general) I normally try to minimize size of diffs. Which in case of file lists in Makefile.in means adding entries in relevant place (mostly in alphabetical order), splitting line if it overflows. So no larger reformatings. Second triviality: our file should have newline at end (your diff produced file with no newline at end). For inclusion of new code we normally want also tests (in src/input subdirectory there are several tests files which may give you idea how this is done). You probably want for DecimalFloat to be recoginzed by Spad compiler. For this you will need changes to src/interp/compiler.boot, probably in compSel1 and constant_coerce. Now, concerning transcendental functions, I feel that your approach of copying and modyfiying code is somewhat problematic. Namely we do improve such code from time to time and having duplicate means that version in decimal will get stale (have no improvement) or we will have to do work twice, separately for each version. Unless you have strong reason to do otherwise the simplest approach is to convert to float, use float function and convert back. Convertion to float should have cost comparable to cost of multiplication in decimal float so preformancewise this should be faster than having function working in decimal format (since operations in float should be faster).

…

-- Waldek Hebisch

LdBeth · 2021-12-01T17:01:27Z

Thank you for the review comments, @hebisch

For this you will need changes to
src/interp/compiler.boot, probably in compSel1 and constant_coerce.

These are mainly hacks to produce double float literals for the Common Lisp compiler.
Besides, I haven't figure out a way to add the support there without breaking the rest of the code.

Unless you have strong reason to do
otherwise the simplest approach is to convert to float,
use float function and convert back.

Converting to Float is easy, however converting back without significant precision loss is not straight
forward and requires complex computations.

Say, 1.2, is 177088743107611695514*2^- 67 which gives 1.1999993663_703848278
when feed to the float(m, e, p)$DecimalFloat function. Although convert10 inside Float
can do the job correctly, there are at least 2 problems:

elision of trailing zeros. The General Decimal Arithmetic requires using full precisions when the result is inexact.
different rounding mode. The Float uses binary round to even, while DecimalFloat uses round up half. exp 1.3 gives 3.6692966676_192442204 in binary float, and in decimal it can gives 3.6692966676_192442205

the General Decimal Arithmetic has these special requirements on log10, power and exp so I have
adapted the algorithm a little. For the rest of the elementary functions they are unspecified, using binary
implementation should be fine.

hebisch · 2021-12-01T22:44:32Z

On Wed, Dec 01, 2021 at 09:01:38AM -0800, LdBeth wrote: Thank you for the review comments, @hebisch > For this you will need changes to src/interp/compiler.boot, probably in compSel1 and constant_coerce. These are mainly hacks to produce double float literals for the Common Lisp compiler. Besides, I haven't figure out a way to add the support there without breaking the rest of the code.

I mean extra lines like: domain = $Float and op = "float" and m = ["DecimalFloat"] => argl is [mant, exp, 10] => [["Sel", ["DecimalFloat"], "float"], mant, exp], m, e] Basically, this code needs to replace [["Sel", ["Float"], "float"], mant, exp] by [["Sel", ["DecimalFloat"], "float"], mant, exp], m, e] Spad compiler uses somewhat different representation of code than interpreter...

> Unless you have strong reason to do otherwise the simplest approach is to convert to float, use float function and convert back. Converting to `Float` is easy, however converting back without significant precision loss is not straight forward and requires complex computations.

Hmm, both require computation and both may lose some precision (0.5-1 lsb). You already have main part: float(m, exp, 2)$DecimalFloat will do the work.

Say, `1.2`, is `177088743107611695514*2^- 67` which gives `1.1999993663_703848278` when feed to the `float(m, e, p)$DecimalFloat` function.

Hmm, with 40 digits precision I get: (4) -> 1.2 - float(177088743107611695514, -67, 2) (4) - 0.2710505431_213761085 E -20 Type: Float (5) -> 1.1999993663_703848278 - float(177088743107611695514, -67, 2) (5) - 0.6336296151_7220271050_5431213761_085 E -6 Type: Float This is massive error, you have only about 20 _bits_ of accuracy. In such case I would look for bugs in decimal artitmetic routines. Maybe you mixed somwhere decimal with binary precision?

Although `convert10` inside `Float` can do the job correctly, there are at least 2 problems: 1. elision of trailing zeros. The General Decimal Arithmetic requires using full precisions when the result is inexact. 2. different rounding mode. The `Float` uses binary round to even, while `DecimalFloat` uses round up half. `exp 1.3` gives `3.6692966676_192442204` in binary float, and in decimal it can gives `3.6692966676_192442205` the General Decimal Arithmetic has these special requirements on `log10`, `power` and `exp` so I have adapted the algorithm a little.

Hmm, I do not see in your code anything that will handle exact rounding. For your purpose rounding in Float is irrelevant, 'exp' in Float just produces rational approximation to correct value of 'exp' which is supposed to have relative error not exceeding 2^-bits. For exact rounding you need to check in rounding code if available accuracy is enough to perform rounding according to the decimal rules. If not, increase accuracy and repeat. So you need someting like: roundExactlyIfCan(x : Float, eps : Float) : Union(DecimalFloat, "failed") .... with appropriate implementation and loop like: fbits := initial_bits(digits) obits := bits()$Float repeat bits(fbits)$Float eps := float(1, -fbits, 2)$Float xf := float(m, e, 10)$Float ef := exp(xf)$Float edu := roundExactlyIfCan(ef, eps) edu case DecimalFloat => bits(obits)$Float return edu::DecimalFloat fbits := 3*bits quo 2 roundExactlyIfCan is somewhat tricky to implement correctly and efficiently. Naively roundExactlyIfCan just can treat its input as exact rational and compute best approximation by decimal fraction. If distance to neighbouring decimal fractions is larger than 2*eps then you know that you got correct rounding. However, FriCAS exponents can be bignums and implied computation with exact fractions may be prohibitively expensive. Instead, roundExactlyIfCan should compute approximation to ed := (log(2)*e)/log(10) and use integer part of this as exponent of decimal approximation and exp of fractional part to correct mantissa. Of course this messing with logs introduces errors on its own, so roundExactlyIfCan must use sufficient accuracy so that errors from this approximation are smaller than eps and then use total estimate of error to decide if approximation can be correctly rounded. Note that in FriCAS when you use 20 digits for mantissa you may have 1000 digits in exponent. Then log computation must deliver hundreds of good digits in order to avoid complets loss of accuracy. To get idea of what I mean you can compare 2.0^(2^911-1) and exp(log(2.0^(2^911-1))) Mathematically they should be equal, but at 20 digits accuracy you get completely different numbers. Note that exact rounding at intermediate stages does not help: exact rounding means that second expression will consistently give you one result (the same for all implementations of exact rounding), but widely different than first expression. Note that roundExactlyIfCan has some similarity to what direct implementation of log and exp need to do for exact rounding. Tradeofs here are that by implementing roundExactlyIfCan you can have only one implementation, supporting multiple uses. In exp and log you can take advantage of specific properties of exp and log. However, after arguments are reduced to standard range this advantage essentially vanishes. I see no reason to copy main body of log and exp. BTW: main part of exp1 is purely integer computation completely independent of base. So, if you really feel that you need it you can just export integer calculation from float and call exported version. BTW2: I plan to implement more functions using similar principle as used in exp1 (that is computing rational approximations purely in integer arithmetic). Then all this rational approximation code probably will go to separate package.

…

-- Waldek Hebisch

LdBeth · 2021-12-02T00:47:16Z

This is massive error, you have only about 20 bits of accuracy.
In such case I would look for bugs in decimal artitmetic routines.

Indeed, this due to a bug that I didn't given enough precision in division when the length
of right operand is large.
With the correction been made float(177088743107611695514, -67, 2)$DecimalFloat
gives 1.2000000000_000000000.

Given that correction is made, I feel no objection to using binary float for elementary functions.
I'll try following the roundExactlyIfCan approach.

I plan to implement more functions using similar
principle as used in exp1 (that is computing rational
approximations purely in integer arithmetic).

Yes, please do so. btw right now I find log(x)$Float gives bad approximation when x is close enough to 1

(16) -> log 1.0000001

   (16)  0.9999999500_0002764028 E -7
                                                                  Type: Float

Which is 0.999999950000003333333 with higher precision.

hebisch · 2021-12-02T04:05:41Z

On Wed, Dec 01, 2021 at 04:47:26PM -0800, LdBeth wrote: Yes, please do so. btw right now I find `log(x)$Float` gives bad approximation when `x` is close enough to 1 ``` (16) -> log 1.0000001 (16) 0.9999999500_0002764028 E -7 Type: Float ``` Which is `0.999999950000003333333` with higher precision.

Hmm, (1) -> x1 := 1.0000001 (1) 1.0000001 Type: Float (3) -> l1 := log(x1) (3) 0.9999999500_0002764028 E -7 Type: Float (4) -> digits(50) (4) 20 Type: PositiveInteger (5) -> l1 - log(x1) (5) 0.4318995886_1415005798_73530061 E -28 Type: Float (7) -> x1 - 1.0000001 (7) 0.2430694534_5387021006_899885833 E -20 Type: Float AFAICS accuracy is as expected. Note last result: using literal form at different precision you are computing log of different number.

…

-- Waldek Hebisch

hebisch · 2021-12-02T05:32:06Z

On Wed, Dec 01, 2021 at 04:47:26PM -0800, LdBeth wrote: Indeed, this due to a bug that I didn't given enough precision in division when the length of right operand is large.

Note that ATM dvide is affectively doing its own rounding towards -infinity. Which means that it does not agree with rule for rounding that you gave. Also, if you want perform correct rounding you need to either have single rounding at the end or _very_ carefuly prove that what you do has equivalent effect. In particular this affects multiplication and division by integers, it looks that currently they may round wrong (I do not know if you want them to always round correctly). Note: I do not promise correct rounding in Float, so for operations where you want correct rounding Float is usualy bad example.

With the correction been made `float(177088743107611695514, -67, 2)$DecimalFloat` gives `1.2000000000_000000000`.

There are two other problems. Exponentiation used in float may lead to nontrivial loss of accuracy for large exponents. If binary powering is in use, then powering implies number of multiplications of order of length(e) where length(e) is length of exponent in bits. Each may cause error of order of least significant bit so you need something like 2*log2(length(e)) extra bits of accuracy. More efficient version would work with log of exponent, where you need to increase accuracy to lenth(e). This is already problem in Float version, apparently nobody tried it for large exponents (normally constants have exponents in reasonably small range where the problem is not visible). Another problem is if you want correct rounding in float, then you need dance with computing approximation, checking it it can be rounded without doubt and repeating in higher precision when needed. As I wrote, there is no promise of correct rounding in Float, so I do not consider this as problem for float version. I do not know if you want to promise correct rounding for float, if you do then probably it makes sense to have internal version with no effort for correct rounding (it makes little sense to spend a lot of effort on correct rounding of things that you will throw out before user can see them).

…

-- Waldek Hebisch

LdBeth · 2021-12-02T06:30:11Z

Note that ATM dvide is affectively doing its own rounding towards
-infinity. Which means that it does not agree with rule for
rounding that you gave.

My code for dvide is different from Float, the basic idea is:

integer division, if it can do exact division then return exact result.
it is guaranteed that LENGTH y.mantissa - LENGTH r is positive after step 1.
get the fraction r / y with exceeded precision. remove trailing zeros.
adds up integer part with the fraction part

I used rounded addition by calling + at the end, so the result is rounded like any other basic arithmetic operators
in DecimalFloat domain. Albeit mine is much slower, the goal is to provide exact result when the result is not normalized,
as required in the General Arithmetic spec.

So far I'm looking at other methods for log, it might be ended up using different algorithms in DecimalFloat.

hebisch · 2021-12-02T15:09:57Z

On Wed, Dec 01, 2021 at 10:30:21PM -0800, LdBeth wrote: > Note that ATM dvide is affectively doing its own rounding towards -infinity. Which means that it does not agree with rule for rounding that you gave. My code for `dvide` is different from `Float`, the basic idea is: 1. integer division, if it can do exact division then return exact result. 2. it is guaranteed that `LENGTH y.mantissa - LENGTH r` is positive after step 1. 3. get the fraction r / y with exceeded precision. remove trailing zeros.

You have mw := (r * POW10(ew)) quo y.mantissa in this step. quo is discarding remainder, which is equvalent to rounding.

4. adds up integer part with the fraction part

That performs _second_ rounding. In general two roundings give different result than single rounding. Statisticaly, most of the time result from two roundings will be the same as single rounding. So, it is hard to check your routines by testing (testing will catch crude errors, but normally misses more subtle ones). For basic four operations there is simple, obviously correct algorithm: do operation treating your numbers as exact rationals and then find best approximation by decimal fraction of requested precision (where "best" is decided by rounding rule). To have faster code normaly implementaions take various shortcuts. But for each shotcut you need a proof that it gives the same result as maive version. Even addition requires proof. For example, with round towards +infinity rule adding anything positive would get strictly bigger result. So current addition code would be wrong. In other words, correctness of current addition code depends on fact that rounding rule is from "round to nearest" family. The point of the above is that close to decision boundary tiny change can lead to different rounding. Unless you already did so I suggest to Google for "table makers dillema", there are references which more deeply explain related issue.

…

-- Waldek Hebisch

LdBeth · 2021-12-03T08:17:43Z

On Thu, 02 Dec 2021 23:10:08 +0800, hebisch wrote: in this step. quo is discarding remainder, which is equvalent to rounding.

I suppose `(r * POW10(ew)) quo y.mantissa` only does truncation in decimal, and with excesses precision it won't affect the analog round-half-up, since only one decimal digits after the precision is used. No? It is actually quite close to work out the results in fixed point number and giving the separated exponents part.

On Thu, 02 Dec 2021 23:10:08 +0800, hebisch wrote: Unless you already did so I suggest to Google for "table makers dillema", there are references which more deeply explain related issue.

I see that means it is reasonable to have transcendental functions with up to 1 ulp error for arbitrary precision float. And it does in turn affects decimal float as well. I guess that gives me the idea it is better to use a different class of algorithms in decimal float point. Best, --- LDB

hebisch · 2021-12-03T13:33:36Z

On Fri, Dec 03, 2021 at 12:17:54AM -0800, LdBeth wrote: On Thu, 02 Dec 2021 23:10:08 +0800, hebisch wrote: > > in this step. quo is discarding remainder, which is equvalent to > rounding. I suppose `(r * POW10(ew)) quo y.mantissa` only does truncation in decimal, and with excesses precision it won't affect the analog round-half-up, since only one decimal digits after the precision is used. No?

Unless you prove otherwise "since only one decimal digits after the precision is used" is likely to be wrong: if you are close to the tie digits very far away can flop result in different direction. Note that you have signed numbers and quo uses funny rule to determine sign of remainder, so even if the code works for one combination of signs it does not mean that it will work for all. BTW: you said that ties rund upward but did not say if this also applies to negative numbers.

…

-- Waldek Hebisch

LdBeth · 2021-12-03T14:41:58Z

On Fri, 03 Dec 2021 21:33:46 +0800, hebisch wrote: Unless you prove otherwise "since only one decimal digits after the precision is used" is likely to be wrong: if you are close to the tie digits very far away can flop result in different direction. Note that you have signed numbers and quo uses funny rule to determine sign of remainder, so even if the code works for one combination of signs it does not mean that it will work for all. BTW: you said that ties rund upward but did not say if this also applies to negative numbers.

First I would explain the rounding. Since the Decimal Arith specifies sign bit is seperated from the mantissa, `round(x) IS sign(x) * round(abs(x))` is presumed. And in addition to that, to stay with some extends of compatible with original `Float` and `DoubleFloat`, `IF x = 0 THEN sign(x) = 1`. So floating zero is always positive. About the divide function: From the lisp code, and Common Lisp spec, we can know that: [q, r] := divide(x, y)$Integer q * y + r = x IS ENSURED AND q IS x/y rounded towards zero AND q = x quo y (Guarding that r ~= 0) rewriting (for y ~= 0), can get: q + r / y = x / y so at least we know the sign of remainder won't bother, because `q` must have the same sign with `r / y`. Due to how sign is manipulated in typical floating, (- a) + (- b) MUST BE IDENTICAL TO - (a + b), and the rounding rule, there should be no rounding problem cause by sign. Hence it is just a question left if we can deduce `q' + r' quo y = x' quo y`, when q', r', x' are q, r, y multiplied by the same power of 10. And in fact it is true.

…

--- LDB

hebisch · 2021-12-04T00:07:59Z

On Fri, Dec 03, 2021 at 06:42:10AM -0800, LdBeth wrote: On Fri, 03 Dec 2021 21:33:46 +0800, hebisch wrote: > > Unless you prove otherwise "since only one decimal digits after the > precision is used" is likely to be wrong: if you are close to the > tie digits very far away can flop result in different direction. > Note that you have signed numbers and quo uses funny rule to > determine sign of remainder, so even if the code works for one > combination of signs it does not mean that it will work for all. > > BTW: you said that ties rund upward but did not say if this also > applies to negative numbers. First I would explain the rounding. Since the Decimal Arith specifies sign bit is seperated from the mantissa, `round(x) IS sign(x) * round(abs(x))` is presumed. And in addition to that, to stay with some extends of compatible with original `Float` and `DoubleFloat`, `IF x = 0 THEN sign(x) = 1`. So floating zero is always positive. About the divide function: >From the lisp code, and Common Lisp spec, we can know that: [q, r] := divide(x, y)$Integer q * y + r = x IS ENSURED AND q IS x/y rounded towards zero AND q = x quo y (Guarding that r ~= 0) rewriting (for y ~= 0), can get: q + r / y = x / y so at least we know the sign of remainder won't bother, because `q` must have the same sign with `r / y`. Due to how sign is manipulated in typical floating, (- a) + (- b) MUST BE IDENTICAL TO - (a + b), and the rounding rule, there should be no rounding problem cause by sign.

OK.

Hence it is just a question left if we can deduce `q' + r' quo y = x' quo y`, when q', r', x' are q, r, y multiplied by the same power of 10. And in fact it is true.

IIUC you basically say that (q, r) := divide(x.mantissa, y.mantissa) ew :I := LENGTH y.mantissa - LENGTH x.mantissa + digits() + 1 mw := (r * POW10(ew)) quo y.mantissa q*POW10(ew) + mw is doing the same thing as ew :I := LENGTH y.mantissa - LENGTH x.mantissa + digits() + 1 mw := (x.mantissa*POW10(ew)) quo y.mantissa In such case why bother with splitting quotient into two stages? The second version is simpler and likely faster. BTW: It seems that you should do ew :I := LENGTH y.mantissa - LENGTH x.mantissa + digits() + 1 ew := max(ew, 0) mw := (x.mantissa*POW10(ew)) quo y.mantissa Otherwise there will be error due to negative power of 10. Another thing is removal of trailing digits. AFAICS it should be done in round and maybe display routines, but nowhere else. With earlier version of your code I get: (1) -> digits(4)$DecimalFloat (1) 20 Type: PositiveInteger (2) -> x := ***@***.*** (2) 1001 Type: DecimalFloat (3) -> digits(3)$DecimalFloat (3) 4 Type: PositiveInteger (4) -> x/1 (4) 1001 Type: DecimalFloat (5) -> ***@***.*** (5) 1.00 E 3 Type: DecimalFloat (6) -> ***@***.***) (6) 1001 Type: DecimalFloat ATM it seems that result of division had extra unwanted digit of accuracy. Note: the above was intended to test removal of trailing zeros (which you say should be done) but looks like separate issue. The point is that before round you have possible trailing guard digit which round may remove. But AFAICS round does not remove zeros which may be exposed when guard digit is removed by round. BTW: I am not sure what spec exactly say about removal of trailing zeros. Surely, hardware decimal float can not magicaly shorten it representation and trailing zeros must be there. So, logically the only place which really need to deal with trailing zeros is printing routine. Or maybe mantissa and exponent (if they are considered part of the spec).

…

-- Waldek Hebisch

LdBeth · 2021-12-04T01:49:54Z

On Sat, 04 Dec 2021 08:08:10 +0800, hebisch wrote: ATM it seems that result of division had extra unwanted digit of accuracy.

Err, I didn't expected shrinking of precesion would happen, The cause is ``` if r = 0 then [q, x.exponent - y.exponent] ``` The result lacks rounding. Doing a `LENGTH` in DecimalFloat is more expensive than `length` in for binary, since when `r = 0` then `q` would be exact, I choose to not round the result. That a mistake indeed, I shall fix that.

BTW: It seems that you should do ew :I := LENGTH y.mantissa - LENGTH x.mantissa + digits() + 1 ew := max(ew, 0) mw := (x.mantissa*POW10(ew)) quo y.mantissa Otherwise there will be error due to negative power of 10.

it is ``` ew :I := LENGTH y.mantissa - LENGTH r + digits() + 1 ``` So, `x.mantissa * POW10(ew)` would not make sense if `ew` is negative. The `dvide$Float` uses `shift2` which would be fine. Doing a `(q, r) := divide(x.mantissa, y.mantissa)` can ensure `LENGTH r <= LENGTH y.mantissa`, so `ew` is always positive.

In such case why bother with splitting quotient into two stages? The second version is simpler and likely faster. Another thing is removal of trailing digits. AFAICS it should be done in round and maybe display routines, but nowhere else.

It is due to requirements from General Decimal Arithmetic. Setting precision to 9 digits, divide(1, 3) ==> 0.333333333 divide(5, 2) ==> 2.5 divide(1, 10) ==> 0.1 divide(12, 12) ==> 1 divide(2.400, 2.0) ==> 1.20 And, | There is a one-to-one mapping between abstract representations and the | result of this conversion. That is, every abstract representation has | a unique to-scientific-string representation. Which means that for example, float(1000, -3)$DF should be displayed as `1.000`, different to float(1,3)$DF that is displayed as `1E3`. For the reason behind that can see FAQ on Decimal Arithmetic, Why are the encodings unnormalized? http://speleotrove.com/decimal/decifaq5.html#unnlay Why is the arithmetic unnormalized? http://speleotrove.com/decimal/decifaq4.html#unnari

…

--- LDB

hebisch · 2021-12-04T06:09:04Z

On Fri, Dec 03, 2021 at 05:50:05PM -0800, LdBeth wrote: On Sat, 04 Dec 2021 08:08:10 +0800, hebisch wrote: > > ATM it seems that result of division had extra unwanted digit > of accuracy. Err, I didn't expected shrinking of precesion would happen,

That is normal thing when precision is settable. You calculate critical things in higher precion and then use result in lower precion calculations...

The cause is ``` if r = 0 then [q, x.exponent - y.exponent] ``` The result lacks rounding. Doing a `LENGTH` in DecimalFloat is more expensive than `length` in for binary, since when `r = 0` then `q` would be exact, I choose to not round the result. That a mistake indeed, I shall fix that.

Yes, computing decimal length exactly has its costs. But if you depenend on length for correctness you (sometimes) need to pay the cost. Length is critical for correct display and WIDTH function in src/interp/i-output.boot tries to be correct and reasonably fast. To say the truth it looks that currently length10 can give wrong result for very large numbers. Namely, for correct operation you need lower bound on log10(abs(x)). Magic constant 4004/13301 is estimate of log(2)/log(10) from _above_. So, once lenth(x) is big enough you will overestimate and may get too large result. BTW1: Why dont you replace x by abs(x) at the beginning of length10? IIUC you want to ignore sign when determining length. BTW2: Code in WIDTH makes significant effort to avoid computing powers of 10: uses pre-computed small powers and computes power of 10 only if number is quite close to power of 10. Also, instead of quotient it is enough to compare with power of 10, which is cheaper. Also note that 10000 bound on exactly computed lenth is a pragmatic compromise of display subsystem. The method with magic constants as now would work up to about length of order 10e9. Tweaking constants one can get up to 10e12. Replacing doubles with Float should allow handling any practicaly reachable length.

> BTW: It seems that you should do > > ew :I := LENGTH y.mantissa - LENGTH x.mantissa + digits() + 1 > ew := max(ew, 0) > mw := (x.mantissa*POW10(ew)) quo y.mantissa > > Otherwise there will be error due to negative power of 10. it is ``` ew :I := LENGTH y.mantissa - LENGTH r + digits() + 1 ```

I looked at diff at link you gave.

So, `x.mantissa * POW10(ew)` would not make sense if `ew` is negative. The `dvide$Float` uses `shift2` which would be fine. Doing a `(q, r) := divide(x.mantissa, y.mantissa)` can ensure `LENGTH r <= LENGTH y.mantissa`, so `ew` is always positive. > In such case why bother with splitting quotient into two stages? > The second version is simpler and likely faster.

Now there is even more reasons to ask this.

> > Another thing is removal of trailing digits. AFAICS it should be > done in round and maybe display routines, but nowhere else. It is due to requirements from General Decimal Arithmetic. Setting precision to 9 digits, divide(1, 3) ==> 0.333333333 divide(5, 2) ==> 2.5 divide(1, 10) ==> 0.1 divide(12, 12) ==> 1 divide(2.400, 2.0) ==> 1.20

Those are examples. What is the rule? Looking at first four examples one can suspect rule "remove trailing zeros from exact results", but this is clearly not true in last example. Do I understand correctly that divide(10^9*divide(1, 3), 333333333) should give 1 (as opposed to 1.000_000_00)? What about divide(2.400, 2) divide(2400, 2) divide(12e5, 2e5) divide(1200, 200) What about case when x is of higher precion with "exact" representation, the same for y, quotient is not exact, but after second quo you get several trailing zeros? That is something like divide(12300000001, 123) where 12300000001 is produced in higher precision. What about additon and multiplication? Anyway, AFAICS you have zeros-removal code at least in two places, it makes sense to have as single routine for this. Also, it looks fishy to round _after_ zeros-removal. Namely, if you actually round, then zeros-removal was not needed.

And, | There is a one-to-one mapping between abstract representations and the | result of this conversion. That is, every abstract representation has | a unique to-scientific-string representation. Which means that for example, float(1000, -3)$DF should be displayed as `1.000`, different to float(1,3)$DF that is displayed as `1E3`.

That is almost clear, but does not explain what should be the result of given operation. Almost part deals fact that in usual scientific representation 3.33e2 is considerd equal to 0.333e3 and in turn equal to 333.

For the reason behind that can see FAQ on Decimal Arithmetic, Why are the encodings unnormalized? http://speleotrove.com/decimal/decifaq5.html#unnlay Why is the arithmetic unnormalized? http://speleotrove.com/decimal/decifaq4.html#unnari

The links above (and a lot of material on site above) is just propaganda. For implementation we need clear rules (which I was not able to find after quick look at this site).

…

-- Waldek Hebisch

LdBeth · 2021-12-04T09:05:09Z

On Sat, 04 Dec 2021 14:09:15 +0800, hebisch wrote: BTW1: Why dont you replace x by abs(x) at the beginning of length10? IIUC you want to ignore sign when determining length.

length(-x) for positive x will be 1 smaller than length(x), since right now I'm using quo for length10 computation to correct the estimation that won't affect the result. However if length10 would be changed accordingly to improve the efficiency then ... it would need such a change.

> divide(2.400, 2.0) ==> 1.20 Those are examples. What is the rule? Looking at first four examples one can suspect rule "remove trailing zeros from exact results", but this is clearly not true in last example.

That last one example isn't exceptional, it is because 2.400 is 2400E-3 and 2.0 is 20E-1, divide(2400,20) is exact. The result is (2400/20)E(-3-(-1)), 120E2, 1.2 . There is indeed an algorithm been describe in the spec http://speleotrove.com/decimal/daops.html#refdivide However I choose to not directly follow that, because that involves a chain of substraction and shift in base 10, far less efficient than the one in Float. Actually, many implementation of that spec are using text string to represent floating point numbers, and shifting with base 10 would be easy that way, but I is isn't very convenient to program strings in SPAD. So instead of calculate division digits by digits, I choose to get the remainder, calculate a "decimal expansion" from it, then add it with the quotient. * I do aware now that with the current trailing zero removal approach, if the decimal expansion has a zero in it, and a precision setting happens to let it becomes the last digit, the result would be different from the definition fro the spec. Thus I'm planning to correct it by add test for exact division/sqrt before performing trailing zero removal. *

Do I understand correctly that divide(10^9*divide(1, 3), 333333333) should give 1 (as opposed to 1.000_000_00)?

Yes.

What about case when x is of higher precion with "exact" representation, the same for y, quotient is not exact, but after second quo you get several trailing zeros? That is something like divide(12300000001, 123) where 12300000001 is produced in higher precision. What about additon and multiplication?

The spec does not cover particular issues with multiple precision implementation. Assuming following the definition directly, divide(12300000001, 123) should be 100000000E0. The spec to addition and multiplication are close enough to typical floating point implementation. http://speleotrove.com/decimal/daops.html#refaddsub http://speleotrove.com/decimal/daops.html#refmult

…

--- LDB

hebisch · 2021-12-06T23:28:44Z

On Sat, Dec 04, 2021 at 01:05:20AM -0800, LdBeth wrote: On Sat, 04 Dec 2021 14:09:15 +0800, hebisch wrote: > > BTW1: Why dont you replace x by abs(x) at the beginning of length10? > IIUC you want to ignore sign when determining length. length(-x) for positive x will be 1 smaller than length(x), since right now I'm using quo for length10 computation to correct the estimation that won't affect the result. However if length10 would be changed accordingly to improve the efficiency then ... it would need such a change.

Well, current version _may_ work. My point was that using abs(x) is obviously correct and there seem to be no advantage in not using it.

> > divide(2.400, 2.0) ==> 1.20 > > Those are examples. What is the rule? Looking at first four > examples one can suspect rule "remove trailing zeros from > exact results", but this is clearly not true in last example. That last one example isn't exceptional, it is because 2.400 is 2400E-3 and 2.0 is 20E-1, divide(2400,20) is exact. The result is (2400/20)E(-3-(-1)), 120E2, 1.2 . There is indeed an algorithm been describe in the spec http://speleotrove.com/decimal/daops.html#refdivide However I choose to not directly follow that, because that involves a chain of substraction and shift in base 10, far less efficient than the one in Float. Actually, many implementation of that spec are using text string to represent floating point numbers, and shifting with base 10 would be easy that way, but I is isn't very convenient to program strings in SPAD.

AFAICS the following (untested) implements the spec: shorten(q : I, eq : I, adjust : I) : % == while adjust > 0 repeat (q1, r1) := divide(q, 10) r1 ~= 0 => break q := q1 adjust := adjust - 1 round([q, eq - adjust]) dvide(x : %, y : %) : % == mx := x.mantissa my := y.mantissa ex := x.exponent ey := y.exponent adjust := max(LENGTH(my) - LENGTH(mx) + digits() + 1, 0) (q, r) := divide(mx*POW10(adjust), my) -- if inexact, then just round r ~= 0 => round([q, ex - ey - adjust]) shorten(q, ex - ey, adjust) This "chain of substraction and shift in base 10" is really long division. adjust above normally is larger than their adjust when they stop long division. If division is exact than the loop is shorten makes sure that adjust is the same as when there is early exit from long division. Remarks: - just calling round is OK only because of round-half-up rule, otherwise round would also need remainder. - one could optimize by replacing LENGTH(my) by upper estimate and LENGTH(mx) by lower estimate (such estimates are much cheaper than exact lenth). - in shorten most of the time there is no need for round. But one would need to compute lenth of q (or estimate of lenth) to know. - as you noted shoten could be optimized (say using binary search). - AFAICS the same shorthen could be used in square root. BTW: For sqrt IIUC they require round-half-even even if another rounding mode is in use for other operations.

…

-- Waldek Hebisch

LdBeth · 2021-12-22T05:55:37Z

sorry for the delay of reply, I was been busy for end term stuff for a while.

My point was that using abs(x) is
obviously correct and there seem to be no advantage in not using it.

Sure, I'll have that fixed.

AFAICS the following (untested) implements the spec:

That looks good. I'll use that for the definition of dvide for now.

For sqrt IIUC they require round-half-even even if
another rounding mode is in use for other operations.

Yes, however the spec does not mention what to do if the precision
of operand is higher than current precision. If we insist on correct rounding
in that case, I guess I have to do a correct implementation of round-half-even.

hebisch · 2022-01-14T21:10:47Z

On Tue, Dec 21, 2021 at 09:55:47PM -0800, LdBeth wrote: sorry for the delay of reply, I was been busy for end term stuff for a while.

I hope you will be able to finish this.

Yes, however the spec does not mention what to do if the precision of operand is higher than current precision. If we insist on correct rounding in that case, I guess I have to do a correct implementation of `round-half-even`.

I think we should not make exceptions due to precision, that is do what spec say...

…

-- Waldek Hebisch

Improve conformance to decarith; add testcase And fix Makefile.in update. fix bug in division DecimalFloat support in SPAD fix obvious problems in divide and sqrt fix sqrt new exp wip still need exp1 add cached exp1 stage changes

Add arbitrary precision decimal floating point

bebd3ef

Improve conformance to decarith; add testcase And fix Makefile.in update. fix bug in division DecimalFloat support in SPAD fix obvious problems in divide and sqrt fix sqrt new exp wip still need exp1 add cached exp1 stage changes

LdBeth force-pushed the decimal branch from a622d9f to bebd3ef Compare January 16, 2022 13:11

Add arbitrary precision decimal floating point #71

Are you sure you want to change the base?

Add arbitrary precision decimal floating point #71

Uh oh!

Conversation

LdBeth commented Nov 29, 2021

Uh oh!

LdBeth commented Nov 29, 2021

Uh oh!

hebisch commented Dec 1, 2021 via email

Uh oh!

LdBeth commented Dec 1, 2021

Uh oh!

hebisch commented Dec 1, 2021 via email

Uh oh!

LdBeth commented Dec 2, 2021

Uh oh!

hebisch commented Dec 2, 2021 via email

Uh oh!

hebisch commented Dec 2, 2021 via email

Uh oh!

LdBeth commented Dec 2, 2021

Uh oh!

hebisch commented Dec 2, 2021 via email

Uh oh!

LdBeth commented Dec 3, 2021 via email

Uh oh!

hebisch commented Dec 3, 2021 via email

Uh oh!

LdBeth commented Dec 3, 2021 via email

Uh oh!

hebisch commented Dec 4, 2021 via email

Uh oh!

LdBeth commented Dec 4, 2021 via email

Uh oh!

hebisch commented Dec 4, 2021 via email

Uh oh!

LdBeth commented Dec 4, 2021 via email

Uh oh!

hebisch commented Dec 6, 2021 via email

Uh oh!

LdBeth commented Dec 22, 2021

Uh oh!

hebisch commented Jan 14, 2022 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants