Skip to content

Commit

Permalink
Further improve accuracy of math.hypot() (pythonGH-22013)
Browse files Browse the repository at this point in the history
  • Loading branch information
rhettinger authored Aug 30, 2020
1 parent 75c80b0 commit 92c3816
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions Modules/mathmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -2455,6 +2455,9 @@ Given that csum >= 1.0, we have:
lo**2 <= 2**-54 < 2**-53 == 1/2*ulp(1.0) <= ulp(csum)/2
Since lo**2 is less than 1/2 ulp(csum), we have csum+lo*lo == csum.
To minimize loss of information during the accumulation of fractional
values, the lo**2 term has a separate accumulator.
The square root differential correction is needed because a
correctly rounded square root of a correctly rounded sum of
squares can still be off by as much as one ulp.
Expand All @@ -2475,15 +2478,16 @@ Borges' ALGORITHM 4 and 5.
1. Veltkamp-Dekker splitting: http://csclub.uwaterloo.ca/~pbarfuss/dekker1971.pdf
2. Compensated summation: http://www.ti3.tu-harburg.de/paper/rump/Ru08b.pdf
3. Square root diffential correction: https://arxiv.org/pdf/1904.09481.pdf
3. Square root differential correction: https://arxiv.org/pdf/1904.09481.pdf
4. https://www.wolframalpha.com/input/?i=Maclaurin+series+sqrt%28h**2+%2B+x%29+at+x%3D0
*/

static inline double
vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
{
const double T27 = 134217729.0; /* ldexp(1.0, 27)+1.0) */
double x, csum = 1.0, oldcsum, frac = 0.0, scale;
double x, csum = 1.0, oldcsum, frac = 0.0, frac_lo = 0.0, scale;
double t, hi, lo, h;
int max_e;
Py_ssize_t i;
Expand Down Expand Up @@ -2528,8 +2532,9 @@ vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
frac += (oldcsum - csum) + x;

assert(csum + lo * lo == csum);
frac += lo * lo;
frac_lo += lo * lo;
}
frac += frac_lo;
h = sqrt(csum - 1.0 + frac);

x = h;
Expand Down

0 comments on commit 92c3816

Please sign in to comment.