TY - JOUR

T1 - Error estimates for the summation of real numbers with application to floating-point summation

AU - Lange, Marko

AU - Rump, Siegfried M.

PY - 2017/9/1

Y1 - 2017/9/1

N2 - Standard Wilkinson-type error estimates of floating-point algorithms involve a factor γk: = ku/ (1 - ku) for u denoting the relative rounding error unit of a floating-point number system. Recently, it was shown that, for many standard algorithms such as matrix multiplication, LU- or Cholesky decomposition, γk can be replaced by ku, and the restriction on k can be removed. However, the arguments make heavy use of specific properties of both the underlying set of floating-point numbers and the corresponding arithmetic. In this paper, we derive error estimates for the summation of real numbers where each sum is afflicted with some perturbation. Recent results on floating-point summation follow as a corollary, in particular error estimates for rounding to nearest and for directed rounding. Our new estimates are sharp and unveil the necessary properties of floating-point schemes to allow for a priori estimates of summation with a factor omitting higher order terms.

AB - Standard Wilkinson-type error estimates of floating-point algorithms involve a factor γk: = ku/ (1 - ku) for u denoting the relative rounding error unit of a floating-point number system. Recently, it was shown that, for many standard algorithms such as matrix multiplication, LU- or Cholesky decomposition, γk can be replaced by ku, and the restriction on k can be removed. However, the arguments make heavy use of specific properties of both the underlying set of floating-point numbers and the corresponding arithmetic. In this paper, we derive error estimates for the summation of real numbers where each sum is afflicted with some perturbation. Recent results on floating-point summation follow as a corollary, in particular error estimates for rounding to nearest and for directed rounding. Our new estimates are sharp and unveil the necessary properties of floating-point schemes to allow for a priori estimates of summation with a factor omitting higher order terms.

KW - Error analysis

KW - Floating-point

KW - Real numbers

KW - Summation

KW - Wilkinson-type error estimates

UR - http://www.scopus.com/inward/record.url?scp=85018980233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018980233&partnerID=8YFLogxK

U2 - 10.1007/s10543-017-0658-9

DO - 10.1007/s10543-017-0658-9

M3 - Article

AN - SCOPUS:85018980233

SN - 0006-3835

VL - 57

SP - 927

EP - 941

JO - BIT

JF - BIT

IS - 3

ER -