TY - JOUR
T1 - Error estimates for the summation of real numbers with application to floating-point summation
AU - Lange, Marko
AU - Rump, Siegfried M.
PY - 2017/9/1
Y1 - 2017/9/1
N2 - Standard Wilkinson-type error estimates of floating-point algorithms involve a factor γk: = ku/ (1 - ku) for u denoting the relative rounding error unit of a floating-point number system. Recently, it was shown that, for many standard algorithms such as matrix multiplication, LU- or Cholesky decomposition, γk can be replaced by ku, and the restriction on k can be removed. However, the arguments make heavy use of specific properties of both the underlying set of floating-point numbers and the corresponding arithmetic. In this paper, we derive error estimates for the summation of real numbers where each sum is afflicted with some perturbation. Recent results on floating-point summation follow as a corollary, in particular error estimates for rounding to nearest and for directed rounding. Our new estimates are sharp and unveil the necessary properties of floating-point schemes to allow for a priori estimates of summation with a factor omitting higher order terms.
AB - Standard Wilkinson-type error estimates of floating-point algorithms involve a factor γk: = ku/ (1 - ku) for u denoting the relative rounding error unit of a floating-point number system. Recently, it was shown that, for many standard algorithms such as matrix multiplication, LU- or Cholesky decomposition, γk can be replaced by ku, and the restriction on k can be removed. However, the arguments make heavy use of specific properties of both the underlying set of floating-point numbers and the corresponding arithmetic. In this paper, we derive error estimates for the summation of real numbers where each sum is afflicted with some perturbation. Recent results on floating-point summation follow as a corollary, in particular error estimates for rounding to nearest and for directed rounding. Our new estimates are sharp and unveil the necessary properties of floating-point schemes to allow for a priori estimates of summation with a factor omitting higher order terms.
KW - Error analysis
KW - Floating-point
KW - Real numbers
KW - Summation
KW - Wilkinson-type error estimates
UR - http://www.scopus.com/inward/record.url?scp=85018980233&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85018980233&partnerID=8YFLogxK
U2 - 10.1007/s10543-017-0658-9
DO - 10.1007/s10543-017-0658-9
M3 - Article
AN - SCOPUS:85018980233
SN - 0006-3835
VL - 57
SP - 927
EP - 941
JO - BIT
JF - BIT
IS - 3
ER -