Abstract
Standard Wilkinson-type error estimates of floating-point algorithms involve a factor γk: = ku/ (1 - ku) for u denoting the relative rounding error unit of a floating-point number system. Recently, it was shown that, for many standard algorithms such as matrix multiplication, LU- or Cholesky decomposition, γk can be replaced by ku, and the restriction on k can be removed. However, the arguments make heavy use of specific properties of both the underlying set of floating-point numbers and the corresponding arithmetic. In this paper, we derive error estimates for the summation of real numbers where each sum is afflicted with some perturbation. Recent results on floating-point summation follow as a corollary, in particular error estimates for rounding to nearest and for directed rounding. Our new estimates are sharp and unveil the necessary properties of floating-point schemes to allow for a priori estimates of summation with a factor omitting higher order terms.
Original language | English |
---|---|
Pages (from-to) | 927-941 |
Number of pages | 15 |
Journal | BIT Numerical Mathematics |
Volume | 57 |
Issue number | 3 |
DOIs | |
Publication status | Published - 2017 Sep 1 |
Keywords
- Error analysis
- Floating-point
- Real numbers
- Summation
- Wilkinson-type error estimates
ASJC Scopus subject areas
- Software
- Computer Networks and Communications
- Computational Mathematics
- Applied Mathematics