TOPICS
Search

Biconjugate Gradient Method


The conjugate gradient method is not suitable for nonsymmetric systems because the residual vectors cannot be made orthogonal with short recurrences, as proved in Voevodin (1983) and Faber and Manteuffel (1984). The generalized minimal residual method retains orthogonality of the residuals by using long recurrences, at the cost of a larger storage demand. The biconjugate gradient method (BCG) takes another approach, replacing the orthogonal sequence of residuals by two mutually orthogonal sequences, at the price of no longer providing a minimization.

The update relations for residuals in the conjugate gradient method are augmented in the biconjugate gradient method by relations that are similar but based on A^(T) instead of A. Thus we update two sequences of residuals

r^((i))=r^((i-1))-alpha_iAp^((i))
(1)
r^~^((i))=r^~^((i-1))-alpha_iA^(T)p^~^((i))
(2)

and two sequences of search directions

p^((i))=r^((i-1))+beta_(i-1)p^((i-1))
(3)
p^~^((i))=r^~^((i-1))+beta_(i-1)p^~^((i-1)).
(4)

The choices

alpha_i=(r^~^((i-1)^(T))r^((i-1)))/(p^~^((i)^(T))Ap^((i)))
(5)
beta_i=(r^~^((i)^(T))r^((i)))/(r^~^((i-1)^(T))r^((i-1)))
(6)

ensure the orthogonality relations

 r^~^((i)^(T))r^((j))=p^~^((i)^(T))Ap^((j))=0
(7)

if i!=j.

Few theoretical results are known about the convergence of the biconjugate gradient method. For symmetric positive definite systems, the method delivers the same results as the conjugate gradient method, but at twice the cost per iteration. For nonsymmetric matrices, it has been shown that in phases of the process where there is significant reduction of the norm of the residual, the method is more or less comparable to the full generalized minimal residual method in terms of numbers of iterations (Freund and Nachtigal 1991). In practice, this is often confirmed, but it is also observed that the convergence behavior may be quite irregular, and the method may even break down. The breakdown situation due to the possible event that

 z^((i-1)^(T))r^~^((i-1)) approx 0
(8)

can be circumvented by so-called look-ahead strategies (Parlett et al. 1985). The other breakdown situation,

 p^~^((i)^(T))q^((i)) approx 0
(9)

occurs when the LU decomposition fails (c.f. conjugate gradient method), and can be repaired by using another decomposition. This is done for example in some versions of the quasi-minimal residual method.

Sometimes, breakdown or near breakdown situations can be satisfactorily avoided by a restart at the iteration step immediately before the (near) breakdown step. Another possibility is to switch to a more robust (but possibly more expensive) method such as the generalized minimal residual method.

BCG requires computing a matrix-vector product Ap^((k)) and a transpose product A^(T)p^~^((k)). In some applications, the latter product may be impossible to perform, for instance if the matrix is not formed explicitly and the regular product is only given in operation form, for instance as a function call evaluation.

In a parallel environment, the two matrix-vector products can theoretically be performed simultaneously; however, in a distributed-memory environment, there will be extra communication costs associated with one of the two matrix-vector products, depending upon the storage scheme for A. A duplicate copy of the matrix will alleviate this problem, at the cost of doubling the storage requirements for the matrix.

Care must also be exercised in choosing the preconditioner, since similar problems arise during the two solves involving the preconditioning matrix.

It is difficult to make a fair comparison between the generalized minimal residual method (GMRES) and BCG. GMRES really minimizes a residual, but at the cost of increasing work for keeping all residuals orthogonal and increasing demands for memory space. BCG does not minimize a residual, but often its accuracy is comparable to GMRES, at the cost of twice the amount of matrix vector products per iteration step. However, the generation of the basis vectors is relatively cheap and the memory requirements are modest. Several variants of BCG have been proposed (e.g., conjugate gradient squared method and biconjugate gradient stabilized method) that increase the effectiveness of this class of methods in certain circumstances.


See also

Biconjugate Gradient Stabilized Method, Conjugate gradient Method on the Normal Equations Chebyshev Iteration, Conjugate Gradient Method, Conjugate Gradient Squared Method Flexible Generalized Minimal Residual Method, Generalized Minimal Residual Method, Linear System of Equations, Minimal Residual Method, Nonstationary Iterative Method, Preconditioner, Quasi-Minimal Residual Method Stationary Iterative Method, Symmetric LQ Method, Transpose-Free Quasi-Minimal Residual Method

This entry contributed by Noel Black and Shirley Moore, adapted from Barrett et al. (1994) (author's link)

Explore with Wolfram|Alpha

References

Barrett, R.; Berry, M.; Chan, T. F.; Demmel, J.; Donato, J.; Dongarra, J.; Eijkhout, V.; Pozo, R.; Romine, C.; and van der Vorst, H. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd ed. Philadelphia, PA: SIAM, 1994. http://www.netlib.org/linalg/html_templates/Templates.html.Faber, V. and Manteuffel, T. "Necessary and Sufficient Conditions for the Existence of a Conjugate Gradient Method." SIAM J. Numer. Anal. 21, 315-339, 1984.Freund, R. and Nachtigal, N. "QMR: A Quasi-Minimal Residual Method for Non-Hermitian Linear Systems." Numer. Math. 60, 315-339, 1991.Parlett, B. N. Taylor, D. R.; and Liu, Z. A. "A Look-Ahead Lanczos Algorithm for Unsymmetric Matrices." Math. Comput. 44, 105-124, 1985.Voevodin, V. "The Problem of Non-Self-Adjoint Generalization of the Conjugate Gradient Method is Closed." U.S.S.R. Comput. Maths. and Math. Phys. 23, 143-144, 1983.

Referenced on Wolfram|Alpha

Biconjugate Gradient Method

Cite this as:

Black, Noel and Moore, Shirley. "Biconjugate Gradient Method." From MathWorld--A Wolfram Web Resource, created by Eric W. Weisstein. https://mathworld.wolfram.com/BiconjugateGradientMethod.html

Subject classifications