scalapack-doc-1.5/ 0040755 0000564 0000062 00000000000 07151422255 013543 5 ustar pfrauenf staff scalapack-doc-1.5/html/ 0040755 0000564 0000062 00000000000 07151420741 014505 5 ustar pfrauenf staff scalapack-doc-1.5/html/faq.html 0100644 0000564 0000062 00000072266 07141341617 016157 0 ustar pfrauenf staff
Many thanks to the netlib_maintainers@netlib.org from whose FAQ list I have patterned this list for ScaLAPACK.
The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block cyclic decomposition.
Like LAPACK, the ScaLAPACK routines are based on block-partitioned algorithms in order to minimize the frequency of data movement between different levels of the memory hierarchy. (For such machines, the memory hierarchy includes the off-processor memory of other processors, in addition to the hierarchy of registers, cache, and local memory on each processor.) The fundamental building blocks of the ScaLAPACK library are distributed memory versions (PBLAS) of the Level 1, 2 and 3 BLAS, and a set of Basic Linear Algebra Communication Subprograms (BLACS) for communication tasks that arise frequently in parallel linear algebra computations. In the ScaLAPACK routines, all interprocessor communication occurs within the PBLAS and the BLACS. One of the design goals of ScaLAPACK was to have the ScaLAPACK routines resemble their LAPACK equivalents as much as possible.
For detailed information on ScaLAPACK, please refer to the ScaLAPACK Users' Guide.
1.2) How do I reference ScaLAPACK in a scientific publication?
We ask that you cite the ScaLAPACK Users' Guide.
@BOOK{slug, AUTHOR = {Blackford, L. S. and Choi, J. and Cleary, A. and D'Azevedo, E. and Demmel, J. and Dhillon, I. and Dongarra, J. and Hammarling, S. and Henry, G. and Petitet, A. and Stanley, K. and Walker, D. and Whaley, R. C.}, TITLE = {{ScaLAPACK} Users' Guide}, PUBLISHER = {Society for Industrial and Applied Mathematics}, YEAR = {1997}, ADDRESS = {Philadelphia, PA}, ISBN = {0-89871-397-8 (paperback)} }
1.3) Are there vendor-specific versions of ScaLAPACK?
Yes.
ScaLAPACK has been incorporated into several commercial packages, including the Sun Scalable Scientific Subroutine Library (Sun S3L), NAG Parallel Library, IBM Parallel ESSL, and Cray LIBSCI, and is being integrated into the VNI IMSL Numerical Library, as well as software libraries for Fujitsu, Hewlett-Packard/Convex, Hitachi, NEC, and SGI.
The publically available version of ScaLAPACK (on netlib) is designed to be portable and efficient across a wide range of computers. It is not hand-tuned for a specific computer architecture.
The vendor-specific versions of ScaLAPACK have been optimized for a specific architecture. Therefore, for best performance, we recommend using a vendor-optimized version of ScaLAPACK if it is available.
However, as new ScaLAPACK routines are introduced with each release, the vendor-specific versions of ScaLAPACK may only contain a subset of the existing routines.
If a user suspects an error in a vendor-specific ScaLAPACK routine, he is recommended to download the ScaLAPACK Test Suite from netlib.
1.5) Are there legal restrictions on the use of ScaLAPACK software?
ScaLAPACK (like LINPACK, EISPACK, LAPACK, etc) is a freely-available software package. It is available from netlib via anonymous ftp and the World Wide Web. It can, and is, being included in commercial packages (e.g., Sun's S3L, IBM's Parallel ESSL, NAG Numerical PVM and MPI Library). We only ask that proper credit be given to the authors.
Like all software, it is copyrighted. It is not trademarked, but we do ask the following:
If you modify the source for these routines we ask that you change the name of the routine and comment the changes made to the original.
We will gladly answer any questions regarding the software. If a modification is done, however, it is the responsibility of the person who modified the routine to provide support.
1.6) What is two-dimensional block cyclic data distribution?
two-dimensional block cyclic decomposition
1.7) Where can I find more information about ScaLAPACK?
A variety of working notes related to the ScaLAPACK library were published as LAPACK Working Notes and are available in postscript and pdf format at:
1.8) What and where are the PBLAS?
The Parallel Basic Linear Algebra Subprograms (PBLAS) are distributed memory versions of the Level 1, 2 and 3 BLAS. A Quick Reference Guide to the PBLAS is available. The software is available as part of the ScaLAPACK distribution tar file (scalapack.tgz).
There is also a new prototype version of the PBLAS (version 2.0), which is alignment-restriction free and uses logical algorithmic blocking techniques. For details, please refer to the scalapack/prototype/readme.pblas.
1.9) Are example ScaLAPACK programs available?
Yes, example ScaLAPACK programs are available. Refer to
A detailed description of how to run a ScaLAPACK example program is discussed in Chapter 2 of the ScaLAPACK Users' Guide.
1.10) How do I run an example program?
A detailed description of how to run a ScaLAPACK example program is discussed in Chapter 2 of the ScaLAPACK Users' Guide.
1.11) How do I install ScaLAPACK?
A comprehensive Installation Guide for ScaLAPACK is provided. In short, a user only needs to modify one file, SLmake.inc, to specify his compiler, compiler flags, location of his MPI library, BLACS library, and BLAS library. And then type make lib to build the ScaLAPACK library, and make exe to build the testing/timing executables. Example SLmake.inc files for various architectures are supplied in the SCALAPACK/INSTALL subdirectory in the distribution.
When you install ScaLAPACK, the installation assumes that the user has available a low-level message passing layer (like MPI, PVM, or a native message-passing library), a BLACS library (MPIBLACS or PVMBLACS, etc), and a BLAS library. If any of these required components is not available, then the user must build the needed component before proceeding with the ScaLAPACK installation.
If a vendor-optimized BLAS library is not available, ATLAS can be used to automatically generate an optimized BLAS library for your architecture. Only as a last resort should the user use the reference implementation Fortran77 BLAS contained on the BLAS webpage.
1.12) How do I install ScaLAPACK using MPIch-G and Globus?
A detailed explanation of how to run a ScaLAPACK program using MPIch-G and Globus can be found at: http://www.cs.utk.edu/~petitet/grads/.
See Question 1.11 for general installation instructions.
1.13) How do I achieve high performance using ScaLAPACK?
ScaLAPACK performance relies on an efficient low-level message-passing layer and high speed interconnection network for communication, and an optimized BLAS library for local computation.
For a detailed description of performance-related issues, please refer to Chapter 5 of the ScaLAPACK Users' Guide.
1.14) Are prebuilt ScaLAPACK libraries available?
Yes, prebuilt ScaLAPACK libraries are available for a variety of architectures. Refer to
1.15) How do I find a particular routine?
Indexes of individual ScaLAPACK driver and computational routines are available. These indexes contain brief descriptions of each routine.
ScaLAPACK routines are available in four types: single precision real, double precision real, single precision complex, and double precision complex. At the present time, the nonsymmetric eigenproblem is only available in single and double precision real.
1.16) I can't get a program to work. What should I do?
Technical questions should be directed to the authors at scalapack@cs.utk.edu.
Please tell us the type of machine on which the tests were run, the compiler and compiler options that were used, details of the BLACS library that was used, as well as the BLAS library, and a copy of the input file if appropriate.
Be prepared to answer the following questions:
1.17) How can I unpack scalapack.tgz?
gunzip scalapack.tgz tar xvf scalapack.tar
The compression program gzip (and gunzip) is Gnu software. If it is not already available on your machine, you can download it via anonymous ftp:
ncftp prep.ai.mit.edu cd pub/gnu/ get gzip-1.2.4.tar
See Question 1.11 for installation instructions.
1.18) What technical support for ScaLAPACK is available?
Technical questions and comments should be directed to the authors at scalapack@cs.utk.edu.
See Question 1.16
1.19) How do I submit a bug report?
Technical questions should be directed to the authors at scalapack@cs.utk.edu.
Be prepared to answer the questions as outlined in Question 1.15. Those are the first questions that we will ask!
1.20) How do I gather a distributed vector back to one processor?
There are several ways to accomplish this task.
2.1) What and where are the BLACS?
The BLACS (Basic Linear Algebra Communication Subprograms) project is an ongoing investigation whose purpose is to create a linear algebra oriented message passing interface that may be implemented efficiently and uniformly across a large range of distributed memory platforms.
The length of time required to implement efficient distributed memory algorithms makes it impractical to rewrite programs for every new parallel machine. The BLACS exist in order to make linear algebra applications both easier to program and more portable. It is for this reason that the BLACS are used as the communication layer of ScaLAPACK.
For further information on the BLACS, please refer to the blacs directory on netlib, as well as the BLACS Homepage.
2.2) Is there a Quick Reference Guide to the BLACS available?
Yes, there is a postscript version of the Quick Reference Guide to the BLACS available.
2.3) How do I install the BLACS?
First, you must choose which underlying message-passing layer that the BLACS will use (MPI, PVM, NX, MPL, etc). Once this decision has been made, you download the respective gzip tar file.
An Installation Guide for the BLACS is provided, as well as a comprehensive BLACS Test Suite. In short, a user only needs to modify one file, Bmake.inc, to specify his compiler, compiler flags, and location of his MPI library. And then type make mpi to build the MPI BLACS library, for example. Example Bmake.inc files for various architectures are supplied in the BLACS/BMAKES subdirectory in the distribution. There are also scripts in BLACS/INSTALL which can be run to help the user to determine some of the settings in the Bmake.inc file.
It is highly recommended that the user run the BLACS Tester to ensure that his installation is correct, and that no bugs have been detected in the low-level message-passing layer. If you suspect an error, please consult the
file on netlib.
2.4) Are prebuilt BLACS libraries available?
Yes, prebuilt BLACS libraries are available for a variety of architectures and message-passing interfaces. Refer to
2.5) Are example BLACS programs available?
Yes, example BLACS programs are available. Refer to
3.1) What and where are the BLAS?
The BLAS (Basic Linear Algebra Subprograms) are high quality "building block" routines for performing basic vector and matrix operations. Level 1 BLAS do vector-vector operations, Level 2 BLAS do matrix-vector operations, and Level 3 BLAS do matrix-matrix operations. Because the BLAS are efficient, portable, and widely available, they're commonly used in the development of high quality linear algebra software, LINPACK and LAPACK for example.
A Fortran77 reference implementation of the BLAS is located in the blas directory of Netlib.
3.2) Publications/references for the BLAS?
3.3) Is there a Quick Reference Guide to the BLAS available?
Yes, there is a postscript version of the Quick Reference Guide to the BLAS available.
3.4) Are optimized BLAS libraries available?
YES! Machine-specific optimized BLAS libraries are available for a variety of computer architectures. These optimized BLAS libraries are provided by the computer vendor or by an independent software vendor (ISV). For further details, please contact your local vendor representative.
Alternatively, the user can download ATLAS to automatically generate an optimized BLAS library for his architecture.
If all else fails, the user can download a Fortran77 reference implementation of the BLAS from netlib. However, keep in mind that this is a reference implementation and is not optimized.
ATLAS is an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. The production of such software for machines ranging from desktop workstations to embedded processors can be a tedious and time consuming task. ATLAS has been designed to automate much of this process. We concentrate our efforts on the widely used linear algebra kernels called the Basic Linear Algebra Subroutines (BLAS).
For further information, refer to the ATL AS webpage.
3.6) Where can I find vendor supplied BLAS?
BLAS Vendor List
Last updated: November 13, 1998
3.7) Where can I find the Intel BLAS for Linux?
Yes, the Intel BLAS for Linux are now available! Refer to the following URL: Intel BLAS for Linux.
3.8) Where can I find Java BLAS?
Yes, Java BLAS are available. Refer to the following URLs: Java LAPACK and JavaNumerics The JavaNumerics webpage provides a focal point for information on numerical computing in Java.
3.9) Are prebuilt Fortran77 ref implementation BLAS libraries available?
Yes. However, it is assumed that you have a machine-specific optimized BLAS library already available on the architecture to which you are installing LAPACK. If this is not the case, you can download a prebuilt Fortran77 reference implementation BLAS library or compile the Fortran77 reference implementation source code of the BLAS from netlib.
Although a model implementation of the BLAS in available from netlib in the blas directory, it is not expected to perform as well as a specially tuned implementation on most high-performance computers -- on some machines it may give much worse performance -- but it allows users to run LAPACK software on machines that do not offer any other implementation of the BLAS.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .