Commit 6e9fc361 authored by Victor Yu's avatar Victor Yu
Browse files

Updated docs to include recent changes

* ELPA 2020.05.001
* ELPA no longer compiles with NAG Fortran (GNU extensions)
* C compiler becomes mandatory
* MPI-3 becomes mandatory
parent 805fff12
......@@ -2,6 +2,16 @@
## Not released
### ELSI interface
* C compiler and MPI-3 have become mandatory to build ELSI.
### ELPA
* Updated redistributed ELPA source code to version 2020.05.001, which supports
single precision calculations, autotuning of runtime parameters, and (NVIDIA)
GPU acceleration.
* The updated ELPA code can not be compiled with the NAG Fortran compiler, due
to the use of GNU extentions in ELPA.
### PEXSI
* AAA method has become the default pole expansion method in PEXSI.
* Increased default number of poles from 20 to 30.
......@@ -10,6 +20,12 @@
### SLEPc-SIPs
* Interface compatible with PETSc 3.13 and SLEPc 3.13.
### Known issues
* ELPA AVX kernels cannot be built with the PGI compiler suite due to incomplete
support of AVX intrinsics in PGI.
* Depending on the choice of k-points, the complex PEXSI solver may randomly
fail at the inertia counting stage.
## v2.5.0 (February 2020)
### ELSI interface
......@@ -36,12 +52,6 @@
* Redistributed source code of BSEPACK 0.1.
* Added parallel BSE eigensolvers PDBSEIG and PZBSEIG.
### Known issues
* ELPA AVX kernels cannot be built with the PGI compiler suite due to incomplete
support of AVX intrinsics in PGI.
* Depending on the choice of k-points, the complex PEXSI solver may randomly
fail at the inertia counting stage.
## v2.4.1 (November 2019)
### ELSI interface
......
......@@ -7,7 +7,7 @@ SET(elsi_URL "http://elsi-interchange.org")
SET(elsi_EMAIL "elsi-team@duke.edu")
SET(elsi_LICENSE "BSD 3")
SET(elsi_DESCRIPTION "Electronic Structure Infrastructure")
SET(elsi_DATESTAMP "20200419")
SET(elsi_DATESTAMP "20200422")
### CMake modules ###
LIST(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake)
......@@ -51,7 +51,6 @@ OPTION(ENABLE_SIPS "Enable SLEPc-SIPs" OFF)
OPTION(ENABLE_MAGMA "Enable MAGMA" OFF)
OPTION(ENABLE_BSEPACK "Enable BSEPACK" OFF)
OPTION(USE_MPI_MODULE "Use MPI module instead of mpif.h in Fortran code" OFF)
OPTION(USE_MPI_IALLGATHER "Use non-blocking collective MPI functions" ON)
OPTION(USE_EXTERNAL_ELPA "Use external ELPA" OFF)
OPTION(USE_EXTERNAL_OMM "Use external libOMM" OFF)
OPTION(USE_EXTERNAL_PEXSI "Use external PEXSI" OFF)
......
......@@ -7,7 +7,7 @@ The installation of ELSI makes use of the CMake software. Minimum requirements:
* CMake (3.0.2 or newer)
* Fortran compiler (Fortran 2003 compliant)
* C compiler (C99 compliant)
* MPI (MPI-3 recommended)
* MPI (MPI-3)
* BLAS, LAPACK, ScaLAPACK (with PBLAS and BLACS)
Enabling the PEXSI solver (highly recommended) requires:
......
......@@ -5,7 +5,7 @@
ELSI provides and enhances open-source software packages which solve or
circumvent eigenvalue problems in self-consistent field calculations based on
the Kohn-Sham density-functional theory. For more information, please visit the
[ELSI interchange](http://elsi-interchange.org) website.
[ELSI interchange](https://elsi-interchange.org) website.
## Installation
......@@ -15,12 +15,12 @@ The standard installation of ELSI requires:
* Fortran compiler (Fortran 2003)
* C compiler (C99)
* C++ compiler (C++11)
* MPI
* MPI (MPI-3)
* BLAS, LAPACK, ScaLAPACK
Installation with recent versions of Cray, GNU, IBM, Intel, NAG, and PGI
compilers has been tested. For a complete description of the installation
process, please refer to [`./INSTALL.md`](./INSTALL.md).
Installation with recent versions of Cray, GNU, IBM, Intel, and PGI compilers
has been tested. For a complete description of the installation process, please
refer to [`./INSTALL.md`](./INSTALL.md).
## More
......
No preview for this file type
......@@ -80,7 +80,7 @@ ELSI is a National Science Foundation Software Infrastructure for Sustained Inno
\chapter{Installation of ELSI}
\section{Prerequisites}
\label{sec:prereq}
The ELSI package contains the ELSI interface software as well as redistributed source code for the solver libraries ELPA (version 2016.11.001), libOMM (version 1.0.0), PEXSI (version 1.2.0), and NTPoly (version 2.4.0). The installation of ELSI makes use of the \href{http://cmake.org}{CMake} software. Minimum requirements include:
The ELSI package contains the ELSI interface software as well as redistributed source code for the solver libraries ELPA (version 2020.05.001), libOMM (version 1.0.0), PEXSI (version 1.2.0), and NTPoly (version 2.4.0). The installation of ELSI makes use of the \href{http://cmake.org}{CMake} software. Minimum requirements include:
\begin{Verbatim}[commandchars=\\\{\}]
\tcb{CMake} [minimum version 3.0; newer version recommended]
\tcb{Fortran compiler} [Fortran 2003 compliant]
......@@ -123,6 +123,8 @@ We recommend preparing configuration settings in a toolchain file that can be re
# Modify contents in \tcr{red} if necessary
set(CMAKE_Fortran_COMPILER \tcr{"mpiifort"} CACHE STRING "MPI Fortran compiler")
set(CMAKE_Fortran_FLAGS \tcr{"-O3 -ip -fp-model precise"} CACHE STRING "Fortran flags")
set(CMAKE_C_COMPILER \tcr{"mpiicc"} CACHE STRING "MPI C compiler")
set(CMAKE_C_FLAGS \tcr{"-O3 -ip -fp-model precise -std=c99"} CACHE STRING "C flags")
set(LIB_PATHS \tcr{"$ENV{MKLROOT}/lib/intel64"} CACHE STRING "External library paths")
set(LIBS \tcr{"mkl_scalapack_lp64 mkl_blacs_intelmpi_lp64 mkl_intel_lp64 mkl_sequential mkl_core"}
CACHE STRING "External libraries")
......@@ -246,8 +248,6 @@ The options accepted by the ELSI CMake build system are listed here in alphabeti
\hline
\texttt{USE\_EXTERNAL\_PEXSI} & boolean & OFF & Use external PEXSI (if PEXSI enabled)\\
\hline
\texttt{USE\_MPI\_IALLGATHER} & boolean & ON & Use non-blocking collective MPI functions\\
\hline
\texttt{USE\_MPI\_MODULE} & boolean & OFF & Use MPI module instead of ``\texttt{mpif.h}'' in Fortran code\\
\hline
\end{longtable}
......@@ -264,8 +264,6 @@ The options accepted by the ELSI CMake build system are listed here in alphabeti
(5) External libraries: ELSI redistributes source code of ELPA, libOMM, NTPoly, PEXSI, SuperLU\_DIST, and PT-SCOTCH libraries, which are built by default together with the ELSI interface. Experienced users are encouraged to link the ELSI interface against external, better optimized solver libraries. See Sec.~\ref{subsec:config_solvers} for more information.
(6) \texttt{USE\_MPI\_IALLGATHER}: NTPoly makes use of non-blocking collective MPI functions such as \texttt{MPI\_Iallgatherv} to overlap its computation and communication. If these MPI functions are not available in the user's MPI version, set \texttt{USE\_MPI\_IALLGATHER} to ``OFF''. Using this flag may lead to reduced performance. Upgrade MPI if possible.
\section{Importing ELSI into Third-Party Code Projects}
\label{sec:import}
\subsection{Linking against ELSI: CMake}
......@@ -893,8 +891,6 @@ In all the subroutines listed below, the first argument (input and output) is an
\hline
\texttt{mu\_mp\_order} & integer & 0 & Order of the Methfessel-Paxton broadening scheme. No effect if Methfessel-Paxton is not used.\\
\hline
\texttt{write\_unit} & integer & 6 & Deprecated. Use \api{elsi\_set\_output\_unit} instead.\\
\hline
\texttt{sing\_check} & integer & 0 & Deprecated. Use \api{elsi\_set\_illcond\_check} instead.\\
\hline
\texttt{sing\_tol} & real double & $10^{-5}$ & Deprecated. Use \api{elsi\_set\_illcond\_tol} instead.\\
......@@ -929,21 +925,17 @@ In all the subroutines listed below, the first argument (input and output) is an
\hline
\texttt{elpa\_n\_single} & integer & 0 & Number of SCF steps using single precision ELPA to solve standard eigenproblems. See remark 1.\\
\hline
\texttt{elpa\_gpu} & integer & 0 & If not 0, try to enable GPU-acceleration in ELPA. See remark 2.\\
\hline
\texttt{elpa\_autotune} & integer & 1 & If not 0, try to enable auto-tuning of runtime parameters in ELPA. See remark 3.\\
\texttt{elpa\_gpu} & integer & 0 & If not 0, enable GPU-acceleration in ELPA. See remark 2.\\
\hline
\texttt{elpa\_gpu\_kernels} & integer & 0 & Deprecated. No effect.\\
\texttt{elpa\_autotune} & integer & 1 & If not 0, enable auto-tuning of runtime parameters in ELPA. Not compatible with \texttt{illcond\_check}.\\
\hline
\end{tabular}
\textbf{Remarks}
(1) \texttt{elpa\_n\_single}: If single precision arithmetic is available in an externally complied ELPA library, it may be enabled by setting \texttt{elpa\_n\_single} to a positive integer, then the standard eigenproblems in the first \texttt{elpa\_n\_single} SCF steps are solved with single precision. The transformations between generalized eigenproblem and the standard form are always performed with double precision. Although this keyword accelerates the solution of standard eigenproblems, the overall SCF convergence may be slower, depending on the physical system and the SCF settings used in the electronic structure code. This keyword is ignored if single precision calculations are not available, which is the case if the internal version of ELPA is used, or if an external ELPA has not been complied with single precision support.
(1) \texttt{elpa\_n\_single}: If single precision arithmetic is available in an externally complied ELPA library, it may be enabled by setting \texttt{elpa\_n\_single} to a positive integer, then the standard eigenproblems in the first \texttt{elpa\_n\_single} SCF steps are solved with single precision. The transformations between generalized eigenproblem and the standard form are always performed with double precision. Although this keyword accelerates the solution of standard eigenproblems, the overall SCF convergence may be slower, depending on the physical system and the SCF settings used in the electronic structure code.
(2) \texttt{elpa\_gpu}: If GPU-acceleration is available in an externally compiled ELPA library, it may be enabled by setting \texttt{elpa\_gpu} to a nonzero integer. This keyword is ignored if GPU-acceleration is not available, which is the case if the internal version of ELPA is used, or if an external ELPA has not been complied with GPU support.
(3) \texttt{elpa\_autotune}: If auto-tuning of runtime parameters is available in an externally complied ELPA library, it may be enabled by setting \texttt{elpa\_autotune} to a nonzero integer. This keyword is ignored if auto-tuning is not available, which is the case if the internal version of ELPA is used.
(2) \texttt{elpa\_gpu}: If ELPA is compiled with GPU support, GPU acceleration may be enabled by setting \texttt{elpa\_gpu} to a nonzero integer. This keyword is ignored if no GPU support is available.
\subsection{Customizing the libOMM Solver}
\label{subsec:setter_omm}
......@@ -987,9 +979,9 @@ In all the subroutines listed below, the first argument (input and output) is an
\hline
\multicolumn{1}{|l|}{\textbf{Argument}} & \multicolumn{1}{l|}{\textbf{Data Type}} & \multicolumn{1}{l|}{\textbf{Default}} & \multicolumn{1}{l|}{\textbf{Explanation}}\\
\hline
\texttt{pexsi\_method} & integer & 2 & 1: Contour integral~\cite{pexsi_lin_2013}. 2: Minimax rational approximation~\cite{pole_moussa_2016} (recommended). See remark 1.\\
\texttt{pexsi\_method} & integer & 3 & 1: Contour integral~\cite{pexsi_lin_2013}. 2: Minimax rational approximation~\cite{pole_moussa_2016}. 3: Adaptive Antoulas-Anderson (AAA)~\cite{aaa_nakatsukasa_2018}. See remark 1.\\
\hline
\texttt{pexsi\_n\_pole} & integer & 20 & Number of poles used by PEXSI. See remark 1.\\
\texttt{pexsi\_n\_pole} & integer & 30 & Number of poles used by PEXSI. See remark 1.\\
\hline
\texttt{pexsi\_n\_mu} & integer & 2 & Number of mu points used by PEXSI. See remark 2.\\
\hline
......@@ -1013,9 +1005,9 @@ In all the subroutines listed below, the first argument (input and output) is an
\textbf{Remarks}
(1) When using the pole expansion method based on contour integral, allowed numbers for \texttt{pexsi\_n\_pole} are: 10, 20, 30, ..., 120. 60 to 100 poles are typically needed to get an accuracy that is comparable with the result obtained from diagonalization. The electronic entropy is available with this method, and may be accessed via \api{elsi\_get\_entropy}.
(1) When using the pole expansion method based on contour integral, allowed numbers for \texttt{pexsi\_n\_pole} are: 10, 20, 30, ..., 110, 120. 60 to 100 poles are typically needed to get an accuracy that is comparable with the result obtained from diagonalization. When using the minimax rational approximation or the Adaptive Antoulas-Anderson method, allowed numbers for \texttt{pexsi\_n\_pole} are: 10, 15, 20, ..., 35, 40. 20 to 30 poles are typically needed to get an accuracy that is comparable with the result obtained from diagonalization. PEXSI outputs an error message when it detects an unsupported choice of number of poles.
When using the pole expansion method based on minimax rational approximation, allowed numbers for \texttt{pexsi\_n\_pole} are: 10, 15, 20, ..., 40. 20 to 30 poles are typically needed to get an accuracy that is comparable with the result obtained from diagonalization. PEXSI outputs an error message when it detects an unsupported choice of number of poles. The electronic entropy cannot be computed with this method.
The electronic entropy can only be computed with the contour integral method and the Adaptive Antoulas-Anderson method. It may be accessed via \api{elsi\_get\_entropy}.
(2) PEXSI determines the chemical potential by performing Fermi operator expansion at several chemical potential values (referred to as ``points'') in an SCF step, then interpolating the results at all points to the final answer. The \texttt{pexsi\_n\_mu} parameter controls the number of chemical potential ``points'' to be evaluated. Two points followed by a simple linear interpolation often yield reasonable results.
......@@ -1096,10 +1088,6 @@ end do
\hline
\texttt{sips\_n\_slice} & integer & 1 & Number of slices. See remark 3.\\
\hline
\texttt{sips\_lower} & real double & -2.0 & Deprecated. Use \api{elsi\_set\_sips\_ev\_min} instead.\\
\hline
\texttt{sips\_upper} & real double & 2.0 & Deprecated. Use \api{elsi\_set\_sips\_ev\_max} instead.\\
\hline
\end{tabular}
\textbf{Remarks}
......@@ -1226,7 +1214,7 @@ In all the subroutines listed below, the first argument (input and output) is an
(3) In general, the energy-weighted density matrix is only needed in a late stage of an SCF cycle to evaluate forces. It is, therefore, not calculated when any of the density matrix solver interface is called. When the energy-weighted density matrix is actually needed, it can be requested by calling \api{elsi\_get\_edm\_\{real$\vert$complex\}\{\_sparse\}}. These subroutines have the requirement that the corresponding \api{elsi\_dm} subroutine must have been invoked. For instance, \api{elsi\_get\_edm\_real\_sparse} only makes sense if \api{elsi\_dm\_real\_sparse} has been successfully executed.
(4) When using \api{elsi\_dm\_\{real$\vert$complex\}\{\_sparse\}} with an eigensolver, ELSI internally computes and stores the eigenvalues, eigenvectors, and occupation numbers, which are the ingredients to assemble the density matrix. These quantities may be retrieved by calling \api{elsi\_get\_eval}, \api{elsi\_get\_evec\_\{real$\vert$complex\}}, and \api{elsi\_get\_occ}. The dimension of \texttt{eval} and \texttt{occ} should be equal to the value of \texttt{n\_states} set in \api{elsi\_init}. Even with \api{elsi\_dm\_\{real$\vert$complex\}\_sparse}, the eigenvectors are returned in a dense format (``\texttt{BLACS\_DENSE}''), as they are in general not sparse. The size of \texttt{evec\_\{real$\vert$complex\}} should always correspond to a global array of size \texttt{n\_basis} by \texttt{n\_basis}, regardless of the value of \texttt{n\_states}.
(4) When using \api{elsi\_dm\_\{real$\vert$complex\}\{\_sparse\}} with an eigensolver, ELSI internally computes and stores the eigenvalues, eigenvectors, and occupation numbers. These quantities may be retrieved by calling \api{elsi\_get\_eval}, \api{elsi\_get\_evec\_\{real$\vert$complex\}}, and \api{elsi\_get\_occ}. The dimension of \texttt{eval} and \texttt{occ} should be equal to the value of \texttt{n\_states} set in \api{elsi\_init}. Even with \api{elsi\_dm\_\{real$\vert$complex\}\_sparse}, the eigenvectors are returned in a dense format (``\texttt{BLACS\_DENSE}''), as they are in general not sparse. The size of \texttt{evec\_\{real$\vert$complex\}} should always correspond to a global array of size \texttt{n\_basis} by \texttt{n\_basis}, regardless of the value of \texttt{n\_states}.
\subsection{Getting Results from the PEXSI Solver}
\label{subsec:getter_pexsi}
......@@ -1971,6 +1959,6 @@ OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
\end{Verbatim}
\end{tcolorbox}
The source code of ELPA 2016.11.001 (LGPL3), libOMM 1.0.0 (BSD2), NTPoly 2.4.0 (MIT), PEXSI 1.2.0 (BSD3), PT-SCOTCH 6.0.0 (CeCILL-C), SuperLU\_DIST 6.2.0 (BSD3), and BSEPACK 0.1 (BSD3) are redistributed through this version of ELSI. Individual license of each library can be found in the corresponding subfolder.
The source code of ELPA 2020.05.001 (LGPL3), libOMM 1.0.0 (BSD2), NTPoly 2.4.0 (MIT), PEXSI 1.2.0 (BSD3), PT-SCOTCH 6.0.0 (CeCILL-C), SuperLU\_DIST 6.2.0 (BSD3), and BSEPACK 0.1 (BSD3) are redistributed through this version of ELSI. Individual license of each library can be found in the corresponding subfolder.
\end{document}
......@@ -18,6 +18,7 @@ LIST(APPEND ntpoly_src
src/MatrixMapsModule.f90
src/MatrixMarketModule.f90
src/MatrixMemoryPoolModule.f90
src/MatrixReduceModule.f90
src/PermutationModule.f90
src/PMatrixMemoryPoolModule.f90
src/PolynomialSolversModule.f90
......@@ -42,12 +43,6 @@ ELSE()
LIST(APPEND ntpoly_src src/NTMPIFH.f90)
ENDIF()
IF(USE_MPI_IALLGATHER)
LIST(APPEND ntpoly_src src/MatrixReduceModule.f90)
ELSE()
LIST(APPEND ntpoly_src src/MatrixReduceModuleNoIallgather.f90)
ENDIF()
ADD_LIBRARY(NTPoly ${ntpoly_src})
TARGET_LINK_LIBRARIES(NTPoly PRIVATE ${LIBS})
......
This diff is collapsed.
### Generic Intel ###
SET(CMAKE_Fortran_COMPILER "mpiifort" CACHE STRING "MPI Fortran compiler")
SET(CMAKE_Fortran_FLAGS "-O3 -ip -fp-model precise" CACHE STRING "Fortran flags")
SET(ENABLE_TESTS ON CACHE BOOL "Enable Fortran tests")
SET(LIB_PATHS "$ENV{MKLROOT}/lib/intel64" CACHE STRING "External library paths")
SET(LIBS "mkl_scalapack_lp64 mkl_blacs_intelmpi_lp64 mkl_intel_lp64 mkl_sequential mkl_core" CACHE STRING "External libraries")
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment