Vortrag: Dr. Renate Dohmen, 11.5.1999

Maria Cherry Maria Cherry <maria@par.univie.ac.at>
Tue, 4 May 1999 12:20:28 +0200 (MET DST)



                             UNIVERSITAET WIEN 
              INSTITUT FUER SOFTWARETECHNIK UND PARALLELE SYSTEME
                                gemeinsam mit 
                                    VCPC 
               EUROPEAN CENTRE FOR PARALLEL COMPUTING AT VIENNA 
                       
              FWF-Projekt Spezialforschungsbereich F011 "AURORA"


        EINLADUNG ZU EINEM VORTRAG IM RAHMEN DES AURORA-KOLLOQUIUMS
                         
            
                                               
     Parallelization of the FP-LAPW code WIEN97 for message-passing systems

			  
		              Renate Dohmen
    Computing Centre Garching of the Max-Planck-Gesellschaft, Germany
                             
                             Jakob Pichlmeier
                     Silicon Graphics GmbH, Germany

		   
                  
                  ZEIT: Dienstag, 11. 5. 1999, 10.15 Uhr s.t.
          ORT: Institut fuer Softwaretechnik und Parallele Systeme
                   1090 Wien, Liechtensteinstrasse 22, 
                          Seminarraum, Mezzanin



Abstract:


The code WIEN97, which is based on the full-potential linearized
augmented plane wave (FP-LAPW) method and serves to calcutate the
electronic structure and magnetic properties of crystals and surfaces,
has been parallelized for message-passing systems and was successfully
implemented on the Cray T3E. The numerical algorithm underlying the
code is dominated by the setup and solution of a generalized eigenvalue
problem with dense symmetric matrices of order 3000 - 10000. As in the
sequential code the solution of the generalized eigenvalue problem and
several other linear algebra tasks could be settled by use of numerical
library routines, especially from the ScaLAPACK and from Cray's
scientific parallel library Scilib. Those program parts dealing with
the setup and manipulations of the symmetric, thus in priciple
triangular matrices demanded special precautions for good load
balancing. The performance of the parallel version on the T3E is quite
satisfactory, a total performance of about 17 GFlop/s can be obtained
with 256 T3E processors.

The talk is to describe the main parallelization strategies, discuss
characteristics of the impementation on the T3E and present performance
measurements of the resulting parallel code giving evidence for the
good performance and scaling behaviour.