parallelization-howto
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
parallelization-howto [2010/03/20 15:34] – Starting static load balancing section bp | parallelization-howto [2012/12/17 19:39] (current) – [Example] mikael.ohman | ||
---|---|---|---|
Line 34: | Line 34: | ||
====== Parallelizing serial problem (static load balancing) ====== | ====== Parallelizing serial problem (static load balancing) ====== | ||
- | Efficient parallel algorithm requires that all steps in solution procedure are parallelized. In a typical problem this involves assembly of characteristic problem, its solution, and postprocessing. | + | Efficient parallel algorithm requires that all steps in solution procedure are parallelized. In a typical problem this involves assembly of characteristic problem, its solution, and postprocessing. |
+ | |||
+ | ===== Parallel assembly ===== | ||
+ | |||
+ | Provided that the partitioning of the problem is given, each processor can assemble its local contribution to the global characteristic equation. | ||
This part is pretty identical to serial assembly process and therefore allows to share this part between serial and parallel part of the algorithm. After finishing assembly of local contributions, | This part is pretty identical to serial assembly process and therefore allows to share this part between serial and parallel part of the algorithm. After finishing assembly of local contributions, | ||
contributions to a particular remote partition, is required, while the receive map contains the shared node numbers, for which the exchange, in terms of receiving the contributions from a particular | contributions to a particular remote partition, is required, while the receive map contains the shared node numbers, for which the exchange, in terms of receiving the contributions from a particular | ||
Line 42: | Line 46: | ||
contributions according to the receive maps. | contributions according to the receive maps. | ||
+ | The user is required to provide following methods: | ||
+ | - A **packing method** for wrapping the local contributions (for example load vector contributions) according to the send map into a communication buffer. Then these packed contributions are sent to the corresponding (remote) partitions, where they are received and unwrapped using **unpack method** according to the receive maps. | ||
+ | - As packing/ | ||
+ | |||
+ | When assembling sparse matrices in parallel, no specific action is needed, provided that the suitable sparse matrix representation is used (SMT_PetscMtrx, | ||
+ | |||
+ | ===== Parallel solution and postprocessing ===== | ||
+ | |||
+ | ===== Example ===== | ||
+ | Fist, we have to create an instance of // | ||
+ | <code cpp> | ||
+ | IRResultType | ||
+ | MyEngngModel :: initializeFrom(InputRecord *ir) | ||
+ | { | ||
+ | // problem specific part here | ||
+ | |||
+ | // initialize communicator if in parallel mode | ||
+ | #ifdef __PARALLEL_MODE | ||
+ | // first create send and receive communication buffers, here we use fix-length | ||
+ | // static buffers (CBT_static) (size needs to be determined) | ||
+ | commBuff = new CommunicatorBuff(this-> | ||
+ | // and create communicator, | ||
+ | communicator = new ProblemCommunicator(this, | ||
+ | | ||
+ | | ||
+ | |||
+ | #endif | ||
+ | } | ||
+ | </ | ||
+ | |||
+ | Following method illustrates how the assembly method for vector (load vector) is changed to support parallel assembly. It uses communicator, | ||
+ | |||
+ | <code cpp> | ||
+ | void | ||
+ | MyEngngModel :: assembleLoadVector (FloatArray & | ||
+ | Domain *sourceDomain, | ||
+ | { | ||
+ | EModelDefaultEquationNumbering en; | ||
+ | | ||
+ | // resize load vector first to accomodate all local entries | ||
+ | loadVector.resize( sourceDomain-> | ||
+ | // prepare for assembly | ||
+ | loadVector.zero(); | ||
+ | |||
+ | // assemble element part of load vector | ||
+ | this-> | ||
+ | | ||
+ | // assemble nodal part | ||
+ | this-> | ||
+ | VM_Total, en, sourceDomain); | ||
+ | |||
+ | #ifdef __PARALLEL_MODE | ||
+ | // parallel section (compiles only when parallel support is configured) | ||
+ | // pack all data, need to pass pointer to engngModel, local vector, and packing method | ||
+ | // this will call pack method for each remote partition | ||
+ | communicator-> | ||
+ | & loadVector, | ||
+ | & MyEngngModel :: packLoad ); | ||
+ | |||
+ | // send the packed data to remote partitions and receive their data | ||
+ | communicator-> | ||
+ | |||
+ | // unpack all received data, parameters: pointer to engngModel, local vector, and unpacking method | ||
+ | communicator-> | ||
+ | & loadVector, | ||
+ | & StructuralEngngModel :: unpackLoad ); | ||
+ | // finish exchange | ||
+ | communicator-> | ||
+ | #endif | ||
+ | |||
+ | } | ||
+ | |||
+ | </ | ||
+ | |||
+ | The following code illustrates the implementation of pack and unpack methods. The methods should have two parameters: a vector containing source/ | ||
+ | |||
+ | <code cpp> | ||
+ | int | ||
+ | MyEngngModel :: packLoad(FloatArray *src, ProcessCommunicator & | ||
+ | { | ||
+ | /** This method wraps the shared nodes entries of given vector into | ||
+ | communication buffer. The buffer is attribute of given ProcessCommunicator, | ||
+ | uniquely determines the remote partition | ||
+ | | ||
+ | int result = 1; | ||
+ | int i, size; | ||
+ | int j, ndofs, eqNum; | ||
+ | Domain *domain = this-> | ||
+ | IntArray const *toSendMap = processComm.giveToSendMap(); | ||
+ | ProcessCommunicatorBuff *pcbuff = processComm.giveProcessCommunicatorBuff(); | ||
+ | DofManager *dman; | ||
+ | Dof *jdof; | ||
+ | |||
+ | size = toSendMap-> | ||
+ | for ( i = 1; i <= size; i++ ) { // loop over send map entries (shared nodes) | ||
+ | dman = domain-> | ||
+ | ndofs = dman-> | ||
+ | for ( j = 1; j <= ndofs; j++ ) { // loop over shared node DOF's | ||
+ | jdof = dman-> | ||
+ | if ( jdof-> | ||
+ | // if DOF is primary (not linked) and no BC applied (equation number assigned) | ||
+ | // then pack its corresponding value in given vector to buffer | ||
+ | result &= pcbuff-> | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | |||
+ | return result; | ||
+ | } | ||
+ | |||
+ | |||
+ | int | ||
+ | MyEngngModel :: unpackLoad(FloatArray *dest, ProcessCommunicator & | ||
+ | { | ||
+ | /** This method unpacks the received shared nodes entries into given vector. | ||
+ | The receive buffer is attribute of given ProcessCommunicator, | ||
+ | uniquely determines the remote partition sending the data. | ||
+ | | ||
+ | |||
+ | int result = 1; | ||
+ | int i, size; | ||
+ | int j, ndofs, eqNum; | ||
+ | Domain *domain = this-> | ||
+ | dofManagerParallelMode dofmanmode; | ||
+ | IntArray const *toRecvMap = processComm.giveToRecvMap(); | ||
+ | ProcessCommunicatorBuff *pcbuff = processComm.giveProcessCommunicatorBuff(); | ||
+ | DofManager *dman; | ||
+ | Dof *jdof; | ||
+ | double value; | ||
+ | |||
+ | |||
+ | size = toRecvMap-> | ||
+ | for ( i = 1; i <= size; i++ ) { // loop over receive map | ||
+ | dman = domain-> | ||
+ | ndofs = dman-> | ||
+ | dofmanmode = dman-> | ||
+ | for ( j = 1; j <= ndofs; j++ ) { // loop over shared node DOFs | ||
+ | jdof = dman-> | ||
+ | if ( jdof-> | ||
+ | // if DOF is primary (not linked) and no BC applied (equation number assigned) | ||
+ | // then unpdate buffer value into corresponding vector value | ||
+ | result &= pcbuff-> | ||
+ | if ( dofmanmode == DofManager_shared ) { | ||
+ | // remote contribution is added to local one | ||
+ | dest-> | ||
+ | } else { | ||
+ | _error3( " | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | |||
+ | return result; | ||
+ | } | ||
+ | |||
+ | </ | ||
+ | Provided that the suitable sparse matrix representation is used (SMT_PetscMtrx, | ||
+ | <code cpp> | ||
+ | void | ||
+ | MyEngngModel :: solveYourselfAt(TimeStep *tStep) { | ||
+ | // create components of characteristic equation | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | // initialize profile of stiffness matrix | ||
+ | | ||
+ | |||
+ | // assemble stiffness and load vector | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | // get numerical method to solve the problem | ||
+ | | ||
+ | // solve the problem (yes, this solves linear system in parallel!) | ||
+ | | ||
+ | // postprocess results update nodes, elements, compute strains, stresses, etc | ||
+ | | ||
+ | // and we are done! | ||
+ | } | ||
+ | </ | ||
====== Further reading ====== | ====== Further reading ====== |
parallelization-howto.1269095672.txt.gz · Last modified: 2010/03/20 15:34 by bp