Topic: Parmetis

Dear Borek,

I tried to partition a mesh with Parmetis into four parts. The result is very suspicious because of the large number of shared nodes. I attached the input and the output. Could you please check whether it is correct?

The other thing is when I try to partition a mesh with mixed element types (e.g., 3d hex elements and 1d truss elements), the partition seems wrong. I am not sure whether Parmetis can work in this case. Thank you very much.

Happy New Year!
Xuejian

Post's attachments

singlewall-parmetis2 106.69 kb, 6 downloads since 2011-12-26 

You don't have the permssions to download the attachments of this post.

2

Re: Parmetis

Dear Xuejian,

this is a question that should be rather answered by metis developpers. I suggest to check carefully the graph you have set up before calling the metis. I have to say, that sometimes the partitioning created by metis is not "perfect", but in general we are quite happy with it and it never created something really awful.
Metis can work with mixed meshes, in fact it operates on graphs rather on meshes, and if you set up graph correctly, it will work.
I am not sure if you create the graph by your own. If I remember right, there is a program, which is a part of metis distribution, that can convert mesh into a graph. But not sure about its capabilities, as we directly set up graphs, but you can try it.

Re: Parmetis

Dear Borek,

Thank you very much. Finally, I used parmetis and post-process the results for the partitioned oofem input files. It seems parmetis cannot partition meshes with mixed element types. But it doesnot bother me a lot. But I got the following error when I run the poofem. Could you help explain a bit where is wrong? Thank you.

Xuejian

____________________________________________________
           OOFEM - Finite Element Solver           
        Copyright (C) 1994-2010 Borek Patzak       
____________________________________________________
____________________________________________________
           OOFEM - Finite Element Solver           
        Copyright (C) 1994-2010 Borek Patzak       
____________________________________________________
____________________________________________________
           OOFEM - Finite Element Solver           
        Copyright (C) 1994-2010 Borek Patzak       
____________________________________________________
____________________________________________________
           OOFEM - Finite Element Solver           
        Copyright (C) 1994-2010 Borek Patzak       
____________________________________________________
Total number of solution steps 31
Total number of solution steps 31
Instanciating domain                1
Instanciating domain                1
Total number of solution steps 31
Instanciating domain                1
Instanciated nodes & sides        990
Total number of solution steps 31
Instanciating domain                1
Instanciated nodes & sides       1041
Instanciated nodes & sides       1051
Instanciated nodes & sides        992
Instanciated elements            1149
Instanciated cross sections         5
Instanciated materials              2
Instanciated BCs                    3
Instanciated ICs                    0
Instanciated load-time fncts        2
Consistency check                  ok
Renumbering ... Instanciated elements            1126
Instanciated cross sections         5
Instanciated materials              2
Instanciated BCs                    3
Instanciated ICs                    0
Instanciated load-time fncts        2
Consistency check                  ok
Renumbering ... Instanciated elements            1176
Instanciated cross sections         5
Instanciated materials              2
Instanciated BCs                    3
Instanciated ICs                    0
Instanciated load-time fncts        2
Consistency check                  ok
Renumbering ... done in 0.01s
Nominal profile 133834 (old) 74459 (new)
done in 0.01s
Nominal profile 136788 (old) 78037 (new)
done in 0.01s
Nominal profile 134716 (old) 76516 (new)
Instanciated elements            1125
Instanciated cross sections         5
Instanciated materials              2
Instanciated BCs                    3
Instanciated ICs                    0
Instanciated load-time fncts        2
Consistency check                  ok
Renumbering ... done in 0.01s
Nominal profile 141907 (old) 71506 (new)
_______________________________________________________
Error: (/home/01900/liuxueji/oofem-2.1/src/oofemlib/petscordering.C:399)
[3] PetscNatural2GlobalOrdering :: init: invalid shared dofman received, globnum 2755

_______________________________________________________
stack trace:
  ./poofem [0x80640b]
  ./poofem [0x7cb5b6]
  ./poofem [0x72934e]
  ./poofem [0x7261b7]
  ./poofem [0x5780f8]
  ./poofem [0x7258f7]
  ./poofem [0x5780a1]
  ./poofem [0x44d83d]
  /lib64/libc.so.6 : __libc_start_main()+0xf4
  ./poofem [0x44cff9]
Total 1 error(s) and 0 warning(s) reported
oofem exit code 1
rank 3 in job 13  login1.longhorn_55655   caused collective abort of all ranks
  exit status of rank 3: killed by signal 9

Re: Parmetis

Without digging to deep into this issue, my guess would be that a node that is shared on one partition isn't on the other.

Re: Parmetis

That's strange. I checked randomly and it seems fine. Does the error tell which node has problem? Thank you very much.

Xuejian

Re: Parmetis

[3] PetscNatural2GlobalOrdering :: init: invalid shared dofman received, globnum 2755

From this, process 3 is the one that is missing the node that some other process is claiming to be there.
So, check all the other input files for node number 2755 and check to see if any of them claims it is shared with process 3.

Re: Parmetis

Hi Mikael,

It seems node 2755 is fine to me. Could you please check the files? Thank you very much.

Xuejian

Post's attachments

singlewall2.tar.gz 160.84 kb, 3 downloads since 2012-02-03 

You don't have the permssions to download the attachments of this post.

Re: Parmetis

Ok so it wasn't the problem I thought it was.
Please run it with log info printed (i.e. use the oofem run flag "-l 3").
But before that, activate all the conditional code in PetscNatural2GlobalOrdering::init
(there is a lot of "#if 0", put them to "#if 1")
This will probably generate a large output for such a big problem, but it'll tell you some more useful details.
In particular this line;

OOFEM_LOG_INFO("[%d]PetscN2G:: init: Sending localShared node %d[%d] to proc %d\n",

Re: Parmetis

Hi Mikael,

Thank you. I noticed a problem in my postprocess and fixed the problem now. Now the analysis can run but it seems like there are no enough boundary conditions and the displacement is almost infinite. Could you please help me look into it again?

Xuejian

Re: Parmetis

You must use PETSc for the parallel runs.
Your input files specify "lstype 1 smtype 4"
You need to use "lstype 3 smtype 7"
(In fact, petsc's solvers are generally faster than anything else, so even for sequential runs, you should always use petsc)

To use an appropriate solver with petsc, you need to supply additional flags when you run. I suggest

-ksp_type preonly -pc_type cholesky -pc_factor_mat_solver_package spooles

Re: Parmetis

Thank you very much, Mikael!

Xuejian

Re: Parmetis

Hi Mikael,

I noticed that all the strain and stress values for cubic elements in the vtu/vtk files are zero after the parallel computing. But the output is normal for the sequential analysis. Could you please check out for me? Another confusion I have is for the partitioned nodes with nodal loads on them, should the loads be described more than one time in different paritioned files or just once in any one of the files? Thank you.

Xuejian

Re: Parmetis

I will first need to know a few more details;

1. What sort of analysis you are doing (i.e. which engineering model)?
2. Is it just stresses and strains that are wrong? What about displacements? (i.e. do you even get the right answer from the parallel problem).
I can't see how stresses or strains could  be incorrect if the displacements are correct.

As for nodal loads, add them on every partition (total value on all partitions, don't split it up).

Re: Parmetis

The displacement seems correct. But the IST_StressTensor and IST_StrainTensor are all zero. But the output of the two variables seems correct when I run the analysis with only one processor. The input files are those you checked for me before. Thanks,

Xuejian

Re: Parmetis

When writing the IST_* values to input files, the "giveIPValue" functions are called on each element, which for IST_StressTensor is redirected to the cross section, in which turn is send to the corresponding material. In short, the code that finally computes the internal state should be
StructuralMaterial::giveIPValue
for which the equilibrated value is used.

Could you add some debug messages to check to see that this function is even called? structuralmaterial.C, line 2558 is where it should be set.
If that code is called, and the answer is all zeros, then I would guess that the material status isn't updated.

Re: Parmetis

If you produce a minimal example (say, 2 elements in parallel) and upload it, I could try it out, it would probably  be faster.

Re: Parmetis

Dear Mikael,

I will try to do that. At the same time, I am thinking whether it is because the microplane_m4 material model I am using. Could you please check it? I have anthere question about using paraview, it doesnot seem to support linear elements. Is there any way to view the stress/strain of the truss/beam elements?

Thanks a lot,
Xuejian

Re: Parmetis

I'm not very familiar with the specifics of most of the elements or material routines in OOFEM, so if you want me to run some analysis, please add a suitable input file.
Creating one with suitable material parameters, elements, etc. is quite a lot of work.

As for trusses and beams, I can't say right off, and I don't have time to check it right now.

Re: Parmetis

Hi Mikael,

Could you see what the problem is when I try to add the runtime flags to petsc? Thank you.

Xuejian

TACC: Setting memory limits for job 2424853 to unlimited KB
TACC: Dumping job script:
--------------------------------------------------------------------------------
#!/bin/bash

#$ -V
#$ -cwd
#$ -N wall
#$ -j y
#$ -o wall1p.o$JOB_ID
#$ -e wall1p.e$JOB_ID
#$ -pe 1way 16
#$ -q long
#$ -P hpc
#$ -l h_rt=24:00:00
ibrun -n 1 -o 0 ./poofem -f singlewall2_np1.in -ksp_type preonly -pc_type cholesky -log_summary
wait
--------------------------------------------------------------------------------
TACC: Done.
TACC: Starting up job 2424853
TACC: Setting up parallel environment for MVAPICH ssh-based mpirun.
TACC: Setup complete. Running job script.
TACC: starting parallel tasks...
____________________________________________________
           OOFEM - Finite Element Solver           
        Copyright (C) 1994-2010 Borek Patzak       
____________________________________________________
Total number of solution steps 101
Instanciating domain                1
Instanciated nodes & sides       3822
Instanciated elements            4576
Instanciated cross sections         5
Instanciated materials              2
Instanciated BCs                    3
Instanciated ICs                    0
Instanciated load-time fncts        2
Consistency check                  ok
Renumbering ... done in 0.30s
Nominal profile 1166346 (old) 519429 (new)
Assembling load
Assembling tangent stiffness matrix
NonLinearStatic info: user time consumed by assembly: 1.07s
Solving [step number     1.0]
Time       Iteration       ForceError      DisplError
__________________________________________________________
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: No support for this operation for this object type!
[0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc CHOLESKY!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 0, Thu Sep  8 12:06:53 CDT 2011
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: /share/home/01900/liuxueji/oofem-2.1/targets/poofem-release/bin/poofem on a barcelona named i134-311.ranger.tacc.utexas.edu by liuxueji Wed Mar  7 12:38:19 2012
[0]PETSC ERROR: Libraries linked from /opt/apps/pgi7_2/mvapich1_1_0_1/petsc/3.2/barcelona-cxx/lib
[0]PETSC ERROR: Configure run at Thu Oct 13 04:45:23 2011
[0]PETSC ERROR: Configure options --with-x=0 -with-pic --with-external-packages-dir=/var/tmp/petsc-3.2-root//opt/apps/pgi7_2/mvapich1_1_0_1/petsc/3.2/externalpackages --with-mpi-compilers=1 --with-mpi-dir=/opt/apps/pgi7_2/mvapich/1.0.1 --with-clanguage=C++ --with-scalar-type=real --with-dynamic-loading=0 --with-shared-libraries=0 --with-spai=1 --download-spai=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/spai_3.0-mar-06.tar.gz --with-hdf5=1 --with-hdf5-dir=/opt/apps/pgi7_2/mvapich1_1_0_1/phdf5/1.8.2 --with-hypre=1 --download-hypre=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/hypre-2.6.0b.tar.gz --with-plapack=1 --download-plapack=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/PLAPACKR32-hg.tar.gz --with-ml=1 --download-ml=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/ml-6.2.tar.gz --with-mumps=1 --download-mumps=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/MUMPS_4.9.2.tar.gz --with-scalapack=1 --download-scalapack=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/scalapack.tgz --with-blacs=1 --download-blacs=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/blacs-dev.tar.gz --with-spooles=1 --download-spooles=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/spooles-2.2-dec-2008.tar.gz --with-superlu=1 --download-superlu=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_4.1-December_20_2010.tar.gz --with-superlu_dist=1 --download-superlu_dist=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/SuperLU_DIST_2.5-December_21_2010.tar.gz --with-parmetis=1 --download-parmetis=/share/home/0000/build/rpms/SOURCES/petsc-externalpackages/ParMetis-dev-p3.tar.gz --with-debugging=no --with-blas-lib="[/opt/apps/intel/mkl/10.0.1.014/lib/em64t/libmkl_em64t.a,libmkl.a,libguide.a,libpthread.a]" --with-lapack-lib="[/opt/apps/intel/mkl/10.0.1.014/lib/em64t/libmkl_em64t.a,libmkl.a,libguide.a,libpthread.a]" --COPTFLAGS="-fast -tp barcelona-64" --CXXOPTFLAGS="-fast -tp barcelona-64" --FOPTFLAGS="-fast -tp barcelona-64"
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: MatGetFactor() line 3945 in src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_Cholesky() line 125 in src/ksp/pc/impls/factor/cholesky/cholesky.c
[0]PETSC ERROR: PCSetUp() line 819 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 260 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 379 in src/ksp/ksp/interface/itfunc.c
_______________________________________________________
Error: (/share/home/01900/liuxueji/oofem-2.1/src/oofemlib/petscsolver.C:185)
PetscSolver::petsc_solve - Error when solving: 56

_______________________________________________________
No backtrace available
Total 1 error(s) and 0 warning(s) reported
oofem exit code 1
Exit code -5 signaled from i134-311.ranger.tacc.utexas.edu
MPI process terminated unexpectedly
Killing remote processes...DONE
TACC: MPI job exited with code: 1
TACC: Shutting down parallel environment.
TACC: Shutdown complete. Exiting.
TACC: Cleaning up after job: 2424853
TACC: Done.

Re: Parmetis

The error is quite clear.
"Matrix format mpiaij does not have a built-in PETSc CHOLESKY!"
since you didn't use the flags i recommended.

-pc_factor_mat_solver_package spooles

Re: Parmetis

Hi Mikael,

I did. Maybe it is because I installed spooles by myself and it has some problems? When I install oofem with spooles, it seems fine. Thanks,

Xuejian

Re: Parmetis

Well your previous post reads

ibrun -n 1 -o 0 ./poofem -f singlewall2_np1.in -ksp_type preonly -pc_type cholesky -log_summary

No pc_factor_mat_solver_package specified there.

Also, I don't mean you should compile/use SPOOLES with OOFEM, but with PETSc.

Re: Parmetis

Dear Mikael,

Wish everything is going on well. Just want to confirm, the IST_* values are calculated at the IP or at the vertexes of elements? Thank you.

Xuejian

Re: Parmetis

Dear Xuejian,

The internal state variables are computed in the integration points. This is what you see in the standard output file.

If you use the VTKXML export, you can automatically get them smoothed to nodal variables. If you export cell variables in VTKXML, we currently don't have any other choice but to compute the average, weighted by the gauss-weights.

Re: Parmetis

Mikael,

Thank you very much for your clarification.

Xuejian