Recently our team Dr Matthew Borg, Dr Stephen Longshaw and myself, began to migrate MDFoam code from OpenFOAM v1.7 to OpenFOAM v2.1. Our Lagrangian particle tracking algorithm differs from original OpenFOAM source code also we have developed significant functionality for Multi-Scale Flow Engineering simulations, which is not present in original OpenFOAM source code. Moreover we plan to run large scale MD simulations using Archer UK national supercomputer, this blog post presents recent performance results on Archer.
Large Scale Simulation on Archer
Beginning October 2014, I started working on optimising OpenFOAM code to enable running large scale simulations on Archer, this project is supported by Archer embedded CSE support (eCSE). The plan is to scale our simulations on Archer to eventually enable our researchers to obtain realistic results in quick time. Presently parallelism within and outside nodes in OpenFOAM is handled by MPI, so I plan to replace existing pure MPI parallelism with mixed mode MPI/OpenMP parallelism.
Each Archer node contains two CPUs producing 24 cores, each CPU with 12 cores also could be Hyper threaded to 48 threads in all. Accordingly parallelism within nodes using MPI will initiate communication between cores therefore one way to bring better performance on Archer is to reduce MPI communication with multi-threading within nodes and MPI for communication across nodes.
A performance graph is presented with two species of simulations run on Archer in figure 1, one simulation contains nitrogen molecules and another with water molecules. The problem sizes differ as well, nitrogen is simulated with 33696 atoms and water with 256000 atoms. The two different sizes are presented to show scaling behaviour on problem sizes and cores.
Due to a medium sized atoms in Nitrogen simulation, unlike water performance has not increased significantly on 48 cores. Typically some cores could be ideal and performing less or no work, however in case of water due to larger number of atoms there is work for every core resulting in obtaining better performance. Accordingly simply consuming larger number of cores does not benefits performance in fact problem size also has to be appropriate.
Figure 1: Performance of MDFoam on Archer
The performance chart presented is for pure MPI based parallelism, meanwhile performance on latest work for mixed mode parallelism will be added soon. Lastly, If your research work is similar to work our research group does and if you are interested in using our code, please feel to contact Dr Matthew Borg.