Dr Stephen M. Longshaw
Senior Computational Scientist, Daresbury Laboratory
Stephen obtained his PhD in Computer Science from The University of Manchester in 2011 with a thesis that looked at geological modelling of fault line evolution. Since then he has worked as a consultant for a number of UK based businesses and was a Research Associate within the school of Mechanical, Aerospace & Civil Engineering at the University of Manchester. During this posting he studied the application of GPU based Smoothed Particle Hydrodynamics (SPH) to modelling fluid simulation problems. Stephen is now a member of STFC at their Daresbury labs, working as a Research Fellow (Computational Scientist), with the goal of coupling macroscopic and microscopic solvers and developing distributable software to that end.
Dr Stephen M. Longshaw's Posts
OpenFOAM's I/O Problem (and solution)
Much of the MNF group's research output has been based around our solvers (mdFoam+ and dsmcFoam+) which are written in the OpenFOAM software framework. OpenFOAM is well known and well acknowedged as a very flexible and stable environment to develop new solvers, however it has a bit of a reputation for scaling badly on big super computers, leaving people to assume it should only be used when your problem can be tackled by a stand-alone workstation or using only a few nodes on your favourite big HPC system. This blog post will talk about the new collated file format introduced into OpenFOAM 5.0 and how it might be the beginning of the end for this mentatility.
The question is, where has this perception come from and, more importantly, is it right? If you search for the issue of OpenFOAM scalability on HPC then you will find numerous articles and topics, what is interesting though is how few are a) looking at massive scalability (most consider running on a few CPUs) b) how few recent articles there.
The question therefore is whether OpenFOAM actually does perform badly on HPC system or is it an out of date perception. This is a hard one to answer fully as OpenFOAM has been around for a good few decades and has a number of different solvers to consider. In theory, each should parallelise as well as the others as they are all built on top of same basic libraries, however of course some algorithms work better in parallel than others and some of the solvers may not have receieved the same attention as others. Generally speaking though the methods used in OpenFOAM are sound, it employs typical static domain-decomposed non-blocking MPI in most of its solvers and allows well-known decomposition libraries such as Scotch to be used to minimise communication overhead. Undoubtedly this could all be optimised better if it were to receieve lots of attention from the HPC community but are there any other problems blocking this?
The MNF group runs many of its simulations on the UK's national HPC service Archer, run by the EPCC, a Cray XC30 machine. At the moment they provide access to OpenFOAM 4 on their system. Arguably OpenFOAM has a bad reputation for use on this system but the same problems are repeated on many systems, especially those that use a Lustre parallel file system and that is the way that OpenFOAM creates and deals with its files.
For every MPI process created, a new folder is also created and a set of files. In cases where lots of output is created during a simulation this can easily mean there are thousands of files per processor created on disk, Archer provides a hard limit per user on the number of files that can be created in their storage and also that they can have open in memory at any one time, parallel runs using OpenFOAM quickly exceed this and can have a major impact on the parallel file system for other users if the limits wern't there, as a result of this OpenFOAM has developed a bad reputation. It is worth noting that this approach is an entirely valid, if outdated, way of dealing with I/O when using MPI.
The good news is that, as of OpenFOAM 5.0, this has been changed and now there is a new way of writing files to disk known as the collated file format. This is a simple idea, rather than each MPI process creating its own folder, there is now just one set of files written by the master process and all other processes transfer data back via MPI. If you get hold of the latest development version via the OpenFOAM-dev repository then this has been further developed so you can mark individual MPI processes as "master node" writers to spread the load and reduce communication overhead as then processes only need to talk to each other within the same node. Therefore, if you were running on 48 nodes of Archer then you would have 1152 MPI processes with 24 on each node, so you would have 48 sets of files instead of 1152. This is really quite significant as if you assume there are 1000 files per set by the end of a simulation then you have 48,000 rather than 1,152,000!
We have done some basic testing and have found using the new file format to be about 50% faster on Archer using the flow past a motorbike tutorial case with simpleFoam and 48 nodes.
Of course the really exciting thing about this development is that the HPC community can now really get stuck in to the challenge of properly benchmarking OpenFOAM over many more MPI ranks than it has previousely attempted as cases now scale, this will therefore hopefully lead to rapid development of the underlying MPI approach and only serve to increase performance of OpenFOAM across all of its solvers, including the MNF group codes!
Conferences, conferences & more conferences!
Well, it's that time of year again, no not Christmas, conference time!
Recently members from the MNF group have been at a number of large conferences, with Prof. David Emerson attending both SuperComputing 2017 and then, with other members of the group, the APS conference in Denver in America.
I recently found myself at the UK's version of SuperComputing, the STFC run Computing Insight UK, although a smaller event than SuperComputing, this year still saw around 400 people come together in Manchester in the UK to see the latest computing technologies, discuss how to join up the UK's e-Infrastructure (i.e. how can we all get better access to the nations HPC resources) and, the reasons I was there, a day long session on emerging computing technology, which I ran! This was an exciting event for me as we didn't just have speakers, instead we also ran a 3 hour practical work-shop on hands-on Quantum Computing in collaboration with IBM Research. This went down fantastically and we hope to run something similar in the future.
The next exciting event is the annual MNF Christmas conference and workshop on the 18th and 19th of December! This is behing held over 2 days in Cheshire, with the first day being devoted to engaging with our industrial partners in a steering and impact committee day and the second for the group to come together and update each other one what we have all been doing! Events like this are essential with research groups as large as this one, we are spread over a number of institutions and not all working together so this event is a really great opportunity.
In the meantime, here are a few photos from the EMiT@CIUK 2017 workshop showing me looking awkward in front of a camera (watch the STFC media feeds for the full interview if you want something to laugh at) and Dr Stefan Filipp from IBM Research Zurich teaching us all about the state of quantum computing, how we can learn it now and what it can be applied to in the future. Fascinating stuff, especially for the future of molecular modelling!
Finally, if you want to have a play with quantum computing yourself, I enourage you to go to the IBM Quantum Experience website, where you can run on an actual quantum machine hosted in the IBM York Town research facility. More importantly though it offers a great set of tutorials to help you find out the important basics such as "what is a qubit?", "how do i teleport data between them", "who or what is a Hadamard gate?" and many others! Have a look here: https://quantumexperience.ng.bluemix.net/qx
Practical Multi-Scale Code Coupling?
This month I thought I would use this blog space to dump some thoughts and musings on the practical aspects of multi-scale coupling that have come to the light the more I talk with people from various scientific discplines looking to achieve this in some way or other!
Clearly, a group like this one has a fundamental interest in problems at different scales. Arguably, this means a fundamental interest in code coupling as it is unlikely a single software framework or computational method will capture physics at very different length- and time-scales. The same can be said for many other areas of science, not just engineering. So we know that multi-scale, coupled simulation is important, what about when we actually try to do it?
I have talked on this blog in the past about some coupling software approaches, one in particular, the Multiscale Universal Interface (https://doi.org/10.1016/j.jcp.2015.05.004) or MUI. This uses MPI to transfer data between solvers in order to enable code coupling and provides an extensible framework in which to build spatial or temporal interpolation schemes to allow data sampling between dissimilar methods. Other frameworks exist (i.e. OpenPALM, MUSCLE-2, CPL) that provide similar functionality. The interesting point though is what multi-scale coupling actually means.
When people thinking of coupling existing solvers, they tend to initially imagine some sort of domain-decomposed solution, two solvers operate on their domain independantly (note: these could be fully overlapped or adjacent with an overlapping region) and then each domain transfers data to the other. For those starting with the raw mathematics, they often look to remove seperation between methods where they can in order to simplify the problems (as a good mathematician should!) There are numerous examples of literature out there on what types of classification these couplings fall into, tight, loose, monolithic etc.
The interesting thing for me though is that when we talk about multi-scale coupling, often the first option is the most likely as when one approaches the task of coupling Molecular Dynamics to complex 3-D Computational Fluid Dynamics, does one want to create a full CFD and MD solver? No, one does not! Clearly there are no hard and fast rules, but more often than not, complex multi-scale problems seem to fit this pattern.
So we know that we are likely coupling two existing solvers together, we know that we can use coupling software like MUI to glue them together, so we know that we have at least two seperate domains to deal with. This is where a problem comes in when dealing with practical engineering applications and the question that needs answering (and won't be in this blog because it' an open question):
"If we have two seperate domains at length scales so different it is considered multi-scale, how can we reasonably couple them in any physical sense? Indeed, should we even be realistically trying to achieve this?"
There is a caveat to this statement: the situation where one end of the coupled length scale spectrum only aims to provide (or receive) a single answer to the other. For example if we wish to define the rheology of a liquid in a macroscopic CFD simulation using MD, then this could be simplified to saying that we wish to define a parameter for whatever viscosity formulation the CFD uses, using MD. Clearly there is a huge complexity in doing this but it is tractable as the MD simulation is not trying to recreate a physical part of the CFD domain, a typical MD "periodic box" scenario may well be enough to derive a single macroscopic value.
I'm not going to provide an answer to the question here as I think it's an open question but just to highlight what I mean. This group has done some really cool work over the past 5 or so years on coupled simulation of difficult micro and nano scale problems where typical CFD solvers would simply fail because Navier-Stokes doesn't capture the physics correctly, such as flow of water through a carbon nanotube. In these we have augmented the computational domain of an intentionally simple CFD solver with either MD, or direct simulation Monte Carlo (DSMC) sub-domains.
There are plenty of references available through the publications section of this site, but this has meant simulating flow through nano- or micro-channels with a high-aspect ratio that would normally be achieved with weeks of MD simulation have been tackled in hours or days. However, this is a very specific case and arguably, isn't multi-scale as both domains are of the same length-scales.
In a nutshell, what do we do when we want to simulate a portion of a truly macroscopic domain (i.e. of the order of metres or even centimetres or millimetres) using a method appropriate to the nano-scale and we want to not have died of old age by the end of the calculations and we want to re-use existing solvers. Answers on a post-card please!
Dr Stephen Longshaw invited to speak at PARENG 2017
Stephen will be giving a talk at the international conference on parallel, distributed, grid and cloud computing (or PARENG) conference at the University of Pécs (Hungary) in late March 2017.
This is the fifth edition of the conference series where he will be talking about some of the HPC aspects of code coupling at scale for multi-scale and multi-physics applications, as well as looking towards the concept of the digital product and how code coupling will play its part.
Micro & Nano Flows December 2016 Workshop
I thought I would just put up a quick post about the recent December workshop hosted and run by the Micro & Nano Flows group, this was held in North Yorkshire in Ripon (chosen as a uniquely central location for all of the members of the MNF group) on the 12th and 13th of December.
This meeting had multiple purposes and took place over 2 (very) full days! The first day was an EPSRC Creativity@Home day, which was designed to bring together all of the members of the MNF group and to encourage team working/building/thinking. It's easy to dismiss this sort of thing as laughable or useless but in reality companies spend huge amounts because it actually works! The MNF group didn't have a corporate budget but nonetheless a genuinely useful day was arranged by the team from Warwick University.
We started with a team based excercise looking at how to create a successful bid for academic funding. This was done in small teams of 3 or 4 and spanned over 3 hour long sessions, puncutated with appropriate lectures from experienced members of the group. The first hour was spent coming up with good ideas, the second fleshing them out and the third preparing a 5 minute presentation. The general standard was fantastic with a real range of projects proposed. To make things interesting two winners were selected and those will go on to write a full proposal in the New Year to bid for actual funding! For those early in their research career the event provided genuine insight into what makes a good proposal and more importantly how a little flair when presenting can make all the difference.
Later in the day we all attended a local events company based at a farm! I was sceptical... but in fact it was a fantastic event, split into 4 teams we each did 6 tasks that required varying levels of team work but were still fun. Everything from racing two tractors along a course to herding some sheep around a field (without a dog!) It was genuinely fun and got everybody talking and working together.
The second day was a more traditional conference which saw 21 presentations given over 3 sessions with the event named "Multiscale Fluid Dynamics: Simulation, Experiments and Applications". These were mostly from members of the MNF group about their current work but there were also a few invited talks from different outside groups of relavence. This is something that the MNF group traditionally does at the end of each year and is always a great way to get an insight into the work done across the group. We were also lucky to be joined by a number of great keynote speakers including Prof. Yonghao Zhang from the University of Strathclyde who talked about "Modelling gas transport in shales" and also the first of our visiting scientists Prof. Joël De Coninck from the Université de Mons who gave a reslly great talk on "Heat transfer and wettability".
As is traditional, we finished up with a great Christmas meal, for some it was their first experience of such a thing in the UK, hopefully it was a good one!
Events like this are imperative to keeping everybody within the MNF group talking and not locked away in their office and this was probably the most successful that I have attended in my short time as part of the group. Bring on 2018!