VECMA will participate in ISC High Performance, one of the most prestigious supercomputing conferences, for the second consecutive year. This year, ISC will address current developments critical to high performance computing, machine learning and data analytics, as well as the future advances that will shape these technologies, through the following topics: System Architecture, Applications/Algorithms, Emerging Technologies, Parallel Programming Models & Performance Modelling, Machine Learning Day.
VECMA will be presenting two posters, on large-scale computations with QCG-PilotJob and on workflow automation with FabSim3. You can browse the posters via the two links below:
Large-scale computations with QCG-PilotJob
Growing needs of computational scenarios on the one hand and growing popularity of large-scale HPC computing on the other require adequate systems that provide both good efficiency, great flexibility and simplicity of usage. Verification, Validation and Uncertainty Quantification of complex multiscale applications, being the main topic of VECMA, requires extremely large computing power. It is anticipated that calculations required for analysis of the use-cases being developed within the project may consume power of not only currently available peta-scale resources, but also power of emerging exa-scale ones. This places high demands on the software that should support such computations.
QCG-PilotJob is a lightweight Python implementation that is designed to enable easy and highly efficient execution of user tasks in the so-called pilot job flavour on HPC machines. The QCG-PilotJob instance started within a regular queuing system allocation may be seen as a separate, second-level and private queuing system. That is, once QCG-PilotJob is started, a user has full control on the tasks submitted to it. There are two basic ways of interaction with QCG-PilotJob. The static way allows to prepare a configuration of tasks in advance and submit such a configuration on the startup of QCG-PilotJob. The dynamic one allows interaction with the already running instance of QCG-PilotJob. What is important for QCG-PilotJob is the fact that it doesn’t need any external services, thus it can be run by a user practically in any circumstances, whether it is a local machine for tests or SLURM system for the production runs.
Workflow automation with FabSim3
FabSim3 is designed to support automation of large scale simulation workflow from data preparation step to results analysis in such a way as to reduce the burden on application developers. The tool is generic and is oriented towards developers from different research disciplines and with at least basic programming experience. To enable users to rapidly prototype and evolve their domain-specific workflows, FabSim3 supports the development of application-specific plugins. Once developed, these plugins can then be shared with the wider community, eliminating the need to duplicate machine configurations, workflow definitions or deployment instructions.