# Enabling particle applications for exascale computing platforms

@article{Mniszewski2021EnablingPA, title={Enabling particle applications for exascale computing platforms}, author={Susan M. Mniszewski and J. F. Belak and Jean-Luc Fattebert and Christian F. A. Negre and Stuart R. Slattery and Adetokunbo Adedoyin and Robert Francis Bird and Choongseok Chang and Guangye Chen and St{\'e}phane Ethier and Shane Fogerty and Salman Habib and Christoph Junghans and Damien Lebrun-Grandi{\'e} and Jamaludin Mohd-Yusof and Stan Gerald Moore and Daniel Osei-Kuffuor and Steven J. Plimpton and Adrian Pope and Samuel Temple Reeve and L. F. Ricketson and Aaron Scheinberg and Amil Y Sharma and Michael E. Wall}, journal={ArXiv}, year={2021}, volume={abs/2109.09056} }

The Exascale Computing Project (ECP) is invested in co-design to assure that key applications are ready for exascale computing. Within ECP, the Co-design Center for Particle Applications (CoPA) is addressing challenges faced by particle-based applications across four “sub-motifs”: short-range particle–particle interactions (e.g., those which often dominate molecular dynamics (MD) and smoothed particle hydrodynamics (SPH) methods), long-range particle–particle interactions (e.g., electrostatic… Expand

#### Figures from this paper

#### One Citation

Machine learning accelerated particle-in-cell plasma simulations

- Physics
- 2021

Particle-In-Cell (PIC) methods are frequently used for kinetic, high-fidelity simulations of plasmas. Implicit formulations of PIC algorithms feature strong conservation properties, up to numerical… Expand

#### References

SHOWING 1-10 OF 70 REFERENCES

Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation

- Computer Science
- 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI)
- 2019

This paper analyzes the limitations of collective MPI communication for the computation of fast Fourier transforms (FFTs), and proposes a new FFT library, named HEFFTE (Highly Efficient FFTs for Exascale), which supports heterogeneous architectures and yields considerable speedups compared with CPU libraries, while maintaining good weak as well as strong scalability. Expand

Warp-X: A new exascale computing platform for beam–plasma simulations

- Physics, Computer Science
- Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment
- 2018

The various components of the codes such as the new Particle-In-Cell Scalable Application Resource (PICSAR) and the redesigned adaptive mesh refinement library AMReX, which are combined with redesigned elements of the Warp code, in the new WarpX software are presented. Expand

An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

- Computer Science, Physics
- J. Comput. Phys.
- 2012

A very efficient, mixed-precision hybrid CPU-GPU implementation of the 1D implicit PIC algorithm exploiting a fundamental feature of the method, the segregation of particle-orbit computations from the field solver, while remaining fully self-consistent. Expand

Implementing a neural network interatomic model with performance portability for emerging exascale architectures

- Physics, Computer Science
- Computer Physics Communications
- 2022

This work re-implement a neural network interatomic model in CabanaMD, an MD proxy application, built on libraries developed for performance portability, and shows significantly improved on-node scaling in this complex kernel as compared to a current LAMMPS implementation. Expand

Modeling Dilute Solutions Using First-Principles Molecular Dynamics: Computing more than a Million Atoms with over a Million Cores

- Computer Science
- SC16: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2016

Using a robust new algorithm, this work has developed an O(N) complexity solver for electronic structure problems with fully controllable numerical error, allowing for very accurate FPMD simulations of more than a million atoms on over a million cores. Expand

Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.

- Computer Science, Medicine
- Journal of chemical theory and computation
- 2015

We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small… Expand

Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665)

- Computer Science
- ArXiv
- 2021

The scientific application of interest here is the Basic Math Library (BML) that provides a singular interface for linear algebra operation frequently used in the Quantum Molecular Dynamics (QMD) community and several optimization strategies are introduced into these micro-kernels. Expand

An energy- and charge-conserving, implicit, electrostatic particle-in-cell algorithm

- Physics, Computer Science
- J. Comput. Phys.
- 2011

A main development in this study is the nonlinear elimination of the new-time particle variables (positions and velocities), which is term particle enslavement, results in a nonlinear formulation with memory requirements comparable to those of a fluid computation, and affords us substantial freedom in regards to the particle orbit integrator. Expand

The Universe at extreme scale: Multi-petaflop sky simulation on the BG/Q

- Computer Science, Physics
- 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
- 2012

HACC simulations at these scales will for the first time enable tracking individual galaxies over the entire volume of a cosmological survey. Expand

Long-Time Dynamics through Parallel Trajectory Splicing.

- Computer Science, Medicine
- Journal of chemical theory and computation
- 2016

A novel simulation technique, Parallel Trajectory Splicing (ParSplice), that aims at addressing the atomistic evolution of materials over long time scales through the timewise parallelization of long trajectories through the study of topology changes in Ag42Cu13 core-shell nanoparticles. Expand