The Role of Software in HPC – Lessons Learnt in the US Exascale Computing Project
Hartwig Anzt (University of Tennessee, USA)
Abstract: The US Exascale Computing Project (ECP) has the goal to deliver a capable exascale computing ecosystem to provide breakthrough modeling and simulation solutions to address the most critical challenges in scientific discovery, energy assurance, economic competitiveness, and national security. A central role plays the softwarestack that has to bridge between the applications and the silicon of the leadership supercomputers. In this talk, we discuss the role of software and software developers in ECP, and which concepts enabled sustainability beyond the project completion.
Bio: Hartwig Anzt is the Director of the Innovative Computing Lab (ICL) and Professor in the Electrical Engineering and Computer Science Department of the University of Tennessee. He also holds a Senior Research Scientist Position at Steinbuch Centre for Computing at the Karlsruhe Institute of Technology where he previously held a Junior Professorship in the Faculty of Computer Science. Hartwig Anzt holds a PhD in applied mathematics and specializes in iterative methods and preconditioning techniques for the next generation hardware architectures. He also has a long track record of high-quality development. He is author of the MAGMA-sparse open source software package and managing lead of the Ginkgo math software library. Hartwig Anzt is the PI of Software Technology (ST) projects that are part of the US Exascale Computing Project (ECP), including a coordinated effort aiming at integrating low-precision functionality into high-accuracy simulation codes. He also is a PI in the EuroHPC Project MICROCARD.
How Challenging is it to Build an Ecosystem for the Edge-Cloud-HPC Continuum?
Gabriel Antoniu (Inria, France)
Abstract: Modern use cases such as autonomous vehicles, digital twins, smart buildings and precision agriculture, greatly increase the complexity of application workflows. They typically combine physics-based simulations, analysis of large data volumes and machine learning and require a hybrid execution infrastructure: edge devices create streams of input data, which are processed by data analytics and machine learning applications in the Cloud, and simulations on large, specialised HPC systems provide insights into and prediction of future system state. All of these steps pose different requirements for the best suited execution platforms, and they need to be connected in an efficient and secure way. This assembly is called the Computing Continuum (CC). It raises challenges at multiple levels: at the application level, innovative algorithms are needed to bridge simulations, machine learning and data-driven analytics; at the middleware level, adequate tools must enable efficient deployment, scheduling and orchestration of the workflow components across the whole distributed infrastructure; and, finally, a capable resource management system must allocate a suitable set of components of the infrastructure to run the application workflow, preferably in a dynamic and adaptive way, taking into account the specific capabilities of each component of the underlying heterogeneous infrastructure. This talk discusses these challenges and introduces TCI – the Transcontinuum Initiative – a European multidisciplinary collaborative action aiming to identify the related gaps for both hardware and software infrastructures to build CC use cases, with the ultimate goal of accelerating scientific discovery, improving timeliness, quality and sustainability of engineering artefacts, and supporting decisions in complex and potentially urgent situations. The talk will also include examples of concrete projects aiming to address the above challenges.
Bio: Gabriel Antoniu is a Senior Research Scientist at Inria, Rennes, where he leads the KerData research team. His recent research interests include scalable storage, I/O and in situ visualization, data processing architectures favoring the convergence of HPC, Big Data analytics and AI. He received his Ph.D. degree in Computer Science in 2001 from ENS Lyon and his habilitation for research supervision in 2009 from ENS Cahan/Brittany. He currently serves as Vice Executive Director of JLESC – Joint Inria- Illinois- ANL-BSC-JSC-RIKEN/AICS Laboratory for Extreme-Scale Computing on behalf of Inria. He has served as a PI for several international projects in these areas in partnership with Microsoft Research, Argonne National Lab, the University of Illinois at Urbana Champaign, Universidad Politécnica de Madrid, Barcelona Supercomputing Center and companies including IBM, ATOS, Total. He served as Program Chair for the IEEE Cluster conference in 2014 and 2017 and regularly serves as a PC member of major conferences in the area of HPC, cloud computing and Big Data analytics (SC, HPDC, CCGRID, Cluster, Big Data, etc.). He has acted as advisor for 20+ PhD theses and has co-authored 150+ international publications in the aforementioned areas.
What Happens When Distributed Systems Meet the Messy Reality of People?
Mariana Marasoiu (University of Cambridge, UK)
Abstract: A user always sits on the edge of a distributed system, whether they’re searching through their email, collaborating on a design document at the same time, or getting ready to analyse some data on a cluster. The behaviour of the systems determines whether people are going to find it usable or not, but their experience can’t be captured by technical performance models or metrics. This is the case even more so when things go wrong.
In this talk I will draw from my experience in working with and building a few such systems, both research and production, and will trace how real-world technical trade-offs play out in daily user experience, and vice versa. Deciding between design option A or B is not so much a matter of just following the theory, we need instead to actively consider the people and their context. UX research then becomes a tool which can help us think about how to make these kinds of decisions.
Bio: Mariana Marasoiu is an Affiliated Lecturer at the University of Cambridge in the Department of Computer Science and Technology. She is a senior software engineer & user experience researcher at Lark Social Impact (LSI), whilst finishing her PhD on end-user programming data visualisation. Mariana is interested in building tools that help people work better, by using novel interaction design to make different kinds of computation available. For the past 3 years she has helped build Katikati, a communication platform from LSI that enables organisations to build valuable relationships with the people that matter to them. She is also General Chair of the Psychology of Programming Interest Group and its conference and has served as a committee member on several editions of the Programming Experience (PX) Workshop, and also the LIVE and the Salon workshops.
Policy Based Security Management towards 6G Networks and the Continuum
Antonio Skarmeta (Universidad de Murcia, Spain)
Abstract: Future networks will be characterized by the heterogeneity of devices, technologies, infrastructure, which together with the strength of the edge computing paradigm is transforming the way they offer services focusing on the isolation and individuality of services requirements in form of Network Slices. But the whole heterogeneous network of devices and technologies that 5G and Beyond represents still keeps manageability a challenge, especially in terms of security. This join to the fact that there is a lack of common understanding ground on security agreements and the massive adoption of network slices as services, makes the security enforcement a specially challenging task.
New services require efficient and effective management of computing and network resources what means to deal with huge amounts of data and at different levels of the future NG infrastructure. In this context there is a need for configuration, architecture and coordination of processing nodes at different levels what has an impact over security of the 5G and beyond infrastructure but also for the new data processing. In this talk several areas of intelligent management and AI and security procedures will be presented to support heterogeneous processing infrastructures in 5G and beyond 5G networks based on the results of some EU project.
Bio: Dr Antonio Skarmeta received the M.S. degree in Computer Science from the University of Granada and B.S. (Hons.) and the Ph.D. degrees in Computer Science from the University of Murcia Spain. Since 2009 he is Full Professor at the same department and University. Antonio F. Skarmeta has worked on different research projects in the national and international area in the networking, security and IoT, 5G and its application to monitoring and natural sciences. He coordinates several H2020 projects . Since 2014 until 2010 he has been Spanish National Representative for the MSCA within H2020. He has published over 200 international papers and being member of several program committees. He has also participated in several standardization for a like IETF, ISO and ETSI. Actually he coordinates the 6G_SNS RIGOUROUS 6G-SNS project and CERTIFY in the security area and participate in other H2020 and HE projects like ERATOSTHENES and FLUIDOS.