The edge of extreme scale computing for global evolutions

Building data ecosystems can aid policy-makers and businesses in mastering their global challenges.
A European initiative, coordinated from Chalmers, plans a "one-stop hub" for high-performance computing to make Big Data usable. And reusable.
Based on experience from the Centre of Excellence for Global Systems Science (CoeGSS), a European Commission funded project that has been running since 2015, a new team has been assembled to further move the edge of what can be done using extreme scale computing for global challenges.

Professor Patrik Jansson, coordinator of the edge-initiative, presents the three main example areas. Pandemics: The spreading of contagious diseases over the globe. Simulations of the spreading can be useful in order to decide whether to close down, for example, schools or airports in order to reduce or slow down the spreading. Global Mobility: The greening and evolution of the global mobility system is an important question for the future of mankind.  The individual choice of means of transportation, and the way we influence each other, has massive impact. Global Finance: Many banks and companies are connected in advanced networks, meaning that if certain partners fail, the entire system is at risk of breaking down. Larger economic actors need to understand systemic risks and try to counteract it.The new centre, Exascale, Data, and Global Evolutions (EDGE), will be prepared to carry on when CoeGSS comes to an end in September 2018. An application has been submitted to the EC, in which the 19 partners aim to establish a one-stop hub in Europe for organisations faced with global challenges that require extreme computing power to make sense of Big Data. Seven of the partners come with experience from CoeGSS and have joined with 12 new ones, including several high performance computing centres. Professor Patrik Jansson at the division of Functional Programming at Chalmers is coordinating the initiative.
– We have identified three main example areas: pandemics, global finance, and global mobility systems. There are some similarities and possible synergies between them, and one of the common factors is big, complicated networks.
 
Therefore, network science and the possibility to simulate large, synthetic populations of agents, are essential in tackling the big challenges in this domain. (A synthetic population contains simplified representations of real people and the networks of connections between them – both physical proximity networks and digital “friendship” networks).

Domain specific languages to bridge the gap
Correctness of algorithms and validation of data will be important parts of the work, and domain specific languages (DSLs) will be used to bridge the gap between the domain experts and the implementors.
– Each scientific domain has its own jargon, and when we codify that as a DSL we not only make many scientific models executable, but the new language also works as a "tool for thinking", and aids communication between experts, programmers, and computers. HPC enables us to use parallel computing in order to get answers quickly. But there’s no use in computing an answer quickly if it’s wrong. Before we spend thousands of core hours and megawatt hours of energy on a computing problem, we need to know that the result can be used for something, says Patrik Jansson.
 
Building data ecosystems
Big Data is often unstructured and messy, and requires preparation in several stages to be useful. Large-scale computations are needed to take the raw data as input and to produce other clean, structured data as output. If the data is properly processed and tagged with correct meta-data, it can be reused in other projects, in a kind of ecosystem of data. One example is high-resolution images of the Earth, collected by the European Space Agency in an Earth observation program.
– Most of this raw data is just pixels, lots of pixels. But turned into a data ecosystem it becomes useful in many different ways. It can be used to examine desert growth, or deforestation, or urban development, or even sea-level rise. With this comprehensible data at hand, policy-makers and businesses will be better equipped to master their global challenges, says Patrik Jansson.

Published: Wed 02 May 2018. Modified: Tue 08 May 2018