Based on experience from the Centre of Excellence for Global Systems Science (CoeGSS), a European Commission funded project that has been running since 2015, a new team has been assembled to further move the edge of what can be done using extreme scale computing for global challenges.
The new centre, Exascale, Data, and Global Evolution
s (EDGE), will be
prepared to carry on when CoeGSS comes to an end in September 2018. An
application has been submitted to the EC, in which the 19 partners aim
to establish a one-stop hub in Europe for organisations faced with
global challenges that require extreme computing power to make sense of Big Data. Seven of the partners come with experience from CoeGSS and
have joined with 12 new ones, including several high performance
Professor Patrik Jansson at the division of Functional
Programming at Chalmers is coordinating the initiative.
– We have identified three main example areas: pandemics, global
finance, and global mobility systems. There are some similarities and
possible synergies between them, and one of the common factors is big,
Therefore, network science and the possibility to simulate large, synthetic populations of agents, are essential in tackling the big challenges in this domain. (A synthetic population contains simplified representations of real people and the networks of connections between them – both physical proximity networks and digital “friendship” networks).
Domain specific languages to bridge the gap
Correctness of algorithms and validation of data will be important parts
of the work, and domain specific languages (DSLs) will be used to
bridge the gap between the domain experts and the implementors.
– Each scientific domain has its own jargon, and when we codify that as a DSL we not only make many scientific models executable, but the new language also works as a "tool for thinking", and aids communication between experts, programmers, and computers. HPC enables us to use parallel computing in order to get answers quickly. But there’s no use in computing an answer quickly if it’s
wrong. Before we spend thousands of core hours and megawatt hours of
energy on a computing problem, we need to know that the result can be
used for something, says Patrik Jansson.
Building data ecosystems
Big Data is often unstructured and messy, and requires
preparation in several stages to be useful. Large-scale computations
are needed to take the raw data as input and to produce other clean,
structured data as output. If the data is properly processed and tagged
with correct meta-data, it can be reused in other projects, in a kind of ecosystem of data. One example is high-resolution images of the Earth, collected by the European Space Agency in an Earth observation
– Most of this raw data is just pixels, lots of pixels. But turned into
a data ecosystem it becomes useful in many different ways. It can be
used to examine desert growth, or deforestation, or urban development,
or even sea-level rise. With this comprehensible data at hand,
policy-makers and businesses will be better equipped to master their
global challenges, says Patrik Jansson.