Research, collaboration and education

Research area:
Big Data in life science
 
Modern life science research involves generation and analysis of Big Data. With the advancement of next generation sequencing technologies we are getting access to more and more genome sequences, e.g. from individuals, from cancer genomes, and metagenomes of gut microbiomes. Analysis of these data are essential for advancing health care to become more sustainable and allow for better prediction and prevention of disease development.
 
The picture shows how genomic mutations and diseases are connected. The graph has been generated based on data from 10,000 of people.
 
 


Research area:
Efficient Distributed, Parallel/Multicore and Stream Processing
for Large-Scale Infrastructures

Fig by V. Gulisano, M. PapatriantafilouThe inherently large and varying volumes of data generated to provide autonomous functionality in large scale cyber-physical systems demand near real-time processing, often as close to the sensing devices as possible, to (i) facilitate faster responses, scalability, locality and privacy properties and (ii) to match the presence of machine-to-machine communication. The data streaming processing paradigm is appropriate for data-intensive processing supporting efficient and continuous analysis of data streams.

The Distributed Computing and Systems (DCS) research group (Networks and Systems Division, Dept. of Computer Science and Engineering) is pioneer in research that combines expertise on data streaming, matching perfectly with the distributed processing paradigm and parallel/multicore processing, leveraging the over-two-decades-long expertise present in the team. The results cover both scale-up and scale-out properties and the applied aspects of this research cover both higher-capacity multicore server systems and single board computing devices, thus extending over various types of cloud-infrastructures for cyberphysical systems (e.g communication, electricity, vehicular networks), as needed for these systems. Recent work by the group, focusing on  “Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects” has been nominated for the Grand Challenge Award of the 9th ACM (SIGMOD-SIGSOFT) Int’l Conference on Distributed Event-Based Systems (DEBS’15).

Contact:
Marina Papatriantafilou (ptrianta@chalmers.se),
Vincenzo Gulisano (vinmas@chalmers.se),
Philippas Tsigas (tsigas@chalmers.se)

Research area:
Information and Data-processing in Focus for Energy Efficiency and Robustness
in Adaptive Electricity Networks (aka Smart Grids)

Fig. by G. GeorgiadisIn this type of cyber-physical systems, the main idea is to allow two-way communication of both power and data between devices, thus allowing for more adaptive and effective ways to utilize energy, especially when it comes from renewable and distributed sources or when it needs to be consumed/stored in adaptive ways, such as by electrical vehicles.

The Distributed Computing and Systems (DCS) research group is among the leading teams in research on these topics; based on variable volumes of data (e.g. through Advanced Metering devices) and with appropriate processing, interpretation and use, the goals are to enable functionality related with resource planning, adaptiveness and robustness in electricity networks, emphasizing also on security and privacy perspectives.

Contact:
Marina Papatriantafilou (ptrianta@chalmers.se),
Magnus Almgren (magnus.almgren@chalmers.se),
Vincenzo Gulisano,
Olaf Landsiedel


Research area:
Data-Driven and Distributed algorithms for Autonomous and Cooperative Vehicular Environments

Gulliver projectData and communication are inherent parts of the infrastructure for autonomous and cooperative vehicular environments. The Distributed Computing and Systems (DCS) group’s research involves design, development and demonstration of methods for cyber-physical vehicular systems (the latter through the Gulliver project) for applications that base their decisions on sensory and infrastructure information and aim at dealing with uncertainty that is inherent in such environments, where  variable-volume data is generated by single sensors (eg. high rates lidar sensors) or aggregated over parts of the distributed infrastructure, for wide area monitoring. 

Contact:
Marina Papatriantafilou, (ptrianta@chalmers.se),
Elad Schiller, (elad.schiller@chalmers.se),
Olaf Landsiedel (olafl@chalmers.se),
Philippas Tsigas


New research center:
Big Data in Global Systems Science 
 
A new research center has recently been initiated, starting October 2015, with a focus on using high performance ICT technologies to address global challenges such as health risks, green growth and urbanization. The EC Center of Excellence in High Performance Computing for Global Systems Science (COEGSS) has been funded under Horizon 2020.
 
The core approach is based on using synthetic information resources which are fine grained representations of populations that are statistically accurate and can be used together with large scale agent based simulations and analyses to investigate the complex dynamics of socially coupled global systems in transport, urbanization and health.

A first cut synthetic population model for Sweden has already been developed and will be further refined and integrated with synthetic population resources for the US, Europe and India towards a truly global synthetic population model. The project is coordinated by the Stuttgart HPC Center and includes ten partners from academia and industry in Europe.

COEGSS website: http://coegss-project.eu/
 
Contact:
Devdatt Dubhashi (dubhashi@chalmers.se),
Chris Barrett (affiliated faculty, Virginia Tech, cbarrett@vbi.vt.edu)
Madhav Marathe (affiliated faculty, Virginia Tech, mmarathe@vbi.vt.edu)
 


Proposal:
Urban Futures with Synthetic Populations

 
This is a proposal submitted to Vinnova under the recent call for “Innovations for a sustainable society: environment and transport 2015” and is closely aligned with the goals of the COEGSS center described above. The project is coordinated by SP and in addition to Chalmers involves public authorities like Goteborgs Kommun, Trafikverket, and companies such as Changemaker AB, Resourcepoint AB.
 
Contact:
Ragne Emardson (ragne.emerdson@sp.se),
Devdatt Dubhashi (dubhashi@chalmers.se)
 
Research project:
DATABIN: Data Driven Secure Business Intelligence
 
The goal of the project is to develop scalable system architectures, algorithms, development methods and working demonstrators for temporal analysis of the large data sets harvested from open sources (web, social media, etc.) as well as corporate databases (customer data, business intelligence data) to enable new forms of collaborative innovation.  In addition to handling very large data sets, these analysis services must have mechanisms for ensuring personal integrity for individuals as well as security for customers. The project involves close collaboration with companies such as Recorded Future, Findwise, Seal Software, Volvo Cars and AstraZeneca. The project has just passed its halfway point and received a very favorable evaluation. Results from the project have been published in the very top ranking academic venues such as NIPS, ICML and KDD on the machine learning and data mining side and in POPL on the programming languages side. Demonstrators have been successfully integrated and deployed in products of the partner companies.
The project is supported by SSF under its programme for “Information Intensive systems, with a budget of 25 M SEK for 2012-17.

Contact:
Devdatt Dubhashi (dubhashi@chalmers.se),
Dave Sands (dave@chalmers.se)

 

Training Network
DIVA - Data Intensive Visualization and Analysis
The DIVA Project is an Initial Training Network (ITN) funded by the EU within the 7th Framework Programme. It brings together 6 full partner institutions and 8 associated partners from 6 different EU countries. The main goal of the network is to train the next generation of researchers in the fields of 3D data presentation and understanding, with a primary focus on data intensive application environments.
At Chalmers, the research group at t2i Lab, Department of Applied IT, is hosting the project.

DIVA website:  http://diva-itn.ifi.uzh.ch/
Contact: Morten Fjeld (fjeld@chalmers.se)



OECD Report:
OECD Synthesis Report on Data Driven Innovation
  
The overarching theme of the OECD Global Forum on the Knowledge Economy GFKE 2014 was data-driven innovation for a resilient society: how creating economic value from large data sets is at the leading edge of business innovation, with companies that base their decisions on data and analytics outperforming other firms in terms of productivity growth.

The OECD Synthesis report on “Data Driven Innovation for Growth and Well Being” was published in October 2015. Devdatt Dubhashi was an expert consultant during the preparation of this report, and also featured in panels on “Big Data” at GFKE 2014 and at the Science Technology and Society (STS) Forum 2014, a gathering of Nobel laureates, CEOs and scientists from all over the world.

OECD Report: Data Driven Innovation: Big Data for Growth and Well-Being
 
Contact: Devdatt Dubhashi (dubhashi@chalmers.se)

 


Research program
MEDAS: Mesoscopic Data Science (submitted to FET Open)

MEDAS is a foundational research program targeting novel approaches to data analysis, representation, and to data-driven modeling of contagion processes, by focusing on representations and structures at intermediate scales (“mesoscales”). Many rich data sets describing behavior and interactions of individuals or socio-economic entities are indeed available nowadays and have driven important progresses in network science and in the development of predictive models of contagion-like phenomena (transmission of infectious diseases, social contagion, financial distress cascades) that face our modern society with huge challenges. The route from data to knowledge and predictions is however often uneasy. Despite the many progresses brought by the availability of data in network science and in data-driven modeling of contagion processes, many challenges remain.
 
MEDAS puts forward a new such route by working at intermediate scales, “mesoscales”. This implies developing new methods to extract mesostructures from data, new parsimonious representations of complex data, hierarchies of effective data-driven models of contagion processes and mapping techniques to bridge the models at different scales.
 
Contact: Devdatt Dubhashi (dubhashi@chalmers.se)
 

Research project
Culturomics: Towards a Knowledge Based Approach
 
This project is a collaboration between three research groups in language technology and computer science: Sprakbanken at Gothenburg University, the LAB Research group at Chalmers and the Language technology group at Lund University.  The main aim of this research program is to advance the state of the art in language technology resources and methods for semantic processing of Swedish text, in order to provide researchers and others with more sophisticated tools for working with the information contained in large volumes of digitized text. The LAB group has focused on methods based on Deep Learning technologies for semantic representations of text that can be used for automated systems to induce word senses and summarise text.
 
The project is supported by a framework grant from the Swedish Research Council (2012-2016) with a total budget of 17 M SEK.
 
Contacts:
Devdatt Dubhashi (dubhashi@chalmers.se),
Lars Borin (lars.borin@svenska.gu.se)

 


Multidisciplinary research team:
NIAS-Lorenz Center Theme Group on Phylogenetics for Linguistics, Sept-Oct 2015

How do languages evolve and what can we infer about the historical sequence of language evolution? This question can now be addressed in the digital age thanks to large volumes of linguistic data sets and massive increases in computational power.
The aim of this theme group is to bring together a multidisciplinary team consisting of linguists, evolutionary biologists, mathematician, and computer scientists to address the question.
 
Contact: Devdatt Dubhashi (dubhashi@chalmers.se)
 

Exchange visit:
Big Data Analytics
 
This is a Vinnova-Marie Curie Academy Outgoing grant for a senior researcher to spend 50% at a company. The company is Persistent (www.persistent.com) a world leader in database and Big Data technologies and one of the overarching goals is to use the Persistent expertise to strengthen and develop Big Data capabilities at Chalmers and initiate collaborative projects with our industry partners.
 
Contact: Devdatt Dubhashi (dubhashi@chalmers.se)
 

Education:
Big Data Analytics Course
in the national Swedish e-Science Education graduate school (SeSE)
 
This course (http://www.cse.chalmers.se/research/lab/courses.php?coid=10) was conducted by the LAB research group at Chalmers together with Persistent, supported by SeSE and the Chalmers e-Science Center. The course consisted of an introduction to the general area of Big Data analytics together with an intensive hands-on introduction to Hadoop technologies including Map Reduce and PIGs and with a special focus on Apache Spark. Visualization was illustrated using the Spotfire suite. The demos and exercises were carried out on the Amazon EC2 platform. Application domains ranged from Life Science to nuclear fusion and turbulent flows.
 
Contact:
Devdatt Dubhashi (dubhashi@chalmers.se),
Mukund Deshpande (mukund_deshpande@persistent.com)
 
 
Education:
Chalmers Machine Learning Summer School
 
The first Chalmers Machine Learning Summer School was conducted April 14-16, 2015 and included tutorials on Bayesian inference, Deep learning, Gaussian processes, Markov decision processes, Monte-Carlo methods and Reinforcement learning. The applications will include Computational Biology, Computer vision, Energy and Smart Grids, Medicine and Robotics.
Read more>>
 
Contact:
Christos Dimitrakakis (chrdimi@chalmers.se),
Devdatt Dubhashi (dubhashi@chalmers.se)
 

Research project:
Computational Biology
 
A collaborative project between the LAB research group and the Nielsen lab for Systems and Synthetic Biology at Chalmers involves integrating the metabolic and gene regulatory systems. A novel approach that introduces the first use of factor graphs is used to improve predictions of perturbations in the gene regulatory systems on reactions in the metabolic system. A related project involves new methods for random sampling from the original and perturbed metabolic systems together with a flux balance analysis.
 
Another recently initiated collaborative project with Mathematical statistics involves the development of new computational and statistical methods for the analysis of large and high dimensional data sets, in particular applied to the Cancer Genome Atlas (TCGA) data.
 
BILD: Computational Biology
 
Contact:
Devdatt Dubhashi (Dubhashi@chalmers.se),
Intawat Nookaev (intawat@chalmers.se),
Rebecka Jornsten (jornsten@chalmers.se)
 

 

 


Research projects:
ICTBioMed: Quantitative Medical Imaging and Drugs Repurposing
 
ICTBioMed is an international consortium spanning the US, Europe and India that brings together diverse expertise to tackle major biomedical challenges. One of the projects involves developing new methods for the analysis and integration of imaging data and a second project involves ICT methods for drugs repurposing. Both projects involve close collaboration with the National Cancer Institute (NCI) in the US.
 
Contact:
Devdatt Dubhashi (dubhashi@chalmers.se),
Rolf Heckeman (rolf.heckeman@medtechwest.se)
 

Research project:
Market Mechanisms for Multiple Minds

This project, in collaboration with Harvad University and funded by Västra Götaland's MoRE program will develop novel market-based methodologies for distributed AI architectures. The focus of the project is problems such as large scale automatic experiment design or crowdsourced intelligent transportation systems, where humans and AI must be appropriately incentivized to perform work and computations. An immediate application is experiment design for drug development, in collaboration with Astra Zeneca.

Contact:
Christos Dimitrakakis (chrdimi@chalmers.se)
David Parkes (Harvard)


Research project:
Swiss Sense Synergy

This project, in collaboration with the university of Geneva, the university of Bern and the university of Applied Sciences of Southern Switzerland, and funded by the Swiss National Science foundation, will develop novel models and algorithms for incentive-compatible, privacy-preserving crowdsourced mobile platforms. Applications include not only crowdsourcing staples such as recommendation systems, but also an experimentation platform for social science research.

Contact:
Christos Dimitrakakis (chrdimi@chalmers.se)
Aikaterini Mitrokotsa (aikmitr@chalmers.se)


Research project:
Structural Equation Modelling for Particle Accellerators

This project, in collaboratoin with CERN, will develop methods for automatically inferring simulation models for the LHC. These are important in order to be able to predict which are the most efficient LHC configurations for different experiments, as well as failures. The challenge is to have a data-driven procedure for choosing the simulation model while limiting human intervention, as well as employing it within the LHC control room in order to aid human decision making when configuring the LHC for different experiments.

Contact:
Christos Dimitrakakis (chrdimi@chalmers.se)


Research project:
Rational verification and outsourced computation

This project, in collaboration with the Tokyo Institute of Technology, will develop incentive-compatible algorithms for verifiable outsourced computation. These are particularly important for distributed algorithms that are based on micro-transactions, and for resource constrained devices that rely on cloud computing services to perform their work.

Contact:
Christos Dimitrakakis (chrdimi@chalmers.se),
Keisuke Tanaka (Tokyo Tech)

Published: Mon 22 Jun 2015. Modified: Wed 28 Oct 2015