Menu

Seminar: Data Fusion (of Everything) by Prof. Blaz Zupan

10th February 2016

2:00 am - 3:30 pm

L.T. 1.4 Kilburn Building, The University of Manchester,

Abstract

Have you ever been overwhelmed by data — not only by their volume but also by their sheer multitude? In many fields, including life sciences, the data abounds. One of the grand challenges of machine learning is to infer a predictive model by jointly considering all the available data sets. In bioinformatics, this would be integrating data sets as diverse as, say, gene expression, interactions, functional annotations, phenotype information, various ontologies, disease markers, structural properties of chemicals, Facebook and Twitter (ok, I perhaps went too far with the last two items). That is, the learning algorithm would need to consider all available information, even if only circumstantially related to the problem at hand. At University of Ljubljana we have developed a computational approach that uses collective matrix tri-factorization and can consider such diverse data sets. Tri-factorization infers a joint latent data model. The model can be used for various data mining tasks, such as class prediction and ranking. Our experiments show that through a broad integration of heterogeneous data sets we can substantially increase the accuracy. In the talk, I will present the intuition behind data fusion by matrix tri-factorization, and show its application in several recent studies, including in finding of new bacterial response genes in social amoeba Dictyostelium.

Biography

Blaz Zupan is a professor at the Faculty of Computer and Information Science, University of Ljubljana, head of the Laboratory for Bioinformatics (http://biolab.si) and a visiting professor at Baylor College of Medicine in Houston, USA. His main research interests are data mining, machine learning, data fusion, and interactive data visualization. His lab develops Orange (http://orange.biolab.si), a popular open source data mining suite. Using visual programming, users of Orange can combine basic analytical components and build powerful workflows for data analytics.)