Marco Colombo

Experimental area · proceed at your own risk!

Research interests


Since October 2009 I have been based in the Centre for Population Health Sciences, now part of the Usher Institute of Population Health Sciences and Informatics, as a member of Paul McKeigue's group.

I initially worked on the genetic epidemiology software admixmap with Paul McKeigue: I extended it to work correctly on the X chromosome and we used this to better understand the genetic determinants of sarcoidosis in Afro-American populations (Genes and Immunity 2011). Afterwards, I implemented a computationally efficient factorization that allows to exploit pedigree data in hours rather than in weeks (Genetic Epidemiology 2013).

I am involved in the MATURA project, a consortium working on rheumathoid arthritis aimed at better targeting of the available treatments to those who would benefit most, thereby improving the selection of the best treatment for individual patients. The first results concerned genome-wide association studies of response to methotrexate (The Pharmacogenomics Journal 2018a) and tumour necrosis factor inhibitor therapy (The Pharmacogenomics Journal 2018b), prediction of response from genome-wide SNP data (Genetic Epidemiology 2018) and other work is ongoing.

For a separate study on early rheumatoid arthritis (PROMISERA), with the lead by Athina Spiliopoulou we developed a novel approach to imputation of ultralow coverage sequence data (Genetics 2017): this is implemented in GeneImp, which relies on the existence of very large reference panel to avoid modelling recombination explicitly. This leads to much faster imputations of this type of data with minimal loss of quality.

I am one of the core developers of GENOSCORES, a platform built to provide a framework for calculating genotypic predictors of binary and quantitative phenotypes from publicly available summary results of genome-wide association studies of multiple phenotypes, -omic measurements and gene expressions. This was used to explore pleiotropy in the genetic determinants of male pattern baldness (Nature Communications 2017).

Biomarker discovery and prediction of diabetes complications

Between March 2012 and December 2015 I worked on the SUMMIT project, a European research consortium dedicated to diabetes complications, focusing in particular on non-genetic biomarkers, data mining and in-silico modelling. In this context, we applied and developed machine learning algorithms to biomarker screening and prediction from high-dimensional data. As complications, we principally looked at cardiovascular disease (Diabetologia 2015, Atherosclerosis 2018) and rapid progression of diabetic chronic kidney disease (Kidney International 2015, Diabetologia 2018). During the project we also tackled the problem of in-silico identification of unknown metabolites (Journal of Chromatography B 2017).

From November 2014 I have been working with Helen Colhoun on data from the SDRNT1BIO cohort, a large cohort of patients with type 1 diabetes linked to electronic health data and genotypes, and where proteins, metabolites, tryptic peptides, glycans were measured in a subset of samples. The task is to identify biomarkers related to diabetes control and progression of chronic kidney disease, as well as understanding their genetic determinants. One of the studies related to the relationship of N-glycans with progression of renal disease (Diabetes Care 2018).

Large-scale optimization and structure exploitation

My research on the theory and implementation of Interior Point Methods for linear and quadratic programming concentrated particularly on the study of search directions and warm-start approaches.

During my PhD, under the supervision of Jacek Gondzio, I have improved the implementation of corrector directions in the HOPDM interior point solver (COAP 2008), before moving to work with the structure-exploiting parallel solver OOPS. I have implemented an SMPS interface for OOPS, HOPDM and CPLEX (extended to LP_SOLVE and GLPK), which allows to solve a stochastic programming problem by formulating the corresponding deterministic equivalent problem. This implementation allows to solve a problem instance with warm-start, by first solving a problem with reduced dimensions (Mathematical Programming 2011).

Between 2007 and 2009 I was employed on an EPSRC-funded project with Andreas Grothey, in which we continued the investigation of warm-start strategies for interior point methods in the context of stochastic programming. This led to consider a multi-step approach, in which the number of intermediate problems can be more than one (ERGO 09-007), and a decomposition-like strategy, in which we generate and warm-start the subproblems rooted at the second-stage nodes (COAP 2013). I also designed the stochastic programming extension for the structure-conveying modelling language SML (Mathematical Programming Computation 2009).

2003-2018 © marco