My research on the theory and implementation of Interior Point Methods for linear and quadratic programming concentrated particularly on the study of search directions and warm-start approaches.
During my PhD, under the supervision of Jacek Gondzio, I have improved the implementation of corrector directions in the HOPDM interior point solver (COAP 2008), before moving to work with the structure-exploiting parallel solver OOPS. I have implemented an SMPS interface for OOPS, HOPDM and CPLEX (extended to LP_SOLVE and GLPK), which allows to solve a stochastic programming problem by formulating the corresponding deterministic equivalent problem. This implementation allows to solve a problem instance with warm-start, by first solving a problem with reduced dimensions (Mathematical Programming 2011).
I have been employed on an EPSRC funded project with Andreas Grothey, in which we continued the investigation of warm-start strategies for interior point methods in the context of stochastic programming.
This led to consider a multi-step approach, in which the number of intermediate problems can be more than one (ERGO 09-007), and a decomposition-like strategy, in which we generate and warm-start the subproblems rooted at the second-stage nodes (COAP 2013). I also designed the stochastic programming extension for the structure-conveying modelling language SML (Mathematical Programming Computation 2009).
Bioinformatics and machine learning
In October 2009 I moved to the Centre for Population Health Sciences, working on the genetic epidemiology software admixmap with Paul McKeigue: I extended it to work correctly on the X chromosome and we used this to better understand the genetic determinants of sarcoidosis in Afro-American populations (Genes and Immunity 2011). Afterwards, I implemented a computationally efficient factorization that allows to exploit pedigree data in hours rather than in weeks (Genetic Epidemiology 2013).
Between March 2012 and December 2015 I worked on the SUMMIT project, a European research consortium dedicated to diabetes complications. I'm a member of Work Package 2 (non-genetic biomarkers) and Work Package 5 (data mining and in-silico modelling). In this context, we apply and develop machine learning algorithms to biomarker screening and prediction from high-dimensional data. As complications, we principally looked at cardiovascular disease (Diabetologia 2015) and rapid progression of diabetic chronic kidney disease (Kidney International 2015).
From November 2014 I have been working on data from the SDRNT1BIO cohort, a large cohort of patients with type 1 diabetes linked to electronic health data and genotypes, and where proteins, metabolites, tryptich peptides, glycans were measured in a subset of samples. The task is to identify biomarkers related to diabetes control and progression of chronic kidney disease, as well as understanding their genetic determinants.
For a separate study on rheumatoid arthritis, we developed a novel approach to imputation of ultra-low coverage sequence data: this is implemented in Geneimp, which relies on the existence of very large reference panel to avoid modelling recombination explicitly, which leads to much faster imputations with minimal loss of quality.
The GENOSCORES platform provides a framework for calculating genotypic predictors of binary and quantitative phenotypes from publicly available summary results of genome-wide association studies of multiple phenotypes, -omic measurements and gene expressions.