School of Mathematics

Ruth King

Becky Nisbet has written the following article as part of our series of Academic Interviews; featuring Ruth King!

A conversation with Prof. Ruth King, the Thomas Bayes’ Professor of Statistics here in the School of Mathematics.

 

Can you tell me what you’re currently researching and working on?

I’m primarily an applied statistician: I work on interesting problems in datasets that require new techniques in order to analyse them.

One project I’m working on is related to bird species, and the particular data I’m looking at is a population of around 30,000 guillemots. We’ve been working with biologists directly to model the data and we’ve just published a paper on estimating survival estimates, which is important for conservation.

As we have nearly 30,000 individuals, the data analysis becomes very computationally intensive and existing techniques can’t always be scaled up to these numbers; they “keel over” and for a single model to be fitted to the data it can take more than a week. We’re trying to develop more efficient computational techniques where we can analyse the data within a much short time so that we can sensibly fit the models that we want.

Another area I’m working on is about estimating population sizes of ‘hidden populations’, such as injecting drug uses or modern day slaves. These groups are very difficult to observe, so we rely on smaller sample populations observed for example via police or GP records, and then combine the observed data from these different sources to obtain a total estimate.

 

Can you explain ‘Bayesian modelling’ to someone who doesn’t have a background in statistics? Modelling is taking reality, simplifying it and trying to extract what the most important drivers are to understand the system of interest. For Bayesian modelling, even before looking at the dataset, we typically already have preconceived ideas about what we’re trying to estimate, or we may have previous data that provides us with some understanding. For example, you may think you know nothing about guillemots, but actually you know that the survival rate will not be something like 10% because they’d be extinct. Once you’ve collected your data, you analyse it as part of the Bayesian philosophy: you update your preconceived ideas, typically called your “prior” beliefs, with the information contained in the data, to form your updated, or posterior, beliefs about the system being modelled.

 

What is it that excites you most about statistics? I have always loved statistics. A lot of the statistics done at school is descriptive – pie charts and bar graphs! At university it’s much, much more than this… in statistics what you’re trying to do is essentially solve a puzzle! I’m a big fan of Agatha Christie and here I think there are many parallels with statistics! A murder victim and the associated evidence is simply a dataset. You’re trying to extract information from that dataset: who did it, how did they do it, and why did they do it? Statistics extracts the often hidden information from the data. The statistical evidence for the answers to the questions need to “stand up in court”, so rigorous methods are required to enable you to support your conclusions or accusations!

 

Statistics is an area of maths that is so often misinterpreted by non-specialists. Can you think of one thing that you wish more people knew about statistics? We all know there’s “lies, damn lies, and statistics” - which itself is an interesting statement often used negatively towards statistics, but it is possibly a misconstrued statement as it could mean that statistics can clear up these things! There’s been a lot of interest over the last year about everyone becoming armchair statisticians with all the data relating to COVID. One thing I’ll say that I want people to realise is that statistics is not about tabulating numbers, it’s about extracting information from the data. And this information can be very well hidden at times.

 

What’s next for your work? Do you have any exciting projects coming up? Yes! One of my projects involves capture-recapture, motivated not by the data but by the underlying maths and models.

Capture-recapture does what it “says on the tin”, and is most easily seen with animals. It’s about observing a given species to understand more about the population. You go out and capture all the animals you see, and you uniquely mark these animals so you can identify each individual. You do this on a series of occasions and create a huge matrix of binary data, where each row corresponds to each individual, and each column to the capture occasions. A 0 entry in the matrix means an individual is not observed at the given capture occasion; a 1 that it is observed. It is interesting how much information can be extracted from just this binary matrix!

However, modern day capture-recapture isn’t as simple as going out and seeing animals. You now get camera traps with motion sensors set up in a spatial array, so when you see an animal you know exactly where and when you saw it. This technique started about 15 years ago but there’s one thing that the associated statistical tools have nearly all failed to take into account: they implicitly assume animals may teleport! What I’ve managed to recently show is that you can write the statistical models in a different way, using an exactly equivalent model that allows you to incorporate memory and stop animals teleporting. Next on my to-do list is to develop this idea further and investigate the impact of these new models on real data problems.

 

Finally, is there any advice you’d give to any current maths undergraduates? Do something that you enjoy doing! There’s nothing worse than doing something you don’t like, so focus on what you really love. Also think about a bit of breadth; when you’re a student, it’s the time you can really try something new. Once you get past the student-stage you may have too many other pressures on. I know you have a lot of pressures on you as a student but it only gets worse!