Information on fit data for Wood, Li, Shaddick and Augustin (2016) "Generalized additive models for gigadata: modelling the UK black smoke network daily data" Journal of the American Statistical Association. For licensing reasons we are not able to distribute the prediction data (see below). Fit Data -------- The file black_smoke.RData contains an R (cran.r-project.org) data file that can be read into R with commands like... setwd("where/ever/you/put/the/data") ## edit this! load("black_smoke.RData") The variables are as follows. See the paper and supplementary material for more inforamtion. * bs is log(black smoke + 1). Black smoke is measured in micrograms per cubic metre. * id is a station identification factor * day is day of whole dataset. * year is year * yday is day of year (Jan 1st is 1) * wday is a numeric code for day of week. * Tmin0/Tmax0 are minimum and maximum temperature in Celcius for the day in question. * Tmean1/Tmean2 are mean temerature on the preceding 2 days (Celcius). * rainfall is monthly rainfall in mm. * x and y are easting and northing on a kilometre grid. * h is elevation in metres. * type is station type. * AR_start is a binary indicator of when an AR1 autocorrelation sequence should start. * type1 is a simplified classification of station types. * month is month of year (1=January) License information ------------------- The black smoke monitoring data themselves are free from restrictions, but the covariate data are provided under a variety of licences, detailed here. * Elevation (OS Open Data): https://www.ordnancesurvey.co.uk/business-and-government/licensing/using-creating-data-with-os-products/os-opendata.html - this gives wide freedom to redistribute. * Rainfall (UK Met office UKCP09): http://www.nationalarchives.gov.uk/doc/non-commercial-government-licence/non-commercial-government-licence.htm - this also gives wide freedom to redistribute. * Daily temperature (UK Met office UKCP09): http://www.metoffice.gov.uk/climatechange/science/monitoring/ukcp09/UKCIP08_license_agreement_130709.pdf - this license is more restrictive. We can re-distribute the `derived' data used for fitting, but the prediction data is essentially `primary' data under the terms of license, so we have not been able to provide it. However any user can download it free of charge themselves for non-commercial purposes.