Environmental

Predicting E. coli risk from continuous sensor data

Chris

Chris Prosser

ML Team Lead

Last year we began a partnership with the North Devon Biosphere and launched a dashboard exploring continuous sensor data in the region. Since then, several data sources have been added including sample data collected by citizen scientists and Environment Agency (EA) bathing water quality.

The samples collected are analysed for E. coli (bacterial pollution) as well as other water quality indicators. Volunteers regularly collect water samples at sites and these are analysed using culture-based methods (growing colonies on a plate) - in this case using a portable testing unit rather than sending to a lab.  The bacterial load results come back as a count of colony-forming units (cfu) per 100ml. High E. coli levels indicate faecal contamination, which can pose a health risk to people using the water.

We are interested in whether the E. coli level can be predicted given the continuous sensor data we have for rainfall, water quality and water depth within the catchment. Sampling to measure E. coli is infrequent because it relies on volunteer's time and lab resources or expensive equipment, so there are gaps between measurements and even when you have a sample, there is a lag before you know the result. A model using continuous sensor data to learn from the samples that have been collected could address both timing issues: filling in between samples and giving an indication of current risk without waiting for lab results.

Continue reading the full article here.