Get in touch
imageimage

When planning a research project, forming a clear picture at proposal stage for the extent of modelling and analysis required, and data requirements of such activity, is challenging. As data collection requires considerable time/resources, there is a danger that when key analysis is required (some time into the project), the available data is not suitable for meeting the requirements of that analysis.

The challenge

There can be considerable time between planning data collection (pre-proposal), and actually using that data for analysis within a working project. On receiving this data, multiple limitations may be discovered that can impair the ability to use that data for a specific analysis, e.g:

  • insufficient sample size,
  • temporal or spatial resolution does not match that required,
  • not all parameters required are measured, or
  • consent with data subject unexpectedly restricts use of the data.

The solution

To mitigate against some of the problems identified above, a project team needs to:

  1. Carry out data collection design at project proposal stage, mapping project objectives with required data at a sufficient level of detail.
  2. Understand what project objectives can involve, and investigate the use of modelling processes to help fill in incomplete or insufficient datasets.

ReFLEX approach

ReFLEX was able to identify required datasets at an early stage and, due to the contacts and previous work of consortium partners, leverage existing datasets to reduce the risk associated with any new dataset that has yet to be collated. Within ReFLEX, a Data Working Group was convened to design and implement a data governancy strategy and maintain a live data inventory of available datasets (including, where appropriate, third-party datasets) so that researchers could easily identify whether the required data was available and whether the specifics of that data would be appropriate for the designed analysis.

This also helped identify data that was not available and therefore conduct modelling approaches that could alleviate this problem (e.g. synthesizing data, extrapolating existing data to represent larger samples).

In combination, this provided a good empirical baseline to describe the energy systems being investigated.

What could have been done differently?

Larger samples of data (e.g. number of households) could have been identified at an earlier stage – though some rollout of metering was impacted by the Covid pandemic. Also, data consent arrangements with households were, intentionally, kept quite restrictive such that data could only be used within ReFLEX for activities that may improve services being offered to customers. To enhance data legacy, these consent forms could have offered an additional option for households to consent to their data being used beyond the needs of just ReFLEX services. However, this would have had to approach the concept of “informed” consent in a sensitive manner, ensuring data subjects were given full disclosure of intended uses of their data.

Both these factors did not alter the primary objectives of the project, but may have allowed for more flexible timelines and secondary tasks to have been completed more effectively – as well as allowing the data to have more impact post-ReFLEX.

image

Calculate your carbon footprint

Start the questionnaire

Latest news

View all news