Standard

Robust and automatic data cleansing method for short-term load forecasting of distribution feeders. / Huyghues-Beaufond, Nathalie; Tindemans, Simon; Falugi, Paola; Sun, Mingyang; Strbac, Goran.

In: Applied Energy, Vol. 261, 2020, p. 1-17.

Research output: Contribution to journalArticleScientificpeer-review

Harvard

APA

Vancouver

Author

Huyghues-Beaufond, Nathalie ; Tindemans, Simon ; Falugi, Paola ; Sun, Mingyang ; Strbac, Goran. / Robust and automatic data cleansing method for short-term load forecasting of distribution feeders. In: Applied Energy. 2020 ; Vol. 261. pp. 1-17.

BibTeX

@article{18fad1de13204a26b412f5c17d5ebe18,
title = "Robust and automatic data cleansing method for short-term load forecasting of distribution feeders",
abstract = "Distribution networks are undergoing fundamental changes at medium voltage level. To support growing planning and control decision-making, the need for large numbers of short-term load forecasts has emerged. Data-driven modelling of medium voltage feeders can be affected by (1) data quality issues, namely, large gross errors and missing observations (2) the presence of structural breaks in the data due to occasional network reconfiguration and load transfers. The present work investigates and reports on the effects of advanced data cleansing techniques on forecast accuracy. A hybrid framework to detect and remove outliers in large datasets is proposed; this automatic procedure combines the Tukey labelling rule and the binary segmentation algorithm to cleanse data more efficiently, it is fast and easy to implement. Various approaches for missing value imputation are investigated, including unconditional mean, Hot Deck via k-nearest neighbour and Kalman smoothing. A combination of the automatic detection/removal of outliers and the imputation methods mentioned above are implemented to cleanse time series of 342 medium-voltage feeders. A nested rolling-origin-validation technique is used to evaluate the feed-forward deep neural network models. The proposed data cleansing framework efficiently removes outliers from the data, and the accuracy of forecasts is improved. It is found that Hot Deck (k-NN) imputation performs best in balancing the bias-variance trade-off for short-term forecasting.",
author = "Nathalie Huyghues-Beaufond and Simon Tindemans and Paola Falugi and Mingyang Sun and Goran Strbac",
year = "2020",
doi = "10.1016/j.apenergy.2019.114405",
language = "English",
volume = "261",
pages = "1--17",
journal = "Applied Energy",
issn = "0306-2619",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Robust and automatic data cleansing method for short-term load forecasting of distribution feeders

AU - Huyghues-Beaufond, Nathalie

AU - Tindemans, Simon

AU - Falugi, Paola

AU - Sun, Mingyang

AU - Strbac, Goran

PY - 2020

Y1 - 2020

N2 - Distribution networks are undergoing fundamental changes at medium voltage level. To support growing planning and control decision-making, the need for large numbers of short-term load forecasts has emerged. Data-driven modelling of medium voltage feeders can be affected by (1) data quality issues, namely, large gross errors and missing observations (2) the presence of structural breaks in the data due to occasional network reconfiguration and load transfers. The present work investigates and reports on the effects of advanced data cleansing techniques on forecast accuracy. A hybrid framework to detect and remove outliers in large datasets is proposed; this automatic procedure combines the Tukey labelling rule and the binary segmentation algorithm to cleanse data more efficiently, it is fast and easy to implement. Various approaches for missing value imputation are investigated, including unconditional mean, Hot Deck via k-nearest neighbour and Kalman smoothing. A combination of the automatic detection/removal of outliers and the imputation methods mentioned above are implemented to cleanse time series of 342 medium-voltage feeders. A nested rolling-origin-validation technique is used to evaluate the feed-forward deep neural network models. The proposed data cleansing framework efficiently removes outliers from the data, and the accuracy of forecasts is improved. It is found that Hot Deck (k-NN) imputation performs best in balancing the bias-variance trade-off for short-term forecasting.

AB - Distribution networks are undergoing fundamental changes at medium voltage level. To support growing planning and control decision-making, the need for large numbers of short-term load forecasts has emerged. Data-driven modelling of medium voltage feeders can be affected by (1) data quality issues, namely, large gross errors and missing observations (2) the presence of structural breaks in the data due to occasional network reconfiguration and load transfers. The present work investigates and reports on the effects of advanced data cleansing techniques on forecast accuracy. A hybrid framework to detect and remove outliers in large datasets is proposed; this automatic procedure combines the Tukey labelling rule and the binary segmentation algorithm to cleanse data more efficiently, it is fast and easy to implement. Various approaches for missing value imputation are investigated, including unconditional mean, Hot Deck via k-nearest neighbour and Kalman smoothing. A combination of the automatic detection/removal of outliers and the imputation methods mentioned above are implemented to cleanse time series of 342 medium-voltage feeders. A nested rolling-origin-validation technique is used to evaluate the feed-forward deep neural network models. The proposed data cleansing framework efficiently removes outliers from the data, and the accuracy of forecasts is improved. It is found that Hot Deck (k-NN) imputation performs best in balancing the bias-variance trade-off for short-term forecasting.

U2 - 10.1016/j.apenergy.2019.114405

DO - 10.1016/j.apenergy.2019.114405

M3 - Article

VL - 261

SP - 1

EP - 17

JO - Applied Energy

T2 - Applied Energy

JF - Applied Energy

SN - 0306-2619

ER -

ID: 67636324