Come meet the poster presenters to ask them questions and discuss their work
Check the programme for our poster viewing moments. For more details on each poster, click on the poster titles to read the abstract.
PO029: Streamlined deployment of value generating machine learning on live operational data
Philip Bradstock, Technical Lead, Bitbloom Ltd
The interest in machine learning (ML) methods has grown significantly in recent years which has led to increased efforts to develop and apply ML models on wind energy data. In particular, time series data from operating wind farms (SCADA data) is a prime candidate for ML modelling due to its availability and the abundance of information it contains. Consequently, many high performing ML models have been developed and demonstrated on various industry problems, such as failure prognostics and loading estimates. Despite this promising progress, there are currently very few such models deployed on live operational data providing value to owners and operators. The aim of this project was to identify the remaining hurdles within the industry that prevent such models being adopted, and thenceforth develop a system that enables easy training, development and deployment of such models on operational data. The project analysed numerous applications of ML and evaluated them from the perspective of an operational analyst according to multiple criteria, by assessing their value-to-effort ratio: * The complexity of training a good model on operational data; * The complexity of additional stages in a pipeline needed for deployment on live operational data; * The immediate value generated for commercial operation. From this evaluation, the prediction of “available power” (i.e. the power the turbine would produce given its current environment, in the absence of curtailment, downtime and under-performance) was identified as the leading application problem due to multiple use cases and the relative ease of developing models and pipelines that improve upon current industry best practice. Accurate evaluation of available power is important in analyses of direct value to wind farm stakeholders: * Evaluating lost yield during downtime and curtailment; * Post-construction long term energy yields; * Calculating power residuals in order to identify operation under yaw misalignment. To implement the identified ML models into a production environment, a software system was developed to streamline the training and evaluation of ML models on SCADA data, making them more accessible and easy to deploy onto live analysis and dissemination platforms. The bias and variance of the trained models were compared to that of the industry standard approach of using measured nacelle anemometer power curves with air density adjustment. Results showed that ML models that perform to twice the accuracy of the industry standard approach are readily achievable. Furthermore, the possibility to deploy the models into pipelines including established automated steps such as data cleaning was clearly demonstrated. Finally, the differences in the outcomes of the aforementioned use cases were investigated along with the reduced uncertainty and increased value that they therefore bring. This project addressed the existing hurdles in moving from high performing ML models in isolated environments towards established value generating pipelines on live SCADA data. The application problem of predicting available power is one that is already incorporated into live analysis pipelines but also one where ML models can easily bring improvement. This project demonstrated the steps needed to bring real value and hopes to encourage greater adoption of ML models onto SCADA data.