Artificial intelligence systems provide promise in analyzing and evaluating power system data. There is currently a large push to use artificial intelligence (AI) and machine learning (ML) to help reduce the time it takes to perform maintenance on transformers and predict where and when the next transformer will fail.
Major companies in various industries are promoting and telling the wonders of AI and ML: managing the replacement plans of an aging or aged fleet, reduction in maintenance while extending asset life, operational efficiency — all while capturing the available expertise so it is not lost. These are lofty goals, and claims are being made already about the benefits of AI applications in the real world. The problem is that AI is not perfect — but it still has a role in the analysis of well-described problems with sufficient data to cover all possible situations that may be identified.
- Utilities are almost always faced with incomplete and possibly ambiguous data.
- Data analysis does not take place in a vacuum; utilities have a history and knowledge base to call on to check results.
- In weather forecasting, AI is used to reduce human error.
- Banks use AI in identity verification processes.
- Numerous institutions use AI to support help-line requests, sometimes in the form of chatbots.
- Siri, Cortana and Google Assistant are based on AI.
- AI systems can classify well-organized data, such as X-rays.
- On the downside, there are some issues with AI and ML:
- AI can be good at interpolation within a data set, but it might not be good at extrapolation to new data.
- “Giraffing” — the generic name for identifying the presence of objects where those objects do not exist — could provide bias in analysis based on
unrepresentative data sets.
- Using a black-box approach could make the reason for a decision not clear and transparent.
- In unsupervised ML, a similar approach is used, but in this case, the ML tool groups the cases based on clusters in the many dimensions of the data provided. An expert then classifies the resulting clusters and tests against new cases. As an example, consider an ML tool developed to recognize sheep and goats in pictures. In a supervised ML approach, an expert would classify each picture, and the tool would try to find data differences between the pictures that reflect the classification. It might not be clear why the tool does what it does, so the ML could be considered a black box. Once trained, an expert would show the ML tool more pictures for it to classify to see how well it does. If only pictures used in the training data are shown, it would likely do very well. However, if more complex pictures or pictures of another animal are shown, the ML tool might fail.
- In unsupervised ML, the tool clusters the data and the expert classifies it afterward. In both supervised and unsupervised ML tools, the ML tool performs very well when the test cases are like the training cases but not so well when the supplied cases are different than the training cases. What happens if there are multiple animals in a picture? Or, if there is a llama — how would that get classified? The effect called “giraffing”— where an ML tool trained to identify giraffes in supplied pictures then identifies giraffes in pictures where no giraffe is present — is a result of ML training where giraffes are overrepresented in the training cases but the cases of no giraffes are underrepresented.
EMI Spectra ML Classification
Practicalities At Duke Energy
- It can be limited and bad.
- Failed-asset data has not been documented and maintained.
- There has been no investment in cleaning and verifying available data.
- Data has not been normalized across multiple sources nor within a single source.
- There are unique characteristics of data related to the manufacturing process for sister units (that is, they are handmade).
In addition, data scientists must contend with these realities:
- Answers are assumed to be in the available data, without necessarily referencing transformer experts.
- ML assumes a Gaussian data distribution, but most failure modes are not based on Gaussian data.
- Major companies like Dow Chemical, Audi and Intel have been open about predictive models for major plant assets not being effective.
- IT and data scientists do not usually understand failure modes and may not take them into account for their modeling.
- A state may be changed automatically to “monitor” or “service” based on raw data.
- The state may be changed to “risk identified” based on engineering analytics and ML classification.
- No transformer state can be automatically changed to “stable” or “replace,” as those states require expert intervention. After reviewing the data, the expert determines whether a transformer is stable or should be marked for replacement, with comments recorded.