CN117454165A

CN117454165A - Training method of network model, motor temperature prediction method and related equipment

Info

Publication number: CN117454165A
Application number: CN202311312361.5A
Authority: CN
Inventors: 及非凡; 谢彪文; 刘灿; 黄成龙
Original assignee: Zhejiang Zero Run Technology Co Ltd; Zhejiang Lingsheng Power Technology Co Ltd
Current assignee: Zhejiang Zero Run Technology Co Ltd; Zhejiang Lingsheng Power Technology Co Ltd
Priority date: 2023-10-09
Filing date: 2023-10-09
Publication date: 2024-01-26

Abstract

The application discloses a training method of a network model, a motor temperature prediction method and related equipment, wherein the method comprises the following steps: acquiring data characteristics of target motor data; selecting model training data from data features of target motor data according to the power state of the motor; and inputting the model training data into the network model for training until the network model converges, so as to obtain the target network model meeting the requirements. According to the scheme, the model training data is selected from the data characteristics of the acquired target motor data according to the power state of the motor to perform model training, the network model training data with pertinence can be obtained, and further accuracy of predicting the motor temperature by the network model is improved.

Description

Training method of network model, motor temperature prediction method and related equipment

Technical Field

The application relates to the technical field of permanent magnet synchronous motors, in particular to a training method of a network model, a motor temperature prediction method and related equipment.

Background

Through accurately predicting and monitoring the temperature of the permanent magnet synchronous motor, workers can find potential faults early, the energy consumption of the motor is optimized, the service life of the motor is prolonged, and the safety and stability of a motor system are improved. Currently, a worker usually predicts the temperature of a motor by adopting a theoretical formula such as a temperature formula method or an empirical formula such as a parameter identification method and a thermal network method.

However, the temperature of the motor under different working conditions and working states is predicted by a theoretical formula and an empirical formula, and parameter adjustment is required according to different formulas, so that the complexity and uncertainty of temperature prediction are increased; the motor temperature is predicted by a parameter identification method, a large amount of actual working data is required to be collected to determine the parameters of the thermal model, and the time consumption and the cost are high; a great amount of experiments or simulation analysis is needed to predict the motor temperature by establishing a thermal network, and a great error may exist under the condition that the predicted motor temperature changes rapidly or the heat transfer process is unstable.

Disclosure of Invention

In order to solve the problems, the application provides a training method of a network model, a motor temperature prediction method and related equipment.

The first aspect of the application provides a training method of a network model, which comprises the following steps: acquiring data characteristics of target motor data; selecting model training data from the data characteristics of the target motor data according to the power state of the motor; and inputting the model training data into the network model for training until the network model converges, so as to obtain a target network model meeting the requirements.

In one embodiment, the step of selecting model training data from the data features of the target motor data according to the power state of the motor includes: filtering motor data with the power state in the target motor data being in a closed state to obtain filtered target motor data; and taking the data characteristics of the target motor data with the rotating speed being greater than or equal to a preset rotating speed threshold value and the torque being greater than or equal to a preset torque threshold value in the filtered target motor data as the model training data.

In some embodiments, the step of acquiring the data characteristic of the target motor data includes: acquiring initial motor data; preprocessing the initial motor data to obtain preprocessed motor data; and extracting the characteristics of the preprocessed motor data to obtain the data characteristics of the target motor data.

In some embodiments, the step of extracting features of the preprocessed motor data to obtain data features of the target motor data includes: performing characteristic correlation analysis on the preprocessed motor data to obtain analyzed motor data; and extracting the data characteristics of the analyzed motor data to obtain the data characteristics of the target motor data.

In an embodiment, the step of preprocessing the initial motor data to obtain preprocessed motor data includes: filling the missing value of the initial motor data to obtain filled motor data; removing motor data with the data value larger than a preset data threshold value in the filled motor data to obtain motor data with abnormal values removed; and carrying out exponential weighted average filtering treatment on the motor data with the abnormal values removed to obtain the preprocessed motor data.

In some embodiments, the network model comprises a two-way long and short memory sub-network, the step of inputting the model training data into the network model for training until the network model converges to obtain a target network model meeting the requirements comprises the steps of convolving the model training data to obtain a first data feature comprising a time sequence; performing time sequence feature extraction on the first data features comprising the time sequence based on the two-way long and short memory network to obtain second data features comprising the time sequence; and performing data jump and full connection processing on the second data features comprising the time sequence until the network model converges to obtain a target network model meeting the requirements.

A second aspect of the present application provides a method for predicting a motor temperature based on a network model, the method comprising: acquiring actual motor data; inputting the actual motor data into the target network model to obtain the motor temperature predicted by the target network model; wherein the target network model is any one of the target network models described above.

A third aspect of the present application provides a network model-based motor temperature prediction apparatus, the apparatus comprising: the acquisition module is used for acquiring the data characteristics of the target motor data; the data selection module is used for selecting model training data from the data characteristics of the target motor data according to the power state of the target motor; and the training module is used for inputting the model training data into the network model for training until the network model converges to obtain a target network model meeting the requirements.

A fourth aspect of the present application provides an electronic device, including a memory and a processor, where the processor is configured to execute program instructions stored in the memory, to implement a method for training a network model according to any one of the foregoing.

A fifth aspect of the present application provides a computer readable storage medium having stored thereon program instructions which when executed by a processor implement a method of training a network model as described in any of the preceding claims.

According to the scheme, the model training data is selected from the data characteristics of the obtained target motor data according to the power state of the motor, and the network model is trained by using the model training data until the target network model meeting the requirements is obtained. Therefore, model training data are selected from the data characteristics of the acquired target motor data according to the power state of the motor to perform model training, and the network model training data with pertinence can be obtained, so that the accuracy of predicting the motor temperature by the network model is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

FIG. 1 is a flow chart of an exemplary embodiment of a method of training a network model as shown herein;

FIG. 2 is a flow chart of an exemplary embodiment of step S120 in the training method of the network model shown in FIG. 1;

FIG. 3 is a flow chart of an exemplary embodiment of step S110 in the training method of the network model shown in FIG. 1;

FIG. 4 is a flow chart of an exemplary embodiment of step S330 in the training method of the network model shown in FIG. 3;

fig. 5 is a schematic diagram of the effect of pearson correlation coefficient analysis on the preprocessed motor data in the training method of the network model shown in the application;

FIG. 6 is a flow chart of an exemplary embodiment of step S320 in the training method of the network model shown in FIG. 3;

FIG. 7 is a flow chart of an exemplary embodiment of a method of training a network model as shown herein;

FIG. 8 is a schematic diagram of a model structure of a training method of the network model shown in the present application;

FIG. 9 is a flow chart of an exemplary embodiment of a network model-based motor temperature prediction method shown herein;

FIG. 10 is a schematic diagram of an embodiment of a training apparatus of the network model of the present application;

FIG. 11 is a schematic diagram of an embodiment of an electronic device of the present application;

FIG. 12 is a schematic diagram of an embodiment of a computer storage medium of the present application.

Detailed Description

The following describes the embodiments of the present application in detail with reference to the drawings.

The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

It should also be noted that the terms "first" or "second" and the like used in this specification are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature.

Referring to fig. 1, fig. 1 is a flowchart illustrating an exemplary embodiment of a training method for a network model according to the present application. Specifically, the method may include the steps of:

step S110: and acquiring the data characteristics of the target motor data.

The target motor data may refer to various data related to the target motor for describing, recording and analyzing information of the target motor operation state, performance, temperature, etc. Target motor data includes, but is not limited to, motor operating parameters, motor operating time series, temperature data, or environmental condition data. Wherein the motor operating parameters may include current, voltage, power, speed or torque, etc., which are used to describe the operating state and performance of the motor; the motor operation time sequence can comprise current, voltage, rotating speed or temperature and the like, and is used for analyzing dynamic change and trend of the motor; the temperature data may include motor surface temperature, internal temperature, historical motor temperature data, or the like; the environmental condition data may include ambient temperature, humidity, air pressure, etc.

For example, when predicting the temperature of the permanent magnet synchronous motor, the network model may acquire motor data for temperature prediction through acquisition requirements of a sensor, a control system or other devices, and select a suitable target motor data feature for model training through a time sequence feature extraction method (Time Series Feature extraction based on scalable hypothesis tests, tsfresh) of extensible hypothesis test, thereby obtaining the network model for predicting the temperature of the permanent magnet synchronous motor.

The Tsfresh is a tool specially used for extracting time series data features, a statistical hypothesis test-based method is used for judging whether each feature has obvious correlation to a target variable, feature selection can be automatically carried out on motor data of a target motor, features which have no meaning or low correlation to the target variable are eliminated, and the cost for processing redundant features is reduced. Tswitch has the capability of parallel computation, the process of feature extraction can be accelerated by carrying out feature extraction on motor data through Tswitch, meanwhile, tswitch supports incremental computation, new time sequence data can be subjected to incremental feature extraction on the basis of the existing feature set, overhead of recalculating all features is avoided, and computing efficiency is improved. The data features of the target motor data may include a statistical feature, a frequency domain feature or a waveform feature of a motor current, a statistical feature, a frequency domain feature or a waveform feature of a motor voltage, a motor power statistical feature such as a motor active power, a motor reactive power or an apparent power, a motor power factor feature, a motor speed statistical feature such as a motor average speed, a motor maximum speed or a motor speed change rate, a motor torque statistical feature such as an average torque, a maximum torque or a motor torque change rate, a motor temperature statistical feature such as an average temperature, a maximum temperature or a motor temperature change rate, a motor external environment data statistical feature such as an environment temperature, a humidity or an air pressure, or a time sequence feature.

Step S120: and selecting model training data from the data characteristics of the target motor data according to the power state of the motor.

The power state of the motor may refer to a power supply condition of a power source connected to the motor, and may include a normal power supply state, a power-off state, or a power failure state. The motor is normally powered on in a normal power supply state, the power supply is stable, and the motor can normally operate; when the motor is in a power-off state, the motor is not connected with a power supply, no current is input, and the motor cannot run; in the power failure state, the power source has faults such as overload, short circuit, open circuit and the like, and the motor may not work normally or be damaged.

For example, when predicting the temperature of the permanent magnet synchronous motor, the network model may mark motor data according to the power state of the motor, and distinguish motor data in a normal power supply state, a power failure state and a fault state.

Step S130: and inputting the model training data into the network model for training until the network model converges, so as to obtain the target network model meeting the requirements.

The network model may refer to a mathematical model used in machine learning and deep learning for fitting and predicting data, which may be combined by various layers and activation functions, trained by weights and bias parameters, predicted output on the basis of given input data, or other tasks such as classification, regression, or sequence generation, etc. Common network models may include recurrent neural networks (Recurrent Neural Network, RNN), long Short Term Memory networks (LSTM), convolutional neural networks (Convolutional Neural Network, CNN), support vector regression (Support Vector Regression, SVR), or multi-layer perceptrons (Multilayer Perceptron, MLP), where the RNN may be a neural network with recurrent connections that can efficiently process time series data; CNN may be a neural network model dedicated to processing image data, which may be applied to modeling of one-dimensional time series data, through convolution and pooling layers, the CNN may capture local or global features in the time series data; SVR can be a regression model based on a support vector machine, which can be used for nonlinear temperature prediction problems, by mapping data into a high-dimensional space, using a kernel function optimization model, and performing regression analysis; LSTM can be a special type of RNN, can be specially used for modeling and predicting time series data, has a memory unit and a gating mechanism, can better handle long-term dependence and is suitable for the problem of temperature prediction of a permanent magnet synchronous motor; the MLP may be a classical feed-forward neural network model comprising one or more hidden layers for learning and predicting nonlinear relations, which can be applied to the prediction of permanent magnet synchronous motor temperature by appropriate network design and training.

The target network model satisfying the requirements may refer to a network model in which an average difference between a predicted value and a true value of a motor temperature is within a preset range, wherein the average difference between the predicted value and the true value may be calculated by an evaluation method such as a mean square error (Mean Squared Error, MSE) or a mean absolute error (Mean Absolute Error, MAE).

Illustratively, the present solution may use a deep learning framework, such as a TensorFlow or PyTorch, to construct Long-term memory network and time-series convolutional network (Long Short-Term Memory Network and Temporal Convolutional Network, LSTNet) models, and input training data into the LSTNet model for training, and through a back-propagation algorithm, an optimizer (e.g., random gradient descent method), and minimizing a loss function, the LSTNet model may update weights and biases so that the LSTNet model can gradually adapt to the training data and gradually converge. The LSTNet is a deep learning model combining LSTM with CNN and is used for predicting time series data, and compared with a traditional method, the LSTNet can capture long-term dependency and nonlinear modes in the time series data when predicting the temperature of the permanent magnet synchronous motor, so that the time dynamic change of the temperature of the motor is better understood. The LSTNet combines the convolution operation of the CNN, the characteristics of time series data can be extracted on different time scales, so that the LSTNet can acquire information with different granularities from the original data, including short-term fluctuation and long-term trend, and simultaneously, the convolution operation of the LSTNet can be calculated in parallel, so that the model has higher calculation efficiency in training and deducing, and the training speed of the model and the response time of real-time prediction are accelerated. In addition, the LSTNet can process time series data with different distribution, noise and missing values, has strong data adaptability, and can better understand how the model extracts key features from input data and predicts the key features by analyzing information such as activation, weight and feature mapping of LSTM and CNN networks.

Referring to fig. 2, fig. 2 is a flowchart of an exemplary embodiment of step S120 in the training method of the network model shown in fig. 1. Based on the above embodiment, step S120 further includes selecting model training data from the data features of the target motor data according to the power state of the motor:

step S210: and filtering motor data with the power state in the target motor data being in a closed state to obtain filtered target motor data.

For example, the network model may tag motor data with binary labels, where 1 represents a normal power state, 0 represents a power down or fault state, and by filtering the data labeled 1, data samples at power down or fault may be removed from the dataset. And after traversing and filtering the target motor data with the power supply state being in the off state, the network model can check adjacent driving fragments one by one, check whether the driving fragments meet the merging condition, if the driving fragments meet the merging condition, the network model can merge the driving fragments into one fragment, and update the information of the starting time, the average speed, the driving distance and the like. The network model can judge whether the two adjacent driving segments can be combined according to the time interval or the distance between the two adjacent driving segments, and if the time interval between the two adjacent driving segments is smaller than a set threshold value or the distance between the two adjacent driving segments is smaller than the set threshold value, the two adjacent driving segments can be combined into one driving segment.

Step S220: and taking the data characteristics of the target motor data with the rotating speed being greater than or equal to a preset rotating speed threshold value and the torque being greater than or equal to a preset torque threshold value in the filtered target motor data as model training data.

The motor speed may refer to the number of revolutions of the motor, typically expressed in Revolutions Per Minute (RPM). The motor speed depends on the power supply frequency, pole pair number and load conditions. In the alternating current industry, standard power supply frequencies are 50Hz or 60Hz, typically using 2 to 8 pole motors, for direct current motors, the rotational speed depends on the voltage, the design of the armature windings and the load characteristics.

The motor torque may refer to the rotational torque produced by the motor, typically expressed in newton meters (n·m). The torque may be the torque applied to the load as the motor rotates, which determines the power and performance of the motor. The torque of the motor is proportional to the current, and is also affected by factors such as motor design, load characteristics, supply voltage, and the like.

For example, when predicting the temperature of the permanent magnet synchronous motor, the network model may define a preset rotation speed threshold and a torque threshold in advance, traverse the filtered target motor data through a loop structure or an iterator, etc., check whether the rotation speed and the torque of the motor meet the preset rotation speed threshold and the torque threshold, and if the detected target motor data has a rotation speed greater than or equal to the preset rotation speed threshold and a torque greater than or equal to the preset torque threshold, extract the data feature of the sample and store the data feature as training data.

According to the scheme, the target motor data with the rotating speed being greater than or equal to the preset rotating speed threshold value and the torque being greater than or equal to the preset torque threshold value in the filtered target motor data are used as model training data, so that noise and interference of abnormal data on a model can be reduced, and the robustness and accuracy of the model are improved. And the filtered data set contains motor data meeting the preset threshold standard, so that the behavior and performance of the motor in the working state of the preset rotating speed and torque threshold standard can be better reflected, and the accuracy of motor temperature prediction is improved.

Referring to fig. 3, fig. 3 is a flowchart illustrating an exemplary embodiment of step S110 in the training method of the network model shown in fig. 1. On the basis of the above embodiment, the step S110 of acquiring the data characteristics of the target motor data further includes:

step S310: initial motor data is obtained.

For example, motor related sensors (e.g., speed sensor, torque sensor, etc.) may be used to collect initial motor data when predicting the temperature of the permanent magnet synchronous motor, wherein the sensors may be directly connected to the motor system, and the initial motor data may be obtained by reading the output data of the sensors. For installed motor systems, initial motor data may also be obtained by means of meters or recording devices that record the operating data of the motor, which typically record information about the motor's run time, speed, temperature, power consumption, etc., and may be obtained by periodically downloading or exporting the recorded data. In addition, if the motor system is already integrated with a data storage system (e.g., database), the initial motor data may also be obtained by querying the database, and query statements may be written to retrieve the desired motor data based on the database structure and stored data format.

Step S320: and preprocessing the initial motor data to obtain preprocessed motor data.

For example, when the temperature of the permanent magnet synchronous motor is predicted, the network model can preprocess the motor data through operations such as missing value processing, abnormal value detection and processing or smoothing processing, so that invalid or unreliable data points are removed, and the quality and reliability of the motor data are improved.

When the missing value processing is performed, the missing value can be filled by using an interpolation method or the missing value can be filled by deducing according to other related variables; when abnormal value detection and processing are carried out, abnormal values can be detected and processed through a statistical analysis or machine learning method so as to avoid the influence on subsequent analysis and modeling; in performing the smoothing process, filters or smoothing techniques may be used to reduce noise in the data, making the data easier to understand and analyze.

Step S330: and extracting the characteristics of the preprocessed motor data to obtain the data characteristics of the target motor data.

Illustratively, the network model may perform feature extraction on the preprocessed motor data through tsfresh. The network model may use a Python packet manager (Package Installer for Python, pip) to command the installation of tsfresh, sort the pre-processed motor data into a format that meets the tsfresh requirements, and use the extract_features function of the tsfresh library for feature extraction.

The extract_features function is one of the core functions of the tsfresh library, and can be used for calculating a large number of features on a time sequence data set so as to be used for tasks of machine learning, data analysis and the like.

According to the scheme, the network model can preprocess the initial motor data by acquiring the initial motor data to obtain the preprocessed motor data, and further, the preprocessed motor data is subjected to feature extraction to obtain the data features of the target motor data. The preprocessed motor data can remove abnormal values and noise, so that the motor data is more accurate and reliable, the stability of a training model is improved, and model training errors caused by data quality problems are reduced. Meanwhile, the network model can reduce the dimension of data, retain important information and reduce redundancy by selecting and extracting proper motor data characteristics, thereby being beneficial to simplifying the complexity of model training and improving the generalization capability and prediction accuracy of the model.

Referring to fig. 4, fig. 4 is a flowchart illustrating an exemplary embodiment of step S330 in the training method of the network model shown in fig. 3. Based on the above embodiment, step S330 performs feature extraction on the preprocessed motor data, and the obtaining the data features of the target motor data further includes:

Step S410: and performing characteristic correlation analysis on the preprocessed motor data to obtain analyzed motor data.

Feature correlation analysis can be a common method for evaluating the relationship between different features, and is helpful for understanding the interaction between features and finding out key features with higher correlation with target variables. Common feature correlation analysis methods may include correlation coefficient methods, heat map methods, scatter plot methods, feature importance methods, or visualization techniques.

For example, when the temperature of the permanent magnet synchronous motor is predicted, the feature correlation analysis can be performed on the preprocessed motor data through the pearson correlation coefficient, the pearson correlation coefficient of the preprocessed motor data can be calculated through a corresponding function or tool in a statistical software package (such as a Numerical computation library (NumPy) in Python, a data analysis library (Pandas) or a scientific computation library (Scientific Python, sciPy), wherein the predicted motor temperature can be used as a reference sequence Y, other features can be used as feature sequences, the pearson correlation coefficient of the feature sequences and Y can be calculated, and the pearson correlation coefficient of the feature sequences and Y can be calculated as follows:

Wherein x and Y respectively represent the characteristic and the predicted motor temperature values,is the mean value of the characteristic sequence, r is the pearson correlation coefficient of the correlation coefficient, and the value range of the pearson correlation coefficientThe circles are-1 to 1, when the correlation coefficient is close to 1, the variables are represented as positive correlation, when the correlation coefficient is close to-1, the variables are represented as negative correlation, and when the correlation coefficient is close to 0, the variables are represented as no linear correlation.

Step S420: and extracting the data characteristics of the analyzed motor data to obtain the data characteristics of the target motor data.

As shown in fig. 5, fig. 5 is a schematic diagram of the effect of pearson correlation coefficient analysis on the preprocessed motor data in the training method of the network model shown in the present application. After pearson correlation coefficient analysis, the correlation coefficient of Rev was 0.66, the correlation coefficient of t_in was 0.62, the correlation coefficient of torque was-0.039, the correlation coefficient of ud was-0.24, the correlation coefficient of uq was 0.64, the correlation coefficient of id was 0.04, and the correlation coefficient of iq was-0.02. From the pearson correlation coefficient analysis result, torque, id, ud and Iq have weak correlation with motor temperature, and the network model can consider deleting the part of the characteristics and extract the data characteristics of the rest motor data. Wherein, rev is the rear motor speed, t_in is the water inlet temperature, torque is the Torque, the given voltage of the D-axis of the Ud rear motor, uq is the given voltage of the Q-axis of the rear motor, id is the given current of the D-axis of the rear motor, and Iq is the given current of the Q-axis of the rear motor.

According to the scheme, the network model can conduct characteristic correlation analysis on the preprocessed motor data, and data characteristics of the motor data after the characteristic correlation analysis are extracted. After the network model analyzes the characteristic correlation of the motor data, deleting the characteristic with low correlation with the target motor data, so that redundancy of a data set can be reduced, complexity of the model is simplified, dimension of a characteristic space is reduced, and efficiency and calculation speed of the model are improved. And deleting the characteristic with low correlation with the target motor data can enable the network model to concentrate on the characteristic with stronger correlation with the motor data, and improves the generalization capability of the model, thereby improving the prediction capability of the model.

Referring to fig. 6, fig. 6 is a flowchart illustrating an exemplary embodiment of step S320 in the training method of the network model shown in fig. 3. Based on the above embodiment, step S320 performs preprocessing on the initial motor data, and the obtaining the preprocessed motor data further includes:

step S510: and carrying out missing value filling processing on the initial motor data to obtain the filled motor data.

The missing value filling process may be a process of finding missing data in the initial data and performing the missing value filling process on the initial data by using spline interpolation, mean filling, median filling, or the like. Spline interpolation, among other things, can be a curve fitting-based method that fills in missing values by interpolating a smooth curve between known data points; mean filling may refer to a method of filling in missing values by using the mean of the missing value features; the median fill may refer to a method of filling in missing values by using the median of the missing value features.

The network model can fill up the missing value of the initial motor data through a linear interpolation method, and replaces the missing initial data by estimating the numerical value according to the data points on two adjacent sides. The calculation formula for filling the missing value of the initial motor data by the linear interpolation method is as follows:

wherein y is a missing value, x is the position of the missing value, (x) ₀ ,y ₀ ) And (x) ₁ ,y ₁ ) The coordinates and values of the data before and after the deletion, respectively.

Step S520: and removing the motor data with the data value larger than the preset data threshold value in the filled motor data to obtain motor data with abnormal values removed.

When the motor data after filling is processed, the network model can determine a proper data threshold as a judging standard of the abnormal value, and whether the motor data is abnormal can be judged by traversing each data value in the motor data set after filling and comparing the data value with a preset data threshold. And if the certain data value is larger than the preset data threshold value, identifying the data value as an abnormal value, deleting or marking the data identified as the abnormal value from the motor data set, and obtaining the motor data with the abnormal value removed.

Illustratively, the network model can remove abnormal values in motor data through a machine learning-based isolated forest algorithm or a normal distribution method. When the data obeys the standard normal distribution, the distance between 99% of the data and the mean value is within 3 standard deviations, the network model can define the threshold range of the abnormal value as mu+/-3 sigma, the abnormal value is detected through the standard deviation method, and the abnormal value is considered to be the abnormal value if the data exceeding the threshold value is considered to be deleted according to the average value and the standard deviation of the motor data. Wherein the data average can be used to measure the central position of the data set, generally denoted μ; the standard deviation can be used for measuring the dispersion degree or fluctuation of the data set, and is generally expressed by sigma, and the larger the standard deviation is, the larger the dispersion degree of the data is; an Isolation Forest algorithm (Isolation Forest) can be an anomaly detection algorithm based on an ensemble learning idea, can effectively identify an anomaly value in data, and has the advantages of high efficiency, expandability and accuracy; the normal distribution method may be a statistical method based on normal distribution, according to the nature of normal distribution, most data points are concentrated near the mean value, the abnormal value deviates far from the mean value, the distance between the data points is calculated by using the mean value and the standard deviation, and the network model may determine whether the data is the abnormal value.

Step S530: and carrying out exponential weighted average filtering treatment on the motor data with the abnormal values removed to obtain the preprocessed motor data.

The filtering process can be a process of performing noise reduction, abnormal value elimination or smoothing on motor data through an algorithm or a filtering method, and more accurate and reliable data information can be extracted through the filtering process. The network model can use Fourier filtering or exponential weighted average filtering to carry out data filtering, wherein the Fourier filtering realizes the noise reduction effect by filtering specific frequency components in a frequency domain, and frequency domain analysis and filtering operation can be completed in a shorter time by utilizing fast Fourier transformation, so that the method is suitable for application with higher real-time requirements, and the processing speed can be improved; exponentially Weighted Moving Average (EWMA) filtering achieves a smoothing effect by applying a weighted average to the previous data points, with EWMA filtering giving higher weight to the nearest data point and progressively lower weight to the older observations than simple moving average filtering.

For example, when using EWMA filtering to process motor data after removing abnormal values, the smoothness of the data may be controlled by a weighting factor (also referred to as a smoothing factor), where the weighting factor generally takes a value between 0 and 1, and the larger the value is the greater the weight given to the current observed value, the smoother the filtering effect, and after determining the weighting factor, starting from the first observed value of the motor data sequence, using the first observed value as an initial smoothed value, calculating each of the remaining observed values from the second observed value according to the formula of the exponential weighted average filtering. The formula for the exponentially weighted average filtering is as follows:

y _t ＝αx _t +(1-α)y _t-1

Wherein y is _t Represents an exponentially weighted moving average at time t, y _t-1 Representing an exponentially weighted moving average at time t-1, α is a smoothing factor which is an intermediate value between [0,1 ]]The parameter in between, α in this scheme is 0.7.

According to the scheme, the network model can remove motor data with the data value larger than the preset data threshold value in the motor data subjected to filling processing by filling the missing value of the initial motor data, and perform exponential weighted average filtering processing on the motor data with the abnormal value removed to obtain the motor data subjected to preprocessing, so that the integrity, accuracy and quality of the data are improved, and the effectiveness and reliability of the network model on motor temperature analysis are enhanced.

Referring to fig. 7 and 8, fig. 7 is a flow chart illustrating an exemplary embodiment of a training method for a network model shown in the present application; fig. 8 is a schematic diagram of a model structure of a training method of the network model shown in the present application. Specifically, the method may include the steps of:

step S610: and carrying out convolution processing on the model training data to obtain a first data characteristic comprising a time sequence.

The convolution process, which may be used for filtering, feature extraction, image enhancement, etc., is a common technique of performing a convolution operation on an input signal and a filter (also called a convolution kernel or kernel function) to generate an output signal, and extracting specific features or information in the input signal from the output signal by the filter, where the convolution process may be implemented by a discrete convolution or a Convolutional Neural Network (CNN), etc.

Illustratively, when predicting the temperature of a permanent magnet synchronous motor using the LSTNet model, short-term features in the time series may be extracted and local dependencies between variables captured by a convolution layer, which typically contains a plurality of filters, each filter having a width w and a height n. Each filter may convolve the input matrix and generate an output eigenvector h _k Wherein the formula of the convolution operation is as follows:

h _k ＝RELU(w _k *X+b _k )

wherein h is _k May be the output eigenvector, which may represent a convolution operation, w _k And b _k The weights and offsets of the filter may be represented, respectively, X may be an input vector, and RELU may represent a rectified linear unit function.

Step S620: and performing time sequence feature extraction on the first data features comprising the time sequence based on the two-way long and short memory network to obtain second data features comprising the time sequence.

Two-way long and short memory network (BiLSTM) based is a commonly used Recurrent Neural Network (RNN) architecture for processing sequence data. Compared with the traditional unidirectional LSTM, the BiLSTM can model sequence data in the forward direction and the backward direction at the same time, so that information in a sequence can be more comprehensively captured. The structure of the BiLSTM comprises two LSTM layers which respectively run in the forward direction and the reverse direction, the forward direction layer processes the input sequence according to the time sequence, the reverse direction layer processes according to the reverse time sequence, and the output results of the forward direction layer and the reverse direction layer are connected or fused to form the whole BiLSTM output.

For example, when the LSTNet model is used to predict the temperature of the permanent magnet synchronous motor, the LSTNet model may establish a BiLSTM model including LSTM layers in two directions (forward direction and reverse direction), sort the first data features including the time sequence obtained by the convolution process into a format suitable for inputting BiLSTM, divide the first data features into an input sequence and a corresponding target sequence, process the input sequence using the forward LSTM layer, process the reverse input sequence using the reverse LSTM layer, wherein the input sequence may be a feature vector sequence arranged in time sequence, and the target sequence may be a feature vector of the next time step corresponding to the input sequence. After training the BiLSTM model, the BiLSTM model may be used to extract second data features, and obtain feature representation results output by the model, where the feature representations may be time sequence features learned by the LSTNet model in the LSTM layer, including long-term dependencies, periodic patterns, and the like. In BiLSTM, the hidden state of the cyclic unit for each time step t is calculated as follows:

r _t ＝σ(x _t w _xr +h _t-1 w _hr +b _r )

u _t ＝σ(x _t w _xu +h _t-1 w _hu +b _lu )

c _y ＝RELU(x _y w _xc +r _t ⊙(h _t-1 w _hc )+b _c )

h _t ＝(1-u _t )⊙h _t-1 +u _t ⊙c _t

wherein rt may be expressed as a reset gate (reset gate) for controlling the hidden state h at the previous time _t-1 Resetting the degree of the current moment; u (u) _t May be represented as an update gate (update gate) for controlling the hidden state h at the previous time _t-1 And a new candidate hidden state c at the current time _t The degree of fusion between the two; c _t May refer to a new candidate hidden state (candidate hidden state) representing a new hidden state at the current timeA state candidate; h is a _t The hidden state at the current moment can be referred to as a final hidden state obtained by updating the door and the new candidate hidden state; x is x _t May refer to an input vector or feature at the current time; w (w) _xr ，w _xu ，w _xc May refer to the input vector x _t Weights between the reset gate, the update gate and the new candidate hidden state respectively; w (w) _hr ，w _hu ，w _hc May refer to the hidden state h at the previous time _t-1 Weights between the reset gate, the update gate and the new candidate hidden state respectively; b _r ，b _lu ，b _c May refer to bias terms; sigma may represent a sigmoid function; the multiplication at element level can be indicated by "; RELU may represent a rectified linear unit function. Resetting the retention degree of the historical information in the gate control time sequence data, updating the updating proportion between the gate control candidate hidden state and the hidden state at the previous moment, combining the information of the current input, the reset gate and the hidden state at the previous moment, and obtaining the final hidden state at the current moment by updating the linear combination of the gate and the new candidate hidden state.

Step S630: and performing data jump and full connection processing on the second data features comprising the time sequence until the network model converges, and obtaining a target network model meeting the requirements.

The data jump (skip connection) may refer to a technique of introducing additional connection in the neural network and retaining the output of the intermediate layer, which is used for improving the gradient propagation and model learning ability, solving the problems of gradient disappearance or gradient explosion, etc., and improving the representation ability and training effect of the network. In data hopping, the output of part of the middle layer of the network is skipped and connected directly to the subsequent or output layer, forming a hopping connection that bypasses some layers.

The full connection process may refer to a process of connecting each neuron with all neurons of a previous layer in a neural network model. In the fully connected layer, each neuron receives all inputs from the previous layer and performs weighted summation on the inputs, then nonlinear conversion is performed through an activation function, specifically, in the fully connected layer, each input characteristic is multiplied by a corresponding weight, and then the result is added with a bias term to obtain the activation value of the neuron.

For example, when predicting the temperature of the permanent magnet synchronous motor using the LSTNet model, a second data feature comprising a time series may be prepared as a suitable input format and a layer comprising a plurality of fully connected layers, skip connections or other types may be constructed. The hidden state of the cyclic unit at each time step t is calculated as follows:

r _t ＝σ(x _t w _xr +h _t-p w _hr +b _r )

u _r ＝σ(x _t w _xu +h _t-p w _ht +b _u )

c _t ＝RELU(x _t w _xc +r _t ⊙(h _t-p w _r )+b _c )

h _t ＝(1-u _t )⊙h _t-p +u _t ⊙c _t

Wherein r is _t May be referred to as a reset gate (reset gate) for controlling the input x at the current time _t And a history hidden state h _t-p Resetting the degree of the current moment; u (u) _t May be referred to as an update gate (update gate) for controlling the input xt and the history hidden state h at the current time _t-p New candidate hidden state c for current time _t Is the degree of fusion of (2); c _t May refer to a new candidate hidden state (candidate hidden state), representing a new hidden state candidate at the current time; h is a _t The hidden state at the current moment can be referred to as a final hidden state obtained by updating the door and the new candidate hidden state; xt may refer to the input vector or feature at the current time; w (w) _xr ，w _xu ，w _xc May refer to the input vector x _t Weights between the reset gate, the update gate and the new candidate hidden state respectively; w (w) _hr ，w _ht ，w _r May refer to a history hidden state h _t-p Weights between the reset gate, the update gate and the new candidate hidden state respectively; br, bu, bc may refer to bias terms; sigma may refer to a sigmoid function forMapping input values to ranges [0,1 ]]The method comprises the steps of carrying out a first treatment on the surface of the p may represent an offset of time steps for specifying the historical hidden state h _t-p Is a position of (c). By introducing a historical hidden state, the model can better capture long-term dependencies of the time series and take into account information of previous time steps.

The calculation formula for combining the outputs of the loop and loop jump layers as the output of the model through the full connection layer is as follows:

wherein,a hidden state (hidden state) that may represent the target state; w (W) ^R Can represent a hidden state for the reference state +.>Hidden state converted into target state->Is a weight matrix of (2); />A hidden state that may represent a reference state; />It is possible to represent the auxiliary hidden state for the moment i in the past +.>Hidden state to transition to target stateIs a weight matrix of (2); />Can be used forAn auxiliary hidden state indicating the past i time; b may represent a bias term (bias term); p may represent the historical number of time steps of the auxiliary hidden state.

Due to the non-linear characteristics of the convolution and circulation layers, the real motor temperature data can be changed continuously in a non-periodic mode, the prediction accuracy of the model is greatly reduced, the LSTNet model can decompose the final prediction into a linear part and a non-linear part containing a repeated mode, and a classical autoregressive model is adopted as a linear component to extract local information in a time sequence, wherein the autoregressive model is shown in the following formula:

wherein,a final hidden state (final hidden state) at position i of time step t may be represented; Can represent the output y at position i for the time of the past k _t-k,i Conversion to final hidden state->Is a weight matrix of (2); y is _t-k,i The output at position i at the past k times can be represented; b ^ar May represent a bias term (bias term); q ^ar-1 The historical number of time steps of the past output may be represented.

The output of the neural network full-connection layer and the output of the autoregressive model are superimposed as the final prediction result of the model as follows:

the network model can learn nonlinear relationships and complex patterns, while the autoregressive model can captureThe historical dependency relationship of the time series data and the output of the superimposed data can more comprehensively capture the characteristics and rules of the data. The output of the full connection layer generally comprises static characteristics of the current input, the output of the autoregressive model comprises dynamic characteristics of the historical moment, the static characteristics and the dynamic characteristics can be combined by superposing the static characteristics and the dynamic characteristics, and the information of the current input and the historical context is comprehensively considered, so that the model can predict more comprehensively and accurately. Wherein,the predicted value, which may be expressed as the t-th time step, is the output of the model; />A dynamic component, which may be denoted as the t-th time step, for representing dynamic, varying portions of the input data; / >A long term component, denoted as the t-th time step, is used to represent a long term trend or periodicity in the input data.

In the training process of the LSTNet model, the mean square error can be used as a loss function of the model, and the calculation formula is as follows:

the Mean Square Error (MSE) may be used to measure the average difference between the predicted and real values by calculating the square of the difference between the predicted and real values for each sample and summing all samples, and dividing the number of samples by n to obtain the average value, where the smaller the average value, the smaller the difference between the predicted and real values, and the better the model performance. Where n may be expressed as the number of samples, i.e. the number of samples in the dataset; y is _i May be represented as a true or observed value for the ith sample;the predicted value, which may be represented as the i-th sample, is typically predicted from the input data by a model.

According to the scheme, the first data features obtained by the convolution processing of the network model are subjected to time sequence feature extraction to obtain the second data features through the two-way long and short memory network, local or global features can be captured in time sequence data, the dimension of the data is reduced, and important features related to motor temperature prediction are extracted. By introducing a bi-directional LSTM layer into the model, the network model can better extract more comprehensive and richer timing characteristics. And some layers or connections are skipped in the network structure, so that training and convergence of the model are accelerated, the problem of gradient disappearance or gradient explosion can be avoided, and the expression capability and decision capability of the model can be further enhanced by reprocessing and integrating the full-connection layers.

Referring to fig. 9, fig. 9 is a flowchart illustrating an exemplary embodiment of a network model-based motor temperature prediction method according to the present application. Specifically, the method may include the steps of:

step S710: and acquiring actual motor data.

For example, when predicting the temperature of the permanent magnet synchronous motor based on a network model, the network model may collect actual motor data using motor-related sensors (e.g., speed sensor, torque sensor, etc.), etc., where the sensors may be directly connected to the motor system, and the actual motor data may be obtained by reading the sensor output data. For installed motor systems, actual motor data may also be obtained by means of meters or recording devices that record the operating data of the motor, which typically record information about the running time, speed, temperature, power consumption, etc. of the motor, and by periodically downloading or exporting the recorded data. In addition, if the motor system is already integrated with a data storage system (e.g., database), the network model may also retrieve the actual motor data by querying the database, and based on the database structure and stored data format, may write a query statement to retrieve the desired actual motor data.

Step S720: and inputting the actual motor data into a target network model to obtain the motor temperature predicted by the target network model.

In an exemplary case where the format of the actual motor data matches the input format of the target network model, the corresponding function and method may be called to transfer the actual motor data to the target network model according to the programming language and the deep learning framework used by the target network model, so that the target network model generates a prediction result of the motor temperature according to the input actual motor number.

According to the scheme, the actual motor data are input into the target model, and the model can learn the modes and trends of the motor data and predict according to the modes and trends. Compared with a simple statistical method, the target model can better capture the complexity and nonlinear relation in the time series data, thereby improving the prediction accuracy. Moreover, the target model can effectively capture long-term dependency in the time series data, and feature extraction is performed on different time scales, so that the target model can simultaneously consider time information of different dimensions, and therefore modes and trends in the time series data can be better described.

It should be further noted that, the execution subject of the training method of the network model may be a training method apparatus of the network model, for example, the training method apparatus of the network model may be a terminal device or a server or other processing device, where the terminal device may be a User Equipment (UE), a computer, a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like.

FIG. 10 is a schematic diagram of an embodiment of a training method of the network model of the present application. As shown in fig. 10, the training method apparatus 800 of the exemplary network model includes: acquisition module 810, data selection module 820, and training module 830, specifically: an obtaining module 810, configured to obtain a data feature of the target motor data; the data selecting module 820 is configured to select model training data from data features of the target motor data according to a power state of the target motor; the training module 830 is configured to input the model training data into the network model for training until the network model converges, so as to obtain a target network model that meets the requirement.

The functions of each module may refer to a training method embodiment of the network model, which is not described herein.

Referring to fig. 11, fig. 11 is a schematic structural diagram of an embodiment of an electronic device of the present application. The electronic device 900 comprises a memory 901 and a processor 902, the processor 902 being arranged to execute program instructions stored in the memory 901 for implementing the steps in a training method embodiment of the network model. In one particular implementation scenario, electronic device 900 may include, but is not limited to: the microcomputer and the server, and the electronic device 900 may also include mobile devices such as a notebook computer and a tablet computer, which are not limited herein.

In particular, the processor 902 is configured to control itself and the memory 901 to implement the steps of any of the above-described embodiments of a method of processing an identification card. The processor 902 may also be referred to as a CPU (Central Processing Unit ). The processor 902 may be an integrated circuit chip having signal processing capabilities. The processor 902 may also be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 902 may be commonly implemented by an integrated circuit chip.

Referring to fig. 12, fig. 12 is a schematic structural diagram of an embodiment of a computer storage medium of the present application. The computer storage medium 903 stores program instructions 904 that can be executed by the processor, the program instructions 904 for implementing the steps in the training method embodiments of any of the network models described above.

The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.

In the several embodiments provided in the present application, it should be understood that the disclosed methods and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all or part of the technical solution contributing to the prior art or in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A method for training a network model, the method comprising:

acquiring data characteristics of target motor data;

selecting model training data from the data characteristics of the target motor data according to the power state of the motor;

and inputting the model training data into the network model for training until the network model converges, so as to obtain a target network model meeting the requirements.

2. The method of training a network model according to claim 1, wherein the step of selecting model training data from the data features of the target motor data according to the power state of the motor comprises:

filtering motor data with the power state in the target motor data being in a closed state to obtain filtered target motor data;

and taking the data characteristics of the target motor data with the rotating speed being greater than or equal to a preset rotating speed threshold value and the torque being greater than or equal to a preset torque threshold value in the filtered target motor data as the model training data.

3. The method of claim 1, wherein the step of obtaining the data characteristics of the target motor data comprises:

Acquiring initial motor data;

preprocessing the initial motor data to obtain preprocessed motor data;

and extracting the characteristics of the preprocessed motor data to obtain the data characteristics of the target motor data.

4. The method for training a network model according to claim 3, wherein the step of extracting features of the preprocessed motor data to obtain data features of the target motor data comprises:

performing characteristic correlation analysis on the preprocessed motor data to obtain analyzed motor data;

and extracting the data characteristics of the analyzed motor data to obtain the data characteristics of the target motor data.

5. A method of training a network model according to claim 3, wherein the step of preprocessing the initial motor data to obtain preprocessed motor data comprises:

filling the missing value of the initial motor data to obtain filled motor data;

removing motor data with the data value larger than a preset data threshold value in the filled motor data to obtain motor data with abnormal values removed;

And carrying out exponential weighted average filtering treatment on the motor data with the abnormal values removed to obtain the preprocessed motor data.

6. The method for training a network model according to claim 1, wherein the network model comprises a two-way long and short memory sub-network, and the step of inputting the model training data into the network model for training until the network model converges to obtain a target network model meeting the requirements comprises the following steps:

carrying out convolution processing on the model training data to obtain first data characteristics comprising a time sequence;

performing time sequence feature extraction on the first data features comprising the time sequence based on the two-way long and short memory network to obtain second data features comprising the time sequence;

and performing data jump and full connection processing on the second data features comprising the time sequence until the network model converges to obtain a target network model meeting the requirements.

7. A method for predicting motor temperature based on a network model, the method comprising:

acquiring actual motor data;

inputting the actual motor data into the target network model to obtain the motor temperature predicted by the target network model;

Wherein the target network model is obtained based on the training method of any one of the preceding claims 1 to 6.

8. A network model-based motor temperature prediction apparatus, the apparatus comprising:

the acquisition module is used for acquiring the data characteristics of the target motor data;

the data selection module is used for selecting model training data from the data characteristics of the target motor data according to the power state of the target motor;

and the training module is used for inputting the model training data into the network model for training until the network model converges to obtain a target network model meeting the requirements.

9. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the method of any one of claims 1 to 7.

10. A computer readable storage medium having stored thereon program instructions, which when executed by a processor, implement the method of any of claims 1 to 7.