CN114723583A

CN114723583A - Unstructured electric power big data analysis method based on deep learning

Info

Publication number: CN114723583A
Application number: CN202210301556.9A
Authority: CN
Inventors: 梁志远; 刘鹏; 张硕; 常迪; 邓嶔; 郑薇; 米兆祥; 崔晓萌; 王薇
Original assignee: Tianjin Sanyuan Electric Information Technology Co ltd
Current assignee: Tianjin Sanyuan Electric Information Technology Co ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-07-08

Abstract

The invention provides an unstructured electric power big data analysis method based on deep learning. The method has the advantages that unstructured data such as videos, images and documents are integrated, a multi-mode combined unstructured electric power big data deep learning algorithm is provided, intelligent recognition is carried out on the unstructured data such as the videos, the images and the documents, the unstructured data processing capacity of a big data platform is improved, stronger characteristic learning capacity and analysis and prediction capacity of electric power unstructured data are embodied in the big data processing process, and the demonstration application of typical application scenes in the electric power industry is realized.

Description

Unstructured electric power big data analysis method based on deep learning

Technical Field

The invention relates to the technical field of data analysis, in particular to an unstructured electric power big data analysis method based on deep learning.

Background

With the rapid development of information technology, more and more data are generated, stored and used all over the world, and the speed is increased more and more. As an important industry related to the national civilization, the production and enterprise management of the electric power are rapidly fused with the information technology in an unprecedented breadth and depth, and data becomes a new opportunity and a new challenge for driving the development of the electric power and related industries.

Unstructured data is data that has an irregular or incomplete data structure, no predefined data model, and is inconvenient to represent with a database two-dimensional logical table. Including office documents of all formats, text, pictures, XML, HTML, various types of reports, images, audio/video information, and so forth. 80% of the data in a business is unstructured and the data grows exponentially by 60% each year. It is reported that only 1% -5% of the data on average are structured data. Today, this explosive growth of unused data consumes the storage capacity of complex and expensive primary storage in the enterprise.

Under the background, how to efficiently utilize information technology means such as unstructured data analysis and the like to provide flexible power consumption for urban power supply networks, comprehensively process and analyze massive data, analyze and process power consumption behaviors of users in different time periods, different categories and fine granularity, realize efficient and rapid management of large power data, and meet the urgent need of urban development at the present stage.

Disclosure of Invention

The object of the present invention is to solve at least one of the technical drawbacks mentioned.

Therefore, an object of the present invention is to provide an unstructured electric power big data analysis method based on deep learning, so as to solve the problems mentioned in the background art and overcome the disadvantages in the prior art.

In order to achieve the above object, an embodiment of an aspect of the present invention provides an unstructured electric big data analysis method based on deep learning, which is characterized in that video data preprocessing, image data preprocessing, and text data preprocessing are adopted to perform analysis processing on electric big data.

The video data preprocessing is used for analyzing and processing the video data; the image data preprocessing is used for analyzing and processing the image data; and preprocessing the text data by adopting a natural language processing technology to preprocess the power big data.

The image preprocessing comprises image graying processing, geometric transformation processing, image enhancement processing and decoding processing; the graying processing includes: adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray image; the geometric transformation process includes: processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm, and correcting errors of an image acquisition system and random errors of instrument positions; the image enhancement processing includes: and enhancing the image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation. The image decoding process includes: and after the image enhancement operation, performing image decoding processing, decoding the image by adopting a TensorFlow frame, and converting the image into an original pixel matrix of the image.

Preferably, the power supply system further comprises a power supply area division, wherein the power supply area division specifically comprises:

step H1, counting each cell in the power supply area;

step H2, power supply area is initialized randomly, each cell is divided into K clusters C randomly₁,C₂To C_KWherein K represents the number of the set power supply areas;

step H3, the cluster center of each power supply area is calculated using the following formula:

where | C | represents the number of cells included in the power supply area C;

and (4) recording the average daily power consumption of the whole unit in t days of the ith area.

And step H4, calculating the similarity of the cluster centers between the cells and the power supply areas.

Step H5 of classifying each cell into a power supply area having the largest degree of similarity to the cluster center between the power supply areas, based on the result of the similarity calculation in step H4.

Step H6: it is determined whether or not the power supply area allocated to each cell converges, and if so, the result of dividing the power supply area allocated to each cell is output, otherwise, the process returns to step H3.

In any of the above schemes, preferably, the method further includes constructing a power prediction model, specifically:

step E1 setting Power supply area C_PLet user U_iThe electricity usage is linearly related over consecutive D +1 days,

by x_iIs predicted by a linear combination of different elements in (1), as shown in the following equation:

wherein w_pIs a region C_PLinear combination parameter of b_iAs error variable, error variable b_iIs a varying random error term for the user U_iHas the following equation:

let matrix y_i，x_i，b_i，w_pRespectively in the following forms:

thus, the user U_iThe linear combination model of the power consumption is as follows:

y_i＝x_iw_p+b_i

step E2, training a separate model for each power supply area, and connecting the power supply areas C_PElectricity consumption matrix x of all users in_iAre combined into a matrix X_pCombining the predicted power consumption into vector Y_pAll error terms b_iCombined into vector B_pThen all linear models are combined as:

Y_p＝X_pw_p+B_p

according to C_PMatrix X in (1)_pSum vector Y_pEstimate the parameter w_pThe value is obtained.

Step E3, applying least squares method to C_PSolving is carried out, and an error B is set_pIs finite in variance and is zero mean, i.e., E [ B ]_p]0 to yield w_pThe least squares solution of (c) is:

the error of the prediction model is:

step E4 for each supply area C_PRespectively adopting linear regression model for modeling, and adopting

And solving to obtain a user power consumption prediction model on different power supply areas.

In any of the above schemes, preferably, the method further includes, based on multi-task joint learning, using an iterative joint learning algorithm, and in each iteration, by sharing user data and simultaneously optimizing prediction models in different power supply regions, realizing overall improvement of model performance in different power supply regions, specifically:

step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the supply ofElectric region C_PUpper utilization matrix X_pSum vector Y_pConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithm_pAnd error B_pAs a region C_PAnd in the reference model, the similarity matrix of the total electricity utilization behaviors of the region is set as S.

Step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas C_qAccording to C_PAnd C_qGeneral electricity demand similarity S_pq，S_pqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability S_pqRandom decimation of C_qAnd with C_PThe user data in (1) are fused to obtain X_p∪qAnd Y_p∪q。

Step F3: using least squares algorithm, according to X_p∪qAnd Y_p∪qSolving model parameter W with minimum joint learning loss function_p∪qAnd a prediction parameter B_p∪q。

Step F4: judging whether the models on all the areas are updated or not: if the models in all the regions are updated, the step F5 is performed, otherwise, the step F2 is returned.

Step F5: judging whether the least square algorithm is converged: if convergence is reached, i.e., the models in all the regions are not updated, the result is output, otherwise, the process returns to step F2.

In any of the above schemes, preferably, the video data preprocessing includes shot segmentation and key frame extraction, and the shot segmentation includes: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises the following steps: and classifying the images in the image library by adopting a K-means clustering algorithm.

In any of the above aspects, preferably, the text data preprocessing includes:

and step Q1, performing word segmentation operation on the text data by adopting a Chinese word segmentation tool.

And step Q2, performing the stop word processing on the segmented data by using the stop word dictionary.

And step Q3, converting the data processed by the stop words into a structured data form which can be identified and analyzed by a computer by adopting a Word2Vec tool kit.

In any of the above schemes, it is preferable that the method further includes video feature extraction, specifically:

and K1, acquiring an original image, and extracting color features in the original image by using the color histogram.

And K2, sequentially carrying out smooth filtering on the original image, calculating the pixel gradient, selecting points with larger gradient change rate as image edge points, and extracting the edge characteristics of the original image.

And step K3, extracting the texture features of the original image by adopting Gabor filtering.

And K4, blocking the pixel areas of the image, and calling a matching algorithm to estimate a motion vector of each pixel area so as to express the motion characteristics of the video.

In any of the above schemes, it is preferable that the method further includes an image analysis based on a convolutional neural network, and the image analysis is specifically:

and D1, acquiring an original pixel matrix of the image and representing the original pixel matrix as a three-dimensional matrix, wherein the length and width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and the depth of the image is 3 in the RGB color mode.

And D2, extracting various features of the image by adopting a plurality of convolution kernels, convolving the input image by adopting a filter which can be trained, and then adding a deviation to obtain a convolution layer after the feature extraction is finished.

And D3, performing maximum or average operation on adjacent areas in the convolution layer, adding corresponding weight and deviation, obtaining output through an activation function, and outputting a characteristic diagram of the sampling layer.

And D4, extracting the local features of the image by connecting each convolution kernel with the local pixel points of the previous layer of feature map, and then performing convolution on the convolution kernels and the previous layer of whole feature map to obtain the convolution result of the global features of the image.

And D5, adding the bias parameters to the convolution result, and calculating the characteristic diagram of the convolution layer through the activation function, wherein the specific operation is as follows:

wherein f represents Sigmoid activation function, b represents offset, and w^n,mAnd representing the positions of convolution kernels N and M, wherein N represents the length of the convolution kernels, M represents the width of the convolution kernels, and u represents the feature diagram of the output of the previous layer.

Step D6, add a pooling layer between the convolutional layers.

In any of the above schemes, preferably, the method further includes a text data analysis based on serialization, which is used for analyzing texts and other types of data, and constructing a serialized text convolution operation according to a one-dimensional sequence characteristic inside the data, specifically:

and P1, establishing a text input layer, and sequentially arranging word vectors corresponding to the words in the sentence into a matrix.

And step P2, establishing text convolution layers, wherein the size of each text convolution layer is the multiplication of the filter _ size and the embedding _ size.

The filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, namely, the word order relationship is considered between adjacent words, and the embedding _ size is the dimension of a word vector.

And step P3, establishing a text pooling layer, and extracting the maximum value of the column vectors obtained by text convolution.

In any of the above schemes, preferably, the method further comprises a heterogeneous stacked denoising automatic coding network, a node semantic modeling based on multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on a heterogeneous information network, and deep scale learning based on a heterogeneous information network;

the heterogeneous stacked denoising automatic coding network is used for learning the coding mode of input data and measuring loss by comparing original input with reconstructed output;

the multi-mode-based node semantic modeling is used for converting node contents in different modes into the same feature space, and then performing unified semantic modeling on all types of nodes;

heterogeneous feature fusion based on meta-path information propagation further fuses and improves feature representation of nodes through sharing and fusion mutual learning of model parameters;

constructing a clustering relation by using structural semantics embodied by a heterogeneous network based on the node neighbor relation of the heterogeneous information network, and establishing a neighbor relation;

the deep heterogeneous scale learning based on the heterogeneous information network is used for constructing a deep heterogeneous scale learning method based on data pair constraint.

Compared with the prior art, the invention has the advantages and beneficial effects that:

1. according to the unstructured electric power big data analysis method based on deep learning, mass data are comprehensively processed and analyzed, the intelligent recognition technical capabilities of voice, images, videos and the like based on a big data platform are improved, an accurate data analysis basis is provided for subsequent information processing and analysis, and more resource cost is saved.

2. The unstructured electric power big data analysis method based on deep learning carries out real-time analysis processing on massive data, is high in efficiency, improves accuracy of statistical estimation through big data, scientifically solves large-scale optimization problems, and becomes an effective tool for big data mining and learning.

3. The unstructured electric power big data analysis method based on deep learning enables consistency optimization of feature learning and learning of a classification model to obtain better performance on one hand, greatly reduces manual intervention on the other hand, and has better applicability in ever-changing practical problems.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a structural diagram of an unstructured electric power big data analysis method based on deep learning according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative and intended to explain the present invention and should not be construed as limiting the present invention.

As shown in fig. 1, according to an unstructured electric power big data analysis method based on deep learning, video data preprocessing, image data preprocessing and text data preprocessing are adopted to analyze and process electric power big data.

The video data preprocessing is used for analyzing and processing the video data.

Image data preprocessing is used to analyze the image data.

Text data preprocessing adopts a natural language processing technology to preprocess the big electric power data; the processing of text data includes word segmentation and vectorization of words.

The image preprocessing comprises the gray processing, the geometric transformation processing, the image enhancement processing and the decoding processing of the image; the graying processing includes: and adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray-scale image.

The geometric transformation process includes: and processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm for correcting errors of an image acquisition system and random errors of the position of an instrument.

The image enhancement processing includes: and enhancing the image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation.

The image decoding process includes: and after the image enhancement operation, performing image decoding processing, decoding the image by adopting a TensorFlow frame, and converting the image into an original pixel matrix of the image.

In the image analysis of the unstructured electric power big data analysis method, the quality of the image directly influences the design of the recognition algorithm and the precision of the effect, irrelevant information in the image is eliminated, useful real information is recovered, the detectability of the relevant information is enhanced, and the data is simplified to the maximum extent, so that the reliability of feature extraction, image segmentation, matching and recognition is improved.

The image preprocessing of the embodiment of the invention mainly aims to eliminate irrelevant information in an image, recover useful real information, enhance the detectability of relevant information and simplify data to the maximum extent, thereby improving the reliability of feature extraction, image segmentation, matching and identification.

The image graying processing performs graying based on an RGB color model. The color of each pixel point of the color map defined in the RGB space is determined by R, G, B three components. The number of bits occupied by each component in the memory determines the image depth, i.e. the number of bytes occupied by each pixel. For a common 24-depth color RGB image, three components of the RGB image respectively occupy 1 byte, so that each component can take a value of 0 to 255, and thus a pixel point can have a color variation range of 1600 tens of thousands (255x255x 255). For such a color map, the corresponding gray-scale map has an image depth of only 8 bits (it can be considered that the three components of RGB are equal), which also means that the amount of computation required for processing the gray-scale map image is really small. It should be noted, however, that although some color levels are lost, the gray scale map description is consistent with the color map description in terms of overall and local color and intensity level distribution characteristics throughout the image.

Graying an RGB image is commonly referred to as weighting and averaging three components of RGB of the image to obtain a final grayscale value. In image processing, a commonly used graying method: 1. component method 2, maximum method 3, average method 4, weighted average method. The invention uses the post-maximum method, and takes the maximum value of the three-component brightness in the color image as the gray value of the gray image.

The specific weighting method comprises the following steps: max (B + G + R).

The geometric transformation processing is to process the acquired image through geometric transformation such as translation, transposition, mirror image, rotation, scaling and the like, and is used for correcting the system error of the image acquisition system and the random error of the instrument position such as the imaging angle, perspective relation and even the lens self-reason. Furthermore, it is also necessary to use a gray interpolation algorithm because pixels of the output image may be mapped onto non-integer coordinates of the input image as calculated according to this transformation relationship. The commonly used methods are nearest neighbor interpolation, bilinear interpolation and bicubic interpolation.

The method aims to improve the visual effect of an image, purposefully emphasizes the overall or local characteristics of the image, changes the original unclear image into clear or emphasizes certain interesting characteristics, enlarges the difference between different object characteristics in the image, inhibits the uninteresting characteristics, improves the image quality and riches the information content, enhances the image interpretation and recognition effect and meets the requirements of certain special analysis aiming at the application occasions of the given image. Image enhancement algorithms can be divided into two broad categories: a spatial domain method and a frequency domain method.

The image enhancement processing of the invention mainly uses a spatial domain method to realize the enhancement of the image. The spatial domain method is a direct image enhancement algorithm and is divided into a point operation algorithm and a neighborhood denoising algorithm. The point arithmetic algorithm is gray level correction, gray level transformation (also called contrast stretching) and histogram modification. The neighborhood enhancement algorithm is divided into two types, namely image smoothing and sharpening. Common algorithms for smoothing include mean filtering, median filtering, and spatial filtering. Common sharpening algorithms include gradient operator method, second derivative operator method, high-pass filtering, mask matching method and the like.

After the image enhancement operation, image decoding processing is carried out to convert the image into an original pixel matrix of the image. The coding of images is mainly done using the tensrflow framework, which provides the coding/decoding functions for jpeg and png format images.

TensorFlow is a second generation artificial intelligence learning system developed by Google based on DistBlief, and can transmit a complex data structure to an artificial intelligence neural network for analysis and processing. The TensorFlow can be used in the fields of multiple machine learning and deep learning such as voice recognition or image recognition, various aspects of improvement are carried out on the developed deep learning infrastructure DistBlief, and the TensorFlow can be operated on various devices which are as small as one smart phone and as large as thousands of data center servers. TensorFlow will be completely open source and available to anyone. TensorFlow supports CNN, RNN and LSTM algorithms, which are currently the most popular deep neural network models in Image, Speech and NLP.

Text data preprocessing, namely processing unstructured data such as a text of a power system and the like through a natural language processing technology aiming at the text data, performing works such as Chinese word segmentation, lexical analysis, syntactic analysis, semantic analysis, vectorization and the like on a basic document, and analyzing the work into vector expression which can be understood by a computer, thereby providing support for constructing a powerful data analysis and intelligent system for the unstructured data.

The unstructured electric power big data analysis method based on deep learning is high in analysis processing efficiency, can be widely applied to various industries such as governments, energy sources and public services, is wide in applicability, more intelligent in application and more refined in management, provides a good basis for big data analysis decision, and reduces resource waste.

Further, still include power supply area and divide, power supply area divides specifically to be:

step H1-statistics of each cell within the power supply area.

Step H2, power supply area is initialized randomly, each cell is divided into K clusters C randomly₁,C₂To C_K，C_KA cluster label is indicated, where K indicates the number of set power supply regions.

where | C | represents the number of cells included in the power supply area C;

and (4) recording the overall average daily electricity consumption in t days of the ith area.

Step H4, regarding the whole representation of each power supply area as a special cell, calculating the cluster center similarity between the cell and the power supply area.

The similarity algorithm is mainly used for measuring the similarity between objects and is the basis of tasks such as information retrieval, recommendation system and data mining. Similarity calculation is performed using euclidean distances, assuming that both objects X and Y have N-dimensional features, i.e., X ═ X₁,x₂,…,x_n}，Y＝{y₁,y₂,…,y_nThe specific formula is as follows:

where dot (X, Y) represents the inner product of the vectors.

And a step H5 of dividing each cell into power supply regions having the greatest degree of similarity to the cluster center between the power supply regions, based on the result of the similarity calculation in the step H4.

Step H6: it is determined whether or not the power supply area allocated to each cell converges, and if so, the result of dividing the power supply area allocated to each cell is output, otherwise, the process returns to step H3. Judging whether the power supply area divided by each cell is converged, namely judging whether cluster labels of all users are adjusted, if so, judging whether cluster labels of the cells are changed in the previous round of cluster label adjustment; if the cluster label of the user is not adjusted, the method returns to the step H3. And D, judging whether cluster labels of the cells are changed in the previous round of cluster label adjustment, if so, returning to the step H3, and otherwise, outputting the division result of the power supply area divided by each cell. By the method, aiming at the characteristics of the power supply area, the power supply area division algorithm based on the k-means algorithm has high division efficiency, and the division result is convenient to analyze and process subsequently.

Specifically, the method further comprises the step of constructing a power prediction model, wherein in the construction process of the power prediction model, a model is respectively constructed for each power supply area based on the power prediction model of the autoregressive model so as to fit the power consumption requirements of users in different areas. The method specifically comprises the following steps:

wherein w_pIs region C_PLinear combination parameter of b_iAs error variable, error variable b_iIs a varying random error term for the user U_iHas the following equation:

let matrix y_i，x_i，b_i，w_pRespectively in the following forms:

y_i＝x_iw_p+b_i

step E2, training a separate model for each power supply area, and connecting the power supply areas C_PElectricity consumption matrix x of all users in_iAre combined into a matrix X_pCombining the predicted power consumption into vector Y_pAll error terms b_iCombined into vector B_pThen, all linear models are combined as:

Y_p＝X_pw_p+B_p

according to C_PMatrix X in (1)_pSum vector Y_pEstimate the parameter w_pA value; i.e. according to C_PPower consumption X of all users in_pAnd the marked power consumption Y_pEstimate the parameter w_p。

the error of the prediction model is:

step E4 for each supply area C_PModeling by respectively adopting linear regression models, and adopting

And solving to obtain a user power consumption prediction model on different power supply areas. The construction of the power prediction model provides an accurate model base for subsequent analysis, fits the power consumption requirements of users in different areas, is convenient and practical,the construction efficiency is high.

Further, the method comprises the steps of based on multi-task joint learning, adopting an iterative joint learning algorithm, and simultaneously optimizing prediction models on different power supply areas through sharing of user data in each iteration process to realize overall improvement of model performance in different power supply areas, wherein the method specifically comprises the following steps:

step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the power supply region C_PUpper utilization matrix X_pSum vector Y_pConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithm_pAnd error B_pAs region C_PThe reference model in (1) is set as an area total electricity consumption behavior similarity matrix S;

step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas C_qAccording to C_PAnd C_qGeneral electricity demand similarity S_pq，S_pqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability S_pqRandom decimation of C_qAnd with C_PThe user data in (1) are fused to obtain X_p∪qAnd Y_p∪q；

Step F3: using least squares algorithm, according to X_p∪qAnd Y_p∪qSolving model parameter W with minimum joint learning loss function_p∪qAnd a prediction parameter B_p∪q(ii) a The joint learning loss function is as follows:

where P is the number of divided regions, E [ B ]_p∪q]For prediction error, the model parameter W with the smallest loss function is selected_p∪qAs C_POf (2) is performed.

Step F4: judging whether the models on all the areas are updated or not: if the models in all the regions are updated, performing step F5, otherwise, returning to step F2;

step F5: judging whether the least square algorithm converges: if convergence is reached, i.e., the models in all the regions are not updated, the result is output, otherwise, the process returns to step F2.

In the process of dividing the power supply regions, in addition to the division result of the power supply regions, a similarity relation matrix S of the overall power consumption requirements among the respective power supply regions is obtained. Aiming at the matrix S, a multi-task-based joint learning model is adopted, an iterative joint learning algorithm is adopted, and in each iteration process, prediction models on different power supply areas are optimized simultaneously through sharing of user data, so that the overall improvement of model performance in different power supply areas is realized.

Further, the video data preprocessing comprises shot segmentation and key frame extraction, and the shot segmentation comprises the following steps: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises the following steps: and classifying the images in the image library by adopting a K-means clustering algorithm.

The video data imaging segmentation is mainly implemented by a histogram-based method because data such as images captured by an unmanned aerial vehicle are often influenced by various external factors. The histogram-based algorithm is the most common scene segmentation method, is simple and convenient to process, and can achieve a better effect on most videos. The histogram-based method equally divides the gray scale, brightness or color of each pixel of adjacent frames into N levels, and then makes histogram comparison for each level counting the number of pixels. The number of grey levels or colour distributions of the population is counted, which is well tolerant to movements within the lens and slow movements of the camera.

The key frame is the most important, representative image or images in the shot. Depending on the complexity of the shot content, one or more key frames may be extracted from a shot. The key frame is selected to contain the main information of the shot. And it must not be too complex to handle.

The key frame extraction method based on clustering of the embodiment of the invention firstly adopts a K-means clustering algorithm to classify the images in the image library, and the key frame extraction based on the K-means clustering algorithm can greatly reduce the calculated amount. The method is high in calculation efficiency, and visual contents with obvious video shot changes can be effectively acquired. Extracting a small number of key frames for the low-activity shots; otherwise, more key frames are extracted.

Further, the text data preprocessing comprises:

step Q1, performing word segmentation operation on the text data by adopting a Chinese word segmentation tool;

step Q2, the stop word dictionary is adopted to carry out stop word processing on the segmented data;

and step Q3, converting the data after the stop Word processing into a structured data form which can be recognized and analyzed by a computer by adopting a Word2Vec tool kit.

Word segmentation: the method is characterized in that the natural language processing technology is utilized to preprocess the power big data, word segmentation operation is firstly carried out, and word segmentation is the most basic problem in natural language processing. The project mainly uses an open-source Chinese word segmentation tool to carry out word segmentation according to the Chinese word segmentation tool.

The results segmentation tool supports three segmentation models: the accurate mode is used for trying to cut the sentence most accurately, and is suitable for text analysis; in the full mode, all words which can be formed into words in a sentence are scanned, so that the speed is very high, but ambiguity cannot be solved; the search engine mode is used for segmenting long words again on the basis of the accurate mode, the recall rate is improved, the search engine mode is suitable for word segmentation of the search engine, meanwhile, complex word segmentation and user-defined dictionaries are supported, and the function is strong.

The algorithm involved in the ending part word comprises: (1) realizing efficient word graph scanning based on the Trie tree structure, and generating a Directed Acyclic Graph (DAG) formed by all possible word forming conditions of Chinese characters in a sentence; (2) a maximum probability path is searched by adopting dynamic planning, and a maximum segmentation combination based on the word frequency is found; (3) for unknown words, an HMM model based on Chinese character word forming capability is adopted, and a Viterbi algorithm is used.

Stop words: on the basis of word segmentation, stop word processing is carried out, because the analyzed text has many invalid words and symbols, such as 'yes', 'and', 'but' and the like, in order to save storage space and improve search efficiency, the existing stop word dictionary is mainly used for stop word operation, real work is screened and processed from the perspective of human brain processing business, and nonstandard and non-uniform text data is converted into an accurate and normative equipment operation sequence.

Generating a word vector: the processed data is converted into a structured data form that can be identified and analyzed by a computer on the basis of the step word segmentation and the step stop word. The data obtained from the stop words are converted into digital information in a vector form, the word vector can reflect the similarity and difference between words, the relation between words can be mined, and unstructured data such as texts are converted into structured data which can be recognized by a computer.

Word2Vec is an open-source tool kit for obtaining Word vector Word2Vec, which is derived by Google in 2013, and comprises a Continuous bag-of-Words Model (CBOW) and a Skip-gram Model, wherein the Skip-gram Model predicts surrounding Words by using a central Word, the CBOW Model predicts the central Word by using the surrounding Words, and the value of each Word to the central Word is the same without sequence and distance. Both models consist of an input layer, a mapping layer, which maps words to a continuous vector space, and an output layer. The training modes adopt random gradient descent and hierarchical softmax so as to reduce the calculation amount. The invention mainly adopts a skip-gram model in Word2Vec to train, generates corresponding Word vectors, and can convert each sentence into a matrix form after obtaining the Word vectors.

Specifically, the method further comprises video feature extraction, including:

k1, acquiring an original image, and extracting color features in the original image by adopting a color histogram;

color features are extracted using a color histogram. The color histogram is a most common way to extract and count the color features of a video image, is a global statistical method, describes the proportion of different color components in all pixel points of a complete picture, and specifically comprises the following steps:

(1) the color space model is selected, and the RGB color space model is mainly used.

The most common uses of the RGB (red, green, blue) color model are display systems, color cathode ray tubes, and color raster graphics displays, which use R, G, B values to drive R, G, B electron guns to emit electrons, and excite R, G, B three-color phosphors on a fluorescent screen to emit light with different brightness, and generate various colors by additive mixing; the scanner also absorbs R, G, B component in the light transmitted by the original through reflection or transmission, and uses it to express the color of the original.

(2) Respectively counting the corresponding color component information of the image pixel points;

(3) the proportion of each color component in the image global color information is obtained through calculation, so that a color histogram of a corresponding color model is obtained, and the specific formula is as follows:

wherein N represents the total number of pixels included in the image, and N_kRepresenting the statistics of the k-th color component.

The edge feature is an important feature form for describing content information of video frames, and is called a contour feature. The edge feature extraction method can effectively filter out the texture and specific new boundary information of the image and completely reserve the whole edge contour information of the object.

The extraction process of the image edge features is similar to that of a visual sense system, when a human observes an external object through eyes, the edge part with color change is found by observing the color consistency of the object, and different objects are effectively distinguished.

The edge feature extraction of the invention distinguishes the edge of the image by utilizing the higher change rate of the color/gray scale of the edge of the image object, and the main steps are to carry out smooth filtering, pixel gradient calculation and selection of the point with the higher gradient change rate as the edge point of the image in sequence, thereby finishing the extraction of the edge feature of the image.

The image edge detection method mainly uses a gaussian filtering method. In all filtering methods, the most important point to be considered is how to balance the contradiction between denoising and edge detection accuracy. Practical engineering experience has shown that a gaussian-determined kernel can provide a better compromise.

As long as the gaussian filtering method is implemented using a discretized window-sliding convolution. The discretization window sliding convolution is mainly realized by using a Gaussian kernel, namely a Gaussian template with odd size. The commonly used gaussian kernel templates include 3x3 and 5x5, and the gaussian kernel is calculated by a two-dimensional gaussian function:

wherein x is²+y²The distance between the pixel point and the central pixel point is represented, and sigma represents the standard deviation. If the sigma is selected to be too large, the filtering degree can be deepened, so that the image edge is blurred, and the next edge detection is not facilitated; if too small, the filtering effect is not good. When calculating the gaussian template parameters, normalization is required.

The convolution operator used by Canny operator is as follows:

the gradient magnitude and gradient direction are as follows:

P[i,j]＝(f[i,j+1]-f[i,j]+f[i+1,j+1]-f[i+1,j])/2

Q[i,j]＝(f[i,j]-f[i+1,j]+f[i,j+1]-f[i+1,j+1])/2

θ[i,j]＝arctan(Q[i,j]/P[i,j])

wherein, P represents the x-direction first order partial derivative matrix, Q represents the y-direction first order partial derivative matrix, M represents the gradient magnitude, and theta represents the gradient direction.

K3, extracting the texture features of the original image by Gabor filtering;

texture, an important visual feature of an image, also belongs to a kind of local structured feature. The texture features do not depend on information such as color, gray scale or brightness of pixels, but represent the change of pixel information in a specific area around a pixel point of an image, and the pixel information can be color information, gray scale information or brightness information. Aiming at the situations, the invention selects a Gabor filtering method to extract the image texture characteristics.

Gabor proposed in 1946, to extract local information of the fourier transform, a time-localized window function was introduced (dividing the signal into many small time intervals, analyzing each interval with the fourier transform). The Gabor transform is therefore also called windowed fourier transform (short-time fourier transform).

In the spatial domain, a two-dimensional Gabor filter is the product of a sinusoidal plane wave and a gaussian kernel function. The former is a tuning function and the latter is a window function.

Can be divided into real and imaginary forms:

wherein the content of the first and second substances,

λ represents the wavelength, which directly affects the filter scale of the filter, and is usually 2 or more; θ represents the direction of the filter; phi denotes the phase shift of the tuning function, and takes values of-180 to 180, sigma denotes the variance of the gaussian filter, and the shape of the filter is determined by taking 2 pi, gamma denotes the spatial aspect ratio, and is circular when 1 is taken, and usually 0.5.

The visual perception system can be effectively imitated by utilizing the characteristic of dual resolution of the combination of the time domain and the frequency domain which is excellent in the Gabor filter. The basic principle is as follows: different textures have different central frequency points in the frequency domain, the bandwidth of the frequency domain is different, different textures are filtered out through filters according to the size of the broadband occupied by the different central frequency points and the textures, different Gabor filters only filter out texture details matched with the own frequency, and meanwhile, the interference of other texture frequencies on the current texture is inhibited. And finally, analyzing the filtering results of different Gabor filters to obtain the texture features of the image.

The features of a video can be divided into static features and dynamic features. The static features are mainly image features of key frames, and the feature extraction method for key frames is the same as that for general static images. The static characteristics mainly comprise color characteristics, texture characteristics, shape characteristics and the like; dynamic features are characteristic of video data and include global motion (camera motion such as camera operation of panning, zooming, tracking, etc.) and local motion (motion of objects within a lens, motion trajectory, relative velocity, change in position between objects, etc.). Dynamic features are important features of video data. Since it is difficult to account for motion variations of a video sequence using only image features representing frames.

The motion characteristic is a characteristic form unique to video data, is different from the characteristic of image static property, is a dynamic characteristic, can effectively reflect the gradual change process of the video on the time process, and reflects the time domain characteristic of the video. The motion characteristics of the video are analyzed, so that the change rule of the relative position of the target object at different time points in the video or the movement of the lens position can be effectively mastered, the video content can be comprehensively described, and the method has a good effect on analyzing and understanding the content and the target of the video.

The motion features of the video may be subdivided into local motion features and global motion features. The local motion characteristics mainly represent the change situation of the relative position of a certain target in the video at different time points, and describe the motion track of the target object. The global motion feature mainly describes a kind of regular motion information generated by the video content along with the change of the positions of the shot camera lens, such as translation, expansion, rotation, etc.

Aiming at the motion characteristics of an image, the motion vector of each pixel area is estimated by blocking the pixel area of the image and calling a matching algorithm, so that the motion characteristics of a video are represented.

Optionally, the method further includes an image analysis based on a convolutional neural network, for analyzing the image, including:

The input layer is the input of the whole neural network, and represents the original pixel matrix of the image obtained in the image data preprocessing as a three-dimensional matrix, wherein the length and width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and in the RGB color mode, the depth of the image is 3.

D2, extracting various features of the image by adopting a plurality of convolution kernels, convolving the input image by adopting a filter which can be trained, then adding a deviation, and obtaining a convolution layer after the feature extraction is finished;

step D3, performing maximum or average operation on the adjacent areas in the convolution layer, adding corresponding weight and deviation, obtaining output through activating a function, and outputting a characteristic diagram of the sampling layer;

d4, extracting the local features of the image by connecting each convolution kernel with the local pixel points of the previous layer of feature map, and then performing convolution on the convolution kernels and the previous layer of whole feature map to obtain the convolution result of the global features of the image;

wherein f represents Sigmoid activation function, b represents offset, and w^n,mRepresenting the weight of the position N and M of a convolution kernel, wherein N represents the length of the convolution kernel, M represents the width of the convolution kernel, and u represents the feature graph of the output of the previous layer;

the convolutional layer mainly adopts a plurality of convolution kernels to extract various characteristics of images. The first convolution process is specific: the feature extraction process is to convolve the input image with a filter that can be trained and then add a bias. After the feature extraction is completed, the convolutional layer is obtained. The maximum or average operation is then performed on adjacent regions within the convolutional layer. Corresponding weights and deviations need to be added in the process, and output is obtained through activating functions. This results in a profile of the sampling layer. In the subsequent convolution process, the input to the convolution layer is the output of the sampled layer in the previous convolution process.

Extracting local features of the image by connecting each convolution kernel with local pixel points of a previous layer of feature map, then obtaining global features of the image by convolution of the convolution kernels and the previous layer of whole feature map, then adding the bias parameters of the layer to convolution results, and calculating the feature map of the convolution layer through an activation function, wherein the specific operation is as follows:

wherein f represents a Sigmoid activation function, b represents an offset, w^n,mAnd (3) representing the weight of the position of the convolution kernel (N, M), N representing the length of the convolution kernel, M representing the width of the convolution kernel, and u representing the feature diagram output by the previous layer. The convolution kernel is used for performing convolution operation with the input image, and may be represented by a two-dimensional matrix N × M, where N and M are equal and odd, and generally take 3, 5, 7, and so on. Because the odd matrix has a central point relative to the even matrix, the sliding convolution can be performed with the image by taking the center of the convolution kernel as a standard, so that the edge and the line are more sensitive, and the features such as the contour and the texture can be more effectively extracted.

The number of convolution kernels represents the number of feature maps output by the convolution layer, and the weight parameters in each convolution kernel are different, so that different convolution kernels can extract different features. The number of convolution kernels is increased within a certain range, so that the recognition effect can be improved, but the number is too large, so that the training parameters of the network can be greatly increased, and the training difficulty of the network is increased. At present, the value of the number of convolution kernels is determined according to the executed task and experience, but generally follows the principle that the number of convolution kernels of the later convolution layer is larger.

Step D6, add a pooling layer between the convolutional layers.

A pooling layer is often added between convolutional layers. The pooling layer neural network does not change the depth of the three-dimensional matrix, but it can reduce the size of the matrix. The pooling operation may be considered as converting a picture with a higher resolution to a picture with a lower resolution.

The forward propagation of the pooling layer is also accomplished by moving a filter-like structure. The computation in the pooling layer filter is not a weighted sum of nodes, but rather a simpler maximum or average operation. Similar to the filter of the convolutional layer, the filter of the pooling layer also needs to be set by manually setting the size of the filter, whether to use all-0 filling, the step size of the filter movement, and the like, and the meaning of these settings is the same. The manner in which filters in the convolutional layers and the pooling layers move is similar, the only difference being that the filters used in the convolutional layers are across the entire depth, whereas the filters used in the pooling layers affect nodes at only one depth.

Embodiments of the present invention use a Convolutional Neural Network (CNN) based unstructured data analysis model for unstructured data such as images. The data is internally divided into a plurality of sub-areas with overlapping, and the characteristic transformation is carried out iteratively, so that the sensitivity of the model to changes such as offset is reduced by using the relative position structure information of the data. In the above operation, each feature map corresponds to one type of feature for data having a two-dimensional topological attribute such as an image, and even if a specific feature changes spatially by offset rotation or the like, the feature can be extracted efficiently by convolution operation, and thus the data has invariance to spatial change.

The Convolutional Neural Network (CNN) is a popular deep network, and different from the traditional network structure, the convolutional neural network comprises a very special convolutional layer and a downsampling layer, wherein the convolutional layer and the previous layer are connected in a local connection and weight sharing mode, and therefore the number of parameters is greatly reduced. The down-sampling layer can greatly reduce input dimensionality, so that the network complexity is reduced, the network has higher robustness, and meanwhile overfitting can be effectively prevented.

In general, the basic structure of CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to a local acceptance domain of the previous layer and extracts the feature of the local. Once the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each computing layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure adopts a sigmoid function with small influence function kernel as an activation function of the convolution network, so that the feature mapping has displacement invariance. In addition, since the neurons on one mapping surface share the weight, the number of free parameters of the network is reduced. Each convolutional layer in the convolutional neural network is followed by a computation layer for local averaging and quadratic extraction, which reduces the feature resolution.

Optionally, the method further includes a text data analysis based on serialization, configured to analyze text and other types of data, and construct a serialized text convolution operation according to a one-dimensional sequence characteristic inside the data, including:

The input layer is a matrix in which word vectors corresponding to words in a sentence are arranged in sequence (from top to bottom), and if the sentence has n words and the vector dimension is k, the matrix is n × k.

All Word vectors directly use the results obtained from unsupervised learning, i.e., the Word2Vector tool of Google, and are fixed.

The type of this matrix is static (static) and the word vector is fixed. For the vector of the unknown word, it can be filled with 0 or a random small positive number.

The size of each text convolution kernel is filter _ size multiplied by embedding _ size. The filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, i.e. the adjacent words are considered to have word order relation. embedding _ size is the dimension of the word vector. After each text convolution kernel is calculated, we obtain 1 column vector, which represents the feature extracted from the sentence by the text convolution kernel. How many features can be extracted with the text convolution kernel.

The pooling operation is to extract the maximum value of the column vector resulting from the convolution. Thus, after the pooling operation, an m-dimensional row vector is obtained, i.e., the maximum value of each text convolution kernel is connected, wherein m represents the number of text convolution kernels.

Semantic features of text are high-level feature expressions of text that represent more essential knowledge in the text. For text and other types of data, one-dimensional sequence characteristics inside the data are considered, and serialized text convolution operation is constructed, so that the model has stronger sequence change invariance.

The non-structural data involved in power services is diverse, including both image data and text data as well as sensor data generated by various smart devices. These data reflect important information of the same event or service from different perspectives at the same time. Traditional deep learning generally can only process data of a single mode, so that a limiting model cannot integrate information of multiple sources for big data analysis and prediction.

Further, the method comprises a heterogeneous stacked denoising automatic coding network, node semantic modeling based on a multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on a heterogeneous information network and deep scale learning based on the heterogeneous information network;

The invention also comprises a heterogeneous stacked denoising automatic coding network based on multi-mode information sharing, which expands the traditional single-mode model into a multi-level heterogeneous deep learning model framework, thereby more fully mining the data contents of various types of nodes and the implicit high-level semantic information in the association relationship, and having better accuracy and expandability.

Because different types of node contents have different internal structures, for example, a text is a one-dimensional sequence structure formed by words, an image is a two-dimensional space structure, and a video is a three-dimensional structure including a time dimension. For data of different structures, there are different applicable depth model structures, such as CNN, RNN, LSTM, etc. The invention provides an effective multi-mode-based node semantic modeling, combines the semantic features of nodes in different modes to be learned simultaneously in a depth model, integrates the deep model architecture with various structures, realizes parameter sharing of different types of nodes in a network, and jointly optimizes network parameters of all nodes through network layer fusion of neighbor relations. The heterogeneous stack type denoising automatic coding network learns the coding mode of input data, and loss is measured by comparing original input with reconstructed output.

Information network can use directed graph

And node type mapping function

The relationship type mapping function φ ε → R. Wherein each node

Belonging to a node type

Each edge e ∈ belongs to a relationship type φ (e) ∈ R. When the number of node types is greater than 1, that is

When the network is a heterogeneous information network, otherwise, the network is called a homogeneous information network. In an information network, the types of objects and the types of relationships are clearly distinguished, and the relationships existing between different types of objects can be clearly described by a network model.

(1) Heterogeneous stacked denoising automatic coding network

The heterogeneous stacked denoising automatic coding network learns the coding mode of input data, and loss is measured by comparing original input with reconstructed output.

Inputting: in order to model the content information and the heterogeneous relationship, the heterogeneous probability joint semantic modeling based on reconstruction is provided. Establishing asymmetric weighted adjacency matrix based on heterogeneous relation between nodes

And the feature matrix obtained by linear transformation in the step one

Then, by reconstructing the feature matrix

And an adjacency matrix E, obtaining the semantic theme distribution of the nodes, and reconstructing the matrix

And E.

In order to incorporate more network structure information, on the basis of simultaneously modeling all nodes and relations in the network, optimization can be further performed based on meta-paths containing higher layer structure information.

Counting all possible meta-path types existing in the network, and respectively establishing a weighted relation matrix for each meta-path

Then for any two nodes v_i,v_jThe association relationship based on the meta path may be expressed as:

link(v_i,v_j)＝{B_1,ij,...,B_n,ij}

wherein B is_n,ijDenotes v_i,v_jThe weight on the nth meta path.

And then, respectively establishing a depth model of the content M of the fusion node according to each meta-path relation matrix, and learning the feature representation of the nodes in each sub-network based on the corresponding meta-path network structure. And finally, fusing with a feature learning result in a global network, and comprehensively improving the quality of node feature learning under the condition of considering multi-aspect structural information.

(2) Node semantic modeling based on multiple modes

When there are multiple content nodes in the heterogeneous information network, the nodes cannot be modeled simultaneously because they have different modal feature representations. Therefore, the contents of the nodes in different modalities are converted into the same feature space, and then the unified semantic modeling is carried out on all types of nodes.

The method comprises the following steps:

the characteristic dimensions of the text and image nodes are assumed to be r respectively₁And r₂To get two kinds of nodes at unity r₃Representation in a dimensionally shared feature space using linear transformation matrices

And

mapping text node x and image node z to r₃In dimensional space, the following formula is shown:

to lowerThe information loss of the low text and the image in the mapping process requires that the representation of the nodes in the shared feature space can keep the adjacent relation in the original network as much as possible. At r₃In the dimension shared feature space, the similarity of content nodes of the same type and different types can be expressed as:

wherein the content of the first and second substances,

is determined by Λ and Π together.

Step two:

in order to measure the similarity of nodes in an original network, meta-path information between the nodes is adopted to establish a decision function d (x)_i,x_j) Converting the neighbor relation between nodes into real number representation so as to enable d (x)_i,x_j) Can represent the node v_iAnd v_jDegree of proximity in the original network. Finally, at node v_iAnd v_jIn mapping to r₃After the dimensions are shared in the feature space, a final reconstruction loss function and a network structure loss function are constructed through ideas such as linear discriminant analysis and spectral clustering. Based on the model, the heterogeneous content nodes after mapping can be simultaneously modeled in the probability generation model, and the loss function based on probability reconstruction and the loss function based on the nearest neighbor relation in the mapping process are optimized.

(3) Heterogeneous feature fusion based on meta-path information propagation

In a heterogeneous stacked denoising automatic coding network model framework, heterogeneous content nodes are mutually learned through sharing and fusing model parameters, and nodes not containing content are used as auxiliary information and are passively generated in the characteristic representation learning process of the auxiliary content nodes. In order to more fully utilize the structural relationship between the nodes, the feature representation of the nodes is further fused and improved based on the feature fusion algorithm of the meta-path.

The heterogeneous stacked denoising automatic coding network model outputs a characteristic representation matrix of all nodes

r represents the dimension of the feature space. On the basis, the meta-path adjacency matrix between nodes is utilized

A feature fusion algorithm of random walk is proposed, which is shown in the following formula:

wherein p (B)_nAnd j, i) represents the adjacency matrix B from the original path_nAnd the obtained feature propagation probability from the j-th class node to the i-th class node and the fusion scale parameter alpha are used for controlling the feature fusion process in each iteration process.

(4) Node neighbor relation construction based on heterogeneous information network

Firstly, a heterogeneous network structured clustering algorithm (H-SCAN) is provided, aiming at the structural information of a heterogeneous information network, the algorithm calculates the structural similarity of each node:

and the network structure similarity sigma (v, w) of the nodes v and w can be calculated according to the commonality of reachable neighbors of the nodes v and w. On the basis, firstly all nodes are initialized to be non-clustering points, secondly all core nodes are traversed, direct structure reachable nodes are found and combined into a cluster, and a cluster label is distributed. And finally, traversing all the nodes which are not clustered, and dividing the nodes into pivot points or outliers according to the number of the adjacent clusters, wherein more of the adjacent clusters are pivot nodes, and less are outliers.

The core idea of the algorithm is that the larger the neighbor commonality of the node is, the higher the structural similarity of the node is. The invention further refers to the idea of RankClus in the calculation of the structural similarity, and considers the information of the heterogeneous nodes and the meta-path information semantics, so that the structural similarity can be calculated for the heterogeneous nodes.

The algorithm can effectively utilize structural semantics embodied by a heterogeneous network to construct a clustering relation and a neighbor relation, can also distinguish different roles and importance degrees of node pivot points, core points, outliers and the like, and provides for the subsequent construction of cost-sensitive loss functions.

(5) Deep scale learning model based on heterogeneous information network

Firstly, a deep heterogeneous scale learning method based on data pair constraint is constructed based on the clustering result of the previous step. The data pairs can be manually marked data pairs (same type is positive example, and different type is negative example), or can be from a clustering algorithm (same cluster is positive example, and different cluster is negative example). Two deep neural networks sharing the same structure and parameters are respectively established. Let < X1, X2> be a data pair, which is sent to two heterogeneous deep neural network models, respectively, and by means of the same parameters and architecture, the data pair is mapped into the same subspace, and the similarity of the samples contained in the positive example pair is defined by the final loss function to be high enough, and the distance between the samples contained in the negative example pair is far enough. Therefore, the consistency of semantic similarity and feature space similarity of the original heterogeneous network structure is ensured by virtue of constraint loss of the data pairs.

The unstructured electric power big data analysis method based on deep learning aims at practical application and fine management, is high in electric power big data processing and analysis efficiency and high in data analysis precision, lays a good data foundation for subsequent data information processing and analysis display, has the advantages of flexibility, expandability, safety and concurrency processing, is low in cost, saves more resource cost, and greatly improves unstructured data processing capability of a data platform.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

It will be understood by those skilled in the art that the present invention includes any combination of the summary and detailed description of the invention described above and those illustrated in the accompanying drawings, which is not intended to be limited to the details and which, for the sake of brevity of this description, does not describe every aspect which may be formed by such combination. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. The unstructured electric power big data analysis method based on deep learning is characterized in that electric power big data are analyzed and processed by adopting video data preprocessing, image data preprocessing and text data preprocessing;

the video data preprocessing is used for analyzing and processing the video data;

the image data preprocessing is used for analyzing and processing the image data;

the text data preprocessing adopts a natural language processing technology to preprocess the big electric power data;

the image preprocessing comprises image graying processing, geometric transformation processing, image enhancement processing and decoding processing;

the graying processing includes: adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray image;

the geometric transformation process includes: processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm, and correcting errors of an image acquisition system and random errors of instrument positions;

the image enhancement processing includes: enhancing an image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation;

2. The unstructured electric power big data analysis method based on deep learning of claim 1, further comprising power supply area division, wherein the power supply area division specifically includes:

step H1, counting each cell in the power supply area;

where | C | represents the number of cells included in the power supply area C;

representing the electricity consumption record of the integral average day in the ith area t days;

step H4, calculating the cluster center similarity between the cell and the power supply area;

a step H5 of dividing each cell into power supply regions having the greatest degree of similarity to the cluster center between the power supply regions, based on the calculation result of the degree of similarity in the step H4;

3. The unstructured electric big data analysis method based on deep learning as claimed in claim 1, further comprising electric prediction model construction, specifically:

let matrix y_i，x_i，b_i，w_pRespectively in the following forms:

y_i＝x_iw_p+b_i

Y_p＝X_pw_p+B_p

according to C_PMatrix X in (1)_pSum vector Y_pEstimate the parameter w_pA value;

step E3, applying least squares method to C_PSolving is carried out, and an error B is set_pIs finite in variance and is zero mean, i.e., E [ B ]_p](ii) 0 to yield w_pThe least squares solution of (c) is:

the error of the prediction model is:

4. The method for analyzing the unstructured electric power big data based on deep learning as claimed in claim 3, further comprising based on multi-task joint learning, adopting an iterative joint learning algorithm, and in each iteration process, through sharing of user data, optimizing prediction models on different power supply areas at the same time, so as to realize overall improvement of model performance in different power supply areas, specifically:

step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the power supply region C_PUpper utilization matrix X_pSum vector Y_pConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithm_pAnd error B_pAs a region C_PThe reference model in (1) is set as an area total electricity consumption behavior similarity matrix S;

step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas C_qAccording to C_PAnd C_qGeneral electricity demand similarity S_pq，S_pqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability S_pqRandom decimation C_qAnd with C_PThe user data in (1) are fused to obtain X_p∪qAnd Y_p∪q；

Step F3: by using the mostSmall multiplication by two algorithm according to X_p∪qAnd Y_p∪qSolving model parameter W with minimum joint learning loss function_p∪qAnd a prediction parameter B_p∪q；

5. The method as claimed in claim 1, wherein the video data preprocessing includes shot segmentation and key frame extraction, and the shot segmentation includes: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises: and classifying the images in the image library by adopting a K-means clustering algorithm.

6. The unstructured electric big data analysis method based on deep learning of claim 1, wherein the text data preprocessing comprises:

7. The unstructured electric big data analysis method based on deep learning as claimed in claim 1, further comprising video feature extraction, specifically:

k2, carrying out smoothing filtering on the original image, calculating pixel gradient, selecting points with larger gradient change rate as image edge points, and extracting the edge characteristics of the original image;

k3, extracting the texture features of the original image by Gabor filtering;

8. The unstructured electric big data analysis method based on deep learning of claim 1, further comprising image analysis based on convolutional neural network, for image analysis, specifically:

step D1, acquiring an original pixel matrix of the image and representing the original pixel matrix as a three-dimensional matrix, wherein the length and the width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and the depth of the image is 3 in an RGB color mode;

step D6, add a pooling layer between the convolutional layers.

9. The method for analyzing the unstructured electric big data based on the deep learning as set forth in claim 1, further comprising a text data analysis based on serialization for analyzing texts and other types of data, and constructing a text convolution operation based on one-dimensional sequence characteristics inside the data, specifically:

step P1, establishing a text input layer, and sequentially arranging word vectors corresponding to words in the sentence into a matrix;

step P2, establishing text convolution layers, wherein the size of each text convolution layer is the multiplication of filter _ size and embedding _ size;

the filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, and the embedding _ size is the dimension of a word vector;

10. The unstructured electric big data analysis method based on deep learning of claim 1, further comprising heterogeneous stacked denoising automatic coding network, node semantic modeling based on multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on heterogeneous information network and deep scale learning based on heterogeneous information network;

the multi-mode-based node semantic modeling is used for converting node contents in different modes into the same feature space and then carrying out uniform semantic modeling on all types of nodes;

the heterogeneous feature fusion based on meta-path information propagation learns each other through sharing and fusion of model parameters, further fuses and improves feature representation of nodes;

the heterogeneous information network-based node neighbor relation construction utilizes structural semantics embodied by a heterogeneous network to construct a clustering relation and establish a neighbor relation;