CN114723583A - Unstructured electric power big data analysis method based on deep learning - Google Patents

Unstructured electric power big data analysis method based on deep learning Download PDF

Info

Publication number
CN114723583A
CN114723583A CN202210301556.9A CN202210301556A CN114723583A CN 114723583 A CN114723583 A CN 114723583A CN 202210301556 A CN202210301556 A CN 202210301556A CN 114723583 A CN114723583 A CN 114723583A
Authority
CN
China
Prior art keywords
image
data
power supply
convolution
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210301556.9A
Other languages
Chinese (zh)
Inventor
梁志远
刘鹏
张硕
常迪
邓嶔
郑薇
米兆祥
崔晓萌
王薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Sanyuan Electric Information Technology Co ltd
Original Assignee
Tianjin Sanyuan Electric Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Sanyuan Electric Information Technology Co ltd filed Critical Tianjin Sanyuan Electric Information Technology Co ltd
Priority to CN202210301556.9A priority Critical patent/CN114723583A/en
Publication of CN114723583A publication Critical patent/CN114723583A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)

Abstract

The invention provides an unstructured electric power big data analysis method based on deep learning. The method has the advantages that unstructured data such as videos, images and documents are integrated, a multi-mode combined unstructured electric power big data deep learning algorithm is provided, intelligent recognition is carried out on the unstructured data such as the videos, the images and the documents, the unstructured data processing capacity of a big data platform is improved, stronger characteristic learning capacity and analysis and prediction capacity of electric power unstructured data are embodied in the big data processing process, and the demonstration application of typical application scenes in the electric power industry is realized.

Description

Unstructured electric power big data analysis method based on deep learning
Technical Field
The invention relates to the technical field of data analysis, in particular to an unstructured electric power big data analysis method based on deep learning.
Background
With the rapid development of information technology, more and more data are generated, stored and used all over the world, and the speed is increased more and more. As an important industry related to the national civilization, the production and enterprise management of the electric power are rapidly fused with the information technology in an unprecedented breadth and depth, and data becomes a new opportunity and a new challenge for driving the development of the electric power and related industries.
Unstructured data is data that has an irregular or incomplete data structure, no predefined data model, and is inconvenient to represent with a database two-dimensional logical table. Including office documents of all formats, text, pictures, XML, HTML, various types of reports, images, audio/video information, and so forth. 80% of the data in a business is unstructured and the data grows exponentially by 60% each year. It is reported that only 1% -5% of the data on average are structured data. Today, this explosive growth of unused data consumes the storage capacity of complex and expensive primary storage in the enterprise.
Under the background, how to efficiently utilize information technology means such as unstructured data analysis and the like to provide flexible power consumption for urban power supply networks, comprehensively process and analyze massive data, analyze and process power consumption behaviors of users in different time periods, different categories and fine granularity, realize efficient and rapid management of large power data, and meet the urgent need of urban development at the present stage.
Disclosure of Invention
The object of the present invention is to solve at least one of the technical drawbacks mentioned.
Therefore, an object of the present invention is to provide an unstructured electric power big data analysis method based on deep learning, so as to solve the problems mentioned in the background art and overcome the disadvantages in the prior art.
In order to achieve the above object, an embodiment of an aspect of the present invention provides an unstructured electric big data analysis method based on deep learning, which is characterized in that video data preprocessing, image data preprocessing, and text data preprocessing are adopted to perform analysis processing on electric big data.
The video data preprocessing is used for analyzing and processing the video data; the image data preprocessing is used for analyzing and processing the image data; and preprocessing the text data by adopting a natural language processing technology to preprocess the power big data.
The image preprocessing comprises image graying processing, geometric transformation processing, image enhancement processing and decoding processing; the graying processing includes: adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray image; the geometric transformation process includes: processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm, and correcting errors of an image acquisition system and random errors of instrument positions; the image enhancement processing includes: and enhancing the image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation. The image decoding process includes: and after the image enhancement operation, performing image decoding processing, decoding the image by adopting a TensorFlow frame, and converting the image into an original pixel matrix of the image.
Preferably, the power supply system further comprises a power supply area division, wherein the power supply area division specifically comprises:
step H1, counting each cell in the power supply area;
step H2, power supply area is initialized randomly, each cell is divided into K clusters C randomly1,C2To CKWherein K represents the number of the set power supply areas;
step H3, the cluster center of each power supply area is calculated using the following formula:
Figure BDA0003563075490000021
where | C | represents the number of cells included in the power supply area C;
Figure BDA0003563075490000022
and (4) recording the average daily power consumption of the whole unit in t days of the ith area.
And step H4, calculating the similarity of the cluster centers between the cells and the power supply areas.
Step H5 of classifying each cell into a power supply area having the largest degree of similarity to the cluster center between the power supply areas, based on the result of the similarity calculation in step H4.
Step H6: it is determined whether or not the power supply area allocated to each cell converges, and if so, the result of dividing the power supply area allocated to each cell is output, otherwise, the process returns to step H3.
In any of the above schemes, preferably, the method further includes constructing a power prediction model, specifically:
step E1 setting Power supply area CPLet user UiThe electricity usage is linearly related over consecutive D +1 days,
Figure BDA0003563075490000023
by xiIs predicted by a linear combination of different elements in (1), as shown in the following equation:
Figure BDA0003563075490000024
wherein wpIs a region CPLinear combination parameter of biAs error variable, error variable biIs a varying random error term for the user UiHas the following equation:
Figure BDA0003563075490000025
Figure BDA0003563075490000026
Figure BDA0003563075490000027
let matrix yi,xi,bi,wpRespectively in the following forms:
Figure BDA0003563075490000031
thus, the user UiThe linear combination model of the power consumption is as follows:
yi=xiwp+bi
step E2, training a separate model for each power supply area, and connecting the power supply areas CPElectricity consumption matrix x of all users iniAre combined into a matrix XpCombining the predicted power consumption into vector YpAll error terms biCombined into vector BpThen all linear models are combined as:
Yp=Xpwp+Bp
according to CPMatrix X in (1)pSum vector YpEstimate the parameter wpThe value is obtained.
Step E3, applying least squares method to CPSolving is carried out, and an error B is setpIs finite in variance and is zero mean, i.e., E [ B ]p]0 to yield wpThe least squares solution of (c) is:
Figure BDA0003563075490000032
the error of the prediction model is:
Figure BDA0003563075490000033
step E4 for each supply area CPRespectively adopting linear regression model for modeling, and adopting
Figure BDA0003563075490000034
And solving to obtain a user power consumption prediction model on different power supply areas.
In any of the above schemes, preferably, the method further includes, based on multi-task joint learning, using an iterative joint learning algorithm, and in each iteration, by sharing user data and simultaneously optimizing prediction models in different power supply regions, realizing overall improvement of model performance in different power supply regions, specifically:
step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the supply ofElectric region CPUpper utilization matrix XpSum vector YpConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithmpAnd error BpAs a region CPAnd in the reference model, the similarity matrix of the total electricity utilization behaviors of the region is set as S.
Step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas CqAccording to CPAnd CqGeneral electricity demand similarity Spq,SpqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability SpqRandom decimation of CqAnd with CPThe user data in (1) are fused to obtain Xp∪qAnd Yp∪q
Step F3: using least squares algorithm, according to Xp∪qAnd Yp∪qSolving model parameter W with minimum joint learning loss functionp∪qAnd a prediction parameter Bp∪q
Step F4: judging whether the models on all the areas are updated or not: if the models in all the regions are updated, the step F5 is performed, otherwise, the step F2 is returned.
Step F5: judging whether the least square algorithm is converged: if convergence is reached, i.e., the models in all the regions are not updated, the result is output, otherwise, the process returns to step F2.
In any of the above schemes, preferably, the video data preprocessing includes shot segmentation and key frame extraction, and the shot segmentation includes: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises the following steps: and classifying the images in the image library by adopting a K-means clustering algorithm.
In any of the above aspects, preferably, the text data preprocessing includes:
and step Q1, performing word segmentation operation on the text data by adopting a Chinese word segmentation tool.
And step Q2, performing the stop word processing on the segmented data by using the stop word dictionary.
And step Q3, converting the data processed by the stop words into a structured data form which can be identified and analyzed by a computer by adopting a Word2Vec tool kit.
In any of the above schemes, it is preferable that the method further includes video feature extraction, specifically:
and K1, acquiring an original image, and extracting color features in the original image by using the color histogram.
And K2, sequentially carrying out smooth filtering on the original image, calculating the pixel gradient, selecting points with larger gradient change rate as image edge points, and extracting the edge characteristics of the original image.
And step K3, extracting the texture features of the original image by adopting Gabor filtering.
And K4, blocking the pixel areas of the image, and calling a matching algorithm to estimate a motion vector of each pixel area so as to express the motion characteristics of the video.
In any of the above schemes, it is preferable that the method further includes an image analysis based on a convolutional neural network, and the image analysis is specifically:
and D1, acquiring an original pixel matrix of the image and representing the original pixel matrix as a three-dimensional matrix, wherein the length and width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and the depth of the image is 3 in the RGB color mode.
And D2, extracting various features of the image by adopting a plurality of convolution kernels, convolving the input image by adopting a filter which can be trained, and then adding a deviation to obtain a convolution layer after the feature extraction is finished.
And D3, performing maximum or average operation on adjacent areas in the convolution layer, adding corresponding weight and deviation, obtaining output through an activation function, and outputting a characteristic diagram of the sampling layer.
And D4, extracting the local features of the image by connecting each convolution kernel with the local pixel points of the previous layer of feature map, and then performing convolution on the convolution kernels and the previous layer of whole feature map to obtain the convolution result of the global features of the image.
And D5, adding the bias parameters to the convolution result, and calculating the characteristic diagram of the convolution layer through the activation function, wherein the specific operation is as follows:
Figure BDA0003563075490000051
wherein f represents Sigmoid activation function, b represents offset, and wn,mAnd representing the positions of convolution kernels N and M, wherein N represents the length of the convolution kernels, M represents the width of the convolution kernels, and u represents the feature diagram of the output of the previous layer.
Step D6, add a pooling layer between the convolutional layers.
In any of the above schemes, preferably, the method further includes a text data analysis based on serialization, which is used for analyzing texts and other types of data, and constructing a serialized text convolution operation according to a one-dimensional sequence characteristic inside the data, specifically:
and P1, establishing a text input layer, and sequentially arranging word vectors corresponding to the words in the sentence into a matrix.
And step P2, establishing text convolution layers, wherein the size of each text convolution layer is the multiplication of the filter _ size and the embedding _ size.
The filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, namely, the word order relationship is considered between adjacent words, and the embedding _ size is the dimension of a word vector.
And step P3, establishing a text pooling layer, and extracting the maximum value of the column vectors obtained by text convolution.
In any of the above schemes, preferably, the method further comprises a heterogeneous stacked denoising automatic coding network, a node semantic modeling based on multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on a heterogeneous information network, and deep scale learning based on a heterogeneous information network;
the heterogeneous stacked denoising automatic coding network is used for learning the coding mode of input data and measuring loss by comparing original input with reconstructed output;
the multi-mode-based node semantic modeling is used for converting node contents in different modes into the same feature space, and then performing unified semantic modeling on all types of nodes;
heterogeneous feature fusion based on meta-path information propagation further fuses and improves feature representation of nodes through sharing and fusion mutual learning of model parameters;
constructing a clustering relation by using structural semantics embodied by a heterogeneous network based on the node neighbor relation of the heterogeneous information network, and establishing a neighbor relation;
the deep heterogeneous scale learning based on the heterogeneous information network is used for constructing a deep heterogeneous scale learning method based on data pair constraint.
Compared with the prior art, the invention has the advantages and beneficial effects that:
1. according to the unstructured electric power big data analysis method based on deep learning, mass data are comprehensively processed and analyzed, the intelligent recognition technical capabilities of voice, images, videos and the like based on a big data platform are improved, an accurate data analysis basis is provided for subsequent information processing and analysis, and more resource cost is saved.
2. The unstructured electric power big data analysis method based on deep learning carries out real-time analysis processing on massive data, is high in efficiency, improves accuracy of statistical estimation through big data, scientifically solves large-scale optimization problems, and becomes an effective tool for big data mining and learning.
3. The unstructured electric power big data analysis method based on deep learning enables consistency optimization of feature learning and learning of a classification model to obtain better performance on one hand, greatly reduces manual intervention on the other hand, and has better applicability in ever-changing practical problems.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a structural diagram of an unstructured electric power big data analysis method based on deep learning according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative and intended to explain the present invention and should not be construed as limiting the present invention.
As shown in fig. 1, according to an unstructured electric power big data analysis method based on deep learning, video data preprocessing, image data preprocessing and text data preprocessing are adopted to analyze and process electric power big data.
The video data preprocessing is used for analyzing and processing the video data.
Image data preprocessing is used to analyze the image data.
Text data preprocessing adopts a natural language processing technology to preprocess the big electric power data; the processing of text data includes word segmentation and vectorization of words.
The image preprocessing comprises the gray processing, the geometric transformation processing, the image enhancement processing and the decoding processing of the image; the graying processing includes: and adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray-scale image.
The geometric transformation process includes: and processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm for correcting errors of an image acquisition system and random errors of the position of an instrument.
The image enhancement processing includes: and enhancing the image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation.
The image decoding process includes: and after the image enhancement operation, performing image decoding processing, decoding the image by adopting a TensorFlow frame, and converting the image into an original pixel matrix of the image.
In the image analysis of the unstructured electric power big data analysis method, the quality of the image directly influences the design of the recognition algorithm and the precision of the effect, irrelevant information in the image is eliminated, useful real information is recovered, the detectability of the relevant information is enhanced, and the data is simplified to the maximum extent, so that the reliability of feature extraction, image segmentation, matching and recognition is improved.
The image preprocessing of the embodiment of the invention mainly aims to eliminate irrelevant information in an image, recover useful real information, enhance the detectability of relevant information and simplify data to the maximum extent, thereby improving the reliability of feature extraction, image segmentation, matching and identification.
The image graying processing performs graying based on an RGB color model. The color of each pixel point of the color map defined in the RGB space is determined by R, G, B three components. The number of bits occupied by each component in the memory determines the image depth, i.e. the number of bytes occupied by each pixel. For a common 24-depth color RGB image, three components of the RGB image respectively occupy 1 byte, so that each component can take a value of 0 to 255, and thus a pixel point can have a color variation range of 1600 tens of thousands (255x255x 255). For such a color map, the corresponding gray-scale map has an image depth of only 8 bits (it can be considered that the three components of RGB are equal), which also means that the amount of computation required for processing the gray-scale map image is really small. It should be noted, however, that although some color levels are lost, the gray scale map description is consistent with the color map description in terms of overall and local color and intensity level distribution characteristics throughout the image.
Graying an RGB image is commonly referred to as weighting and averaging three components of RGB of the image to obtain a final grayscale value. In image processing, a commonly used graying method: 1. component method 2, maximum method 3, average method 4, weighted average method. The invention uses the post-maximum method, and takes the maximum value of the three-component brightness in the color image as the gray value of the gray image.
The specific weighting method comprises the following steps: max (B + G + R).
The geometric transformation processing is to process the acquired image through geometric transformation such as translation, transposition, mirror image, rotation, scaling and the like, and is used for correcting the system error of the image acquisition system and the random error of the instrument position such as the imaging angle, perspective relation and even the lens self-reason. Furthermore, it is also necessary to use a gray interpolation algorithm because pixels of the output image may be mapped onto non-integer coordinates of the input image as calculated according to this transformation relationship. The commonly used methods are nearest neighbor interpolation, bilinear interpolation and bicubic interpolation.
The method aims to improve the visual effect of an image, purposefully emphasizes the overall or local characteristics of the image, changes the original unclear image into clear or emphasizes certain interesting characteristics, enlarges the difference between different object characteristics in the image, inhibits the uninteresting characteristics, improves the image quality and riches the information content, enhances the image interpretation and recognition effect and meets the requirements of certain special analysis aiming at the application occasions of the given image. Image enhancement algorithms can be divided into two broad categories: a spatial domain method and a frequency domain method.
The image enhancement processing of the invention mainly uses a spatial domain method to realize the enhancement of the image. The spatial domain method is a direct image enhancement algorithm and is divided into a point operation algorithm and a neighborhood denoising algorithm. The point arithmetic algorithm is gray level correction, gray level transformation (also called contrast stretching) and histogram modification. The neighborhood enhancement algorithm is divided into two types, namely image smoothing and sharpening. Common algorithms for smoothing include mean filtering, median filtering, and spatial filtering. Common sharpening algorithms include gradient operator method, second derivative operator method, high-pass filtering, mask matching method and the like.
After the image enhancement operation, image decoding processing is carried out to convert the image into an original pixel matrix of the image. The coding of images is mainly done using the tensrflow framework, which provides the coding/decoding functions for jpeg and png format images.
TensorFlow is a second generation artificial intelligence learning system developed by Google based on DistBlief, and can transmit a complex data structure to an artificial intelligence neural network for analysis and processing. The TensorFlow can be used in the fields of multiple machine learning and deep learning such as voice recognition or image recognition, various aspects of improvement are carried out on the developed deep learning infrastructure DistBlief, and the TensorFlow can be operated on various devices which are as small as one smart phone and as large as thousands of data center servers. TensorFlow will be completely open source and available to anyone. TensorFlow supports CNN, RNN and LSTM algorithms, which are currently the most popular deep neural network models in Image, Speech and NLP.
Text data preprocessing, namely processing unstructured data such as a text of a power system and the like through a natural language processing technology aiming at the text data, performing works such as Chinese word segmentation, lexical analysis, syntactic analysis, semantic analysis, vectorization and the like on a basic document, and analyzing the work into vector expression which can be understood by a computer, thereby providing support for constructing a powerful data analysis and intelligent system for the unstructured data.
The unstructured electric power big data analysis method based on deep learning is high in analysis processing efficiency, can be widely applied to various industries such as governments, energy sources and public services, is wide in applicability, more intelligent in application and more refined in management, provides a good basis for big data analysis decision, and reduces resource waste.
Further, still include power supply area and divide, power supply area divides specifically to be:
step H1-statistics of each cell within the power supply area.
Step H2, power supply area is initialized randomly, each cell is divided into K clusters C randomly1,C2To CK,CKA cluster label is indicated, where K indicates the number of set power supply regions.
Step H3, the cluster center of each power supply area is calculated using the following formula:
Figure BDA0003563075490000091
where | C | represents the number of cells included in the power supply area C;
Figure BDA0003563075490000092
and (4) recording the overall average daily electricity consumption in t days of the ith area.
Step H4, regarding the whole representation of each power supply area as a special cell, calculating the cluster center similarity between the cell and the power supply area.
The similarity algorithm is mainly used for measuring the similarity between objects and is the basis of tasks such as information retrieval, recommendation system and data mining. Similarity calculation is performed using euclidean distances, assuming that both objects X and Y have N-dimensional features, i.e., X ═ X1,x2,…,xn},Y={y1,y2,…,ynThe specific formula is as follows:
Figure BDA0003563075490000093
where dot (X, Y) represents the inner product of the vectors.
And a step H5 of dividing each cell into power supply regions having the greatest degree of similarity to the cluster center between the power supply regions, based on the result of the similarity calculation in the step H4.
Step H6: it is determined whether or not the power supply area allocated to each cell converges, and if so, the result of dividing the power supply area allocated to each cell is output, otherwise, the process returns to step H3. Judging whether the power supply area divided by each cell is converged, namely judging whether cluster labels of all users are adjusted, if so, judging whether cluster labels of the cells are changed in the previous round of cluster label adjustment; if the cluster label of the user is not adjusted, the method returns to the step H3. And D, judging whether cluster labels of the cells are changed in the previous round of cluster label adjustment, if so, returning to the step H3, and otherwise, outputting the division result of the power supply area divided by each cell. By the method, aiming at the characteristics of the power supply area, the power supply area division algorithm based on the k-means algorithm has high division efficiency, and the division result is convenient to analyze and process subsequently.
Specifically, the method further comprises the step of constructing a power prediction model, wherein in the construction process of the power prediction model, a model is respectively constructed for each power supply area based on the power prediction model of the autoregressive model so as to fit the power consumption requirements of users in different areas. The method specifically comprises the following steps:
step E1 setting Power supply area CPLet user UiThe electricity usage is linearly related over consecutive D +1 days,
Figure BDA0003563075490000094
by xiIs predicted by a linear combination of different elements in (1), as shown in the following equation:
Figure BDA0003563075490000095
wherein wpIs region CPLinear combination parameter of biAs error variable, error variable biIs a varying random error term for the user UiHas the following equation:
Figure BDA0003563075490000101
Figure BDA0003563075490000102
Figure BDA0003563075490000103
let matrix yi,xi,bi,wpRespectively in the following forms:
Figure BDA0003563075490000104
thus, the user UiThe linear combination model of the power consumption is as follows:
yi=xiwp+bi
step E2, training a separate model for each power supply area, and connecting the power supply areas CPElectricity consumption matrix x of all users iniAre combined into a matrix XpCombining the predicted power consumption into vector YpAll error terms biCombined into vector BpThen, all linear models are combined as:
Yp=Xpwp+Bp
according to CPMatrix X in (1)pSum vector YpEstimate the parameter wpA value; i.e. according to CPPower consumption X of all users inpAnd the marked power consumption YpEstimate the parameter wp
Step E3, applying least squares method to CPSolving is carried out, and an error B is setpIs finite in variance and is zero mean, i.e., E [ B ]p]0 to yield wpThe least squares solution of (c) is:
Figure BDA0003563075490000105
the error of the prediction model is:
Figure BDA0003563075490000106
step E4 for each supply area CPModeling by respectively adopting linear regression models, and adopting
Figure BDA0003563075490000107
And solving to obtain a user power consumption prediction model on different power supply areas. The construction of the power prediction model provides an accurate model base for subsequent analysis, fits the power consumption requirements of users in different areas, is convenient and practical,the construction efficiency is high.
Further, the method comprises the steps of based on multi-task joint learning, adopting an iterative joint learning algorithm, and simultaneously optimizing prediction models on different power supply areas through sharing of user data in each iteration process to realize overall improvement of model performance in different power supply areas, wherein the method specifically comprises the following steps:
step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the power supply region CPUpper utilization matrix XpSum vector YpConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithmpAnd error BpAs region CPThe reference model in (1) is set as an area total electricity consumption behavior similarity matrix S;
step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas CqAccording to CPAnd CqGeneral electricity demand similarity Spq,SpqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability SpqRandom decimation of CqAnd with CPThe user data in (1) are fused to obtain Xp∪qAnd Yp∪q
Step F3: using least squares algorithm, according to Xp∪qAnd Yp∪qSolving model parameter W with minimum joint learning loss functionp∪qAnd a prediction parameter Bp∪q(ii) a The joint learning loss function is as follows:
Figure BDA0003563075490000111
where P is the number of divided regions, E [ B ]p∪q]For prediction error, the model parameter W with the smallest loss function is selectedp∪qAs CPOf (2) is performed.
Step F4: judging whether the models on all the areas are updated or not: if the models in all the regions are updated, performing step F5, otherwise, returning to step F2;
step F5: judging whether the least square algorithm converges: if convergence is reached, i.e., the models in all the regions are not updated, the result is output, otherwise, the process returns to step F2.
In the process of dividing the power supply regions, in addition to the division result of the power supply regions, a similarity relation matrix S of the overall power consumption requirements among the respective power supply regions is obtained. Aiming at the matrix S, a multi-task-based joint learning model is adopted, an iterative joint learning algorithm is adopted, and in each iteration process, prediction models on different power supply areas are optimized simultaneously through sharing of user data, so that the overall improvement of model performance in different power supply areas is realized.
Further, the video data preprocessing comprises shot segmentation and key frame extraction, and the shot segmentation comprises the following steps: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises the following steps: and classifying the images in the image library by adopting a K-means clustering algorithm.
The video data imaging segmentation is mainly implemented by a histogram-based method because data such as images captured by an unmanned aerial vehicle are often influenced by various external factors. The histogram-based algorithm is the most common scene segmentation method, is simple and convenient to process, and can achieve a better effect on most videos. The histogram-based method equally divides the gray scale, brightness or color of each pixel of adjacent frames into N levels, and then makes histogram comparison for each level counting the number of pixels. The number of grey levels or colour distributions of the population is counted, which is well tolerant to movements within the lens and slow movements of the camera.
The key frame is the most important, representative image or images in the shot. Depending on the complexity of the shot content, one or more key frames may be extracted from a shot. The key frame is selected to contain the main information of the shot. And it must not be too complex to handle.
The key frame extraction method based on clustering of the embodiment of the invention firstly adopts a K-means clustering algorithm to classify the images in the image library, and the key frame extraction based on the K-means clustering algorithm can greatly reduce the calculated amount. The method is high in calculation efficiency, and visual contents with obvious video shot changes can be effectively acquired. Extracting a small number of key frames for the low-activity shots; otherwise, more key frames are extracted.
Further, the text data preprocessing comprises:
step Q1, performing word segmentation operation on the text data by adopting a Chinese word segmentation tool;
step Q2, the stop word dictionary is adopted to carry out stop word processing on the segmented data;
and step Q3, converting the data after the stop Word processing into a structured data form which can be recognized and analyzed by a computer by adopting a Word2Vec tool kit.
Word segmentation: the method is characterized in that the natural language processing technology is utilized to preprocess the power big data, word segmentation operation is firstly carried out, and word segmentation is the most basic problem in natural language processing. The project mainly uses an open-source Chinese word segmentation tool to carry out word segmentation according to the Chinese word segmentation tool.
The results segmentation tool supports three segmentation models: the accurate mode is used for trying to cut the sentence most accurately, and is suitable for text analysis; in the full mode, all words which can be formed into words in a sentence are scanned, so that the speed is very high, but ambiguity cannot be solved; the search engine mode is used for segmenting long words again on the basis of the accurate mode, the recall rate is improved, the search engine mode is suitable for word segmentation of the search engine, meanwhile, complex word segmentation and user-defined dictionaries are supported, and the function is strong.
The algorithm involved in the ending part word comprises: (1) realizing efficient word graph scanning based on the Trie tree structure, and generating a Directed Acyclic Graph (DAG) formed by all possible word forming conditions of Chinese characters in a sentence; (2) a maximum probability path is searched by adopting dynamic planning, and a maximum segmentation combination based on the word frequency is found; (3) for unknown words, an HMM model based on Chinese character word forming capability is adopted, and a Viterbi algorithm is used.
Stop words: on the basis of word segmentation, stop word processing is carried out, because the analyzed text has many invalid words and symbols, such as 'yes', 'and', 'but' and the like, in order to save storage space and improve search efficiency, the existing stop word dictionary is mainly used for stop word operation, real work is screened and processed from the perspective of human brain processing business, and nonstandard and non-uniform text data is converted into an accurate and normative equipment operation sequence.
Generating a word vector: the processed data is converted into a structured data form that can be identified and analyzed by a computer on the basis of the step word segmentation and the step stop word. The data obtained from the stop words are converted into digital information in a vector form, the word vector can reflect the similarity and difference between words, the relation between words can be mined, and unstructured data such as texts are converted into structured data which can be recognized by a computer.
Word2Vec is an open-source tool kit for obtaining Word vector Word2Vec, which is derived by Google in 2013, and comprises a Continuous bag-of-Words Model (CBOW) and a Skip-gram Model, wherein the Skip-gram Model predicts surrounding Words by using a central Word, the CBOW Model predicts the central Word by using the surrounding Words, and the value of each Word to the central Word is the same without sequence and distance. Both models consist of an input layer, a mapping layer, which maps words to a continuous vector space, and an output layer. The training modes adopt random gradient descent and hierarchical softmax so as to reduce the calculation amount. The invention mainly adopts a skip-gram model in Word2Vec to train, generates corresponding Word vectors, and can convert each sentence into a matrix form after obtaining the Word vectors.
Specifically, the method further comprises video feature extraction, including:
k1, acquiring an original image, and extracting color features in the original image by adopting a color histogram;
color features are extracted using a color histogram. The color histogram is a most common way to extract and count the color features of a video image, is a global statistical method, describes the proportion of different color components in all pixel points of a complete picture, and specifically comprises the following steps:
(1) the color space model is selected, and the RGB color space model is mainly used.
The most common uses of the RGB (red, green, blue) color model are display systems, color cathode ray tubes, and color raster graphics displays, which use R, G, B values to drive R, G, B electron guns to emit electrons, and excite R, G, B three-color phosphors on a fluorescent screen to emit light with different brightness, and generate various colors by additive mixing; the scanner also absorbs R, G, B component in the light transmitted by the original through reflection or transmission, and uses it to express the color of the original.
(2) Respectively counting the corresponding color component information of the image pixel points;
(3) the proportion of each color component in the image global color information is obtained through calculation, so that a color histogram of a corresponding color model is obtained, and the specific formula is as follows:
Figure BDA0003563075490000131
wherein N represents the total number of pixels included in the image, and NkRepresenting the statistics of the k-th color component.
And K2, sequentially carrying out smooth filtering on the original image, calculating the pixel gradient, selecting points with larger gradient change rate as image edge points, and extracting the edge characteristics of the original image.
The edge feature is an important feature form for describing content information of video frames, and is called a contour feature. The edge feature extraction method can effectively filter out the texture and specific new boundary information of the image and completely reserve the whole edge contour information of the object.
The extraction process of the image edge features is similar to that of a visual sense system, when a human observes an external object through eyes, the edge part with color change is found by observing the color consistency of the object, and different objects are effectively distinguished.
The edge feature extraction of the invention distinguishes the edge of the image by utilizing the higher change rate of the color/gray scale of the edge of the image object, and the main steps are to carry out smooth filtering, pixel gradient calculation and selection of the point with the higher gradient change rate as the edge point of the image in sequence, thereby finishing the extraction of the edge feature of the image.
The image edge detection method mainly uses a gaussian filtering method. In all filtering methods, the most important point to be considered is how to balance the contradiction between denoising and edge detection accuracy. Practical engineering experience has shown that a gaussian-determined kernel can provide a better compromise.
As long as the gaussian filtering method is implemented using a discretized window-sliding convolution. The discretization window sliding convolution is mainly realized by using a Gaussian kernel, namely a Gaussian template with odd size. The commonly used gaussian kernel templates include 3x3 and 5x5, and the gaussian kernel is calculated by a two-dimensional gaussian function:
Figure BDA0003563075490000141
wherein x is2+y2The distance between the pixel point and the central pixel point is represented, and sigma represents the standard deviation. If the sigma is selected to be too large, the filtering degree can be deepened, so that the image edge is blurred, and the next edge detection is not facilitated; if too small, the filtering effect is not good. When calculating the gaussian template parameters, normalization is required.
The convolution operator used by Canny operator is as follows:
Figure BDA0003563075490000142
the gradient magnitude and gradient direction are as follows:
P[i,j]=(f[i,j+1]-f[i,j]+f[i+1,j+1]-f[i+1,j])/2
Q[i,j]=(f[i,j]-f[i+1,j]+f[i,j+1]-f[i+1,j+1])/2
Figure BDA0003563075490000143
θ[i,j]=arctan(Q[i,j]/P[i,j])
wherein, P represents the x-direction first order partial derivative matrix, Q represents the y-direction first order partial derivative matrix, M represents the gradient magnitude, and theta represents the gradient direction.
K3, extracting the texture features of the original image by Gabor filtering;
texture, an important visual feature of an image, also belongs to a kind of local structured feature. The texture features do not depend on information such as color, gray scale or brightness of pixels, but represent the change of pixel information in a specific area around a pixel point of an image, and the pixel information can be color information, gray scale information or brightness information. Aiming at the situations, the invention selects a Gabor filtering method to extract the image texture characteristics.
Gabor proposed in 1946, to extract local information of the fourier transform, a time-localized window function was introduced (dividing the signal into many small time intervals, analyzing each interval with the fourier transform). The Gabor transform is therefore also called windowed fourier transform (short-time fourier transform).
In the spatial domain, a two-dimensional Gabor filter is the product of a sinusoidal plane wave and a gaussian kernel function. The former is a tuning function and the latter is a window function.
Figure BDA0003563075490000151
Can be divided into real and imaginary forms:
Figure BDA0003563075490000152
wherein the content of the first and second substances,
Figure BDA0003563075490000153
λ represents the wavelength, which directly affects the filter scale of the filter, and is usually 2 or more; θ represents the direction of the filter; phi denotes the phase shift of the tuning function, and takes values of-180 to 180, sigma denotes the variance of the gaussian filter, and the shape of the filter is determined by taking 2 pi, gamma denotes the spatial aspect ratio, and is circular when 1 is taken, and usually 0.5.
The visual perception system can be effectively imitated by utilizing the characteristic of dual resolution of the combination of the time domain and the frequency domain which is excellent in the Gabor filter. The basic principle is as follows: different textures have different central frequency points in the frequency domain, the bandwidth of the frequency domain is different, different textures are filtered out through filters according to the size of the broadband occupied by the different central frequency points and the textures, different Gabor filters only filter out texture details matched with the own frequency, and meanwhile, the interference of other texture frequencies on the current texture is inhibited. And finally, analyzing the filtering results of different Gabor filters to obtain the texture features of the image.
And K4, blocking the pixel areas of the image, and calling a matching algorithm to estimate a motion vector of each pixel area so as to express the motion characteristics of the video.
The features of a video can be divided into static features and dynamic features. The static features are mainly image features of key frames, and the feature extraction method for key frames is the same as that for general static images. The static characteristics mainly comprise color characteristics, texture characteristics, shape characteristics and the like; dynamic features are characteristic of video data and include global motion (camera motion such as camera operation of panning, zooming, tracking, etc.) and local motion (motion of objects within a lens, motion trajectory, relative velocity, change in position between objects, etc.). Dynamic features are important features of video data. Since it is difficult to account for motion variations of a video sequence using only image features representing frames.
The motion characteristic is a characteristic form unique to video data, is different from the characteristic of image static property, is a dynamic characteristic, can effectively reflect the gradual change process of the video on the time process, and reflects the time domain characteristic of the video. The motion characteristics of the video are analyzed, so that the change rule of the relative position of the target object at different time points in the video or the movement of the lens position can be effectively mastered, the video content can be comprehensively described, and the method has a good effect on analyzing and understanding the content and the target of the video.
The motion features of the video may be subdivided into local motion features and global motion features. The local motion characteristics mainly represent the change situation of the relative position of a certain target in the video at different time points, and describe the motion track of the target object. The global motion feature mainly describes a kind of regular motion information generated by the video content along with the change of the positions of the shot camera lens, such as translation, expansion, rotation, etc.
Aiming at the motion characteristics of an image, the motion vector of each pixel area is estimated by blocking the pixel area of the image and calling a matching algorithm, so that the motion characteristics of a video are represented.
Optionally, the method further includes an image analysis based on a convolutional neural network, for analyzing the image, including:
and D1, acquiring an original pixel matrix of the image and representing the original pixel matrix as a three-dimensional matrix, wherein the length and width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and the depth of the image is 3 in the RGB color mode.
The input layer is the input of the whole neural network, and represents the original pixel matrix of the image obtained in the image data preprocessing as a three-dimensional matrix, wherein the length and width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and in the RGB color mode, the depth of the image is 3.
D2, extracting various features of the image by adopting a plurality of convolution kernels, convolving the input image by adopting a filter which can be trained, then adding a deviation, and obtaining a convolution layer after the feature extraction is finished;
step D3, performing maximum or average operation on the adjacent areas in the convolution layer, adding corresponding weight and deviation, obtaining output through activating a function, and outputting a characteristic diagram of the sampling layer;
d4, extracting the local features of the image by connecting each convolution kernel with the local pixel points of the previous layer of feature map, and then performing convolution on the convolution kernels and the previous layer of whole feature map to obtain the convolution result of the global features of the image;
and D5, adding the bias parameters to the convolution result, and calculating the characteristic diagram of the convolution layer through the activation function, wherein the specific operation is as follows:
Figure BDA0003563075490000171
wherein f represents Sigmoid activation function, b represents offset, and wn,mRepresenting the weight of the position N and M of a convolution kernel, wherein N represents the length of the convolution kernel, M represents the width of the convolution kernel, and u represents the feature graph of the output of the previous layer;
the convolutional layer mainly adopts a plurality of convolution kernels to extract various characteristics of images. The first convolution process is specific: the feature extraction process is to convolve the input image with a filter that can be trained and then add a bias. After the feature extraction is completed, the convolutional layer is obtained. The maximum or average operation is then performed on adjacent regions within the convolutional layer. Corresponding weights and deviations need to be added in the process, and output is obtained through activating functions. This results in a profile of the sampling layer. In the subsequent convolution process, the input to the convolution layer is the output of the sampled layer in the previous convolution process.
Extracting local features of the image by connecting each convolution kernel with local pixel points of a previous layer of feature map, then obtaining global features of the image by convolution of the convolution kernels and the previous layer of whole feature map, then adding the bias parameters of the layer to convolution results, and calculating the feature map of the convolution layer through an activation function, wherein the specific operation is as follows:
Figure BDA0003563075490000172
wherein f represents a Sigmoid activation function, b represents an offset, wn,mAnd (3) representing the weight of the position of the convolution kernel (N, M), N representing the length of the convolution kernel, M representing the width of the convolution kernel, and u representing the feature diagram output by the previous layer. The convolution kernel is used for performing convolution operation with the input image, and may be represented by a two-dimensional matrix N × M, where N and M are equal and odd, and generally take 3, 5, 7, and so on. Because the odd matrix has a central point relative to the even matrix, the sliding convolution can be performed with the image by taking the center of the convolution kernel as a standard, so that the edge and the line are more sensitive, and the features such as the contour and the texture can be more effectively extracted.
The number of convolution kernels represents the number of feature maps output by the convolution layer, and the weight parameters in each convolution kernel are different, so that different convolution kernels can extract different features. The number of convolution kernels is increased within a certain range, so that the recognition effect can be improved, but the number is too large, so that the training parameters of the network can be greatly increased, and the training difficulty of the network is increased. At present, the value of the number of convolution kernels is determined according to the executed task and experience, but generally follows the principle that the number of convolution kernels of the later convolution layer is larger.
Step D6, add a pooling layer between the convolutional layers.
A pooling layer is often added between convolutional layers. The pooling layer neural network does not change the depth of the three-dimensional matrix, but it can reduce the size of the matrix. The pooling operation may be considered as converting a picture with a higher resolution to a picture with a lower resolution.
The forward propagation of the pooling layer is also accomplished by moving a filter-like structure. The computation in the pooling layer filter is not a weighted sum of nodes, but rather a simpler maximum or average operation. Similar to the filter of the convolutional layer, the filter of the pooling layer also needs to be set by manually setting the size of the filter, whether to use all-0 filling, the step size of the filter movement, and the like, and the meaning of these settings is the same. The manner in which filters in the convolutional layers and the pooling layers move is similar, the only difference being that the filters used in the convolutional layers are across the entire depth, whereas the filters used in the pooling layers affect nodes at only one depth.
Embodiments of the present invention use a Convolutional Neural Network (CNN) based unstructured data analysis model for unstructured data such as images. The data is internally divided into a plurality of sub-areas with overlapping, and the characteristic transformation is carried out iteratively, so that the sensitivity of the model to changes such as offset is reduced by using the relative position structure information of the data. In the above operation, each feature map corresponds to one type of feature for data having a two-dimensional topological attribute such as an image, and even if a specific feature changes spatially by offset rotation or the like, the feature can be extracted efficiently by convolution operation, and thus the data has invariance to spatial change.
The Convolutional Neural Network (CNN) is a popular deep network, and different from the traditional network structure, the convolutional neural network comprises a very special convolutional layer and a downsampling layer, wherein the convolutional layer and the previous layer are connected in a local connection and weight sharing mode, and therefore the number of parameters is greatly reduced. The down-sampling layer can greatly reduce input dimensionality, so that the network complexity is reduced, the network has higher robustness, and meanwhile overfitting can be effectively prevented.
In general, the basic structure of CNN includes two layers, one of which is a feature extraction layer, and the input of each neuron is connected to a local acceptance domain of the previous layer and extracts the feature of the local. Once the local feature is extracted, the position relation between the local feature and other features is determined; the other is a feature mapping layer, each computing layer of the network is composed of a plurality of feature mappings, each feature mapping is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure adopts a sigmoid function with small influence function kernel as an activation function of the convolution network, so that the feature mapping has displacement invariance. In addition, since the neurons on one mapping surface share the weight, the number of free parameters of the network is reduced. Each convolutional layer in the convolutional neural network is followed by a computation layer for local averaging and quadratic extraction, which reduces the feature resolution.
Optionally, the method further includes a text data analysis based on serialization, configured to analyze text and other types of data, and construct a serialized text convolution operation according to a one-dimensional sequence characteristic inside the data, including:
and P1, establishing a text input layer, and sequentially arranging word vectors corresponding to the words in the sentence into a matrix.
The input layer is a matrix in which word vectors corresponding to words in a sentence are arranged in sequence (from top to bottom), and if the sentence has n words and the vector dimension is k, the matrix is n × k.
All Word vectors directly use the results obtained from unsupervised learning, i.e., the Word2Vector tool of Google, and are fixed.
The type of this matrix is static (static) and the word vector is fixed. For the vector of the unknown word, it can be filled with 0 or a random small positive number.
And step P2, establishing text convolution layers, wherein the size of each text convolution layer is the multiplication of the filter _ size and the embedding _ size.
The filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, namely, the word order relationship is considered between adjacent words, and the embedding _ size is the dimension of a word vector.
The size of each text convolution kernel is filter _ size multiplied by embedding _ size. The filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, i.e. the adjacent words are considered to have word order relation. embedding _ size is the dimension of the word vector. After each text convolution kernel is calculated, we obtain 1 column vector, which represents the feature extracted from the sentence by the text convolution kernel. How many features can be extracted with the text convolution kernel.
And step P3, establishing a text pooling layer, and extracting the maximum value of the column vectors obtained by text convolution.
The pooling operation is to extract the maximum value of the column vector resulting from the convolution. Thus, after the pooling operation, an m-dimensional row vector is obtained, i.e., the maximum value of each text convolution kernel is connected, wherein m represents the number of text convolution kernels.
Semantic features of text are high-level feature expressions of text that represent more essential knowledge in the text. For text and other types of data, one-dimensional sequence characteristics inside the data are considered, and serialized text convolution operation is constructed, so that the model has stronger sequence change invariance.
The non-structural data involved in power services is diverse, including both image data and text data as well as sensor data generated by various smart devices. These data reflect important information of the same event or service from different perspectives at the same time. Traditional deep learning generally can only process data of a single mode, so that a limiting model cannot integrate information of multiple sources for big data analysis and prediction.
Further, the method comprises a heterogeneous stacked denoising automatic coding network, node semantic modeling based on a multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on a heterogeneous information network and deep scale learning based on the heterogeneous information network;
the heterogeneous stacked denoising automatic coding network is used for learning the coding mode of input data and measuring loss by comparing original input with reconstructed output;
the multi-mode-based node semantic modeling is used for converting node contents in different modes into the same feature space, and then performing unified semantic modeling on all types of nodes;
heterogeneous feature fusion based on meta-path information propagation further fuses and improves feature representation of nodes through sharing and fusion mutual learning of model parameters;
constructing a clustering relation by using structural semantics embodied by a heterogeneous network based on the node neighbor relation of the heterogeneous information network, and establishing a neighbor relation;
the deep heterogeneous scale learning based on the heterogeneous information network is used for constructing a deep heterogeneous scale learning method based on data pair constraint.
The invention also comprises a heterogeneous stacked denoising automatic coding network based on multi-mode information sharing, which expands the traditional single-mode model into a multi-level heterogeneous deep learning model framework, thereby more fully mining the data contents of various types of nodes and the implicit high-level semantic information in the association relationship, and having better accuracy and expandability.
Because different types of node contents have different internal structures, for example, a text is a one-dimensional sequence structure formed by words, an image is a two-dimensional space structure, and a video is a three-dimensional structure including a time dimension. For data of different structures, there are different applicable depth model structures, such as CNN, RNN, LSTM, etc. The invention provides an effective multi-mode-based node semantic modeling, combines the semantic features of nodes in different modes to be learned simultaneously in a depth model, integrates the deep model architecture with various structures, realizes parameter sharing of different types of nodes in a network, and jointly optimizes network parameters of all nodes through network layer fusion of neighbor relations. The heterogeneous stack type denoising automatic coding network learns the coding mode of input data, and loss is measured by comparing original input with reconstructed output.
Information network can use directed graph
Figure BDA0003563075490000201
And node type mapping function
Figure BDA0003563075490000202
The relationship type mapping function φ ε → R. Wherein each node
Figure BDA0003563075490000203
Belonging to a node type
Figure BDA0003563075490000204
Each edge e ∈ belongs to a relationship type φ (e) ∈ R. When the number of node types is greater than 1, that is
Figure BDA0003563075490000205
When the network is a heterogeneous information network, otherwise, the network is called a homogeneous information network. In an information network, the types of objects and the types of relationships are clearly distinguished, and the relationships existing between different types of objects can be clearly described by a network model.
(1) Heterogeneous stacked denoising automatic coding network
The heterogeneous stacked denoising automatic coding network learns the coding mode of input data, and loss is measured by comparing original input with reconstructed output.
Inputting: in order to model the content information and the heterogeneous relationship, the heterogeneous probability joint semantic modeling based on reconstruction is provided. Establishing asymmetric weighted adjacency matrix based on heterogeneous relation between nodes
Figure BDA0003563075490000206
And the feature matrix obtained by linear transformation in the step one
Figure BDA0003563075490000207
Then, by reconstructing the feature matrix
Figure BDA0003563075490000208
And an adjacency matrix E, obtaining the semantic theme distribution of the nodes, and reconstructing the matrix
Figure BDA0003563075490000209
And E.
In order to incorporate more network structure information, on the basis of simultaneously modeling all nodes and relations in the network, optimization can be further performed based on meta-paths containing higher layer structure information.
Counting all possible meta-path types existing in the network, and respectively establishing a weighted relation matrix for each meta-path
Figure BDA00035630754900002010
Then for any two nodes vi,vjThe association relationship based on the meta path may be expressed as:
link(vi,vj)={B1,ij,...,Bn,ij}
wherein B isn,ijDenotes vi,vjThe weight on the nth meta path.
And then, respectively establishing a depth model of the content M of the fusion node according to each meta-path relation matrix, and learning the feature representation of the nodes in each sub-network based on the corresponding meta-path network structure. And finally, fusing with a feature learning result in a global network, and comprehensively improving the quality of node feature learning under the condition of considering multi-aspect structural information.
(2) Node semantic modeling based on multiple modes
When there are multiple content nodes in the heterogeneous information network, the nodes cannot be modeled simultaneously because they have different modal feature representations. Therefore, the contents of the nodes in different modalities are converted into the same feature space, and then the unified semantic modeling is carried out on all types of nodes.
The method comprises the following steps:
the characteristic dimensions of the text and image nodes are assumed to be r respectively1And r2To get two kinds of nodes at unity r3Representation in a dimensionally shared feature space using linear transformation matrices
Figure BDA0003563075490000211
And
Figure BDA0003563075490000212
mapping text node x and image node z to r3In dimensional space, the following formula is shown:
Figure BDA0003563075490000213
to lowerThe information loss of the low text and the image in the mapping process requires that the representation of the nodes in the shared feature space can keep the adjacent relation in the original network as much as possible. At r3In the dimension shared feature space, the similarity of content nodes of the same type and different types can be expressed as:
Figure BDA0003563075490000214
Figure BDA0003563075490000215
Figure BDA0003563075490000216
wherein the content of the first and second substances,
Figure BDA0003563075490000217
is determined by Λ and Π together.
Step two:
in order to measure the similarity of nodes in an original network, meta-path information between the nodes is adopted to establish a decision function d (x)i,xj) Converting the neighbor relation between nodes into real number representation so as to enable d (x)i,xj) Can represent the node viAnd vjDegree of proximity in the original network. Finally, at node viAnd vjIn mapping to r3After the dimensions are shared in the feature space, a final reconstruction loss function and a network structure loss function are constructed through ideas such as linear discriminant analysis and spectral clustering. Based on the model, the heterogeneous content nodes after mapping can be simultaneously modeled in the probability generation model, and the loss function based on probability reconstruction and the loss function based on the nearest neighbor relation in the mapping process are optimized.
(3) Heterogeneous feature fusion based on meta-path information propagation
In a heterogeneous stacked denoising automatic coding network model framework, heterogeneous content nodes are mutually learned through sharing and fusing model parameters, and nodes not containing content are used as auxiliary information and are passively generated in the characteristic representation learning process of the auxiliary content nodes. In order to more fully utilize the structural relationship between the nodes, the feature representation of the nodes is further fused and improved based on the feature fusion algorithm of the meta-path.
The heterogeneous stacked denoising automatic coding network model outputs a characteristic representation matrix of all nodes
Figure BDA0003563075490000221
Figure BDA0003563075490000222
r represents the dimension of the feature space. On the basis, the meta-path adjacency matrix between nodes is utilized
Figure BDA0003563075490000223
A feature fusion algorithm of random walk is proposed, which is shown in the following formula:
Figure BDA0003563075490000224
wherein p (B)nAnd j, i) represents the adjacency matrix B from the original pathnAnd the obtained feature propagation probability from the j-th class node to the i-th class node and the fusion scale parameter alpha are used for controlling the feature fusion process in each iteration process.
(4) Node neighbor relation construction based on heterogeneous information network
Firstly, a heterogeneous network structured clustering algorithm (H-SCAN) is provided, aiming at the structural information of a heterogeneous information network, the algorithm calculates the structural similarity of each node:
Figure BDA0003563075490000225
and the network structure similarity sigma (v, w) of the nodes v and w can be calculated according to the commonality of reachable neighbors of the nodes v and w. On the basis, firstly all nodes are initialized to be non-clustering points, secondly all core nodes are traversed, direct structure reachable nodes are found and combined into a cluster, and a cluster label is distributed. And finally, traversing all the nodes which are not clustered, and dividing the nodes into pivot points or outliers according to the number of the adjacent clusters, wherein more of the adjacent clusters are pivot nodes, and less are outliers.
The core idea of the algorithm is that the larger the neighbor commonality of the node is, the higher the structural similarity of the node is. The invention further refers to the idea of RankClus in the calculation of the structural similarity, and considers the information of the heterogeneous nodes and the meta-path information semantics, so that the structural similarity can be calculated for the heterogeneous nodes.
The algorithm can effectively utilize structural semantics embodied by a heterogeneous network to construct a clustering relation and a neighbor relation, can also distinguish different roles and importance degrees of node pivot points, core points, outliers and the like, and provides for the subsequent construction of cost-sensitive loss functions.
(5) Deep scale learning model based on heterogeneous information network
Firstly, a deep heterogeneous scale learning method based on data pair constraint is constructed based on the clustering result of the previous step. The data pairs can be manually marked data pairs (same type is positive example, and different type is negative example), or can be from a clustering algorithm (same cluster is positive example, and different cluster is negative example). Two deep neural networks sharing the same structure and parameters are respectively established. Let < X1, X2> be a data pair, which is sent to two heterogeneous deep neural network models, respectively, and by means of the same parameters and architecture, the data pair is mapped into the same subspace, and the similarity of the samples contained in the positive example pair is defined by the final loss function to be high enough, and the distance between the samples contained in the negative example pair is far enough. Therefore, the consistency of semantic similarity and feature space similarity of the original heterogeneous network structure is ensured by virtue of constraint loss of the data pairs.
The unstructured electric power big data analysis method based on deep learning aims at practical application and fine management, is high in electric power big data processing and analysis efficiency and high in data analysis precision, lays a good data foundation for subsequent data information processing and analysis display, has the advantages of flexibility, expandability, safety and concurrency processing, is low in cost, saves more resource cost, and greatly improves unstructured data processing capability of a data platform.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
It will be understood by those skilled in the art that the present invention includes any combination of the summary and detailed description of the invention described above and those illustrated in the accompanying drawings, which is not intended to be limited to the details and which, for the sake of brevity of this description, does not describe every aspect which may be formed by such combination. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made in the above embodiments by those of ordinary skill in the art without departing from the principle and spirit of the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. The unstructured electric power big data analysis method based on deep learning is characterized in that electric power big data are analyzed and processed by adopting video data preprocessing, image data preprocessing and text data preprocessing;
the video data preprocessing is used for analyzing and processing the video data;
the image data preprocessing is used for analyzing and processing the image data;
the text data preprocessing adopts a natural language processing technology to preprocess the big electric power data;
the image preprocessing comprises image graying processing, geometric transformation processing, image enhancement processing and decoding processing;
the graying processing includes: adopting a post-maximum value method, and taking the maximum value of the three-component brightness in the color image as the gray value of the gray image;
the geometric transformation process includes: processing the acquired image by adopting a geometric transformation method and a gray interpolation algorithm, and correcting errors of an image acquisition system and random errors of instrument positions;
the image enhancement processing includes: enhancing an image by adopting a spatial domain method, wherein the spatial domain method comprises point operation and neighborhood denoising operation;
the image decoding process includes: and after the image enhancement operation, performing image decoding processing, decoding the image by adopting a TensorFlow frame, and converting the image into an original pixel matrix of the image.
2. The unstructured electric power big data analysis method based on deep learning of claim 1, further comprising power supply area division, wherein the power supply area division specifically includes:
step H1, counting each cell in the power supply area;
step H2, power supply area is initialized randomly, each cell is divided into K clusters C randomly1,C2To CKWherein K represents the number of the set power supply areas;
step H3, the cluster center of each power supply area is calculated using the following formula:
Figure FDA0003563075480000011
where | C | represents the number of cells included in the power supply area C;
Figure FDA0003563075480000012
representing the electricity consumption record of the integral average day in the ith area t days;
step H4, calculating the cluster center similarity between the cell and the power supply area;
a step H5 of dividing each cell into power supply regions having the greatest degree of similarity to the cluster center between the power supply regions, based on the calculation result of the degree of similarity in the step H4;
step H6: it is determined whether or not the power supply area allocated to each cell converges, and if so, the result of dividing the power supply area allocated to each cell is output, otherwise, the process returns to step H3.
3. The unstructured electric big data analysis method based on deep learning as claimed in claim 1, further comprising electric prediction model construction, specifically:
step E1 setting Power supply area CPLet user UiThe electricity usage is linearly related over consecutive D +1 days,
Figure FDA0003563075480000021
by xiIs predicted by a linear combination of different elements in (1), as shown in the following equation:
Figure FDA0003563075480000022
wherein wpIs a region CPLinear combination parameter of biAs error variable, error variable biIs a varying random error term for the user UiHas the following equation:
Figure FDA0003563075480000023
Figure FDA0003563075480000024
Figure FDA0003563075480000025
let matrix yi,xi,bi,wpRespectively in the following forms:
Figure FDA0003563075480000026
thus, the user UiThe linear combination model of the power consumption is as follows:
yi=xiwp+bi
step E2, training a separate model for each power supply area, and connecting the power supply areas CPElectricity consumption matrix x of all users iniAre combined into a matrix XpCombining the predicted power consumption into vector YpAll error terms biCombined into vector BpThen, all linear models are combined as:
Yp=Xpwp+Bp
according to CPMatrix X in (1)pSum vector YpEstimate the parameter wpA value;
step E3, applying least squares method to CPSolving is carried out, and an error B is setpIs finite in variance and is zero mean, i.e., E [ B ]p](ii) 0 to yield wpThe least squares solution of (c) is:
Figure FDA0003563075480000031
the error of the prediction model is:
Figure FDA0003563075480000032
step E4 for each supply area CPRespectively adopting linear regression model for modeling, and adopting
Figure FDA0003563075480000033
And solving to obtain a user power consumption prediction model on different power supply areas.
4. The method for analyzing the unstructured electric power big data based on deep learning as claimed in claim 3, further comprising based on multi-task joint learning, adopting an iterative joint learning algorithm, and in each iteration process, through sharing of user data, optimizing prediction models on different power supply areas at the same time, so as to realize overall improvement of model performance in different power supply areas, specifically:
step F1: respectively constructing a reference power consumption prediction model on each power supply area: in the power supply region CPUpper utilization matrix XpSum vector YpConstructing a linear prediction model, and solving the parameter w by adopting a least square algorithmpAnd error BpAs a region CPThe reference model in (1) is set as an area total electricity consumption behavior similarity matrix S;
step F2: performing data fusion on the areas according to the overall electricity utilization behavior similarity matrix S, namely performing data fusion on all other power supply areas CqAccording to CPAnd CqGeneral electricity demand similarity Spq,SpqThat is, the similarity corresponding to the p-th row and the q-th column of the similarity matrix S of the regional total power utilization behavior with the probability SpqRandom decimation CqAnd with CPThe user data in (1) are fused to obtain Xp∪qAnd Yp∪q
Step F3: by using the mostSmall multiplication by two algorithm according to Xp∪qAnd Yp∪qSolving model parameter W with minimum joint learning loss functionp∪qAnd a prediction parameter Bp∪q
Step F4: judging whether the models on all the areas are updated or not: if the models in all the regions are updated, performing step F5, otherwise, returning to step F2;
step F5: judging whether the least square algorithm converges: if convergence is reached, i.e., the models in all the regions are not updated, the result is output, otherwise, the process returns to step F2.
5. The method as claimed in claim 1, wherein the video data preprocessing includes shot segmentation and key frame extraction, and the shot segmentation includes: dividing the gray scale, brightness or color of each pixel of adjacent frames into N levels by a histogram-based method, and counting the number of pixels for each level to make a histogram comparison; the key frame extraction comprises: and classifying the images in the image library by adopting a K-means clustering algorithm.
6. The unstructured electric big data analysis method based on deep learning of claim 1, wherein the text data preprocessing comprises:
step Q1, performing word segmentation operation on the text data by adopting a Chinese word segmentation tool;
step Q2, the stop word dictionary is adopted to carry out stop word processing on the segmented data;
and step Q3, converting the data after the stop Word processing into a structured data form which can be recognized and analyzed by a computer by adopting a Word2Vec tool kit.
7. The unstructured electric big data analysis method based on deep learning as claimed in claim 1, further comprising video feature extraction, specifically:
k1, acquiring an original image, and extracting color features in the original image by adopting a color histogram;
k2, carrying out smoothing filtering on the original image, calculating pixel gradient, selecting points with larger gradient change rate as image edge points, and extracting the edge characteristics of the original image;
k3, extracting the texture features of the original image by Gabor filtering;
and K4, blocking the pixel areas of the image, and calling a matching algorithm to estimate a motion vector of each pixel area so as to express the motion characteristics of the video.
8. The unstructured electric big data analysis method based on deep learning of claim 1, further comprising image analysis based on convolutional neural network, for image analysis, specifically:
step D1, acquiring an original pixel matrix of the image and representing the original pixel matrix as a three-dimensional matrix, wherein the length and the width of the three-dimensional matrix represent the size of the image, the depth represents the color channel of the image, the depth of the black-and-white picture is 1, and the depth of the image is 3 in an RGB color mode;
d2, extracting various features of the image by adopting a plurality of convolution kernels, convolving the input image by adopting a filter which can be trained, then adding a deviation, and obtaining a convolution layer after the feature extraction is finished;
step D3, performing maximum or average operation on the adjacent areas in the convolution layer, adding corresponding weight and deviation, obtaining output through activating a function, and outputting a characteristic diagram of the sampling layer;
d4, extracting the local features of the image by connecting each convolution kernel with the local pixel points of the previous layer of feature map, and then performing convolution on the convolution kernels and the previous layer of whole feature map to obtain the convolution result of the global features of the image;
and D5, adding the bias parameters to the convolution result, and calculating the characteristic diagram of the convolution layer through the activation function, wherein the specific operation is as follows:
Figure FDA0003563075480000041
wherein f represents Sigmoid activation function, b represents offset, and wn,mRepresenting the weight of the position N and M of a convolution kernel, wherein N represents the length of the convolution kernel, M represents the width of the convolution kernel, and u represents the feature graph of the output of the previous layer;
step D6, add a pooling layer between the convolutional layers.
9. The method for analyzing the unstructured electric big data based on the deep learning as set forth in claim 1, further comprising a text data analysis based on serialization for analyzing texts and other types of data, and constructing a text convolution operation based on one-dimensional sequence characteristics inside the data, specifically:
step P1, establishing a text input layer, and sequentially arranging word vectors corresponding to words in the sentence into a matrix;
step P2, establishing text convolution layers, wherein the size of each text convolution layer is the multiplication of filter _ size and embedding _ size;
the filter _ size represents the number of words contained in the text convolution kernel in the longitudinal direction, and the embedding _ size is the dimension of a word vector;
and step P3, establishing a text pooling layer, and extracting the maximum value of the column vectors obtained by text convolution.
10. The unstructured electric big data analysis method based on deep learning of claim 1, further comprising heterogeneous stacked denoising automatic coding network, node semantic modeling based on multi-mode, heterogeneous feature fusion based on meta-path information propagation, node neighbor relation construction based on heterogeneous information network and deep scale learning based on heterogeneous information network;
the heterogeneous stacked denoising automatic coding network is used for learning the coding mode of input data and measuring loss by comparing original input with reconstructed output;
the multi-mode-based node semantic modeling is used for converting node contents in different modes into the same feature space and then carrying out uniform semantic modeling on all types of nodes;
the heterogeneous feature fusion based on meta-path information propagation learns each other through sharing and fusion of model parameters, further fuses and improves feature representation of nodes;
the heterogeneous information network-based node neighbor relation construction utilizes structural semantics embodied by a heterogeneous network to construct a clustering relation and establish a neighbor relation;
the deep heterogeneous scale learning based on the heterogeneous information network is used for constructing a deep heterogeneous scale learning method based on data pair constraint.
CN202210301556.9A 2022-03-24 2022-03-24 Unstructured electric power big data analysis method based on deep learning Pending CN114723583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210301556.9A CN114723583A (en) 2022-03-24 2022-03-24 Unstructured electric power big data analysis method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210301556.9A CN114723583A (en) 2022-03-24 2022-03-24 Unstructured electric power big data analysis method based on deep learning

Publications (1)

Publication Number Publication Date
CN114723583A true CN114723583A (en) 2022-07-08

Family

ID=82240405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210301556.9A Pending CN114723583A (en) 2022-03-24 2022-03-24 Unstructured electric power big data analysis method based on deep learning

Country Status (1)

Country Link
CN (1) CN114723583A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049169A (en) * 2022-08-16 2022-09-13 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN116432064A (en) * 2023-03-06 2023-07-14 北京车讯互联网股份有限公司 Data preprocessing system and method
CN116842459A (en) * 2023-09-01 2023-10-03 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN117555454A (en) * 2024-01-10 2024-02-13 深圳市极客智能科技有限公司 Data analysis method and system for realizing terminal AI display screen based on multiple modes

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049169A (en) * 2022-08-16 2022-09-13 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN115049169B (en) * 2022-08-16 2022-10-28 国网湖北省电力有限公司信息通信公司 Regional power consumption prediction method, system and medium based on combination of frequency domain and spatial domain
CN116432064A (en) * 2023-03-06 2023-07-14 北京车讯互联网股份有限公司 Data preprocessing system and method
CN116432064B (en) * 2023-03-06 2023-10-27 北京车讯互联网股份有限公司 Data preprocessing system and method
CN116842459A (en) * 2023-09-01 2023-10-03 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN116842459B (en) * 2023-09-01 2023-11-21 国网信息通信产业集团有限公司 Electric energy metering fault diagnosis method and diagnosis terminal based on small sample learning
CN117555454A (en) * 2024-01-10 2024-02-13 深圳市极客智能科技有限公司 Data analysis method and system for realizing terminal AI display screen based on multiple modes
CN117555454B (en) * 2024-01-10 2024-04-16 深圳市极客智能科技有限公司 Data analysis method and system for realizing terminal AI display screen based on multiple modes

Similar Documents

Publication Publication Date Title
Maharana et al. A review: Data pre-processing and data augmentation techniques
CN111291212B (en) Zero sample sketch image retrieval method and system based on graph convolution neural network
Fahim et al. Single-View 3D reconstruction: A Survey of deep learning methods
CN114723583A (en) Unstructured electric power big data analysis method based on deep learning
Wazarkar et al. A survey on image data analysis through clustering techniques for real world applications
Rosu et al. Semi-supervised semantic mapping through label propagation with semantic texture meshes
Jiang et al. Local and global structure for urban ALS point cloud semantic segmentation with ground-aware attention
Ahmad et al. 3D capsule networks for object classification from 3D model data
Chai et al. A one-to-many conditional generative adversarial network framework for multiple image-to-image translations
Kavitha et al. Convolutional Neural Networks Based Video Reconstruction and Computation in Digital Twins.
Jiu et al. DHCN: Deep hierarchical context networks for image annotation
Chong et al. High-order Markov random field as attention network for high-resolution remote-sensing image compression
Xu et al. High quality superpixel generation through regional decomposition
CN116595479A (en) Community discovery method, system, equipment and medium based on graph double self-encoder
Palomo et al. Image compression and video segmentation using hierarchical self-organization
CN115688234A (en) Building layout generation method, device and medium based on conditional convolution
CN115019053A (en) Dynamic graph semantic feature extraction method for point cloud classification and segmentation
Paul Transnerf-improving neural radiance fields using transfer learning for efficient scene reconstruction
Hassan et al. Salient object detection based on CNN fusion of two types of saliency models
CN113139540A (en) Backboard detection method and equipment
Liu et al. Automatic algorithm for fractal plant art image similarity feature generation
US20220292341A1 (en) Local neural implicit functions with modulated periodic activations
Sinaga et al. Tile2vec with predicting noise for land cover classification
Oubari et al. Efficient spatio-temporal feature clustering for large event-based datasets
Chen et al. Ordered smooth representation clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination