CN116168352A - Power grid obstacle recognition processing method and system based on image processing - Google Patents

Power grid obstacle recognition processing method and system based on image processing Download PDF

Info

Publication number
CN116168352A
CN116168352A CN202310461859.1A CN202310461859A CN116168352A CN 116168352 A CN116168352 A CN 116168352A CN 202310461859 A CN202310461859 A CN 202310461859A CN 116168352 A CN116168352 A CN 116168352A
Authority
CN
China
Prior art keywords
image
sequence
training
classifier
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310461859.1A
Other languages
Chinese (zh)
Other versions
CN116168352B (en
Inventor
李佩剑
邓清凤
伍强
黄渠洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Ruitong Technology Co ltd
Original Assignee
Chengdu Ruitong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Ruitong Technology Co ltd filed Critical Chengdu Ruitong Technology Co ltd
Priority to CN202310461859.1A priority Critical patent/CN116168352B/en
Publication of CN116168352A publication Critical patent/CN116168352A/en
Application granted granted Critical
Publication of CN116168352B publication Critical patent/CN116168352B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

A method and a system for identifying and processing a power grid obstacle based on image processing are provided, wherein a monitoring image collected by a camera arranged on a power line tower is obtained, the monitoring image is subjected to image blocking processing to obtain a sequence of image blocks, each image block in the sequence of image blocks is respectively subjected to image resolution enhancement based on an automatic codec to obtain a sequence of enhanced image blocks, each enhanced image block in the sequence of enhanced image blocks is respectively subjected to convolution neural network model serving as a filter to obtain a sequence of image local semantic feature vectors, the sequence of image local semantic feature vectors is subjected to ViT model based on a converter to obtain an image global context semantic understanding feature vector, the image global context semantic understanding feature vector is subjected to a classifier to obtain a classification result, the classification result is used for indicating whether birds are contained in a monitoring range, and finally the stability of power transmission is ensured.

Description

Power grid obstacle recognition processing method and system based on image processing
Technical Field
The present disclosure relates to the field of image recognition technologies, and in particular, to a method and a system for recognizing and processing a power grid obstacle based on image processing.
Background
In modern society, the application of electric power is ubiquitous, and electric power transmission has vital significance for the normal operation of national economy and the daily life of people. The power line tower is an important facility in power transmission, however many line towers are located in the field, and the tower body of the line tower and the crossing place of the cross arm form a structure similar to a tree fork, so birds are easy to be attracted to nest and perch at the crossing place of the cross arm and the tower body, and the stability of power transmission is affected.
When bird expelling means are adopted for expelling birds, accurate and effective identification and detection of birds are key for guaranteeing bird expelling effects.
The Chinese patent with the application number of 202110405605.9 discloses a method for identifying bird species images related to bird-related faults of a power transmission line, which comprises the steps of firstly, collecting bird species information around the power transmission line, establishing a bird species image database related to bird-related faults, and performing background removal pretreatment on bird species images based on a category activation diagram method; then, a learning model is built by utilizing four deep convolutional neural networks, the learning model is pre-trained through an ImageNet data set, the pre-trained model network structure is fine-tuned, the fine-tuned model is retrained by utilizing a preprocessed bird species image training set, and a test set is classified and identified; and finally, establishing a bird-related fault bird species image identification model integrating the multi-convolution network by adopting a linear weighting method according to the classification accuracy of the four network models, and classifying and identifying the bird species images. The method can provide a means for correctly identifying birds for power transmission line operation and maintenance personnel, is beneficial to realizing differential control of bird fault and reducing the tripping rate of bird fault.
Further, a method, apparatus, device and storage medium for bird image recognition are disclosed in chinese patent application No. 201910775414.4, wherein the method comprises: acquiring an image to be identified containing a bird target; based on a preset positioning algorithm, carrying out local area positioning on the image to be identified to obtain an area where the bird target is located; according to the multi-part feature extraction model, extracting features of the region where the bird target is located to obtain a plurality of part features of the bird target; identifying each part characteristic by using a classifier and verification part characteristics to obtain similarity scores corresponding to the part characteristics, wherein the verification part characteristics and the part characteristics have a one-to-one correspondence; according to all similarity scores, the recognition results of bird targets in the images to be recognized are calculated, and the technical problem that the conventional image recognition method is low in recognition efficiency is solved.
However, the above-mentioned identification and bird repelling techniques have the following drawbacks: because birds are small-size objects in actual detection, errors easily occur in the traditional mode of relying on manual identification, so that the identification and detection precision of the birds is low, and the bird repelling effect is affected. And the conventional method has low timeliness in the process of bird identification, and a great deal of effort is required to perform bird identification detection at various positions of the power line tower.
Therefore, an optimized image processing-based power grid obstacle recognition processing scheme is desired.
Disclosure of Invention
The present application has been made in order to solve the above technical problems. The embodiment of the application provides a method and a system for identifying and processing a power grid obstacle based on image processing, wherein the method and the system acquire a monitoring image acquired by a camera deployed on a power line tower; and the artificial intelligence technology based on deep learning is adopted to mine the implicit characteristic information about birds in the monitoring image, so that the birds are identified and detected, and then when the birds are detected, the compressed air is controlled to drive the birds, so that the stability of power transmission is ensured.
In a first aspect, there is provided an image processing-based power grid obstacle recognition processing method, including:
acquiring a monitoring image acquired by a camera deployed on a power line tower;
performing image blocking processing on the monitoring image to obtain a sequence of image blocks;
passing each image block in the sequence of image blocks through an automatic codec-based image resolution enhancer, respectively, to obtain a sequence of enhanced image blocks;
each enhanced image block in the sequence of enhanced image blocks is respectively passed through a convolutional neural network model serving as a filter to obtain a sequence of image local semantic feature vectors;
Passing the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain image global context semantic understanding feature vectors;
the image global context semantic understanding feature vector is passed through a classifier to obtain a classification result, and the classification result is used for indicating whether birds are contained in a monitoring range; and
and controlling the compressed air to drive birds in response to the classification result being that birds are contained in the monitoring range.
In the above method for identifying and processing a power grid obstacle based on image processing, the steps of passing each image block in the sequence of image blocks through an image resolution enhancer based on an automatic codec to obtain a sequence of enhanced image blocks respectively include: performing explicit spatial coding on each image block in the sequence of image blocks by an image resolution encoder of the automatic codec using a convolutional layer to obtain each image feature; and deconvolution processing is carried out on each image feature by an image resolution decoder of the automatic codec by using deconvolution layers to obtain the sequence of enhanced image blocks.
In the above method for identifying and processing a grid obstacle based on image processing, the steps of passing each enhanced image block in the sequence of enhanced image blocks through a convolutional neural network model as a filter to obtain a sequence of image local semantic feature vectors respectively include: each layer of the convolutional neural network model used as the filter performs the following steps on input data in forward transfer of the layer: carrying out convolution processing on the input data to obtain a convolution characteristic diagram; carrying out mean pooling treatment based on a feature matrix on the convolution feature map to obtain a pooled feature map; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network model as a filter is a sequence of local semantic feature vectors of the image, and the input of the first layer of the convolutional neural network model as a filter is each enhanced image block in the sequence of enhanced image blocks.
In the above method for identifying and processing a grid obstacle based on image processing, the step of passing the sequence of the image local semantic feature vectors through a ViT model based on a converter to obtain image global context semantic understanding feature vectors includes: one-dimensional arrangement is carried out on the sequence of the image local semantic feature vectors so as to obtain global image semantic feature vectors; calculating the product between the global image semantic feature vector and the transpose vector of each image local semantic feature vector in the sequence of image local semantic feature vectors to obtain a plurality of self-attention association matrices; respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and weighting each image local semantic feature vector in the sequence of image local semantic feature vectors by taking each probability value in the plurality of probability values as a weight to obtain the image global context semantic understanding feature vector.
In the above method for identifying and processing the grid obstacle based on image processing, the method for identifying and processing the grid obstacle based on image processing includes that the image global context semantic understanding feature vector is passed through a classifier to obtain a classification result, and the classification result is used for indicating whether birds are contained in a monitoring range, and the method includes: performing full-connection coding on the image global context semantic understanding feature vector by using a plurality of full-connection layers of the classifier to obtain a coding classification feature vector; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
The image processing-based power grid obstacle recognition processing method further comprises training the convolutional neural network model serving as a filter, the ViT model based on a converter and the classifier; wherein training the convolutional neural network model as a filter, the converter-based ViT model, and the classifier comprises: acquiring training data, wherein the training data comprises training monitoring images and whether the monitoring range contains the true value of birds or not; performing image blocking processing on the training monitoring image to obtain a sequence of training image blocks; respectively passing each training image block in the sequence of training image blocks through the image resolution enhancer based on the automatic codec to obtain a sequence of training enhancement image blocks; respectively passing each training enhancement image block in the sequence of training enhancement image blocks through the convolutional neural network model serving as a filter to obtain a sequence of training image local semantic feature vectors; passing the sequence of training image local semantic feature vectors through the converter-based ViT model to obtain training image global context semantic understanding feature vectors; passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value; and training the convolutional neural network model as a filter, the ViT model based on the converter and the classifier based on the classification loss function value and propagating through the direction of gradient descent, wherein in each round of iteration of the training, a feature affinity spatial affine learning iteration is performed on a weight matrix of the classifier.
Image processing-based power grid obstacle recognitionIn a processing method, passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value, including: the classifier processes the training image global context semantic understanding feature vector with a classification formula to generate a training classification result, wherein the classification formula is as follows:
Figure SMS_1
, wherein ,/>
Figure SMS_2
Representing the training image global context semantic understanding feature vector,/for>
Figure SMS_3
To->
Figure SMS_4
Is a weight matrix>
Figure SMS_5
To represent a bias matrix; and calculating a cross entropy value between the training classification result and a true value as the classification loss function value.
In the above method for identifying and processing the grid obstacle based on image processing, in each iteration of the training, performing feature affinity space affine learning iteration on the weight matrix of the classifier according to the following optimization formula; wherein, the optimization formula is:
Figure SMS_6
wherein ,
Figure SMS_8
a weight matrix representing said classifier, +.>
Figure SMS_12
Transpose of the weight matrix representing the classifier, < >>
Figure SMS_14
Two norms of a weight matrix representing the classifier +.>
Figure SMS_9
A kernel norm of a weight matrix representing the classifier, and +. >
Figure SMS_10
Is the scale of the weight matrix of the classifier, < >>
Figure SMS_13
Represents a logarithmic function with base 2, +.>
Figure SMS_15
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_7
Representing multiplication by location +.>
Figure SMS_11
And representing the weight matrix of the classifier after iteration.
In a second aspect, there is provided an image processing-based power grid obstacle recognition processing system, including:
the image acquisition module is used for acquiring a monitoring image acquired by a camera arranged on the power line tower;
the image blocking processing module is used for carrying out image blocking processing on the monitoring image to obtain a sequence of image blocks;
an automatic encoding and decoding module, configured to obtain a sequence of enhanced image blocks by respectively passing each image block in the sequence of image blocks through an image resolution enhancer based on an automatic encoder and decoder;
the feature extraction module is used for enabling each enhanced image block in the sequence of enhanced image blocks to pass through a convolutional neural network model serving as a filter respectively to obtain a sequence of image local semantic feature vectors;
the global coding module is used for enabling the sequence of the image local semantic feature vectors to pass through a ViT model based on a converter to obtain image global context semantic understanding feature vectors;
The monitoring result generation module is used for enabling the image global context semantic understanding feature vector to pass through a classifier to obtain a classification result, and the classification result is used for indicating whether birds are contained in a monitoring range; and
and the control module is used for controlling the compressed air to drive birds in response to the classification result that birds are contained in the monitoring range.
In the above system for identifying and processing a power grid obstacle based on image processing, the automatic encoding and decoding module includes: an encoding unit, configured to perform explicit spatial encoding on each image block in the sequence of image blocks by using a convolutional layer through an image resolution encoder of the automatic codec to obtain each image feature; and a decoding unit for performing deconvolution processing on the respective image features by an image resolution decoder of the automatic codec using a deconvolution layer to obtain the sequence of enhanced image blocks.
Compared with the prior art, the method and the system for identifying and processing the power grid obstacle based on the image processing acquire the monitoring image acquired by the camera deployed on the power line tower; and the artificial intelligence technology based on deep learning is adopted to mine the hidden characteristic information about birds in the monitoring image, so that the birds are identified and detected, the accuracy of bird image identification is improved, and then when birds are detected, compressed air is controlled to drive the birds, so that the stability of power transmission is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic view of a scenario of a power grid obstacle recognition processing method based on image processing according to an embodiment of the application.
Fig. 2 is a flowchart of a method for identifying and processing a grid obstacle based on image processing according to an embodiment of the application.
Fig. 3 is a schematic architecture diagram of an image processing-based power grid obstacle recognition processing method according to an embodiment of the present application.
Fig. 4 is a flowchart of the sub-steps of step 130 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application.
Fig. 5 is a flowchart of the sub-steps of step 150 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application.
Fig. 6 is a flowchart of the sub-steps of step 160 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application.
Fig. 7 is a flowchart of the sub-steps of step 180 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application.
Fig. 8 is a block diagram of an image processing-based grid obstacle recognition processing system according to an embodiment of the present application.
Description of the embodiments
The following description of the technical solutions in the embodiments of the present application will be made with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Unless defined otherwise, all technical and scientific terms used in the examples of this application have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present application.
In the description of the embodiments of the present application, unless otherwise indicated and defined, the term "connected" should be construed broadly, and for example, may be an electrical connection, may be a communication between two elements, may be a direct connection, or may be an indirect connection via an intermediary, and it will be understood by those skilled in the art that the specific meaning of the term may be understood according to the specific circumstances.
It should be noted that, the term "first\second\third" in the embodiments of the present application is merely to distinguish similar objects, and does not represent a specific order for the objects, it is to be understood that "first\second\third" may interchange a specific order or sequence where allowed. It is to be understood that the "first\second\third" distinguishing objects may be interchanged where appropriate such that the embodiments of the present application described herein may be implemented in sequences other than those illustrated or described herein.
As described above, the current tower bird repellent technology has the following drawbacks: because birds are small-size objects in actual detection, errors easily occur in the traditional mode of relying on manual identification, so that the identification and detection precision of the birds is low, and the bird repelling effect is affected. And the conventional method has low timeliness in the process of bird identification, and a great deal of effort is required to perform bird identification detection at various positions of the power line tower. Therefore, an optimized image processing-based power grid obstacle recognition processing scheme is desired.
Accordingly, considering that in the process of actually performing bird recognition, since the power transmission line of the power line tower is long, birds are small-sized object features in the actual detection process, the bird information at each position of the power line tower cannot be effectively and accurately detected by means of manpower. Based on this, in the technical solution of the present application, it is desirable to perform image analysis on a monitoring image collected by a camera disposed in a power line tower to realize recognition detection of birds. However, because the information amount in the image is large, but birds are small-scale characteristic information in the image, capturing and extracting are difficult, and because the field environment is complex, the image resolution and the like can be influenced in the image acquisition process, so that the representation accuracy of the characteristic information about the birds in the image is influenced. Therefore, in this process, the difficulty lies in how to excavate the implicit characteristic information about birds in the monitored image, so as to perform the recognition detection of birds, and then when detecting that birds exist, control compressed air to perform bird expelling so as to ensure the stability of power transmission.
In recent years, deep learning and neural networks have been widely used in the fields of computer vision, natural language processing, text signal processing, and the like. The development of deep learning and neural networks provides new solutions and schemes for mining implicit characteristic information about birds in the monitored images.
Specifically, in the technical scheme of the application, first, a monitoring image is acquired through a camera deployed at a power line tower. It should be understood that, considering that the power line towers are often arranged in the wild, during the process of collecting the monitoring image, the monitoring image is easily interfered by external environment or equipment factors, so that the resolution of the image is lower, and the characteristic information about birds in the monitoring image becomes fuzzy, which affects the subsequent bird identification. In addition, since birds belong to small-sized objects in the monitoring image, if the monitoring image is directly subjected to preprocessing such as image filtering, the object recognition of the birds is lost, and the subsequent bird recognition and detection are affected.
Based on the above, in order to improve the expression capability of the bird feature in the monitoring image, so as to improve the accuracy of bird recognition detection, in the technical scheme of the application, the image blocking processing is performed on the monitoring image so as to obtain the sequence of the image blocks. It should be appreciated that the dimensions of the individual image blocks in the sequence of image blocks are reduced compared to the original image, and therefore the bird implication features for small sizes in the surveillance image are no longer small-sized objects in the individual image blocks for subsequent bird identification.
Then, each image block in the sequence of image blocks is respectively passed through an image resolution enhancer based on an automatic codec, so that resolution enhancement of each image block is performed, thereby obtaining a sequence of enhanced image blocks. In particular, here, the automatic codec includes an image resolution encoder and an image resolution decoder, the image resolution encoder explicitly spatially encoding the respective image blocks using a convolutional layer to obtain respective image features; the image resolution decoder uses a deconvolution layer to deconvolute the individual image features to obtain the sequence of enhanced image blocks.
Further, since each enhanced image block in the sequence of enhanced image blocks is image data, in order to enable expression of bird feature information in each enhanced image block, in the technical solution of the present application, feature mining of each enhanced image block in the sequence of enhanced image blocks is further performed using a convolutional neural network model as a filter, which has excellent performance in terms of implicit feature extraction of images, so as to extract implicit feature distribution information in each enhanced image block about birds, thereby obtaining a sequence of image local semantic feature vectors.
Next, further consider that since the individual image local semantic feature vectors of the sequence of image local semantic feature vectors have bird feature related information about the whole monitored image between them, the pure CNN approach has difficulty learning explicit global and remote semantic information interactions due to the inherent limitations of convolution operations. Therefore, in the technical scheme of the application, the sequence of the image local semantic feature vectors is encoded in a ViT model based on a converter to extract the context semantic association features of the bird implicit features in each image block, so as to obtain the image global context semantic understanding feature vector. It should be appreciated that ViT may process the individual image blocks directly through a self-attention mechanism like a transducer to extract contextual semantic association feature information about bird implicit features in the individual image blocks, respectively.
And then, taking the image global context semantic understanding feature vector as a classification feature vector to carry out classification processing in a classifier so as to obtain a classification result for indicating whether birds are contained in the monitoring range. That is, the birds in the image are recognized and detected by classifying the images with the context semantic association features of the hidden features of the birds in the respective image blocks of the monitoring image, so that the compressed air is controlled to expel birds in response to the classification result being that birds are contained in the monitoring range.
That is, in the technical solution of the present application, the tag of the classifier includes that birds (first tag) are included in the monitoring range, and that birds (second tag) are not included in the monitoring range, wherein the classifier determines to which classification tag the classification feature vector belongs through a soft maximum function. It should be noted that the first tag p1 and the second tag p2 do not include a human-set concept, and in fact, during the training process, the computer model does not have a concept of "whether birds are contained in the monitoring range" which is only two kinds of classification tags and the probability that the output characteristics are under the two classification tags, that is, the sum of p1 and p2 is one. Therefore, the classification result of whether birds are contained in the monitoring range is actually converted into a classification probability distribution conforming to the natural rule through classifying the tags, and the physical meaning of the natural probability distribution of the tags is essentially used instead of the language text meaning of whether birds are contained in the monitoring range. It should be understood that, in the technical scheme of the present application, the classification label of the classifier is an identification detection label of whether birds are contained in the monitoring range, so after the classification result is obtained, bird identification detection in the image can be performed based on the classification result, and accordingly, birds are contained in the monitoring range in response to the classification result, and compressed air is controlled to expel birds, so that stability of power transmission is ensured.
In particular, in the technical solution of the present application, for the image global context semantic understanding feature vector, since the sequence of the image local semantic feature vector is obtained by directly concatenating a plurality of context image local semantic feature vectors obtained based on a ViT model of a converter, although a ViT model based on the converter can promote context relevance of the plurality of context image local semantic feature vectors, explicit differences of feature distributions of the plurality of context image local semantic feature vectors still exist, which makes relevance among individual local weight value distributions of a weight matrix of the classifier insufficient when the image global context semantic understanding feature vector obtained by directly concatenating passes through the classifier, and affects training speed of the classifier.
Based on the above, in the technical solution of the present application, each time the weight matrix iterates, the weight matrix is mapped
Figure SMS_16
Feature affinity spatial affine learning is performed, expressed as:
Figure SMS_17
wherein ,
Figure SMS_18
representing the two norms of the weight matrix, i.e +.>
Figure SMS_19
Maximum eigenvalue of>
Figure SMS_20
Represents the kernel norm of the weight matrix, i.e. the sum of the eigenvalues of the weight matrix, and +. >
Figure SMS_21
Is the scale of the weight matrix, i.e. width times height.
Here, the feature affinity spatial affine learning performs affine migration based on spatial transformation with relatively low-resolution information characterization by performing detailed structured information expression in a low-dimensional eigensubspace on high-resolution information characterization in a weight value distribution space of the weight matrix, thereby implementing super-resolution (e.g., weight-by-weight) activation of weight distribution local to each weight based on affinity (affinity) dense simulation between weight value characterization to enhance training speed of the classifier by enhancing correlation between each local weight distribution of the weight matrix. Therefore, the birds of the power line tower can be accurately identified and detected, and when birds are detected, the compressed air is controlled to drive the birds, so that the stability of power transmission is ensured.
Fig. 1 is a schematic view of a scenario of a power grid obstacle recognition processing method based on image processing according to an embodiment of the application. As shown in fig. 1, in this application scenario, first, a monitoring image (e.g., C as illustrated in fig. 1) acquired by a camera disposed at a power line tower (e.g., M as illustrated in fig. 1) is acquired; then, the acquired monitoring image is input into a server (e.g., S as illustrated in fig. 1) in which an image processing-based grid obstacle recognition processing algorithm is deployed, wherein the server is capable of processing the monitoring image based on the image processing-based grid obstacle recognition processing algorithm to generate a classification result indicating whether birds are contained in the monitoring range, and controlling compressed air to repel birds in response to the classification result being that birds are contained in the monitoring range.
Having described the basic principles of the present application, various non-limiting embodiments of the present application will now be described in detail with reference to the accompanying drawings.
In one embodiment of the present application, fig. 2 is a flowchart of a method for identifying and processing a grid obstacle based on image processing according to an embodiment of the present application. As shown in fig. 2, a method 100 for identifying and processing a power grid obstacle based on image processing according to an embodiment of the present application includes: 110, acquiring a monitoring image acquired by a camera deployed on a power line tower; 120, performing image blocking processing on the monitoring image to obtain a sequence of image blocks; 130 passing each image block in the sequence of image blocks through an automatic codec based image resolution enhancer, respectively, to obtain a sequence of enhanced image blocks; 140, passing each enhanced image block in the sequence of enhanced image blocks through a convolutional neural network model serving as a filter to obtain a sequence of image local semantic feature vectors; 150, passing the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain image global context semantic understanding feature vectors; 160, passing the image global context semantic understanding feature vector through a classifier to obtain a classification result, wherein the classification result is used for indicating whether birds are contained in a monitoring range; and 170, controlling compressed air to drive birds in response to the classification result being that birds are contained in the monitoring range.
Fig. 3 is a schematic architecture diagram of an image processing-based power grid obstacle recognition processing method according to an embodiment of the present application. As shown in fig. 3, in the network architecture, first, a monitoring image acquired by a camera disposed at a power line tower is acquired; then, performing image blocking processing on the monitoring image to obtain a sequence of image blocks; then, each image block in the sequence of image blocks is respectively passed through an image resolution enhancer based on an automatic codec to obtain a sequence of enhanced image blocks; then, each enhanced image block in the sequence of enhanced image blocks respectively passes through a convolutional neural network model serving as a filter to obtain a sequence of image local semantic feature vectors; then, passing the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain image global context semantic understanding feature vectors; then, the image global context semantic understanding feature vector passes through a classifier to obtain a classification result, wherein the classification result is used for indicating whether birds are contained in a monitoring range; and finally, controlling the compressed air to expel birds in response to the classification result being that birds are contained in the monitoring range.
Specifically, in step 110, a monitoring image acquired by a camera deployed at a power line tower is acquired. As described above, the current tower bird repellent technology has the following drawbacks: because birds are small-size objects in actual detection, errors easily occur in the traditional mode of relying on manual identification, so that the identification and detection precision of the birds is low, and the bird repelling effect is affected. And the conventional method has low timeliness in the process of bird identification, and a great deal of effort is required to perform bird identification detection at various positions of the power line tower. Therefore, an optimized image processing-based power grid obstacle recognition processing scheme is desired.
Accordingly, considering that in the process of actually performing bird recognition, since the power transmission line of the power line tower is long, birds are small-sized object features in the actual detection process, the bird information at each position of the power line tower cannot be effectively and accurately detected by means of manpower. Based on this, in the technical solution of the present application, it is desirable to perform image analysis on a monitoring image collected by a camera disposed in a power line tower to realize recognition detection of birds. However, because the information amount in the image is large, but birds are small-scale characteristic information in the image, capturing and extracting are difficult, and because the field environment is complex, the image resolution and the like can be influenced in the image acquisition process, so that the representation accuracy of the characteristic information about the birds in the image is influenced. Therefore, in this process, the difficulty lies in how to excavate the implicit characteristic information about birds in the monitored image, so as to perform the recognition detection of birds, and then when detecting that birds exist, control compressed air to perform bird expelling so as to ensure the stability of power transmission.
In recent years, deep learning and neural networks have been widely used in the fields of computer vision, natural language processing, text signal processing, and the like. The development of deep learning and neural networks provides new solutions and schemes for mining implicit characteristic information about birds in the monitored images.
Specifically, in the technical scheme of the application, first, a monitoring image is acquired through a camera deployed at a power line tower.
Specifically, in step 120, the monitoring image is subjected to image blocking processing to obtain a sequence of image blocks. It should be understood that, considering that the power line towers are often arranged in the wild, during the process of collecting the monitoring image, the monitoring image is easily interfered by external environment or equipment factors, so that the resolution of the image is lower, and the characteristic information about birds in the monitoring image becomes fuzzy, which affects the subsequent bird identification. In addition, since birds belong to small-sized objects in the monitoring image, if the monitoring image is directly subjected to preprocessing such as image filtering, the object recognition of the birds is lost, and the subsequent bird recognition and detection are affected.
Based on the above, in order to improve the expression capability of the bird feature in the monitoring image, so as to improve the accuracy of bird recognition detection, in the technical scheme of the application, the image blocking processing is performed on the monitoring image so as to obtain the sequence of the image blocks. It should be appreciated that the dimensions of the individual image blocks in the sequence of image blocks are reduced compared to the original image, and therefore the bird implication features for small sizes in the surveillance image are no longer small-sized objects in the individual image blocks for subsequent bird identification.
Specifically, in step 130, each image block in the sequence of image blocks is passed through an automatic codec-based image resolution enhancer to obtain a sequence of enhanced image blocks, respectively. Then, each image block in the sequence of image blocks is respectively passed through an image resolution enhancer based on an automatic codec, so that resolution enhancement of each image block is performed, thereby obtaining a sequence of enhanced image blocks. In particular, here, the automatic codec includes an image resolution encoder and an image resolution decoder, the image resolution encoder explicitly spatially encoding the respective image blocks using a convolutional layer to obtain respective image features; the image resolution decoder uses a deconvolution layer to deconvolute the individual image features to obtain the sequence of enhanced image blocks.
Fig. 4 is a flowchart of the sub-steps of step 130 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application, as shown in fig. 4, the step of passing each image block in the sequence of image blocks through an image resolution enhancer based on an automatic codec to obtain a sequence of enhanced image blocks, including: 131, performing explicit spatial coding on each image block in the sequence of image blocks by an image resolution encoder of the automatic codec using a convolution layer to obtain each image feature; a kind of electronic device with a high-pressure air-conditioning system. And 132, performing deconvolution processing on the image features by an image resolution decoder of the automatic coder by using deconvolution layers to obtain the sequence of enhanced image blocks.
It should be appreciated that the automatic codec includes an encoder and a decoder, the encoder having two convolutional layers. In one example, the first convolution layer has 1 input channel number, 2 output channel number, convolution kernel size 10, sliding step size 10, zero padding width 1, and then a normalization layer and a ReLU nonlinear active layer are set; the number of input channels of the second convolution layer is 25, the number of output channels is 50, the convolution kernel size is 3, the sliding step length is 3, the zero padding width is 0, and then a normalization layer and a ReLU nonlinear activation layer are arranged; the tail end of the encoder is a full-connection layer, and the number of neurons is 10; the decoder head end is a full-connection layer, the number of neurons is 850, two deconvolution layers are connected later, the number of input channels of a first deconvolution layer is 50, the number of output channels is 25, the convolution kernel size is 4, the sliding step length is 3, the zero padding width is 1, then a normalization layer and a ReLU nonlinear activation layer are arranged, the number of input channels of a second deconvolution layer is 25, the number of output channels is 1, the convolution kernel size is 10, the sliding step length is 10, the zero padding width is 1, and then a Sigmoid nonlinear activation layer is arranged.
Specifically, in step 140, each enhanced image block in the sequence of enhanced image blocks is passed through a convolutional neural network model as a filter to obtain a sequence of image local semantic feature vectors, respectively. Further, since each enhanced image block in the sequence of enhanced image blocks is image data, in order to enable expression of bird feature information in each enhanced image block, in the technical solution of the present application, feature mining of each enhanced image block in the sequence of enhanced image blocks is further performed using a convolutional neural network model as a filter, which has excellent performance in terms of implicit feature extraction of images, so as to extract implicit feature distribution information in each enhanced image block about birds, thereby obtaining a sequence of image local semantic feature vectors.
Wherein, passing each enhanced image block in the sequence of enhanced image blocks through a convolutional neural network model as a filter to obtain a sequence of image local semantic feature vectors, respectively, comprising: each layer of the convolutional neural network model used as the filter performs the following steps on input data in forward transfer of the layer: carrying out convolution processing on the input data to obtain a convolution characteristic diagram; carrying out mean pooling treatment based on a feature matrix on the convolution feature map to obtain a pooled feature map; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network model as a filter is a sequence of local semantic feature vectors of the image, and the input of the first layer of the convolutional neural network model as a filter is each enhanced image block in the sequence of enhanced image blocks.
The convolutional neural network (Convolutional Neural Network, CNN) is an artificial neural network and has wide application in the fields of image recognition and the like. The convolutional neural network may include an input layer, a hidden layer, and an output layer, where the hidden layer may include a convolutional layer, a pooling layer, an activation layer, a full connection layer, etc., where the previous layer performs a corresponding operation according to input data, outputs an operation result to the next layer, and obtains a final result after the input initial data is subjected to a multi-layer operation.
The convolutional neural network model has excellent performance in the aspect of image local feature extraction by taking a convolutional kernel as a feature filtering factor, and has stronger feature extraction generalization capability and fitting capability compared with the traditional image feature extraction algorithm based on statistics or feature engineering.
Specifically, in step 150, the sequence of image local semantic feature vectors is passed through a translator-based ViT model to arrive at an image global context semantic understanding feature vector. Next, further consider that since the individual image local semantic feature vectors of the sequence of image local semantic feature vectors have bird feature related information about the whole monitored image between them, the pure CNN approach has difficulty learning explicit global and remote semantic information interactions due to the inherent limitations of convolution operations. Therefore, in the technical scheme of the application, the sequence of the image local semantic feature vectors is encoded in a ViT model based on a converter to extract the context semantic association features of the bird implicit features in each image block, so as to obtain the image global context semantic understanding feature vector. It should be appreciated that ViT may process the individual image blocks directly through a self-attention mechanism like a transducer to extract contextual semantic association feature information about bird implicit features in the individual image blocks, respectively.
Fig. 5 is a flowchart of the substeps of step 150 in the image processing-based grid obstacle recognition processing method according to an embodiment of the present application, as shown in fig. 5, the step of passing the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain an image global context semantic understanding feature vector includes: 151, performing one-dimensional arrangement on the sequence of the image local semantic feature vectors to obtain global image semantic feature vectors; 152, calculating the product between the global image semantic feature vector and the transpose vector of each image local semantic feature vector in the sequence of image local semantic feature vectors to obtain a plurality of self-attention correlation matrices; 153, respectively performing standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; 154, obtaining a plurality of probability values by using a Softmax classification function for each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and 155, weighting each image local semantic feature vector in the sequence of image local semantic feature vectors by using each probability value in the plurality of probability values as a weight to obtain the image global context semantic understanding feature vector.
The context encoder aims to mine for hidden patterns between contexts in the word sequence, optionally the encoder comprises: CNN (Convolutional Neural Network ), recurrent NN (RecursiveNeural Network, recurrent neural network), language Model (Language Model), and the like. The CNN-based method has a better extraction effect on local features, but has a poor effect on Long-Term Dependency (Long-Term Dependency) problems in sentences, so Bi-LSTM (Long Short-Term Memory) based encoders are widely used. The repetitive NN processes sentences as a tree structure rather than a sequence, has stronger representation capability in theory, but has the weaknesses of high sample marking difficulty, deep gradient disappearance, difficulty in parallel calculation and the like, so that the repetitive NN is less in practical application. The transducer has a network structure with wide application, has the characteristics of CNN and RNN, has a better extraction effect on global characteristics, and has a certain advantage in parallel calculation compared with RNN (RecurrentNeural Network ).
Specifically, in step 160 and step 170, the image global context semantic understanding feature vector is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether birds are contained in a monitoring range; and controlling the compressed air to repel birds in response to the classification result being that birds are contained in the monitoring range. And then, taking the image global context semantic understanding feature vector as a classification feature vector to carry out classification processing in a classifier so as to obtain a classification result for indicating whether birds are contained in the monitoring range. That is, the birds in the image are recognized and detected by classifying the images with the context semantic association features of the hidden features of the birds in the respective image blocks of the monitoring image, so that the compressed air is controlled to expel birds in response to the classification result being that birds are contained in the monitoring range.
That is, in the technical solution of the present application, the tag of the classifier includes that birds (first tag) are included in the monitoring range, and that birds (second tag) are not included in the monitoring range, wherein the classifier determines to which classification tag the classification feature vector belongs through a soft maximum function. It should be noted that the first tag p1 and the second tag p2 do not include a human-set concept, and in fact, during the training process, the computer model does not have a concept of "whether birds are contained in the monitoring range" which is only two kinds of classification tags and the probability that the output characteristics are under the two classification tags, that is, the sum of p1 and p2 is one. Therefore, the classification result of whether birds are contained in the monitoring range is actually converted into a classification probability distribution conforming to the natural rule through classifying the tags, and the physical meaning of the natural probability distribution of the tags is essentially used instead of the language text meaning of whether birds are contained in the monitoring range.
It should be understood that, in the technical scheme of the present application, the classification label of the classifier is an identification detection label of whether birds are contained in the monitoring range, so after the classification result is obtained, bird identification detection in the image can be performed based on the classification result, and accordingly, birds are contained in the monitoring range in response to the classification result, and compressed air is controlled to expel birds, so that stability of power transmission is ensured.
Fig. 6 is a flowchart of a sub-step of step 160 in an image processing-based power grid obstacle recognition processing method according to an embodiment of the present application, as shown in fig. 6, the image global context semantic understanding feature vector is passed through a classifier to obtain a classification result, where the classification result is used to indicate whether birds are contained in a monitoring range, and the method includes: 161, performing full-connection coding on the image global context semantic understanding feature vector by using a plurality of full-connection layers of the classifier to obtain a coding classification feature vector; and 162, passing the encoded classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
Further, the image processing-based power grid obstacle recognition processing method further comprises training the convolutional neural network model serving as a filter, the ViT model based on a converter and the classifier; fig. 7 is a flowchart of a sub-step of step 180 in the image processing-based power grid obstacle recognition processing method according to an embodiment of the present application, and as shown in fig. 7, training the convolutional neural network model as a filter, the converter-based ViT model, and the classifier includes: 181, obtaining training data, wherein the training data comprises training monitoring images and whether the monitoring range contains the true value of birds; 182, performing image blocking processing on the training monitoring image to obtain a sequence of training image blocks; 183, passing each training image block in the sequence of training image blocks through the automatic codec based image resolution enhancer, respectively, to obtain a sequence of training enhanced image blocks; 184, passing each training enhancement image block in the sequence of training enhancement image blocks through the convolutional neural network model as a filter to obtain a sequence of training image local semantic feature vectors; 185, passing the sequence of training image local semantic feature vectors through the ViT model based on the converter to obtain training image global context semantic understanding feature vectors; 186, passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value; and, 187 training the convolutional neural network model as a filter, the converter-based ViT model, and the classifier based on the classification loss function values and by propagation in the direction of gradient descent, wherein, in each iteration of the training, a feature affinity spatial affine learning iteration is performed on a weight matrix of the classifier.
Wherein passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value comprises: the classifier processes the training image global context semantic understanding feature vector with a classification formula to generate a training classification result, wherein the classification formula is as follows:
Figure SMS_22
, wherein ,/>
Figure SMS_23
Representing the training image global context semantic understanding feature vector,/for>
Figure SMS_24
To->
Figure SMS_25
Is a weight matrix>
Figure SMS_26
To->
Figure SMS_27
Representing a bias matrix; and calculating a cross entropy value between the training classification result and a true value as the classification loss function value.
In particular, in the technical solution of the present application, for the image global context semantic understanding feature vector, since the sequence of the image local semantic feature vector is obtained by directly concatenating a plurality of context image local semantic feature vectors obtained based on a ViT model of a converter, although a ViT model based on the converter can promote context relevance of the plurality of context image local semantic feature vectors, explicit differences of feature distributions of the plurality of context image local semantic feature vectors still exist, which makes relevance among individual local weight value distributions of a weight matrix of the classifier insufficient when the image global context semantic understanding feature vector obtained by directly concatenating passes through the classifier, and affects training speed of the classifier.
Based on the above, in the technical solution of the present application, each time the weight matrix iterates, the weight matrix is mapped
Figure SMS_28
Feature affinity spatial affine learning is performed, expressed as: in each iteration of the training, carrying out characteristic affinity space affine learning iteration on the weight matrix of the classifier according to the following optimization formula; wherein, the optimization formula is:
Figure SMS_29
wherein ,
Figure SMS_32
a weight matrix representing said classifier, +.>
Figure SMS_34
Transpose of the weight matrix representing the classifier, < >>
Figure SMS_36
Two norms of a weight matrix representing the classifier +.>
Figure SMS_31
A kernel norm of a weight matrix representing the classifier, and +.>
Figure SMS_35
Is the scale of the weight matrix of the classifier, < >>
Figure SMS_37
Represents a logarithmic function with base 2, +.>
Figure SMS_38
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_30
Representing multiplication by location +.>
Figure SMS_33
And representing the weight matrix of the classifier after iteration.
Here, the feature affinity spatial affine learning performs affine migration based on spatial transformation with relatively low-resolution information characterization by performing detailed structured information expression in a low-dimensional eigensubspace on high-resolution information characterization in a weight value distribution space of the weight matrix, thereby implementing super-resolution (e.g., weight-by-weight) activation of weight distribution local to each weight based on affinity (affinity) dense simulation between weight value characterization to enhance training speed of the classifier by enhancing correlation between each local weight distribution of the weight matrix. Therefore, the birds of the power line tower can be accurately identified and detected, and when birds are detected, the compressed air is controlled to drive the birds, so that the stability of power transmission is ensured.
In summary, an image processing-based power grid obstacle recognition processing method 100 according to an embodiment of the present application is illustrated, which acquires a monitoring image acquired by a camera disposed at a power line tower; and the artificial intelligence technology based on deep learning is adopted to mine the implicit characteristic information about birds in the monitoring image, so that the birds are identified and detected, and then when the birds are detected, the compressed air is controlled to drive the birds, so that the stability of power transmission is ensured.
In one embodiment of the present application, fig. 8 is a block diagram of an image processing-based grid obstacle recognition processing system according to an embodiment of the present application. As shown in fig. 8, the image processing-based power grid obstacle recognition processing system 200 according to the embodiment of the present application includes: an image acquisition module 210 for acquiring a monitoring image acquired by a camera disposed at the power line tower; the image blocking processing module 220 is configured to perform image blocking processing on the monitoring image to obtain a sequence of image blocks; an automatic codec module 230 for passing each image block in the sequence of image blocks through an automatic codec-based image resolution enhancer, respectively, to obtain a sequence of enhanced image blocks; the feature extraction module 240 is configured to pass each enhanced image block in the sequence of enhanced image blocks through a convolutional neural network model serving as a filter to obtain a sequence of image local semantic feature vectors; the global encoding module 250 is configured to pass the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain an image global context semantic understanding feature vector; the monitoring result generating module 260 is configured to pass the image global context semantic understanding feature vector through a classifier to obtain a classification result, where the classification result is used to indicate whether birds are contained in a monitoring range; and a control module 270 for controlling the compressed air to expel birds in response to the classification result being that birds are contained in the monitoring range.
In a specific example, in the above image processing-based power grid obstacle recognition processing system, the automatic codec module includes: an encoding unit, configured to perform explicit spatial encoding on each image block in the sequence of image blocks by using a convolutional layer through an image resolution encoder of the automatic codec to obtain each image feature; and a decoding unit for performing deconvolution processing on the respective image features by an image resolution decoder of the automatic codec using a deconvolution layer to obtain the sequence of enhanced image blocks.
In a specific example, in the above image processing-based power grid obstacle recognition processing system, the feature extraction module includes: each layer of the convolutional neural network model used as the filter performs the following steps on input data in forward transfer of the layer: carrying out convolution processing on the input data to obtain a convolution characteristic diagram; carrying out mean pooling treatment based on a feature matrix on the convolution feature map to obtain a pooled feature map; performing nonlinear activation on the pooled feature map to obtain an activated feature map; wherein the output of the last layer of the convolutional neural network model as a filter is a sequence of local semantic feature vectors of the image, and the input of the first layer of the convolutional neural network model as a filter is each enhanced image block in the sequence of enhanced image blocks.
In a specific example, in the above image processing-based grid obstacle recognition processing system, the global encoding module includes: the one-dimensional arrangement unit is used for one-dimensionally arranging the sequence of the image local semantic feature vectors to obtain global image semantic feature vectors; the self-attention unit is used for calculating the product between the global image semantic feature vector and the transpose vector of each image local semantic feature vector in the sequence of the image local semantic feature vectors to obtain a plurality of self-attention association matrices; the normalization processing unit is used for respectively performing normalization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of normalized self-attention correlation matrices; the classification function unit is used for obtaining a plurality of probability values through a Softmax classification function by each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and the weighting unit is used for weighting each image local semantic feature vector in the sequence of the image local semantic feature vectors by taking each probability value in the plurality of probability values as a weight so as to obtain the image global context semantic understanding feature vector.
In a specific example, in the above-mentioned grid obstacle recognition processing system based on image processing, the monitoring result generating module includes: the coding unit is used for carrying out full-connection coding on the image global context semantic understanding feature vector by using a plurality of full-connection layers of the classifier so as to obtain a coding classification feature vector; and the classification unit is used for passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.
In a specific example, in the above image processing-based power grid obstacle recognition processing system, the system further includes a training module that trains the convolutional neural network model as a filter, the converter-based ViT model, and the classifier; wherein, training module includes: the training image acquisition unit is used for acquiring training data, wherein the training data comprises training monitoring images and whether the monitoring range contains the true value of birds or not; the training image blocking processing unit is used for carrying out image blocking processing on the training monitoring image to obtain a sequence of training image blocks; the training automatic coding and decoding unit is used for respectively passing each training image block in the sequence of training image blocks through the image resolution enhancer based on the automatic coder and decoder so as to obtain a sequence of training enhancement image blocks; the training feature extraction unit is used for respectively passing each training enhancement image block in the sequence of training enhancement image blocks through the convolutional neural network model serving as a filter to obtain a sequence of training image local semantic feature vectors; the training global coding unit is used for enabling the sequence of the training image local semantic feature vectors to pass through the ViT model based on the converter to obtain training image global context semantic understanding feature vectors; the classification loss function value calculation unit is used for enabling the training image global context semantic understanding feature vector to pass through the classifier to obtain a classification loss function value; and a training iteration unit for training the convolutional neural network model as a filter, the ViT model based on the converter and the classifier based on the classification loss function value and traveling in the direction of gradient descent, wherein in each round of the training, a feature affinity space affine learning iteration is performed on a weight matrix of the classifier.
In a specific example, in the above-described image processing-based power grid obstacle recognition processing system, the classification loss function value calculation unit includes: the training classification subunit is configured to process the training image global context semantic understanding feature vector by using the classifier according to the following classification formula to generate a training classification result, where the classification formula is:
Figure SMS_39
, wherein ,/>
Figure SMS_40
Representing the training image global context semantic understanding feature vector,/for>
Figure SMS_41
To->
Figure SMS_42
Is a weight matrix>
Figure SMS_43
To->
Figure SMS_44
Representing a bias matrix; and a calculation subunit for calculating a cross entropy value between the training classification result and a true value as the classification loss function value.
In a specific example, in the above image processing-based power grid obstacle recognition processing system, the training iteration unit is configured to: in each iteration of the training, carrying out characteristic affinity space affine learning iteration on the weight matrix of the classifier according to the following optimization formula; wherein, the optimization formula is:
Figure SMS_45
wherein ,
Figure SMS_47
a weight matrix representing said classifier, +.>
Figure SMS_50
Transpose of the weight matrix representing the classifier, < > >
Figure SMS_52
Two norms of a weight matrix representing the classifier +.>
Figure SMS_48
A kernel norm of a weight matrix representing the classifier, and +.>
Figure SMS_51
Is the scale of the weight matrix of the classifier, < >>
Figure SMS_53
Represents a logarithmic function with base 2, +.>
Figure SMS_54
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure SMS_46
Representing multiplication by location +.>
Figure SMS_49
And representing the weight matrix of the classifier after iteration.
Here, it will be understood by those skilled in the art that the specific functions and operations of the respective units and modules in the above-described image processing-based power grid obstacle recognition processing system have been described in detail in the above description of the image processing-based power grid obstacle recognition processing method with reference to fig. 1 to 7, and thus, repetitive descriptions thereof will be omitted.
As described above, the image processing-based power grid obstacle recognition processing system 200 according to the embodiment of the present application may be implemented in various terminal devices, for example, a server or the like for image processing-based power grid obstacle recognition processing. In one example, the image processing-based grid obstacle recognition processing system 200 according to embodiments of the present application may be integrated into the terminal device as one software module and/or hardware module. For example, the image processing-based grid obstacle recognition processing system 200 may be a software module in the operating system of the terminal device, or may be an application developed for the terminal device; of course, the image processing-based grid obstacle recognition processing system 200 may also be one of a plurality of hardware modules of the terminal device.
Alternatively, in another example, the image processing-based power grid obstacle recognition processing system 200 and the terminal device may be separate devices, and the image processing-based power grid obstacle recognition processing system 200 may be connected to the terminal device through a wired and/or wireless network and transmit the interactive information in a contracted data format.
The present application also provides a computer program product comprising instructions which, when executed, cause an apparatus to perform operations corresponding to the above-described methods.
In one embodiment of the present application, there is also provided a computer readable storage medium storing a computer program for executing the above-described method.
It should be appreciated that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the forms of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects may be utilized. Furthermore, the computer program product may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Methods, systems, and computer program products of embodiments of the present application are described in terms of flow diagrams and/or block diagrams. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The basic principles of the present application have been described above in connection with specific embodiments, however, it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not intended to be limited to the details disclosed herein as such.
The block diagrams of the devices, apparatuses, devices, systems referred to in this application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent to the present application.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (10)

1. The method for identifying and processing the power grid obstacle based on the image processing is characterized by comprising the following steps of:
acquiring a monitoring image acquired by a camera deployed on a power line tower;
performing image blocking processing on the monitoring image to obtain a sequence of image blocks;
passing each image block in the sequence of image blocks through an automatic codec-based image resolution enhancer, respectively, to obtain a sequence of enhanced image blocks;
each enhanced image block in the sequence of enhanced image blocks is respectively passed through a convolutional neural network model serving as a filter to obtain a sequence of image local semantic feature vectors;
passing the sequence of image local semantic feature vectors through a ViT model based on a converter to obtain image global context semantic understanding feature vectors;
the image global context semantic understanding feature vector is passed through a classifier to obtain a classification result, and the classification result is used for indicating whether birds are contained in a monitoring range; and
And controlling the compressed air to drive birds in response to the classification result being that birds are contained in the monitoring range.
2. The image processing-based power grid obstacle recognition processing method according to claim 1, wherein passing each image block in the sequence of image blocks through an automatic codec-based image resolution enhancer to obtain a sequence of enhanced image blocks, respectively, comprises:
performing explicit spatial coding on each image block in the sequence of image blocks by an image resolution encoder of the automatic codec using a convolutional layer to obtain each image feature; and
and performing deconvolution processing on the image features by an image resolution decoder of the automatic codec by using a deconvolution layer to obtain the sequence of enhanced image blocks.
3. The method for identifying and processing the grid obstacle based on image processing according to claim 2, wherein the step of passing each enhanced image block in the sequence of enhanced image blocks through a convolutional neural network model as a filter to obtain the sequence of image local semantic feature vectors comprises: each layer of the convolutional neural network model used as the filter performs the following steps on input data in forward transfer of the layer:
Carrying out convolution processing on the input data to obtain a convolution characteristic diagram;
carrying out mean pooling treatment based on a feature matrix on the convolution feature map to obtain a pooled feature map; and
non-linear activation is carried out on the pooled feature map so as to obtain an activated feature map;
wherein the output of the last layer of the convolutional neural network model as a filter is a sequence of local semantic feature vectors of the image, and the input of the first layer of the convolutional neural network model as a filter is each enhanced image block in the sequence of enhanced image blocks.
4. A method of image processing based grid obstacle recognition processing according to claim 3, wherein passing the sequence of image local semantic feature vectors through a converter based ViT model to obtain image global context semantic understanding feature vectors comprises:
one-dimensional arrangement is carried out on the sequence of the image local semantic feature vectors so as to obtain global image semantic feature vectors;
calculating the product between the global image semantic feature vector and the transpose vector of each image local semantic feature vector in the sequence of image local semantic feature vectors to obtain a plurality of self-attention association matrices;
Respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices;
obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; and
and weighting each image local semantic feature vector in the sequence of image local semantic feature vectors by taking each probability value in the plurality of probability values as a weight so as to obtain the image global context semantic understanding feature vector.
5. The method for identifying and processing the grid obstacle based on image processing according to claim 4, wherein the step of passing the image global context semantic understanding feature vector through a classifier to obtain a classification result, wherein the classification result is used for indicating whether birds are contained in a monitoring range, and the method comprises the following steps:
performing full-connection coding on the image global context semantic understanding feature vector by using a plurality of full-connection layers of the classifier to obtain a coding classification feature vector; and
and the coding classification feature vector is passed through a Softmax classification function of the classifier to obtain the classification result.
6. The image processing-based power grid obstacle recognition processing method according to claim 5, further comprising training the convolutional neural network model as a filter, the converter-based ViT model, and the classifier;
wherein training the convolutional neural network model as a filter, the converter-based ViT model, and the classifier comprises:
acquiring training data, wherein the training data comprises training monitoring images and whether the monitoring range contains the true value of birds or not;
performing image blocking processing on the training monitoring image to obtain a sequence of training image blocks;
respectively passing each training image block in the sequence of training image blocks through the image resolution enhancer based on the automatic codec to obtain a sequence of training enhancement image blocks;
respectively passing each training enhancement image block in the sequence of training enhancement image blocks through the convolutional neural network model serving as a filter to obtain a sequence of training image local semantic feature vectors;
passing the sequence of training image local semantic feature vectors through the converter-based ViT model to obtain training image global context semantic understanding feature vectors;
Passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value; and
training the convolutional neural network model as a filter, the ViT model based on a converter and the classifier based on the classification loss function value and traveling in the direction of gradient descent, wherein in each round of iteration of the training, a feature affinity space affine learning iteration is performed on a weight matrix of the classifier.
7. The image processing-based power grid obstacle recognition processing method as recited in claim 6, wherein passing the training image global context semantic understanding feature vector through the classifier to obtain a classification loss function value, comprises:
the classifier processes the training image global context semantic understanding feature vector with a classification formula to generate a training classification result, wherein the classification formula is as follows:
Figure QLYQS_1
, wherein ,/>
Figure QLYQS_2
Representing the training image global context semantic understanding feature vector,/for>
Figure QLYQS_3
To->
Figure QLYQS_4
Is a weight matrix>
Figure QLYQS_5
To->
Figure QLYQS_6
Representing a bias matrix; and
and calculating a cross entropy value between the training classification result and a true value as the classification loss function value.
8. The image processing-based power grid obstacle recognition processing method according to claim 7, wherein in each iteration of the training, feature affinity space affine learning is iterated on the weight matrix of the classifier with the following optimization formula;
wherein, the optimization formula is:
Figure QLYQS_7
wherein ,
Figure QLYQS_9
a weight matrix representing said classifier, +.>
Figure QLYQS_12
A transpose of the weight matrix representing the classifier,
Figure QLYQS_14
two norms of a weight matrix representing the classifier +.>
Figure QLYQS_10
Represents the kernel norms of the weight matrix of the classifier, and
Figure QLYQS_13
is the scale of the weight matrix of the classifier, < >>
Figure QLYQS_15
Represents a logarithmic function with base 2, +.>
Figure QLYQS_16
An exponential operation representing a matrix representing the calculation of a natural exponential function value raised to a power by a characteristic value at each position in the matrix,/v>
Figure QLYQS_8
Representing multiplication by location +.>
Figure QLYQS_11
And representing the weight matrix of the classifier after iteration.
9. An image processing-based power grid obstacle recognition processing system is characterized by comprising:
the image acquisition module is used for acquiring a monitoring image acquired by a camera arranged on the power line tower;
the image blocking processing module is used for carrying out image blocking processing on the monitoring image to obtain a sequence of image blocks;
An automatic encoding and decoding module, configured to obtain a sequence of enhanced image blocks by respectively passing each image block in the sequence of image blocks through an image resolution enhancer based on an automatic encoder and decoder;
the feature extraction module is used for enabling each enhanced image block in the sequence of enhanced image blocks to pass through a convolutional neural network model serving as a filter respectively to obtain a sequence of image local semantic feature vectors;
the global coding module is used for enabling the sequence of the image local semantic feature vectors to pass through a ViT model based on a converter to obtain image global context semantic understanding feature vectors;
the monitoring result generation module is used for enabling the image global context semantic understanding feature vector to pass through a classifier to obtain a classification result, and the classification result is used for indicating whether birds are contained in a monitoring range; and
and the control module is used for controlling the compressed air to drive birds in response to the classification result that birds are contained in the monitoring range.
10. The image processing-based power grid obstacle recognition processing system of claim 9, wherein the automatic codec module comprises:
an encoding unit, configured to perform explicit spatial encoding on each image block in the sequence of image blocks by using a convolutional layer through an image resolution encoder of the automatic codec to obtain each image feature; and
And the decoding unit is used for carrying out deconvolution processing on the image features by using a deconvolution layer through an image resolution decoder of the automatic coder so as to obtain the sequence of the enhanced image blocks.
CN202310461859.1A 2023-04-26 2023-04-26 Power grid obstacle recognition processing method and system based on image processing Active CN116168352B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310461859.1A CN116168352B (en) 2023-04-26 2023-04-26 Power grid obstacle recognition processing method and system based on image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310461859.1A CN116168352B (en) 2023-04-26 2023-04-26 Power grid obstacle recognition processing method and system based on image processing

Publications (2)

Publication Number Publication Date
CN116168352A true CN116168352A (en) 2023-05-26
CN116168352B CN116168352B (en) 2023-06-27

Family

ID=86414986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310461859.1A Active CN116168352B (en) 2023-04-26 2023-04-26 Power grid obstacle recognition processing method and system based on image processing

Country Status (1)

Country Link
CN (1) CN116168352B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665134A (en) * 2023-07-28 2023-08-29 南京兴沧环保科技有限公司 Production monitoring equipment and method for radio frequency filter
CN116758359A (en) * 2023-08-16 2023-09-15 腾讯科技(深圳)有限公司 Image recognition method and device and electronic equipment
CN116844217A (en) * 2023-08-30 2023-10-03 成都睿瞳科技有限责任公司 Image processing system and method for generating face data
CN116864140A (en) * 2023-09-05 2023-10-10 天津市胸科医院 Intracardiac branch of academic or vocational study postoperative care monitoring data processing method and system thereof
CN116912831A (en) * 2023-09-15 2023-10-20 东莞市将为防伪科技有限公司 Method and system for processing acquired information of letter code anti-counterfeiting printed matter
CN117372528A (en) * 2023-11-21 2024-01-09 南昌工控机器人有限公司 Visual image positioning method for modularized assembly of mobile phone shell
CN117540935A (en) * 2024-01-09 2024-02-09 上海银行股份有限公司 DAO operation management method based on block chain technology
CN117608283A (en) * 2023-11-08 2024-02-27 浙江孚宝智能科技有限公司 Autonomous navigation method and system for robot
CN117994253A (en) * 2024-04-03 2024-05-07 国网山东省电力公司东营供电公司 High-voltage distribution line ground fault identification method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019232830A1 (en) * 2018-06-06 2019-12-12 平安科技(深圳)有限公司 Method and device for detecting foreign object debris at airport, computer apparatus, and storage medium
AU2020101229A4 (en) * 2020-07-02 2020-08-06 South China University Of Technology A Text Line Recognition Method in Chinese Scenes Based on Residual Convolutional and Recurrent Neural Networks
US20210166350A1 (en) * 2018-07-17 2021-06-03 Xi'an Jiaotong University Fusion network-based method for image super-resolution and non-uniform motion deblurring
CN113255661A (en) * 2021-04-15 2021-08-13 南昌大学 Bird species image identification method related to bird-involved fault of power transmission line
WO2022047625A1 (en) * 2020-09-01 2022-03-10 深圳先进技术研究院 Image processing method and system, and computer storage medium
US20220230302A1 (en) * 2019-06-24 2022-07-21 Zhejiang University Three-dimensional automatic location system for epileptogenic focus based on deep learning
WO2022182353A1 (en) * 2021-02-26 2022-09-01 Hewlett-Packard Development Company, L.P. Captured document image enhancement
WO2022242029A1 (en) * 2021-05-18 2022-11-24 广东奥普特科技股份有限公司 Generation method, system and apparatus capable of visual resolution enhancement, and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019232830A1 (en) * 2018-06-06 2019-12-12 平安科技(深圳)有限公司 Method and device for detecting foreign object debris at airport, computer apparatus, and storage medium
US20210166350A1 (en) * 2018-07-17 2021-06-03 Xi'an Jiaotong University Fusion network-based method for image super-resolution and non-uniform motion deblurring
US20220230302A1 (en) * 2019-06-24 2022-07-21 Zhejiang University Three-dimensional automatic location system for epileptogenic focus based on deep learning
AU2020101229A4 (en) * 2020-07-02 2020-08-06 South China University Of Technology A Text Line Recognition Method in Chinese Scenes Based on Residual Convolutional and Recurrent Neural Networks
WO2022047625A1 (en) * 2020-09-01 2022-03-10 深圳先进技术研究院 Image processing method and system, and computer storage medium
WO2022182353A1 (en) * 2021-02-26 2022-09-01 Hewlett-Packard Development Company, L.P. Captured document image enhancement
CN113255661A (en) * 2021-04-15 2021-08-13 南昌大学 Bird species image identification method related to bird-involved fault of power transmission line
WO2022242029A1 (en) * 2021-05-18 2022-11-24 广东奥普特科技股份有限公司 Generation method, system and apparatus capable of visual resolution enhancement, and storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116665134A (en) * 2023-07-28 2023-08-29 南京兴沧环保科技有限公司 Production monitoring equipment and method for radio frequency filter
CN116758359A (en) * 2023-08-16 2023-09-15 腾讯科技(深圳)有限公司 Image recognition method and device and electronic equipment
CN116844217A (en) * 2023-08-30 2023-10-03 成都睿瞳科技有限责任公司 Image processing system and method for generating face data
CN116844217B (en) * 2023-08-30 2023-11-14 成都睿瞳科技有限责任公司 Image processing system and method for generating face data
CN116864140A (en) * 2023-09-05 2023-10-10 天津市胸科医院 Intracardiac branch of academic or vocational study postoperative care monitoring data processing method and system thereof
CN116912831A (en) * 2023-09-15 2023-10-20 东莞市将为防伪科技有限公司 Method and system for processing acquired information of letter code anti-counterfeiting printed matter
CN117608283A (en) * 2023-11-08 2024-02-27 浙江孚宝智能科技有限公司 Autonomous navigation method and system for robot
CN117372528A (en) * 2023-11-21 2024-01-09 南昌工控机器人有限公司 Visual image positioning method for modularized assembly of mobile phone shell
CN117372528B (en) * 2023-11-21 2024-05-28 南昌工控机器人有限公司 Visual image positioning method for modularized assembly of mobile phone shell
CN117540935A (en) * 2024-01-09 2024-02-09 上海银行股份有限公司 DAO operation management method based on block chain technology
CN117540935B (en) * 2024-01-09 2024-04-05 上海银行股份有限公司 DAO operation management method based on block chain technology
CN117994253A (en) * 2024-04-03 2024-05-07 国网山东省电力公司东营供电公司 High-voltage distribution line ground fault identification method
CN117994253B (en) * 2024-04-03 2024-06-11 国网山东省电力公司东营供电公司 High-voltage distribution line ground fault identification method

Also Published As

Publication number Publication date
CN116168352B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
CN116168352B (en) Power grid obstacle recognition processing method and system based on image processing
Zhang et al. Multi-scale attention with dense encoder for handwritten mathematical expression recognition
CN111325323B (en) Automatic power transmission and transformation scene description generation method integrating global information and local information
CN113936339B (en) Fighting identification method and device based on double-channel cross attention mechanism
CN109524006B (en) Chinese mandarin lip language identification method based on deep learning
CN110580292A (en) Text label generation method and device and computer readable storage medium
CN116245513B (en) Automatic operation and maintenance system and method based on rule base
CN109840322A (en) It is a kind of based on intensified learning cloze test type reading understand analysis model and method
Ma et al. Multi-feature fusion deep networks
CN115951883B (en) Service component management system of distributed micro-service architecture and method thereof
CN115471216B (en) Data management method of intelligent laboratory management platform
CN117058622A (en) Intelligent monitoring system and method for sewage treatment equipment
CN116152611B (en) Multistage multi-scale point cloud completion method, system, equipment and storage medium
CN117006654A (en) Air conditioner load control system and method based on edge calculation
Safdari et al. A hierarchical feature learning for isolated Farsi handwritten digit recognition using sparse autoencoder
Sharm et al. Deformable and Structural Representative Network for Remote Sensing Image Captioning.
CN116954113B (en) Intelligent robot driving sensing intelligent control system and method thereof
Wu CNN-Based Recognition of Handwritten Digits in MNIST Database
CN113868414A (en) Interpretable legal dispute focus summarizing method and system
Li et al. Supervised classification of plant image based on attention mechanism
CN112836752A (en) Intelligent sampling parameter control method based on feature map fusion of depth values
CN116524416B (en) Animal serum draws equipment for medical treatment experiments
Dutta et al. Sign Language Detection Using Action Recognition in Python
CN116258504B (en) Bank customer relationship management system and method thereof
CN111158640B (en) One-to-many demand analysis and identification method based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant