CN116363688A - Image processing method, device, equipment, medium and product - Google Patents

Image processing method, device, equipment, medium and product Download PDF

Info

Publication number
CN116363688A
CN116363688A CN202310303092.XA CN202310303092A CN116363688A CN 116363688 A CN116363688 A CN 116363688A CN 202310303092 A CN202310303092 A CN 202310303092A CN 116363688 A CN116363688 A CN 116363688A
Authority
CN
China
Prior art keywords
image
target
position information
instrument
image processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310303092.XA
Other languages
Chinese (zh)
Inventor
吴新涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiayang Smart Security Technology Beijing Co ltd
Original Assignee
Jiayang Smart Security Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiayang Smart Security Technology Beijing Co ltd filed Critical Jiayang Smart Security Technology Beijing Co ltd
Priority to CN202310303092.XA priority Critical patent/CN116363688A/en
Publication of CN116363688A publication Critical patent/CN116363688A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/42Document-oriented image-based pattern recognition based on the type of document
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/02Recognising information on displays, dials, clocks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an image processing method, an image processing device, a medium and a product, comprising the following steps: acquiring an image to be processed containing a target object; extracting image features of an image to be processed based on the image processing model, determining a target boundary box based on the image features, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image; switching the instrument image selected by the target boundary box from an image space to a parameter space to obtain a switched instrument image, and determining the position information of an instrument pointer in the instrument image based on the switched instrument image; the target scale value pointed by the meter pointer is determined based on the position information of the plurality of scale values and the position information of the meter pointer. According to the embodiment of the application, the accuracy of reading the scale value on the digital instrument is improved, and meanwhile, the reading efficiency is improved.

Description

Image processing method, device, equipment, medium and product
Technical Field
The application belongs to the technical field of image processing, and particularly relates to an image processing method, an image processing device, an image processing medium and an image processing product.
Background
With the continuous popularization of digitization, digital meters are adopted in most industrial scenes, and based on the digital meters, it is important to be able to accurately read parameters on the digital meters. However, in real life, it is generally necessary to manually read the scale values on the digital meter, and since the digital meter is typically spread throughout the corners of the industrial scene, staff often need to find the digital meter throughout the industrial scene in order to read and record the scale values displayed by the digital meter. As such, not only is it easy to cause a lower accuracy in reading the scale values on the digital meter, but also a lower reading efficiency.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, an image processing medium and an image processing product, which improve the reading efficiency while improving the accuracy of reading scale values on a digital instrument.
In a first aspect, an embodiment of the present application provides an image processing method, including:
acquiring an image to be processed containing a target object, wherein the target object comprises an instrument image;
extracting image features of an image to be processed based on an image processing model, determining a target boundary box based on the image features, wherein an image selected by the target boundary box is an instrument image, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image;
Switching the instrument image selected by the target boundary box from an image space to a parameter space to obtain a switched instrument image, and determining the position information of an instrument pointer in the instrument image based on the switched instrument image;
the target scale value pointed by the meter pointer is determined based on the position information of the plurality of scale values and the position information of the meter pointer.
In an optional implementation manner of the first aspect, extracting image features of the image to be processed based on the image processing model, determining a target bounding box based on the image features, and obtaining location information of a plurality of scale values included in the meter image by performing text recognition on the meter image includes:
extracting image features of an image to be processed based on a first network of an image processing model, determining respective boundary frame information of a plurality of first boundary frames based on the image features, wherein the boundary frame information comprises boundary frame confidence and boundary frame position information, and determining a target boundary frame based on the boundary frame confidence and the boundary frame position information of the respective first boundary frames;
and carrying out text recognition on the instrument image framed by the target boundary box based on the second network of the image processing model so as to obtain the position information of a plurality of scale values included in the instrument image.
In an optional implementation manner of the first aspect, determining the target scale value pointed by the meter pointer based on the position information of the plurality of scale values and the position information of the meter pointer includes:
determining a first scale value nearest to the target intersection point in the anticlockwise direction of the meter pointer and a second scale value nearest to the target intersection point in the clockwise direction of the meter pointer from the plurality of scale values based on the position information of the target intersection point and the position information of the plurality of scale values, wherein the target intersection point is a point at which the meter pointer intersects with a scale value line segment, and the scale value line segment is a line segment obtained by sequentially connecting the plurality of scale values according to a preset sequence;
determining a difference value between the second scale value and the first scale value as a first numerical value;
and determining the sum of the first scale value and a target value, wherein the sum is the target scale value pointed by the instrument pointer, the target value is the product of the first value and a second value, and the second value is determined based on the relative distance between the instrument pointer and the first scale value and the second scale value respectively.
In an optional implementation manner of the first aspect, determining the target bounding box based on the bounding box confidence and bounding box position information of each of the plurality of first bounding boxes includes:
Determining a first boundary frame with highest boundary frame confidence degree in the plurality of first boundary frames as a second boundary frame based on the boundary frame confidence degree of each of the plurality of first boundary frames, and determining a first boundary frame except the second boundary frame in the plurality of first boundary frames as a third boundary frame;
calculating the intersection ratio of the second boundary frame and the third boundary frame based on the boundary frame position information of the second boundary frame and the third boundary frame;
and determining a third boundary frame with the cross-over ratio with the second boundary frame being smaller than a preset cross-over ratio threshold value in the third boundary frame, and taking the second boundary frame as a target boundary frame.
In an optional implementation manner of the first aspect, the training method of the image processing model includes:
acquiring a training sample set, wherein the training sample set comprises a plurality of image samples to be processed and position information of a plurality of label scale values corresponding to each image sample to be processed;
extracting reference image characteristics of an image sample to be processed based on a first network of a preset image processing model, determining respective reference boundary frame information of a plurality of first reference boundary frames based on the reference image characteristics, wherein the reference boundary frame information comprises reference confidence and reference position information, and determining a reference boundary frame based on the respective reference frame confidence and the reference position information of the plurality of first reference boundary frames;
Text recognition is carried out on the instrument image sample selected by the reference boundary frame based on a second network of the preset image processing model, so that position information of a plurality of reference scale values included in the instrument image sample is obtained;
determining a loss function value of a preset image processing model according to position information of a plurality of label scale values and position information of a plurality of reference scale values of a target image sample to be processed, wherein the target image sample to be processed is any one of the image samples to be processed;
and training the preset image processing model by utilizing the image sample to be processed based on the loss function value of the preset image processing model to obtain a trained image processing model.
In an optional implementation manner of the first aspect, before acquiring the training sample set, the method further includes:
acquiring a plurality of original images containing target objects;
respectively preprocessing a plurality of original images according to a preset image preprocessing mode to obtain a plurality of image samples to be processed corresponding to each original image, wherein the preset image preprocessing mode comprises an image enhancement operation and an image normalization operation.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
The acquisition module is used for acquiring an image to be processed containing a target object, wherein the target object comprises an instrument image;
the processing module is used for extracting image characteristics of an image to be processed based on the image processing model, determining a target boundary frame based on the image characteristics, wherein an image selected by the target boundary frame is an instrument image, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image;
the switching module is used for switching the instrument image selected by the target boundary frame from the image space to the parameter space to obtain a switched instrument image, and determining the position information of an instrument pointer in the instrument image based on the switched instrument image;
and the determining module is used for determining the target scale value pointed by the meter pointer based on the position information of the scale values and the position information of the meter pointer.
In a third aspect, there is provided an electronic device comprising: a memory for storing computer program instructions; a processor for reading and executing computer program instructions stored in a memory to perform the image processing method provided in any of the alternative embodiments of the first aspect.
In a fourth aspect, there is provided a computer storage medium having stored thereon computer program instructions which, when executed by a processor, implement the image processing method provided by any of the alternative embodiments of the first aspect.
In a fifth aspect, a computer program product is provided, instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform an image processing method implementing any of the alternative embodiments of the first aspect.
In the embodiment of the application, the image to be processed including the target object can be obtained, the image characteristic of the image to be processed can be extracted based on the image processing model, and then the target boundary frame can be determined based on the image characteristic, because the image selected by the target boundary frame is the instrument image, and further the position information of a plurality of scale values included in the instrument image can be obtained by carrying out text recognition on the instrument image, meanwhile, the instrument image selected by the target boundary frame can be switched from the image space to the parameter space, the switched instrument image is obtained, and the position information of the instrument pointer in the instrument image is determined based on the switched instrument image, so that the target scale value pointed by the instrument pointer can be determined based on the position information of the plurality of scale values and the position information of the instrument pointer, and further the accuracy of reading the scale values of the digital instrument is improved, and the reading efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described, and it is possible for a person skilled in the art to obtain other drawings according to these drawings without inventive effort.
Fig. 1 is a schematic diagram of a training flow of an image processing model in an image processing method according to an embodiment of the present application;
fig. 2 is a schematic flow chart of an image processing method according to an embodiment of the present application;
fig. 3 is a schematic structural view of an image processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application are described in detail below to make the objects, technical solutions and advantages of the present application more apparent, and to further describe the present application in conjunction with the accompanying drawings and the detailed embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative of the application and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by showing examples of the present application.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone.
In order to solve the problem that in the prior art, the reading accuracy is low and the efficiency is low due to manual reading of scale values displayed on a digital instrument, the embodiment of the application provides an image processing method, device, equipment, medium and product.
The image processing method provided in the embodiment of the present application, the execution subject may be an image processing apparatus, or a control module for executing the image processing method in the image processing apparatus. In the embodiment of the present application, an image processing method performed by an image processing apparatus is taken as an example, and the image processing method provided in the embodiment of the present application is described.
In addition, in the image processing method provided in the embodiment of the present application, the image to be processed needs to be processed by using the pre-trained image processing model, so the image processing model needs to be trained before the image processing is performed by using the image processing model. Accordingly, a specific implementation of the training method for an image processing model provided in the embodiments of the present application is described below with reference to the accompanying drawings.
The embodiment of the application provides a training method for an image processing model, an execution subject of the method is an image processing device, as shown in fig. 1, and the method can be specifically implemented by the following steps:
s110, acquiring a training sample set.
The training sample set may include a plurality of image samples to be processed and position information of a plurality of label scale values corresponding to each image sample to be processed. Wherein each image sample to be processed may include a reference object, which may be a meter image sample.
In order to obtain a more accurate training sample set and thus better train the image processing model, in a specific embodiment, obtaining the training sample set may specifically include the following steps:
and step 1, acquiring a plurality of image samples to be processed.
Specifically, the image processing apparatus may directly obtain, by the monitoring device, a plurality of image samples to be processed within a preset period of time. The preset time period may be a time period based on actual experience or required to be preset, may be one month or three months, and is not particularly limited herein.
Specifically, the image processing apparatus may acquire the meter video of the job site within a preset period of time by using the monitoring device, and acquire a plurality of meter image samples from the meter video based on the separation algorithm. The monitoring device may be a monitoring device installed in a work scene, capable of acquiring an image of the work scene, and having a horizontal distance from the position of the work scene within 100 meters. In addition, the above-mentioned separation algorithm may be a correlation algorithm based on practical situations, and is not limited herein.
And 2, marking position information of a plurality of label scale values corresponding to the plurality of image samples to be processed one by one.
Specifically, the position information of the plurality of label scale values of each image sample to be processed may be marked by a manual marking method, or the position information of the plurality of label scale values of each image sample to be processed may be directly marked by the image processing device, and the specific marking method is not limited herein.
In the labeling process, 75% of labeled sample data can be used as a training sample, 25% of labeled sample data can be used as a test sample, and the distribution ratio of the specific training sample and the test sample is not excessively limited here.
The image processing model needs to be subjected to iterative processing for a plurality of times to adjust the loss function value until the loss function value meets the training stop condition, so as to obtain the trained image processing device. However, in each iterative training, if only one image sample to be processed is input, too little sample size is not beneficial to training adjustment of the image processing model. Therefore, the training sample set needs to be divided into a plurality of image samples to be processed, so that the image processing model can be subjected to iterative processing by using the image samples to be processed in the training sample set.
Therefore, the label confidence degrees corresponding to the plurality of image samples to be processed one by one can be obtained by annotating the acquired image samples to be processed, and a training sample set containing the plurality of image samples to be processed can be obtained. Thus, the training of the subsequent model is facilitated.
S120, extracting reference image characteristics of an image sample to be processed based on a first network of a preset image processing model, determining reference frame information of each of a plurality of first reference frames based on the reference image characteristics, and determining a reference frame based on the reference frame confidence and the reference position information of each of the plurality of first reference frames.
The first network may be a YOLOv5 network, where YOLOv5 includes a backbone network, a feature pyramid network, and a network head. To achieve better performance, the present embodiments incorporate adaptive spatial feature fusion tiles (ASFFs). Based on this, the YOLOv5 network model in the embodiment of the application can adopt 8 times, 16 times and 32 times downsampling characteristic diagrams.
In some embodiments, the reference bounding box information referred to above may include reference confidence and reference location information. The image samples framed by the first reference bounding box may include meter image samples. The position information of the first reference bounding box may be the position coordinates of the bounding box at the top left corner pixel vertex and the bottom right corner pixel vertex in the image to be processed, and in particular is not excessively limited here.
Specifically, the image processing apparatus may input the image sample to be processed in the training sample set to the first network in the preset image processing model after acquiring the training sample set, and determine the reference bounding box based on the reference bounding box information of each of the plurality of first reference bounding boxes in the image sample to be processed extracted by the first network, because the reference bounding box information may include the reference confidence and the reference position information, based on which the reference bounding box may be determined based on the reference confidence and the reference position information of each of the plurality of first reference bounding boxes.
In one example, the image processing apparatus may input the image samples to be processed in the training sample set into a first network of preset image processing models after acquiring the training sample set, and the first network may input the position information of a first reference bounding box with a reference confidence level greater than a first preset threshold value by extracting reference image features of the image samples to be processed and processing the reference image features. The first preset threshold may be based on practical experience or a threshold that needs to be preset, for example, may be 0.7, which is not limited in this case.
It should be noted that, when the sample to be processed is input into the preset image processing model, the sample to be processed may be scaled to a suitable size, and in addition, the normalization operation is performed on the instrument image in order to ensure the training speed and the training efficiency.
In addition, in the petroleum operation scene, the camera is far away from the operation scene, the petroleum pipeline pressure instrument panel target is small, the operation scene has much interference, and the positioning and scale identification and positioning of the petroleum pipeline pressure instrument panel can be greatly interfered by a background area by simply using a target detection algorithm, so that confusion and errors are introduced to the subsequent completion of the scale reading of the petroleum pointer instrument. Therefore, the embodiment of the application adds text algorithm EAST algorithm discrimination on the basis of target detection, and is used for realizing scale positioning and identification in the instrument panel. The EAST algorithm can accurately position the scale values and positions in the instrument panel, and provides strong priori information for subsequent instrument panel readings.
Adaptive spatial feature fusion plate (ASFF):
for pressure instrument panels and the like in petroleum operation scenes, different angles possibly exist on the instrument panels, certain influence is caused on detection, the fact that the image feature extraction capability of a target detector is stronger is achieved on the instrument panels with different angles, the positions of the instrument panels can be accurately identified, and more excellent priori knowledge is provided for follow-up, so that the feature fusion capability of a feature pyramid is improved by adding an adaptive space feature fusion plate (ASFF) in a target detection model, and the performance of the detector is further improved.
In the YOLOv5 network, in order to fully utilize semantic information of high-level features and fine-grained features of bottom-level features, the network outputs multi-layer features in a Feature Pyramid (FPN) mode to realize multi-feature map prediction, so that the network can detect targets of various scales and simultaneously fuse the high-level features and the bottom-level features, but conflicts exist among feature maps of different levels of the structure, and the conflicts interfere gradient calculation during training, so that the effectiveness of the feature pyramid is reduced.
The embodiment of the application adds the self-adaptive spatial feature fusion plate to solve the conflict problem. Firstly, generating 3 feature images with different sizes through an FPN plate, and then realizing spatial feature fusion of the feature images through an ASFF plate, so that the feature images with different sizes are connected to weaken conflict among the feature images. The specific steps are as follows:
step 1: and carrying out up-down sampling operation on the rest feature images for the output of the first-level feature image to obtain feature images with the same size and depth, so that the subsequent fusion is convenient.
Step 2: the processed 3-level feature maps are output and input into a convolution of 1×1×n to obtain 3 spatial weight vectors, each size being n×h×w.
Step 3: then the channel direction is spliced to obtain a weight fusion graph of 3n multiplied by h multiplied by w
Step 4: in order to obtain a weight map with a channel of 3, a convolution of 1×1×3 is applied to the feature map to obtain a weight vector of 3×h×w.
Step 5: normalizing in the channel direction, and multiplying and adding 3 vectors to the 3 feature images to obtain a fused c×h×w feature image.
Step 6: a 3 x 3 convolution is used to obtain a predicted output result with an output channel of 256.
The fused formula of step 5 is
Figure BDA0004149662890000091
Wherein alpha, beta and gamma are weight coefficients of feature fusion, and X is differentFeatures of the hierarchical feature map.
The spatial filtering conflict information is learned through the weighted fusion mode, if positive samples exist in a certain level of feature graphs in the original feature golden sub-tower and negative samples exist in corresponding positions in other levels of feature graphs, so that discontinuity can interfere gradient results, training efficiency is reduced, through adding an ASFF plate, the weight coefficient of the corresponding negative samples can be controlled through weight parameters, gradients of the negative samples can not interfere results, filtering conflict information is achieved, and network feature fusion capability is further enhanced.
And S130, carrying out text recognition on the instrument image sample selected by the reference boundary box based on a second network of the preset image processing model to obtain position information of a plurality of reference scale values included in the instrument image sample.
Wherein the second network may be determined based on a text detection algorithm, which may be based on actual conditions or experience, for example, an efficient and accurate scene text (Efficient and Accuracy Scene Text, EAST) algorithm is not specifically limited herein.
Specifically, since the image sample framed by the first reference bounding box includes the meter image sample, the image processing device may perform text recognition on the meter image sample framed by the reference bounding box through the second network of the preset image processing model after obtaining the reference bounding box, so as to obtain position information of a plurality of reference scale values included in the meter image sample.
When the meter image is input to the second network of the preset image processing model, the meter image may be scaled to a proper size and normalized, and then the meter image may be input to the second network of the preset image processing model.
The application scene of this application embodiment is under the petroleum operation scene, realizes simultaneously that petroleum pipeline pressure panel board scale value obtains can have very big puzzlement. Meanwhile, the scale reading in the instrument panel is different from the conventional text detection, the original EAST algorithm is better for the conventional text detection, and a backbone network with stronger feature extraction capability is required to be used for network for scale reading identification under the scene.
The original EAST algorithm first fed the image into the FCN network structure and generated a text score feature map and a multi-channel geometry feature map at the single channel pixel level. Text regions take two geometric shapes: rotating the box and level and designing a different loss function for each geometry; a threshold is then applied to each predicted region, wherein geometries whose scores exceed a predetermined threshold are considered valid and saved for subsequent non-maximal suppression, outputting the final result of the network.
The embodiment of the application improves the petroleum operation scene based on the EAST algorithm, and the improved EAST algorithm is more accurate than a target detection algorithm based on a candidate frame. For a large scale, text detection in images with various text scales is more accurate, a clearer training sample is improved, a network structure is optimized, and the accuracy of a detection algorithm is further improved. The improved EAST algorithm mainly comprises 5 parts: algorithmic neural network structure, focal-loss optimization-based loss function, oblique local-aware non-maximum suppression Network (NMS), variable-scale-based image segmentation optimization, and scale-cut training samples. The improved neural network structure mainly comprises three parts, namely a feature extraction branch, a feature combination branch and an output layer. In each merging stage of the feature merging branches, inputting a feature image of a feature extraction branch f1 stage into a reverse pooling layer, wherein an output image is 2 times of an input image of the previous stage; then merge step by step, which may result in a portion of the computational cost. To improve algorithm efficiency, we operate by Conv3 to output into the f3 stage by reducing the number of channels of Conv1, then combining the local convolution features. After all the feature merging stages, the input result of the feature extraction branch f4 is output into the output layer.
The EAST algorithm improved by the embodiment of the application is faster than the original algorithm, has a better effect and is suitable for complex petroleum operation scenes.
And S140, determining a loss function value of a preset image processing model according to the position information of the plurality of label scale values and the position information of the plurality of reference scale values of the target image sample to be processed.
The target image sample to be processed is any one of the image samples to be processed;
specifically, the image processing device may obtain position information of a plurality of reference scale values based on any one of a plurality of image samples to be processed, and further accurately determine a loss function value of the preset image processing device according to position information of a plurality of label scale values corresponding to the image sample to be processed, so as to facilitate iterative training on a preset image processing model based on the loss function value, and further obtain a more accurate image processing model.
And S150, training the preset image processing model by using the image sample to be processed based on the loss function value of the preset image processing model, and obtaining a trained image processing model.
Specifically, in order to obtain a better trained image processing model, under the condition that the loss function value does not meet the training stop condition, model parameters of a preset image processing model are adjusted, and the image processing model with the adjusted parameters is continuously trained by using the image sample to be processed until the loss function value meets the training stop condition, so that the trained image processing model is obtained.
It should be noted that, the image processing model mentioned above may also output a category of the image sample to be processed, and the category may be a category of the target object, for example, may be an instrument image, or may be an image of another category.
In this embodiment, the image processing apparatus may obtain the training sample set, and because the training sample set may include a plurality of image samples to be processed, based on this, may extract reference image features of the image samples to be processed based on a preset image processing model, determine respective reference bounding box information of a plurality of first reference bounding boxes based on the reference image features, and may further determine the reference bounding box based on the respective reference bounding box information of the plurality of first reference bounding boxes, and may further perform text recognition on the meter image samples framed by the reference bounding box based on a second network of the preset image processing model, to obtain location information of a plurality of reference scale values included in the meter image samples. Based on the method, the training sample set further comprises position information of a plurality of label scale values corresponding to each image sample to be processed, so that a loss function value is determined according to the position information of the label scale values and the position information of the reference scale values of the target processing image sample, and further, a preset image processing model can be trained by utilizing the image sample to be processed based on the loss function value until the loss function value meets a training stop condition, so that a more accurate image processing model can be obtained.
Based on this, considering that the image with single image feature is not beneficial to training of the model, in order to make the acquired image to be processed have diversity so as to make the trained model more robust, in an embodiment, before acquiring the training sample set, the image processing method related to the foregoing may further include:
acquiring a plurality of original images containing target objects;
and respectively preprocessing the plurality of original images according to a preset image preprocessing mode to obtain a plurality of image samples to be processed corresponding to each original image.
Specifically, the image processing device may acquire a plurality of original images before acquiring the training sample set, and respectively perform preprocessing on the plurality of original images according to a preset image preprocessing manner, so as to obtain a plurality of image samples to be processed corresponding to each original image.
In some embodiments, the above-mentioned preset image preprocessing method includes an image enhancement operation and an image normalization operation. The image enhancement operations may include multi-source data enhancement and single-source data enhancement, among others. Wherein the multi-source data enhancement includes a Mosaic enhancement and a Mixup enhancement; single source data enhancements include HSV enhancements and random flipping. The image normalization operation may be such that the preprocessed image data is limited to a range to eliminate adverse effects caused by singular sample data. Specifically, the image normalization operation is i= (I-m)/σ, where m is the mean of the image pixels and σ is the variance of the image pixels. In addition, a plurality of image samples to be processed can be obtained by copying on the basis of the original image.
It should be noted that, the reasonable data enhancement mode can significantly improve the performance of the detector, and the data set comprising the petroleum pipeline manometer is collected from various scenes because the confidentiality of the data is not disclosed; for the case of a relatively small number of samples we use the generation of the challenge network for data enhancement, in the GAN framework the learning process is a very small game between two networks, a generator generating the composite data for a given random noise vector, a discriminator distinguishing the real data from the composite data of the generator. The pictures under different brightness conditions are simulated, and the detection effect can be obviously improved.
In this embodiment, the image processing apparatus may acquire a plurality of original images before acquiring the training sample set, and perform preprocessing on each of the plurality of original images according to a preset image preprocessing manner, so as to obtain a plurality of image samples to be processed corresponding to each of the plurality of original images. Therefore, the single image characteristic is avoided, and a large number of image samples to be processed can be obtained, so that a more accurate image processing model can be trained conveniently.
Based on the image processing model obtained through training in the above embodiment, the embodiment of the present application further provides a specific implementation manner of the image processing method, which is specifically described in detail with reference to fig. 2.
S210, acquiring a to-be-processed image containing the target object.
Wherein the target object referred to above may comprise a meter image.
The image processing device can acquire a monitoring video in real time through monitoring equipment installed on a working site, and acquire an image to be processed containing a target object from the monitoring video in a separated mode.
S220, extracting image features of the image to be processed based on the image processing model, determining a target boundary box based on the image features, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image.
Wherein the image selected by the target boundary box is a meter image.
Specifically, after the image to be processed including the target object is acquired, the image processing device may extract image features in the image to be processed through the image processing model, and determine a target bounding box based on the image features, and based on the image selected by the target bounding box is an instrument image, location information of a plurality of scale values included in the instrument image may be obtained by performing text recognition on the instrument image.
S230, switching the instrument image selected by the target boundary box from the image space to the parameter space to obtain a switched instrument image, and determining the position information of the instrument pointer in the instrument image based on the switched instrument image.
After the target boundary box is determined, the image processing device switches the instrument image selected by the target boundary box from the image space to the parameter space so as to obtain a switched instrument image, and further, the position information of the instrument pointer in the instrument image can be determined based on the switched instrument image.
In one example, the image processing apparatus may switch the meter image from the image space to the parameter space through a Hough transform to detect the center of the dashboard and the position information of the meter pointer in the parameter space of the meter image based on the switched meter image.
It should be noted that, in order to ensure that the meter image selected by the target bounding box can be accurately identified, the meter image may be preprocessed to obtain a more accurate meter image. In one embodiment, before the instrument image selected by the target bounding box is switched from the image space to the parameter space, the image processing device may scale the instrument image selected by the target bounding box to a specified size, perform basic image processing and correction processing on the instrument image to obtain a corrected instrument image, perform binarization operation on the corrected instrument image, convert the instrument image into a binarized image, and remove an interference area of the binarized image by an optimization algorithm to obtain an image.
S240, determining a target scale value pointed by the meter pointer based on the position information of the scale values and the position information of the meter pointer.
The image device may determine the target scale value pointed by the meter pointer based on the position information of the plurality of scale values and the position information of the meter pointer after obtaining the position information of the plurality of scale values and the position information of the meter pointer in the meter image.
In the embodiment of the application, the image to be processed including the target object can be obtained, the image characteristic of the image to be processed can be extracted based on the image processing model, and then the target boundary frame can be determined based on the image characteristic, because the image selected by the target boundary frame is the instrument image, and further the position information of a plurality of scale values included in the instrument image can be obtained by carrying out text recognition on the instrument image, meanwhile, the instrument image selected by the target boundary frame can be switched from the image space to the parameter space, the switched instrument image is obtained, and the position information of the instrument pointer in the instrument image is determined based on the switched instrument image, so that the target scale value pointed by the instrument pointer can be determined based on the position information of the plurality of scale values and the position information of the instrument pointer, and further the accuracy of reading the scale values of the digital instrument is improved, and the reading efficiency is improved.
In order to describe the image processing method provided in the embodiments of the present application more accurately, in one embodiment, the step S120 mentioned above may include the following steps:
extracting image features of an image to be processed based on a first network of an image processing model, determining respective boundary frame information of a plurality of first boundary frames based on the image features, wherein the boundary frame information comprises boundary frame confidence and boundary frame position information, and determining a target boundary frame based on the boundary frame confidence and the boundary frame position information of the respective first boundary frames;
and carrying out text recognition on the instrument image framed by the target boundary box based on the second network of the image processing model so as to obtain the position information of a plurality of scale values included in the instrument image.
Specifically, since the image processing model related to the above may include the first network and the second network, based on this, after obtaining the image to be processed, the image feature of the image to be processed may be extracted based on the first network of the image processing model, and the respective bounding box information of the plurality of first bounding boxes may be determined based on the image feature, and since the bounding box information may include the bounding box confidence level and the location information of the bounding box, the image processing apparatus may determine the target bounding box based on the respective bounding box confidence level and the location information of the bounding box of the plurality of first bounding boxes, and further may perform text recognition on the meter image selected by the target bounding box based on the second network of the image processing model, so as to obtain the location information of the plurality of scale values included in the meter image.
In this embodiment, since the image processing model mentioned above may include the first network and the second network, based on this, the image to be processed may be processed by the first network of the image processing model to determine the target bounding box, and further, the meter image framed by the target bounding box may be processed by the second network of the image processing model to determine the position information of the plurality of scale values in the meter image. In this way, the target scale value pointed by the meter pointer can be determined conveniently.
Because a large number of candidate frames are usually generated at the same location in the object detection and the candidate frames overlap with each other, in order to determine a more accurate object bounding box, in one embodiment, the step of determining the object bounding box based on the bounding box confidence and the bounding box position information of each of the plurality of first bounding boxes may specifically include the following steps:
determining a first boundary frame with highest boundary frame confidence degree in the plurality of first boundary frames as a second boundary frame based on the boundary frame confidence degree of each of the plurality of first boundary frames, and determining a first boundary frame except the second boundary frame in the plurality of first boundary frames as a third boundary frame;
Calculating the intersection ratio of the second boundary frame and the third boundary frame based on the boundary frame position information of the second boundary frame and the third boundary frame;
and determining a third boundary frame with the cross-over ratio with the second boundary frame being smaller than a preset cross-over ratio threshold value in the third boundary frame, and taking the second boundary frame as a target boundary frame.
The preset cross ratio threshold may be a threshold preset based on actual experience or situation, and is not limited in any way.
Specifically, the image processing apparatus may determine, after obtaining the bounding box information of the plurality of first bounding boxes, a first bounding box having a highest bounding box confidence among the plurality of first bounding boxes as a second bounding box, and determine, among the plurality of first bounding boxes, a first bounding box other than the second bounding box as a third bounding box, and further may calculate, based on the bounding box position information of each of the second bounding box and the third bounding box, a blending ratio of the second bounding box and the third bounding box, and further may determine, among the third bounding boxes, a third bounding box having a blending ratio with the second bounding box less than a preset blending ratio threshold, and the second bounding box as a target bounding box.
It should be noted that, the blending ratio may be calculated by the following formula (1), specifically as follows:
Figure BDA0004149662890000151
wherein b high May be a second bounding box, b rest May be a third bounding box.
In this embodiment, the image processing apparatus can accurately determine the target bounding box based on the respective bounding box confidence and bounding box position of the plurality of first bounding boxes, so that the text included in the meter image framed in the target bounding box can be accurately recognized later.
Based on this, in order to describe the image processing method provided in the embodiments of the present application more accurately and comprehensively, in one embodiment, the step S240 mentioned above may include the following steps:
determining a first scale value nearest to the target intersection point in a counterclockwise direction of the meter pointer and a second scale value nearest to the target intersection point in a clockwise direction of the meter pointer from the plurality of scale values based on the position information of the target intersection point and the position information of the plurality of scale values;
determining a difference value between the second scale value and the first scale value as a first numerical value;
and determining the sum of the first scale value and the target value as the target scale value pointed by the instrument pointer.
The target intersection point is a point at which the instrument pointer intersects with a scale value line segment, and the scale value line segment is a line segment obtained by sequentially connecting a plurality of scale values according to a preset sequence. The target value is a product of a first value and a second value, the second value being determined based on a relative distance between the meter pointer and the first scale value and the second scale value, respectively.
In one example, the plurality of scale values are sequentially connected based on the position information of the plurality of scale values according to a preset sequence, so as to obtain scale value line segments, wherein the preset sequence may be a preset sequence based on actual experience or conditions, and is not limited in detail herein. And further, a first scale value nearest to the target intersection point in the anticlockwise direction of the meter pointer and a second scale value nearest to the target intersection point in the clockwise direction of the meter pointer can be determined from the plurality of scale values based on the position information of the target intersection point where the meter pointer intersects the scale value line segment and the position information of the plurality of scale values, and further, a difference value between the second scale value and the first scale value can be determined to be a first value, and further, a second value determined based on the relative distance between the meter pointer and the first scale value and the second scale value respectively can be determined to be a target value based on a product between the first value and the second value, and finally, a sum of the first scale value and the target value can be determined to be the target scale value pointed by the meter pointer.
In addition, in order to avoid that the distances between every two scale values in the instrument are not the same, the average distance between the scale values can be calculated first, and then the target scale value can be calculated based on the average distance, which can be specifically shown as the formula (2) and the formula (3):
Figure BDA0004149662890000161
Figure BDA0004149662890000171
Wherein d 1 D is the relative distance between the meter pointer and the first scale value 2 The relative distance between the meter pointer and the second scale value is that A is the first scale value and B is the second scale value. d' is the target scale value. Lambda is the average distance between the scale values of two pairs, L i Is the position of the ith scale value.
In this embodiment, the target scale value pointed by the meter pointer can be accurately calculated based on the position information of the plurality of scale values and the position information of the meter pointer. Thus, the accuracy of reading the scale value on the digital instrument is improved, and meanwhile, the reading efficiency is improved.
According to the image processing method provided by the embodiment of the application, the identification and detection of the petroleum pipeline pressure instrument reading are realized through the target detection technology, the text detection and the image processing algorithm. The inventor also performs field tests, and training and testing are performed by adopting a site scene data set which is shot and collected by the inventor under the conditions of an Inter Core i7 CPU, a 4G memory and an NVIDIA Geforce 2080Ti independent display card. The specific results are shown in Table 1 below.
Table 1 statistical table of test results
Figure BDA0004149662890000172
Based on the same inventive concept, the embodiment of the application also provides an image processing device. The image processing apparatus provided in the embodiment of the present application will be described in detail with reference to fig. 3.
Fig. 3 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
As shown in fig. 3, the image processing apparatus 300 may include: acquisition module 310, processing module 320, switching module 330 and determination module 340
An acquisition module 310 is configured to acquire a to-be-processed image including a target object, where the target object includes an instrument image.
The processing module 320 is configured to extract image features of an image to be processed based on the image processing model, determine a target bounding box based on the image features, and obtain location information of a plurality of scale values included in the meter image by performing text recognition on the meter image, where the image selected by the target bounding box is the meter image.
And the switching module 330 is configured to switch the meter image selected by the target bounding box from the image space to the parameter space, obtain a switched meter image, and determine the position information of the meter pointer in the meter image based on the switched meter image.
The determining module 340 is configured to determine a target scale value pointed by the meter pointer based on the position information of the plurality of scale values and the position information of the meter pointer.
In one embodiment, the processing module is specifically configured to:
extracting image features of an image to be processed based on a first network of an image processing model, determining respective boundary frame information of a plurality of first boundary frames based on the image features, wherein the boundary frame information comprises boundary frame confidence and boundary frame position information, and determining a target boundary frame based on the boundary frame confidence and the boundary frame position information of the respective first boundary frames;
And carrying out text recognition on the instrument image framed by the target boundary box based on the second network of the image processing model so as to obtain the position information of a plurality of scale values included in the instrument image.
In one embodiment, the determining module is specifically configured to:
determining a first scale value nearest to the target intersection point in the anticlockwise direction of the meter pointer and a second scale value nearest to the target intersection point in the clockwise direction of the meter pointer from the plurality of scale values based on the position information of the target intersection point and the position information of the plurality of scale values, wherein the target intersection point is a point at which the meter pointer intersects with a scale value line segment, and the scale value line segment is a line segment obtained by sequentially connecting the plurality of scale values according to a preset sequence;
determining a difference value between the second scale value and the first scale value as a first numerical value;
and determining the sum of the first scale value and a target value, wherein the sum is the target scale value pointed by the instrument pointer, the target value is the product of the first value and a second value, and the second value is determined based on the relative distance between the instrument pointer and the first scale value and the second scale value respectively.
In one embodiment, the processing module is further specifically configured to:
determining a first boundary frame with highest boundary frame confidence degree in the plurality of first boundary frames as a second boundary frame based on the boundary frame confidence degree of each of the plurality of first boundary frames, and determining a first boundary frame except the second boundary frame in the plurality of first boundary frames as a third boundary frame;
Calculating the intersection ratio of the second boundary frame and the third boundary frame based on the boundary frame position information of the second boundary frame and the third boundary frame;
and determining a third boundary frame with the cross-over ratio with the second boundary frame being smaller than a preset cross-over ratio threshold value in the third boundary frame, and taking the second boundary frame as a target boundary frame.
In one embodiment, the image processing apparatus may further include a training module. The training module is specifically used for:
acquiring a training sample set, wherein the training sample set comprises a plurality of image samples to be processed and position information of a plurality of label scale values corresponding to each image sample to be processed;
extracting reference image characteristics of an image sample to be processed based on a first network of a preset image processing model, determining respective reference boundary frame information of a plurality of first reference boundary frames based on the reference image characteristics, wherein the reference boundary frame information comprises reference confidence and reference position information, and determining a reference boundary frame based on the respective reference frame confidence and the reference position information of the plurality of first reference boundary frames;
text recognition is carried out on the instrument image sample selected by the reference boundary frame based on a second network of the preset image processing model, so that position information of a plurality of reference scale values included in the instrument image sample is obtained;
Determining a loss function value of a preset image processing model according to position information of a plurality of label scale values and position information of a plurality of reference scale values of a target image sample to be processed, wherein the target image sample to be processed is any one of the image samples to be processed;
and training the preset image processing model by utilizing the image sample to be processed based on the loss function value of the preset image processing model to obtain a trained image processing model.
In one embodiment, the acquiring module is further configured to acquire a plurality of original images including the target object;
the processing module is further used for respectively preprocessing the plurality of original images according to a preset image preprocessing mode to obtain a plurality of image samples to be processed corresponding to each original image, wherein the preset image preprocessing mode comprises an image enhancement operation and an image normalization operation
In the embodiment of the application, the image to be processed including the target object can be obtained, the image characteristic of the image to be processed can be extracted based on the image processing model, and then the target boundary frame can be determined based on the image characteristic, because the image selected by the target boundary frame is the instrument image, and further the position information of a plurality of scale values included in the instrument image can be obtained by carrying out text recognition on the instrument image, meanwhile, the instrument image selected by the target boundary frame can be switched from the image space to the parameter space, the switched instrument image is obtained, and the position information of the instrument pointer in the instrument image is determined based on the switched instrument image, so that the target scale value pointed by the instrument pointer can be determined based on the position information of the plurality of scale values and the position information of the instrument pointer, and further the accuracy of reading the scale values of the digital instrument is improved, and the reading efficiency is improved.
Each module in the image processing apparatus provided in the embodiment of the present application may implement the method steps in the embodiment shown in fig. 1 or fig. 2, and may achieve the technical effects corresponding to the steps, which are not described herein for brevity.
Fig. 4 shows a schematic hardware structure of an electronic device according to an embodiment of the present application.
A processor 401 may be included in an electronic device as well as a memory 402 in which computer program instructions are stored.
In particular, the processor 401 described above may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement one or more integrated circuits of embodiments of the present application.
Memory 402 may include mass storage for data or instructions. By way of example, and not limitation, memory 402 may comprise a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of the foregoing. Memory 402 may include removable or non-removable (or fixed) media, where appropriate. Memory 402 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 402 is a non-volatile solid state memory.
The memory may include Read Only Memory (ROM), random Access Memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible memory storage devices. Thus, in general, the memory includes one or more tangible (non-transitory) computer-readable storage media (e.g., memory devices) encoded with software comprising computer-executable instructions and when the software is executed (e.g., by one or more processors) it is operable to perform the operations described with reference to methods in accordance with aspects of the present disclosure.
The processor 401 implements any of the image processing methods of the above embodiments by reading and executing computer program instructions stored in the memory 402.
In one example, the electronic device may also include a communication interface 403 and a bus 410. As shown in fig. 4, the processor 401, the memory 402, and the communication interface 403 are connected by a bus 410 and perform communication with each other.
The communication interface 403 is mainly used to implement communication between each module, device, unit and/or apparatus in the embodiments of the present application.
Bus 410 includes hardware, software, or both, coupling components of the online data flow billing device to each other. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 410 may include one or more buses, where appropriate. Although embodiments of the present application describe and illustrate a particular bus, the present application contemplates any suitable bus or interconnect.
In addition, in combination with the image processing method in the above embodiment, the embodiment of the application may be implemented by providing a computer storage medium. The computer storage medium has stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement the image processing method provided in the embodiments of the present application.
The embodiment of the application also provides a computer program product, and instructions in the computer program product when executed by a processor of the electronic device cause the electronic device to execute the scientific and technological innovation achievement evaluation method provided by the embodiment of the application.
It should be clear that the present application is not limited to the particular arrangements and processes described above and illustrated in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions, or change the order between steps, after appreciating the spirit of the present application.
The functional blocks shown in the above block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be different from the order in the embodiments, or several steps may be performed simultaneously.
Aspects of the present disclosure are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable image processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable image processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to being, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware which performs the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In the foregoing, only the specific embodiments of the present application are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, which are intended to be included in the scope of the present application.

Claims (10)

1. An image processing method, the method comprising:
acquiring an image to be processed containing a target object, wherein the target object comprises an instrument image;
extracting image features of the image to be processed based on an image processing model, determining a target boundary box based on the image features, wherein an image selected by the target boundary box is the instrument image, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image;
switching the instrument image selected by the target boundary box from an image space to a parameter space to obtain a switched instrument image, and determining the position information of an instrument pointer in the instrument image based on the switched instrument image;
And determining a target scale value pointed by the meter pointer based on the position information of the scale values and the position information of the meter pointer.
2. The method according to claim 1, wherein the extracting the image features of the image to be processed based on the image processing model, and determining the target bounding box based on the image features, and obtaining the position information of the plurality of scale values included in the meter image by performing text recognition on the meter image, includes:
extracting image characteristics of the image to be processed based on a first network of an image processing model, determining respective boundary frame information of a plurality of first boundary frames based on the image characteristics, wherein the boundary frame information comprises boundary frame confidence and boundary frame position information, and determining a target boundary frame based on the respective boundary frame confidence and boundary frame position information of the plurality of first boundary frames;
and carrying out text recognition on the instrument image selected by the target boundary box based on the second network of the image processing model so as to obtain position information of a plurality of scale values included in the instrument image.
3. The method of claim 1, wherein the determining the target scale value pointed to by the meter pointer based on the position information of the plurality of scale values and the position information of the meter pointer comprises:
Determining a first scale value nearest to the target intersection point in the anticlockwise direction of the meter pointer and a second scale value nearest to the target intersection point in the clockwise direction of the meter pointer from the plurality of scale values based on the position information of the target intersection point and the position information of the plurality of scale values, wherein the target intersection point is a point at which the meter pointer intersects with a scale value line segment, and the scale value line segment is a line segment obtained by sequentially connecting the plurality of scale values according to a preset sequence;
determining a difference value between the second scale value and the first scale value as a first numerical value;
and determining the sum of the first scale value and a target value, wherein the sum is the target scale value pointed by the instrument pointer, the target value is the product of the first value and a second value, and the second value is determined based on the relative distance between the instrument pointer and the first scale value and the second scale value respectively.
4. The method of claim 2, wherein the determining the target bounding box based on the bounding box confidence and bounding box position information for each of the plurality of first bounding boxes comprises:
Determining a first boundary frame with highest boundary frame confidence degree in the plurality of first boundary frames as a second boundary frame based on the boundary frame confidence degree of each of the plurality of first boundary frames, and determining a first boundary frame except the second boundary frame in the plurality of first boundary frames as a third boundary frame;
calculating an intersection ratio of the second bounding box and the third bounding box based on the respective bounding box position information of the second bounding box and the third bounding box;
and determining a third boundary frame with the cross ratio smaller than a preset cross ratio threshold value with the second boundary frame in the third boundary frame, and taking the second boundary frame as a target boundary frame.
5. The method according to claim 1 or 2, characterized in that the training method of the image processing model comprises:
acquiring a training sample set, wherein the training sample set comprises a plurality of image samples to be processed and position information of a plurality of label scale values corresponding to each image sample to be processed;
extracting reference image characteristics of the image sample to be processed based on a first network of a preset image processing model, determining reference boundary frame information of each of a plurality of first reference boundary frames based on the reference image characteristics, wherein the reference boundary frame information comprises reference confidence and reference position information, and determining a reference boundary frame based on the reference frame confidence and the reference position information of each of the plurality of first reference boundary frames;
Performing text recognition on the instrument image sample selected by the reference boundary box based on a second network of the preset image processing model to obtain position information of a plurality of reference scale values included in the instrument image sample;
determining a loss function value of a preset image processing model according to position information of a plurality of label scale values and position information of a plurality of reference scale values of a target image sample to be processed, wherein the target image sample to be processed is any one of the image samples to be processed;
and training the preset image processing model by using the image sample to be processed based on the loss function value of the preset image processing model to obtain a trained image processing model.
6. The method of claim 5, wherein prior to the acquiring a training sample set, the method further comprises:
acquiring a plurality of original images containing target objects;
and respectively preprocessing a plurality of original images according to a preset image preprocessing mode to obtain a plurality of image samples to be processed corresponding to each original image, wherein the preset image preprocessing mode comprises an image enhancement operation and an image normalization operation.
7. An image processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring an image to be processed containing a target object, wherein the target object comprises an instrument image;
the processing module is used for extracting image characteristics of the image to be processed based on an image processing model, determining a target boundary box based on the image characteristics, wherein an image selected by the target boundary box is the instrument image, and obtaining position information of a plurality of scale values included in the instrument image by carrying out text recognition on the instrument image;
the switching module is used for switching the instrument image selected by the target boundary frame from an image space to a parameter space to obtain a switched instrument image, and determining the position information of an instrument pointer in the instrument image based on the switched instrument image;
and the determining module is used for determining the target scale value pointed by the meter pointer based on the position information of the scale values and the position information of the meter pointer.
8. An electronic device, the device comprising: a processor and a memory storing computer program instructions;
the processor reads and executes the computer program instructions to implement the image processing method according to any of claims 1-6.
9. A computer storage medium having stored thereon computer program instructions which, when executed by a processor, implement the image processing method according to any of claims 1-6.
10. A computer program product, characterized in that instructions in the computer program product, when executed by a processor of an electronic device, cause the electronic device to perform the image processing method according to any of claims 1-6.
CN202310303092.XA 2023-03-23 2023-03-23 Image processing method, device, equipment, medium and product Pending CN116363688A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310303092.XA CN116363688A (en) 2023-03-23 2023-03-23 Image processing method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310303092.XA CN116363688A (en) 2023-03-23 2023-03-23 Image processing method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN116363688A true CN116363688A (en) 2023-06-30

Family

ID=86914040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310303092.XA Pending CN116363688A (en) 2023-03-23 2023-03-23 Image processing method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN116363688A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170322951A1 (en) * 2010-03-29 2017-11-09 Ebay Inc. Finding products that are similar to a product selected from a plurality of products
US20180233028A1 (en) * 2008-08-19 2018-08-16 Digimarc Corporation Methods and systems for content processing
CN112115893A (en) * 2020-09-24 2020-12-22 深圳市赛为智能股份有限公司 Instrument panel pointer reading identification method and device, computer equipment and storage medium
CN114913233A (en) * 2022-06-10 2022-08-16 嘉洋智慧安全生产科技发展(北京)有限公司 Image processing method, apparatus, device, medium, and product
CN115205858A (en) * 2022-07-07 2022-10-18 南昌航空大学 Pointer type instrument automatic reading method based on rotating target detection
CN115546795A (en) * 2022-09-20 2022-12-30 华南理工大学 Automatic reading method of circular pointer instrument based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180233028A1 (en) * 2008-08-19 2018-08-16 Digimarc Corporation Methods and systems for content processing
US20170322951A1 (en) * 2010-03-29 2017-11-09 Ebay Inc. Finding products that are similar to a product selected from a plurality of products
CN112115893A (en) * 2020-09-24 2020-12-22 深圳市赛为智能股份有限公司 Instrument panel pointer reading identification method and device, computer equipment and storage medium
CN114913233A (en) * 2022-06-10 2022-08-16 嘉洋智慧安全生产科技发展(北京)有限公司 Image processing method, apparatus, device, medium, and product
CN115205858A (en) * 2022-07-07 2022-10-18 南昌航空大学 Pointer type instrument automatic reading method based on rotating target detection
CN115546795A (en) * 2022-09-20 2022-12-30 华南理工大学 Automatic reading method of circular pointer instrument based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
万吉林;王慧芳;管敏渊;沈建良;吴国强;高奥;杨斌;: "基于Faster R-CNN和U-Net的变电站指针式仪表读数自动识别方法", 电网技术, no. 08 *
徐发兵;吴怀宇;陈志环;喻汉;: "基于深度学习的指针式仪表检测与识别研究", 高技术通讯, no. 12, pages 1 - 3 *
陈树;王磊;: "基于机器视觉的指针式仪表检测方法", 计算机与数字工程, no. 09 *

Similar Documents

Publication Publication Date Title
Hoang et al. Metaheuristic optimized edge detection for recognition of concrete wall cracks: a comparative study on the performances of roberts, prewitt, canny, and sobel algorithms
CN106778705B (en) Pedestrian individual segmentation method and device
CN110276285B (en) Intelligent ship water gauge identification method in uncontrolled scene video
CN110930347A (en) Convolutional neural network training method, and method and device for detecting welding spot defects
CN110889399B (en) High-resolution remote sensing image weak and small target detection method based on deep learning
CN110135514B (en) Workpiece classification method, device, equipment and medium
CN112733885A (en) Point cloud identification model determining method and point cloud identification method and device
CN110706224B (en) Optical element weak scratch detection method, system and device based on dark field image
CN112085700B (en) Automatic extraction method, system and medium for weld joint region in X-ray image
CN110910445B (en) Object size detection method, device, detection equipment and storage medium
CN111507055A (en) Circuit design layout and electron microscope scanning image registration method and system, circuit design layout and imaging error calculation method thereof
CN111222507A (en) Automatic identification method of digital meter reading and computer readable storage medium
CN110348307B (en) Path edge identification method and system for crane metal structure climbing robot
CN116630323A (en) Automatic calculation method, system, medium and equipment for corrosion depth of dense metal
CN111539456A (en) Target identification method and device
CN116805387B (en) Model training method, quality inspection method and related equipment based on knowledge distillation
CN117409244A (en) SCKConv multi-scale feature fusion enhanced low-illumination small target detection method
CN116363688A (en) Image processing method, device, equipment, medium and product
CN116543222A (en) Knee joint lesion detection method, device, equipment and computer readable storage medium
Zhang et al. Digital instruments recognition based on PCA-BP neural network
CN116310713A (en) Infrared image recognition method and device, electronic equipment and storage medium
CN115761606A (en) Box electric energy meter identification method and device based on image processing
CN114549548A (en) Glass image segmentation method based on polarization clue
CN113627430B (en) Cable number detection method and device and electronic equipment
CN117745826B (en) Learning accompanying method, device, robot and medium based on text corner detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination