CN111680689A - Target detection method, system and storage medium based on deep learning - Google Patents

Target detection method, system and storage medium based on deep learning Download PDF

Info

Publication number
CN111680689A
CN111680689A CN202010798762.6A CN202010798762A CN111680689A CN 111680689 A CN111680689 A CN 111680689A CN 202010798762 A CN202010798762 A CN 202010798762A CN 111680689 A CN111680689 A CN 111680689A
Authority
CN
China
Prior art keywords
target
rectangular roi
threshold value
characteristic image
positive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010798762.6A
Other languages
Chinese (zh)
Other versions
CN111680689B (en
Inventor
马卫飞
袁飞杨
张胜森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Jingce Electronic Group Co Ltd
Wuhan Jingli Electronic Technology Co Ltd
Wuhan Jingce Electronic Technology Co Ltd
Original Assignee
Wuhan Jingce Electronic Group Co Ltd
Wuhan Jingli Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Jingce Electronic Group Co Ltd, Wuhan Jingli Electronic Technology Co Ltd filed Critical Wuhan Jingce Electronic Group Co Ltd
Priority to CN202010798762.6A priority Critical patent/CN111680689B/en
Publication of CN111680689A publication Critical patent/CN111680689A/en
Application granted granted Critical
Publication of CN111680689B publication Critical patent/CN111680689B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/84Systems specially adapted for particular applications
    • G01N21/88Investigating the presence of flaws or contamination
    • G01N21/8851Scan or image signal processing specially adapted therefor, e.g. for scan signal adjustment, for detecting different kinds of defects, for compensating for structures, markings, edges
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/84Systems specially adapted for particular applications
    • G01N21/88Investigating the presence of flaws or contamination
    • G01N21/8851Scan or image signal processing specially adapted therefor, e.g. for scan signal adjustment, for detecting different kinds of defects, for compensating for structures, markings, edges
    • G01N2021/8887Scan or image signal processing specially adapted therefor, e.g. for scan signal adjustment, for detecting different kinds of defects, for compensating for structures, markings, edges based on image processing techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target detection method, a system and a storage medium based on deep learning. The training in the method comprises the following steps: extracting a feature mapping chart of a training sample; extracting an ROI (region of interest) region of the feature mapping graph to obtain a feature image of a positive rectangular ROI region; carrying out coordinate transformation on the regular rectangular ROI area to obtain a characteristic image of a corresponding oblique rectangular ROI area; calculating the characteristic image of the oblique rectangular ROI area and the IOU value of a real mark frame, and comparing the IOU value with a threshold value to determine a positive sample and a negative sample, wherein the threshold value is a dynamically adjusted value; converting the characteristic image of the oblique rectangular ROI area of the positive sample into a characteristic image of a corresponding positive rectangular ROI area; and outputting a detection result according to the converted characteristic image of the positive rectangular ROI area. The method can improve the target detection accuracy, is suitable for detecting the inclined target, and is particularly suitable for the field of panel defect detection.

Description

Target detection method, system and storage medium based on deep learning
Technical Field
The invention belongs to the technical field of target detection, and particularly relates to a target detection method and system based on deep learning and a storage medium.
Background
In AOI defect detection, it is important whether a detected bounding box (bounding box) fits a target defect. The more the detection result is attached to the target defect, the easier the manual re-judgment of the defect is, and the higher the detection precision is. In addition, aiming at the deep learning detection frames of the two stages, the closer the boundary frame of the target is, the more pertinent the feature extraction of the defect part is.
As shown in FIG. 1, a deep learning inspection method in the prior art is described by taking a two-stage inspection network representing Faster R-CNN currently used for defect inspection as an example. The Faster R-CNN detection network is mainly divided into 3 parts, namely a backbone network for feature extraction, a regional candidate network (RPN) and a regression classification network. The RPN is used to intercept a suggested region like (x, y, w, h) on the feature map obtained from the feature extraction layer, and then the region is sent to a final regression classification layer, and the regression classification further obtains the final target coordinate and category on the suggested region.
In this detection method, the output bounding box coordinates are also in the form of (x, y, w, h), and the bounding box is a positive rectangle. When the method is used for detecting general defects, the output bounding box can be attached to the target defects. However, when detecting some long or oblique defects, such as scratches, stains, etc., the output bounding box does not fit the target defect well. In addition, the coordinate frame obtained in the RPN is also a positive rectangle, so the proposed area fed into the final regression and classification network carries a large number of background features in addition to defects, which is very disadvantageous for regression and classification.
Disclosure of Invention
Aiming at least one defect or improvement requirement in the prior art, the invention provides a target detection method, a system and a storage medium based on deep learning, which can improve the target detection accuracy rate, are suitable for the detection of inclined targets, and are particularly suitable for the field of panel defect detection.
In order to achieve the above object, according to a first aspect of the present invention, there is provided a target detection method based on deep learning, including a step of inputting training samples into a convolutional neural network for training, and a step of performing inclined target detection using the trained convolutional neural network, where the training includes:
performing feature extraction on a training sample to obtain a feature mapping chart of the training sample, wherein the training sample is marked with a target marking frame in advance;
extracting an ROI (region of interest) region of the feature mapping graph to obtain a feature image of a positive rectangular ROI region;
carrying out coordinate transformation on the regular rectangular ROI area to obtain a characteristic image of a corresponding oblique rectangular ROI area;
calculating the feature image of the oblique rectangular ROI area and the IOU value of the target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the reduction of the length-width difference of the target marking frame;
converting the characteristic image of the oblique rectangular ROI area of the positive sample into a characteristic image of a corresponding positive rectangular ROI area;
and outputting a detection result according to the converted characteristic image of the positive rectangular ROI area.
Preferably, the threshold T is satisfied
Figure 741853DEST_PATH_IMAGE001
Said T is0And w is the length of the target mark frame, and h is the width of the target mark frame, wherein the preset threshold value is used.
Preferably, the preset threshold value T0Is 0.5.
Preferably, a uniform marking mode is predefined, and the training samples are marked according to the marking mode.
Preferably, the marking mode is as follows: and selecting one point of a preset direction as an initial point, and sequentially marking the rest three vertexes of the target marking frame according to a preset direction.
Preferably, the target detection method based on deep learning is applied to the field of panel defect detection, and the detection target is a tilt defect.
According to a second aspect of the present invention, there is provided a deep learning-based target detection system, including a training module for inputting training samples into a convolutional neural network for training and a detection module for performing inclined target detection by using the trained convolutional neural network, the training module including:
the characteristic extraction module is used for extracting characteristics of a training sample to obtain a characteristic mapping chart of the training sample, and the training sample is marked with a target marking frame in advance;
the ROI area extraction module is used for extracting an ROI area of the feature mapping graph to obtain a feature image of a positive rectangular ROI area;
the region transformation module is used for carrying out coordinate transformation on the regular rectangular ROI region to obtain a characteristic image of a corresponding oblique rectangular ROI region; the system is also used for calculating the IOU value of the characteristic image of the oblique rectangular ROI area and the target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the decrease of the length-width difference of the target marking frame; the characteristic image of the inclined rectangular ROI area of the positive sample is converted into a characteristic image of a corresponding positive rectangular ROI area;
and the classification module is used for receiving the converted characteristic image of the positive rectangular ROI and outputting a detection result.
According to a third aspect of the present invention, there is provided a convolutional neural network data processing system for target detection, comprising:
the backbone network is used for extracting features of a training sample to obtain a feature mapping chart of the training sample, and the training sample is marked with a target marking frame in advance;
the region selection network is used for extracting the ROI region of the feature mapping graph to obtain a feature image of the regular rectangular ROI region;
the region transformation network is used for carrying out coordinate transformation on the regular rectangular ROI region to obtain a characteristic image of a corresponding oblique rectangular ROI region; the matching module is further used for calculating the IOU value of the characteristic image of the oblique rectangular ROI area and the oblique rectangular target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the reduction of the length-width difference of the target marking frame; the characteristic image of the inclined rectangular ROI area of the positive sample is converted into a characteristic image of a corresponding positive rectangular ROI area;
and the regression classification network is used for receiving the converted characteristic image of the positive rectangular ROI and outputting a detection result.
According to a fourth aspect of the invention, there is provided a computer-readable storage medium having a computer program stored thereon, characterized in that the computer program realizes any of the above methods when executed by a processor.
In general, compared with the prior art, the invention has the following beneficial effects:
(1) the method for dynamically and adaptively adjusting the threshold is applied to target detection, can improve the target detection accuracy, is suitable for detection of an inclined target, is particularly suitable for the field of panel defect detection, and enables a detection result to be more fit with the defect shape.
(2) Aiming at the detection algorithm of the tilt defect, a marking method for fitting any quadrangle is provided, and the marking method is more beneficial to the training process of the tilt defect detection algorithm.
Drawings
FIG. 1 is a schematic diagram of a prior art convolutional neural network;
FIGS. 2 and 3 are schematic diagrams illustrating the alignment of the mark frame with the oblique rectangular ROI area according to the embodiment of the present invention;
FIG. 4 is a schematic diagram of a marking method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The target detection method based on deep learning comprises the steps of inputting training samples into a convolutional neural network for training and detecting inclined targets by using the trained convolutional neural network, wherein the training comprises the steps S1 to S6.
S1: and performing feature extraction on the training sample to obtain a feature mapping chart of the training sample, wherein the training sample is marked with a target marking frame in advance.
The convolutional neural network adopted by the embodiment of the invention comprises a backbone network, a regional selection network (RPN network), a regional transformation network (Roi Transformer) and a regression classification network. The Roi Transformer includes a rroierner module and a Roi Align module, and detailed implementation and principles of the Roi Transformer will be described in detail later.
And sending the training sample to a backbone network to obtain a required feature mapping chart.
S2: and extracting the ROI area of the feature mapping image acquired in the last step to acquire a feature image of the regular rectangular ROI area.
And (4) sending the feature map obtained in the step (S1) to an RPN network to obtain the coordinates of the ROI area without the rotation of the regular rectangle, and intercepting a feature image corresponding to the ROI area of the regular rectangle according to the coordinates of the ROI area of the regular rectangle in the feature map output in the step (S2).
S3: and carrying out coordinate transformation on the positive rectangular ROI area in the last step to obtain a characteristic image of the corresponding oblique rectangular ROI area.
Sending the coordinate value of the regular rectangular ROI obtained in the step S2 into a Rroileramer module of a ROI Transformer to obtain the coordinate value of an inclined rectangular ROI area with an inclined angle; and the corresponding image is obtained according to the screenshot of the inclined rectangular ROI area coordinate value on the feature map output in the step S1.
S4: the feature image of the diagonal rectangular ROI region of the last step is computed with the IOU value of the target mark box of step S1, the computed IOU value is compared with a threshold to determine positive and negative samples, and the threshold can be dynamically adjusted.
The IOU (interaction over) value is a measure of the degree of coincidence of two images compared.
The feature image of this diagonal rectangular ROI region is matched with the feature image line of the pre-marked target mark frame, and the IOU values of the diagonal rectangular ROI and the manually marked target mark frame are calculated. A positive sample if the calculated IOU value is greater than the threshold, and a negative sample otherwise.
The threshold value compared with the threshold value can be dynamically adjusted, and is not a fixed threshold value commonly used in the prior art. The detailed implementation and principles of the dynamic threshold are described in detail below.
S5: and converting the characteristic image of the inclined rectangular ROI area of the positive sample into the characteristic image of the corresponding positive rectangular ROI area.
Because the regression classification network is suitable for processing the feature images of the regular rectangles, the feature images of the inclined rectangular ROI area of the reserved positive sample are input to an ROI Align module of the ROI Transformer and are converted into the feature images of the corresponding regular rectangular ROI area.
S6: and inputting the feature image of the positive rectangular ROI converted in the last step into a regression classification network to output a detection result.
The convolutional neural network may be embodied in the following manner.
The convolutional neural network of the embodiment of the invention is mainly different from the traditional two-stage detection network in that a Roi Transformer module is added between the RPN and the regression classification layer, thereby solving the problem of the conversion process from the regular rectangular coordinate to the oblique rectangular coordinate.
The Roi Transformer is mainly composed of two parts, RRoI Learner and RoI Align. After RPN and RoiAlign, a Horizontal ROI region (HRoi, Horizontal region of interest) of the form (x, y, w, h) is obtained. In step S3, HRoi is fed into the fully connected layer with dimension 5, and the regression target is the mark true value (RGT) of the image region rotation covered by the mark frame relative to the shift of HRoi (r) ((r))
Figure 113535DEST_PATH_IMAGE003
Figure 320525DEST_PATH_IMAGE005
Figure 902816DEST_PATH_IMAGE007
Figure 562468DEST_PATH_IMAGE009
Figure 255617DEST_PATH_IMAGE011
) The formula is as follows:
Figure 300802DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure 53995DEST_PATH_IMAGE013
represents the coordinates of the skewed rectangular ROI after the transformation and the vector formed by the length and width stack,
Figure 200942DEST_PATH_IMAGE014
frame shape of indication markThe true coordinates of formula (iv). This completes the work of RRoI Learner. The obtained 5-dimensional coordinates are mapped to the feature map of step S1, and a feature image of the oblique rectangular ROI is extracted.
In step S5, to extract rotation invariant features in the network, a rotation position sensitive ROI Align module is used, which can convert the rotated matrix into a positive rectangle for use by the final regression classification layer. For each pixel in the diagonal rectangular ROI, it can be calculated by the following transformation formula:
Figure 697783DEST_PATH_IMAGE015
wherein x is,yFor each vertex coordinate of the transformed positive rectangle. x, y are coordinates of each vertex of the mark box.
And sending the transformed positive rectangular ROI into a final regression classification layer to obtain a final rotating frame coordinate.
However, applying the ROI Transformer module directly in the target defect has two problems:
(1) in the prior art, a fixed IOU threshold is generally used, and a uniform threshold, for example, 0.5, is set, which may cause a problem of too few or too many positive samples matching, thereby affecting the accuracy of detection. In some application scenarios, such as a panel defect detection scenario, a large number of samples with large length-width ratios exist, and the fixed IOU threshold is used to screen positive samples, which may cause the problems of insufficient number of positive samples and low model recall rate. Therefore, the embodiment of the invention adopts a dynamic threshold method to screen positive samples.
(2) At present, a unified rule is not formed in the labeling method of the Roi Transformer, so that a labeled sample cannot train a network well, and a corresponding labeling method needs to be designed according to the characteristics of panel defect detection. Therefore, in the embodiment of the present invention, a uniform labeling manner may be predefined, and the training samples are labeled according to the uniform labeling rule.
The adaptive dynamic threshold may be implemented in the following manner.
In target detection, matching the real mark box with the ROI area is a necessary process, and determines the allocation of positive and negative samples. The matching criterion is the IOU value, when the IOU is larger than a certain threshold value, the matching is successful, and the matching is considered as a positive sample, otherwise, the matching is considered as a negative sample. When matching is performed, the aspect ratio of the mark frame has a great influence on the matching.
As shown in fig. 2, GT represents the real target mark box, and the IOU is large when the difference between the length and width of the mark box and the oblique rectangular ROI area is small. As shown in fig. 3, the IOU is smaller when the angle is the same as that of fig. 2, but the difference in length and width is larger. If the fixed threshold method in the prior art is adopted, the characteristic image of the oblique rectangular ROI in FIG. 3 does not satisfy the matching condition.
Therefore, the embodiment of the invention adopts a method for self-adaptively adjusting the IOU threshold value.
Preferably, a preset threshold is predefined, and the IOU threshold is dynamically adjusted based on the preset threshold, so that the threshold decreases as the length-width difference of the mark frames increases and increases as the length-width difference of the mark frames decreases.
Preferably, let the threshold be denoted as T, the threshold T is satisfied
Figure 613786DEST_PATH_IMAGE001
,T0And w is the length of the mark frame and h is the width of the mark frame for presetting the threshold value. When w = h, the threshold is a preset threshold T0(ii) a However, if the ratio of w to h becomes larger and larger, the threshold T becomes smaller because the overlapping portion of the real detection frame GT and the ROI becomes smaller as the ratio of length to width becomes larger; when the ratio of w to h becomes smaller, the threshold tends to 0.5.
Preferably, T is selected0At 0.5, the threshold T satisfies the formula:
Figure 537880DEST_PATH_IMAGE016
the marking method of the target marking box can be realized by the following mode.
To simplify the labeling process, the real coordinate frame label is configured as
Figure 922856DEST_PATH_IMAGE017
The form of the four coordinate points can be automatically converted into the form of the four coordinate points in the training process
Figure 957808DEST_PATH_IMAGE018
In the form of (1). In order to make the learning of the coordinate frame angle uniform, 4 points are marked by a uniform marking mode. Preferably, a point in a preset direction is selected as an initial point, and the remaining three vertexes of the target marking frame are marked in sequence according to a preset direction. As shown in fig. 4, the top left corner of the target defect is selected as the initial point, and marked with 1, 2, 3, and 4 in the clockwise direction. The coordinate frame is in a rotating rectangular state rather than an arbitrary quadrilateral shape as much as possible, so that the coordinate frame is more conveniently converted into a coordinate form with an angle.
The target detection system based on deep learning comprises a module for inputting training samples into a convolutional neural network to carry out training and a classification module for carrying out inclined target detection by utilizing the trained convolutional neural network, wherein the training module comprises a feature extraction module, an ROI (region of interest) extraction module, a region transformation module and a detection module:
the characteristic extraction module is used for extracting characteristics of the training samples to obtain a characteristic mapping chart of the training samples, and the training samples are marked with target marking frames in advance;
the ROI area extraction module is used for extracting an ROI area of the feature mapping graph to obtain a feature image of the positive rectangular ROI area;
the region transformation module is used for carrying out coordinate transformation on the positive rectangular ROI region to obtain a characteristic image of the corresponding oblique rectangular ROI region; the system is also used for calculating the IOU value of the characteristic image of the oblique rectangular ROI area and the target marking frame, comparing the IOU value with a threshold value to determine a positive sample and a negative sample, wherein the threshold value is predefined with a preset threshold value, and dynamically adjusting the threshold value to ensure that the threshold value is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the reduction of the length-width difference of the target marking frame; the characteristic image of the inclined rectangular ROI area of the positive sample is converted into a characteristic image of a corresponding positive rectangular ROI area;
and the classification module is used for receiving the converted characteristic image of the positive rectangular ROI and outputting a detection result.
The implementation principle and technical effect of the target detection system device based on deep learning are similar to those of the target detection method, and are not described herein again.
The target detection method and system based on deep learning of the embodiment of the invention can be applied to the field of panel defect detection, and the detected target is an inclined defect. The method can also be applied to the detection field of other target objects, such as pedestrian detection, defect detection of other scenes and the like.
The embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the technical solution of any one of the above-mentioned embodiments of the target detection method. The implementation principle and technical effect are similar to those of the above method, and are not described herein again.
It must be noted that in any of the above embodiments, the methods are not necessarily executed in order of sequence number, and as long as it cannot be assumed from the execution logic that they are necessarily executed in a certain order, it means that they can be executed in any other possible order.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A target detection method based on deep learning is characterized by comprising a step of inputting training samples into a convolutional neural network for training and a step of utilizing the trained convolutional neural network for inclined target detection, wherein the training comprises the following steps:
performing feature extraction on a training sample to obtain a feature mapping chart of the training sample, wherein the training sample is marked with a target marking frame in advance;
extracting an ROI (region of interest) region of the feature mapping graph to obtain a feature image of a positive rectangular ROI region;
carrying out coordinate transformation on the regular rectangular ROI area to obtain a characteristic image of a corresponding oblique rectangular ROI area;
calculating the feature image of the oblique rectangular ROI area and the IOU value of the target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the reduction of the length-width difference of the target marking frame;
converting the characteristic image of the oblique rectangular ROI area of the positive sample into a characteristic image of a corresponding positive rectangular ROI area;
and outputting a detection result according to the converted characteristic image of the positive rectangular ROI area.
2. The deep learning-based target detection method of claim 1, wherein the threshold T is satisfied
Figure 35645DEST_PATH_IMAGE002
Said T is0And w is the length of the target mark frame, and h is the width of the target mark frame, wherein the preset threshold value is used.
3. The deep learning-based target detection method as claimed in claim 2, wherein the preset threshold T is0Is 0.5.
4. The method for detecting the target based on the deep learning as claimed in any one of claims 1, 2 or 3, wherein a uniform marking mode is predefined, and the training samples are marked according to the marking mode.
5. The target detection method based on deep learning of claim 4, wherein the marking mode is as follows: and selecting one point of a preset direction as an initial point, and sequentially marking the rest three vertexes of the target marking frame according to a preset direction.
6. The method as claimed in claim 1, 2 or 3, wherein the method is applied to the field of panel defect detection, and the target is detected as a tilt defect.
7. A deep learning-based target detection system is characterized by comprising a training module for inputting training samples into a convolutional neural network for training and a detection module for detecting inclined targets by using the trained convolutional neural network, wherein the training module comprises:
the characteristic extraction module is used for extracting characteristics of a training sample to obtain a characteristic mapping chart of the training sample, and the training sample is marked with a target marking frame in advance;
the ROI area extraction module is used for extracting an ROI area of the feature mapping graph to obtain a feature image of a positive rectangular ROI area;
the region transformation module is used for carrying out coordinate transformation on the regular rectangular ROI region to obtain a characteristic image of a corresponding oblique rectangular ROI region; the system is also used for calculating the IOU value of the characteristic image of the oblique rectangular ROI area and the target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the decrease of the length-width difference of the target marking frame; the characteristic image of the inclined rectangular ROI area of the positive sample is converted into a characteristic image of a corresponding positive rectangular ROI area;
and the classification module is used for receiving the converted characteristic image of the positive rectangular ROI and outputting a detection result.
8. The deep learning based object detection system of claim 7,the threshold T is satisfied
Figure 192957DEST_PATH_IMAGE002
Said T is0And w is the length of the target mark frame, and h is the width of the target mark frame, wherein the preset threshold value is used.
9. A convolutional neural network data processing system for use in target detection, comprising:
the backbone network is used for extracting features of a training sample to obtain a feature mapping chart of the training sample, and the training sample is marked with a target marking frame in advance;
the region selection network is used for extracting the ROI region of the feature mapping graph to obtain a feature image of the regular rectangular ROI region;
the region transformation network is used for carrying out coordinate transformation on the regular rectangular ROI region to obtain a characteristic image of a corresponding oblique rectangular ROI region; the matching module is further used for calculating the IOU value of the characteristic image of the oblique rectangular ROI area and the oblique rectangular target marking frame, comparing the IOU value with a threshold value T to determine a positive sample and a negative sample, wherein the threshold value T is predefined to have a preset threshold value, and dynamically adjusting the threshold value T so that the threshold value T is reduced along with the increase of the length-width difference of the target marking frame and is increased along with the reduction of the length-width difference of the target marking frame; the characteristic image of the inclined rectangular ROI area of the positive sample is converted into a characteristic image of a corresponding positive rectangular ROI area;
and the regression classification network is used for receiving the converted characteristic image of the positive rectangular ROI and outputting a detection result.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
CN202010798762.6A 2020-08-11 2020-08-11 Target detection method, system and storage medium based on deep learning Active CN111680689B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010798762.6A CN111680689B (en) 2020-08-11 2020-08-11 Target detection method, system and storage medium based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010798762.6A CN111680689B (en) 2020-08-11 2020-08-11 Target detection method, system and storage medium based on deep learning

Publications (2)

Publication Number Publication Date
CN111680689A true CN111680689A (en) 2020-09-18
CN111680689B CN111680689B (en) 2021-03-23

Family

ID=72458212

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010798762.6A Active CN111680689B (en) 2020-08-11 2020-08-11 Target detection method, system and storage medium based on deep learning

Country Status (1)

Country Link
CN (1) CN111680689B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112171668A (en) * 2020-09-21 2021-01-05 河南颂达信息技术有限公司 Rail-mounted robot anti-jamming detection method and device based on artificial intelligence
CN112418263A (en) * 2020-10-10 2021-02-26 上海鹰瞳医疗科技有限公司 Medical image focus segmentation and labeling method and system
CN112505049A (en) * 2020-10-14 2021-03-16 上海互觉科技有限公司 Mask inhibition-based method and system for detecting surface defects of precision components
CN112949614A (en) * 2021-04-29 2021-06-11 成都市威虎科技有限公司 Face detection method and device for automatically allocating candidate areas and electronic equipment
CN113408342A (en) * 2021-05-11 2021-09-17 深圳大学 Target detection method for determining intersection ratio threshold based on features
CN114608801A (en) * 2020-12-08 2022-06-10 重庆云石高科技有限公司 Automatic detection algorithm for falling of connecting wire of locomotive axle temperature probe
CN116342607A (en) * 2023-05-30 2023-06-27 尚特杰电力科技有限公司 Power transmission line defect identification method and device, electronic equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124415A1 (en) * 2015-11-04 2017-05-04 Nec Laboratories America, Inc. Subcategory-aware convolutional neural networks for object detection
CN108088422A (en) * 2018-01-24 2018-05-29 成都纵横自动化技术有限公司 A kind of method for determining true Duplication
CN108985186A (en) * 2018-06-27 2018-12-11 武汉理工大学 A kind of unmanned middle pedestrian detection method based on improvement YOLOv2
CN109344772A (en) * 2018-09-30 2019-02-15 中国人民解放军战略支援部队信息工程大学 Ultrashort wave signal specific reconnaissance method based on spectrogram and depth convolutional network
CN110020651A (en) * 2019-04-19 2019-07-16 福州大学 Car plate detection localization method based on deep learning network
CN110084831A (en) * 2019-04-23 2019-08-02 江南大学 Based on the more Bernoulli Jacob's video multi-target detecting and tracking methods of YOLOv3
CN110288017A (en) * 2019-06-21 2019-09-27 河北数云堂智能科技有限公司 High-precision cascade object detection method and device based on dynamic structure optimization
KR20190125702A (en) * 2018-04-30 2019-11-07 전자부품연구원 Tracking Optimization Method using Cosine Distance and Intersection Area in Deep Learning based Tracking Module
CN110991523A (en) * 2019-11-29 2020-04-10 西安交通大学 Interpretability evaluation method for unmanned vehicle detection algorithm performance
US20200134837A1 (en) * 2019-12-19 2020-04-30 Intel Corporation Methods and apparatus to improve efficiency of object tracking in video frames
CN111401290A (en) * 2020-03-24 2020-07-10 杭州博雅鸿图视频技术有限公司 Face detection method and system and computer readable storage medium
CN111507391A (en) * 2020-04-13 2020-08-07 武汉理工大学 Intelligent identification method for nonferrous metal broken materials based on machine vision

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124415A1 (en) * 2015-11-04 2017-05-04 Nec Laboratories America, Inc. Subcategory-aware convolutional neural networks for object detection
CN108088422A (en) * 2018-01-24 2018-05-29 成都纵横自动化技术有限公司 A kind of method for determining true Duplication
KR20190125702A (en) * 2018-04-30 2019-11-07 전자부품연구원 Tracking Optimization Method using Cosine Distance and Intersection Area in Deep Learning based Tracking Module
CN108985186A (en) * 2018-06-27 2018-12-11 武汉理工大学 A kind of unmanned middle pedestrian detection method based on improvement YOLOv2
CN109344772A (en) * 2018-09-30 2019-02-15 中国人民解放军战略支援部队信息工程大学 Ultrashort wave signal specific reconnaissance method based on spectrogram and depth convolutional network
CN110020651A (en) * 2019-04-19 2019-07-16 福州大学 Car plate detection localization method based on deep learning network
CN110084831A (en) * 2019-04-23 2019-08-02 江南大学 Based on the more Bernoulli Jacob's video multi-target detecting and tracking methods of YOLOv3
CN110288017A (en) * 2019-06-21 2019-09-27 河北数云堂智能科技有限公司 High-precision cascade object detection method and device based on dynamic structure optimization
CN110991523A (en) * 2019-11-29 2020-04-10 西安交通大学 Interpretability evaluation method for unmanned vehicle detection algorithm performance
US20200134837A1 (en) * 2019-12-19 2020-04-30 Intel Corporation Methods and apparatus to improve efficiency of object tracking in video frames
CN111401290A (en) * 2020-03-24 2020-07-10 杭州博雅鸿图视频技术有限公司 Face detection method and system and computer readable storage medium
CN111507391A (en) * 2020-04-13 2020-08-07 武汉理工大学 Intelligent identification method for nonferrous metal broken materials based on machine vision

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KHIN LAY MON ET AL;: "《AUTOMATIC IMAGE SEGMENTATION USING MARKER CONTROLLED WATERSHED AND OVERLAP RATIO BASED REGION MERGING》", 《2018 IEEE 7TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS》 *
李良福 等;: "《基于深度学习的桥梁裂缝检测算法研究》", 《自动化学报》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112171668A (en) * 2020-09-21 2021-01-05 河南颂达信息技术有限公司 Rail-mounted robot anti-jamming detection method and device based on artificial intelligence
CN112418263A (en) * 2020-10-10 2021-02-26 上海鹰瞳医疗科技有限公司 Medical image focus segmentation and labeling method and system
CN112505049A (en) * 2020-10-14 2021-03-16 上海互觉科技有限公司 Mask inhibition-based method and system for detecting surface defects of precision components
CN114608801A (en) * 2020-12-08 2022-06-10 重庆云石高科技有限公司 Automatic detection algorithm for falling of connecting wire of locomotive axle temperature probe
CN114608801B (en) * 2020-12-08 2024-04-19 重庆云石高科技有限公司 Automatic detection algorithm for falling off of connecting wire of locomotive shaft temperature probe
CN112949614A (en) * 2021-04-29 2021-06-11 成都市威虎科技有限公司 Face detection method and device for automatically allocating candidate areas and electronic equipment
CN113408342A (en) * 2021-05-11 2021-09-17 深圳大学 Target detection method for determining intersection ratio threshold based on features
CN116342607A (en) * 2023-05-30 2023-06-27 尚特杰电力科技有限公司 Power transmission line defect identification method and device, electronic equipment and storage medium
CN116342607B (en) * 2023-05-30 2023-08-08 尚特杰电力科技有限公司 Power transmission line defect identification method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111680689B (en) 2021-03-23

Similar Documents

Publication Publication Date Title
CN111680689B (en) Target detection method, system and storage medium based on deep learning
WO2022148192A1 (en) Image processing method, image processing apparatus, and non-transitory storage medium
WO2022057607A1 (en) Object edge recognition method and system, and computer readable storage medium
US11636604B2 (en) Edge detection method and device, electronic equipment, and computer-readable storage medium
CN111179243A (en) Small-size chip crack detection method and system based on computer vision
CN108764352B (en) Method and device for detecting repeated page content
CN109919002B (en) Yellow stop line identification method and device, computer equipment and storage medium
CN111325717B (en) Mobile phone defect position identification method and equipment
US11037017B2 (en) Method and device for obtaining image of form sheet
CN111311556B (en) Mobile phone defect position identification method and equipment
CN109409356B (en) Multi-direction Chinese print font character detection method based on SWT
CN110443235B (en) Intelligent paper test paper total score identification method and system
CN111027538A (en) Container detection method based on instance segmentation model
CN112465766A (en) Flat and micro polyp image recognition method
CN112419207A (en) Image correction method, device and system
CN115273115A (en) Document element labeling method and device, electronic equipment and storage medium
CN111325728B (en) Product defect detection method, device, equipment and storage medium
CN110909772B (en) High-precision real-time multi-scale dial pointer detection method and system
CN108992033B (en) Grading device, equipment and storage medium for vision test
CN113506288A (en) Lung nodule detection method and device based on transform attention mechanism
CN115546219B (en) Detection plate type generation method, plate card defect detection method, device and product
CN114120305B (en) Training method of text classification model, and text content recognition method and device
CN115908409A (en) Method and device for detecting defects of photovoltaic sheet, computer equipment and medium
WO2022056875A1 (en) Method and apparatus for segmenting nameplate image, and computer-readable storage medium
CN109389595B (en) Table line intersection point detection method, electronic device and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant