CN114677568A - Linear target detection method, module and system based on neural network - Google Patents

Linear target detection method, module and system based on neural network Download PDF

Info

Publication number
CN114677568A
CN114677568A CN202210595785.6A CN202210595785A CN114677568A CN 114677568 A CN114677568 A CN 114677568A CN 202210595785 A CN202210595785 A CN 202210595785A CN 114677568 A CN114677568 A CN 114677568A
Authority
CN
China
Prior art keywords
linear
target
length
linear target
angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210595785.6A
Other languages
Chinese (zh)
Other versions
CN114677568B (en
Inventor
何超
邓富城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Jijian Technology Co.,Ltd.
Original Assignee
Shandong Jivisual Angle Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Jivisual Angle Technology Co ltd filed Critical Shandong Jivisual Angle Technology Co ltd
Priority to CN202210595785.6A priority Critical patent/CN114677568B/en
Publication of CN114677568A publication Critical patent/CN114677568A/en
Application granted granted Critical
Publication of CN114677568B publication Critical patent/CN114677568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a linear target detection method, module and system based on a neural network, which are used for improving the detection precision of a linear target. The method comprises the following steps: acquiring an image to be detected; inputting an image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting linear characteristics of a linear target image sample, determining an anchor point according to the angle and the length contained by the linear characteristics and training based on the angle, the length, the center and the type contained by the anchor point and the linear characteristics, the linear characteristics are characteristics of a linear target contained in the linear target image sample, the angle is the included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is the midpoint of the linear target; determining a linear characteristic predicted value of the image to be detected through the linear target detection model; and determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.

Description

Linear target detection method, module and system based on neural network
Technical Field
The present application relates to the field of image data processing, and in particular, to a method, a module, and a system for detecting a linear target based on a neural network.
Background
Linear object detection, such as lane line detection, component contour detection, etc., is one of the necessary technologies in the fields of intelligent traffic, unmanned driving, intelligent industry, etc.
At present, there are two main types of methods for linear target detection, namely a method based on traditional vision and a method based on a deep neural network. The method based on the traditional vision mainly comprises the steps of carrying out pixel-level transformation on an image to enhance the image quality, then carrying out binarization on the image, and solving a linear target in the image by utilizing traditional algorithms such as Hough transformation and the like; the deep neural network-based method mainly tries to directly regress the midpoint, angle and length information of line segment of the linear target.
However, the linear target detection based on the traditional vision can only identify line segment characteristics and is difficult to distinguish the linear target types, and the linear target detection based on the neural network has poor robustness of model detection due to large attribute differences of angles, lengths and the like of linear targets in actual life, and high detection precision is difficult to realize by direct regression.
Disclosure of Invention
The application provides a linear target detection method, a module and a system based on a neural network, which are used for improving the detection precision of a linear target.
The application provides a linear target detection method based on a neural network in a first aspect, which comprises the following steps:
acquiring an image to be detected;
inputting the image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting linear features of a linear target image sample, determining an anchor point according to the angle and the length contained in the linear features, and training based on the angle, the length, the center and the type contained in the anchor point and the linear features, the linear features are the features of the linear target contained in the linear target image sample, the angle is the included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is the midpoint of the linear target;
determining a linear characteristic predicted value of the image to be detected through the linear target detection model;
and determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.
Optionally, before the acquiring an image to be detected, the linear target detection method further includes:
constructing a linear target image set, wherein the linear target image set comprises a plurality of linear target image samples, and each linear target image sample is marked with a center, an angle, a length and a type corresponding to a linear target of the linear target image sample;
clustering the lengths and the angles respectively, and determining m length clustering centers and n angle clustering centers;
constructing a linear target detection model based on a neural network by taking the m length clustering centers and the n angle clustering centers as anchor points;
and circularly and iteratively inputting a linear target image sample of the linear target image set into the linear target detection model for linear feature extraction, training according to the linear feature, calculating a loss function, and performing optimization updating on the linear target detection model by using a back propagation algorithm according to the loss function and the linear target image set to obtain a trained linear target detection model, wherein the loss function is constructed by including central prediction loss, length prediction loss, angle prediction loss, target loss and category prediction loss.
Optionally, the constructing the linear target image set includes:
acquiring a plurality of linear target image samples containing linear targets;
marking the end point coordinates and the types of the linear targets on the linear target image samples;
determining the center, the angle and the length of the linear target according to the endpoint coordinate and a target conversion formula;
constructing a linear target image set according to the marked linear target image samples;
the target conversion formula is:
Figure 509655DEST_PATH_IMAGE001
wherein L is the length, (Cx, Cy) is the center, T is the angle, (x)1,y1) And (x)2,y2) Are the coordinates of two end points of the linear object.
Optionally, the iteratively inputting the linear target image samples of the linear target image set into the linear target detection model in a loop manner to perform linear feature extraction, training the linear target detection model according to the linear features, calculating a loss function, and performing optimization updating on the linear target detection model according to the loss function and the linear target image set by using a back propagation algorithm to obtain the trained linear target detection model includes:
inputting the linear target image sample into a linear feature extraction backbone network for linear feature extraction and converting the linear feature extraction backbone network into a feature map, wherein the width and the height of the feature map are S, and the linear target detection model comprises the linear feature extraction backbone network and a linear target detection network;
determining the labeled target of the linear target contained in the feature map according to the linear features, wherein each numerical value of the labeled target is (g)cx,gcy,glength,gtheta,gc) Wherein (g)cx,gcy) As coordinates of the target center point, glengthIs a target length, gthetaIs a target angle gcTarget category information;
according to the target center point coordinate (g)cx,gcy) Determine the integer coordinate (g) of its upper left cornercxi,gcyj);
Marking the target angle and the target of the target based on the integer coordinateRespectively matching the target lengths with the anchor points in the linear target detection network to determine target anchor points (T)i,Lj);
Based on the target anchor point (T)i,Lj) Training the marked target by utilizing the linear target detection network;
according to the training result, on the target characteristic diagram with the size of (m × n) × S × C, according to the position (i × m + n, g)cyj,gcxi) The C-dimensional vector of (A) determines a target prediction value (p) of the labeled targetx,py,plength,ptheta,pobj,pc1,…,pck) C is the number of channels, the value of C is 5+ K, and K represents the category number of the linear target;
calculating a loss function according to the labeling coordinate, the integer coordinate of the upper left corner and the target predicted value;
and optimizing and updating the linear target detection network by using a back propagation algorithm according to the loss function, and circularly and iteratively inputting linear target image samples in the linear target image set to train until the value of the loss function is minimized to obtain a trained linear target detection model.
Optionally, the loss function is composed of a center prediction loss, a length prediction loss, an angle prediction loss, a target loss, and a category prediction loss;
the target loss comprises whether a predicted loss of the target exists and a loss determined not to exist;
the loss function formula is:
Ls=γ1 Lxy2 Llength+ γ3 Ltheta4 Lobj+ γ5 Lcls+ γ6 Lnobj
wherein, γiA weight value representing a weighting;
center predicted loss of Lxy=SMOOTHL1((px,py),(gcx-gcxi,gcy-gcyj));
Length prediction penalty of Llength=SMOOTHL1(plength,glength/Lj );
Angle prediction loss of Ltheta=SMOOTHL1(ptheta,gtheta/Ti)
Whether there is a predicted loss of the target of Lobj=BCE(pobj,1);
Determining that there is no loss of target as Lnobj=BCE(pobj,0);
Class prediction loss of Lcls=BCE((pc1,…,pck),gc)。
Optionally, the target angle and the target length of the labeled target are respectively matched with an anchor point in the linear target detection network, so as to determine a target anchor point (T)i,Lj) The method comprises the following steps:
the target angle of the labeling target is compared with n anchor point angles (T)1,T2,…,Tn) Matching is carried out, and the anchor point angle T closest to the target angle is determinedi
The target length of the labeling target and the length (L) of m anchor points1,L2,…,Lm) Matching is carried out, and the anchor point length L closest to the target length is determinedj
According to the anchor point angle TiAnd the anchor length LjDetermining a target anchor point (T)i,Lj)。
Optionally, the determining, by the linear target detection model, a linear feature prediction value of the image to be detected includes:
determining the target characteristics to be detected of the image to be detected through the linear target detection model, and acquiring a target characteristic diagram;
traversing and matching all anchor points contained in the target feature map according to the target feature to be detected;
and determining a matching anchor point according to the matching result, and determining a linear characteristic predicted value according to the matching anchor point.
Optionally, the determining a linear target detection result of the image to be detected according to the linear feature prediction value includes:
judging whether a linear target exists according to the linear characteristic predicted value;
and if so, outputting a linear target detection result of the image to be detected according to the linear characteristic prediction value and a linear target reduction formula, wherein the linear target detection result comprises an angle, a length, a center and a type.
A second aspect of the present application provides a linear feature extraction module, which is used for linear feature extraction in the linear target detection method described in the first aspect, and includes:
the device comprises an input end, a 1 × 1 convolution module, a 1 × 3 convolution module, a 3 × 1 convolution module, a stacking operation module, a 3 × 3 convolution module and an output end;
the input end is connected with the 1 x 1 convolution module;
the 1 × 1 convolution module is respectively connected with the 1 × 3 convolution module and the 3 × 1 convolution module, the 1 × 3 convolution module is used for extracting linear gradient information biased towards the vertical direction, and the 3 × 1 convolution module is used for extracting linear gradient information biased towards the horizontal direction;
the stacking operation module is respectively connected with the 1 × 3 convolution module and the 3 × 1 convolution module, and is used for respectively acquiring linear features corresponding to the linear gradient information output by the 1 × 3 convolution module and the 3 × 1 convolution module, and performing stacking operation according to the linear features;
and the 3 × 3 convolution module is connected with the stacking operation module and the output end and is used for acquiring the linear features output by the stacking operation module, performing 3 × 3 convolution on the linear features and outputting the linear features after the 3 × 3 convolution to the output end.
A third aspect of the present application provides a linear target detection system based on a neural network, including:
the acquisition unit is used for acquiring an image to be detected;
the image detection device comprises an input unit, a detection unit and a processing unit, wherein the input unit is used for inputting the image to be detected into a linear target detection model, the linear target detection model is obtained by extracting linear characteristics of a linear target image sample, determining an anchor point according to an angle and a length contained in the linear characteristics and then training based on the angle, the length, the center and the type contained in the anchor point and the linear characteristics, the linear characteristics are characteristics of a linear target contained in the linear target image sample, the angle is an included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is a midpoint of the linear target;
the first determining unit is used for determining a linear feature prediction value of the image to be detected through the linear target detection model;
and the second determining unit is used for determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.
Optionally, the linear target detection system further comprises:
the linear target image acquisition device comprises a first construction unit, a second construction unit and a third construction unit, wherein the first construction unit is used for constructing a linear target image set, the linear target image set comprises a plurality of linear target image samples, and each linear target image sample is marked with a center, an angle, a length and a type corresponding to a linear target of the linear target;
the clustering unit is used for clustering the lengths and the angles respectively and determining m length clustering centers and n angle clustering centers;
the second construction unit is used for constructing a linear target detection model based on the neural network by taking the m length clustering centers and the n angle clustering centers as anchor points;
and the training unit is used for inputting the linear target image samples of the linear target image set into the linear target detection model in a circulating iteration mode for linear feature extraction, training according to the linear features, calculating a loss function, and performing optimization updating on the linear target detection model by using a back propagation algorithm according to the loss function and the linear target image set to obtain the trained linear target detection model, wherein the loss function is constructed by including central prediction loss, length prediction loss, angle prediction loss, target loss and category prediction loss.
Optionally, the first constructing unit is specifically configured to obtain a plurality of linear target image samples including a linear target;
marking the end point coordinates and the types of the linear targets on the linear target image samples;
determining the center, the angle and the length of the linear target according to the endpoint coordinate and a target conversion formula;
constructing a linear target image set according to the marked linear target image samples;
the target conversion formula is:
Figure 165152DEST_PATH_IMAGE001
wherein L is the length, (Cx, Cy) is the center, T is the angle, (x)1,y1) And (x)2,y2) Are the coordinates of two end points of the linear object.
Optionally, the training unit is specifically configured to input the linear target image sample into a linear feature extraction backbone network to perform linear feature extraction and convert the linear feature extraction into a feature map, where a width and a height of the feature map are S × S, and the linear target detection model includes the linear feature extraction backbone network and a linear target detection network;
determining the labeled target of the linear target contained in the feature map according to the linear feature, wherein each item of numerical value of the labeled target is (g)cx,gcy,glength,gtheta,gc) Wherein (g)cx,gcy) As coordinates of the center point of the object, glengthIs a target length, gthetaIs a target angle gcTarget category information;
according to the target center point coordinate (g)cx,gcy) Determine the integer coordinates (g) of its upper left cornercxi,gcyj);
Respectively matching the target angle and the target length of the labeled target with the anchor points in the linear target detection network based on the integer coordinate, and determining a target anchor point (T)i,Lj);
Based on the target anchor point (T)i,Lj) Benefit toTraining the labeled target by using the linear target detection network;
according to the training result, on the target characteristic diagram with the size of (m × n) × S × C, according to the position (i × m + n, g)cyj,gcxi) The C-dimensional vector of (A) determines a target prediction value (p) of the labeled targetx,py,plength,ptheta,pobj,pc1,…,pck) C is the number of channels, the value of C is 5+ K, and K represents the category number of the linear target;
calculating a loss function according to the labeling coordinate, the integer coordinate of the upper left corner and the target predicted value;
and optimizing and updating the linear target detection network by using a back propagation algorithm according to the loss function, and circularly and iteratively inputting linear target image samples in the linear target image set to train until the value of the loss function is minimized to obtain a trained linear target detection model.
Optionally, the loss function is composed of a center prediction loss, a length prediction loss, an angle prediction loss, a target loss, and a category prediction loss;
the target loss comprises whether a predicted loss of the target exists and a loss determined not to exist;
the loss function formula is:
Ls=γ1 Lxy2 Llength+ γ3 Ltheta4 Lobj+ γ5 Lcls+ γ6 Lnobj
wherein, γiA weight value representing a weighting;
center predicted loss of Lxy=SMOOTHL1((px,py),(gcx-gcxi,gcy-gcyj));
Length prediction loss of Llength=SMOOTHL1(plength,glength/Lj );
Angle prediction loss of Ltheta=SMOOTHL1(ptheta,gtheta/Ti)
Whether there is a predicted loss of the target of Lobj=BCE(pobj,1);
Determining that there is no loss of target as Lnobj=BCE(pobj,0);
Class prediction loss of Lcls=BCE((pc1,…,pck),gc)。
Optionally, the training unit is specifically configured to correlate the target angle of the labeled target with n anchor point angles (T)1,T2,…,Tn) Matching is carried out, and the anchor point angle T closest to the target angle is determinedi
The target length of the labeling target and the lengths (L) of the m anchor points are compared1,L2,…,Lm) Matching is carried out, and the anchor point length L closest to the target length is determinedj
According to the anchor point angle TiAnd the anchor length LjDetermination of target anchor point (T)i,Lj)。
Optionally, the first determining unit is specifically configured to determine, through the linear target detection model, a target feature to be detected of the image to be detected, and acquire a target feature map;
traversing and matching all anchor points contained in the target feature map according to the target feature to be detected;
and determining a matching anchor point according to the matching result, and determining a linear characteristic predicted value according to the matching anchor point.
Optionally, the second determining unit is specifically configured to determine whether a linear target exists according to the linear feature prediction value;
and if so, outputting a linear target detection result of the image to be detected according to the linear characteristic prediction value and a linear target reduction formula, wherein the linear target detection result comprises an angle, a length, a center and a type.
A fourth aspect of the present application provides a linear target detection apparatus based on a neural network, the apparatus including:
the device comprises a processor, a memory, an input and output unit and a bus;
the processor is connected with the memory, the input and output unit and the bus;
the memory holds a program that the processor calls to perform the method for linear object detection of the first aspect and optional one of the first aspects.
A fifth aspect of the present application provides a computer-readable storage medium having a program stored thereon, where the program is executed on a computer to perform the linear object detection method of any one of the first aspect and the first aspect.
According to the technical scheme, the method has the following advantages:
firstly, acquiring an image to be detected; and inputting the image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting the linear characteristics of a linear target image sample, determining an anchor point according to the angle and the length contained in the linear characteristics and training based on the angle, the length, the center and the type contained in the anchor point and the linear characteristics. And then determining a linear characteristic predicted value of the image to be detected through a linear target detection model, and determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.
By the linear target detection method, the type of the linear target can be detected by the linear target detection model; the linear target detection model is obtained by training the angles and lengths contained in the linear target as anchor points, and the anchor points of the angles and the lengths are set as the initial points of model optimization, so that the network model is more stable, the possibility of poor robustness of the network model is reduced, and the detection precision of the linear target detection is improved.
Drawings
In order to more clearly illustrate the technical solutions in the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flowchart of an embodiment of a neural network-based linear target detection method provided in the present application;
FIG. 2 is a schematic flow chart illustrating another embodiment of a neural network-based linear target detection method provided in the present application;
FIG. 3 is a schematic structural diagram of an embodiment of a linear feature extraction module provided in the present application;
FIG. 4 is a schematic structural diagram of an embodiment of a neural network-based linear target detection system provided in the present application;
FIG. 5 is a schematic structural diagram of another embodiment of a neural network-based linear target detection system provided in the present application;
fig. 6 is a schematic structural diagram of an embodiment of a linear target detection device based on a neural network provided in the present application.
Detailed Description
The application provides a linear target detection method, a module and a system based on a neural network, which are used for improving the precision of linear target detection.
It should be noted that the linear target detection method based on the neural network provided by the present application may be applied to a terminal, and may also be applied to a server, for example, the terminal may be a fixed terminal such as a smart phone or a computer, a tablet computer, a smart television, a smart watch, a portable computer terminal, or a desktop computer. For convenience of explanation, the terminal is taken as an execution subject for illustration in the present application.
Referring to fig. 1, fig. 1 is a block diagram illustrating an embodiment of a neural network-based linear target detection method provided in the present application, the method including:
101. acquiring an image to be detected;
in this embodiment, the terminal may obtain the image to be detected from an image stored locally thereof, or crawl the image to be detected from a network, or obtain the image to be detected in other manners, which is not limited herein. Further, the image to be detected can be an image which accords with the input size of the linear target detection model and is normalized. Or, the terminal can also directly acquire the image to be detected, then adjust the image size of the image to be detected to the preset input size of the linear target detection model, and divide the image pixel value of the image to be detected after size adjustment by 255 to perform image normalization processing, so as to obtain the image to be detected after size adjustment and normalization, so as to perform linear target detection on the image to be detected which accords with the input size and is normalized, and improve the accuracy of linear target detection.
102. Inputting an image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting linear characteristics of a linear target image sample, determining an anchor point according to the angle and the length contained by the linear characteristics and training based on the angle, the length, the center and the type contained by the anchor point and the linear characteristics, the linear characteristics are characteristics of a linear target contained in the linear target image sample, the angle is the included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is the midpoint of the linear target;
in this embodiment, the terminal inputs the acquired image to be detected into the linear target detection model, so that the linear target detection is performed by the linear target detection model. It should be noted that the linear target detection model is obtained by training a linear target image sample. Specifically, linear features in a linear target image sample are extracted, a plurality of anchor points are determined according to angles and lengths in the linear features, the anchor points are used as optimized starting points in a linear target detection model, a large number of linear target image samples are trained based on the anchor points, and a network is continuously optimized, so that the trained linear target detection model is obtained, and the trained linear target detection model can detect a linear target and determine the length, the angle, the center and the type corresponding to the linear target. Further, the specific training process of the linear object detection model is described in detail in the following embodiments, and is not specifically described here.
103. Determining a linear characteristic predicted value of the image to be detected through the linear target detection model;
in this embodiment, after the terminal inputs the image to be detected into the trained linear target detection model, the linear target feature in the image to be detected is output according to the linear target detection model, and then the linear feature prediction value of the image to be detected is determined according to the linear target feature.
104. And determining a linear target detection result of the image to be detected according to the linear characteristic prediction value.
In this embodiment, the terminal determines a linear target detection result of the image to be detected according to the linear feature prediction value, specifically, determines whether a linear target exists according to the linear feature prediction value, and if so, determines a center, an angle, a length, and a type corresponding to the linear target according to the linear feature prediction value.
In this embodiment, after the terminal acquires the image to be detected, the image to be detected is input to the linear target detection model. The linear target detection model is obtained by training according to the angle, the length, the center and the type of the corresponding linear target in the linear target image sample, so that the type of the linear target in the image to be detected can be determined through the linear target detection model. Meanwhile, the linear target detection model takes the length and the angle of a linear target as anchor points, model optimization is carried out based on the anchor points, the linear target detection model can be more stable during training, the problem that the robustness of the detection model is poor is reduced, and therefore the detection precision of the linear target detection is improved.
In order to make the linear target detection method based on the neural network provided by the present application more obvious and understandable, the following describes in detail the linear target detection method based on the neural network provided by the present application:
referring to fig. 2, fig. 2 is a schematic diagram illustrating another embodiment of a neural network-based linear target detection method according to the present application, the method including:
201. constructing a linear target image set, wherein the linear target image set comprises a plurality of linear target image samples, and each linear target image sample is marked with a center, an angle, a length and a type corresponding to a linear target of the linear target image sample;
in this embodiment, the terminal constructs a linear target image set including a large number of linear target image samples, where the linear target image set is used as a training sample for training a linear target detection model, and each linear target image sample in the image set includes a linear target and is labeled with a center, an angle, a length, and a type corresponding to the linear target. Specifically, a linear target image set may be constructed according to the following manner:
firstly, a large number of linear target image samples containing linear targets are obtained from the network by means of a crawler, or may be obtained by other means, which is not limited herein. The linear target image sample may be preprocessed after being acquired, for example, normalized, or processed by image size. And then labeling the linear target contained in the linear target image samples, specifically labeling the center, the angle, the length and the type of the linear target. For the marking of the center, the angle and the length of the linear target, the method of firstly determining the coordinates of the end point and then converting the center, the angle and the length according to the coordinates of the end point and a target conversion formula can be adopted. For example, the linear targets in the linear icon image sample may be abstracted into line segment targets, the two endpoint coordinates of each line segment target may be determined using the annotation software, and then converted into a center, an angle, and a length according to the determined two endpoint coordinates and a target conversion formula. The target conversion formula is as follows:
Figure 361778DEST_PATH_IMAGE001
wherein L is length, (Cx, Cy) is center, T is angle, (x)1,y1) And (x)2,y2) Are the coordinates of the two end points of the linear object. Alternatively, other labeling methods can be used to label the angle, center, and length of the linear target.
It should be noted that the labeling software may be the existing labeling software or may be designed by itself; the target conversion formula may be as described above, or may be other formulas, and is not limited herein.
Then, after the linear target image samples are labeled, the terminal integrates the labeled linear target image samples, so as to construct a linear target image set.
202. Clustering the lengths and the angles respectively, and determining m length clustering centers and n angle clustering centers;
in this embodiment, the terminal clusters the angles and lengths corresponding to the linear targets in all the linear target image samples in the linear target image set, respectively, to obtain n angle cluster centers (T) corresponding to all the angle values1,T2,…,Tn) And m length cluster centers (L) corresponding to all length values1,L2,…,Lm). It should be noted that the number n of the angle clustering centers and the number m of the length clustering centers may be set according to the data distribution difference situation of the angle values and the length values corresponding to the angles and the lengths in the linear target image set, for example, if the data distribution of the angle values/the length values is concentrated, the number of the angle clustering centers/the length clustering centers may be set to be a smaller number, and if the data distribution of the angle values/the length values is dispersed, the number of the angle clustering centers/the length clustering centers may be set to be a larger number, which is not limited herein.
203. Constructing a linear target detection model based on a neural network by taking the m length clustering centers and the n angle clustering centers as anchor points;
in this embodiment, the terminal uses the m length clustering centers and the n angle clustering centers as a setting basis of an anchor point, and constructs a linear target detection model based on a neural network based on the anchor point. Specifically, m × n anchor points (T) are generated from the m length cluster centers and the n length cluster centers1,L1 ),…,(T1,Lm ),…,(Tn,L1 ),…,(Tn,Lm)。
And building a network structure based on the m-n anchor points, setting corresponding parameters, such as feature length, category number and the like, and then building a linear target detection model by using the network structure and a deep learning framework.
204. Inputting linear target image samples of a linear target image set into the linear target detection model in a circulating iteration mode for linear feature extraction, training according to linear features, calculating a loss function, and performing optimization updating on the linear target detection model by using a back propagation algorithm according to the loss function and the linear target image set to obtain a trained linear target detection model, wherein the loss function is constructed by center prediction loss, length prediction loss, angle prediction loss, target loss and category prediction loss;
in this embodiment, after the linear target detection model is constructed, the terminal circularly inputs the linear target image samples included in the linear target image set into the linear target detection model, and extracts linear features from the linear target image samples. And then training according to the extracted linear features, calculating a loss function, and optimizing and updating parameters of the linear target detection model by using a back propagation algorithm according to the loss value of the loss function and the linear target image sample set so as to obtain the optimized linear target detection model. For example, the terminal may sort a plurality of linear target image samples in the linear target image set, input the linear target image samples of the first sequence number into the linear target detection model according to the sorting result to perform linear feature extraction, calculate a loss function, perform parameter optimization and update on the detection model by using a back propagation algorithm according to a loss value of the loss function, input the linear target image samples of the second sequence number into the detection model to perform feature extraction again, calculate the loss function and perform corresponding model parameter optimization and update, and circulate until the loss function converges or a function value of the loss function is minimized, thereby obtaining the trained linear target detection model.
Further, the following describes the training process of the linear target detection model in detail:
firstly, the linear target detection model comprises a linear feature extraction backbone network and a linear target detection network. The linear feature extraction backbone network is a residual network comprising a linear feature extraction module, and the linear feature extraction module performs linear feature extraction by using a residual structure continuously accumulated in the residual network as shown in fig. 3, so that the effect of linear feature extraction is improved.
In this embodiment, the terminal inputs the linear target image sample into the linear feature extraction backbone network to perform linear feature extraction and converts the linear feature extraction into a feature map, that is, matching the labeled target corresponding to the linear feature in the linear target image sample to the feature map. Specifically, the labeling target of the linear target image sample can be converted into the labeling target of the feature map according to the extracted linear feature and the size of the feature map, and then the integer coordinate of the upper left corner of the labeling target in the feature map is determined according to the target center point coordinate in the labeling target in the feature map. For example, if the width and height of the feature map is S, the original image size of the linear target image sample is scaled and converted into the size corresponding to the feature map, and the angle, length, and center included in the linear feature in the linear target image sample are determined based on the scaling conversion for the labeled target (g) corresponding to the feature map sizecx,gcy,glength,gtheta,gc) Wherein (g)cx,gcy) As coordinates of the center point of the object, glengthIs a target length, gthetaIs a target angle gcTarget category information; and according to the target center point coordinate (g)cx,gcy) Determine the integer coordinate (g) of its upper left cornercxi,gcyj). When the central coordinate in the linear target image sample is converted into the size corresponding to the feature map, the central coordinate is also subjected to corresponding scaling transformation, but at the moment, the scaled target central coordinate (g) is subjected to scaling transformationcx,gcy) If the value of (g) is a decimal value, the center point coordinate is a decimal coordinate, so we need to coordinate the target center point (g)cx,gcy) Converted to integer coordinates. In this embodiment, the integer coordinate (g) is determined by directly matching the upper left cornercxi,gcyj). It should be noted that the target center point coordinates may also be converted into integer coordinates by other ways, which is not limited herein.
Then, the terminal marks the target based on the integer coordinateThe target angle and the target length in the target are respectively matched with all anchor points to determine a target anchor point, then the labeled target is learned and trained by utilizing a network architecture of a constructed linear target detection network based on the target anchor point, and a target predicted value of the labeled target is determined according to a training result. Specifically, target (g) will be labeledcx,gcy,glength,gtheta,gc) Target angle g ofthetaAngle with n anchor points (T)1,T2,…,Tn) Matching is carried out, and the angle g between the target and the target is determinedthetaClosest anchor angle Ti(ii) a Then label the target (g)cx,gcy,glength,gtheta,gc) Target length g oflengthAnd length of m anchor points (L)1,L2,…,Lm) Matching is carried out, and the anchor point length L closest to the target length is determinedj(ii) a According to anchor point angle TiAnd anchor length LjDetermining a target anchor point (T)i,Lj). Thus, in the linear target detection model, with the target anchor point (T)i,Lj) For the starting point of training, target (g) is labeledcx,gcy,glength,gtheta,gc) And (5) performing learning training. Then, according to the training result, on the target feature map with the size of (m × n) × S × C, according to the position (i × m + n, g)cyj,gcxi) The C-dimensional vector of (b) determines a target prediction value (p) of the labeled targetx,py,plength,ptheta,pobj,pc1,…,pck) Where i x m + n is the result of anchor matching, gcyj,gcxiAs a result of the midpoint matching, C is the number of channels, C has a value of 5+ K, and K represents the number of classes of linear objects in the linear image sample set. Specifically, the target feature map is a feature map including four dimensions, where a first dimension m × n indicates how many anchor points corresponding to lengths and angles exist, a second dimension S indicates a height of the feature map, a third dimension S indicates a width of the feature map, and a fourth dimension C indicates a channel number. Thus, m × n anchors correspond to each position of the target feature map with width × S, i.e., the total number of anchors in the target feature mapIs S m n. Then, each anchor point corresponds to C channels, and the C channels are respectively used for predicting different attribute values, including center point coordinates, angles, lengths, wireless targets, categories and the like. During training, based on the four-dimensional feature map of (m × n) × S and the labeled target, each position of the three dimensions of (m × n) × S is traversed, and the angle, the length, the center and the type of the labeled target are predicted through each channel, so that the position (i × m + n, g) is finally obtainedcyj,gcxi) The C-dimensional vector corresponding to the C channel outputs a target predicted value (p) of the labeling targetx,py,plength,ptheta,pobj,pc1,…,pck). Wherein p isx,pyAs a prediction of the coordinates of the center point, plengthFor the length prediction, pthetaAs angle prediction value, pobjFor predicting whether a linear target is present; p is a radical ofc1,…,pckIs a category prediction value.
And finally, calculating a loss function according to the target predicted value, the labeled coordinate and the integer coordinate of the upper left corner, optimizing and updating the linear target detection network by using a back propagation algorithm according to the loss function, and circularly and iteratively inputting a new linear target image sample for training until the loss function value is minimized to obtain a trained linear target detection model.
It should be noted that the loss function is configured by including a center prediction loss, a length prediction loss, an angle prediction loss, a target loss, and a category prediction loss, wherein the target loss includes a prediction loss of whether a target exists and a loss of determining that the target does not exist. Specifically, SMOOTHL1 loss is calculated for center prediction, SMOOTHL1 loss is calculated for length prediction (smoothed L1 norm), SMOOTHL1 loss is calculated for angle prediction, BCE loss is calculated for prediction of whether targets exist (binary cross entropy), BCE loss is calculated for determination that targets do not exist (prediction points that do not match any labeled targets), and BCE loss is calculated for category prediction. The specific formula is as follows, and the loss function formula is as follows:
Ls=γ1 Lxy2 Llength+ γ3 Ltheta4 Lobj+ γ5 Lcls+ γ6 Lnobj
wherein, γiA weight value representing a weighting;
center predicted loss of Lxy=SMOOTHL1((px,py),(gcx-gcxi,gcy-gcyj));
Length prediction loss of Llength=SMOOTHL1(plength,glength/Lj );
Angle prediction loss of Ltheta=SMOOTHL1(ptheta,gtheta/Ti)
Whether there is a predicted loss of the target of Lobj=BCE(pobj,1);
Determining that there is no loss of target as Lnobj=BCE(pobj,0);
Class prediction loss of Lcls=BCE((pc1,…,pck),gc)。
205. Acquiring an image to be detected;
206. inputting an image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting linear characteristics of a linear target image sample, determining an anchor point according to the angle and the length contained in the linear characteristics and training based on the angle, the length, the center and the type contained in the anchor point and the linear characteristics, the linear characteristics are characteristics of a linear target contained in the linear target image sample, the angle is an included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is the midpoint of the linear target;
steps 205 and 206 in this embodiment are similar to steps 101 and 102 in the embodiment shown in fig. 1, and are not described herein again.
207. Determining the target characteristics to be detected of the image to be detected through a linear target detection model, and acquiring a target characteristic diagram;
optionally, in this embodiment, after the terminal inputs the image to be detected into the model and determines the target feature to be detected of the image to be detected, the terminal obtains the target feature map. Specifically, a four-dimensional feature map with the size of (m × n) S × S is obtained.
208. Traversing and matching all anchor points contained in the target feature graph according to the target feature to be detected;
optionally, in this embodiment, the terminal traverses all anchor point positions in the target feature map according to the target feature to be detected, and matches the target feature to be detected with the anchor points. Specifically, after the four-dimensional target feature map of (m × n) × S is acquired, each anchor point position of three dimensions of (m × n) × S in the four dimensions of the target feature map is traversed and matched.
209. Determining a matching anchor point according to the matching result, and determining a linear characteristic predicted value according to the matching anchor point;
optionally, in this embodiment, the terminal determines, according to the matching result, a matching anchor point that is most matched with the target feature to be detected, determines a corresponding linear feature prediction value based on the matching anchor point, determines a linear feature prediction value based on the anchor point matching anchor point, and improves accuracy of the linear feature prediction value.
210. Judging whether a linear target exists according to the linear feature prediction value, if so, executing step 211;
optionally, in this embodiment, it is assumed that the linear feature prediction value determined by the terminal is (p)x,py,plength,ptheta,pobj,pc1,…,pck). First according to pobjDetermining whether a linear target is present, in particular, determining pobjIf p is greater than 1obj>1, it means that there is a linear target, otherwise, there is no linear target. When the terminal determines that the linear object exists, step 211 is performed. When it is determined that no linear target exists, the terminal may feed back to the user that no linear target exists, or end the process, which is not limited herein.
211. And outputting a linear target detection result of the image to be detected according to the linear characteristic prediction value and a linear target reduction formula, wherein the linear target detection result comprises an angle, a length, a center and a type.
Optionally, in this embodiment, after the terminal determines that the linear target exists, the terminal outputs a linear target detection result of the image to be detected according to the linear feature prediction value and the linear target reduction formula. Specifically, the coordinates (x, y) of the center are determined according to a center point position reduction formula, which is:
Figure 584949DEST_PATH_IMAGE002
where i is the width position of the matching anchor point on the target feature map, j is the height position of the matching anchor point on the target feature map, and W, H is the width and height of the input image to be detected.
Determining the angle according to the angle and length reduction formula
Figure 790803DEST_PATH_IMAGE003
And a length l, the angle and length reduction formula being:
Figure 32297DEST_PATH_IMAGE004
wherein, L is the length corresponding to the anchor point of the matching anchor point, and T is the angle corresponding to the anchor point of the matching anchor point.
Finally, by taking (p)c1,…,pck) The index of the maximum of the k values determines the type, if pc1Maximum, indicates that the object belongs to the first class, if pc3Maximum, indicates that the object belongs to the third class, if pckMax, which means that the target belongs to class k, and so on.
In the embodiment, the angle clustering center and the length clustering center are determined according to the linear target image sample in the constructed linear target image set, the linear target detection model is constructed based on the anchor points determined by the angle clustering center and the length clustering center, and the anchor points are used as the starting points of model training, so that the model training is more stable, and the problem of poor robustness is reduced. Therefore, when the target detection is carried out on the image to be detected, the linear target can be effectively detected through the linear target detection model, the angle, the center, the length and the type of the image to be detected are accurately output, the linear target can be determined according to the angle, the center, the length and the type subsequently, and the detection precision is improved.
The above describes the linear target detection method based on the neural network, and the following describes the linear feature extraction module provided by the present application:
referring to fig. 3, fig. 3 is a block diagram of an embodiment of a linear feature extraction module provided in the present application, the module including:
an input 301, a 1 × 1 convolution module 302, a 1 × 3 convolution module 303, a 3 × 1 convolution module 304, a stack operation module 305, a 3 × 3 convolution module 306, and an output 307; the input terminal 301 is connected to the 1 × 1 convolution module 302; the 1 × 1 convolution module 302 is respectively connected to the 1 × 3 convolution module 303 and the 3 × 1 convolution module 304, the 1 × 3 convolution module 303 is configured to extract linear gradient information biased toward a vertical direction, and the 3 × 1 convolution module 304 is configured to extract linear gradient information biased toward a horizontal direction; the stacking operation module 305 is connected to the 1 × 3 convolution module 303 and the 3 × 1 convolution module 304, respectively, and is configured to obtain linear features corresponding to the linear gradient information output by the 1 × 3 convolution module 303 and the 3 × 1 convolution module 304, respectively, and perform a stacking operation according to the linear features; the 3 × 3 convolution module 306 is connected to the stack operation module 306 and the output terminal 307, and is configured to obtain the linear feature output by the stack operation module 305, perform 3 × 3 convolution, and output the linear feature after 3 × 3 convolution to the output terminal 307.
In this embodiment, the linear feature extraction module may be integrated into various linear feature extraction backbone networks, so as to improve the effect of linear feature extraction. Specifically, as shown in fig. 3, the 1 × 1 boxes represent the 1 × 1 convolution module 302 for performing the 1 × 1 convolution operation, the 1 × 3 boxes represent the 1 × 3 convolution module 303 for performing the 1 × 3 convolution operation, the 3 × 1 boxes represent the 3 × 1 convolution module 304 for performing the 3 × 1 convolution bins, and the 3 × 3 boxes represent the 3 × 3 convolution module 306 for performing the 3 × 3 convolution operation.
In the linear feature extraction, first, gradient information of the linear feature, which includes pixel change, image contour, and the like, is input from the input terminal 301. After the convolution operation of 1 × 1 is performed, the convolution operation of 1 × 3 and the convolution operation of 3 × 1 are performed, and finally, the stacking operation is performed to output the features. Wherein, the convolution of 1 × 3 is used for extracting gradient information biased towards the vertical direction, and the convolution of 3 × 1 is used for extracting gradient information biased towards the horizontal direction. After the stacking operation is carried out on the features output by the convolution of 1 × 3 and the convolution of 3 × 1, the linear features in the image can be better expressed, and the effect of linear feature extraction is improved.
The following describes a linear target detection system based on a neural network provided by the present application:
referring to fig. 4, fig. 4 is a block diagram illustrating an embodiment of a neural network-based linear target detection system provided in the present application, the linear target detection system including:
an acquiring unit 401, configured to acquire an image to be detected;
an input unit 402, configured to input the image to be detected into a linear target detection model, where the linear target detection model is obtained by extracting a linear feature of a linear target image sample, determining an anchor point according to an angle and a length included in the linear feature, and training based on the angle, the length, a center, and a type included in the anchor point and the linear feature, where the linear feature is a feature of a linear target included in the linear target image sample, the angle is an included angle between the linear target and a horizontal direction, the length is a length of the linear target, and the center is a midpoint of the linear target;
a first determining unit 403, configured to determine a linear feature prediction value of the image to be detected through the linear target detection model;
a second determining unit 404, configured to determine a linear target detection result of the image to be detected according to the linear feature prediction value.
In the system of this embodiment, the functions executed by each unit correspond to the steps in the method embodiment shown in fig. 1, and detailed description thereof is omitted here.
In this embodiment, the obtaining unit 401 obtains an image to be detected; the input unit 402 inputs the image to be detected into a linear target detection model, which is obtained by extracting the linear characteristics of a linear target image sample, determining an anchor point according to the angle and length contained in the linear characteristics, and training based on the angle, length, center and type contained in the anchor point and the linear characteristics. Then, the first determining unit 403 determines a linear feature prediction value of the image to be detected through a linear target detection model, and the second determining unit 404 determines a linear target detection result of the image to be detected according to the linear feature prediction value
Therefore, through the unit, the type of the linear target can be detected through the linear target detection model; the linear target detection model is obtained by training the angles and lengths contained in the linear target as anchor points, and the anchor points of the angles and the lengths are set as the initial points of model optimization, so that the network model is more stable, the possibility of poor robustness of the network model is reduced, and the detection precision of the linear target detection is improved.
Referring to fig. 5, fig. 5 is a schematic diagram illustrating another embodiment of a neural network-based linear object detection system provided in the present application, where the system includes:
an acquiring unit 505, configured to acquire an image to be detected;
an input unit 506, configured to input the image to be detected into a linear target detection model, where the linear target detection model is obtained by extracting a linear feature of a linear target image sample, determining an anchor point according to an angle and a length included in the linear feature, and training based on the angle, the length, a center, and a type included in the anchor point and the linear feature, where the linear feature is a feature of a linear target included in the linear target image sample, the angle is an included angle between the linear target and a horizontal direction, the length is a length of the linear target, and the center is a midpoint of the linear target;
a first determining unit 507, configured to determine a linear feature prediction value of the image to be detected through the linear target detection model;
a second determining unit 508, configured to determine a linear target detection result of the image to be detected according to the linear feature prediction value.
Optionally, the linear target detection system further comprises:
a first constructing unit 501, configured to construct a linear target image set, where the linear target image set includes a plurality of linear target image samples, and each linear target image sample is labeled with a center, an angle, a length, and a type corresponding to a linear target of the linear target;
a clustering unit 502, configured to cluster the lengths and the angles respectively, and determine m length clustering centers and n angle clustering centers;
a second constructing unit 503, configured to construct a linear target detection model based on a neural network with the m length cluster centers and the n angle cluster centers as anchor points;
a training unit 504, configured to iteratively input a linear target image sample of the linear target image set to the linear target detection model in a loop manner to perform linear feature extraction, train according to the linear feature, calculate a loss function, and perform optimization update on the linear target detection model according to the loss function and the linear target image set by using a back propagation algorithm to obtain a trained linear target detection model, where the loss function is constructed by including a center prediction loss, a length prediction loss, an angle prediction loss, a target loss, and a category prediction loss.
Optionally, the first constructing unit 501 is specifically configured to obtain a plurality of linear target image samples including a linear target;
marking the end point coordinates and the types of the linear targets on the linear target image samples;
determining the center, the angle and the length of the linear target according to the endpoint coordinate and a target conversion formula;
constructing a linear target image set according to the marked linear target image samples;
the target conversion formula is:
Figure 727721DEST_PATH_IMAGE001
whereinL is the length, (Cx, Cy) is the center, T is the angle, (x)1,y1) And (x)2,y2) Are the coordinates of two end points of the linear object.
Optionally, the training unit 504 is specifically configured to input the linear target image sample into a linear feature extraction backbone network to perform linear feature extraction, and convert the linear target image sample into a feature map, where a width of the feature map is S × S, and the linear target detection model includes the linear feature extraction backbone network and a linear target detection network;
determining the labeled target of the linear target contained in the feature map according to the linear feature, wherein each item of the labeled target is (g)cx,gcy,glength,gtheta,gc) Wherein (g)cx,gcy) As coordinates of the center point of the object, glengthIs a target length, gthetaIs a target angle gcTarget category information;
according to the target center point coordinate (g)cx,gcy) Determine the integer coordinate (g) of its upper left cornercxi,gcyj);
Respectively matching the target angle and the target length of the labeled target with the anchor points in the linear target detection network based on the integer coordinates, and determining a target anchor point (T)i,Lj);
Based on the target anchor point (T)i,Lj) Training the marked target by utilizing the linear target detection network;
according to the training result, on the target characteristic diagram with the size of (m × n) × S × C, according to the position (i × m + n, g)cyj,gcxi) The C-dimensional vector of (A) determines a target prediction value (p) of the labeled targetx,py,plength,ptheta,pobj,pc1,…,pck) C is the number of channels, the value of C is 5+ K, and K represents the number of categories of the linear target;
calculating a loss function according to the labeling coordinate, the integer coordinate of the upper left corner and the target predicted value;
and optimizing and updating the linear target detection network by using a back propagation algorithm according to the loss function, and circularly and iteratively inputting linear target image samples in the linear target image set to train until the value of the loss function is minimized to obtain a trained linear target detection model.
Optionally, the loss function is formed by including a center prediction loss, a length prediction loss, an angle prediction loss, a target loss, and a category prediction loss;
the target loss comprises whether a predicted loss of the target exists and a loss determined to not exist;
the loss function is formulated as:
Ls=γ1 Lxy2 Llength+ γ3 Ltheta4 Lobj+ γ5 Lcls+ γ6 Lnobj
wherein, gamma isiA weight value representing a weighting;
center predicted loss of Lxy=SMOOTHL1((px,py),(gcx-gcxi,gcy-gcyj));
Length prediction loss of Llength=SMOOTHL1(plength,glength/Lj );
Angle prediction loss of Ltheta=SMOOTHL1(ptheta,gtheta/Ti)
Whether there is a predicted loss of the target of Lobj=BCE(pobj,1);
Determining that there is no loss of target as Lnobj=BCE(pobj,0);
Class prediction loss of Lcls=BCE((pc1,…,pck),gc)。
Optionally, the training unit 504 is specifically configured to relate the target angle of the labeled target to n anchor point angles (T)1,T2,…,Tn) Matching is carried out, and the anchor point angle T closest to the target angle is determinedi
The target length of the labeling target and the length (L) of m anchor points1,L2,…,Lm) Matching is carried out, and the anchor point length L closest to the target length is determinedj
According to the anchor point angle TiAnd the anchor length LjDetermining a target anchor point (T)i,Lj)。
Optionally, the first determining unit 507 is specifically configured to determine, through the linear target detection model, a target feature to be detected of the image to be detected, and obtain a target feature map;
traversing and matching all anchor points contained in the target feature graph according to the target feature to be detected;
and determining a matching anchor point according to the matching result, and determining a linear characteristic predicted value according to the matching anchor point.
Optionally, the second determining unit 508 is specifically configured to determine whether a linear target exists according to the linear feature prediction value;
and if so, outputting a linear target detection result of the image to be detected according to the linear characteristic prediction value and a linear target reduction formula, wherein the linear target detection result comprises an angle, a length, a center and a type.
In this embodiment, the clustering unit 502 determines an angle clustering center and a length clustering center according to a linear target image sample in a linear target image set constructed by the first construction unit 501, the second construction unit 503 constructs a linear target detection model based on determining an anchor point by using the angle clustering center and the length clustering center, and the anchor point is used as a starting point of model training, so that the model training is more stable, and the problem of poor robustness is reduced. Therefore, the image to be detected is input into the input unit 506 for target detection, the linear target can be effectively detected through the first determining unit 507 and the second determining unit 508, and the angle, the center, the length and the type of the image to be detected are accurately output, so that the linear target can be determined according to the angle, the center, the length and the type subsequently, and the detection precision is improved.
Referring to fig. 6, fig. 6 is a schematic diagram of an embodiment of a linear target detection apparatus based on a neural network, where the apparatus includes:
a processor 601, a memory 602, an input-output unit 603, a bus 604;
the processor 601 is connected with the memory 602, the input/output unit 603 and the bus 604;
the memory 602 holds a program that the processor 601 calls to perform any of the neural network based linear object detection methods described above.
The present application also relates to a computer-readable storage medium having a program stored thereon, which, when run on a computer, causes the computer to perform any of the neural network-based linear object detection methods described above.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and the like.

Claims (10)

1. A linear target detection method based on a neural network is characterized by comprising the following steps:
acquiring an image to be detected;
inputting the image to be detected into a linear target detection model, wherein the linear target detection model is obtained by extracting linear features of a linear target image sample, determining an anchor point according to the angle and the length contained in the linear features, and training based on the angle, the length, the center and the type contained in the anchor point and the linear features, the linear features are the features of a linear target contained in the linear target image sample, the angle is the included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is the midpoint of the linear target;
determining a linear characteristic predicted value of the image to be detected through the linear target detection model;
and determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.
2. The linear object detection method according to claim 1, wherein before said acquiring an image to be detected, the linear object detection method further comprises:
constructing a linear target image set, wherein the linear target image set comprises a plurality of linear target image samples, and each linear target image sample is marked with a center, an angle, a length and a type corresponding to a linear target of the linear target image sample;
clustering the lengths and the angles respectively, and determining m length clustering centers and n angle clustering centers;
constructing a linear target detection model based on a neural network by taking the m length clustering centers and the n angle clustering centers as anchor points;
and circularly and iteratively inputting a linear target image sample of the linear target image set into the linear target detection model for linear feature extraction, training according to the linear feature, calculating a loss function, and performing optimization updating on the linear target detection model by using a back propagation algorithm according to the loss function and the linear target image set to obtain a trained linear target detection model, wherein the loss function is constructed by including central prediction loss, length prediction loss, angle prediction loss, target loss and category prediction loss.
3. The linear target detection method of claim 2, wherein the constructing a linear target image set comprises:
acquiring a plurality of linear target image samples containing linear targets;
marking the end point coordinates and the types of the linear targets on the linear target image samples;
determining the center, the angle and the length of the linear target according to the endpoint coordinate and a target conversion formula;
constructing a linear target image set according to the marked linear target image samples;
the target conversion formula is:
Figure 61518DEST_PATH_IMAGE001
wherein L is the length, (Cx, Cy) is the center, T is the angle, (x)1,y1) And (x)2,y2) Are the coordinates of two end points of the linear object.
4. The linear target detection method according to claim 2, wherein the iteratively inputting the linear target image samples of the linear target image set into the linear target detection model in a loop to perform linear feature extraction, training the linear target detection model according to the linear features, calculating a loss function, and performing optimization updating on the linear target detection model according to the loss function and the linear target image set by using a back propagation algorithm to obtain the trained linear target detection model comprises:
inputting the linear target image sample into a linear feature extraction backbone network to perform linear feature extraction and converting the linear feature extraction backbone network into a feature map, wherein the width and the height of the feature map are S, and the linear target detection model comprises the linear feature extraction backbone network and a linear target detection network;
determining the labeled target of the linear target contained in the feature map according to the linear features, wherein each numerical value of the labeled target is (g)cx,gcy,glength,gtheta,gc) Wherein (g)cx,gcy) As coordinates of the center point of the object, glengthIs a target length, gthetaIs a target angle gcTarget category information;
according to the target center point coordinate (g)cx,gcy) Determine the integer coordinate (g) of its upper left cornercxi,gcyj);
The target angle sum of the labeled target is obtained based on the integer coordinateRespectively matching the target length with the anchor points in the linear target detection network to determine target anchor points (T)i,Lj);
Based on the target anchor point (T)i,Lj) Training the marked target by utilizing the linear target detection network;
according to the training result, on the target characteristic diagram with the size of (m × n) × S × C, according to the position (i × m + n, g)cyj,gcxi) The C-dimensional vector of (A) determines a target prediction value (p) of the labeled targetx,py,plength,ptheta,pobj,pc1,…,pck) C is the number of channels, the value of C is 5+ K, and K represents the category number of the linear target;
calculating a loss function according to the labeling coordinate, the integer coordinate of the upper left corner and the target predicted value;
and optimizing and updating the linear target detection network by using a back propagation algorithm according to the loss function, and circularly and iteratively inputting linear target image samples in the linear target image set to train until the value of the loss function is minimized to obtain a trained linear target detection model.
5. The linear object detection method of claim 4, wherein the loss function is constructed by including a center prediction loss, a length prediction loss, an angle prediction loss, an object loss, and a class prediction loss;
the target loss comprises whether a predicted loss of the target exists and a loss determined not to exist;
the loss function formula is:
Ls=γ1 Lxy2 Llength+ γ3 Ltheta4 Lobj+ γ5 Lcls+ γ6 Lnobj
wherein, γiA weight value representing a weighting;
center predicted loss of Lxy=SMOOTHL1((px,py),(gcx-gcxi,gcy-gcyj));
Length prediction loss of Llength=SMOOTHL1(plength,glength/Lj);
Angle prediction loss of Ltheta=SMOOTHL1(ptheta,gtheta/Ti)
Whether there is a predicted loss of the target of Lobj=BCE(pobj,1);
Determining that there is no loss of target as Lnobj=BCE(pobj,0);
Class prediction loss of Lcls=BCE((pc1,…,pck),gc)。
6. The linear target detection method according to claim 4, wherein the target angle and the target length of the labeled target are respectively matched with an anchor point in the linear target detection network to determine a target anchor point (T)i,Lj) The method comprises the following steps:
the target angle of the labeling target is compared with n anchor point angles (T)1,T2,…,Tn) Matching is carried out, and the anchor point angle T closest to the target angle is determinedi
The target length of the labeling target and the length (L) of m anchor points1,L2,…,Lm) Matching is carried out, and the anchor point length L closest to the target length is determinedj
According to the anchor point angle TiAnd the anchor length LjDetermining a target anchor point (T)i,Lj)。
7. The linear object detection method according to any one of claims 1 to 6, wherein the determining the linear feature prediction value of the image to be detected by the linear object detection model comprises:
determining the target characteristics to be detected of the image to be detected through the linear target detection model, and acquiring a target characteristic diagram;
traversing and matching all anchor points contained in the target feature map according to the target feature to be detected;
and determining a matching anchor point according to the matching result, and determining a linear characteristic predicted value according to the matching anchor point.
8. The linear target detection method according to any one of claims 1 to 6, wherein the determining the linear target detection result of the image to be detected according to the linear feature prediction value comprises:
judging whether a linear target exists according to the linear characteristic predicted value;
and if so, outputting a linear target detection result of the image to be detected according to the linear characteristic prediction value and a linear target reduction formula, wherein the linear target detection result comprises an angle, a length, a center and a type.
9. A linear feature extraction module for use in linear feature extraction in a linear object detection method as claimed in any one of claims 1 to 8, the module comprising:
the device comprises an input end, a 1 × 1 convolution module, a 1 × 3 convolution module, a 3 × 1 convolution module, a stacking operation module, a 3 × 3 convolution module and an output end;
the input end is connected with the 1 x 1 convolution module;
the 1 × 1 convolution module is respectively connected with the 1 × 3 convolution module and the 3 × 1 convolution module, the 1 × 3 convolution module is used for extracting linear gradient information biased to the vertical direction, and the 3 × 1 convolution module is used for extracting linear gradient information biased to the horizontal direction;
the stacking operation module is respectively connected with the 1 × 3 convolution module and the 3 × 1 convolution module, and is used for respectively acquiring linear features corresponding to the linear gradient information output by the 1 × 3 convolution module and the 3 × 1 convolution module, and performing stacking operation according to the linear features;
and the 3 × 3 convolution module is connected with the stacking operation module and the output end, and is used for acquiring the linear feature output by the stacking operation module, performing 3 × 3 convolution on the linear feature, and outputting the linear feature after the 3 × 3 convolution to the output end.
10. A linear target detection system based on a neural network, the linear target detection system comprising:
the acquisition unit is used for acquiring an image to be detected;
the image detection device comprises an input unit, a detection unit and a processing unit, wherein the input unit is used for inputting the image to be detected into a linear target detection model, the linear target detection model is obtained by extracting linear characteristics of a linear target image sample, determining an anchor point according to an angle and a length contained in the linear characteristics and then training based on the angle, the length, the center and the type contained in the anchor point and the linear characteristics, the linear characteristics are characteristics of a linear target contained in the linear target image sample, the angle is an included angle between the linear target and the horizontal direction, the length is the length of the linear target, and the center is a midpoint of the linear target;
the first determining unit is used for determining a linear characteristic predicted value of the image to be detected through the linear target detection model;
and the second determining unit is used for determining a linear target detection result of the image to be detected according to the linear characteristic predicted value.
CN202210595785.6A 2022-05-30 2022-05-30 Linear target detection method, module and system based on neural network Active CN114677568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210595785.6A CN114677568B (en) 2022-05-30 2022-05-30 Linear target detection method, module and system based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210595785.6A CN114677568B (en) 2022-05-30 2022-05-30 Linear target detection method, module and system based on neural network

Publications (2)

Publication Number Publication Date
CN114677568A true CN114677568A (en) 2022-06-28
CN114677568B CN114677568B (en) 2022-08-23

Family

ID=82079933

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210595785.6A Active CN114677568B (en) 2022-05-30 2022-05-30 Linear target detection method, module and system based on neural network

Country Status (1)

Country Link
CN (1) CN114677568B (en)

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182764A1 (en) * 2004-02-13 2005-08-18 Evans Lynne M. System and method for arranging concept clusters in thematic neighborhood relationships in a two-dimensional visual display space
CN109858618A (en) * 2019-03-07 2019-06-07 电子科技大学 The neural network and image classification method of a kind of convolutional Neural cell block, composition
CN110852320A (en) * 2019-11-08 2020-02-28 积成电子股份有限公司 Transmission channel foreign matter intrusion detection method based on deep learning
CN111145174A (en) * 2020-01-02 2020-05-12 南京邮电大学 3D target detection method for point cloud screening based on image semantic features
CN111488804A (en) * 2020-03-19 2020-08-04 山西大学 Labor insurance product wearing condition detection and identity identification method based on deep learning
CN111753666A (en) * 2020-05-21 2020-10-09 西安科技大学 Method and system for detecting faults of small targets in power transmission line and storage medium
CN111797940A (en) * 2020-07-20 2020-10-20 中国科学院长春光学精密机械与物理研究所 Image identification method based on ocean search and rescue and related device
CN112052817A (en) * 2020-09-15 2020-12-08 中国人民解放军海军大连舰艇学院 Improved YOLOv3 model side-scan sonar sunken ship target automatic identification method based on transfer learning
CN112199993A (en) * 2020-09-01 2021-01-08 广西大学 Method for identifying transformer substation insulator infrared image detection model in any direction based on artificial intelligence
CN112257566A (en) * 2020-10-20 2021-01-22 哈尔滨工程大学 Artificial intelligence target identification ranging method based on big data
CN112668469A (en) * 2020-12-28 2021-04-16 西安电子科技大学 Multi-target detection and identification method based on deep learning
CN112861744A (en) * 2021-02-20 2021-05-28 哈尔滨工程大学 Remote sensing image target rapid detection method based on rotation anchor point clustering
CN113222949A (en) * 2021-05-19 2021-08-06 云南电网有限责任公司电力科学研究院 X-ray image automatic detection method for plugging position of power equipment conductor
CN113298169A (en) * 2021-06-02 2021-08-24 浙江工业大学 Convolutional neural network-based rotating target detection method and device
CN113344115A (en) * 2021-06-25 2021-09-03 南京邮电大学 Target detection method based on lightweight model
CN113409314A (en) * 2021-08-18 2021-09-17 南京市特种设备安全监督检验研究院 Unmanned aerial vehicle visual detection and evaluation method and system for corrosion of high-altitude steel structure
CN113420648A (en) * 2021-06-22 2021-09-21 深圳市华汉伟业科技有限公司 Target detection method and system with rotation adaptability
CN113486961A (en) * 2021-07-12 2021-10-08 安徽耀峰雷达科技有限公司 Radar RD image target detection method and system based on deep learning under low signal-to-noise ratio and computer equipment
CN113516053A (en) * 2021-05-28 2021-10-19 西安空间无线电技术研究所 Ship target refined detection method with rotation invariance
CN113903028A (en) * 2021-09-07 2022-01-07 武汉大学 Target detection method and electronic equipment
CN113962954A (en) * 2021-10-20 2022-01-21 上海师范大学 Surface defect detection method based on SE-R-YOLOV4 automobile steel part
CN114445482A (en) * 2022-01-29 2022-05-06 福州大学 Method and system for detecting target in image based on Libra-RCNN and elliptical shape characteristics

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182764A1 (en) * 2004-02-13 2005-08-18 Evans Lynne M. System and method for arranging concept clusters in thematic neighborhood relationships in a two-dimensional visual display space
CN109858618A (en) * 2019-03-07 2019-06-07 电子科技大学 The neural network and image classification method of a kind of convolutional Neural cell block, composition
CN110852320A (en) * 2019-11-08 2020-02-28 积成电子股份有限公司 Transmission channel foreign matter intrusion detection method based on deep learning
CN111145174A (en) * 2020-01-02 2020-05-12 南京邮电大学 3D target detection method for point cloud screening based on image semantic features
CN111488804A (en) * 2020-03-19 2020-08-04 山西大学 Labor insurance product wearing condition detection and identity identification method based on deep learning
CN111753666A (en) * 2020-05-21 2020-10-09 西安科技大学 Method and system for detecting faults of small targets in power transmission line and storage medium
CN111797940A (en) * 2020-07-20 2020-10-20 中国科学院长春光学精密机械与物理研究所 Image identification method based on ocean search and rescue and related device
CN112199993A (en) * 2020-09-01 2021-01-08 广西大学 Method for identifying transformer substation insulator infrared image detection model in any direction based on artificial intelligence
CN112052817A (en) * 2020-09-15 2020-12-08 中国人民解放军海军大连舰艇学院 Improved YOLOv3 model side-scan sonar sunken ship target automatic identification method based on transfer learning
CN112257566A (en) * 2020-10-20 2021-01-22 哈尔滨工程大学 Artificial intelligence target identification ranging method based on big data
CN112668469A (en) * 2020-12-28 2021-04-16 西安电子科技大学 Multi-target detection and identification method based on deep learning
CN112861744A (en) * 2021-02-20 2021-05-28 哈尔滨工程大学 Remote sensing image target rapid detection method based on rotation anchor point clustering
CN113222949A (en) * 2021-05-19 2021-08-06 云南电网有限责任公司电力科学研究院 X-ray image automatic detection method for plugging position of power equipment conductor
CN113516053A (en) * 2021-05-28 2021-10-19 西安空间无线电技术研究所 Ship target refined detection method with rotation invariance
CN113298169A (en) * 2021-06-02 2021-08-24 浙江工业大学 Convolutional neural network-based rotating target detection method and device
CN113420648A (en) * 2021-06-22 2021-09-21 深圳市华汉伟业科技有限公司 Target detection method and system with rotation adaptability
CN113344115A (en) * 2021-06-25 2021-09-03 南京邮电大学 Target detection method based on lightweight model
CN113486961A (en) * 2021-07-12 2021-10-08 安徽耀峰雷达科技有限公司 Radar RD image target detection method and system based on deep learning under low signal-to-noise ratio and computer equipment
CN113409314A (en) * 2021-08-18 2021-09-17 南京市特种设备安全监督检验研究院 Unmanned aerial vehicle visual detection and evaluation method and system for corrosion of high-altitude steel structure
CN113903028A (en) * 2021-09-07 2022-01-07 武汉大学 Target detection method and electronic equipment
CN113962954A (en) * 2021-10-20 2022-01-21 上海师范大学 Surface defect detection method based on SE-R-YOLOV4 automobile steel part
CN114445482A (en) * 2022-01-29 2022-05-06 福州大学 Method and system for detecting target in image based on Libra-RCNN and elliptical shape characteristics

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHANGQING CAO 等: "Research on Airplane and Ship Detection of Aerial Remote Sensing Images Based on Convolutional Neural Network", 《SENSORS》 *
JING LI 等: "A PCB Electronic Components Detection Network Design Based on Effective Receptive Field Size and Anchor Size Matching", 《COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE》 *
WANZENG KONG 等: "YOLOv3-DPFIN: A Dual-Path Feature Fusion Neural Network for Robust Real-Time Sonar Target Detection", 《IEEE SENSORS JOURNAL》 *
蒋弘毅 等: "目标检测模型及其优化方法综述", 《自动化学报》 *
黄仝宇 等: "面向驾驶场景的多尺度特征融合目标检测方法", 《计算机工程与应用》 *

Also Published As

Publication number Publication date
CN114677568B (en) 2022-08-23

Similar Documents

Publication Publication Date Title
US11062123B2 (en) Method, terminal, and storage medium for tracking facial critical area
CN107229904B (en) Target detection and identification method based on deep learning
CN107529650B (en) Closed loop detection method and device and computer equipment
CN107229942B (en) Convolutional neural network classification method based on multiple classifiers
CN111079674B (en) Target detection method based on global and local information fusion
CN107273458B (en) Depth model training method and device, and image retrieval method and device
CN109726195B (en) Data enhancement method and device
CN112836692B (en) Method, apparatus, device and medium for processing image
CN116152254B (en) Industrial leakage target gas detection model training method, detection method and electronic equipment
CN113989604B (en) Tire DOT information identification method based on end-to-end deep learning
CN111931572B (en) Target detection method for remote sensing image
CN115937552A (en) Image matching method based on fusion of manual features and depth features
EP4343616A1 (en) Image classification method, model training method, device, storage medium, and computer program
Alsanad et al. Real-time fuel truck detection algorithm based on deep convolutional neural network
WO2024174726A1 (en) Handwritten and printed text detection method and device based on deep learning
CN107274425A (en) A kind of color image segmentation method and device based on Pulse Coupled Neural Network
CN116824609B (en) Document format detection method and device and electronic equipment
CN109583584B (en) Method and system for enabling CNN with full connection layer to accept indefinite shape input
CN114677568B (en) Linear target detection method, module and system based on neural network
CN114511862B (en) Form identification method and device and electronic equipment
CN112487927B (en) Method and system for realizing indoor scene recognition based on object associated attention
CN115761332A (en) Smoke and flame detection method, device, equipment and storage medium
CN115376195A (en) Method for training multi-scale network model and method for detecting key points of human face
CN111291820B (en) Target detection method combining positioning information and classification information
CN112668643A (en) Semi-supervised significance detection method based on lattice tower rule

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 266000 F3, Jingkong building, No. 57 Lushan Road, Huangdao District, Qingdao, Shandong

Patentee after: Shandong Jijian Technology Co.,Ltd.

Address before: 266000 F3, Jingkong building, No. 57 Lushan Road, Huangdao District, Qingdao, Shandong

Patentee before: Shandong jivisual angle Technology Co.,Ltd.