CN109977943B - Image target recognition method, system and storage medium based on YOLO - Google Patents
Image target recognition method, system and storage medium based on YOLO Download PDFInfo
- Publication number
- CN109977943B CN109977943B CN201910114621.5A CN201910114621A CN109977943B CN 109977943 B CN109977943 B CN 109977943B CN 201910114621 A CN201910114621 A CN 201910114621A CN 109977943 B CN109977943 B CN 109977943B
- Authority
- CN
- China
- Prior art keywords
- detection frame
- classification
- image
- detection
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 53
- 238000001514 detection method Methods 0.000 claims abstract description 239
- 238000003062 neural network model Methods 0.000 claims abstract description 45
- 238000012549 training Methods 0.000 claims description 30
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000012216 screening Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to an artificial intelligence technology, and provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium, wherein the method comprises the following steps: receiving an image to be detected; the size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated; the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated; judging whether the classification probability value is larger than a preset classification probability threshold value or not; and if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification. By the technical scheme, the detection precision can be effectively improved, and the detection time can be reduced. Compared with the detection method in the prior art, the method provided by the invention has the advantages that the identification accuracy is improved and the operation speed is increased.
Description
Technical Field
The present invention relates to the field of computer learning and image recognition, and more particularly, to a YOLO-based image target recognition method, system and storage medium.
Background
The high-speed development of artificial intelligence technology, deep learning is increasingly applied to the field of computer vision, especially the field of image target detection.
In recent years, the target detection algorithm has made a great breakthrough. The more popular algorithms can be classified into two types, one is based on Region Proposal R-CNN (Fast R-CNN) which are two-stage, requiring the heuristic method (SELECTIVE SEARCH) or CNN network (RPN) to generate Region Proposal, and then classifying and regressing at Region Proposal. The other is Yolo (all You Only Look Once), a one-stage algorithm such as SSD, which uses only one CNN network to directly predict the class and location of different targets. The first class of methods are more accurate but slow; the second class of algorithms is fast but less accurate. More and more target detection methods are implemented based on YOLO, and many deep networks are also improved based on YOLO. YOLO solves the object detection as a regression problem, completing the input from the original image to the output of the object location and class based on a single end-to-end network.
The key idea of YOLO is to use the whole graph as the input of the network, and directly return the position of the binding box and the category to which the binding box belongs at the output layer. On the basis of utilizing the YOLO high-rate operation, how to design a method capable of improving the YOLO accuracy is currently urgent to be solved.
Disclosure of Invention
In order to solve at least one technical problem, the invention provides an image target identification method, an image target identification system and a storage medium based on YOLO.
In order to achieve the above object, the technical scheme of the present invention provides a YOLO-based image target recognition method, which includes:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
In this solution, before the receiving the image to be detected, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers.
In this embodiment, the size is defined by the neural network model.
The technical scheme of the invention also provides a YOLO-based image target recognition system, which comprises: the image target recognition system comprises a memory, a processor and an image pickup device, wherein the memory comprises a YOLO-based image target recognition method program, and the image target recognition method program based on the YOLO realizes the following steps when being executed by the processor:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
In this solution, before the receiving the image to be detected, the method further includes:
training pictures to obtain a neural network model; the neural network model is a model with an input interface and an output interface, which is obtained by performing image training on pictures of different categories.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers.
In this embodiment, the size is defined by the neural network model.
The third aspect of the present invention also provides a computer-readable storage medium having embodied therein a YOLO-based image target recognition method program which, when executed by a processor, implements the steps of a YOLO-based image target recognition method as described above.
The invention provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium. According to the method, the classification recognition probability is judged, and the recognition information is taken as the recognition result when the preset classification probability threshold is reached, so that the accuracy of image recognition and the recognition experience are improved. The invention can also adjust the position of the detection frame in real time, thereby effectively improving the detection efficiency and precision; the detection time is reduced by optimizing the calculation mode of the detection. Through experiments and verification, the method of the invention is superior to the detection method in the prior art. The identification accuracy is improved and the operation speed is increased.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flowchart of a method for recognizing an image object based on YOLO according to the present invention;
FIG. 2 shows a schematic diagram of convolution operation in the classification process of the present invention;
FIG. 3 shows a block diagram of a YOLO-based image target recognition system of the present invention;
fig. 4 shows a schematic diagram of an embodiment of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
FIG. 1 is a flowchart of a method for recognizing an image object based on YOLO according to the present invention.
As shown in fig. 1, the technical scheme of the present invention provides a YOLO-based image target recognition method, which includes:
s102, receiving an image to be detected;
s104, adjusting the size of the image to be detected according to a preset requirement to generate a first detection image;
S106, the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
s108, judging whether the classification probability value is larger than a preset classification probability threshold value;
And S110, if the detection frame and the classification identification information are larger than each other, the detection frame and the classification identification information are used as the classification result of the identification.
The size of the dimension is the size specified by the neural network model. In the neural network model, the size of the image is generally selected to be smaller than that of the image to be detected, so that the speed of operation processing can be ensured, and class identification can be rapidly carried out. It will be appreciated by those skilled in the art that the size of the steps may be set according to actual needs, and are not limited to the above-mentioned sizes, and are not intended to limit the scope of the present invention.
And sending the first detection image to a neural network model, and generating a detection frame, classification identification information and a classification probability value corresponding to the classification identification information. A person skilled in the art can set the classification probability value according to actual needs, for example, set the classification probability threshold to 90%, and when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90%, it indicates that the kitten is circled in the detection frame, and the kitten in the picture is already identified. If the classification probability value is smaller than the preset classification probability threshold, the step S106 is returned to carry out re-identification until the classification probability value is larger than the preset classification probability threshold. The neural network model performs a multi-layer convolution operation on the image. The YOLO convolution operation is a conventional operation in the field, belongs to the prior art, and is not described in detail.
In this solution, before receiving the image to be detected in the step S102, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
It should be noted that the training image data set has 1000 object categories and 120 ten thousand training images. The data set is preprocessed before training, the preprocessing comprises one or more of rotation, contrast enhancement, tilting and scaling, after the preprocessing, the image has certain distortion, and the accuracy of the final image recognition can be increased through training the distorted image.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
It should be noted that, the initial preset coordinate point may be a coordinate point of a preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and a height pw, p h, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
The network predicts 4 coordinates per detection box, (t x,ty,tw,th). If the cell deviates from the upper left corner coordinates (c x;cy) of the image, the coordinates of the latest detection frame expressed by the above formula can be derived.
It should be noted that each box uses multiple labels to classify classes that the prediction bounding box may contain. In the process of category identification, the application uses a binary cross entropy loss technique to conduct category prediction. The objective of using binary cross entropy loss for class prediction is mainly that the inventors have found that the softmax technique does not require good performance, but rather that a separate logic classifier is used, so this step does not require the softmax technique. The binary cross entropy loss technique provides more assistance when migrating to more complex category identification areas using the method of the present application. The binary cross entropy loss technology is a common technology in the field, and a person skilled in the art can implement the binary cross entropy loss technology according to requirements, so that the application is not repeated.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers. The inventor finds that the accuracy can be increased and the operation speed can be effectively improved by adopting the convolution layer to perform alternating operation in limited practical tests. The convolution layers are alternately calculated, specifically, 3×3 convolution operation is adopted first, then 1×1 convolution operation is adopted, and the calculation is sequentially and alternately carried out until all the convolution layers participate in the operation.
In this embodiment, the size is defined by the neural network model.
In order to better explain the technical scheme of the present invention, the following describes the technical scheme of the present invention in detail.
After the detection frames are generated, the classification probability values in the detection frames are calculated, and the optimal N detection frames of the same type are screened out. It should be noted that, the size of the detection frame is dynamically predicted, and the dynamic prediction process is the scheme described above. Screening M classification probability values of all detection frames by using probability threshold values, and formulating a set of screening rules:
And calculating the classification probability value of each detection frame, arranging the classification probability values according to the sequence from large to small, and selecting the classification with the highest ranking. The first screening step can be said to be that the M categories of each detection frame are firstly compared, and a champion category with the highest probability value is selected.
Comparing the highest-ranking classification with a preset probability threshold, and if the highest-ranking classification is larger than or equal to the preset probability threshold, reserving the detection frame; and if the probability is smaller than the preset probability threshold, deleting the detection frame. It can be said that the second round of screening, the champion classification is compared with the probability threshold value, and the detection box with the value larger than the probability threshold value is qualified to enter the resolution. For example, the probability threshold may be set to 0.24 (24%). By comparing, the detection frame which is preset is displayed on the picture, and the classification probability value which is larger than or equal to the 0.24 probability threshold value can be displayed. Those skilled in the art may set the probability threshold according to actual needs, and the probability threshold described in the present application is not limited to the protection scope of the present application.
And calculating the coincidence degree of the N similar detection frames, and reserving the detection frame with the highest coincidence degree.
For example, after the screening step described above, all three of the test frames were tested for classification as "horse".
And sequencing the detection probabilities of the three detection units in a descending order.
And (3) performing coincidence degree calculation (IOU) two by two, and eliminating the detection frame with low probability if the coincidence degree calculated value IoU is more than 0.3.
The result is a unique detection box classified as "horse".
FIG. 2 shows a schematic diagram of convolution operation in the classification process of the present invention.
As shown in fig. 2, the neural network model adopts 53-layer convolution operations, and the convolution operations of each layer are alternately performed for 3×3 and 1×1 convolution layers.
The feature extraction scheme enables the highest measurement floating point operation per second. This means that the neural network architecture can better utilize the GPU of the machine, improving the evaluation efficiency and thus the speed. The convolution operation of the present application may be more efficient and accurate due to the much hierarchical and inefficient techniques ResNets.
For example, each neural network is trained using the same settings and tested with a single clipping precision of 256×256. The performance of the classifier adopting the feature extraction of the application is equivalent to that of the most advanced classifier in the prior art, but the floating point operation is less and the speed is faster.
FIG. 3 shows a block diagram of a YOLO-based image object recognition system of the present invention.
As shown in fig. 2, the technical solution of the present invention further provides a YOLO-based image target recognition system 2, which includes: the memory 201, the processor 202 and the image capturing device 203, wherein the memory 201 includes a YOLO-based image target recognition method program, and the YOLO-based image target recognition method program, when executed by the processor, implements the following steps:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
The size of the dimension is the size specified by the neural network model. In the neural network model, the size of the image is generally selected to be smaller than that of the image to be detected, so that the speed of operation processing can be ensured, and class identification can be rapidly carried out. It will be appreciated by those skilled in the art that the size of the steps may be set according to actual needs, and are not limited to the above-mentioned sizes, and are not intended to limit the scope of the present invention.
And sending the first detection image to a neural network model, and generating a detection frame, classification identification information and a classification probability value corresponding to the classification identification information. A person skilled in the art can set the classification probability value according to actual needs, for example, set the classification probability threshold to 90%, and when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90%, it indicates that the kitten is circled in the detection frame, and the kitten in the picture is already identified. If the classification probability value is smaller than the preset classification probability threshold, the step S106 is returned to carry out re-identification until the classification probability value is larger than the preset classification probability threshold. The neural network model performs a multi-layer convolution operation on the image. The YOLO convolution operation is a conventional operation in the field, belongs to the prior art, and is not described in detail.
In this solution, before the receiving the image to be detected, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
It should be noted that the training image data set has 1000 object categories and 120 ten thousand training images. The data set is preprocessed before training, the preprocessing comprises one or more of rotation, contrast enhancement, tilting and scaling, after the preprocessing, the image has certain distortion, and the accuracy of the final image recognition can be increased through training the distorted image.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
It should be noted that, the initial preset coordinate point may be a coordinate point of a preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
In the invention, dimension clusters can be used as anchor frames to dynamically predict detection frames, and the detection frames are also boundary frames. The network predicts 4 coordinates per detection box, t x,ty,tw,th. If the cell deviates from the upper left corner coordinates (c x;cy) of the image, the coordinates of the latest detection frame expressed by the above formula can be derived, where b x、by、bw、bh is the four coordinate point values of the latest detection frame, respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points. .
It should be noted that each box uses multiple labels to classify classes that the prediction bounding box may contain. In the process of category identification, the application uses a binary cross entropy loss technique to conduct category prediction. The objective of using binary cross entropy loss for class prediction is mainly that the inventors have found that the softmax technique does not require good performance, but rather that a separate logic classifier is used, so this step does not require the softmax technique. The binary cross entropy loss technique provides even more assistance when migrating to more complex category identification areas using the method of the present application. The binary cross entropy loss technology is a common technology in the field, and a person skilled in the art can implement the binary cross entropy loss technology according to requirements, so that the application is not repeated.
In this solution, before the receiving the image to be detected, the method further includes:
training pictures to obtain a neural network model; the neural network model is a model with an input interface and an output interface, which is obtained by performing image training on pictures of different categories.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers. The inventor finds that the accuracy can be increased and the operation speed can be effectively improved by adopting the convolution layer to perform alternating operation in limited practical tests. The convolution layers are alternately calculated, specifically, 3×3 convolution operation is adopted first, then 1×1 convolution operation is adopted, and the calculation is sequentially and alternately carried out until all the convolution layers participate in the operation.
In this embodiment, the size is defined by the neural network model.
It should be noted that, in the neural network model, 53-layer convolution operation is adopted, and the convolution operation of each layer is alternately performed by 3×3 and 1×1 convolution layers.
The feature extraction scheme enables the highest measurement floating point operation per second. This means that the neural network architecture can better utilize the GPU of the machine, improving the evaluation efficiency and thus the speed. The convolution operation of the present application may be more efficient and accurate due to the much hierarchical and inefficient techniques ResNets.
For example, each neural network is trained using the same settings and tested with a single clipping precision of 256×256. The performance of the classifier adopting the feature extraction of the application is equivalent to that of the most advanced classifier in the prior art, but the floating point operation is less and the speed is faster.
And calculating the classification probability value in the detection frames, and screening out the optimal N detection frames of the same kind. Screening M classification probability values of all detection frames by using probability threshold values, and formulating a set of screening rules:
And calculating the classification probability value of each detection frame, arranging the classification probability values according to the sequence from large to small, and selecting the classification with the highest ranking. The first screening step can be said to be that the M categories of each detection frame are firstly compared, and a champion category with the highest probability value is selected.
Comparing the highest-ranking classification with a preset probability threshold, and if the highest-ranking classification is larger than or equal to the preset probability threshold, reserving the detection frame; and if the probability is smaller than the preset probability threshold, deleting the detection frame. It can be said that the second round of screening, the champion classification is compared with the probability threshold value, and the detection box with the value larger than the probability threshold value is qualified to enter the resolution. For example, the probability threshold may be set to 0.24 (24%). By comparison, the detection frame passing through the pre-match is displayed on the picture, and it can be seen that the detection frame can be displayed as long as the classification probability value is equal to or greater than the 0.24 probability threshold.
And calculating the coincidence degree of the N similar detection frames, and reserving the detection frame with the highest coincidence degree.
For example, after the screening step described above, all three of the test frames were tested for classification as "horse".
And sequencing the detection probabilities of the three detection units in a descending order.
And (3) performing coincidence degree calculation (IOU) two by two, and eliminating the detection frame with low probability if the coincidence degree calculated value IoU is more than 0.3.
The result is a unique detection box classified as "horse".
In order to better explain the technical scheme of the invention, the following detailed description is given by an embodiment. Fig. 4 shows a schematic diagram of an embodiment of the invention.
As shown in fig. 4, the number of convolution layers is set to 0-52 layers in the neural network model. And then receiving the first detection image after size adjustment, wherein the size of the first detection image is 416×416, the specific size can be set according to the actual operation requirement and the operation capability, and the embodiment selects 416×416 for explanation, and the color is a color photo. Layer 0 of the neural network model receives the first color detection image of 416 x 416 size, 3 channels (RGB), and performs convolution operation.
After the convolution operation of layers 0-51, a feature map (feature map) of a channel with a size of 13 x 13 and 425 is obtained.
And the 52 th layer carries out convolution operation on the characteristic picture, and finally outputs a one-dimensional prediction array which comprises 13 x 5 x 85 numerical values. The multi-dimensional array or matrix is reduced to a one-dimensional array by a series of operations. The one-dimensional array is a prediction array.
Wherein the number 13 x 13 of the 13 x 5 x 85 values represents the broad x height of the feature map (feature map), and there are a total of 13 x 13 feature cells. YOLO equally divides the original picture (416×416) into 13×13 regions (cells), one picture region for each feature cell. The specific size may be set by those skilled in the art according to the actual operational requirements and operational capabilities.
Number 5: representing 5 different-shaped detection boxes (bounding boxes), YOLO generates 5 different-shaped detection boxes in each picture area, and uses the center point of the area as the center point of the detection boxes to detect objects, so that YOLO uses 13 x 5 detection boxes to detect one picture or image.
The numeral 85 can be split into 3 parts understanding that 85=4+1+80.
4: Each detection frame contains 4 coordinate values (x, y, width, height)
1: Each detection frame has 1 confidence value of the detected object, which is also the confidence (0-1), and the confidence value is understood as the confidence probability of detecting the object.
80: Each detection frame has 80 classification detection probability values (0-1), which is understood to be the probability that the objects in the detection frame may be each classification respectively.
It can be said that the above procedure is that a 416-416 picture is divided into 13-13 picture areas on average, each picture area generates 5 detection frames, each detection frame contains 85 values (4 coordinate values+1 detection object self-confidence value+80 classification detection values), the finally obtained one-dimensional prediction array (predictions) represents the object detected in the picture, and the array contains 13-5-85 numerical values predictions [0] to predictions [ 13-5-85-1 ].
The invention provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium. The method can effectively improve the detection precision and reduce the detection time. Through experiments and verification, the method of the invention is superior to the detection method in the prior art. The identification accuracy is improved and the operation speed is increased.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or optical disk, or the like, which can store program codes.
Or the above-described integrated units of the invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (6)
1. A YOLO-based image target recognition method, comprising:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
Arranging the classification probability values in each detection frame from large to small, and selecting the classification with highest ranking; then comparing the classification probability value of the highest-ranking classification with a preset probability threshold value, and judging whether the classification probability value of the highest-ranking classification is larger than the preset probability threshold value;
If the probability threshold value is smaller than the preset probability threshold value, deleting the detection frame; if the probability threshold value is larger than the preset probability threshold value, the detection frame and the classification identification information are used as the identification classification result;
the step of generating the detection frame specifically comprises the following steps:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
finally generating N similar detection frames of the same class;
the step of carrying out the prediction of the dynamic detection frame, carrying out iterative prediction on the generated detection frame, and generating the latest detection frame specifically comprises the following steps:
Predicting 4 coordinates of each detection frame ) ; If the cell deviates from the upper left corner coordinates of the imageAnd the detection frame of the upper prediction has a width/>And height/>The coordinates of the latest detection frame are:
wherein/> And the four coordinate point position values of the latest detection frame are respectively.
2. The YOLO-based image object recognition method according to claim 1, further comprising, before the receiving the image to be detected:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
3. The YOLO-based image object recognition method of claim 2, wherein 53-layer convolution operations are used in the neural network model, and the convolution operations of each layer are calculated alternately for 3 x 3 and 1 x 1 convolution layers.
4. The YOLO-based image object recognition method of claim 1, wherein the size is a size specified by a neural network model.
5. A YOLO-based image target recognition system, the system comprising: the image target recognition system comprises a memory, a processor and an image pickup device, wherein the memory comprises a YOLO-based image target recognition method program, and the image target recognition method program based on the YOLO realizes the following steps when being executed by the processor:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
Arranging the classification probability values in each detection frame from large to small, and selecting the classification with highest ranking; then comparing the classification probability value of the highest-ranking classification with a preset probability threshold value, and judging whether the classification probability value of the highest-ranking classification is larger than the preset probability threshold value;
If the probability threshold value is smaller than the preset probability threshold value, deleting the detection frame; if the probability threshold value is larger than the preset probability threshold value, the detection frame and the classification identification information are used as the identification classification result;
the step of generating the detection frame specifically comprises the following steps:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
finally generating N similar detection frames of the same class;
the step of carrying out the prediction of the dynamic detection frame, carrying out iterative prediction on the generated detection frame, and generating the latest detection frame specifically comprises the following steps: predicting 4 coordinates of each detection frame ) ; If the cell deviates from the upper left corner coordinates/>, of the imageAnd the detection frame of the upper prediction has a width/>And height/>The coordinates of the latest detection frame are:
wherein/> And the four coordinate point position values of the latest detection frame are respectively.
6. A computer-readable storage medium, characterized in that a YOLO-based image object recognition method program is included in the computer-readable storage medium, which, when executed by a processor, implements the steps of a YOLO-based image object recognition method according to any one of claims 1 to 4.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910114621.5A CN109977943B (en) | 2019-02-14 | 2019-02-14 | Image target recognition method, system and storage medium based on YOLO |
PCT/CN2019/118499 WO2020164282A1 (en) | 2019-02-14 | 2019-11-14 | Yolo-based image target recognition method and apparatus, electronic device, and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910114621.5A CN109977943B (en) | 2019-02-14 | 2019-02-14 | Image target recognition method, system and storage medium based on YOLO |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977943A CN109977943A (en) | 2019-07-05 |
CN109977943B true CN109977943B (en) | 2024-05-07 |
Family
ID=67076997
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910114621.5A Active CN109977943B (en) | 2019-02-14 | 2019-02-14 | Image target recognition method, system and storage medium based on YOLO |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109977943B (en) |
WO (1) | WO2020164282A1 (en) |
Families Citing this family (130)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977943B (en) * | 2019-02-14 | 2024-05-07 | 平安科技(深圳)有限公司 | Image target recognition method, system and storage medium based on YOLO |
CN110348304A (en) * | 2019-06-06 | 2019-10-18 | 武汉理工大学 | A kind of maritime affairs distress personnel search system being equipped on unmanned plane and target identification method |
CN110738125B (en) * | 2019-09-19 | 2023-08-01 | 平安科技(深圳)有限公司 | Method, device and storage medium for selecting detection frame by Mask R-CNN |
CN111223343B (en) * | 2020-03-07 | 2022-01-28 | 上海中科教育装备集团有限公司 | Artificial intelligence scoring experimental equipment and scoring method for lever balance experiment |
CN111582021A (en) * | 2020-03-26 | 2020-08-25 | 平安科技(深圳)有限公司 | Method and device for detecting text in scene image and computer equipment |
CN111695559B (en) * | 2020-04-28 | 2023-07-18 | 深圳市跨越新科技有限公司 | YoloV3 model-based waybill picture information coding method and system |
CN113705591A (en) * | 2020-05-20 | 2021-11-26 | 上海微创卜算子医疗科技有限公司 | Readable storage medium, and support specification identification method and device |
CN111626256B (en) * | 2020-06-03 | 2023-06-27 | 兰波(苏州)智能科技有限公司 | High-precision diatom detection and identification method and system based on scanning electron microscope image |
CN111738259A (en) * | 2020-06-29 | 2020-10-02 | 广东电网有限责任公司 | Tower state detection method and device |
CN111523621B (en) * | 2020-07-03 | 2020-10-20 | 腾讯科技(深圳)有限公司 | Image recognition method and device, computer equipment and storage medium |
CN111857350A (en) * | 2020-07-28 | 2020-10-30 | 海尔优家智能科技(北京)有限公司 | Method, device and equipment for rotating display equipment |
CN112101134B (en) * | 2020-08-24 | 2024-01-02 | 深圳市商汤科技有限公司 | Object detection method and device, electronic equipment and storage medium |
CN112036286A (en) * | 2020-08-25 | 2020-12-04 | 北京华正明天信息技术股份有限公司 | Method for achieving temperature sensing and intelligently analyzing and identifying flame based on yoloV3 algorithm |
CN111986255B (en) * | 2020-09-07 | 2024-04-09 | 凌云光技术股份有限公司 | Multi-scale anchor initializing method and device of image detection model |
CN112132018A (en) * | 2020-09-22 | 2020-12-25 | 平安国际智慧城市科技股份有限公司 | Traffic police recognition method, traffic police recognition device, traffic police recognition medium and electronic equipment |
CN112116582A (en) * | 2020-09-24 | 2020-12-22 | 深圳爱莫科技有限公司 | Cigarette detection and identification method under stock or display scene |
CN112036507B (en) * | 2020-09-25 | 2023-11-14 | 北京小米松果电子有限公司 | Training method and device of image recognition model, storage medium and electronic equipment |
CN112149748B (en) * | 2020-09-28 | 2024-05-21 | 商汤集团有限公司 | Image classification method and device, electronic equipment and storage medium |
CN112183358B (en) * | 2020-09-29 | 2024-04-23 | 新石器慧通(北京)科技有限公司 | Training method and device for target detection model |
CN112132088B (en) * | 2020-09-29 | 2024-01-12 | 动联(山东)电子科技有限公司 | Inspection point missing inspection identification method |
CN112200186B (en) * | 2020-10-15 | 2024-03-15 | 上海海事大学 | Vehicle logo identification method based on improved YOLO_V3 model |
CN112231497B (en) * | 2020-10-19 | 2024-04-09 | 腾讯科技(深圳)有限公司 | Information classification method and device, storage medium and electronic equipment |
CN112348778B (en) * | 2020-10-21 | 2023-10-27 | 深圳市优必选科技股份有限公司 | Object identification method, device, terminal equipment and storage medium |
CN112288003B (en) * | 2020-10-28 | 2023-07-25 | 北京奇艺世纪科技有限公司 | Neural network training and target detection method and device |
CN112381773B (en) * | 2020-11-05 | 2023-04-18 | 东风柳州汽车有限公司 | Key cross section data analysis method, device, equipment and storage medium |
CN112365465B (en) * | 2020-11-09 | 2024-02-06 | 浙江大华技术股份有限公司 | Synthetic image category determining method and device, storage medium and electronic device |
CN112287884B (en) * | 2020-11-19 | 2024-02-20 | 长江大学 | Examination abnormal behavior detection method and device and computer readable storage medium |
CN112348112B (en) * | 2020-11-24 | 2023-12-15 | 深圳市优必选科技股份有限公司 | Training method and training device for image recognition model and terminal equipment |
CN112364807B (en) * | 2020-11-24 | 2023-12-15 | 深圳市优必选科技股份有限公司 | Image recognition method, device, terminal equipment and computer readable storage medium |
CN112560586B (en) * | 2020-11-27 | 2024-05-10 | 国家电网有限公司大数据中心 | Method and device for obtaining structural data of pole and tower signboard and electronic equipment |
CN112634202A (en) * | 2020-12-04 | 2021-04-09 | 浙江省农业科学院 | Method, device and system for detecting behavior of polyculture fish shoal based on YOLOv3-Lite |
CN112508915A (en) * | 2020-12-11 | 2021-03-16 | 中信银行股份有限公司 | Target detection result optimization method and system |
CN112215308B (en) * | 2020-12-13 | 2021-03-30 | 之江实验室 | Single-order detection method and device for hoisted object, electronic equipment and storage medium |
CN112507896B (en) * | 2020-12-14 | 2023-11-07 | 大连大学 | Method for detecting cherry fruits by adopting improved YOLO-V4 model |
CN113723157B (en) * | 2020-12-15 | 2024-02-09 | 京东科技控股股份有限公司 | Crop disease identification method and device, electronic equipment and storage medium |
CN112613097A (en) * | 2020-12-15 | 2021-04-06 | 中铁二十四局集团江苏工程有限公司 | BIM rapid modeling method based on computer vision |
CN112507912A (en) * | 2020-12-15 | 2021-03-16 | 网易(杭州)网络有限公司 | Method and device for identifying illegal picture |
CN112633352B (en) * | 2020-12-18 | 2023-08-29 | 浙江大华技术股份有限公司 | Target detection method and device, electronic equipment and storage medium |
CN112634327A (en) * | 2020-12-21 | 2021-04-09 | 合肥讯图信息科技有限公司 | Tracking method based on YOLOv4 model |
CN112633159B (en) * | 2020-12-22 | 2024-04-12 | 北京迈格威科技有限公司 | Human-object interaction relation identification method, model training method and corresponding device |
CN112580523A (en) * | 2020-12-22 | 2021-03-30 | 平安国际智慧城市科技股份有限公司 | Behavior recognition method, behavior recognition device, behavior recognition equipment and storage medium |
CN112699925A (en) * | 2020-12-23 | 2021-04-23 | 国网安徽省电力有限公司检修分公司 | Transformer substation meter image classification method |
CN112633286B (en) * | 2020-12-25 | 2022-09-09 | 北京航星机器制造有限公司 | Intelligent security inspection system based on similarity rate and recognition probability of dangerous goods |
CN112541483B (en) * | 2020-12-25 | 2024-05-17 | 深圳市富浩鹏电子有限公司 | Dense face detection method combining YOLO and blocking-fusion strategy |
CN112580734B (en) * | 2020-12-25 | 2023-12-29 | 深圳市优必选科技股份有限公司 | Target detection model training method, system, terminal equipment and storage medium |
CN112597915B (en) * | 2020-12-26 | 2024-04-09 | 上海有个机器人有限公司 | Method, device, medium and robot for identifying indoor close-distance pedestrians |
CN112613570A (en) * | 2020-12-29 | 2021-04-06 | 深圳云天励飞技术股份有限公司 | Image detection method, image detection device, equipment and storage medium |
CN112784694A (en) * | 2020-12-31 | 2021-05-11 | 杭州电子科技大学 | EVP-YOLO-based indoor article detection method |
CN112560799B (en) * | 2021-01-05 | 2022-08-05 | 北京航空航天大学 | Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application |
CN112733741A (en) * | 2021-01-14 | 2021-04-30 | 苏州挚途科技有限公司 | Traffic signboard identification method and device and electronic equipment |
CN112818980A (en) * | 2021-01-15 | 2021-05-18 | 湖南千盟物联信息技术有限公司 | Steel ladle number detection and identification method based on Yolov3 algorithm |
CN112766170B (en) * | 2021-01-21 | 2024-04-16 | 广西财经学院 | Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image |
CN112906478B (en) * | 2021-01-22 | 2024-01-09 | 北京百度网讯科技有限公司 | Target object identification method, device, equipment and storage medium |
CN112906495B (en) * | 2021-01-27 | 2024-04-30 | 深圳安智杰科技有限公司 | Target detection method and device, electronic equipment and storage medium |
CN112800971A (en) * | 2021-01-29 | 2021-05-14 | 深圳市商汤科技有限公司 | Neural network training and point cloud data processing method, device, equipment and medium |
CN114821288A (en) * | 2021-01-29 | 2022-07-29 | 中强光电股份有限公司 | Image identification method and unmanned aerial vehicle system |
CN112911171B (en) * | 2021-02-04 | 2022-04-22 | 上海航天控制技术研究所 | Intelligent photoelectric information processing system and method based on accelerated processing |
CN112861711A (en) * | 2021-02-05 | 2021-05-28 | 深圳市安软科技股份有限公司 | Regional intrusion detection method and device, electronic equipment and storage medium |
CN112861716A (en) * | 2021-02-05 | 2021-05-28 | 深圳市安软科技股份有限公司 | Illegal article placement monitoring method, system, equipment and storage medium |
CN112906794A (en) * | 2021-02-22 | 2021-06-04 | 珠海格力电器股份有限公司 | Target detection method, device, storage medium and terminal |
CN113095133B (en) * | 2021-03-04 | 2023-12-29 | 北京迈格威科技有限公司 | Model training method, target detection method and corresponding devices |
CN112906621A (en) * | 2021-03-10 | 2021-06-04 | 北京华捷艾米科技有限公司 | Hand detection method, device, storage medium and equipment |
CN112966618B (en) * | 2021-03-11 | 2024-02-09 | 京东科技信息技术有限公司 | Dressing recognition method, apparatus, device and computer readable medium |
CN113011319B (en) * | 2021-03-16 | 2024-04-16 | 上海应用技术大学 | Multi-scale fire target identification method and system |
CN112966762B (en) * | 2021-03-16 | 2023-12-26 | 南京恩博科技有限公司 | Wild animal detection method and device, storage medium and electronic equipment |
CN112991304A (en) * | 2021-03-23 | 2021-06-18 | 武汉大学 | Molten pool sputtering detection method based on laser directional energy deposition monitoring system |
CN113033398B (en) * | 2021-03-25 | 2022-02-11 | 深圳市康冠商用科技有限公司 | Gesture recognition method and device, computer equipment and storage medium |
CN112965604A (en) * | 2021-03-29 | 2021-06-15 | 深圳市优必选科技股份有限公司 | Gesture recognition method and device, terminal equipment and computer readable storage medium |
CN112990334A (en) * | 2021-03-29 | 2021-06-18 | 西安电子科技大学 | Small sample SAR image target identification method based on improved prototype network |
CN113222889B (en) * | 2021-03-30 | 2024-03-12 | 大连智慧渔业科技有限公司 | Industrial aquaculture counting method and device for aquaculture under high-resolution image |
CN113052127A (en) * | 2021-04-09 | 2021-06-29 | 上海云从企业发展有限公司 | Behavior detection method, behavior detection system, computer equipment and machine readable medium |
CN113139597B (en) * | 2021-04-19 | 2022-11-04 | 中国人民解放军91054部队 | Statistical thought-based image distribution external detection method |
CN113158922A (en) * | 2021-04-26 | 2021-07-23 | 平安科技(深圳)有限公司 | Traffic flow statistical method, device and equipment based on YOLO neural network |
CN113128522B (en) * | 2021-05-11 | 2024-04-05 | 四川云从天府人工智能科技有限公司 | Target identification method, device, computer equipment and storage medium |
CN113240638B (en) * | 2021-05-12 | 2023-11-10 | 上海联影智能医疗科技有限公司 | Target detection method, device and medium based on deep learning |
CN113205067B (en) * | 2021-05-26 | 2024-04-09 | 北京京东乾石科技有限公司 | Method and device for monitoring operators, electronic equipment and storage medium |
WO2022252089A1 (en) * | 2021-05-31 | 2022-12-08 | 京东方科技集团股份有限公司 | Training method for object detection model, and object detection method and device |
CN113435260A (en) * | 2021-06-07 | 2021-09-24 | 上海商汤智能科技有限公司 | Image detection method, related training method, related device, equipment and medium |
CN113392833A (en) * | 2021-06-10 | 2021-09-14 | 沈阳派得林科技有限责任公司 | Method for identifying type number of industrial radiographic negative image |
CN113269188B (en) * | 2021-06-17 | 2023-03-14 | 华南农业大学 | Mark point and pixel coordinate detection method thereof |
CN113486746A (en) * | 2021-06-25 | 2021-10-08 | 海南电网有限责任公司三亚供电局 | Power cable external damage prevention method based on biological induction and video monitoring |
CN113536963B (en) * | 2021-06-25 | 2023-08-15 | 西安电子科技大学 | SAR image airplane target detection method based on lightweight YOLO network |
CN113377888B (en) * | 2021-06-25 | 2024-04-02 | 北京百度网讯科技有限公司 | Method for training object detection model and detection object |
CN113591566A (en) * | 2021-06-28 | 2021-11-02 | 北京百度网讯科技有限公司 | Training method and device of image recognition model, electronic equipment and storage medium |
CN113553948A (en) * | 2021-07-23 | 2021-10-26 | 中远海运科技(北京)有限公司 | Automatic recognition and counting method for tobacco insects and computer readable medium |
CN113486857B (en) * | 2021-08-03 | 2023-05-12 | 云南大学 | YOLOv 4-based ascending safety detection method and system |
CN113723217A (en) * | 2021-08-09 | 2021-11-30 | 南京邮电大学 | Object intelligent detection method and system based on yolo improvement |
CN113705643B (en) * | 2021-08-17 | 2022-10-28 | 荣耀终端有限公司 | Target detection method and device and electronic equipment |
CN113657280A (en) * | 2021-08-18 | 2021-11-16 | 广东电网有限责任公司 | Power transmission line target defect detection warning method and system |
CN113948190A (en) * | 2021-09-02 | 2022-01-18 | 上海健康医学院 | Method and equipment for automatically identifying X-ray skull positive position film cephalogram measurement mark points |
CN113723406B (en) * | 2021-09-03 | 2023-07-18 | 乐普(北京)医疗器械股份有限公司 | Method and device for processing support positioning of coronary angiography image |
CN114119455B (en) * | 2021-09-03 | 2024-04-09 | 乐普(北京)医疗器械股份有限公司 | Method and device for positioning vascular stenosis part based on target detection network |
CN113743339B (en) * | 2021-09-09 | 2023-10-03 | 三峡大学 | Indoor falling detection method and system based on scene recognition |
CN113792656B (en) * | 2021-09-15 | 2023-07-18 | 山东大学 | Behavior detection and alarm system using mobile communication equipment in personnel movement |
CN114022705B (en) * | 2021-10-29 | 2023-08-04 | 电子科技大学 | Self-adaptive target detection method based on scene complexity pre-classification |
CN114022554B (en) * | 2021-11-03 | 2023-02-03 | 北华航天工业学院 | Massage robot acupoint detection and positioning method based on YOLO |
CN114120358B (en) * | 2021-11-11 | 2024-04-26 | 国网江苏省电力有限公司技能培训中心 | Super-pixel-guided deep learning-based personnel head-mounted safety helmet recognition method |
CN114255389A (en) * | 2021-11-15 | 2022-03-29 | 浙江时空道宇科技有限公司 | Target object detection method, device, equipment and storage medium |
CN113989939B (en) * | 2021-11-16 | 2024-05-14 | 河北工业大学 | Small target pedestrian detection system based on improved YOLO algorithm |
CN114373075A (en) * | 2021-12-31 | 2022-04-19 | 西安电子科技大学广州研究院 | Target component detection data set construction method, detection method, device and equipment |
US11756288B2 (en) * | 2022-01-05 | 2023-09-12 | Baidu Usa Llc | Image processing method and apparatus, electronic device and storage medium |
CN114565848B (en) * | 2022-02-25 | 2022-12-02 | 佛山读图科技有限公司 | Liquid medicine level detection method and system in complex scene |
CN114662594B (en) * | 2022-03-25 | 2022-10-04 | 浙江省通信产业服务有限公司 | Target feature recognition analysis system |
CN114742204A (en) * | 2022-04-08 | 2022-07-12 | 黑龙江惠达科技发展有限公司 | Method and device for detecting straw coverage rate |
CN114782778B (en) * | 2022-04-25 | 2023-01-06 | 广东工业大学 | Assembly state monitoring method and system based on machine vision technology |
CN114842315B (en) * | 2022-05-07 | 2024-02-02 | 无锡雪浪数制科技有限公司 | Looseness-prevention identification method and device for lightweight high-speed railway hub gasket |
CN114881763B (en) * | 2022-05-18 | 2023-05-26 | 中国工商银行股份有限公司 | Post-loan supervision method, device, equipment and medium for aquaculture |
CN115029209A (en) * | 2022-06-17 | 2022-09-09 | 杭州天杭空气质量检测有限公司 | Colony image acquisition processing device and processing method thereof |
CN114972891B (en) * | 2022-07-07 | 2024-05-03 | 智云数创(洛阳)数字科技有限公司 | Automatic identification method for CAD (computer aided design) component and BIM (building information modeling) method |
CN115082661B (en) * | 2022-07-11 | 2024-05-10 | 阿斯曼尔科技(上海)有限公司 | Sensor assembly difficulty reducing method |
CN115187982B (en) * | 2022-07-12 | 2023-05-23 | 河北华清环境科技集团股份有限公司 | Algae detection method and device and terminal equipment |
CN115909358B (en) * | 2022-07-27 | 2024-02-13 | 广州市玄武无线科技股份有限公司 | Commodity specification identification method, commodity specification identification device, terminal equipment and computer storage medium |
CN115346170B (en) * | 2022-08-11 | 2023-05-30 | 北京市燃气集团有限责任公司 | Intelligent monitoring method and device for gas facility area |
CN115346172B (en) * | 2022-08-16 | 2023-04-21 | 哈尔滨市科佳通用机电股份有限公司 | Method and system for detecting lost and broken hook lifting rod reset spring |
CN115297263B (en) * | 2022-08-24 | 2023-04-07 | 广州方图科技有限公司 | Automatic photographing control method and system suitable for cube shooting and cube shooting |
CN115690565B (en) * | 2022-09-28 | 2024-02-20 | 大连海洋大学 | Method for detecting cultivated takifugu rubripes target by fusing knowledge and improving YOLOv5 |
CN115546566A (en) * | 2022-11-24 | 2022-12-30 | 杭州心识宇宙科技有限公司 | Intelligent body interaction method, device, equipment and storage medium based on article identification |
CN116051985B (en) * | 2022-12-20 | 2023-06-23 | 中国科学院空天信息创新研究院 | Semi-supervised remote sensing target detection method based on multi-model mutual feedback learning |
CN115690570B (en) * | 2023-01-05 | 2023-03-28 | 中国水产科学研究院黄海水产研究所 | Fish shoal feeding intensity prediction method based on ST-GCN |
CN116452858B (en) * | 2023-03-24 | 2023-12-15 | 哈尔滨市科佳通用机电股份有限公司 | Rail wagon connecting pull rod round pin breaking fault identification method and system |
CN116403163B (en) * | 2023-04-20 | 2023-10-27 | 慧铁科技有限公司 | Method and device for identifying opening and closing states of handles of cut-off plug doors |
CN116342316A (en) * | 2023-05-31 | 2023-06-27 | 青岛希尔信息科技有限公司 | Accounting and project financial management system and method |
CN116681687A (en) * | 2023-06-20 | 2023-09-01 | 广东电网有限责任公司广州供电局 | Wire detection method and device based on computer vision and computer equipment |
CN116758547B (en) * | 2023-06-27 | 2024-03-12 | 北京中超伟业信息安全技术股份有限公司 | Paper medium carbonization method, system and storage medium |
CN117201834A (en) * | 2023-09-11 | 2023-12-08 | 南京天创电子技术有限公司 | Real-time double-spectrum fusion video stream display method and system based on target detection |
CN116916166B (en) * | 2023-09-12 | 2023-11-17 | 湖南湘银河传感科技有限公司 | Telemetry terminal based on AI image analysis |
CN116935232A (en) * | 2023-09-15 | 2023-10-24 | 青岛国测海遥信息技术有限公司 | Remote sensing image processing method and device for offshore wind power equipment, equipment and medium |
CN117671597A (en) * | 2023-12-25 | 2024-03-08 | 北京大学长沙计算与数字经济研究院 | Method for constructing mouse detection model and mouse detection method and device |
CN117523318B (en) * | 2023-12-26 | 2024-04-16 | 宁波微科光电股份有限公司 | Anti-light interference subway shielding door foreign matter detection method, device and medium |
CN117893895A (en) * | 2024-03-15 | 2024-04-16 | 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) | Method, system, equipment and storage medium for identifying portunus trituberculatus |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247956A (en) * | 2016-10-09 | 2017-10-13 | 成都快眼科技有限公司 | A kind of fast target detection method judged based on grid |
CN107423760A (en) * | 2017-07-21 | 2017-12-01 | 西安电子科技大学 | Based on pre-segmentation and the deep learning object detection method returned |
CN108154098A (en) * | 2017-12-20 | 2018-06-12 | 歌尔股份有限公司 | A kind of target identification method of robot, device and robot |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107527009B (en) * | 2017-07-11 | 2020-09-04 | 浙江汉凡软件科技有限公司 | Remnant detection method based on YOLO target detection |
CN109117794A (en) * | 2018-08-16 | 2019-01-01 | 广东工业大学 | A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing |
CN109977943B (en) * | 2019-02-14 | 2024-05-07 | 平安科技(深圳)有限公司 | Image target recognition method, system and storage medium based on YOLO |
-
2019
- 2019-02-14 CN CN201910114621.5A patent/CN109977943B/en active Active
- 2019-11-14 WO PCT/CN2019/118499 patent/WO2020164282A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107247956A (en) * | 2016-10-09 | 2017-10-13 | 成都快眼科技有限公司 | A kind of fast target detection method judged based on grid |
CN107423760A (en) * | 2017-07-21 | 2017-12-01 | 西安电子科技大学 | Based on pre-segmentation and the deep learning object detection method returned |
CN108154098A (en) * | 2017-12-20 | 2018-06-12 | 歌尔股份有限公司 | A kind of target identification method of robot, device and robot |
Also Published As
Publication number | Publication date |
---|---|
CN109977943A (en) | 2019-07-05 |
WO2020164282A1 (en) | 2020-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977943B (en) | Image target recognition method, system and storage medium based on YOLO | |
CN109918969B (en) | Face detection method and device, computer device and computer readable storage medium | |
EP3493101B1 (en) | Image recognition method, terminal, and nonvolatile storage medium | |
CN108470172B (en) | Text information identification method and device | |
CN111079674B (en) | Target detection method based on global and local information fusion | |
CN111814902A (en) | Target detection model training method, target identification method, device and medium | |
WO2018052586A1 (en) | Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks | |
CN110991311A (en) | Target detection method based on dense connection deep network | |
Wang et al. | Fast and robust object detection using asymmetric totally corrective boosting | |
CN111368636B (en) | Object classification method, device, computer equipment and storage medium | |
CN110766017B (en) | Mobile terminal text recognition method and system based on deep learning | |
CN109993221B (en) | Image classification method and device | |
CN109934216B (en) | Image processing method, device and computer readable storage medium | |
CN112508094A (en) | Junk picture identification method, device and equipment | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
CN111444976A (en) | Target detection method and device, electronic equipment and readable storage medium | |
CN111724342A (en) | Method for detecting thyroid nodule in ultrasonic image | |
CN115239644B (en) | Concrete defect identification method, device, computer equipment and storage medium | |
CN114187311A (en) | Image semantic segmentation method, device, equipment and storage medium | |
CN111461145A (en) | Method for detecting target based on convolutional neural network | |
CN111696080A (en) | Face fraud detection method, system and storage medium based on static texture | |
CN111414910B (en) | Small target enhancement detection method and device based on double convolution neural network | |
CN111597875A (en) | Traffic sign identification method, device, equipment and storage medium | |
CN116152226A (en) | Method for detecting defects of image on inner side of commutator based on fusible feature pyramid | |
CN112926595B (en) | Training device of deep learning neural network model, target detection system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |