CN109977943B - Image target recognition method, system and storage medium based on YOLO - Google Patents

Image target recognition method, system and storage medium based on YOLO Download PDF

Info

Publication number
CN109977943B
CN109977943B CN201910114621.5A CN201910114621A CN109977943B CN 109977943 B CN109977943 B CN 109977943B CN 201910114621 A CN201910114621 A CN 201910114621A CN 109977943 B CN109977943 B CN 109977943B
Authority
CN
China
Prior art keywords
detection frame
classification
image
detection
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910114621.5A
Other languages
Chinese (zh)
Other versions
CN109977943A (en
Inventor
赵峰
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910114621.5A priority Critical patent/CN109977943B/en
Publication of CN109977943A publication Critical patent/CN109977943A/en
Priority to PCT/CN2019/118499 priority patent/WO2020164282A1/en
Application granted granted Critical
Publication of CN109977943B publication Critical patent/CN109977943B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an artificial intelligence technology, and provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium, wherein the method comprises the following steps: receiving an image to be detected; the size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated; the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated; judging whether the classification probability value is larger than a preset classification probability threshold value or not; and if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification. By the technical scheme, the detection precision can be effectively improved, and the detection time can be reduced. Compared with the detection method in the prior art, the method provided by the invention has the advantages that the identification accuracy is improved and the operation speed is increased.

Description

Image target recognition method, system and storage medium based on YOLO
Technical Field
The present invention relates to the field of computer learning and image recognition, and more particularly, to a YOLO-based image target recognition method, system and storage medium.
Background
The high-speed development of artificial intelligence technology, deep learning is increasingly applied to the field of computer vision, especially the field of image target detection.
In recent years, the target detection algorithm has made a great breakthrough. The more popular algorithms can be classified into two types, one is based on Region Proposal R-CNN (Fast R-CNN) which are two-stage, requiring the heuristic method (SELECTIVE SEARCH) or CNN network (RPN) to generate Region Proposal, and then classifying and regressing at Region Proposal. The other is Yolo (all You Only Look Once), a one-stage algorithm such as SSD, which uses only one CNN network to directly predict the class and location of different targets. The first class of methods are more accurate but slow; the second class of algorithms is fast but less accurate. More and more target detection methods are implemented based on YOLO, and many deep networks are also improved based on YOLO. YOLO solves the object detection as a regression problem, completing the input from the original image to the output of the object location and class based on a single end-to-end network.
The key idea of YOLO is to use the whole graph as the input of the network, and directly return the position of the binding box and the category to which the binding box belongs at the output layer. On the basis of utilizing the YOLO high-rate operation, how to design a method capable of improving the YOLO accuracy is currently urgent to be solved.
Disclosure of Invention
In order to solve at least one technical problem, the invention provides an image target identification method, an image target identification system and a storage medium based on YOLO.
In order to achieve the above object, the technical scheme of the present invention provides a YOLO-based image target recognition method, which includes:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
In this solution, before the receiving the image to be detected, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers.
In this embodiment, the size is defined by the neural network model.
The technical scheme of the invention also provides a YOLO-based image target recognition system, which comprises: the image target recognition system comprises a memory, a processor and an image pickup device, wherein the memory comprises a YOLO-based image target recognition method program, and the image target recognition method program based on the YOLO realizes the following steps when being executed by the processor:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
In this solution, before the receiving the image to be detected, the method further includes:
training pictures to obtain a neural network model; the neural network model is a model with an input interface and an output interface, which is obtained by performing image training on pictures of different categories.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers.
In this embodiment, the size is defined by the neural network model.
The third aspect of the present invention also provides a computer-readable storage medium having embodied therein a YOLO-based image target recognition method program which, when executed by a processor, implements the steps of a YOLO-based image target recognition method as described above.
The invention provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium. According to the method, the classification recognition probability is judged, and the recognition information is taken as the recognition result when the preset classification probability threshold is reached, so that the accuracy of image recognition and the recognition experience are improved. The invention can also adjust the position of the detection frame in real time, thereby effectively improving the detection efficiency and precision; the detection time is reduced by optimizing the calculation mode of the detection. Through experiments and verification, the method of the invention is superior to the detection method in the prior art. The identification accuracy is improved and the operation speed is increased.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flowchart of a method for recognizing an image object based on YOLO according to the present invention;
FIG. 2 shows a schematic diagram of convolution operation in the classification process of the present invention;
FIG. 3 shows a block diagram of a YOLO-based image target recognition system of the present invention;
fig. 4 shows a schematic diagram of an embodiment of the invention.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those described herein, and therefore the scope of the present invention is not limited to the specific embodiments disclosed below.
FIG. 1 is a flowchart of a method for recognizing an image object based on YOLO according to the present invention.
As shown in fig. 1, the technical scheme of the present invention provides a YOLO-based image target recognition method, which includes:
s102, receiving an image to be detected;
s104, adjusting the size of the image to be detected according to a preset requirement to generate a first detection image;
S106, the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
s108, judging whether the classification probability value is larger than a preset classification probability threshold value;
And S110, if the detection frame and the classification identification information are larger than each other, the detection frame and the classification identification information are used as the classification result of the identification.
The size of the dimension is the size specified by the neural network model. In the neural network model, the size of the image is generally selected to be smaller than that of the image to be detected, so that the speed of operation processing can be ensured, and class identification can be rapidly carried out. It will be appreciated by those skilled in the art that the size of the steps may be set according to actual needs, and are not limited to the above-mentioned sizes, and are not intended to limit the scope of the present invention.
And sending the first detection image to a neural network model, and generating a detection frame, classification identification information and a classification probability value corresponding to the classification identification information. A person skilled in the art can set the classification probability value according to actual needs, for example, set the classification probability threshold to 90%, and when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90%, it indicates that the kitten is circled in the detection frame, and the kitten in the picture is already identified. If the classification probability value is smaller than the preset classification probability threshold, the step S106 is returned to carry out re-identification until the classification probability value is larger than the preset classification probability threshold. The neural network model performs a multi-layer convolution operation on the image. The YOLO convolution operation is a conventional operation in the field, belongs to the prior art, and is not described in detail.
In this solution, before receiving the image to be detected in the step S102, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
It should be noted that the training image data set has 1000 object categories and 120 ten thousand training images. The data set is preprocessed before training, the preprocessing comprises one or more of rotation, contrast enhancement, tilting and scaling, after the preprocessing, the image has certain distortion, and the accuracy of the final image recognition can be increased through training the distorted image.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
It should be noted that, the initial preset coordinate point may be a coordinate point of a preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and a height pw, p h, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
Wherein b x、by、bw、bh is the four coordinate point position values of the latest detection frame respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points.
The network predicts 4 coordinates per detection box, (t x,ty,tw,th). If the cell deviates from the upper left corner coordinates (c x;cy) of the image, the coordinates of the latest detection frame expressed by the above formula can be derived.
It should be noted that each box uses multiple labels to classify classes that the prediction bounding box may contain. In the process of category identification, the application uses a binary cross entropy loss technique to conduct category prediction. The objective of using binary cross entropy loss for class prediction is mainly that the inventors have found that the softmax technique does not require good performance, but rather that a separate logic classifier is used, so this step does not require the softmax technique. The binary cross entropy loss technique provides more assistance when migrating to more complex category identification areas using the method of the present application. The binary cross entropy loss technology is a common technology in the field, and a person skilled in the art can implement the binary cross entropy loss technology according to requirements, so that the application is not repeated.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers. The inventor finds that the accuracy can be increased and the operation speed can be effectively improved by adopting the convolution layer to perform alternating operation in limited practical tests. The convolution layers are alternately calculated, specifically, 3×3 convolution operation is adopted first, then 1×1 convolution operation is adopted, and the calculation is sequentially and alternately carried out until all the convolution layers participate in the operation.
In this embodiment, the size is defined by the neural network model.
In order to better explain the technical scheme of the present invention, the following describes the technical scheme of the present invention in detail.
After the detection frames are generated, the classification probability values in the detection frames are calculated, and the optimal N detection frames of the same type are screened out. It should be noted that, the size of the detection frame is dynamically predicted, and the dynamic prediction process is the scheme described above. Screening M classification probability values of all detection frames by using probability threshold values, and formulating a set of screening rules:
And calculating the classification probability value of each detection frame, arranging the classification probability values according to the sequence from large to small, and selecting the classification with the highest ranking. The first screening step can be said to be that the M categories of each detection frame are firstly compared, and a champion category with the highest probability value is selected.
Comparing the highest-ranking classification with a preset probability threshold, and if the highest-ranking classification is larger than or equal to the preset probability threshold, reserving the detection frame; and if the probability is smaller than the preset probability threshold, deleting the detection frame. It can be said that the second round of screening, the champion classification is compared with the probability threshold value, and the detection box with the value larger than the probability threshold value is qualified to enter the resolution. For example, the probability threshold may be set to 0.24 (24%). By comparing, the detection frame which is preset is displayed on the picture, and the classification probability value which is larger than or equal to the 0.24 probability threshold value can be displayed. Those skilled in the art may set the probability threshold according to actual needs, and the probability threshold described in the present application is not limited to the protection scope of the present application.
And calculating the coincidence degree of the N similar detection frames, and reserving the detection frame with the highest coincidence degree.
For example, after the screening step described above, all three of the test frames were tested for classification as "horse".
And sequencing the detection probabilities of the three detection units in a descending order.
And (3) performing coincidence degree calculation (IOU) two by two, and eliminating the detection frame with low probability if the coincidence degree calculated value IoU is more than 0.3.
The result is a unique detection box classified as "horse".
FIG. 2 shows a schematic diagram of convolution operation in the classification process of the present invention.
As shown in fig. 2, the neural network model adopts 53-layer convolution operations, and the convolution operations of each layer are alternately performed for 3×3 and 1×1 convolution layers.
The feature extraction scheme enables the highest measurement floating point operation per second. This means that the neural network architecture can better utilize the GPU of the machine, improving the evaluation efficiency and thus the speed. The convolution operation of the present application may be more efficient and accurate due to the much hierarchical and inefficient techniques ResNets.
For example, each neural network is trained using the same settings and tested with a single clipping precision of 256×256. The performance of the classifier adopting the feature extraction of the application is equivalent to that of the most advanced classifier in the prior art, but the floating point operation is less and the speed is faster.
FIG. 3 shows a block diagram of a YOLO-based image object recognition system of the present invention.
As shown in fig. 2, the technical solution of the present invention further provides a YOLO-based image target recognition system 2, which includes: the memory 201, the processor 202 and the image capturing device 203, wherein the memory 201 includes a YOLO-based image target recognition method program, and the YOLO-based image target recognition method program, when executed by the processor, implements the following steps:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
judging whether the classification probability value is larger than a preset classification probability threshold value or not;
And if the detection frame and the classification identification information are larger than the detection frame and the classification identification information, the detection frame and the classification identification information are used as the classification result of the identification.
The size of the dimension is the size specified by the neural network model. In the neural network model, the size of the image is generally selected to be smaller than that of the image to be detected, so that the speed of operation processing can be ensured, and class identification can be rapidly carried out. It will be appreciated by those skilled in the art that the size of the steps may be set according to actual needs, and are not limited to the above-mentioned sizes, and are not intended to limit the scope of the present invention.
And sending the first detection image to a neural network model, and generating a detection frame, classification identification information and a classification probability value corresponding to the classification identification information. A person skilled in the art can set the classification probability value according to actual needs, for example, set the classification probability threshold to 90%, and when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90%, it indicates that the kitten is circled in the detection frame, and the kitten in the picture is already identified. If the classification probability value is smaller than the preset classification probability threshold, the step S106 is returned to carry out re-identification until the classification probability value is larger than the preset classification probability threshold. The neural network model performs a multi-layer convolution operation on the image. The YOLO convolution operation is a conventional operation in the field, belongs to the prior art, and is not described in detail.
In this solution, before the receiving the image to be detected, the method further includes:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
It should be noted that the training image data set has 1000 object categories and 120 ten thousand training images. The data set is preprocessed before training, the preprocessing comprises one or more of rotation, contrast enhancement, tilting and scaling, after the preprocessing, the image has certain distortion, and the accuracy of the final image recognition can be increased through training the distorted image.
In this scheme, the step of generating the detection frame specifically includes:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
And finally generating N similar detection frames of the same class.
It should be noted that, the initial preset coordinate point may be a coordinate point of a preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
In this scheme, the performing prediction of the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame specifically includes:
Predicting 4 coordinates of each detection frame, (t x,ty,tw,th); if the cell deviates from the upper left corner coordinate (c x;cy) of the image and the up-predicted detection frame has a width and height p w,ph, the coordinates of the latest detection frame are:
bx=σ(tx)+cx
by=σ(ty)+cy
In the invention, dimension clusters can be used as anchor frames to dynamically predict detection frames, and the detection frames are also boundary frames. The network predicts 4 coordinates per detection box, t x,ty,tw,th. If the cell deviates from the upper left corner coordinates (c x;cy) of the image, the coordinates of the latest detection frame expressed by the above formula can be derived, where b x、by、bw、bh is the four coordinate point values of the latest detection frame, respectively. The detection frame is quadrilateral, wherein the position of the quadrilateral detection frame can be determined through the values of 4 points. .
It should be noted that each box uses multiple labels to classify classes that the prediction bounding box may contain. In the process of category identification, the application uses a binary cross entropy loss technique to conduct category prediction. The objective of using binary cross entropy loss for class prediction is mainly that the inventors have found that the softmax technique does not require good performance, but rather that a separate logic classifier is used, so this step does not require the softmax technique. The binary cross entropy loss technique provides even more assistance when migrating to more complex category identification areas using the method of the present application. The binary cross entropy loss technology is a common technology in the field, and a person skilled in the art can implement the binary cross entropy loss technology according to requirements, so that the application is not repeated.
In this solution, before the receiving the image to be detected, the method further includes:
training pictures to obtain a neural network model; the neural network model is a model with an input interface and an output interface, which is obtained by performing image training on pictures of different categories.
Preferably, 53 layers of convolution operations are adopted in the neural network model, and the convolution operation of each layer is calculated alternately for 3×3 and 1×1 convolution layers. The inventor finds that the accuracy can be increased and the operation speed can be effectively improved by adopting the convolution layer to perform alternating operation in limited practical tests. The convolution layers are alternately calculated, specifically, 3×3 convolution operation is adopted first, then 1×1 convolution operation is adopted, and the calculation is sequentially and alternately carried out until all the convolution layers participate in the operation.
In this embodiment, the size is defined by the neural network model.
It should be noted that, in the neural network model, 53-layer convolution operation is adopted, and the convolution operation of each layer is alternately performed by 3×3 and 1×1 convolution layers.
The feature extraction scheme enables the highest measurement floating point operation per second. This means that the neural network architecture can better utilize the GPU of the machine, improving the evaluation efficiency and thus the speed. The convolution operation of the present application may be more efficient and accurate due to the much hierarchical and inefficient techniques ResNets.
For example, each neural network is trained using the same settings and tested with a single clipping precision of 256×256. The performance of the classifier adopting the feature extraction of the application is equivalent to that of the most advanced classifier in the prior art, but the floating point operation is less and the speed is faster.
And calculating the classification probability value in the detection frames, and screening out the optimal N detection frames of the same kind. Screening M classification probability values of all detection frames by using probability threshold values, and formulating a set of screening rules:
And calculating the classification probability value of each detection frame, arranging the classification probability values according to the sequence from large to small, and selecting the classification with the highest ranking. The first screening step can be said to be that the M categories of each detection frame are firstly compared, and a champion category with the highest probability value is selected.
Comparing the highest-ranking classification with a preset probability threshold, and if the highest-ranking classification is larger than or equal to the preset probability threshold, reserving the detection frame; and if the probability is smaller than the preset probability threshold, deleting the detection frame. It can be said that the second round of screening, the champion classification is compared with the probability threshold value, and the detection box with the value larger than the probability threshold value is qualified to enter the resolution. For example, the probability threshold may be set to 0.24 (24%). By comparison, the detection frame passing through the pre-match is displayed on the picture, and it can be seen that the detection frame can be displayed as long as the classification probability value is equal to or greater than the 0.24 probability threshold.
And calculating the coincidence degree of the N similar detection frames, and reserving the detection frame with the highest coincidence degree.
For example, after the screening step described above, all three of the test frames were tested for classification as "horse".
And sequencing the detection probabilities of the three detection units in a descending order.
And (3) performing coincidence degree calculation (IOU) two by two, and eliminating the detection frame with low probability if the coincidence degree calculated value IoU is more than 0.3.
The result is a unique detection box classified as "horse".
In order to better explain the technical scheme of the invention, the following detailed description is given by an embodiment. Fig. 4 shows a schematic diagram of an embodiment of the invention.
As shown in fig. 4, the number of convolution layers is set to 0-52 layers in the neural network model. And then receiving the first detection image after size adjustment, wherein the size of the first detection image is 416×416, the specific size can be set according to the actual operation requirement and the operation capability, and the embodiment selects 416×416 for explanation, and the color is a color photo. Layer 0 of the neural network model receives the first color detection image of 416 x 416 size, 3 channels (RGB), and performs convolution operation.
After the convolution operation of layers 0-51, a feature map (feature map) of a channel with a size of 13 x 13 and 425 is obtained.
And the 52 th layer carries out convolution operation on the characteristic picture, and finally outputs a one-dimensional prediction array which comprises 13 x 5 x 85 numerical values. The multi-dimensional array or matrix is reduced to a one-dimensional array by a series of operations. The one-dimensional array is a prediction array.
Wherein the number 13 x 13 of the 13 x 5 x 85 values represents the broad x height of the feature map (feature map), and there are a total of 13 x 13 feature cells. YOLO equally divides the original picture (416×416) into 13×13 regions (cells), one picture region for each feature cell. The specific size may be set by those skilled in the art according to the actual operational requirements and operational capabilities.
Number 5: representing 5 different-shaped detection boxes (bounding boxes), YOLO generates 5 different-shaped detection boxes in each picture area, and uses the center point of the area as the center point of the detection boxes to detect objects, so that YOLO uses 13 x 5 detection boxes to detect one picture or image.
The numeral 85 can be split into 3 parts understanding that 85=4+1+80.
4: Each detection frame contains 4 coordinate values (x, y, width, height)
1: Each detection frame has 1 confidence value of the detected object, which is also the confidence (0-1), and the confidence value is understood as the confidence probability of detecting the object.
80: Each detection frame has 80 classification detection probability values (0-1), which is understood to be the probability that the objects in the detection frame may be each classification respectively.
It can be said that the above procedure is that a 416-416 picture is divided into 13-13 picture areas on average, each picture area generates 5 detection frames, each detection frame contains 85 values (4 coordinate values+1 detection object self-confidence value+80 classification detection values), the finally obtained one-dimensional prediction array (predictions) represents the object detected in the picture, and the array contains 13-5-85 numerical values predictions [0] to predictions [ 13-5-85-1 ].
The invention provides a YOLO-based image target identification method, a YOLO-based image target identification system and a storage medium. The method can effectively improve the detection precision and reduce the detection time. Through experiments and verification, the method of the invention is superior to the detection method in the prior art. The identification accuracy is improved and the operation speed is increased.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or optical disk, or the like, which can store program codes.
Or the above-described integrated units of the invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A YOLO-based image target recognition method, comprising:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
Arranging the classification probability values in each detection frame from large to small, and selecting the classification with highest ranking; then comparing the classification probability value of the highest-ranking classification with a preset probability threshold value, and judging whether the classification probability value of the highest-ranking classification is larger than the preset probability threshold value;
If the probability threshold value is smaller than the preset probability threshold value, deleting the detection frame; if the probability threshold value is larger than the preset probability threshold value, the detection frame and the classification identification information are used as the identification classification result;
the step of generating the detection frame specifically comprises the following steps:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
finally generating N similar detection frames of the same class;
the step of carrying out the prediction of the dynamic detection frame, carrying out iterative prediction on the generated detection frame, and generating the latest detection frame specifically comprises the following steps:
Predicting 4 coordinates of each detection frame ) ; If the cell deviates from the upper left corner coordinates of the imageAnd the detection frame of the upper prediction has a width/>And height/>The coordinates of the latest detection frame are:
wherein/> And the four coordinate point position values of the latest detection frame are respectively.
2. The YOLO-based image object recognition method according to claim 1, further comprising, before the receiving the image to be detected:
Training pictures to obtain a neural network model; the neural network model is trained by the following steps:
acquiring a training image dataset;
performing image preprocessing on the training image data set to obtain a preprocessed image set;
training the preprocessed image set to obtain the neural network model with the input interface and the output interface.
3. The YOLO-based image object recognition method of claim 2, wherein 53-layer convolution operations are used in the neural network model, and the convolution operations of each layer are calculated alternately for 3 x 3 and 1 x 1 convolution layers.
4. The YOLO-based image object recognition method of claim 1, wherein the size is a size specified by a neural network model.
5. A YOLO-based image target recognition system, the system comprising: the image target recognition system comprises a memory, a processor and an image pickup device, wherein the memory comprises a YOLO-based image target recognition method program, and the image target recognition method program based on the YOLO realizes the following steps when being executed by the processor:
Receiving an image to be detected;
The size of the image to be detected is adjusted according to a preset requirement, and a first detection image is generated;
the first detection image is sent to a neural network model for matching identification, and a detection frame, classification identification information and a classification probability value corresponding to the classification identification information are generated;
Arranging the classification probability values in each detection frame from large to small, and selecting the classification with highest ranking; then comparing the classification probability value of the highest-ranking classification with a preset probability threshold value, and judging whether the classification probability value of the highest-ranking classification is larger than the preset probability threshold value;
If the probability threshold value is smaller than the preset probability threshold value, deleting the detection frame; if the probability threshold value is larger than the preset probability threshold value, the detection frame and the classification identification information are used as the identification classification result;
the step of generating the detection frame specifically comprises the following steps:
generating an initial detection frame according to initial preset coordinate points;
Performing prediction of a dynamic detection frame, performing iterative prediction on the generated detection frame, and generating a latest detection frame;
calculating the coincidence ratio of the latest detection frame;
if the latest detection frame overlap ratio is greater than or equal to a preset overlap ratio threshold value, reserving the latest detection frame; if the latest detection frame overlap ratio is smaller than a preset overlap ratio threshold value, continuing to predict the dynamic detection frame;
finally generating N similar detection frames of the same class;
the step of carrying out the prediction of the dynamic detection frame, carrying out iterative prediction on the generated detection frame, and generating the latest detection frame specifically comprises the following steps: predicting 4 coordinates of each detection frame ) ; If the cell deviates from the upper left corner coordinates/>, of the imageAnd the detection frame of the upper prediction has a width/>And height/>The coordinates of the latest detection frame are:
wherein/> And the four coordinate point position values of the latest detection frame are respectively.
6. A computer-readable storage medium, characterized in that a YOLO-based image object recognition method program is included in the computer-readable storage medium, which, when executed by a processor, implements the steps of a YOLO-based image object recognition method according to any one of claims 1 to 4.
CN201910114621.5A 2019-02-14 2019-02-14 Image target recognition method, system and storage medium based on YOLO Active CN109977943B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910114621.5A CN109977943B (en) 2019-02-14 2019-02-14 Image target recognition method, system and storage medium based on YOLO
PCT/CN2019/118499 WO2020164282A1 (en) 2019-02-14 2019-11-14 Yolo-based image target recognition method and apparatus, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910114621.5A CN109977943B (en) 2019-02-14 2019-02-14 Image target recognition method, system and storage medium based on YOLO

Publications (2)

Publication Number Publication Date
CN109977943A CN109977943A (en) 2019-07-05
CN109977943B true CN109977943B (en) 2024-05-07

Family

ID=67076997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910114621.5A Active CN109977943B (en) 2019-02-14 2019-02-14 Image target recognition method, system and storage medium based on YOLO

Country Status (2)

Country Link
CN (1) CN109977943B (en)
WO (1) WO2020164282A1 (en)

Families Citing this family (130)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977943B (en) * 2019-02-14 2024-05-07 平安科技(深圳)有限公司 Image target recognition method, system and storage medium based on YOLO
CN110348304A (en) * 2019-06-06 2019-10-18 武汉理工大学 A kind of maritime affairs distress personnel search system being equipped on unmanned plane and target identification method
CN110738125B (en) * 2019-09-19 2023-08-01 平安科技(深圳)有限公司 Method, device and storage medium for selecting detection frame by Mask R-CNN
CN111223343B (en) * 2020-03-07 2022-01-28 上海中科教育装备集团有限公司 Artificial intelligence scoring experimental equipment and scoring method for lever balance experiment
CN111582021A (en) * 2020-03-26 2020-08-25 平安科技(深圳)有限公司 Method and device for detecting text in scene image and computer equipment
CN111695559B (en) * 2020-04-28 2023-07-18 深圳市跨越新科技有限公司 YoloV3 model-based waybill picture information coding method and system
CN113705591A (en) * 2020-05-20 2021-11-26 上海微创卜算子医疗科技有限公司 Readable storage medium, and support specification identification method and device
CN111626256B (en) * 2020-06-03 2023-06-27 兰波(苏州)智能科技有限公司 High-precision diatom detection and identification method and system based on scanning electron microscope image
CN111738259A (en) * 2020-06-29 2020-10-02 广东电网有限责任公司 Tower state detection method and device
CN111523621B (en) * 2020-07-03 2020-10-20 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN111857350A (en) * 2020-07-28 2020-10-30 海尔优家智能科技(北京)有限公司 Method, device and equipment for rotating display equipment
CN112101134B (en) * 2020-08-24 2024-01-02 深圳市商汤科技有限公司 Object detection method and device, electronic equipment and storage medium
CN112036286A (en) * 2020-08-25 2020-12-04 北京华正明天信息技术股份有限公司 Method for achieving temperature sensing and intelligently analyzing and identifying flame based on yoloV3 algorithm
CN111986255B (en) * 2020-09-07 2024-04-09 凌云光技术股份有限公司 Multi-scale anchor initializing method and device of image detection model
CN112132018A (en) * 2020-09-22 2020-12-25 平安国际智慧城市科技股份有限公司 Traffic police recognition method, traffic police recognition device, traffic police recognition medium and electronic equipment
CN112116582A (en) * 2020-09-24 2020-12-22 深圳爱莫科技有限公司 Cigarette detection and identification method under stock or display scene
CN112036507B (en) * 2020-09-25 2023-11-14 北京小米松果电子有限公司 Training method and device of image recognition model, storage medium and electronic equipment
CN112149748B (en) * 2020-09-28 2024-05-21 商汤集团有限公司 Image classification method and device, electronic equipment and storage medium
CN112183358B (en) * 2020-09-29 2024-04-23 新石器慧通(北京)科技有限公司 Training method and device for target detection model
CN112132088B (en) * 2020-09-29 2024-01-12 动联(山东)电子科技有限公司 Inspection point missing inspection identification method
CN112200186B (en) * 2020-10-15 2024-03-15 上海海事大学 Vehicle logo identification method based on improved YOLO_V3 model
CN112231497B (en) * 2020-10-19 2024-04-09 腾讯科技(深圳)有限公司 Information classification method and device, storage medium and electronic equipment
CN112348778B (en) * 2020-10-21 2023-10-27 深圳市优必选科技股份有限公司 Object identification method, device, terminal equipment and storage medium
CN112288003B (en) * 2020-10-28 2023-07-25 北京奇艺世纪科技有限公司 Neural network training and target detection method and device
CN112381773B (en) * 2020-11-05 2023-04-18 东风柳州汽车有限公司 Key cross section data analysis method, device, equipment and storage medium
CN112365465B (en) * 2020-11-09 2024-02-06 浙江大华技术股份有限公司 Synthetic image category determining method and device, storage medium and electronic device
CN112287884B (en) * 2020-11-19 2024-02-20 长江大学 Examination abnormal behavior detection method and device and computer readable storage medium
CN112348112B (en) * 2020-11-24 2023-12-15 深圳市优必选科技股份有限公司 Training method and training device for image recognition model and terminal equipment
CN112364807B (en) * 2020-11-24 2023-12-15 深圳市优必选科技股份有限公司 Image recognition method, device, terminal equipment and computer readable storage medium
CN112560586B (en) * 2020-11-27 2024-05-10 国家电网有限公司大数据中心 Method and device for obtaining structural data of pole and tower signboard and electronic equipment
CN112634202A (en) * 2020-12-04 2021-04-09 浙江省农业科学院 Method, device and system for detecting behavior of polyculture fish shoal based on YOLOv3-Lite
CN112508915A (en) * 2020-12-11 2021-03-16 中信银行股份有限公司 Target detection result optimization method and system
CN112215308B (en) * 2020-12-13 2021-03-30 之江实验室 Single-order detection method and device for hoisted object, electronic equipment and storage medium
CN112507896B (en) * 2020-12-14 2023-11-07 大连大学 Method for detecting cherry fruits by adopting improved YOLO-V4 model
CN113723157B (en) * 2020-12-15 2024-02-09 京东科技控股股份有限公司 Crop disease identification method and device, electronic equipment and storage medium
CN112613097A (en) * 2020-12-15 2021-04-06 中铁二十四局集团江苏工程有限公司 BIM rapid modeling method based on computer vision
CN112507912A (en) * 2020-12-15 2021-03-16 网易(杭州)网络有限公司 Method and device for identifying illegal picture
CN112633352B (en) * 2020-12-18 2023-08-29 浙江大华技术股份有限公司 Target detection method and device, electronic equipment and storage medium
CN112634327A (en) * 2020-12-21 2021-04-09 合肥讯图信息科技有限公司 Tracking method based on YOLOv4 model
CN112633159B (en) * 2020-12-22 2024-04-12 北京迈格威科技有限公司 Human-object interaction relation identification method, model training method and corresponding device
CN112580523A (en) * 2020-12-22 2021-03-30 平安国际智慧城市科技股份有限公司 Behavior recognition method, behavior recognition device, behavior recognition equipment and storage medium
CN112699925A (en) * 2020-12-23 2021-04-23 国网安徽省电力有限公司检修分公司 Transformer substation meter image classification method
CN112633286B (en) * 2020-12-25 2022-09-09 北京航星机器制造有限公司 Intelligent security inspection system based on similarity rate and recognition probability of dangerous goods
CN112541483B (en) * 2020-12-25 2024-05-17 深圳市富浩鹏电子有限公司 Dense face detection method combining YOLO and blocking-fusion strategy
CN112580734B (en) * 2020-12-25 2023-12-29 深圳市优必选科技股份有限公司 Target detection model training method, system, terminal equipment and storage medium
CN112597915B (en) * 2020-12-26 2024-04-09 上海有个机器人有限公司 Method, device, medium and robot for identifying indoor close-distance pedestrians
CN112613570A (en) * 2020-12-29 2021-04-06 深圳云天励飞技术股份有限公司 Image detection method, image detection device, equipment and storage medium
CN112784694A (en) * 2020-12-31 2021-05-11 杭州电子科技大学 EVP-YOLO-based indoor article detection method
CN112560799B (en) * 2021-01-05 2022-08-05 北京航空航天大学 Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN112733741A (en) * 2021-01-14 2021-04-30 苏州挚途科技有限公司 Traffic signboard identification method and device and electronic equipment
CN112818980A (en) * 2021-01-15 2021-05-18 湖南千盟物联信息技术有限公司 Steel ladle number detection and identification method based on Yolov3 algorithm
CN112766170B (en) * 2021-01-21 2024-04-16 广西财经学院 Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
CN112906478B (en) * 2021-01-22 2024-01-09 北京百度网讯科技有限公司 Target object identification method, device, equipment and storage medium
CN112906495B (en) * 2021-01-27 2024-04-30 深圳安智杰科技有限公司 Target detection method and device, electronic equipment and storage medium
CN112800971A (en) * 2021-01-29 2021-05-14 深圳市商汤科技有限公司 Neural network training and point cloud data processing method, device, equipment and medium
CN114821288A (en) * 2021-01-29 2022-07-29 中强光电股份有限公司 Image identification method and unmanned aerial vehicle system
CN112911171B (en) * 2021-02-04 2022-04-22 上海航天控制技术研究所 Intelligent photoelectric information processing system and method based on accelerated processing
CN112861711A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Regional intrusion detection method and device, electronic equipment and storage medium
CN112861716A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Illegal article placement monitoring method, system, equipment and storage medium
CN112906794A (en) * 2021-02-22 2021-06-04 珠海格力电器股份有限公司 Target detection method, device, storage medium and terminal
CN113095133B (en) * 2021-03-04 2023-12-29 北京迈格威科技有限公司 Model training method, target detection method and corresponding devices
CN112906621A (en) * 2021-03-10 2021-06-04 北京华捷艾米科技有限公司 Hand detection method, device, storage medium and equipment
CN112966618B (en) * 2021-03-11 2024-02-09 京东科技信息技术有限公司 Dressing recognition method, apparatus, device and computer readable medium
CN113011319B (en) * 2021-03-16 2024-04-16 上海应用技术大学 Multi-scale fire target identification method and system
CN112966762B (en) * 2021-03-16 2023-12-26 南京恩博科技有限公司 Wild animal detection method and device, storage medium and electronic equipment
CN112991304A (en) * 2021-03-23 2021-06-18 武汉大学 Molten pool sputtering detection method based on laser directional energy deposition monitoring system
CN113033398B (en) * 2021-03-25 2022-02-11 深圳市康冠商用科技有限公司 Gesture recognition method and device, computer equipment and storage medium
CN112965604A (en) * 2021-03-29 2021-06-15 深圳市优必选科技股份有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN112990334A (en) * 2021-03-29 2021-06-18 西安电子科技大学 Small sample SAR image target identification method based on improved prototype network
CN113222889B (en) * 2021-03-30 2024-03-12 大连智慧渔业科技有限公司 Industrial aquaculture counting method and device for aquaculture under high-resolution image
CN113052127A (en) * 2021-04-09 2021-06-29 上海云从企业发展有限公司 Behavior detection method, behavior detection system, computer equipment and machine readable medium
CN113139597B (en) * 2021-04-19 2022-11-04 中国人民解放军91054部队 Statistical thought-based image distribution external detection method
CN113158922A (en) * 2021-04-26 2021-07-23 平安科技(深圳)有限公司 Traffic flow statistical method, device and equipment based on YOLO neural network
CN113128522B (en) * 2021-05-11 2024-04-05 四川云从天府人工智能科技有限公司 Target identification method, device, computer equipment and storage medium
CN113240638B (en) * 2021-05-12 2023-11-10 上海联影智能医疗科技有限公司 Target detection method, device and medium based on deep learning
CN113205067B (en) * 2021-05-26 2024-04-09 北京京东乾石科技有限公司 Method and device for monitoring operators, electronic equipment and storage medium
WO2022252089A1 (en) * 2021-05-31 2022-12-08 京东方科技集团股份有限公司 Training method for object detection model, and object detection method and device
CN113435260A (en) * 2021-06-07 2021-09-24 上海商汤智能科技有限公司 Image detection method, related training method, related device, equipment and medium
CN113392833A (en) * 2021-06-10 2021-09-14 沈阳派得林科技有限责任公司 Method for identifying type number of industrial radiographic negative image
CN113269188B (en) * 2021-06-17 2023-03-14 华南农业大学 Mark point and pixel coordinate detection method thereof
CN113486746A (en) * 2021-06-25 2021-10-08 海南电网有限责任公司三亚供电局 Power cable external damage prevention method based on biological induction and video monitoring
CN113536963B (en) * 2021-06-25 2023-08-15 西安电子科技大学 SAR image airplane target detection method based on lightweight YOLO network
CN113377888B (en) * 2021-06-25 2024-04-02 北京百度网讯科技有限公司 Method for training object detection model and detection object
CN113591566A (en) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 Training method and device of image recognition model, electronic equipment and storage medium
CN113553948A (en) * 2021-07-23 2021-10-26 中远海运科技(北京)有限公司 Automatic recognition and counting method for tobacco insects and computer readable medium
CN113486857B (en) * 2021-08-03 2023-05-12 云南大学 YOLOv 4-based ascending safety detection method and system
CN113723217A (en) * 2021-08-09 2021-11-30 南京邮电大学 Object intelligent detection method and system based on yolo improvement
CN113705643B (en) * 2021-08-17 2022-10-28 荣耀终端有限公司 Target detection method and device and electronic equipment
CN113657280A (en) * 2021-08-18 2021-11-16 广东电网有限责任公司 Power transmission line target defect detection warning method and system
CN113948190A (en) * 2021-09-02 2022-01-18 上海健康医学院 Method and equipment for automatically identifying X-ray skull positive position film cephalogram measurement mark points
CN113723406B (en) * 2021-09-03 2023-07-18 乐普(北京)医疗器械股份有限公司 Method and device for processing support positioning of coronary angiography image
CN114119455B (en) * 2021-09-03 2024-04-09 乐普(北京)医疗器械股份有限公司 Method and device for positioning vascular stenosis part based on target detection network
CN113743339B (en) * 2021-09-09 2023-10-03 三峡大学 Indoor falling detection method and system based on scene recognition
CN113792656B (en) * 2021-09-15 2023-07-18 山东大学 Behavior detection and alarm system using mobile communication equipment in personnel movement
CN114022705B (en) * 2021-10-29 2023-08-04 电子科技大学 Self-adaptive target detection method based on scene complexity pre-classification
CN114022554B (en) * 2021-11-03 2023-02-03 北华航天工业学院 Massage robot acupoint detection and positioning method based on YOLO
CN114120358B (en) * 2021-11-11 2024-04-26 国网江苏省电力有限公司技能培训中心 Super-pixel-guided deep learning-based personnel head-mounted safety helmet recognition method
CN114255389A (en) * 2021-11-15 2022-03-29 浙江时空道宇科技有限公司 Target object detection method, device, equipment and storage medium
CN113989939B (en) * 2021-11-16 2024-05-14 河北工业大学 Small target pedestrian detection system based on improved YOLO algorithm
CN114373075A (en) * 2021-12-31 2022-04-19 西安电子科技大学广州研究院 Target component detection data set construction method, detection method, device and equipment
US11756288B2 (en) * 2022-01-05 2023-09-12 Baidu Usa Llc Image processing method and apparatus, electronic device and storage medium
CN114565848B (en) * 2022-02-25 2022-12-02 佛山读图科技有限公司 Liquid medicine level detection method and system in complex scene
CN114662594B (en) * 2022-03-25 2022-10-04 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114742204A (en) * 2022-04-08 2022-07-12 黑龙江惠达科技发展有限公司 Method and device for detecting straw coverage rate
CN114782778B (en) * 2022-04-25 2023-01-06 广东工业大学 Assembly state monitoring method and system based on machine vision technology
CN114842315B (en) * 2022-05-07 2024-02-02 无锡雪浪数制科技有限公司 Looseness-prevention identification method and device for lightweight high-speed railway hub gasket
CN114881763B (en) * 2022-05-18 2023-05-26 中国工商银行股份有限公司 Post-loan supervision method, device, equipment and medium for aquaculture
CN115029209A (en) * 2022-06-17 2022-09-09 杭州天杭空气质量检测有限公司 Colony image acquisition processing device and processing method thereof
CN114972891B (en) * 2022-07-07 2024-05-03 智云数创(洛阳)数字科技有限公司 Automatic identification method for CAD (computer aided design) component and BIM (building information modeling) method
CN115082661B (en) * 2022-07-11 2024-05-10 阿斯曼尔科技(上海)有限公司 Sensor assembly difficulty reducing method
CN115187982B (en) * 2022-07-12 2023-05-23 河北华清环境科技集团股份有限公司 Algae detection method and device and terminal equipment
CN115909358B (en) * 2022-07-27 2024-02-13 广州市玄武无线科技股份有限公司 Commodity specification identification method, commodity specification identification device, terminal equipment and computer storage medium
CN115346170B (en) * 2022-08-11 2023-05-30 北京市燃气集团有限责任公司 Intelligent monitoring method and device for gas facility area
CN115346172B (en) * 2022-08-16 2023-04-21 哈尔滨市科佳通用机电股份有限公司 Method and system for detecting lost and broken hook lifting rod reset spring
CN115297263B (en) * 2022-08-24 2023-04-07 广州方图科技有限公司 Automatic photographing control method and system suitable for cube shooting and cube shooting
CN115690565B (en) * 2022-09-28 2024-02-20 大连海洋大学 Method for detecting cultivated takifugu rubripes target by fusing knowledge and improving YOLOv5
CN115546566A (en) * 2022-11-24 2022-12-30 杭州心识宇宙科技有限公司 Intelligent body interaction method, device, equipment and storage medium based on article identification
CN116051985B (en) * 2022-12-20 2023-06-23 中国科学院空天信息创新研究院 Semi-supervised remote sensing target detection method based on multi-model mutual feedback learning
CN115690570B (en) * 2023-01-05 2023-03-28 中国水产科学研究院黄海水产研究所 Fish shoal feeding intensity prediction method based on ST-GCN
CN116452858B (en) * 2023-03-24 2023-12-15 哈尔滨市科佳通用机电股份有限公司 Rail wagon connecting pull rod round pin breaking fault identification method and system
CN116403163B (en) * 2023-04-20 2023-10-27 慧铁科技有限公司 Method and device for identifying opening and closing states of handles of cut-off plug doors
CN116342316A (en) * 2023-05-31 2023-06-27 青岛希尔信息科技有限公司 Accounting and project financial management system and method
CN116681687A (en) * 2023-06-20 2023-09-01 广东电网有限责任公司广州供电局 Wire detection method and device based on computer vision and computer equipment
CN116758547B (en) * 2023-06-27 2024-03-12 北京中超伟业信息安全技术股份有限公司 Paper medium carbonization method, system and storage medium
CN117201834A (en) * 2023-09-11 2023-12-08 南京天创电子技术有限公司 Real-time double-spectrum fusion video stream display method and system based on target detection
CN116916166B (en) * 2023-09-12 2023-11-17 湖南湘银河传感科技有限公司 Telemetry terminal based on AI image analysis
CN116935232A (en) * 2023-09-15 2023-10-24 青岛国测海遥信息技术有限公司 Remote sensing image processing method and device for offshore wind power equipment, equipment and medium
CN117671597A (en) * 2023-12-25 2024-03-08 北京大学长沙计算与数字经济研究院 Method for constructing mouse detection model and mouse detection method and device
CN117523318B (en) * 2023-12-26 2024-04-16 宁波微科光电股份有限公司 Anti-light interference subway shielding door foreign matter detection method, device and medium
CN117893895A (en) * 2024-03-15 2024-04-16 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) Method, system, equipment and storage medium for identifying portunus trituberculatus

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247956A (en) * 2016-10-09 2017-10-13 成都快眼科技有限公司 A kind of fast target detection method judged based on grid
CN107423760A (en) * 2017-07-21 2017-12-01 西安电子科技大学 Based on pre-segmentation and the deep learning object detection method returned
CN108154098A (en) * 2017-12-20 2018-06-12 歌尔股份有限公司 A kind of target identification method of robot, device and robot

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107527009B (en) * 2017-07-11 2020-09-04 浙江汉凡软件科技有限公司 Remnant detection method based on YOLO target detection
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN109977943B (en) * 2019-02-14 2024-05-07 平安科技(深圳)有限公司 Image target recognition method, system and storage medium based on YOLO

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247956A (en) * 2016-10-09 2017-10-13 成都快眼科技有限公司 A kind of fast target detection method judged based on grid
CN107423760A (en) * 2017-07-21 2017-12-01 西安电子科技大学 Based on pre-segmentation and the deep learning object detection method returned
CN108154098A (en) * 2017-12-20 2018-06-12 歌尔股份有限公司 A kind of target identification method of robot, device and robot

Also Published As

Publication number Publication date
CN109977943A (en) 2019-07-05
WO2020164282A1 (en) 2020-08-20

Similar Documents

Publication Publication Date Title
CN109977943B (en) Image target recognition method, system and storage medium based on YOLO
CN109918969B (en) Face detection method and device, computer device and computer readable storage medium
EP3493101B1 (en) Image recognition method, terminal, and nonvolatile storage medium
CN108470172B (en) Text information identification method and device
CN111079674B (en) Target detection method based on global and local information fusion
CN111814902A (en) Target detection model training method, target identification method, device and medium
WO2018052586A1 (en) Method and system for multi-scale cell image segmentation using multiple parallel convolutional neural networks
CN110991311A (en) Target detection method based on dense connection deep network
Wang et al. Fast and robust object detection using asymmetric totally corrective boosting
CN111368636B (en) Object classification method, device, computer equipment and storage medium
CN110766017B (en) Mobile terminal text recognition method and system based on deep learning
CN109993221B (en) Image classification method and device
CN109934216B (en) Image processing method, device and computer readable storage medium
CN112508094A (en) Junk picture identification method, device and equipment
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN111444976A (en) Target detection method and device, electronic equipment and readable storage medium
CN111724342A (en) Method for detecting thyroid nodule in ultrasonic image
CN115239644B (en) Concrete defect identification method, device, computer equipment and storage medium
CN114187311A (en) Image semantic segmentation method, device, equipment and storage medium
CN111461145A (en) Method for detecting target based on convolutional neural network
CN111696080A (en) Face fraud detection method, system and storage medium based on static texture
CN111414910B (en) Small target enhancement detection method and device based on double convolution neural network
CN111597875A (en) Traffic sign identification method, device, equipment and storage medium
CN116152226A (en) Method for detecting defects of image on inner side of commutator based on fusible feature pyramid
CN112926595B (en) Training device of deep learning neural network model, target detection system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant