WO2020164282A1 - Yolo-based image target recognition method and apparatus, electronic device, and storage medium - Google Patents

Yolo-based image target recognition method and apparatus, electronic device, and storage medium Download PDF

Info

Publication number
WO2020164282A1
WO2020164282A1 PCT/CN2019/118499 CN2019118499W WO2020164282A1 WO 2020164282 A1 WO2020164282 A1 WO 2020164282A1 CN 2019118499 W CN2019118499 W CN 2019118499W WO 2020164282 A1 WO2020164282 A1 WO 2020164282A1
Authority
WO
WIPO (PCT)
Prior art keywords
detection frame
image
classification
yolo
preset
Prior art date
Application number
PCT/CN2019/118499
Other languages
French (fr)
Chinese (zh)
Inventor
赵峰
王健宗
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164282A1 publication Critical patent/WO2020164282A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Definitions

  • This application relates to the field of computer learning and image recognition, and more specifically, to a YOLO-based image target recognition method, device, electronic equipment and storage medium.
  • YOLO solves object detection as a regression problem, based on a single end-to-end network, to complete the input from the original image to the output of the object position and category.
  • YOLO The core idea of YOLO is to use the entire graph as the input of the network, and directly return to the position of the bounding box and the category to which the bounding box belongs in the output layer. The inventor realized that based on the use of YOLO's high-speed operation, how to design a method that can improve the accuracy of YOLO is an urgent need to solve at present.
  • this application proposes a YOLO-based image target recognition method, device, electronic equipment and storage medium.
  • the technical solution of the present application provides a method for image target recognition based on YOLO, including:
  • the detection frame and the classification identification information are used as the classification result of the identification.
  • the technical solution of the present application also proposes an image target recognition device based on YOLO, which includes: an input module to receive the image to be detected;
  • the adjustment module adjusts the size of the image to be detected received by the input module according to preset requirements, and generates the first detected image
  • the matching recognition module sends the first detection image generated by the adjustment module to the neural network model for matching recognition, and generates a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
  • a judging module judging whether the classification probability value is greater than a preset classification probability threshold, if it is not greater, send a signal to the matching recognition module, if it is greater, send a signal to the classification module;
  • the classification module uses the detection frame and the classification identification information as the classification result of the identification.
  • the technical solution of the present application also proposes an electronic device, including a memory, a processor, and a camera device.
  • the memory includes a YOLO-based image target recognition program, and the YOLO-based image target recognition program is used by the processor.
  • the steps of the above-mentioned YOLO-based image target recognition method are realized during execution.
  • the fourth aspect of the present application also provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium includes a YOLO-based image target recognition program, the YOLO-based image target recognition program is When the processor is executed, the steps of the above-mentioned YOLO-based image target recognition method are realized.
  • This application proposes a YOLO-based image target recognition method, device, system and storage medium.
  • This method judges the classification and recognition probability, and only uses the recognition information as the recognition result when the preset classification probability threshold is reached, which improves the accuracy of image recognition and the sense of recognition experience.
  • the present application can also adjust the position of the detection frame in real time, which effectively improves the detection efficiency and accuracy; the detection time is reduced by optimizing the detection calculation method. After experiments and verifications, the method of this application is superior to the detection method of the prior art. More embodied in improved recognition accuracy and increased computing speed.
  • Figure 1 is a flow chart of an image target recognition method based on YOLO in this application
  • Figure 2 shows a schematic diagram of the convolution operation in the classification process of this application
  • Figure 3 shows a block diagram of an electronic device of the present application
  • Fig. 4 shows a schematic diagram of a specific embodiment of the present application.
  • Fig. 1 is a flow chart of an image target recognition method based on YOLO in this application.
  • the technical solution of the present application provides a YOLO-based image target recognition method, including:
  • S104 Adjust the size of the image to be detected according to a preset requirement to generate a first detected image
  • S106 Send the first detection image to a neural network model for matching recognition, and generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
  • the size is the size specified by the neural network model.
  • the size of the generally selected image will be smaller than the size of the image to be detected, which can ensure the speed of calculation processing and can quickly perform class recognition.
  • 448*448 or 416*416 is selected.
  • the size selection in this step can be set according to actual needs, and is not limited to the above-mentioned sizes and cannot limit the protection of this application. range.
  • the first detection image is sent to a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information.
  • a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information.
  • the classification probability threshold is set to 90%, when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90% , It means that a kitten is circled in the detection box and the cat in the picture has been identified.
  • the classification probability value is less than the preset classification probability threshold, it will return to step S106 for re-identification until the classification probability value is greater than the preset classification probability threshold.
  • the neural network model performs multi-layer convolution operations on the image.
  • the described YOLO convolution operation is a conventional operation in the field, and belongs to the prior art, and will not be repeated in this application.
  • step S106 it includes after step S106:
  • the step After the step of generating the detection frame, the classification identification information, and the classification probability value corresponding to the classification identification information, the step includes:
  • the remaining detection frames of the same type are calculated for the coincidence degree, and the detection frame with the highest coincidence degree is retained.
  • the method before receiving the image to be detected in the step S102, the method further includes:
  • the neural network model is trained through the following steps:
  • the preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
  • the step of obtaining the training image data set includes:
  • a set number of positive samples and negative samples of each tag in the total set of identification tags are selected from the picture library to form the training set and the validation set, where a positive sample of a label is a picture containing the object corresponding to the label, and a negative of a label
  • the sample is a picture that does not contain the object corresponding to the label
  • the training set is the image data of the positive sample and the negative sample
  • the verification set is the label sequence of the positive sample and the negative sample
  • the output of the neural network model Is the predicted label sequence of the samples in the training set.
  • the training image data set has 1,000 object categories and 1.2 million training images.
  • the preprocessing includes one or more of rotation, contrast enhancement, tilt, and scaling.
  • the image will be distorted to a certain extent.
  • the training of the distorted image can be Increase the accuracy of the final image recognition.
  • the step of generating the detection frame is specifically as follows:
  • Predict the dynamic detection frame perform iterative prediction on the generated detection frame, and generate the latest detection frame
  • the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
  • the initial preset coordinate point is the coordinate point of the preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
  • the prediction of the dynamic detection frame, the iterative prediction of the generated detection frame, and the generation of the latest detection frame are specifically as follows:
  • b x , b y , b w , and b h are the four coordinate point values of the latest detection frame. It should be noted that the detection frame is a quadrilateral, and the position of the quadrilateral detection frame can be determined by the values of 4 points.
  • the network predicts the 4 coordinates of each detection frame, (t x , t y , t w , t h ). If the cell deviates from the coordinates (c x , c y ) of the upper left corner of the image, the coordinates of the latest detection frame expressed by the above formula can be obtained.
  • each box uses multi-label classification to predict the classes that the bounding box may contain.
  • this application uses binary cross entropy loss technology for class prediction.
  • the main purpose of using binary cross entropy loss for category prediction is that the applicant finds that the softmax technology does not require good performance, but only uses an independent logical classifier, so this step does not need to use the softmax technology.
  • the binary cross entropy loss technology will provide more help.
  • the binary cross-entropy loss technology is a common technology in the field, and those skilled in the art can implement the binary cross-entropy loss technology according to requirements, and this application will not repeat them one by one.
  • the convolution operation of each layer is calculated by alternating 3 ⁇ 3 and 1 ⁇ 1 convolution layers.
  • the applicant has found through a limited number of actual tests that the use of the above-mentioned convolutional layer for alternating operation can increase the accuracy and effectively increase the operation speed.
  • the alternate calculation of the convolutional layer is specifically as follows: firstly, a 3 ⁇ 3 convolution operation is used, and then a 1 ⁇ 1 convolution operation is used, and the operations are alternately performed in turn until all the convolutional layers have participated in the operation.
  • the size is the size specified by the neural network model.
  • the classification probability value in the detection frame is calculated, and the optimal N detection frames of the same type are selected. It should be noted that the size of the detection frame is dynamically predicted, and the process of dynamic prediction is the solution described above.
  • the probability threshold is used to filter the M classification probability values of all the detection frames, and a set of screening rules is formulated:
  • Calculate the classification probability value of each detection frame arrange its classification probability values from largest to smallest, and select the highest ranked category. It can be said that this step is the first round of screening.
  • the M categories of each detection frame are evaluated first, and a champion category with the highest probability value is selected.
  • the highest ranked category is compared with a preset probability threshold. If it is greater than or equal to the preset probability threshold, the detection frame is retained; if it is less than the preset probability threshold, the detection frame is deleted. It can be said that in the second round of screening, the champion classification is compared with the probability threshold, and the check box with a value greater than the probability threshold is eligible to enter the final.
  • the probability threshold can be set to 0.24 (24%). By comparison, the preset detection frame is displayed on the picture, and it can be seen that as long as the classification probability value is greater than or equal to the 0.24 probability threshold, it can be displayed.
  • the overlap degree calculation is performed on the N detection frames of the same type, and the detection frame with the highest overlap degree is retained.
  • the coincidence degree calculation (IOU) is performed in pairs. If the coincidence degree calculation value IoU>0.3, the detection frame with low probability is eliminated.
  • Figure 2 shows a schematic diagram of the convolution operation in the classification process of this application.
  • the neural network model adopts 53 layers of convolution operation, and the convolution operation of each layer is 3 ⁇ 3 and 1 ⁇ 1 convolution layers alternately.
  • This feature extraction method realizes the highest measurement floating point operation per second. This also means that the neural network structure can make better use of the machine's GPU, improve the evaluation efficiency, and thus increase the speed. Because the ResNets technology has too many levels and is not efficient, the convolution operation described in this application can have higher efficiency and higher accuracy.
  • each neural network is trained with the same settings and tested with a single cropping accuracy of 256 ⁇ 256.
  • the performance of the classifier using the feature extraction of the present application is comparable to the most advanced classifier in the prior art, but there are fewer floating point operations and faster speed.
  • FIG. 3 shows a block diagram of the application of the above-mentioned YOLO-based image target recognition method of this application to an electronic device.
  • the technical solution of the present application also proposes an electronic device 2, which includes a memory 201, a processor 202, and a camera 203.
  • the memory 201 includes an image target recognition program based on YOLO.
  • the YOLO image target recognition program is executed by the processor, the following steps are implemented:
  • the detection frame and the classification identification information are used as the classification result of the identification.
  • the first detection image is sent to a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information.
  • a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information.
  • the classification probability threshold is set to 90%, when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90% , It means that a kitten is circled in the detection box and the cat in the picture has been identified.
  • the classification probability value is less than the preset classification probability threshold, it will return to step S106 for re-identification until the classification probability value is greater than the preset classification probability threshold.
  • the neural network model performs multi-layer convolution operations on the image.
  • the described YOLO convolution operation is a conventional operation in the field, and belongs to the prior art, and will not be repeated in this application.
  • the method before the receiving the image to be detected, the method further includes:
  • the neural network model is trained through the following steps:
  • the preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
  • the step of generating the detection frame is specifically as follows:
  • the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
  • the prediction of the dynamic detection frame, the iterative prediction of the generated detection frame, and the generation of the latest detection frame are specifically as follows:
  • dimensional clustering can be used as an anchor frame to dynamically predict the detection frame, and the detection frame is also a bounding box.
  • the network predicts the 4 coordinates of each detection frame, t x , t y , t w , and t h . If the cell deviates from the upper left corner of the image (c x , c y ), the coordinates of the latest detection frame expressed by the above formula can be obtained, where b x , b y , b w , and b h are the coordinates of the latest detection frame.
  • Four coordinate point values It should be noted that the detection frame is a quadrilateral, and the position of the quadrilateral detection frame can be determined by the values of 4 points.
  • each box uses multi-label classification to predict the classes that the bounding box may contain.
  • this application uses binary cross-entropy loss technology for class prediction.
  • the main purpose of using binary cross entropy loss for category prediction is that the applicant finds that the softmax technology does not require good performance, but only uses an independent logical classifier, so this step does not need to use the softmax technology.
  • the binary cross entropy loss technology will provide more help.
  • the binary cross-entropy loss technology is a common technology in the field, and those skilled in the art can implement the binary cross-entropy loss technology according to requirements, and this application will not repeat them one by one.
  • the method before the receiving the image to be detected, the method further includes:
  • Image training is performed to obtain a neural network model; the neural network model is a model with an input interface and an output interface obtained by image training for different types of pictures.
  • the convolution operation of each layer is calculated by alternating 3 ⁇ 3 and 1 ⁇ 1 convolution layers.
  • the applicant has found through a limited number of actual tests that the use of the above-mentioned convolutional layer for alternating operation can increase the accuracy and effectively increase the operation speed.
  • the alternate calculation of the convolutional layer is specifically as follows: firstly, a 3 ⁇ 3 convolution operation is used, and then a 1 ⁇ 1 convolution operation is used, and the operations are alternated in turn until all the convolutional layers have participated in the operation.
  • the size is the size specified by the neural network model.
  • This feature extraction method realizes the highest measurement floating point operation per second. This also means that the neural network structure can make better use of the machine's GPU, improve the evaluation efficiency, and thus increase the speed. Because the ResNets technology has too many levels and is not efficient, the convolution operation described in this application can have higher efficiency and higher accuracy.
  • each neural network is trained with the same settings and tested with a single cropping accuracy of 256 ⁇ 256.
  • the performance of the classifier using the feature extraction of the present application is comparable to the most advanced classifier in the prior art, but there are fewer floating point operations and faster speed.
  • the probability threshold is used to filter the M classification probability values of all the detection frames, and a set of screening rules is formulated:
  • Calculate the classification probability value of each detection frame arrange its classification probability values from largest to smallest, and select the highest ranked category. It can be said that this step is the first round of screening.
  • the M categories of each detection frame are evaluated first, and a champion category with the highest probability value is selected.
  • the highest ranked category is compared with a preset probability threshold. If it is greater than or equal to the preset probability threshold, the detection frame is retained; if it is less than the preset probability threshold, the detection frame is deleted. It can be said that in the second round of screening, the champion classification is compared with the probability threshold, and the check box with a value greater than the probability threshold is eligible to enter the final.
  • the probability threshold can be set to 0.24 (24%). By comparison, the detection frame that passed the preliminaries is displayed on the picture. It can be seen that as long as the classification probability value is greater than or equal to the 0.24 probability threshold, it can be displayed.
  • the overlap degree calculation is performed on the N detection frames of the same type, and the detection frame with the highest overlap degree is retained.
  • the coincidence degree calculation (IOU) is performed in pairs. If the coincidence degree calculation value IoU>0.3, the detection frame with low probability is eliminated.
  • this application also proposes an image target recognition device based on YOLO, including: an input module to receive the image to be detected;
  • the adjustment module adjusts the size of the image to be detected received by the input module according to preset requirements, and generates the first detected image
  • the matching recognition module sends the first detection image generated by the adjustment module to the neural network model for matching recognition, and generates a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
  • a judging module judging whether the classification probability value is greater than a preset classification probability threshold, if it is not greater, send a signal to the matching recognition module, if it is greater, send a signal to the classification module;
  • the classification module uses the detection frame and the classification identification information as the classification result of the identification.
  • it further includes a training module to perform image training to obtain a neural network model, and the training module includes:
  • Data set acquisition unit to acquire training image data set
  • a preprocessing unit which performs image preprocessing on the training image data set to obtain a preprocessed image set
  • the training unit trains the preprocessed image set to obtain a neural network model with an input interface and an output interface.
  • the aforementioned data set obtaining unit includes:
  • Tag library which stores different tags and tag sequences corresponding to different objects
  • Picture library which stores the image data and label sequence of pictures
  • the screening unit selects a set number of positive samples and negative samples of each tag in the total identification tag set from the picture library to form a training set and a validation set, where a positive sample of one label is a picture containing the object corresponding to the label, and one The negative sample of the label is a picture that does not contain the object corresponding to the label, the training set is the image data of the positive sample and the negative sample, the verification set is the label sequence of the positive sample and the negative sample, the neural network The output of the model is the predicted label sequence of the samples in the training set.
  • the above-mentioned matching recognition module includes:
  • the initial detection frame generating unit generates the initial detection frame according to the initial preset coordinate points
  • the prediction unit predicts the dynamic detection frame, iteratively predicts the generated detection frame, and generates the latest detection frame;
  • the coincidence degree obtaining unit calculates the coincidence degree of the latest detection frame
  • the screening unit if the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, send a signal to the prediction unit Then continue to predict the dynamic detection frame;
  • the detection frame generation unit generates N detection frames of the same category.
  • the above prediction unit includes:
  • Prediction sub-unit predict the 4 coordinates of each detection frame, (t x , t y , t w , t h );
  • Update the subunit, by predicting the width p w and height p h of the detection frame predicted by the subunit, and update the coordinates of the detection frame, the coordinates of the latest detection frame are:
  • b x , b y , b w , and b h are the four coordinate point values of the latest detection frame.
  • 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is alternately calculated by 3 ⁇ 3 and 1 ⁇ 1 convolution layers.
  • the judgment module includes:
  • the classification probability obtaining unit calculates the classification probability value of each detection frame
  • the first screening unit arranges the classification probability value of each detection frame from largest to smallest, and selects the highest ranked category;
  • the second screening unit compares the highest ranked category with a preset probability threshold, and if it is greater than or equal to the preset probability threshold, then keep the detection frame; if it is less than the preset probability threshold, delete the Check box,
  • the classification module includes a third screening unit, which calculates the degree of coincidence of the reserved detection frames of the same type, retains the detection frame with the highest degree of coincidence, and uses the detection frame with the highest degree of coincidence and its corresponding classification identification information as the recognized classification result.
  • Figure 4 shows a schematic diagram of an embodiment of the present application.
  • the number of convolutional layers in the neural network model is set to 0-52. Then receive the first detected image after size adjustment.
  • the size of the first detected image is 416*416.
  • the specific size can be set according to actual computing requirements and computing capabilities.
  • 416*416 is selected for description, and the color is Color photo.
  • the 0th layer of the neural network model receives the 416*416 size, 3-channel (RGB) color first detection image, and performs the convolution operation.
  • the 52nd layer performs a convolution operation on the feature picture, and the final output one-dimensional prediction array contains 13*13*5*85 values. Reduce the multi-dimensional array or matrix to a one-dimensional array through a series of operations.
  • the one-dimensional array is the prediction array.
  • the number 13*13 in the 13*13*5*85 values represents the width*height of the feature map, and there are a total of 13*13 feature units.
  • YOLO divides the original picture (416*416) into 13*13 cells on average, and each feature unit corresponds to a picture area.
  • the specific size can be set by those skilled in the art according to actual computing requirements and computing capabilities.
  • Number 5 Represents 5 bounding boxes with different shapes. YOLO will generate 5 bounding boxes in each image area, and use the center of the area as the center of the detection box to detect objects, so YOLO will use 13*13*5 detection frames to detect a picture or image.
  • Each detection frame contains 4 coordinate values (x, y, width, height)
  • Each detection frame has a confidence value of the detected object, which is also the above-mentioned confidence (0 ⁇ 1), which is understood as the confidence probability of detecting the object, that is, the confidence value.
  • Each detection frame has 80 classification detection probability values (0 ⁇ 1), which means that the objects in the detection frame may be the probability of each classification respectively.
  • the above process is to divide a 416*416 picture into 13*13 picture areas.
  • Each picture area generates 5 detection frames, and each detection frame contains 85 values (4 coordinates).
  • the final one-dimensional prediction array (predictions) represents the detected objects in the picture, the array contains a total of 13*13*5*85 numerical predictions[0 ] ⁇ predictions[13*13*5*85-1].
  • this application also proposes a computer-readable storage medium including a YOLO-based image target recognition program, which, when executed by a processor, implements the steps of the above-mentioned YOLO-based image target recognition method.
  • This application proposes a YOLO-based image target recognition method, device, electronic equipment and storage medium. This method can effectively improve the detection accuracy and reduce the detection time. After experiments and verifications, the method of this application is superior to the detection method of the prior art. More embodied in improved recognition accuracy and increased computing speed.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the functional units in the embodiments of the present application can all be integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: removable storage devices, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc.
  • the medium storing the program code.
  • the above-mentioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

An artificial intelligence technique, providing a YOLO-based image target recognition method, system, and storage medium, the method comprising: receiving an image to be detected (S102); on the basis of a preset requirement, adjusting the size of the image to be detected to generate a first detection image (S104); sending the first detection image to a neural network model to implement matching recognition, and generating a detection frame and class recognition information, and a class probability value corresponding to the class recognition information (S106); determining whether the class probability value is greater than a preset class probability value (S108); if so, then setting the detection frame and the class recognition information as the recognised class result (S110). The present method can effectively improve detection precision and reduce detection time.

Description

基于YOLO的图像目标识别方法、装置、电子设备和存储介质YOLO-based image target recognition method, device, electronic equipment and storage medium
本申请要求申请号为201910114621.5,申请日为2019年2月14日,发明创造名称为“一种基于YOLO的图像目标识别方法、系统和存储介质”的专利申请的优先权。This application requires the priority of the patent application whose application number is 201910114621.5, the filing date is February 14, 2019, and the invention-creation title is "A YOLO-based image target recognition method, system and storage medium".
技术领域Technical field
本申请涉及计算机学习及图像识别领域,更具体地,涉及一种基于YOLO的图像目标识别方法、装置、电子设备和存储介质。This application relates to the field of computer learning and image recognition, and more specifically, to a YOLO-based image target recognition method, device, electronic equipment and storage medium.
背景技术Background technique
人工智能技术的高速发展,深度学习越来越多的应用于计算机视觉中,尤其是图像的目标检测领域。With the rapid development of artificial intelligence technology, deep learning is increasingly used in computer vision, especially in the field of image target detection.
近几年来,目标检测算法取得了很大的突破。比较流行的算法可以分为两类,一类是基于Region Proposal的R-CNN系算法(R-CNN,Fast R-CNN,Faster R-CNN),它们是two-stage的,需要先使用启发式方法(selective search)或者CNN网络(RPN)产生Region Proposal,然后再在Region Proposal上做分类与回归。而另一类是Yolo(全称为You Only Look Once),SSD这类one-stage算法,其仅仅使用一个CNN网络直接预测不同目标的类别与位置。第一类方法准确度高一些,但是速度慢;第二类算法速度快,但是准确性要低一些。越来越多的目标检测方法基于YOLO实现,许多深度网络也是基于YOLO进行改进。YOLO将物体检测作为回归问题求解,基于一个单独的end-to-end网络,完成从原始图像的输入到物体位置和类别的输出。In recent years, the target detection algorithm has made great breakthroughs. The more popular algorithms can be divided into two categories, one is based on Region Proposal R-CNN algorithm (R-CNN, Fast R-CNN, Faster R-CNN), they are two-stage, you need to use heuristics first Method (selective search) or CNN network (RPN) generates Region Proposal, and then perform classification and regression on Region Proposal. The other is Yolo (full name You Only Look Once), one-stage algorithms such as SSD, which use only a CNN network to directly predict the categories and positions of different targets. The first type of method is more accurate, but slower; the second type of algorithm is faster, but the accuracy is lower. More and more target detection methods are implemented based on YOLO, and many deep networks are also improved based on YOLO. YOLO solves object detection as a regression problem, based on a single end-to-end network, to complete the input from the original image to the output of the object position and category.
YOLO的核心思想就是利用整张图作为网络的输入,直接在输出层回归bounding box的位置和bounding box所属的类别。发明人意识到在利用YOLO高速率运算的基础上,如何设计出能够提高YOLO准确率的方法是目前亟不可待要解决的。The core idea of YOLO is to use the entire graph as the input of the network, and directly return to the position of the bounding box and the category to which the bounding box belongs in the output layer. The inventor realized that based on the use of YOLO's high-speed operation, how to design a method that can improve the accuracy of YOLO is an urgent need to solve at present.
申请内容Application content
为了解决上述至少一个技术问题,本申请提出了一种基于YOLO的图像目标识别方法、装置、电子设备和存储介质。In order to solve the above-mentioned at least one technical problem, this application proposes a YOLO-based image target recognition method, device, electronic equipment and storage medium.
为了实现上述目的,本申请的技术方案提供了一种基于YOLO的图像目标识别方法,包括:In order to achieve the above objectives, the technical solution of the present application provides a method for image target recognition based on YOLO, including:
接收待检测图像;Receive the image to be detected;
根据预设的要求调整所述待检测图像的尺寸大小,生成第一检测图像;Adjusting the size of the image to be inspected according to preset requirements to generate a first inspection image;
将所述第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;Sending the first detection image to a neural network model for matching recognition, and generating a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
判断所述分类概率值是否大于预设的分类概率阈值;Judging whether the classification probability value is greater than a preset classification probability threshold;
若大于,则将所述检测框和分类识别信息作为识别的分类结果。If it is greater than, the detection frame and the classification identification information are used as the classification result of the identification.
本申请的技术方案还提出了一种基于YOLO的图像目标识别装置,包括:输入模块,接收待检测图像;The technical solution of the present application also proposes an image target recognition device based on YOLO, which includes: an input module to receive the image to be detected;
调整模块,根据预设的要求调整输入模块接收的待检测图像的尺寸大小,生成第一检测图像;The adjustment module adjusts the size of the image to be detected received by the input module according to preset requirements, and generates the first detected image;
匹配识别模块,将调整模块生成的第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;The matching recognition module sends the first detection image generated by the adjustment module to the neural network model for matching recognition, and generates a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
判断模块,判断所述分类概率值是否大于预设的分类概率阈值,若不大于,发送信号给匹配识别模块,若大于,发送信号给分类模块;A judging module, judging whether the classification probability value is greater than a preset classification probability threshold, if it is not greater, send a signal to the matching recognition module, if it is greater, send a signal to the classification module;
分类模块,将所述检测框和分类识别信息作为识别的分类结果。The classification module uses the detection frame and the classification identification information as the classification result of the identification.
本申请的技术方案还提出了一种电子设备,包括:存储器、处理器及摄像装置,所述存储器中包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被所述处理器执行时实现上述基于YOLO的图像目标识别方法的步骤。The technical solution of the present application also proposes an electronic device, including a memory, a processor, and a camera device. The memory includes a YOLO-based image target recognition program, and the YOLO-based image target recognition program is used by the processor. The steps of the above-mentioned YOLO-based image target recognition method are realized during execution.
本申请第四方面还提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质中包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被处理器执行时,实现如上述基于YOLO的图像目标识别方法的步骤。The fourth aspect of the present application also provides a computer non-volatile readable storage medium, the computer non-volatile readable storage medium includes a YOLO-based image target recognition program, the YOLO-based image target recognition program is When the processor is executed, the steps of the above-mentioned YOLO-based image target recognition method are realized.
本申请提出的一种基于YOLO的图像目标识别方法、装置、系统和存储介质。该方法对分类识别概率进行判断,在达到预设分类概率阈值时才将识别信息作为识别结果,提升了图像识别的准确性和识别体验感。本申请还能够实时调整检测框的位置,有效提高检测效率和精度;通过优化检测的计算方式降低了检测时间。经过实验和验证,本申请的方法比现有技术的检测方法要优越。更多的体现在提高了识别准确率和增加了运算速度。This application proposes a YOLO-based image target recognition method, device, system and storage medium. This method judges the classification and recognition probability, and only uses the recognition information as the recognition result when the preset classification probability threshold is reached, which improves the accuracy of image recognition and the sense of recognition experience. The present application can also adjust the position of the detection frame in real time, which effectively improves the detection efficiency and accuracy; the detection time is reduced by optimizing the detection calculation method. After experiments and verifications, the method of this application is superior to the detection method of the prior art. More embodied in improved recognition accuracy and increased computing speed.
附图说明Description of the drawings
图1为本申请一种基于YOLO的图像目标识别方法流程图;Figure 1 is a flow chart of an image target recognition method based on YOLO in this application;
图2示出了本申请分类过程中卷积运算示意图;Figure 2 shows a schematic diagram of the convolution operation in the classification process of this application;
图3示出了本申请一种电子设备的框图;Figure 3 shows a block diagram of an electronic device of the present application;
图4示出了本申请一个具体实施例的示意图。Fig. 4 shows a schematic diagram of a specific embodiment of the present application.
具体实施方式detailed description
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施方式对本申请进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be further described in detail below in conjunction with the accompanying drawings and specific implementations. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
图1为本申请一种基于YOLO的图像目标识别方法流程图。Fig. 1 is a flow chart of an image target recognition method based on YOLO in this application.
如图1所示,本申请的技术方案提供了一种基于YOLO的图像目标识别方法,包括:As shown in Figure 1, the technical solution of the present application provides a YOLO-based image target recognition method, including:
S102,接收待检测图像;S102, receiving an image to be detected;
S104,根据预设的要求调整所述待检测图像的尺寸大小,生成第一检测图像;S104: Adjust the size of the image to be detected according to a preset requirement to generate a first detected image;
S106,将所述第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;S106: Send the first detection image to a neural network model for matching recognition, and generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
S108,判断所述分类概率值是否大于预设的分类概率阈值;S108: Determine whether the classification probability value is greater than a preset classification probability threshold;
S110,若大于,则将所述检测框和分类识别信息作为识别的分类结果。S110: If it is greater than, use the detection frame and the classification identification information as the classification result of the identification.
需要说明的是,其中所述尺寸大小为上述神经网络模型规定的大小。在神经网络模型中,一般选取的图像尺寸大小会比待检测图像的尺寸要小,这样可以保证运算处理的速度,可以快速的进行类识别。一般选取448*448或者416*416等,本领域技术人员应当明了,此步骤的尺寸大小选取是可以根据实际需要进行设定的,并不限于上述提及的尺寸大小,不能限制本申请的保护范围。It should be noted that the size is the size specified by the neural network model. In the neural network model, the size of the generally selected image will be smaller than the size of the image to be detected, which can ensure the speed of calculation processing and can quickly perform class recognition. Generally, 448*448 or 416*416 is selected. Those skilled in the art should understand that the size selection in this step can be set according to actual needs, and is not limited to the above-mentioned sizes and cannot limit the protection of this application. range.
将所述第一检测图像发送至神经网络模型中,生成检测框和分类识别信息以及分类识别信息对应的分类概率值。本领域技术人员可根据实际需要设置分类概率值,例如,设置所述分类概率阈值为90%,则在检测一个包含小猫的图片时,若检测框中对小猫的识别的概率超过90%,则表明检测框中圈出了小猫,已经识别出来图片中的猫。若所述分类概率值是否小于预设的分类概率阈值,则将返回步骤S106进行重新识别,一直到所述分类概率值大于所述预设的分类概率阈值为止。所述神经网络模型会将图像进行多层的卷积运算。所述的YOLO卷积运算为本领域的常规运算,属于现有技术,本申请不再一一赘述。The first detection image is sent to a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information. Those skilled in the art can set the classification probability value according to actual needs. For example, if the classification probability threshold is set to 90%, when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90% , It means that a kitten is circled in the detection box and the cat in the picture has been identified. If the classification probability value is less than the preset classification probability threshold, it will return to step S106 for re-identification until the classification probability value is greater than the preset classification probability threshold. The neural network model performs multi-layer convolution operations on the image. The described YOLO convolution operation is a conventional operation in the field, and belongs to the prior art, and will not be repeated in this application.
优选地,在步骤S106之后包括:Preferably, it includes after step S106:
所述生成检测框和分类识别信息以及分类识别信息对应的分类概率值步骤之后包括:After the step of generating the detection frame, the classification identification information, and the classification probability value corresponding to the classification identification information, the step includes:
计算每个检测框的分类概率值,将其分类概率值按照从大到小进行排列,选取排名最高的分类;Calculate the classification probability value of each detection frame, arrange its classification probability values from largest to smallest, and select the highest ranked category;
将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框;Comparing the highest ranked category with a preset probability threshold, if it is greater than or equal to the preset probability threshold, keep the detection frame; if it is less than the preset probability threshold, delete the detection frame;
将保留的所述同类检测框进行重合度计算,将重合度最高的检测框保留。The remaining detection frames of the same type are calculated for the coincidence degree, and the detection frame with the highest coincidence degree is retained.
本方案中,在所述S102步骤中接收待检测图像之前还包括:In this solution, before receiving the image to be detected in the step S102, the method further includes:
进行图片训练,得到神经网络模型;所述神经网络模型通过如下步骤进行训练:Perform image training to obtain a neural network model; the neural network model is trained through the following steps:
获取训练图像数据集;Obtain a training image data set;
将所述训练图像数据集进行图像预处理,得到预处理后的图像集;Image preprocessing the training image data set to obtain a preprocessed image set;
将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
优选地,上述获取训练图像数据集的步骤包括:Preferably, the step of obtaining the training image data set includes:
建立标签库,所述标签库存储有不同物体对应的不同标签及标签顺序;Establishing a tag library, which stores different tags and tag sequences corresponding to different objects;
构建图片库,存储图片的图像数据和标签序列;Build a picture library to store the image data and label sequence of the picture;
从图片库挑选所述识别标签总集中每个标签的设定数量的正样本和负样本构成训练集和验证集,其中,一个标签的正样本为包含该标签对应物体的图片,一个标签的负样本为不包含该标签对应物体的图片,所述训练集为所述正样本和负样本的图像数据,所述验证集为所述正样本和负样本的标签序列,所述神经网络模型的输出为预测的训练集中样本的标签序列。A set number of positive samples and negative samples of each tag in the total set of identification tags are selected from the picture library to form the training set and the validation set, where a positive sample of a label is a picture containing the object corresponding to the label, and a negative of a label The sample is a picture that does not contain the object corresponding to the label, the training set is the image data of the positive sample and the negative sample, the verification set is the label sequence of the positive sample and the negative sample, and the output of the neural network model Is the predicted label sequence of the samples in the training set.
需要说明的是,所述训练图像数据集有1000个对象类别和120万个训练图像。在进行训练之前会将数据集进行预处理,预处理包括旋转、对比度增强、倾斜、缩放中的一种或几种,在预处理之后,图像会有一定的失真,通过对失真图像的训练可以增加最后图像识别的准确性。It should be noted that the training image data set has 1,000 object categories and 1.2 million training images. Before training, the data set will be preprocessed. The preprocessing includes one or more of rotation, contrast enhancement, tilt, and scaling. After preprocessing, the image will be distorted to a certain extent. The training of the distorted image can be Increase the accuracy of the final image recognition.
本方案中,所述生成检测框的步骤具体为:In this solution, the step of generating the detection frame is specifically as follows:
根据初始预设坐标点位生成初始检测框;Generate an initial detection frame according to the initial preset coordinate points;
进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框;Predict the dynamic detection frame, perform iterative prediction on the generated detection frame, and generate the latest detection frame;
计算所述最新的检测框的重合度;Calculating the coincidence degree of the latest detection frame;
若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,则继续进行动态检测框的预测;If the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
最后生成同类别的N个同类检测框。Finally, N detection frames of the same category are generated.
需要说明的是,所述初始预设坐标点位为预设的检测框的坐标点位,可以是在训练和识别检测中自动生成,也可以是本领域技术人员根据实际需要生成。It should be noted that the initial preset coordinate point is the coordinate point of the preset detection frame, which may be automatically generated during training and recognition detection, or may be generated by a person skilled in the art according to actual needs.
本方案中,所述进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框具体为:In this solution, the prediction of the dynamic detection frame, the iterative prediction of the generated detection frame, and the generation of the latest detection frame are specifically as follows:
预测每个检测框的4个坐标,(t x,t y,t w,t h);如果单元格偏离图像的左上角坐标(c x,c y),并且上步预测的检测框具有宽度p w和高度p h,则最新检测框的坐标为: Predict the 4 coordinates of each detection frame, (t x , t y , t w , t h ); if the cell deviates from the upper left corner of the image (c x , c y ), and the detection frame predicted in the previous step has a width p w and height p h , the coordinates of the latest detection frame are:
b x=σ(t x)+c x b x =σ(t x )+c x
b y=σ(t y)+c y b y =σ(t y )+c y
Figure PCTCN2019118499-appb-000001
Figure PCTCN2019118499-appb-000001
Figure PCTCN2019118499-appb-000002
Figure PCTCN2019118499-appb-000002
其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。需要说明的是,检测框为四边形,其中通过4个点的值便可确定四边形检测框的位置。 Among them, b x , b y , b w , and b h are the four coordinate point values of the latest detection frame. It should be noted that the detection frame is a quadrilateral, and the position of the quadrilateral detection frame can be determined by the values of 4 points.
网络预测每个检测框的4个坐标,(t x,t y,t w,t h)。如果单元格偏离图像的左上角坐标(c x,c y),所以可以得出上述公式表达的最新检测框的坐标。 The network predicts the 4 coordinates of each detection frame, (t x , t y , t w , t h ). If the cell deviates from the coordinates (c x , c y ) of the upper left corner of the image, the coordinates of the latest detection frame expressed by the above formula can be obtained.
需要说明的是,每个框使用多标签分类预测边界框可能包含的类。在 进行类别识别过程中,本申请使用二元交叉熵损失技术进行类预测。采用二元交叉熵损失进行类别预测的目的主要为,申请人发现softmax技术不需要良好的性能,而只是使用独立的逻辑分类器,所以此步骤不需采用softmax技术。当使用本申请的方法迁移到更复杂的类别识别领域时,二元交叉熵损失技术会提供更多帮助。二元交叉熵损失技术为本领域的常用技术,本领域技术人员可以根据需求进行二元交叉熵损失技术的实现,本申请不再一一赘述。It should be noted that each box uses multi-label classification to predict the classes that the bounding box may contain. In the process of class recognition, this application uses binary cross entropy loss technology for class prediction. The main purpose of using binary cross entropy loss for category prediction is that the applicant finds that the softmax technology does not require good performance, but only uses an independent logical classifier, so this step does not need to use the softmax technology. When using the method of this application to migrate to a more complex category recognition field, the binary cross entropy loss technology will provide more help. The binary cross-entropy loss technology is a common technology in the field, and those skilled in the art can implement the binary cross-entropy loss technology according to requirements, and this application will not repeat them one by one.
优选的,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替进行计算。申请人在通过有限次的实际测试中发现,采用上述的卷积层进行交替运算可以增加准确率,并且还能有效提高运算速度。其中,卷积层交替进行计算具体为,首先采用3×3的卷积运算,再采用1×1的卷积运算,依次交替进行,直至所有的卷积层都参与完运算为止。Preferably, 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is calculated by alternating 3×3 and 1×1 convolution layers. The applicant has found through a limited number of actual tests that the use of the above-mentioned convolutional layer for alternating operation can increase the accuracy and effectively increase the operation speed. Among them, the alternate calculation of the convolutional layer is specifically as follows: firstly, a 3×3 convolution operation is used, and then a 1×1 convolution operation is used, and the operations are alternately performed in turn until all the convolutional layers have participated in the operation.
本方案中,所述尺寸大小为神经网络模型规定的大小。In this solution, the size is the size specified by the neural network model.
为了更好的说明本申请的技术方案,下面对本申请的技术方案进行详细的举例说明。In order to better illustrate the technical solution of the present application, the technical solution of the present application will be described in detail below.
在生成了检测框后,计算所述检测框中的分类概率值,并筛选出最优的N个同类检测框。需要说明的是,检测框的大小是动态预测的,动态预测的过程为上述描述的方案。使用概率阈值对所有检测框的M个分类概率值进行筛选,并制定了一套筛选规则:After the detection frame is generated, the classification probability value in the detection frame is calculated, and the optimal N detection frames of the same type are selected. It should be noted that the size of the detection frame is dynamically predicted, and the process of dynamic prediction is the solution described above. The probability threshold is used to filter the M classification probability values of all the detection frames, and a set of screening rules is formulated:
计算每个检测框的分类概率值,将其分类概率值按照从大到小进行排列,选取排名最高的分类。可以说此步骤为第一轮筛选,每个检测框的M个分类内部先进行评比,选出一个概率值最高的冠军分类。Calculate the classification probability value of each detection frame, arrange its classification probability values from largest to smallest, and select the highest ranked category. It can be said that this step is the first round of screening. The M categories of each detection frame are evaluated first, and a champion category with the highest probability value is selected.
将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框。可以说,第二轮筛选,冠军分类与概率阈值进行比较,数值大于概率阈值的检测框才有资格进入决赛。例如,概率阈值可以设置为0.24(24%)。通过比较,就把通过预设的检测框显示在图片上,可以看出只要是大于等于0.24概率阈值的分类概率值,就可以显示出来。本领域技术人员可根据实际需要设置概率阈值,本申请描述的概率阈值并不能限制本申请的保护范围。The highest ranked category is compared with a preset probability threshold. If it is greater than or equal to the preset probability threshold, the detection frame is retained; if it is less than the preset probability threshold, the detection frame is deleted. It can be said that in the second round of screening, the champion classification is compared with the probability threshold, and the check box with a value greater than the probability threshold is eligible to enter the final. For example, the probability threshold can be set to 0.24 (24%). By comparison, the preset detection frame is displayed on the picture, and it can be seen that as long as the classification probability value is greater than or equal to the 0.24 probability threshold, it can be displayed. Those skilled in the art can set the probability threshold according to actual needs, and the probability threshold described in this application cannot limit the protection scope of this application.
将所述N同类检测框进行重合度计算,将重合度最高的检测框保留。The overlap degree calculation is performed on the N detection frames of the same type, and the detection frame with the highest overlap degree is retained.
例如,经过上述筛选步骤之后又三个检测框都检测出分类为“马”。For example, after the above-mentioned screening steps, all three detection boxes are classified as "horse".
对所述的三个检测各自的检测概率降序排序。Sort the detection probability of each of the three detections in descending order.
两两进行重合度计算(IOU),如果重合度计算值IoU>0.3,就将概率低的检测框淘汰。The coincidence degree calculation (IOU) is performed in pairs. If the coincidence degree calculation value IoU>0.3, the detection frame with low probability is eliminated.
最终得出了分类为“马”的唯一检测框。Finally, a unique detection frame classified as "horse" is obtained.
图2示出了本申请分类过程中卷积运算示意图。Figure 2 shows a schematic diagram of the convolution operation in the classification process of this application.
如图2所示,所述的神经网络模型中采用53层卷积运算,并且每层的 卷积运算为3×3和1×1卷积层交替进行。As shown in Figure 2, the neural network model adopts 53 layers of convolution operation, and the convolution operation of each layer is 3×3 and 1×1 convolution layers alternately.
该特征提取方式实现了每秒最高的测量浮点运算。这也就意味着神经网络结构可以更好地利用机器的GPU,提高评估效率,从而提高速度。由于ResNets技术的层次太多而且效率不高,所以本申请的所述卷积运算可以有更高的效率和更高的准确率。This feature extraction method realizes the highest measurement floating point operation per second. This also means that the neural network structure can make better use of the machine's GPU, improve the evaluation efficiency, and thus increase the speed. Because the ResNets technology has too many levels and is not efficient, the convolution operation described in this application can have higher efficiency and higher accuracy.
例如,对每个神经网络都使用相同的设置进行训练,并以256×256的单一裁剪精度进行测试。采用本申请的特征提取的分类器的性能与现有技术中最先进的分类器相当,但浮点运算更少,速度更快。For example, each neural network is trained with the same settings and tested with a single cropping accuracy of 256×256. The performance of the classifier using the feature extraction of the present application is comparable to the most advanced classifier in the prior art, but there are fewer floating point operations and faster speed.
图3示出了本申请上述基于YOLO的图像目标识别方法应用于一种电子设备的框图。FIG. 3 shows a block diagram of the application of the above-mentioned YOLO-based image target recognition method of this application to an electronic device.
如图3所示,本申请的技术方案还提出了一种电子设备2,包括:存储器201、处理器202及摄像装置203,所述存储器201中包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被所述处理器执行时实现如下步骤:As shown in FIG. 3, the technical solution of the present application also proposes an electronic device 2, which includes a memory 201, a processor 202, and a camera 203. The memory 201 includes an image target recognition program based on YOLO. When the YOLO image target recognition program is executed by the processor, the following steps are implemented:
接收待检测图像;Receive the image to be detected;
根据预设的要求调整所述待检测图像的尺寸大小,生成第一检测图像;Adjusting the size of the image to be inspected according to preset requirements to generate a first inspection image;
将所述第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;Sending the first detection image to a neural network model for matching recognition, and generating a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
判断所述分类概率值是否大于预设的分类概率阈值;Judging whether the classification probability value is greater than a preset classification probability threshold;
若大于,则将所述检测框和分类识别信息作为识别的分类结果。If it is greater than, the detection frame and the classification identification information are used as the classification result of the identification.
将所述第一检测图像发送至神经网络模型中,生成检测框和分类识别信息以及分类识别信息对应的分类概率值。本领域技术人员可根据实际需要设置分类概率值,例如,设置所述分类概率阈值为90%,则在检测一个包含小猫的图片时,若检测框中对小猫的识别的概率超过90%,则表明检测框中圈出了小猫,已经识别出来图片中的猫。若所述分类概率值是否小于预设的分类概率阈值,则将返回步骤S106进行重新识别,一直到所述分类概率值大于所述预设的分类概率阈值为止。所述神经网络模型会将图像进行多层的卷积运算。所述的YOLO卷积运算为本领域的常规运算,属于现有技术,本申请不再一一赘述。The first detection image is sent to a neural network model to generate a detection frame, classification identification information, and classification probability values corresponding to the classification identification information. Those skilled in the art can set the classification probability value according to actual needs. For example, if the classification probability threshold is set to 90%, when detecting a picture containing a kitten, if the probability of identifying the kitten in the detection frame exceeds 90% , It means that a kitten is circled in the detection box and the cat in the picture has been identified. If the classification probability value is less than the preset classification probability threshold, it will return to step S106 for re-identification until the classification probability value is greater than the preset classification probability threshold. The neural network model performs multi-layer convolution operations on the image. The described YOLO convolution operation is a conventional operation in the field, and belongs to the prior art, and will not be repeated in this application.
本方案中,在所述接收待检测图像之前还包括:In this solution, before the receiving the image to be detected, the method further includes:
进行图片训练,得到神经网络模型;所述神经网络模型通过如下步骤进行训练:Perform image training to obtain a neural network model; the neural network model is trained through the following steps:
获取训练图像数据集;Obtain a training image data set;
将所述训练图像数据集进行图像预处理,得到预处理后的图像集;Image preprocessing the training image data set to obtain a preprocessed image set;
将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
本方案中,所述生成检测框的步骤具体为:In this solution, the step of generating the detection frame is specifically as follows:
根据初始预设坐标点位生成初始检测框;Generate an initial detection frame according to the initial preset coordinate points;
进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最 新的检测框;Carry out dynamic detection frame prediction, iterative prediction of the generated detection frame, and generate the latest detection frame;
计算所述最新的检测框的重合度;Calculating the coincidence degree of the latest detection frame;
若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,则继续进行动态检测框的预测;If the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
最后生成同类别的N个同类检测框。Finally, N detection frames of the same category are generated.
本方案中,所述进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框具体为:In this solution, the prediction of the dynamic detection frame, the iterative prediction of the generated detection frame, and the generation of the latest detection frame are specifically as follows:
预测每个检测框的4个坐标,(t x,t y,t w,t h);如果单元格偏离图像的左上角坐标(c x,c y),并且上步预测的检测框具有宽度p w和高度p h,则最新检测框的坐标为: Predict the 4 coordinates of each detection frame, (t x , t y , t w , t h ); if the cell deviates from the upper left corner of the image (c x , c y ), and the detection frame predicted in the previous step has a width p w and height p h , the coordinates of the latest detection frame are:
b x=σ(t x)+c x b x =σ(t x )+c x
b y=σ(t y)+c y b y =σ(t y )+c y
Figure PCTCN2019118499-appb-000003
Figure PCTCN2019118499-appb-000003
Figure PCTCN2019118499-appb-000004
Figure PCTCN2019118499-appb-000004
本申请中,可以使用维度聚类作为锚框来动态预测检测框,所述的检测框也为边界框。网络预测每个检测框的4个坐标,t x,t y,t w,t h。如果单元格偏离图像的左上角坐标(c x,c y),所以可以得出上述公式表达的最新检测框的坐标,其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。需要说明的是,检测框为四边形,其中通过4个点的值便可确定四边形检测框的位置。 In this application, dimensional clustering can be used as an anchor frame to dynamically predict the detection frame, and the detection frame is also a bounding box. The network predicts the 4 coordinates of each detection frame, t x , t y , t w , and t h . If the cell deviates from the upper left corner of the image (c x , c y ), the coordinates of the latest detection frame expressed by the above formula can be obtained, where b x , b y , b w , and b h are the coordinates of the latest detection frame. Four coordinate point values. It should be noted that the detection frame is a quadrilateral, and the position of the quadrilateral detection frame can be determined by the values of 4 points.
需要说明的是,每个框使用多标签分类预测边界框可能包含的类。在进行类别识别过程中,本申请使用二元交叉熵损失技术进行类预测。采用二元交叉熵损失进行类别预测的目的主要为,申请人发现softmax技术不需要良好的性能,而只是使用独立的逻辑分类器,所以此步骤不需采用softmax技术。当使用本申请的方法迁移到更复杂的类别识别领域时,二元交叉熵损失技术会更提供更多帮助。二元交叉熵损失技术为本领域的常用技术,本领域技术人员可以根据需求进行二元交叉熵损失技术的实现,本申请不再一一赘述。It should be noted that each box uses multi-label classification to predict the classes that the bounding box may contain. In the process of class recognition, this application uses binary cross-entropy loss technology for class prediction. The main purpose of using binary cross entropy loss for category prediction is that the applicant finds that the softmax technology does not require good performance, but only uses an independent logical classifier, so this step does not need to use the softmax technology. When the method of this application is used to migrate to a more complex category recognition field, the binary cross entropy loss technology will provide more help. The binary cross-entropy loss technology is a common technology in the field, and those skilled in the art can implement the binary cross-entropy loss technology according to requirements, and this application will not repeat them one by one.
本方案中,在所述接收待检测图像之前还包括:In this solution, before the receiving the image to be detected, the method further includes:
进行图片训练,得到神经网络模型;所述的神经网络模型为针对不同类别的图片进行图像训练得到的具备输入接口和输出接口的模型。Image training is performed to obtain a neural network model; the neural network model is a model with an input interface and an output interface obtained by image training for different types of pictures.
优选的,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替进行计算。申请人在通过有限次的实际测试中发现,采用上述的卷积层进行交替运算可以增加准确率,并且还能有效提高运算速度。其中,卷积层交替进行计算具体为,首先采用3×3的卷积运算,再采用1×1的卷积运算,依次交替进行,直至所有的卷积层都参 与完运算为止。Preferably, 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is calculated by alternating 3×3 and 1×1 convolution layers. The applicant has found through a limited number of actual tests that the use of the above-mentioned convolutional layer for alternating operation can increase the accuracy and effectively increase the operation speed. Among them, the alternate calculation of the convolutional layer is specifically as follows: firstly, a 3×3 convolution operation is used, and then a 1×1 convolution operation is used, and the operations are alternated in turn until all the convolutional layers have participated in the operation.
本方案中,所述尺寸大小为神经网络模型规定的大小。In this solution, the size is the size specified by the neural network model.
需要说明的是,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替进行。It should be noted that 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is performed alternately by 3×3 and 1×1 convolution layers.
该特征提取方式实现了每秒最高的测量浮点运算。这也就意味着神经网络结构可以更好地利用机器的GPU,提高评估效率,从而提高速度。由于ResNets技术的层次太多而且效率不高,所以本申请的所述卷积运算可以有更高的效率和更高的准确率。This feature extraction method realizes the highest measurement floating point operation per second. This also means that the neural network structure can make better use of the machine's GPU, improve the evaluation efficiency, and thus increase the speed. Because the ResNets technology has too many levels and is not efficient, the convolution operation described in this application can have higher efficiency and higher accuracy.
例如,对每个神经网络都使用相同的设置进行训练,并以256×256的单一裁剪精度进行测试。采用本申请的特征提取的分类器的性能与现有技术中最先进的分类器相当,但浮点运算更少,速度更快。For example, each neural network is trained with the same settings and tested with a single cropping accuracy of 256×256. The performance of the classifier using the feature extraction of the present application is comparable to the most advanced classifier in the prior art, but there are fewer floating point operations and faster speed.
计算所述检测框中的分类概率值,并筛选出最优的N个同类检测框。使用概率阈值对所有检测框的M个分类概率值进行筛选,并制定了一套筛选规则:Calculate the classification probability value in the detection frame, and screen out the optimal N detection frames of the same kind. The probability threshold is used to filter the M classification probability values of all the detection frames, and a set of screening rules is formulated:
计算每个检测框的分类概率值,将其分类概率值按照从大到小进行排列,选取排名最高的分类。可以说此步骤为第一轮筛选,每个检测框的M个分类内部先进行评比,选出一个概率值最高的冠军分类。Calculate the classification probability value of each detection frame, arrange its classification probability values from largest to smallest, and select the highest ranked category. It can be said that this step is the first round of screening. The M categories of each detection frame are evaluated first, and a champion category with the highest probability value is selected.
将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框。可以说,第二轮筛选,冠军分类与概率阈值进行比较,数值大于概率阈值的检测框才有资格进入决赛。例如,概率阈值可以设置为0.24(24%)。通过比较,就把通过预赛的检测框显示在图片上,可以看出只要是大于等于0.24概率阈值的分类概率值,就可以显示出来。The highest ranked category is compared with a preset probability threshold. If it is greater than or equal to the preset probability threshold, the detection frame is retained; if it is less than the preset probability threshold, the detection frame is deleted. It can be said that in the second round of screening, the champion classification is compared with the probability threshold, and the check box with a value greater than the probability threshold is eligible to enter the final. For example, the probability threshold can be set to 0.24 (24%). By comparison, the detection frame that passed the preliminaries is displayed on the picture. It can be seen that as long as the classification probability value is greater than or equal to the 0.24 probability threshold, it can be displayed.
将所述N同类检测框进行重合度计算,将重合度最高的检测框保留。The overlap degree calculation is performed on the N detection frames of the same type, and the detection frame with the highest overlap degree is retained.
例如,经过上述筛选步骤之后又三个检测框都检测出分类为“马”。For example, after the above-mentioned screening steps, all three detection boxes are classified as "horse".
对所述的三个检测各自的检测概率降序排序。Sort the detection probability of each of the three detections in descending order.
两两进行重合度计算(IOU),如果重合度计算值IoU>0.3,就将概率低的检测框淘汰。The coincidence degree calculation (IOU) is performed in pairs. If the coincidence degree calculation value IoU>0.3, the detection frame with low probability is eliminated.
最终得出了分类为“马”的唯一检测框。Finally, a unique detection frame classified as "horse" is obtained.
此外,本申请还提出一种基于YOLO的图像目标识别装置,包括:输入模块,接收待检测图像;In addition, this application also proposes an image target recognition device based on YOLO, including: an input module to receive the image to be detected;
调整模块,根据预设的要求调整输入模块接收的待检测图像的尺寸大小,生成第一检测图像;The adjustment module adjusts the size of the image to be detected received by the input module according to preset requirements, and generates the first detected image;
匹配识别模块,将调整模块生成的第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;The matching recognition module sends the first detection image generated by the adjustment module to the neural network model for matching recognition, and generates a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
判断模块,判断所述分类概率值是否大于预设的分类概率阈值,若不大于,发送信号给匹配识别模块,若大于,发送信号给分类模块;A judging module, judging whether the classification probability value is greater than a preset classification probability threshold, if it is not greater, send a signal to the matching recognition module, if it is greater, send a signal to the classification module;
分类模块,将所述检测框和分类识别信息作为识别的分类结果。The classification module uses the detection frame and the classification identification information as the classification result of the identification.
优选地,还包括训练模块,进行图片训练,得到神经网络模型,所述训练模块包括:Preferably, it further includes a training module to perform image training to obtain a neural network model, and the training module includes:
数据集获取单元,获取训练图像数据集;Data set acquisition unit to acquire training image data set;
预处理单元,将所述训练图像数据集进行图像预处理,得到预处理后的图像集;A preprocessing unit, which performs image preprocessing on the training image data set to obtain a preprocessed image set;
训练单元,将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The training unit trains the preprocessed image set to obtain a neural network model with an input interface and an output interface.
进一步,优选地,上述数据集获取单元包括:Further, preferably, the aforementioned data set obtaining unit includes:
标签库,存储有不同物体对应的不同标签及标签顺序;Tag library, which stores different tags and tag sequences corresponding to different objects;
图片库,存储图片的图像数据和标签序列;Picture library, which stores the image data and label sequence of pictures;
筛选单元,从图片库挑选所述识别标签总集中每个标签的设定数量的正样本和负样本构成训练集和验证集,其中,一个标签的正样本为包含该标签对应物体的图片,一个标签的负样本为不包含该标签对应物体的图片,所述训练集为所述正样本和负样本的图像数据,所述验证集为所述正样本和负样本的标签序列,所述神经网络模型的输出为预测的训练集中样本的标签序列。The screening unit selects a set number of positive samples and negative samples of each tag in the total identification tag set from the picture library to form a training set and a validation set, where a positive sample of one label is a picture containing the object corresponding to the label, and one The negative sample of the label is a picture that does not contain the object corresponding to the label, the training set is the image data of the positive sample and the negative sample, the verification set is the label sequence of the positive sample and the negative sample, the neural network The output of the model is the predicted label sequence of the samples in the training set.
优选地,上述匹配识别模块包括:Preferably, the above-mentioned matching recognition module includes:
初始检测框生成单元,根据初始预设坐标点位生成初始检测框;The initial detection frame generating unit generates the initial detection frame according to the initial preset coordinate points;
预测单元,进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框;The prediction unit predicts the dynamic detection frame, iteratively predicts the generated detection frame, and generates the latest detection frame;
重合度获得单元,计算所述最新的检测框的重合度;The coincidence degree obtaining unit calculates the coincidence degree of the latest detection frame;
筛选单元,若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,发送信号给预测单元则继续进行动态检测框的预测;The screening unit, if the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, send a signal to the prediction unit Then continue to predict the dynamic detection frame;
检测框生成单元,生成同类别的N个同类检测框。The detection frame generation unit generates N detection frames of the same category.
优选地,上述预测单元包括:Preferably, the above prediction unit includes:
预测子单元,预测每个检测框的4个坐标,(t x,t y,t w,t h); Prediction sub-unit, predict the 4 coordinates of each detection frame, (t x , t y , t w , t h );
判断子单元,判断单元格是否偏离图像的左上角坐标(c x,c y),如果偏离发送信号给更新子单元; Judge the sub-unit, judge whether the cell deviates from the upper left corner of the image (c x , c y ), and send a signal to the update sub-unit if it deviates;
更新子单元,通过预测子单元预测的检测框的宽度p w和高度p h,更新检测框的坐标,则最新检测框的坐标为: Update the subunit, by predicting the width p w and height p h of the detection frame predicted by the subunit, and update the coordinates of the detection frame, the coordinates of the latest detection frame are:
b x=σ(t x)+c x b x =σ(t x )+c x
b y=σ(t y)+c y b y =σ(t y )+c y
Figure PCTCN2019118499-appb-000005
Figure PCTCN2019118499-appb-000005
Figure PCTCN2019118499-appb-000006
Figure PCTCN2019118499-appb-000006
其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。 Among them, b x , b y , b w , and b h are the four coordinate point values of the latest detection frame.
优选地,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替计算。Preferably, 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is alternately calculated by 3×3 and 1×1 convolution layers.
优选地,所述判断模块包括:Preferably, the judgment module includes:
分类概率获得单元,计算每个检测框的分类概率值;The classification probability obtaining unit calculates the classification probability value of each detection frame;
第一筛选单元,将每个检测框的分类概率值按照从大到小进行排列,选取排名最高的分类;The first screening unit arranges the classification probability value of each detection frame from largest to smallest, and selects the highest ranked category;
第二筛选单元,将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框,The second screening unit compares the highest ranked category with a preset probability threshold, and if it is greater than or equal to the preset probability threshold, then keep the detection frame; if it is less than the preset probability threshold, delete the Check box,
所述分类模块包括第三筛选单元,将保留的所述同类检测框进行重合度计算,将重合度最高的检测框保留,将重合度最高的检测框及其对应的分类识别信息作为识别的分类结果。The classification module includes a third screening unit, which calculates the degree of coincidence of the reserved detection frames of the same type, retains the detection frame with the highest degree of coincidence, and uses the detection frame with the highest degree of coincidence and its corresponding classification identification information as the recognized classification result.
为了更好的说明本申请的技术方案,下面通过一实施例进行详细说明。图4示出了本申请实施例的示意图。In order to better explain the technical solution of the present application, an embodiment is used to describe in detail below. Figure 4 shows a schematic diagram of an embodiment of the present application.
如图4所示,在神经网络模型中设置卷积层数为0-52层。然后接收经过尺寸调整后的第一检测图像,其中所述第一检测图像的大小为416*416,具体大小可以根据实际运算需求和运算能力设置,本实施例选用416*416进行说明,颜色是彩色照片。神经网络模型的第0层接收416*416大小,3通道(RGB)的彩色第一检测图像,进行卷积运算。As shown in Figure 4, the number of convolutional layers in the neural network model is set to 0-52. Then receive the first detected image after size adjustment. The size of the first detected image is 416*416. The specific size can be set according to actual computing requirements and computing capabilities. In this embodiment, 416*416 is selected for description, and the color is Color photo. The 0th layer of the neural network model receives the 416*416 size, 3-channel (RGB) color first detection image, and performs the convolution operation.
经过第0~51层卷积运算后,得到一张13*13大小,425通道的特征图片(feature map)。After the 0-51th layer of convolution operation, a 13*13 size, 425 channel feature map (feature map) is obtained.
第52层对该特征图片进行卷积运算,最终输出一维预测数组包含13*13*5*85个数值。通过一系列的运算将多维数组或矩阵降到一维数组。所述的一维数组便是预测数组。The 52nd layer performs a convolution operation on the feature picture, and the final output one-dimensional prediction array contains 13*13*5*85 values. Reduce the multi-dimensional array or matrix to a one-dimensional array through a series of operations. The one-dimensional array is the prediction array.
其中,13*13*5*85个数值中的数字13*13代表了特征图(feature map)的宽*高,一共有13*13个特征单元。YOLO将原始图片(416*416)平均划分成13*13个区域(cell),每个特征单元对应的一个图片区域。具体大小可以由本领域技术人员根据实际运算需求和运算能力设置。Among them, the number 13*13 in the 13*13*5*85 values represents the width*height of the feature map, and there are a total of 13*13 feature units. YOLO divides the original picture (416*416) into 13*13 cells on average, and each feature unit corresponds to a picture area. The specific size can be set by those skilled in the art according to actual computing requirements and computing capabilities.
数字5:代表了5个形状不同的检测框(bounding box),YOLO在每个图片区域都会生成5个形状不同的检测框,以该区域的中心点为检测框的中心点去检测物体,所以YOLO一共会用13*13*5个检测框去检测一张图片或者图像。Number 5: Represents 5 bounding boxes with different shapes. YOLO will generate 5 bounding boxes in each image area, and use the center of the area as the center of the detection box to detect objects, so YOLO will use 13*13*5 detection frames to detect a picture or image.
数字85可以分拆成3部分理解,85=4+1+80。The number 85 can be divided into 3 parts to understand, 85=4+1+80.
4:每个检测框包含4个坐标值(x,y,width,height)4: Each detection frame contains 4 coordinate values (x, y, width, height)
1:每个检测框都有1个检测物体自信值,也为上述的的置信度(0~1),理解为检测到物体的自信概率,即置信度值。1: Each detection frame has a confidence value of the detected object, which is also the above-mentioned confidence (0~1), which is understood as the confidence probability of detecting the object, that is, the confidence value.
80:每个检测框都有80个分类检测概率值(0~1),理解为检测框内的物体分别可能是每个分类的概率。80: Each detection frame has 80 classification detection probability values (0~1), which means that the objects in the detection frame may be the probability of each classification respectively.
可以说,上述的过程为,将一张416*416的图片,被平均划分成13*13个图片区域,每个图片区域生成5个检测框,每个检测框包含85个值(4个坐标值+1个检测物体自信值+80个分类检测值),最后得到的一维预测数组(predictions)代表了图片中检测到的物体,数组共 包含13*13*5*85个数值predictions[0]~predictions[13*13*5*85-1]。It can be said that the above process is to divide a 416*416 picture into 13*13 picture areas. Each picture area generates 5 detection frames, and each detection frame contains 85 values (4 coordinates). Value +1 detection object confidence value + 80 classification detection values), the final one-dimensional prediction array (predictions) represents the detected objects in the picture, the array contains a total of 13*13*5*85 numerical predictions[0 ]~predictions[13*13*5*85-1].
此外,本申请还提出一种计算机可读存储介质,包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被处理器执行时,实现上述基于YOLO的图像目标识别方法的步骤。In addition, this application also proposes a computer-readable storage medium including a YOLO-based image target recognition program, which, when executed by a processor, implements the steps of the above-mentioned YOLO-based image target recognition method.
本申请之计算机可读存储介质的具体实施方式与上述多标签分类方法、电子装置的具体实施方式大致相同,在此不再赘述。The specific implementation of the computer-readable storage medium of the present application is substantially the same as the specific implementation of the above-mentioned multi-tag classification method and electronic device, and will not be repeated here.
本申请提出的一种基于YOLO的图像目标识别方法、装置、电子设备和存储介质。该方法能够有效提高检测精度,降低检测时间。经过实验和验证,本申请的方法比现有技术的检测方法要优越。更多的体现在提高了识别准确率和增加了运算速度。This application proposes a YOLO-based image target recognition method, device, electronic equipment and storage medium. This method can effectively improve the detection accuracy and reduce the detection time. After experiments and verifications, the method of this application is superior to the detection method of the prior art. More embodied in improved recognition accuracy and increased computing speed.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, the functional units in the embodiments of the present application can all be integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit; The unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps in the above method embodiments can be implemented by a program instructing relevant hardware. The foregoing program can be stored in a computer readable storage medium. When the program is executed, the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: removable storage devices, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc. The medium storing the program code.
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to the prior art. The computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application. The aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed in this application. Should be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (20)

  1. 一种基于YOLO的图像目标识别方法,其特征在于,包括:An image target recognition method based on YOLO is characterized in that it includes:
    接收待检测图像;Receive the image to be detected;
    根据预设的要求调整所述待检测图像的尺寸大小,生成第一检测图像;Adjusting the size of the image to be inspected according to preset requirements to generate a first inspection image;
    将所述第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;Sending the first detection image to a neural network model for matching recognition, and generating a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
    判断所述分类概率值是否大于预设的分类概率阈值;Judging whether the classification probability value is greater than a preset classification probability threshold;
    若大于,则将所述检测框和分类识别信息作为识别的分类结果。If it is greater than, the detection frame and the classification identification information are used as the classification result of the identification.
  2. 根据权利要求1所述的基于YOLO的图像目标识别方法,其特征在于,在所述接收待检测图像之前还包括:The YOLO-based image target recognition method according to claim 1, characterized in that, before said receiving the image to be detected, it further comprises:
    进行图片训练,得到神经网络模型;所述神经网络模型通过如下步骤进行训练:Perform image training to obtain a neural network model; the neural network model is trained through the following steps:
    获取训练图像数据集;Obtain a training image data set;
    将所述训练图像数据集进行图像预处理,得到预处理后的图像集;Image preprocessing the training image data set to obtain a preprocessed image set;
    将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
  3. 根据权利要求2所述的基于YOLO的图像目标识别方法,其特征在于,所述获取训练图像数据集的步骤包括:The YOLO-based image target recognition method according to claim 2, wherein the step of obtaining a training image data set comprises:
    建立标签库,所述标签库存储有不同物体对应的不同标签及标签顺序;Establishing a tag library, which stores different tags and tag sequences corresponding to different objects;
    构建图片库,存储图片的图像数据和标签序列;Build a picture library to store the image data and label sequence of the picture;
    从图片库挑选所述识别标签总集中每个标签的设定数量的正样本和负样本构成训练集和验证集,其中,一个标签的正样本为包含该标签对应物体的图片,一个标签的负样本为不包含该标签对应物体的图片,所述训练集为所述正样本和负样本的图像数据,所述验证集为所述正样本和负样本的标签序列,所述神经网络模型的输出为预测的训练集中样本的标签序列。A set number of positive samples and negative samples of each tag in the total set of identification tags are selected from the picture library to form the training set and the validation set, where a positive sample of a label is a picture containing the object corresponding to the label, and a negative of a label The sample is a picture that does not contain the object corresponding to the label, the training set is the image data of the positive sample and the negative sample, the verification set is the label sequence of the positive sample and the negative sample, and the output of the neural network model Is the predicted label sequence of the samples in the training set.
  4. 根据权利要求1所述的基于YOLO的图像目标识别方法,其特征在于,所述生成检测框的步骤具体为:The YOLO-based image target recognition method according to claim 1, wherein the step of generating a detection frame specifically includes:
    根据初始预设坐标点位生成初始检测框;Generate an initial detection frame according to the initial preset coordinate points;
    进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框;Predict the dynamic detection frame, perform iterative prediction on the generated detection frame, and generate the latest detection frame;
    计算所述最新的检测框的重合度;Calculating the coincidence degree of the latest detection frame;
    若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,则继续进行动态检测框的预测;If the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
    最后生成同类别的N个同类检测框。Finally, N detection frames of the same category are generated.
  5. 根据权利要求4所述的基于YOLO的图像目标识别方法,其特征在于,所述进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框具体为:The YOLO-based image target recognition method according to claim 4, wherein the prediction of the dynamic detection frame, the iterative prediction of the generated detection frame, and the generation of the latest detection frame specifically include:
    预测每个检测框的4个坐标,(t x,t y,t w,t h);如果单元格偏离图像的左上角坐标(c x,c y),并且上步预测的检测框具有宽度p w和高度p h,则最新检测框的坐标为: Predict the 4 coordinates of each detection frame, (t x , t y , t w , t h ); if the cell deviates from the upper left corner of the image (c x , c y ), and the detection frame predicted in the previous step has a width p w and height p h , the coordinates of the latest detection frame are:
    b x=σ(t x)+c x b x =σ(t x )+c x
    b y=σ(t y)+c y b y =σ(t y )+c y
    Figure PCTCN2019118499-appb-100001
    Figure PCTCN2019118499-appb-100001
    Figure PCTCN2019118499-appb-100002
    Figure PCTCN2019118499-appb-100002
    其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。 Among them, b x , b y , b w , and b h are the four coordinate point values of the latest detection frame.
  6. 根据权利要求2所述的基于YOLO的图像目标识别方法,其特征在于,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替计算。The YOLO-based image target recognition method according to claim 2, wherein the neural network model adopts 53 layers of convolution operation, and the convolution operation of each layer is 3×3 and 1×1 convolution Layer alternate calculation.
  7. 根据权利要求1所述的基于YOLO的图像目标识别方法,其特征在于,所述尺寸大小为神经网络模型规定的大小。The YOLO-based image target recognition method according to claim 1, wherein the size is a size specified by a neural network model.
  8. 根据权利要求1所述的基于YOLO的图像目标识别方法,其特征在于,所述生成检测框和分类识别信息以及分类识别信息对应的分类概率值步骤之后包括:The YOLO-based image target recognition method according to claim 1, characterized in that, after the step of generating the detection frame, the classification identification information and the classification probability value corresponding to the classification identification information, the step comprises:
    计算每个检测框的分类概率值,将其分类概率值按照从大到小进行排列,选取排名最高的分类;Calculate the classification probability value of each detection frame, arrange its classification probability values from largest to smallest, and select the highest ranked category;
    将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框;Comparing the highest ranked category with a preset probability threshold, if it is greater than or equal to the preset probability threshold, keep the detection frame; if it is less than the preset probability threshold, delete the detection frame;
    将保留的同类所述检测框进行重合度计算,将重合度最高的检测框保留。The remaining detection frames of the same type are calculated for the coincidence degree, and the detection frame with the highest coincidence degree is retained.
  9. 一种基于YOLO的图像目标识别装置,其特征在于,包括:输入模块,接收待检测图像;An image target recognition device based on YOLO, which is characterized by comprising: an input module, which receives an image to be detected;
    调整模块,根据预设的要求调整输入模块接收的待检测图像的尺寸大小,生成第一检测图像;The adjustment module adjusts the size of the image to be detected received by the input module according to preset requirements, and generates the first detected image;
    匹配识别模块,将调整模块生成的第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;The matching recognition module sends the first detection image generated by the adjustment module to the neural network model for matching recognition, and generates a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
    判断模块,判断所述分类概率值是否大于预设的分类概率阈值,若不大于,发送信号给匹配识别模块,若大于,发送信号给分类模块;A judging module, judging whether the classification probability value is greater than a preset classification probability threshold, if it is not greater, send a signal to the matching recognition module, if it is greater, send a signal to the classification module;
    分类模块,将所述检测框和分类识别信息作为识别的分类结果。The classification module uses the detection frame and the classification identification information as the classification result of the identification.
  10. 根据权利要求9所述的基于YOLO的图像目标识别装置,其特征在于,还包括训练模块,进行图片训练,得到神经网络模型,所述训练模块包括:The YOLO-based image target recognition device according to claim 9, characterized in that it further comprises a training module for image training to obtain a neural network model, and the training module comprises:
    数据集获取单元,获取训练图像数据集;Data set acquisition unit to acquire training image data set;
    预处理单元,将所述训练图像数据集进行图像预处理,得到预处理后 的图像集;A preprocessing unit, which performs image preprocessing on the training image data set to obtain a preprocessed image set;
    训练单元,将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The training unit trains the preprocessed image set to obtain a neural network model with an input interface and an output interface.
  11. 根据权利要求10所述的基于YOLO的图像目标识别装置,其特征在于,所述数据集获取单元包括:The YOLO-based image target recognition device according to claim 10, wherein the data set acquisition unit comprises:
    标签库,存储有不同物体对应的不同标签及标签顺序;Tag library, which stores different tags and tag sequences corresponding to different objects;
    图片库,存储图片的图像数据和标签序列;Picture library, which stores the image data and label sequence of pictures;
    筛选单元,从图片库挑选所述识别标签总集中每个标签的设定数量的正样本和负样本构成训练集和验证集,其中,一个标签的正样本为包含该标签对应物体的图片,一个标签的负样本为不包含该标签对应物体的图片,所述训练集为所述正样本和负样本的图像数据,所述验证集为所述正样本和负样本的标签序列,所述神经网络模型的输出为预测的训练集中样本的标签序列。The screening unit selects a set number of positive samples and negative samples of each tag in the total identification tag set from the picture library to form a training set and a validation set, where a positive sample of one label is a picture containing the object corresponding to the label, and one The negative sample of the label is a picture that does not contain the object corresponding to the label, the training set is the image data of the positive sample and the negative sample, the verification set is the label sequence of the positive sample and the negative sample, the neural network The output of the model is the predicted label sequence of the samples in the training set.
  12. 根据权利要求9所述的基于YOLO的图像目标识别装置,其特征在于,所述匹配识别模块包括:The YOLO-based image target recognition device according to claim 9, wherein the matching recognition module comprises:
    初始检测框生成单元,根据初始预设坐标点位生成初始检测框;The initial detection frame generating unit generates the initial detection frame according to the initial preset coordinate points;
    预测单元,进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框;The prediction unit predicts the dynamic detection frame, iteratively predicts the generated detection frame, and generates the latest detection frame;
    重合度获得单元,计算所述最新的检测框的重合度;The coincidence degree obtaining unit calculates the coincidence degree of the latest detection frame;
    筛选单元,若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,发送信号给预测单元则继续进行动态检测框的预测;The screening unit, if the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, send a signal to the prediction unit Then continue to predict the dynamic detection frame;
    检测框生成单元,生成同类别的N个同类检测框。The detection frame generation unit generates N detection frames of the same category.
  13. 根据权利要求12所述的基于YOLO的图像目标识别装置,其特征在于,所述预测单元包括:The YOLO-based image target recognition device according to claim 12, wherein the prediction unit comprises:
    预测子单元,预测每个检测框的4个坐标,(t x,t y,t w,t h); Prediction sub-unit, predict the 4 coordinates of each detection frame, (t x , t y , t w , t h );
    判断子单元,判断单元格是否偏离图像的左上角坐标(c x,c y),如果偏离发送信号给更新子单元; Judge the sub-unit, judge whether the cell deviates from the upper left corner of the image (c x , c y ), and send a signal to the update sub-unit if it deviates;
    更新子单元,通过预测子单元预测的检测框的宽度p w和高度p h,更新检测框的坐标,则最新检测框的坐标为: Update the subunit, by predicting the width p w and height p h of the detection frame predicted by the subunit, and update the coordinates of the detection frame, the coordinates of the latest detection frame are:
    b x=σ(t x)+c x b x =σ(t x )+c x
    b y=σ(t y)+c y b y =σ(t y )+c y
    Figure PCTCN2019118499-appb-100003
    Figure PCTCN2019118499-appb-100003
    Figure PCTCN2019118499-appb-100004
    Figure PCTCN2019118499-appb-100004
    其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。 Among them, b x , b y , b w , and b h are the four coordinate point values of the latest detection frame.
  14. 根据权利要求9所述的基于YOLO的图像目标识别装置,其特征在于,所述的神经网络模型中采用53层卷积运算,并且每层的卷积运算为3×3和1×1卷积层交替计算。The YOLO-based image target recognition device according to claim 9, wherein 53 layers of convolution operation are used in the neural network model, and the convolution operation of each layer is 3×3 and 1×1 convolution Layer alternate calculation.
  15. 根据权利要求9所述的基于YOLO的图像目标识别装置,其特征在 于,所述判断模块包括:The YOLO-based image target recognition device according to claim 9, characterized in that the judgment module comprises:
    分类概率获得单元,计算每个检测框的分类概率值;The classification probability obtaining unit calculates the classification probability value of each detection frame;
    第一筛选单元,将每个检测框的分类概率值按照从大到小进行排列,选取排名最高的分类;The first screening unit arranges the classification probability value of each detection frame from largest to smallest, and selects the highest ranked category;
    第二筛选单元,将排名最高的分类与预设的概率阈值进行比较,若大于等于所述预设的概率阈值,则保留所述检测框;若小于所述预设概率阈值,则删除所述检测框,The second screening unit compares the highest ranked category with a preset probability threshold, and if it is greater than or equal to the preset probability threshold, then keep the detection frame; if it is less than the preset probability threshold, delete the Check box,
    所述分类模块包括第三筛选单元,将保留的同类所述检测框进行重合度计算,将重合度最高的检测框保留,将重合度最高的检测框及其对应的分类识别信息作为识别的分类结果。The classification module includes a third screening unit, which calculates the degree of coincidence of the reserved detection frames of the same type, retains the detection frame with the highest degree of coincidence, and uses the detection frame with the highest degree of coincidence and its corresponding classification identification information as the recognized classification result.
  16. 一种电子设备,其特征在于,包括:存储器、处理器及摄像装置,所述存储器中包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被所述处理器执行时实现如下步骤:An electronic device, comprising: a memory, a processor, and a camera device, the memory includes a YOLO-based image target recognition program, and the YOLO-based image target recognition program is executed by the processor as follows step:
    接收待检测图像;Receive the image to be detected;
    根据预设的要求调整所述待检测图像的尺寸大小,生成第一检测图像;Adjusting the size of the image to be inspected according to preset requirements to generate a first inspection image;
    将所述第一检测图像发送至神经网络模型中进行匹配识别,生成检测框和分类识别信息以及分类识别信息对应的分类概率值;Sending the first detection image to a neural network model for matching recognition, and generating a detection frame, classification identification information, and classification probability values corresponding to the classification identification information;
    判断所述分类概率值是否大于预设的分类概率阈值;Judging whether the classification probability value is greater than a preset classification probability threshold;
    若大于,则将所述检测框和分类识别信息作为识别的分类结果。If it is greater than, the detection frame and the classification identification information are used as the classification result of the identification.
  17. 根据权利要求16所述的电子设备,其特征在于,所述生成检测框的步骤包括:The electronic device according to claim 16, wherein the step of generating a detection frame comprises:
    根据初始预设坐标点位生成初始检测框;Generate an initial detection frame according to the initial preset coordinate points;
    进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框;Predict the dynamic detection frame, perform iterative prediction on the generated detection frame, and generate the latest detection frame;
    计算所述最新的检测框的重合度;Calculating the coincidence degree of the latest detection frame;
    若最新的检测框重合度大于等于预设的重合度阈值,则保留所述最新的检测框;若所述最新的检测框重合度小于预设的重合度阈值,则继续进行动态检测框的预测;If the latest detection frame coincidence degree is greater than or equal to the preset coincidence degree threshold, keep the latest detection frame; if the latest detection frame coincidence degree is less than the preset coincidence degree threshold, continue to predict the dynamic detection frame ;
    最后生成同类别的N个同类检测框。Finally, N detection frames of the same category are generated.
  18. 根据权利要求16所述的电子设备,其特征在于,所述进行动态检测框的预测,对已经生成的检测框进行迭代预测,生成最新的检测框的步骤包括:The electronic device according to claim 16, wherein the step of predicting the dynamic detection frame, performing iterative prediction on the generated detection frame, and generating the latest detection frame comprises:
    预测每个检测框的4个坐标,(t x,t y,t w,t h);如果单元格偏离图像的左上角坐标(c x,c y),并且上步预测的检测框具有宽度p w和高度p h,则最新检测框的坐标为: Predict the 4 coordinates of each detection frame, (t x , t y , t w , t h ); if the cell deviates from the upper left corner of the image (c x , c y ), and the detection frame predicted in the previous step has a width p w and height p h , the coordinates of the latest detection frame are:
    b x=σ(t x)+c x b x =σ(t x )+c x
    b y=σ(t y)+c y b y =σ(t y )+c y
    Figure PCTCN2019118499-appb-100005
    Figure PCTCN2019118499-appb-100005
    Figure PCTCN2019118499-appb-100006
    Figure PCTCN2019118499-appb-100006
    其中,b x、b y、b w、b h分别为最新检测框的四个坐标点位值。 Among them, b x , b y , b w , and b h are the four coordinate point values of the latest detection frame.
  19. 根据权利要求16所述的电子设备,其特征在于,所述神经网络模型的训练步骤包括:The electronic device according to claim 16, wherein the training step of the neural network model comprises:
    获取训练图像数据集;Obtain a training image data set;
    将所述训练图像数据集进行图像预处理,得到预处理后的图像集;Image preprocessing the training image data set to obtain a preprocessed image set;
    将所述预处理后的图像集进行训练,得到具备输入接口和输出接口的神经网络模型。The preprocessed image set is trained to obtain a neural network model with an input interface and an output interface.
  20. 一种计算机非易失性可读存储介质,其特征在于,所述计算机非易失性可读存储介质中包括基于YOLO的图像目标识别程序,所述基于YOLO的图像目标识别程序被处理器执行时,实现如权利要求1至8中任一项所述的一种基于YOLO的图像目标识别方法的步骤。A computer non-volatile readable storage medium, wherein the computer non-volatile readable storage medium includes a YOLO-based image target recognition program, and the YOLO-based image target recognition program is executed by a processor At the time, the steps of a YOLO-based image target recognition method according to any one of claims 1 to 8 are realized.
PCT/CN2019/118499 2019-02-14 2019-11-14 Yolo-based image target recognition method and apparatus, electronic device, and storage medium WO2020164282A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910114621.5A CN109977943B (en) 2019-02-14 2019-02-14 Image target recognition method, system and storage medium based on YOLO
CN201910114621.5 2019-02-14

Publications (1)

Publication Number Publication Date
WO2020164282A1 true WO2020164282A1 (en) 2020-08-20

Family

ID=67076997

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118499 WO2020164282A1 (en) 2019-02-14 2019-11-14 Yolo-based image target recognition method and apparatus, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN109977943B (en)
WO (1) WO2020164282A1 (en)

Cited By (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111986255A (en) * 2020-09-07 2020-11-24 北京凌云光技术集团有限责任公司 Multi-scale anchor initialization method and device of image detection model
CN112036507A (en) * 2020-09-25 2020-12-04 北京小米松果电子有限公司 Training method and device of image recognition model, storage medium and electronic equipment
CN112036286A (en) * 2020-08-25 2020-12-04 北京华正明天信息技术股份有限公司 Method for achieving temperature sensing and intelligently analyzing and identifying flame based on yoloV3 algorithm
CN112101134A (en) * 2020-08-24 2020-12-18 深圳市商汤科技有限公司 Object detection method and device, electronic device and storage medium
CN112149748A (en) * 2020-09-28 2020-12-29 商汤集团有限公司 Image classification method and device, electronic equipment and storage medium
CN112183358A (en) * 2020-09-29 2021-01-05 新石器慧拓(北京)科技有限公司 Training method and device for target detection model
CN112200186A (en) * 2020-10-15 2021-01-08 上海海事大学 Car logo identification method based on improved YOLO _ V3 model
CN112231497A (en) * 2020-10-19 2021-01-15 腾讯科技(深圳)有限公司 Information classification method and device, storage medium and electronic equipment
CN112288003A (en) * 2020-10-28 2021-01-29 北京奇艺世纪科技有限公司 Neural network training and target detection method and device
CN112287884A (en) * 2020-11-19 2021-01-29 长江大学 Examination abnormal behavior detection method and device and computer readable storage medium
CN112330641A (en) * 2020-11-09 2021-02-05 迩言(上海)科技有限公司 Grain imperfect grain identification method and system based on deep learning
CN112348112A (en) * 2020-11-24 2021-02-09 深圳市优必选科技股份有限公司 Training method and device for image recognition model and terminal equipment
CN112348778A (en) * 2020-10-21 2021-02-09 深圳市优必选科技股份有限公司 Object identification method and device, terminal equipment and storage medium
CN112364807A (en) * 2020-11-24 2021-02-12 深圳市优必选科技股份有限公司 Image recognition method and device, terminal equipment and computer readable storage medium
CN112365465A (en) * 2020-11-09 2021-02-12 浙江大华技术股份有限公司 Method and apparatus for determining type of synthesized image, storage medium, and electronic apparatus
CN112489015A (en) * 2020-11-27 2021-03-12 广州高新兴机器人有限公司 Chemical fiber impurity floating identification method for mobile robot
CN112507912A (en) * 2020-12-15 2021-03-16 网易(杭州)网络有限公司 Method and device for identifying illegal picture
CN112529020A (en) * 2020-12-24 2021-03-19 携程旅游信息技术(上海)有限公司 Animal identification method, system, equipment and storage medium based on neural network
CN112541483A (en) * 2020-12-25 2021-03-23 三峡大学 Dense face detection method combining YOLO and blocking-fusion strategy
CN112560586A (en) * 2020-11-27 2021-03-26 国家电网有限公司大数据中心 Method and device for obtaining structured data of pole and tower signboard and electronic equipment
CN112560799A (en) * 2021-01-05 2021-03-26 北京航空航天大学 Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN112580523A (en) * 2020-12-22 2021-03-30 平安国际智慧城市科技股份有限公司 Behavior recognition method, behavior recognition device, behavior recognition equipment and storage medium
CN112580734A (en) * 2020-12-25 2021-03-30 深圳市优必选科技股份有限公司 Target detection model training method, system, terminal device and storage medium
CN112597915A (en) * 2020-12-26 2021-04-02 上海有个机器人有限公司 Method, device, medium and robot for identifying indoor close-distance pedestrians
CN112613097A (en) * 2020-12-15 2021-04-06 中铁二十四局集团江苏工程有限公司 BIM rapid modeling method based on computer vision
CN112633159A (en) * 2020-12-22 2021-04-09 北京迈格威科技有限公司 Human-object interaction relation recognition method, model training method and corresponding device
CN112633286A (en) * 2020-12-25 2021-04-09 北京航星机器制造有限公司 Intelligent security inspection system based on similarity rate and recognition probability of dangerous goods
CN112634202A (en) * 2020-12-04 2021-04-09 浙江省农业科学院 Method, device and system for detecting behavior of polyculture fish shoal based on YOLOv3-Lite
CN112634327A (en) * 2020-12-21 2021-04-09 合肥讯图信息科技有限公司 Tracking method based on YOLOv4 model
CN112633352A (en) * 2020-12-18 2021-04-09 浙江大华技术股份有限公司 Target detection method and device, electronic equipment and storage medium
CN112699925A (en) * 2020-12-23 2021-04-23 国网安徽省电力有限公司检修分公司 Transformer substation meter image classification method
CN112733741A (en) * 2021-01-14 2021-04-30 苏州挚途科技有限公司 Traffic signboard identification method and device and electronic equipment
CN112734641A (en) * 2020-12-31 2021-04-30 百果园技术(新加坡)有限公司 Training method and device of target detection model, computer equipment and medium
CN112766170A (en) * 2021-01-21 2021-05-07 广西财经学院 Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
CN112784694A (en) * 2020-12-31 2021-05-11 杭州电子科技大学 EVP-YOLO-based indoor article detection method
CN112800971A (en) * 2021-01-29 2021-05-14 深圳市商汤科技有限公司 Neural network training and point cloud data processing method, device, equipment and medium
CN112818980A (en) * 2021-01-15 2021-05-18 湖南千盟物联信息技术有限公司 Steel ladle number detection and identification method based on Yolov3 algorithm
CN112861711A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Regional intrusion detection method and device, electronic equipment and storage medium
CN112861716A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Illegal article placement monitoring method, system, equipment and storage medium
CN112906478A (en) * 2021-01-22 2021-06-04 北京百度网讯科技有限公司 Target object identification method, device, equipment and storage medium
CN112906794A (en) * 2021-02-22 2021-06-04 珠海格力电器股份有限公司 Target detection method, device, storage medium and terminal
CN112906495A (en) * 2021-01-27 2021-06-04 深圳安智杰科技有限公司 Target detection method and device, electronic equipment and storage medium
CN112906621A (en) * 2021-03-10 2021-06-04 北京华捷艾米科技有限公司 Hand detection method, device, storage medium and equipment
CN112911171A (en) * 2021-02-04 2021-06-04 上海航天控制技术研究所 Intelligent photoelectric information processing system and method based on accelerated processing
CN112966762A (en) * 2021-03-16 2021-06-15 南京恩博科技有限公司 Wild animal detection method and device, storage medium and electronic equipment
CN112966618A (en) * 2021-03-11 2021-06-15 京东数科海益信息科技有限公司 Dressing identification method, device, equipment and computer readable medium
CN112990334A (en) * 2021-03-29 2021-06-18 西安电子科技大学 Small sample SAR image target identification method based on improved prototype network
CN112989924A (en) * 2021-01-26 2021-06-18 深圳市优必选科技股份有限公司 Target detection method, target detection device and terminal equipment
CN112991304A (en) * 2021-03-23 2021-06-18 武汉大学 Molten pool sputtering detection method based on laser directional energy deposition monitoring system
CN113011319A (en) * 2021-03-16 2021-06-22 上海应用技术大学 Multi-scale fire target identification method and system
CN113052127A (en) * 2021-04-09 2021-06-29 上海云从企业发展有限公司 Behavior detection method, behavior detection system, computer equipment and machine readable medium
CN113095133A (en) * 2021-03-04 2021-07-09 北京迈格威科技有限公司 Model training method, target detection method and corresponding device
CN113128522A (en) * 2021-05-11 2021-07-16 四川云从天府人工智能科技有限公司 Target identification method and device, computer equipment and storage medium
CN113139597A (en) * 2021-04-19 2021-07-20 中国人民解放军91054部队 Statistical thought-based image distribution external detection method
CN113205067A (en) * 2021-05-26 2021-08-03 北京京东乾石科技有限公司 Method and device for monitoring operator, electronic equipment and storage medium
CN113222889A (en) * 2021-03-30 2021-08-06 大连智慧渔业科技有限公司 Industrial aquaculture counting method and device for aquatic aquaculture objects under high-resolution images
CN113240638A (en) * 2021-05-12 2021-08-10 上海联影智能医疗科技有限公司 Target detection method, device and medium based on deep learning
CN113377888A (en) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 Training target detection model and method for detecting target
CN113392833A (en) * 2021-06-10 2021-09-14 沈阳派得林科技有限责任公司 Method for identifying type number of industrial radiographic negative image
CN113435260A (en) * 2021-06-07 2021-09-24 上海商汤智能科技有限公司 Image detection method, related training method, related device, equipment and medium
CN113486746A (en) * 2021-06-25 2021-10-08 海南电网有限责任公司三亚供电局 Power cable external damage prevention method based on biological induction and video monitoring
CN113486857A (en) * 2021-08-03 2021-10-08 云南大学 Ascending safety detection method and system based on YOLOv4
CN113536963A (en) * 2021-06-25 2021-10-22 西安电子科技大学 SAR image airplane target detection method based on lightweight YOLO network
CN113553948A (en) * 2021-07-23 2021-10-26 中远海运科技(北京)有限公司 Automatic recognition and counting method for tobacco insects and computer readable medium
CN113591566A (en) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 Training method and device of image recognition model, electronic equipment and storage medium
CN113723217A (en) * 2021-08-09 2021-11-30 南京邮电大学 Object intelligent detection method and system based on yolo improvement
CN113723157A (en) * 2020-12-15 2021-11-30 京东数字科技控股股份有限公司 Crop disease identification method and device, electronic equipment and storage medium
CN113723406A (en) * 2021-09-03 2021-11-30 乐普(北京)医疗器械股份有限公司 Processing method and device for positioning bracket of coronary angiography image
CN113743339A (en) * 2021-09-09 2021-12-03 三峡大学 Indoor fall detection method and system based on scene recognition
CN113762023A (en) * 2021-02-18 2021-12-07 北京京东振世信息技术有限公司 Object identification method and device based on article incidence relation
CN113792656A (en) * 2021-09-15 2021-12-14 山东大学 Behavior detection and alarm system for using mobile communication equipment in personnel movement
CN113807449A (en) * 2021-09-23 2021-12-17 合肥工业大学 Sedimentary rock category identification method and device, electronic equipment and storage medium
CN113870196A (en) * 2021-09-10 2021-12-31 苏州浪潮智能科技有限公司 Image processing method, device, equipment and medium based on anchor point cutting graph
CN113948190A (en) * 2021-09-02 2022-01-18 上海健康医学院 Method and equipment for automatically identifying X-ray skull positive position film cephalogram measurement mark points
CN113989939A (en) * 2021-11-16 2022-01-28 河北工业大学 Small-target pedestrian detection system based on improved YOLO algorithm
CN114022446A (en) * 2021-11-04 2022-02-08 广东工业大学 Leather flaw detection method and system based on improved YOLOv3
CN114022554A (en) * 2021-11-03 2022-02-08 北华航天工业学院 Massage robot acupuncture point detection and positioning method based on YOLO
CN114022705A (en) * 2021-10-29 2022-02-08 电子科技大学 Adaptive target detection method based on scene complexity pre-classification
CN114120358A (en) * 2021-11-11 2022-03-01 国网江苏省电力有限公司技能培训中心 Super-pixel-guided deep learning-based identification method for head-worn safety helmet of person
CN114119455A (en) * 2021-09-03 2022-03-01 乐普(北京)医疗器械股份有限公司 Method and device for positioning blood vessel stenosis part based on target detection network
CN114255389A (en) * 2021-11-15 2022-03-29 浙江时空道宇科技有限公司 Target object detection method, device, equipment and storage medium
CN114359222A (en) * 2022-01-05 2022-04-15 多伦科技股份有限公司 Method for detecting arbitrary polygon target, electronic device and storage medium
CN114373075A (en) * 2021-12-31 2022-04-19 西安电子科技大学广州研究院 Target component detection data set construction method, detection method, device and equipment
CN114387219A (en) * 2021-12-17 2022-04-22 依未科技(北京)有限公司 Method, device, medium and equipment for detecting arteriovenous cross compression characteristics of eyeground
US20220130139A1 (en) * 2022-01-05 2022-04-28 Baidu Usa Llc Image processing method and apparatus, electronic device and storage medium
CN114565848A (en) * 2022-02-25 2022-05-31 佛山读图科技有限公司 Liquid medicine level detection method and system in complex scene
CN114662594A (en) * 2022-03-25 2022-06-24 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114742204A (en) * 2022-04-08 2022-07-12 黑龙江惠达科技发展有限公司 Method and device for detecting straw coverage rate
CN114782778A (en) * 2022-04-25 2022-07-22 广东工业大学 Assembly state monitoring method and system based on machine vision technology
CN114821288A (en) * 2021-01-29 2022-07-29 中强光电股份有限公司 Image identification method and unmanned aerial vehicle system
CN114842315A (en) * 2022-05-07 2022-08-02 无锡雪浪数制科技有限公司 Anti-loosening identification method and device for lightweight high-speed rail hub gasket
CN114881763A (en) * 2022-05-18 2022-08-09 中国工商银行股份有限公司 Method, device, equipment and medium for post-loan supervision of aquaculture
CN114972891A (en) * 2022-07-07 2022-08-30 智云数创(洛阳)数字科技有限公司 CAD component automatic identification method and BIM modeling method
CN115029209A (en) * 2022-06-17 2022-09-09 杭州天杭空气质量检测有限公司 Colony image acquisition processing device and processing method thereof
CN115082661A (en) * 2022-07-11 2022-09-20 阿斯曼尔科技(上海)有限公司 Method for reducing assembly difficulty of sensor
CN115187982A (en) * 2022-07-12 2022-10-14 河北华清环境科技集团股份有限公司 Algae detection method and device and terminal equipment
CN115297263A (en) * 2022-08-24 2022-11-04 广州方图科技有限公司 Automatic photographing control method and system suitable for cube shooting and cube shooting
CN115346170A (en) * 2022-08-11 2022-11-15 北京市燃气集团有限责任公司 Intelligent monitoring method and device for gas facility area
CN115346172A (en) * 2022-08-16 2022-11-15 哈尔滨市科佳通用机电股份有限公司 Method and system for detecting loss and breakage of hook lifting rod return spring
WO2022252089A1 (en) * 2021-05-31 2022-12-08 京东方科技集团股份有限公司 Training method for object detection model, and object detection method and device
CN115546566A (en) * 2022-11-24 2022-12-30 杭州心识宇宙科技有限公司 Intelligent body interaction method, device, equipment and storage medium based on article identification
CN115690565A (en) * 2022-09-28 2023-02-03 大连海洋大学 Target detection method for cultivated fugu rubripes by fusing knowledge and improving YOLOv5
CN115690570A (en) * 2023-01-05 2023-02-03 中国水产科学研究院黄海水产研究所 Fish shoal feeding intensity prediction method based on ST-GCN
CN115909358A (en) * 2022-07-27 2023-04-04 广州市玄武无线科技股份有限公司 Commodity specification identification method and device, terminal equipment and computer storage medium
CN116051985A (en) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 Semi-supervised remote sensing target detection method based on multi-model mutual feedback learning
CN116403163A (en) * 2023-04-20 2023-07-07 慧铁科技有限公司 Method and device for identifying opening and closing states of handles of cut-off plug doors
CN116452858A (en) * 2023-03-24 2023-07-18 哈尔滨市科佳通用机电股份有限公司 Rail wagon connecting pull rod round pin breaking fault identification method and system
CN116681687A (en) * 2023-06-20 2023-09-01 广东电网有限责任公司广州供电局 Wire detection method and device based on computer vision and computer equipment
CN116758547A (en) * 2023-06-27 2023-09-15 北京中超伟业信息安全技术股份有限公司 Paper medium carbonization method, system and storage medium
CN116916166A (en) * 2023-09-12 2023-10-20 湖南湘银河传感科技有限公司 Telemetry terminal based on AI image analysis
CN116935232A (en) * 2023-09-15 2023-10-24 青岛国测海遥信息技术有限公司 Remote sensing image processing method and device for offshore wind power equipment, equipment and medium
CN117523318A (en) * 2023-12-26 2024-02-06 宁波微科光电股份有限公司 Anti-light interference subway shielding door foreign matter detection method, device and medium
CN117671597A (en) * 2023-12-25 2024-03-08 北京大学长沙计算与数字经济研究院 Method for constructing mouse detection model and mouse detection method and device
CN117893895A (en) * 2024-03-15 2024-04-16 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) Method, system, equipment and storage medium for identifying portunus trituberculatus

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977943B (en) * 2019-02-14 2024-05-07 平安科技(深圳)有限公司 Image target recognition method, system and storage medium based on YOLO
CN110348304A (en) * 2019-06-06 2019-10-18 武汉理工大学 A kind of maritime affairs distress personnel search system being equipped on unmanned plane and target identification method
CN110738125B (en) * 2019-09-19 2023-08-01 平安科技(深圳)有限公司 Method, device and storage medium for selecting detection frame by Mask R-CNN
CN111223343B (en) * 2020-03-07 2022-01-28 上海中科教育装备集团有限公司 Artificial intelligence scoring experimental equipment and scoring method for lever balance experiment
CN111582021A (en) * 2020-03-26 2020-08-25 平安科技(深圳)有限公司 Method and device for detecting text in scene image and computer equipment
CN111695559B (en) * 2020-04-28 2023-07-18 深圳市跨越新科技有限公司 YoloV3 model-based waybill picture information coding method and system
CN113705591A (en) * 2020-05-20 2021-11-26 上海微创卜算子医疗科技有限公司 Readable storage medium, and support specification identification method and device
CN111626256B (en) * 2020-06-03 2023-06-27 兰波(苏州)智能科技有限公司 High-precision diatom detection and identification method and system based on scanning electron microscope image
CN111738259A (en) * 2020-06-29 2020-10-02 广东电网有限责任公司 Tower state detection method and device
CN111523621B (en) * 2020-07-03 2020-10-20 腾讯科技(深圳)有限公司 Image recognition method and device, computer equipment and storage medium
CN111857350A (en) * 2020-07-28 2020-10-30 海尔优家智能科技(北京)有限公司 Method, device and equipment for rotating display equipment
CN112132018A (en) * 2020-09-22 2020-12-25 平安国际智慧城市科技股份有限公司 Traffic police recognition method, traffic police recognition device, traffic police recognition medium and electronic equipment
CN112116582A (en) * 2020-09-24 2020-12-22 深圳爱莫科技有限公司 Cigarette detection and identification method under stock or display scene
CN112132088B (en) * 2020-09-29 2024-01-12 动联(山东)电子科技有限公司 Inspection point missing inspection identification method
CN112381773B (en) * 2020-11-05 2023-04-18 东风柳州汽车有限公司 Key cross section data analysis method, device, equipment and storage medium
CN112508915A (en) * 2020-12-11 2021-03-16 中信银行股份有限公司 Target detection result optimization method and system
CN112215308B (en) * 2020-12-13 2021-03-30 之江实验室 Single-order detection method and device for hoisted object, electronic equipment and storage medium
CN112507896B (en) * 2020-12-14 2023-11-07 大连大学 Method for detecting cherry fruits by adopting improved YOLO-V4 model
CN112508017A (en) * 2020-12-15 2021-03-16 通号智慧城市研究设计院有限公司 Intelligent digital display instrument reading identification method, system, processing equipment and storage medium
CN112613570B (en) * 2020-12-29 2024-06-11 深圳云天励飞技术股份有限公司 Image detection method, image detection device, equipment and storage medium
CN113033398B (en) * 2021-03-25 2022-02-11 深圳市康冠商用科技有限公司 Gesture recognition method and device, computer equipment and storage medium
CN112965604A (en) * 2021-03-29 2021-06-15 深圳市优必选科技股份有限公司 Gesture recognition method and device, terminal equipment and computer readable storage medium
CN113158922A (en) * 2021-04-26 2021-07-23 平安科技(深圳)有限公司 Traffic flow statistical method, device and equipment based on YOLO neural network
CN113269188B (en) * 2021-06-17 2023-03-14 华南农业大学 Mark point and pixel coordinate detection method thereof
CN113705643B (en) * 2021-08-17 2022-10-28 荣耀终端有限公司 Target detection method and device and electronic equipment
CN113793325B (en) * 2021-09-22 2024-05-24 北京市商汤科技开发有限公司 Detection method, detection device, computer equipment and storage medium
CN116342316A (en) * 2023-05-31 2023-06-27 青岛希尔信息科技有限公司 Accounting and project financial management system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107527009A (en) * 2017-07-11 2017-12-29 浙江汉凡软件科技有限公司 A kind of remnant object detection method based on YOLO target detections
CN108154098A (en) * 2017-12-20 2018-06-12 歌尔股份有限公司 A kind of target identification method of robot, device and robot
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN109977943A (en) * 2019-02-14 2019-07-05 平安科技(深圳)有限公司 A kind of images steganalysis method, system and storage medium based on YOLO

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247956B (en) * 2016-10-09 2020-03-27 成都快眼科技有限公司 Rapid target detection method based on grid judgment
CN107423760A (en) * 2017-07-21 2017-12-01 西安电子科技大学 Based on pre-segmentation and the deep learning object detection method returned

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107527009A (en) * 2017-07-11 2017-12-29 浙江汉凡软件科技有限公司 A kind of remnant object detection method based on YOLO target detections
CN108154098A (en) * 2017-12-20 2018-06-12 歌尔股份有限公司 A kind of target identification method of robot, device and robot
CN109117794A (en) * 2018-08-16 2019-01-01 广东工业大学 A kind of moving target behavior tracking method, apparatus, equipment and readable storage medium storing program for executing
CN109977943A (en) * 2019-02-14 2019-07-05 平安科技(深圳)有限公司 A kind of images steganalysis method, system and storage medium based on YOLO

Cited By (183)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101134A (en) * 2020-08-24 2020-12-18 深圳市商汤科技有限公司 Object detection method and device, electronic device and storage medium
CN112101134B (en) * 2020-08-24 2024-01-02 深圳市商汤科技有限公司 Object detection method and device, electronic equipment and storage medium
CN112036286A (en) * 2020-08-25 2020-12-04 北京华正明天信息技术股份有限公司 Method for achieving temperature sensing and intelligently analyzing and identifying flame based on yoloV3 algorithm
CN111986255A (en) * 2020-09-07 2020-11-24 北京凌云光技术集团有限责任公司 Multi-scale anchor initialization method and device of image detection model
CN111986255B (en) * 2020-09-07 2024-04-09 凌云光技术股份有限公司 Multi-scale anchor initializing method and device of image detection model
CN112036507A (en) * 2020-09-25 2020-12-04 北京小米松果电子有限公司 Training method and device of image recognition model, storage medium and electronic equipment
CN112036507B (en) * 2020-09-25 2023-11-14 北京小米松果电子有限公司 Training method and device of image recognition model, storage medium and electronic equipment
CN112149748B (en) * 2020-09-28 2024-05-21 商汤集团有限公司 Image classification method and device, electronic equipment and storage medium
CN112149748A (en) * 2020-09-28 2020-12-29 商汤集团有限公司 Image classification method and device, electronic equipment and storage medium
CN112183358B (en) * 2020-09-29 2024-04-23 新石器慧通(北京)科技有限公司 Training method and device for target detection model
CN112183358A (en) * 2020-09-29 2021-01-05 新石器慧拓(北京)科技有限公司 Training method and device for target detection model
CN112200186A (en) * 2020-10-15 2021-01-08 上海海事大学 Car logo identification method based on improved YOLO _ V3 model
CN112200186B (en) * 2020-10-15 2024-03-15 上海海事大学 Vehicle logo identification method based on improved YOLO_V3 model
CN112231497A (en) * 2020-10-19 2021-01-15 腾讯科技(深圳)有限公司 Information classification method and device, storage medium and electronic equipment
CN112231497B (en) * 2020-10-19 2024-04-09 腾讯科技(深圳)有限公司 Information classification method and device, storage medium and electronic equipment
CN112348778A (en) * 2020-10-21 2021-02-09 深圳市优必选科技股份有限公司 Object identification method and device, terminal equipment and storage medium
CN112348778B (en) * 2020-10-21 2023-10-27 深圳市优必选科技股份有限公司 Object identification method, device, terminal equipment and storage medium
CN112288003A (en) * 2020-10-28 2021-01-29 北京奇艺世纪科技有限公司 Neural network training and target detection method and device
CN112365465B (en) * 2020-11-09 2024-02-06 浙江大华技术股份有限公司 Synthetic image category determining method and device, storage medium and electronic device
CN112330641A (en) * 2020-11-09 2021-02-05 迩言(上海)科技有限公司 Grain imperfect grain identification method and system based on deep learning
CN112365465A (en) * 2020-11-09 2021-02-12 浙江大华技术股份有限公司 Method and apparatus for determining type of synthesized image, storage medium, and electronic apparatus
CN112287884B (en) * 2020-11-19 2024-02-20 长江大学 Examination abnormal behavior detection method and device and computer readable storage medium
CN112287884A (en) * 2020-11-19 2021-01-29 长江大学 Examination abnormal behavior detection method and device and computer readable storage medium
CN112364807A (en) * 2020-11-24 2021-02-12 深圳市优必选科技股份有限公司 Image recognition method and device, terminal equipment and computer readable storage medium
CN112348112B (en) * 2020-11-24 2023-12-15 深圳市优必选科技股份有限公司 Training method and training device for image recognition model and terminal equipment
CN112348112A (en) * 2020-11-24 2021-02-09 深圳市优必选科技股份有限公司 Training method and device for image recognition model and terminal equipment
CN112364807B (en) * 2020-11-24 2023-12-15 深圳市优必选科技股份有限公司 Image recognition method, device, terminal equipment and computer readable storage medium
CN112560586A (en) * 2020-11-27 2021-03-26 国家电网有限公司大数据中心 Method and device for obtaining structured data of pole and tower signboard and electronic equipment
CN112560586B (en) * 2020-11-27 2024-05-10 国家电网有限公司大数据中心 Method and device for obtaining structural data of pole and tower signboard and electronic equipment
CN112489015A (en) * 2020-11-27 2021-03-12 广州高新兴机器人有限公司 Chemical fiber impurity floating identification method for mobile robot
CN112634202A (en) * 2020-12-04 2021-04-09 浙江省农业科学院 Method, device and system for detecting behavior of polyculture fish shoal based on YOLOv3-Lite
CN113723157A (en) * 2020-12-15 2021-11-30 京东数字科技控股股份有限公司 Crop disease identification method and device, electronic equipment and storage medium
CN112613097A (en) * 2020-12-15 2021-04-06 中铁二十四局集团江苏工程有限公司 BIM rapid modeling method based on computer vision
CN112507912B (en) * 2020-12-15 2024-06-11 杭州网易智企科技有限公司 Method and device for identifying illegal pictures
CN113723157B (en) * 2020-12-15 2024-02-09 京东科技控股股份有限公司 Crop disease identification method and device, electronic equipment and storage medium
CN112507912A (en) * 2020-12-15 2021-03-16 网易(杭州)网络有限公司 Method and device for identifying illegal picture
CN112633352A (en) * 2020-12-18 2021-04-09 浙江大华技术股份有限公司 Target detection method and device, electronic equipment and storage medium
CN112633352B (en) * 2020-12-18 2023-08-29 浙江大华技术股份有限公司 Target detection method and device, electronic equipment and storage medium
CN112634327A (en) * 2020-12-21 2021-04-09 合肥讯图信息科技有限公司 Tracking method based on YOLOv4 model
CN112633159B (en) * 2020-12-22 2024-04-12 北京迈格威科技有限公司 Human-object interaction relation identification method, model training method and corresponding device
CN112580523A (en) * 2020-12-22 2021-03-30 平安国际智慧城市科技股份有限公司 Behavior recognition method, behavior recognition device, behavior recognition equipment and storage medium
CN112633159A (en) * 2020-12-22 2021-04-09 北京迈格威科技有限公司 Human-object interaction relation recognition method, model training method and corresponding device
CN112699925A (en) * 2020-12-23 2021-04-23 国网安徽省电力有限公司检修分公司 Transformer substation meter image classification method
CN112529020A (en) * 2020-12-24 2021-03-19 携程旅游信息技术(上海)有限公司 Animal identification method, system, equipment and storage medium based on neural network
CN112529020B (en) * 2020-12-24 2024-05-24 携程旅游信息技术(上海)有限公司 Animal identification method, system, equipment and storage medium based on neural network
CN112580734A (en) * 2020-12-25 2021-03-30 深圳市优必选科技股份有限公司 Target detection model training method, system, terminal device and storage medium
CN112541483A (en) * 2020-12-25 2021-03-23 三峡大学 Dense face detection method combining YOLO and blocking-fusion strategy
CN112541483B (en) * 2020-12-25 2024-05-17 深圳市富浩鹏电子有限公司 Dense face detection method combining YOLO and blocking-fusion strategy
CN112580734B (en) * 2020-12-25 2023-12-29 深圳市优必选科技股份有限公司 Target detection model training method, system, terminal equipment and storage medium
CN112633286A (en) * 2020-12-25 2021-04-09 北京航星机器制造有限公司 Intelligent security inspection system based on similarity rate and recognition probability of dangerous goods
CN112597915A (en) * 2020-12-26 2021-04-02 上海有个机器人有限公司 Method, device, medium and robot for identifying indoor close-distance pedestrians
CN112597915B (en) * 2020-12-26 2024-04-09 上海有个机器人有限公司 Method, device, medium and robot for identifying indoor close-distance pedestrians
CN112734641A (en) * 2020-12-31 2021-04-30 百果园技术(新加坡)有限公司 Training method and device of target detection model, computer equipment and medium
CN112734641B (en) * 2020-12-31 2024-05-31 百果园技术(新加坡)有限公司 Training method and device for target detection model, computer equipment and medium
CN112784694A (en) * 2020-12-31 2021-05-11 杭州电子科技大学 EVP-YOLO-based indoor article detection method
CN112560799B (en) * 2021-01-05 2022-08-05 北京航空航天大学 Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN112560799A (en) * 2021-01-05 2021-03-26 北京航空航天大学 Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN112733741A (en) * 2021-01-14 2021-04-30 苏州挚途科技有限公司 Traffic signboard identification method and device and electronic equipment
CN112818980A (en) * 2021-01-15 2021-05-18 湖南千盟物联信息技术有限公司 Steel ladle number detection and identification method based on Yolov3 algorithm
CN112766170B (en) * 2021-01-21 2024-04-16 广西财经学院 Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
CN112766170A (en) * 2021-01-21 2021-05-07 广西财经学院 Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
CN112906478A (en) * 2021-01-22 2021-06-04 北京百度网讯科技有限公司 Target object identification method, device, equipment and storage medium
CN112906478B (en) * 2021-01-22 2024-01-09 北京百度网讯科技有限公司 Target object identification method, device, equipment and storage medium
CN112989924B (en) * 2021-01-26 2024-05-24 深圳市优必选科技股份有限公司 Target detection method, target detection device and terminal equipment
CN112989924A (en) * 2021-01-26 2021-06-18 深圳市优必选科技股份有限公司 Target detection method, target detection device and terminal equipment
CN112906495B (en) * 2021-01-27 2024-04-30 深圳安智杰科技有限公司 Target detection method and device, electronic equipment and storage medium
CN112906495A (en) * 2021-01-27 2021-06-04 深圳安智杰科技有限公司 Target detection method and device, electronic equipment and storage medium
CN114821288A (en) * 2021-01-29 2022-07-29 中强光电股份有限公司 Image identification method and unmanned aerial vehicle system
CN112800971A (en) * 2021-01-29 2021-05-14 深圳市商汤科技有限公司 Neural network training and point cloud data processing method, device, equipment and medium
CN112911171A (en) * 2021-02-04 2021-06-04 上海航天控制技术研究所 Intelligent photoelectric information processing system and method based on accelerated processing
CN112911171B (en) * 2021-02-04 2022-04-22 上海航天控制技术研究所 Intelligent photoelectric information processing system and method based on accelerated processing
CN112861716A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Illegal article placement monitoring method, system, equipment and storage medium
CN112861711A (en) * 2021-02-05 2021-05-28 深圳市安软科技股份有限公司 Regional intrusion detection method and device, electronic equipment and storage medium
CN113762023B (en) * 2021-02-18 2024-05-24 北京京东振世信息技术有限公司 Object identification method and device based on article association relation
CN113762023A (en) * 2021-02-18 2021-12-07 北京京东振世信息技术有限公司 Object identification method and device based on article incidence relation
CN112906794A (en) * 2021-02-22 2021-06-04 珠海格力电器股份有限公司 Target detection method, device, storage medium and terminal
CN113095133A (en) * 2021-03-04 2021-07-09 北京迈格威科技有限公司 Model training method, target detection method and corresponding device
CN113095133B (en) * 2021-03-04 2023-12-29 北京迈格威科技有限公司 Model training method, target detection method and corresponding devices
CN112906621A (en) * 2021-03-10 2021-06-04 北京华捷艾米科技有限公司 Hand detection method, device, storage medium and equipment
CN112966618B (en) * 2021-03-11 2024-02-09 京东科技信息技术有限公司 Dressing recognition method, apparatus, device and computer readable medium
CN112966618A (en) * 2021-03-11 2021-06-15 京东数科海益信息科技有限公司 Dressing identification method, device, equipment and computer readable medium
CN113011319B (en) * 2021-03-16 2024-04-16 上海应用技术大学 Multi-scale fire target identification method and system
CN112966762A (en) * 2021-03-16 2021-06-15 南京恩博科技有限公司 Wild animal detection method and device, storage medium and electronic equipment
CN113011319A (en) * 2021-03-16 2021-06-22 上海应用技术大学 Multi-scale fire target identification method and system
CN112966762B (en) * 2021-03-16 2023-12-26 南京恩博科技有限公司 Wild animal detection method and device, storage medium and electronic equipment
CN112991304A (en) * 2021-03-23 2021-06-18 武汉大学 Molten pool sputtering detection method based on laser directional energy deposition monitoring system
CN112990334A (en) * 2021-03-29 2021-06-18 西安电子科技大学 Small sample SAR image target identification method based on improved prototype network
CN113222889A (en) * 2021-03-30 2021-08-06 大连智慧渔业科技有限公司 Industrial aquaculture counting method and device for aquatic aquaculture objects under high-resolution images
CN113222889B (en) * 2021-03-30 2024-03-12 大连智慧渔业科技有限公司 Industrial aquaculture counting method and device for aquaculture under high-resolution image
CN113052127A (en) * 2021-04-09 2021-06-29 上海云从企业发展有限公司 Behavior detection method, behavior detection system, computer equipment and machine readable medium
CN113139597B (en) * 2021-04-19 2022-11-04 中国人民解放军91054部队 Statistical thought-based image distribution external detection method
CN113139597A (en) * 2021-04-19 2021-07-20 中国人民解放军91054部队 Statistical thought-based image distribution external detection method
CN113128522A (en) * 2021-05-11 2021-07-16 四川云从天府人工智能科技有限公司 Target identification method and device, computer equipment and storage medium
CN113128522B (en) * 2021-05-11 2024-04-05 四川云从天府人工智能科技有限公司 Target identification method, device, computer equipment and storage medium
CN113240638B (en) * 2021-05-12 2023-11-10 上海联影智能医疗科技有限公司 Target detection method, device and medium based on deep learning
CN113240638A (en) * 2021-05-12 2021-08-10 上海联影智能医疗科技有限公司 Target detection method, device and medium based on deep learning
CN113205067A (en) * 2021-05-26 2021-08-03 北京京东乾石科技有限公司 Method and device for monitoring operator, electronic equipment and storage medium
CN113205067B (en) * 2021-05-26 2024-04-09 北京京东乾石科技有限公司 Method and device for monitoring operators, electronic equipment and storage medium
WO2022252089A1 (en) * 2021-05-31 2022-12-08 京东方科技集团股份有限公司 Training method for object detection model, and object detection method and device
CN113435260A (en) * 2021-06-07 2021-09-24 上海商汤智能科技有限公司 Image detection method, related training method, related device, equipment and medium
CN113392833A (en) * 2021-06-10 2021-09-14 沈阳派得林科技有限责任公司 Method for identifying type number of industrial radiographic negative image
CN113377888A (en) * 2021-06-25 2021-09-10 北京百度网讯科技有限公司 Training target detection model and method for detecting target
CN113536963B (en) * 2021-06-25 2023-08-15 西安电子科技大学 SAR image airplane target detection method based on lightweight YOLO network
CN113377888B (en) * 2021-06-25 2024-04-02 北京百度网讯科技有限公司 Method for training object detection model and detection object
CN113486746A (en) * 2021-06-25 2021-10-08 海南电网有限责任公司三亚供电局 Power cable external damage prevention method based on biological induction and video monitoring
CN113536963A (en) * 2021-06-25 2021-10-22 西安电子科技大学 SAR image airplane target detection method based on lightweight YOLO network
CN113591566A (en) * 2021-06-28 2021-11-02 北京百度网讯科技有限公司 Training method and device of image recognition model, electronic equipment and storage medium
CN113553948A (en) * 2021-07-23 2021-10-26 中远海运科技(北京)有限公司 Automatic recognition and counting method for tobacco insects and computer readable medium
CN113486857B (en) * 2021-08-03 2023-05-12 云南大学 YOLOv 4-based ascending safety detection method and system
CN113486857A (en) * 2021-08-03 2021-10-08 云南大学 Ascending safety detection method and system based on YOLOv4
CN113723217A (en) * 2021-08-09 2021-11-30 南京邮电大学 Object intelligent detection method and system based on yolo improvement
CN113948190A (en) * 2021-09-02 2022-01-18 上海健康医学院 Method and equipment for automatically identifying X-ray skull positive position film cephalogram measurement mark points
CN113723406A (en) * 2021-09-03 2021-11-30 乐普(北京)医疗器械股份有限公司 Processing method and device for positioning bracket of coronary angiography image
CN113723406B (en) * 2021-09-03 2023-07-18 乐普(北京)医疗器械股份有限公司 Method and device for processing support positioning of coronary angiography image
CN114119455B (en) * 2021-09-03 2024-04-09 乐普(北京)医疗器械股份有限公司 Method and device for positioning vascular stenosis part based on target detection network
CN114119455A (en) * 2021-09-03 2022-03-01 乐普(北京)医疗器械股份有限公司 Method and device for positioning blood vessel stenosis part based on target detection network
CN113743339A (en) * 2021-09-09 2021-12-03 三峡大学 Indoor fall detection method and system based on scene recognition
CN113743339B (en) * 2021-09-09 2023-10-03 三峡大学 Indoor falling detection method and system based on scene recognition
CN113870196A (en) * 2021-09-10 2021-12-31 苏州浪潮智能科技有限公司 Image processing method, device, equipment and medium based on anchor point cutting graph
CN113792656B (en) * 2021-09-15 2023-07-18 山东大学 Behavior detection and alarm system using mobile communication equipment in personnel movement
CN113792656A (en) * 2021-09-15 2021-12-14 山东大学 Behavior detection and alarm system for using mobile communication equipment in personnel movement
CN113807449A (en) * 2021-09-23 2021-12-17 合肥工业大学 Sedimentary rock category identification method and device, electronic equipment and storage medium
CN114022705B (en) * 2021-10-29 2023-08-04 电子科技大学 Self-adaptive target detection method based on scene complexity pre-classification
CN114022705A (en) * 2021-10-29 2022-02-08 电子科技大学 Adaptive target detection method based on scene complexity pre-classification
CN114022554A (en) * 2021-11-03 2022-02-08 北华航天工业学院 Massage robot acupuncture point detection and positioning method based on YOLO
CN114022554B (en) * 2021-11-03 2023-02-03 北华航天工业学院 Massage robot acupoint detection and positioning method based on YOLO
CN114022446A (en) * 2021-11-04 2022-02-08 广东工业大学 Leather flaw detection method and system based on improved YOLOv3
CN114120358B (en) * 2021-11-11 2024-04-26 国网江苏省电力有限公司技能培训中心 Super-pixel-guided deep learning-based personnel head-mounted safety helmet recognition method
CN114120358A (en) * 2021-11-11 2022-03-01 国网江苏省电力有限公司技能培训中心 Super-pixel-guided deep learning-based identification method for head-worn safety helmet of person
CN114255389A (en) * 2021-11-15 2022-03-29 浙江时空道宇科技有限公司 Target object detection method, device, equipment and storage medium
CN113989939A (en) * 2021-11-16 2022-01-28 河北工业大学 Small-target pedestrian detection system based on improved YOLO algorithm
CN113989939B (en) * 2021-11-16 2024-05-14 河北工业大学 Small target pedestrian detection system based on improved YOLO algorithm
CN114387219A (en) * 2021-12-17 2022-04-22 依未科技(北京)有限公司 Method, device, medium and equipment for detecting arteriovenous cross compression characteristics of eyeground
CN114373075A (en) * 2021-12-31 2022-04-19 西安电子科技大学广州研究院 Target component detection data set construction method, detection method, device and equipment
US11756288B2 (en) * 2022-01-05 2023-09-12 Baidu Usa Llc Image processing method and apparatus, electronic device and storage medium
US20220130139A1 (en) * 2022-01-05 2022-04-28 Baidu Usa Llc Image processing method and apparatus, electronic device and storage medium
CN114359222A (en) * 2022-01-05 2022-04-15 多伦科技股份有限公司 Method for detecting arbitrary polygon target, electronic device and storage medium
CN114565848A (en) * 2022-02-25 2022-05-31 佛山读图科技有限公司 Liquid medicine level detection method and system in complex scene
CN114565848B (en) * 2022-02-25 2022-12-02 佛山读图科技有限公司 Liquid medicine level detection method and system in complex scene
CN114662594A (en) * 2022-03-25 2022-06-24 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114662594B (en) * 2022-03-25 2022-10-04 浙江省通信产业服务有限公司 Target feature recognition analysis system
CN114742204A (en) * 2022-04-08 2022-07-12 黑龙江惠达科技发展有限公司 Method and device for detecting straw coverage rate
CN114782778B (en) * 2022-04-25 2023-01-06 广东工业大学 Assembly state monitoring method and system based on machine vision technology
CN114782778A (en) * 2022-04-25 2022-07-22 广东工业大学 Assembly state monitoring method and system based on machine vision technology
CN114842315B (en) * 2022-05-07 2024-02-02 无锡雪浪数制科技有限公司 Looseness-prevention identification method and device for lightweight high-speed railway hub gasket
CN114842315A (en) * 2022-05-07 2022-08-02 无锡雪浪数制科技有限公司 Anti-loosening identification method and device for lightweight high-speed rail hub gasket
CN114881763B (en) * 2022-05-18 2023-05-26 中国工商银行股份有限公司 Post-loan supervision method, device, equipment and medium for aquaculture
CN114881763A (en) * 2022-05-18 2022-08-09 中国工商银行股份有限公司 Method, device, equipment and medium for post-loan supervision of aquaculture
CN115029209A (en) * 2022-06-17 2022-09-09 杭州天杭空气质量检测有限公司 Colony image acquisition processing device and processing method thereof
CN114972891A (en) * 2022-07-07 2022-08-30 智云数创(洛阳)数字科技有限公司 CAD component automatic identification method and BIM modeling method
CN114972891B (en) * 2022-07-07 2024-05-03 智云数创(洛阳)数字科技有限公司 Automatic identification method for CAD (computer aided design) component and BIM (building information modeling) method
CN115082661B (en) * 2022-07-11 2024-05-10 阿斯曼尔科技(上海)有限公司 Sensor assembly difficulty reducing method
CN115082661A (en) * 2022-07-11 2022-09-20 阿斯曼尔科技(上海)有限公司 Method for reducing assembly difficulty of sensor
CN115187982A (en) * 2022-07-12 2022-10-14 河北华清环境科技集团股份有限公司 Algae detection method and device and terminal equipment
CN115909358B (en) * 2022-07-27 2024-02-13 广州市玄武无线科技股份有限公司 Commodity specification identification method, commodity specification identification device, terminal equipment and computer storage medium
CN115909358A (en) * 2022-07-27 2023-04-04 广州市玄武无线科技股份有限公司 Commodity specification identification method and device, terminal equipment and computer storage medium
CN115346170B (en) * 2022-08-11 2023-05-30 北京市燃气集团有限责任公司 Intelligent monitoring method and device for gas facility area
CN115346170A (en) * 2022-08-11 2022-11-15 北京市燃气集团有限责任公司 Intelligent monitoring method and device for gas facility area
CN115346172A (en) * 2022-08-16 2022-11-15 哈尔滨市科佳通用机电股份有限公司 Method and system for detecting loss and breakage of hook lifting rod return spring
CN115346172B (en) * 2022-08-16 2023-04-21 哈尔滨市科佳通用机电股份有限公司 Method and system for detecting lost and broken hook lifting rod reset spring
CN115297263A (en) * 2022-08-24 2022-11-04 广州方图科技有限公司 Automatic photographing control method and system suitable for cube shooting and cube shooting
CN115297263B (en) * 2022-08-24 2023-04-07 广州方图科技有限公司 Automatic photographing control method and system suitable for cube shooting and cube shooting
CN115690565A (en) * 2022-09-28 2023-02-03 大连海洋大学 Target detection method for cultivated fugu rubripes by fusing knowledge and improving YOLOv5
CN115690565B (en) * 2022-09-28 2024-02-20 大连海洋大学 Method for detecting cultivated takifugu rubripes target by fusing knowledge and improving YOLOv5
CN115546566A (en) * 2022-11-24 2022-12-30 杭州心识宇宙科技有限公司 Intelligent body interaction method, device, equipment and storage medium based on article identification
CN116051985B (en) * 2022-12-20 2023-06-23 中国科学院空天信息创新研究院 Semi-supervised remote sensing target detection method based on multi-model mutual feedback learning
CN116051985A (en) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 Semi-supervised remote sensing target detection method based on multi-model mutual feedback learning
CN115690570A (en) * 2023-01-05 2023-02-03 中国水产科学研究院黄海水产研究所 Fish shoal feeding intensity prediction method based on ST-GCN
CN115690570B (en) * 2023-01-05 2023-03-28 中国水产科学研究院黄海水产研究所 Fish shoal feeding intensity prediction method based on ST-GCN
CN116452858A (en) * 2023-03-24 2023-07-18 哈尔滨市科佳通用机电股份有限公司 Rail wagon connecting pull rod round pin breaking fault identification method and system
CN116452858B (en) * 2023-03-24 2023-12-15 哈尔滨市科佳通用机电股份有限公司 Rail wagon connecting pull rod round pin breaking fault identification method and system
CN116403163B (en) * 2023-04-20 2023-10-27 慧铁科技有限公司 Method and device for identifying opening and closing states of handles of cut-off plug doors
CN116403163A (en) * 2023-04-20 2023-07-07 慧铁科技有限公司 Method and device for identifying opening and closing states of handles of cut-off plug doors
CN116681687A (en) * 2023-06-20 2023-09-01 广东电网有限责任公司广州供电局 Wire detection method and device based on computer vision and computer equipment
CN116758547B (en) * 2023-06-27 2024-03-12 北京中超伟业信息安全技术股份有限公司 Paper medium carbonization method, system and storage medium
CN116758547A (en) * 2023-06-27 2023-09-15 北京中超伟业信息安全技术股份有限公司 Paper medium carbonization method, system and storage medium
CN116916166A (en) * 2023-09-12 2023-10-20 湖南湘银河传感科技有限公司 Telemetry terminal based on AI image analysis
CN116916166B (en) * 2023-09-12 2023-11-17 湖南湘银河传感科技有限公司 Telemetry terminal based on AI image analysis
CN116935232A (en) * 2023-09-15 2023-10-24 青岛国测海遥信息技术有限公司 Remote sensing image processing method and device for offshore wind power equipment, equipment and medium
CN117671597A (en) * 2023-12-25 2024-03-08 北京大学长沙计算与数字经济研究院 Method for constructing mouse detection model and mouse detection method and device
CN117523318B (en) * 2023-12-26 2024-04-16 宁波微科光电股份有限公司 Anti-light interference subway shielding door foreign matter detection method, device and medium
CN117523318A (en) * 2023-12-26 2024-02-06 宁波微科光电股份有限公司 Anti-light interference subway shielding door foreign matter detection method, device and medium
CN117893895A (en) * 2024-03-15 2024-04-16 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) Method, system, equipment and storage medium for identifying portunus trituberculatus

Also Published As

Publication number Publication date
CN109977943B (en) 2024-05-07
CN109977943A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
WO2020164282A1 (en) Yolo-based image target recognition method and apparatus, electronic device, and storage medium
CN110533084B (en) Multi-scale target detection method based on self-attention mechanism
CN112052787B (en) Target detection method and device based on artificial intelligence and electronic equipment
CN109978893B (en) Training method, device, equipment and storage medium of image semantic segmentation network
CN110941594B (en) Splitting method and device of video file, electronic equipment and storage medium
US8401292B2 (en) Identifying high saliency regions in digital images
CN110378297B (en) Remote sensing image target detection method and device based on deep learning and storage medium
US20170032247A1 (en) Media classification
CN111079674B (en) Target detection method based on global and local information fusion
EP3493101A1 (en) Image recognition method, terminal, and nonvolatile storage medium
KR20210110823A (en) Image recognition method, training method of recognition model, and related devices and devices
WO2017105655A1 (en) Methods for object localization and image classification
CN110991311A (en) Target detection method based on dense connection deep network
CN109871821A (en) The pedestrian of adaptive network recognition methods, device, equipment and storage medium again
WO2021027157A1 (en) Vehicle insurance claim settlement identification method and apparatus based on picture identification, and computer device and storage medium
CN110008899B (en) Method for extracting and classifying candidate targets of visible light remote sensing image
CN110069959A (en) A kind of method for detecting human face, device and user equipment
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN111724342A (en) Method for detecting thyroid nodule in ultrasonic image
CN112949578B (en) Vehicle lamp state identification method, device, equipment and storage medium
CN111274972A (en) Dish identification method and device based on metric learning
CN114139564B (en) Two-dimensional code detection method and device, terminal equipment and training method of detection network
CN111414930B (en) Deep learning model training method and device, electronic equipment and storage medium
CN112784494A (en) Training method of false positive recognition model, target recognition method and device
CN115512207A (en) Single-stage target detection method based on multipath feature fusion and high-order loss sensing sampling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915076

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19915076

Country of ref document: EP

Kind code of ref document: A1