WO2022116720A1 - Target detection method and apparatus, and electronic device - Google Patents

Target detection method and apparatus, and electronic device Download PDF

Info

Publication number
WO2022116720A1
WO2022116720A1 PCT/CN2021/124860 CN2021124860W WO2022116720A1 WO 2022116720 A1 WO2022116720 A1 WO 2022116720A1 CN 2021124860 W CN2021124860 W CN 2021124860W WO 2022116720 A1 WO2022116720 A1 WO 2022116720A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
detection
detection result
network
sub
Prior art date
Application number
PCT/CN2021/124860
Other languages
French (fr)
Chinese (zh)
Inventor
张辉
高巍
Original Assignee
歌尔股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 歌尔股份有限公司 filed Critical 歌尔股份有限公司
Publication of WO2022116720A1 publication Critical patent/WO2022116720A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30121CRT, LCD or plasma display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Definitions

  • the present application relates to the technical field of machine vision, and in particular, to a method, apparatus and electronic device for target detection.
  • the machine vision technology based on image processing technology mainly uses computers to simulate people or reproduce some intelligent behaviors related to human vision, and extract information from images of objective things. Processing, and understanding, and finally used for actual detection and control, such as industrial testing, industrial flaw detection, precision measurement and control, automatic production lines and so on.
  • machine vision detection method can not only greatly improve production efficiency and production automation, but also machine vision is easy to achieve information integration to meet the requirements of digital and automated production.
  • Embodiments of the present application provide a target detection method, apparatus, and electronic device, so as to improve the detection level of difficult targets such as screen defects.
  • an embodiment of the present application provides a target detection method, including: inputting a detection image into a first sub-network of a target detection model to obtain a first detection result output by the first sub-network; If there is a non-salient target, input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; according to the first detection result and the second detection The result determines the final test result.
  • an embodiment of the present application further provides a target detection device, the device comprising:
  • the first detection unit is used to input the detection image into the first sub-network of the target detection model, and obtain the first detection result output by the first sub-network;
  • the second detection unit is configured to input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model if there is a non-salient target in the first detection result, and obtain the first sub-network output by the second sub-network.
  • a determination unit configured to determine the final detection result according to the first detection result and the second detection result.
  • embodiments of the present application further provide an electronic device, including: a processor; and a memory arranged to store computer-executable instructions, the executable instructions, when executed, cause the processor to execute the above target detection method.
  • embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and when the one or more programs are executed by an electronic device including multiple application programs, The device performs the object detection method as above.
  • the above-mentioned at least one technical solution adopted in the embodiment of the present application can achieve the following beneficial effects: using the first sub-network of the target detection model to detect the target in the detection image, but there may be non-salient targets in these targets, that is, the first sub-network of the target detection model
  • the first detection result output by the sub-network may not be completely accurate.
  • the image part corresponding to the non-salient target is input into the second sub-network to obtain the second detection result.
  • the first detection result and the second detection result are combined. Taken together, a more accurate final detection result can be obtained.
  • the target detection model is used in the industry to identify product defects.
  • the defects may be large or small, and it is difficult to obtain accurate detection results through a single network for small defects.
  • the solution of the present application can achieve high detection accuracy, and does not require The second sub-network performs secondary detection on all targets, and the efficiency is also very high.
  • FIG. 1 shows a schematic flowchart of a target detection method according to an embodiment of the present application
  • FIG. 2 shows a schematic diagram of a defect detection process in a diaphragm product according to an embodiment of the present application
  • FIG. 3 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of an electronic device in an embodiment of the present application.
  • the technical idea of the present application is to use two sub-networks to construct a target detection model.
  • the first sub-network detects all the targets as much as possible, while the second sub-network performs secondary detection on the non-salient targets, taking into account the detection accuracy and efficiency.
  • FIG. 1 shows a schematic flowchart of a target detection method according to an embodiment of the present application. As shown in FIG. 1 , the method includes:
  • Step S110 the detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained.
  • Step S120 if there is a non-salient target in the first detection result, input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network.
  • the target to be detected can be determined according to actual needs, and it has a good detection effect for the target that has urgent needs in the industry, such as screen defects.
  • Step S130 determining a final detection result according to the first detection result and the second detection result.
  • the method shown in Figure 1 uses the first sub-network of the target detection model to detect the target in the detection image, but there may be non-salient targets in these targets, that is to say, the first detection result output by the first sub-network It may not be completely accurate.
  • the image part corresponding to the non-salient target is input into the second sub-network to obtain the second detection result.
  • the first detection result and the second detection result can be combined to obtain a more accurate result.
  • the final test result is used in the industry to identify product defects. The defects may be large or small, and it is difficult to obtain accurate detection results through a single network for small defects.
  • the solution of the present application can achieve high detection accuracy, and does not require The second sub-network performs secondary detection on all targets, and the efficiency is also very high.
  • the first sub-network is a target detection network, and the first detection result includes the location of the target and the first classification; the second sub-network is a target classification network, and the second detection result includes the second classification of the target; Determining the final detection result by the first detection result and the second detection result includes: replacing the first classification with the second classification of the non-salient object as the final classification of the non-salient object.
  • a pixel coordinate system is established based on the pixel size of the detected image, and in the first detection result, the position of the target can be represented by pixel coordinates, such as by the pixel coordinates of the four vertices of the rectangular frame, or by the two diagonal corners.
  • the first classification includes the type of the target, and may specifically include the confidence score of the type.
  • the position of the target is generally more accurate, but there may be errors in the classification of the target, that is, false detection. Therefore, the second sub-network is used to not re-detect the position to improve the detection efficiency, but only to re-detect the classification, and then replace the first classification with the second classification of the non-salient target, that is, for the non-salient target.
  • the final detection result is the position of the target in the first detection result and the classification of the target in the second detection result.
  • This application gives two examples of how to determine non-salient objects.
  • the method further includes: calculating the area of each target according to the position of each target in the first detection result; if the area of one target is smaller than the first threshold, the target is not significant Target.
  • a first threshold can be set, and if the area of the detected defect is smaller than the first threshold, it is considered that the defect may have the risk of misclassification, and it is regarded as a non-salient target.
  • an object is a non-salient object if its confidence score for the first classification is less than a second threshold.
  • the first sub-network When outputting the first detection result, the first sub-network actually predicts the probability that the target belongs to each type, and then outputs the type with the highest probability.
  • This probability can be specifically expressed in the form of a confidence score. Therefore, if the confidence score of the first classification of a target is too low, for example, smaller than the second threshold, it means that there may be other types with similar confidence scores, and there may be misclassification.
  • the present application also proposes an example of determining the first threshold and the second threshold according to the training of the target detection model.
  • the method further includes: in the training phase of the target detection model, determining the area of each target in the first detection result, sorting each target according to the size of the area, The area is used as the first threshold; or, the targets in the first detection result are sorted according to the classification confidence score, and the confidence score at the second sequence position in the confidence score sequence is used as the second threshold.
  • the targets detected by the first sub-network are sorted according to their area or confidence score, and then the last third of the targets are considered as non-salient targets. Then according to the sorting, the area of two-thirds of the starting point can be used as the first threshold, and the confidence score of two-thirds of the starting point can be used as the second threshold, so that the setting of the threshold is suitable for the model training. Strong stick.
  • the object detection network is implemented based on the Mask_RCNN algorithm, and the object classification network is implemented based on the EfficientNet algorithm.
  • Mask_RCNN integrates the two functions of target detection and instance segmentation, which can achieve the two effects of classifying the target and determining the position of the target in the detection image, and has the characteristics of simple training and significant detection effect.
  • EfficientNet is a multi-dimensional hybrid model scaling algorithm that combines the three dimensions of network depth, network width, and image resolution, which can take into account both speed and accuracy.
  • the advantages of these two algorithms can be used to implement a target detection network and a target classification network respectively.
  • Faster_RCNN or the like may also be selected to implement the target detection network
  • Resnet or the like may be used to implement the target classification network.
  • the method further includes: counting the number of targets in the first detection result, and assigning the target number to the control variable; if the value of the control variable is not 0, selecting one of the first detection results that has not been selected The target selected this time is judged whether it is a non-salient target; if the target selected this time is a non-salient target, the second step of inputting the corresponding part of the non-salient target in the detection image to the target detection model is performed In the sub-network, the step of obtaining the second detection result output by the second sub-network, and the value of the control variable is decremented by 1; if the value of the control variable is 0, the final detection is determined according to the first detection result and the second detection result. Result steps.
  • Step 1 Determine the detection image input into the first sub-network.
  • the diaphragm image image can be obtained by taking a photo with the camera, and the image can be scaled to a preset size of 1778 ⁇ 1778 (unit is pixel) (usually the same as the sample image in the training stage), and the scaled image image_resize can be obtained as the detection image.
  • step 2 the detection image (eg image_resize) is sent to the first sub-network detect_model for defect detection, and the detection result detect_result is obtained.
  • the detection image eg image_resize
  • Step 3 Count the total number of defective instances in detect_result, instance_num.
  • instance_num is equal to 0, it means that the diaphragm product is free of defects; otherwise, traverse each defective instance, and execute step 4 until instance_num is equal to 0, and instance_num is equal to 0, execute step 6, and output the final inspection result.
  • Step 4 Determine whether the score (confidence score) in the instance is greater than the second threshold and whether the area (area) is greater than the first threshold; if both are greater than the corresponding threshold, it indicates that the defect has been identified, and the judgment of the next defect is continued. ; otherwise, go to step five.
  • Step 5 For defects whose score or area is less than the corresponding threshold (non-significant defects), they are sent to the second sub-network classify_model for separate classification, and the confidence score of the highest classification in the classification result, that is, the highest probability value in the classification result, and The corresponding category number is assigned to the defect.
  • Step 6 Output the final detection result. That is, for the final detection defect and classification defect, according to the coordinates of the upper left and lower right corners, the target defect is identified in the image, and the final detection result image_result is output.
  • the target detection model is trained in the following manner: inputting the first training image into the first sub-network to obtain a first detection result output by the first sub-network; determining non-salient in the first detection result target, extract the corresponding part of each non-salient target in the first training image as the second training image; input the second training image into the second sub-network to obtain the second detection result output by the second sub-network; The detection result, the second detection result and the labeling information of the first training image determine a training loss value, and the parameters of the first sub-network and the second sub-network are updated according to the training loss value.
  • a sample image of a diaphragm product mark the defect range
  • obtain a detection data set detect_data containing the first training image input the detect_data data set into the first sub-network detect_model for preliminary target detection, and obtain the detection result detect_result; then , sort the score and area of each type of defect detected from high to low, set the thresholds of the score and area in the last third respectively, and extract the difficulty from detect_result according to the thresholds of score and area.
  • the detected defect data classif__data is used to obtain the second training image; the classify_data data set is input into the second sub-network to obtain the second detection result output by the second sub-network.
  • a preset loss function is used to calculate the training loss value, and a back propagation algorithm is used to update the parameters, and the training of the target detection model is iteratively completed.
  • Embodiments of the present application further provide a target detection apparatus, which is used to implement the target detection method described in any of the above.
  • FIG. 3 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application.
  • the target detection apparatus 300 includes:
  • the first detection unit 310 is configured to input the detection image into the first sub-network of the target detection model to obtain the first detection result output by the first sub-network.
  • the second detection unit 320 is configured to input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model if there is a non-salient target in the first detection result, and obtain the output of the second sub-network. The second test result.
  • the determining unit 330 is configured to determine the final detection result according to the first detection result and the second detection result.
  • the first sub-network is a target detection network, and the first detection result includes the location of the target and the first classification; the second sub-network is a target classification network, and the second detection result includes the second classification of the target; the determining unit 330, for replacing the first classification with the second classification of the non-salient object as the final classification of the non-salient object.
  • the first detection unit 310 is configured to calculate the area of each target according to the position of each target in the first detection result; if the area of one target is smaller than the first threshold, the target is a non-salient target.
  • an object is a non-salient object if its confidence score for the first classification is less than a second threshold.
  • the apparatus further includes a training unit, configured to determine the area of each target in the first detection result in the training stage of the target detection model, sort each target according to the size of the area, and place the first target in the area sequence.
  • the area at a sequence position is used as the first threshold; or, the targets in the first detection result are sorted according to their classification confidence scores, and the confidence score at the second sequence position in the sequence of confidence scores is used as the second threshold .
  • the object detection network is implemented based on the Mask_RCNN algorithm, and the object classification network is implemented based on the EfficientNet algorithm.
  • the first detection unit 310 is configured to count the number of targets in the first detection result, and assign the target number to the control variable; if the value of the control variable is not 0, select one of the target numbers in the first detection result. For the selected target, judge whether the selected target is a non-salient target this time; if the target selected this time is a non-salient target, the second detection unit 320 will detect the corresponding part of the non-salient target in the detection image.
  • Input into the second sub-network of the target detection model obtain the second detection result output by the second sub-network, and subtract 1 from the value of the control variable; if the value of the control variable is 0, make the determination unit 330 according to the first detection
  • the result and the second detection result determine the final detection result.
  • the apparatus further includes a training unit, configured to input the first training image into the first sub-network to obtain a first detection result output by the first sub-network; and determine the non-salient in the first detection result target, extract the corresponding part of each non-salient target in the first training image as the second training image; input the second training image into the second sub-network to obtain the second detection result output by the second sub-network;
  • the detection result, the second detection result and the labeling information of the first training image determine a training loss value, and the parameters of the first sub-network and the second sub-network are updated according to the training loss value.
  • target detection apparatus can implement each step of the target detection method provided in the foregoing embodiments, and relevant explanations about the target detection method are applicable to the target detection apparatus, and are not repeated here.
  • FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory.
  • the memory may include memory, such as high-speed random-access memory (Random-Access Memory, RAM), or may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
  • RAM Random-Access Memory
  • non-volatile memory such as at least one disk memory.
  • the electronic equipment may also include hardware required for other services.
  • the processor, network interface and memory can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Component Interconnect) bus. Industry Standard Architecture, extended industry standard structure) bus, etc.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one bidirectional arrow is used in FIG. 4, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, and the program code includes computer operation instructions.
  • the memory may include memory and non-volatile memory and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory into the memory and runs it, forming a target detection device on a logical level.
  • the processor executes the program stored in the memory, and is specifically used to perform the following operations:
  • the detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained; if there is a non-salient target in the first detection result, the corresponding part of the non-salient target in the detection image is Input into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; determine the final detection result according to the first detection result and the second detection result.
  • the processor executes other operations that can be performed by the program stored in the memory, and reference is made to the relevant content of the method and device embodiments of the present application, and details are not described herein again.
  • the above-mentioned method performed by the target detection apparatus disclosed in the embodiment shown in FIG. 1 of the present application may be applied to a processor, or implemented by a processor.
  • a processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above-mentioned method can be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processor, DSP), dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
  • the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.
  • the electronic device can also execute the method performed by the target detection apparatus in FIG. 1 , and implement the functions of the target detection apparatus in the embodiment shown in FIG. 1 , and details are not described herein again in this embodiment of the present application.
  • the embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and the one or more programs include instructions, and the instructions are executed by an electronic device including multiple application programs.
  • the electronic device can be made to execute the method executed by the target detection apparatus in the embodiment shown in FIG. 1 , and is specifically used to execute:
  • the detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained; if there is a non-salient target in the first detection result, the corresponding part of the non-salient target in the detection image is Input into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; determine the final detection result according to the first detection result and the second detection result.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include forms of non-persistent memory, random access memory (RAM) and/or non-volatile memory in computer readable media, such as read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash memory
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

A target detection method and apparatus, and an electronic device. Said method comprises: inputting a detection image into a first subnetwork of a target detection model, to obtain a first detection result outputted by the first subnetwork (S110); if there is a non-significant target in the first detection result, inputting a corresponding part of the non-significant target in the detection image into a second subnetwork of the target detection model, to obtain a second detection result outputted by the second subnetwork (S120); and determining a final detection result according to the first detection result and the second detection result (S130). In the present invention, high detection accuracy can be achieved, and the second subnetwork is not required to perform secondary detection on all targets, and the efficiency is also very high.

Description

目标检测方法、装置和电子设备Target detection method, device and electronic device 技术领域technical field
本申请涉及机器视觉技术领域,尤其涉及目标检测方法、装置和电子设备。The present application relates to the technical field of machine vision, and in particular, to a method, apparatus and electronic device for target detection.
发明背景Background of the Invention
近年来,国际上对机器视觉的研究日渐重视,以图像处理技术为基础的机器视觉技术主要利用计算机来模拟人或再现与人类视觉有关的某些智能行为,从客观事物的图像中提取信息进行处理,并加以理解,最终用于实际检测和控制,例如应用于如工业检测、工业探伤、精密测控、自动生产线等方面。用机器视觉检测方法不仅可以大大提高生产效率和生产的自动化程度,而且机器视觉易于实现信息集成,满足数字化、自动化生产的要求。In recent years, more and more attention has been paid to the research of machine vision in the world. The machine vision technology based on image processing technology mainly uses computers to simulate people or reproduce some intelligent behaviors related to human vision, and extract information from images of objective things. Processing, and understanding, and finally used for actual detection and control, such as industrial testing, industrial flaw detection, precision measurement and control, automatic production lines and so on. Using machine vision detection method can not only greatly improve production efficiency and production automation, but also machine vision is easy to achieve information integration to meet the requirements of digital and automated production.
但是,在工业化的生产线如胶印版材、纸张、铝板带,以及TFT(Thin Film Transistor,薄膜晶体管)、LCD(Liquid Crystal Display,液晶显示屏)应用广泛的电视、电脑、手机等领域,有时候生产的产品会存在一些低对比度的缺陷,不太容易被检测出来。However, in industrialized production lines such as offset printing plates, paper, aluminum strips, as well as TFT (Thin Film Transistor, thin film transistor), LCD (Liquid Crystal Display, liquid crystal display) widely used in TVs, computers, mobile phones and other fields, sometimes The product produced will have some low-contrast defects that are not easy to detect.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了目标检测方法、装置和电子设备,以能够提高屏幕缺陷等较难检测目标的检测水平。Embodiments of the present application provide a target detection method, apparatus, and electronic device, so as to improve the detection level of difficult targets such as screen defects.
本申请实施例采用下述技术方案:The embodiment of the present application adopts the following technical solutions:
第一方面,本申请实施例提供一种目标检测方法,包括:将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果;若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果;根据第一检测结果和第二检测结果确定最终检测结果。In a first aspect, an embodiment of the present application provides a target detection method, including: inputting a detection image into a first sub-network of a target detection model to obtain a first detection result output by the first sub-network; If there is a non-salient target, input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; according to the first detection result and the second detection The result determines the final test result.
第二方面,本申请实施例还提供一种目标检测装置,该装置包括:In a second aspect, an embodiment of the present application further provides a target detection device, the device comprising:
第一检测单元,用于将检测图像输入到目标检测模型的第一子网络中,得 到第一子网络输出的第一检测结果;The first detection unit is used to input the detection image into the first sub-network of the target detection model, and obtain the first detection result output by the first sub-network;
第二检测单元,用于若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果;The second detection unit is configured to input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model if there is a non-salient target in the first detection result, and obtain the first sub-network output by the second sub-network. 2. Test results;
确定单元,用于根据第一检测结果和第二检测结果确定最终检测结果。A determination unit, configured to determine the final detection result according to the first detection result and the second detection result.
第三方面,本申请实施例还提供一种电子设备,包括:处理器;以及被安排成存储计算机可执行指令的存储器,可执行指令在被执行时使处理器执行如上的目标检测方法。In a third aspect, embodiments of the present application further provide an electronic device, including: a processor; and a memory arranged to store computer-executable instructions, the executable instructions, when executed, cause the processor to execute the above target detection method.
第四方面,本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质存储一个或多个程序,一个或多个程序当被包括多个应用程序的电子设备执行时,使得电子设备执行如上的目标检测方法。In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and when the one or more programs are executed by an electronic device including multiple application programs, The device performs the object detection method as above.
本申请实施例采用的上述至少一个技术方案能够达到以下有益效果:利用目标检测模型的第一子网络检测出检测图像中的目标,但这些目标中可能存在着非显著目标,也就是说第一子网络输出的第一检测结果可能不完全准确,此种情况下将非显著目标对应的图像部分输入到第二子网络中,得到第二检测结果,这样将第一检测结果和第二检测结果综合起来,就能够得到更准确的最终检测结果。例如在工业上利用目标检测模型来识别产品缺陷,缺陷可能有大有小,小的缺陷很难通过单一网络得到准确的检测结果,本申请的方案可以实现较高的检测准确度,并且不需要第二子网络对所有目标都进行二次检测,效率也很高。The above-mentioned at least one technical solution adopted in the embodiment of the present application can achieve the following beneficial effects: using the first sub-network of the target detection model to detect the target in the detection image, but there may be non-salient targets in these targets, that is, the first sub-network of the target detection model The first detection result output by the sub-network may not be completely accurate. In this case, the image part corresponding to the non-salient target is input into the second sub-network to obtain the second detection result. In this way, the first detection result and the second detection result are combined. Taken together, a more accurate final detection result can be obtained. For example, the target detection model is used in the industry to identify product defects. The defects may be large or small, and it is difficult to obtain accurate detection results through a single network for small defects. The solution of the present application can achieve high detection accuracy, and does not require The second sub-network performs secondary detection on all targets, and the efficiency is also very high.
附图简要说明Brief Description of Drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described herein are used to provide further understanding of the present application and constitute a part of the present application. The schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute an improper limitation of the present application. In the attached image:
图1示出了根据本申请一个实施例的一种目标检测方法的流程示意图;1 shows a schematic flowchart of a target detection method according to an embodiment of the present application;
图2示出了根据本申请一个实施例的振膜产品中缺陷检测流程示意图;FIG. 2 shows a schematic diagram of a defect detection process in a diaphragm product according to an embodiment of the present application;
图3示出了根据本申请一个实施例的一种目标检测装置的结构示意图;3 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application;
图4为本申请实施例中一种电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device in an embodiment of the present application.
具体实施方式Detailed ways
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the objectives, technical solutions and advantages of the present application clearer, the technical solutions of the present application will be clearly and completely described below with reference to the specific embodiments of the present application and the corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本申请的技术构思在于,利用两个子网络构建目标检测模型,第一子网络尽可能检测出所有目标,而第二子网络则对其中非显著目标进行二次检测,兼顾检测准确度和效率。The technical idea of the present application is to use two sub-networks to construct a target detection model. The first sub-network detects all the targets as much as possible, while the second sub-network performs secondary detection on the non-salient targets, taking into account the detection accuracy and efficiency.
以下结合附图,详细说明本申请各实施例提供的技术方案。The technical solutions provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
图1示出了根据本申请一个实施例的一种目标检测方法的流程示意图,如图1所示,该方法包括:FIG. 1 shows a schematic flowchart of a target detection method according to an embodiment of the present application. As shown in FIG. 1 , the method includes:
步骤S110,将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果。Step S110, the detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained.
步骤S120,若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果。Step S120, if there is a non-salient target in the first detection result, input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network.
本申请中,要检测的目标可以根据实际需求确定,并且对于屏幕缺陷这种工业上有着迫切需求的目标有着较好的检测效果。In this application, the target to be detected can be determined according to actual needs, and it has a good detection effect for the target that has urgent needs in the industry, such as screen defects.
以屏幕缺陷为例,存在着坏点、污渍等多种缺陷类型,有些缺陷的面积较大,有些缺陷的面积较小,较小的缺陷一般难以检测。在本申请的技术方案中,可以将难以检测的目标作为非显著目标,这些目标通常也可以被检测出来,但是误检(比如类型判断错误)的概率较高。因此本申请利用第二子网络再次对 非显著目标进行检测以提高检测准确度。Taking screen defects as an example, there are many types of defects such as dead pixels and stains. Some defects are large in area, and some defects are small in area. Smaller defects are generally difficult to detect. In the technical solution of the present application, targets that are difficult to detect can be regarded as non-salient targets, and these targets can usually be detected, but the probability of false detection (such as type judgment error) is high. Therefore, the present application uses the second sub-network to detect non-salient objects again to improve the detection accuracy.
步骤S130,根据第一检测结果和第二检测结果确定最终检测结果。Step S130, determining a final detection result according to the first detection result and the second detection result.
可见,图1所示的方法,利用目标检测模型的第一子网络检测出检测图像中的目标,但这些目标中可能存在着非显著目标,也就是说第一子网络输出的第一检测结果可能不完全准确,此种情况下将非显著目标对应的图像部分输入到第二子网络中,得到第二检测结果,这样将第一检测结果和第二检测结果综合起来,就能够得到更准确的最终检测结果。例如在工业上利用目标检测模型来识别产品缺陷,缺陷可能有大有小,小的缺陷很难通过单一网络得到准确的检测结果,本申请的方案可以实现较高的检测准确度,并且不需要第二子网络对所有目标都进行二次检测,效率也很高。It can be seen that the method shown in Figure 1 uses the first sub-network of the target detection model to detect the target in the detection image, but there may be non-salient targets in these targets, that is to say, the first detection result output by the first sub-network It may not be completely accurate. In this case, the image part corresponding to the non-salient target is input into the second sub-network to obtain the second detection result. In this way, the first detection result and the second detection result can be combined to obtain a more accurate result. the final test result. For example, the target detection model is used in the industry to identify product defects. The defects may be large or small, and it is difficult to obtain accurate detection results through a single network for small defects. The solution of the present application can achieve high detection accuracy, and does not require The second sub-network performs secondary detection on all targets, and the efficiency is also very high.
在一些实施例中,第一子网络为目标检测网络,第一检测结果包括目标的位置和第一分类;第二子网络为目标分类网络,第二检测结果包括目标的第二分类;根据第一检测结果和第二检测结果确定最终检测结果包括:以非显著目标的第二分类替换第一分类,作为非显著目标的最终分类。In some embodiments, the first sub-network is a target detection network, and the first detection result includes the location of the target and the first classification; the second sub-network is a target classification network, and the second detection result includes the second classification of the target; Determining the final detection result by the first detection result and the second detection result includes: replacing the first classification with the second classification of the non-salient object as the final classification of the non-salient object.
例如,基于检测图像的像素尺寸建立像素坐标系,则第一检测结果中,目标的位置可以用像素坐标来表示,如用矩形框的四个顶点的像素坐标表示,或者用对角的两个顶点的像素坐标表示。第一分类包括目标的类型,具体地还可以包含类型的置信度得分。For example, a pixel coordinate system is established based on the pixel size of the detected image, and in the first detection result, the position of the target can be represented by pixel coordinates, such as by the pixel coordinates of the four vertices of the rectangular frame, or by the two diagonal corners. The pixel coordinate representation of the vertex. The first classification includes the type of the target, and may specifically include the confidence score of the type.
一般来说,在能够检测出目标的情况下,目标的位置一般是较为准确的,但目标的分类可能存在错误,也就是误检。因此,利用第二子网络不再重新检测位置以提高检测效率,而是仅对分类进行重新检测,再用非显著目标的第二分类替换第一分类即可,也就是对于非显著目标来说,最终检测结果为第一检测结果中目标的位置和第二检测结果中目标的分类。Generally speaking, when the target can be detected, the position of the target is generally more accurate, but there may be errors in the classification of the target, that is, false detection. Therefore, the second sub-network is used to not re-detect the position to improve the detection efficiency, but only to re-detect the classification, and then replace the first classification with the second classification of the non-salient target, that is, for the non-salient target. , and the final detection result is the position of the target in the first detection result and the classification of the target in the second detection result.
对于如何确定非显著目标,本申请给出了两种示例。This application gives two examples of how to determine non-salient objects.
对于第一种示例,在一些实施例中,该方法还包括:根据第一检测结果中各目标的位置,计算各目标的面积;若一个目标的面积小于第一阈值,则该目标为非显著目标。For the first example, in some embodiments, the method further includes: calculating the area of each target according to the position of each target in the first detection result; if the area of one target is smaller than the first threshold, the target is not significant Target.
例如,振膜产品中的一些缺陷,特征是尺寸很小,导致分类困难。因此,可以设置第一阈值,如果检测出的缺陷的面积小于第一阈值,那么就认为这个缺陷可能存在分类错误的风险,将其作为非显著目标。For example, some defects in diaphragm products, characterized by small size, make sorting difficult. Therefore, a first threshold can be set, and if the area of the detected defect is smaller than the first threshold, it is considered that the defect may have the risk of misclassification, and it is regarded as a non-salient target.
对于第二种示例,在一些实施例中,若一个目标的第一分类的置信度得分小于第二阈值,则该目标为非显著目标。For the second example, in some embodiments, an object is a non-salient object if its confidence score for the first classification is less than a second threshold.
第一子网络在输出第一检测结果时,实际上是预测该目标属于每个类型的概率,然后将概率最高的类型进行输出。这一概率具体可以以置信度得分的方式来表示。因此如果一个目标的第一分类的置信度得分过低,例如小于第二阈值,说明可能还存在其他置信度得分相近的类型,就有可能存在误分类。When outputting the first detection result, the first sub-network actually predicts the probability that the target belongs to each type, and then outputs the type with the highest probability. This probability can be specifically expressed in the form of a confidence score. Therefore, if the confidence score of the first classification of a target is too low, for example, smaller than the second threshold, it means that there may be other types with similar confidence scores, and there may be misclassification.
为了减少第一阈值和第二阈值受到人工经验的影响,本申请还提出了根据目标检测模型的训练来确定第一阈值和第二阈值的示例。在一些实施例中,该方法还包括:在目标检测模型的训练阶段,确定第一检测结果中的各目标的面积,将各目标按面积大小进行排序,将面积序列中第一序列位置处的面积作为第一阈值;或者,将第一检测结果中的各目标按分类置信度得分高低进行排序,将置信度得分序列中第二序列位置处的置信度得分作为第二阈值。In order to reduce the influence of artificial experience on the first threshold and the second threshold, the present application also proposes an example of determining the first threshold and the second threshold according to the training of the target detection model. In some embodiments, the method further includes: in the training phase of the target detection model, determining the area of each target in the first detection result, sorting each target according to the size of the area, The area is used as the first threshold; or, the targets in the first detection result are sorted according to the classification confidence score, and the confidence score at the second sequence position in the confidence score sequence is used as the second threshold.
例如,在训练时,将第一子网络检测出的各目标按照面积或者置信度得分高低进行排序,然后将最后三分之一的目标认为是非显著目标。那么根据排序,就可以将从头开始三分之二处的面积作为第一阈值,将从头开始三分之二处的置信度得分作为第二阈值,这样使得阈值的设置与模型训练相适应,鲁棒性强。For example, during training, the targets detected by the first sub-network are sorted according to their area or confidence score, and then the last third of the targets are considered as non-salient targets. Then according to the sorting, the area of two-thirds of the starting point can be used as the first threshold, and the confidence score of two-thirds of the starting point can be used as the second threshold, so that the setting of the threshold is suitable for the model training. Strong stick.
在一些实施例中,目标检测网络基于Mask_RCNN算法实现,目标分类网络基于EfficientNet算法实现。In some embodiments, the object detection network is implemented based on the Mask_RCNN algorithm, and the object classification network is implemented based on the EfficientNet algorithm.
Mask_RCNN集成了目标检测和实例分割两大功能,可以实现对目标进行分类和确定目标在检测图像中的位置这两个效果,并且具有训练简单和检测效果显著的特点。Mask_RCNN integrates the two functions of target detection and instance segmentation, which can achieve the two effects of classifying the target and determining the position of the target in the detection image, and has the characteristics of simple training and significant detection effect.
而EfficientNet是一种多维度混合的模型放缩算法,将网络深度、网络宽度、图像分辨率这三个维度进行了组合,能够兼顾速度和精度。EfficientNet is a multi-dimensional hybrid model scaling algorithm that combines the three dimensions of network depth, network width, and image resolution, which can take into account both speed and accuracy.
本申请的实施例中可以利用这两个算法的优点分别实现目标检测网络和目 标分类网络。当然,在其他实施例中也可以选择Faster_RCNN等实现目标检测网络,利用Resnet等实现目标分类网络。In the embodiments of the present application, the advantages of these two algorithms can be used to implement a target detection network and a target classification network respectively. Of course, in other embodiments, Faster_RCNN or the like may also be selected to implement the target detection network, and Resnet or the like may be used to implement the target classification network.
在一些实施例中,该方法还包括:统计第一检测结果中的目标数量,将目标数量赋值给控制变量;若控制变量的值不为0,则选择第一检测结果中一个未被选择过的目标,对本次选择的目标进行是否为非显著目标的判断;若本次选择的目标为非显著目标,则执行将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果的步骤,并将控制变量的值减1;若控制变量的值为0,则执行根据第一检测结果和第二检测结果确定最终检测结果的步骤。In some embodiments, the method further includes: counting the number of targets in the first detection result, and assigning the target number to the control variable; if the value of the control variable is not 0, selecting one of the first detection results that has not been selected The target selected this time is judged whether it is a non-salient target; if the target selected this time is a non-salient target, the second step of inputting the corresponding part of the non-salient target in the detection image to the target detection model is performed In the sub-network, the step of obtaining the second detection result output by the second sub-network, and the value of the control variable is decremented by 1; if the value of the control variable is 0, the final detection is determined according to the first detection result and the second detection result. Result steps.
下面结合图2,介绍以振膜产品中的缺陷作为目标,进行目标检测的示例。In the following, with reference to Figure 2, an example of target detection with defects in diaphragm products as targets is introduced.
步骤一,确定输入到第一子网络中的检测图像。例如可以先通过相机拍照获取到振膜图像image,将image缩放到1778×1778(单位为像素)的预设尺寸(一般是与训练阶段的样本图像相同),得到缩放图像image_resize作为检测图像。Step 1: Determine the detection image input into the first sub-network. For example, the diaphragm image image can be obtained by taking a photo with the camera, and the image can be scaled to a preset size of 1778×1778 (unit is pixel) (usually the same as the sample image in the training stage), and the scaled image image_resize can be obtained as the detection image.
步骤二,将检测图像(如image_resize)送到第一子网络detect_model中进行缺陷检测,得到检测结果detect_result。In step 2, the detection image (eg image_resize) is sent to the first sub-network detect_model for defect detection, and the detection result detect_result is obtained.
步骤三,统计detect_result中的缺陷instance总数instance_num。Step 3: Count the total number of defective instances in detect_result, instance_num.
如果instance_num等于0,说明该振膜产品无缺陷;否则,遍历每一个缺陷instance,执行步骤四,直至instance_num等于0,instance_num等于0时,执行步骤六,输出最终检测结果。If instance_num is equal to 0, it means that the diaphragm product is free of defects; otherwise, traverse each defective instance, and execute step 4 until instance_num is equal to 0, and instance_num is equal to 0, execute step 6, and output the final inspection result.
步骤四,判断instance中的score(置信度得分)是否大于第二阈值、area(面积)是否大于第一阈值;如果均大于相应阈值,则表明该缺陷已被识别出,继续下一个缺陷的判断;否则,执行步骤五。Step 4: Determine whether the score (confidence score) in the instance is greater than the second threshold and whether the area (area) is greater than the first threshold; if both are greater than the corresponding threshold, it indicates that the defect has been identified, and the judgment of the next defect is continued. ; otherwise, go to step five.
步骤五,对于score或area小于相应阈值的缺陷(非显著缺陷),送入第二子网络classify_model进行单独分类,将分类结果中最高分类的置信度得分,即分类结果中的最高概率值,以及对应的类别号赋予该缺陷。Step 5: For defects whose score or area is less than the corresponding threshold (non-significant defects), they are sent to the second sub-network classify_model for separate classification, and the confidence score of the highest classification in the classification result, that is, the highest probability value in the classification result, and The corresponding category number is assigned to the defect.
步骤六、输出最终检测结果。即对于最终的检测缺陷和分类缺陷,根据其 左上角和右下角的坐标,将目标缺陷在image中标识出来,输出最终检测结果image_result。Step 6: Output the final detection result. That is, for the final detection defect and classification defect, according to the coordinates of the upper left and lower right corners, the target defect is identified in the image, and the final detection result image_result is output.
在一些实施例中,目标检测模型是通过如下方式训练的:将第一训练图像输入到第一子网络中,得到第一子网络输出的第一检测结果;确定第一检测结果中的非显著目标,提取各非显著目标在第一训练图像中的对应部分作为第二训练图像;将第二训练图像输入到第二子网络中,得到第二子网络输出的第二检测结果;根据第一检测结果、第二检测结果和第一训练图像的标注信息确定训练损失值,根据训练损失值对第一子网络和第二子网络的参数进行更新。In some embodiments, the target detection model is trained in the following manner: inputting the first training image into the first sub-network to obtain a first detection result output by the first sub-network; determining non-salient in the first detection result target, extract the corresponding part of each non-salient target in the first training image as the second training image; input the second training image into the second sub-network to obtain the second detection result output by the second sub-network; The detection result, the second detection result and the labeling information of the first training image determine a training loss value, and the parameters of the first sub-network and the second sub-network are updated according to the training loss value.
例如,获取振膜产品的样本图像,标注出缺陷范围,得到包含第一训练图像的检测数据集detect_data,将detect_data数据集输入到第一子网络detect_model进行初步的目标检测,得到检测结果detect_result;然后,将检测出的每一类缺陷的得分、面积按照从高到低排序,分别设置最后三分之一处的值为得分、面积的阈值,根据得分、面积的阈值,从detect_result中提取出难检测的缺陷数据classif__data,得到第二训练图像;将classify_data数据集输入到第二子网络中,得到第二子网络输出的第二检测结果。之后利用预设的损失函数计算训练损失值,利用反向传播算法等方式进行参数更新,迭代完成目标检测模型的训练。For example, obtain a sample image of a diaphragm product, mark the defect range, obtain a detection data set detect_data containing the first training image, input the detect_data data set into the first sub-network detect_model for preliminary target detection, and obtain the detection result detect_result; then , sort the score and area of each type of defect detected from high to low, set the thresholds of the score and area in the last third respectively, and extract the difficulty from detect_result according to the thresholds of score and area. The detected defect data classif__data is used to obtain the second training image; the classify_data data set is input into the second sub-network to obtain the second detection result output by the second sub-network. Then, a preset loss function is used to calculate the training loss value, and a back propagation algorithm is used to update the parameters, and the training of the target detection model is iteratively completed.
本申请的实施例还提供了一种目标检测装置,用于实现如上任一所述的目标检测方法。Embodiments of the present application further provide a target detection apparatus, which is used to implement the target detection method described in any of the above.
具体地,图3示出了根据本申请一个实施例的一种目标检测装置的结构示意图。如图3所示,目标检测装置300包括:Specifically, FIG. 3 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application. As shown in FIG. 3 , the target detection apparatus 300 includes:
第一检测单元310,用于将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果。The first detection unit 310 is configured to input the detection image into the first sub-network of the target detection model to obtain the first detection result output by the first sub-network.
第二检测单元320,用于若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果。The second detection unit 320 is configured to input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model if there is a non-salient target in the first detection result, and obtain the output of the second sub-network. The second test result.
确定单元330,用于根据第一检测结果和第二检测结果确定最终检测结果。The determining unit 330 is configured to determine the final detection result according to the first detection result and the second detection result.
在一些实施例中,第一子网络为目标检测网络,第一检测结果包括目标的位置和第一分类;第二子网络为目标分类网络,第二检测结果包括目标的第二分类;确定单元330,用于以非显著目标的第二分类替换第一分类,作为非显著目标的最终分类。In some embodiments, the first sub-network is a target detection network, and the first detection result includes the location of the target and the first classification; the second sub-network is a target classification network, and the second detection result includes the second classification of the target; the determining unit 330, for replacing the first classification with the second classification of the non-salient object as the final classification of the non-salient object.
在一些实施例中,第一检测单元310,用于根据第一检测结果中各目标的位置,计算各目标的面积;若一个目标的面积小于第一阈值,则该目标为非显著目标。In some embodiments, the first detection unit 310 is configured to calculate the area of each target according to the position of each target in the first detection result; if the area of one target is smaller than the first threshold, the target is a non-salient target.
在一些实施例中,若一个目标的第一分类的置信度得分小于第二阈值,则该目标为非显著目标。In some embodiments, an object is a non-salient object if its confidence score for the first classification is less than a second threshold.
在一些实施例中,该装置还包括,训练单元,用于在目标检测模型的训练阶段,确定第一检测结果中的各目标的面积,将各目标按面积大小进行排序,将面积序列中第一序列位置处的面积作为第一阈值;或者,将第一检测结果中的各目标按分类置信度得分高低进行排序,将置信度得分序列中第二序列位置处的置信度得分作为第二阈值。In some embodiments, the apparatus further includes a training unit, configured to determine the area of each target in the first detection result in the training stage of the target detection model, sort each target according to the size of the area, and place the first target in the area sequence. The area at a sequence position is used as the first threshold; or, the targets in the first detection result are sorted according to their classification confidence scores, and the confidence score at the second sequence position in the sequence of confidence scores is used as the second threshold .
在一些实施例中,目标检测网络基于Mask_RCNN算法实现,目标分类网络基于EfficientNet算法实现。In some embodiments, the object detection network is implemented based on the Mask_RCNN algorithm, and the object classification network is implemented based on the EfficientNet algorithm.
在一些实施例中,第一检测单元310,用于统计第一检测结果中的目标数量,将目标数量赋值给控制变量;若控制变量的值不为0,则选择第一检测结果中一个未被选择过的目标,对本次选择的目标进行是否为非显著目标的判断;若本次选择的目标为非显著目标,则使第二检测单元320将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果,并将控制变量的值减1;若控制变量的值为0,则使确定单元330根据第一检测结果和第二检测结果确定最终检测结果。In some embodiments, the first detection unit 310 is configured to count the number of targets in the first detection result, and assign the target number to the control variable; if the value of the control variable is not 0, select one of the target numbers in the first detection result. For the selected target, judge whether the selected target is a non-salient target this time; if the target selected this time is a non-salient target, the second detection unit 320 will detect the corresponding part of the non-salient target in the detection image. Input into the second sub-network of the target detection model, obtain the second detection result output by the second sub-network, and subtract 1 from the value of the control variable; if the value of the control variable is 0, make the determination unit 330 according to the first detection The result and the second detection result determine the final detection result.
在一些实施例中,该装置还包括,训练单元,用于将第一训练图像输入到第一子网络中,得到第一子网络输出的第一检测结果;确定第一检测结果中的非显著目标,提取各非显著目标在第一训练图像中的对应部分作为第二训练图像;将第二训练图像输入到第二子网络中,得到第二子网络输出的第二检测结 果;根据第一检测结果、第二检测结果和第一训练图像的标注信息确定训练损失值,根据训练损失值对第一子网络和第二子网络的参数进行更新。In some embodiments, the apparatus further includes a training unit, configured to input the first training image into the first sub-network to obtain a first detection result output by the first sub-network; and determine the non-salient in the first detection result target, extract the corresponding part of each non-salient target in the first training image as the second training image; input the second training image into the second sub-network to obtain the second detection result output by the second sub-network; The detection result, the second detection result and the labeling information of the first training image determine a training loss value, and the parameters of the first sub-network and the second sub-network are updated according to the training loss value.
能够理解,上述目标检测装置,能够实现前述实施例中提供的目标检测方法的各个步骤,关于目标检测方法的相关阐释均适用于目标检测装置,此处不再赘述。It can be understood that the above-mentioned target detection apparatus can implement each step of the target detection method provided in the foregoing embodiments, and relevant explanations about the target detection method are applicable to the target detection apparatus, and are not repeated here.
图4是本申请的一个实施例电子设备的结构示意图。请参考图4,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to FIG. 4 , at the hardware level, the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory. The memory may include memory, such as high-speed random-access memory (Random-Access Memory, RAM), or may also include non-volatile memory (non-volatile memory), such as at least one disk memory. Of course, the electronic equipment may also include hardware required for other services.
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图4中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。The processor, network interface and memory can be connected to each other through an internal bus, which can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Component Interconnect) bus. Industry Standard Architecture, extended industry standard structure) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of presentation, only one bidirectional arrow is used in FIG. 4, but it does not mean that there is only one bus or one type of bus.
存储器,用于存放程序。具体地,程序可以包括程序代码,程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。memory for storing programs. Specifically, the program may include program code, and the program code includes computer operation instructions. The memory may include memory and non-volatile memory and provide instructions and data to the processor.
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成目标检测装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:The processor reads the corresponding computer program from the non-volatile memory into the memory and runs it, forming a target detection device on a logical level. The processor executes the program stored in the memory, and is specifically used to perform the following operations:
将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果;若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果;根据第一检测结果和第二检测结果确定最终检测结果。The detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained; if there is a non-salient target in the first detection result, the corresponding part of the non-salient target in the detection image is Input into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; determine the final detection result according to the first detection result and the second detection result.
本实施例中处理器执行存储器所存放的程序能够执行的其他操作,参见本 申请方法和装置实施例的相关内容,在此不再赘述。In this embodiment, the processor executes other operations that can be performed by the program stored in the memory, and reference is made to the relevant content of the method and device embodiments of the present application, and details are not described herein again.
上述如本申请图1所示实施例揭示的目标检测装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。The above-mentioned method performed by the target detection apparatus disclosed in the embodiment shown in FIG. 1 of the present application may be applied to a processor, or implemented by a processor. A processor may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above-mentioned method can be completed by a hardware integrated logic circuit in a processor or an instruction in the form of software. The above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital Signal Processor, DSP), dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware.
该电子设备还可执行图1中目标检测装置执行的方法,并实现目标检测装置在图1所示实施例的功能,本申请实施例在此不再赘述。The electronic device can also execute the method performed by the target detection apparatus in FIG. 1 , and implement the functions of the target detection apparatus in the embodiment shown in FIG. 1 , and details are not described herein again in this embodiment of the present application.
本申请实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的电子设备执行时,能够使该电子设备执行图1所示实施例中目标检测装置执行的方法,并具体用于执行:The embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and the one or more programs include instructions, and the instructions are executed by an electronic device including multiple application programs. , the electronic device can be made to execute the method executed by the target detection apparatus in the embodiment shown in FIG. 1 , and is specifically used to execute:
将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果;若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果;根据第一检测结果和第二检测结果确定最终检测结果。The detection image is input into the first sub-network of the target detection model, and the first detection result output by the first sub-network is obtained; if there is a non-salient target in the first detection result, the corresponding part of the non-salient target in the detection image is Input into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network; determine the final detection result according to the first detection result and the second detection result.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计 算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flows of the flowcharts and/or the block or blocks of the block diagrams.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include forms of non-persistent memory, random access memory (RAM) and/or non-volatile memory in computer readable media, such as read only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存 (PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media does not include transitory computer-readable media, such as modulated data signals and carrier waves.
以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。The above descriptions are merely examples of the present application, and are not intended to limit the present application. Various modifications and variations of this application are possible for those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included within the scope of the claims of this application.

Claims (13)

  1. 一种目标检测方法,包括:A target detection method, comprising:
    将检测图像输入到目标检测模型的第一子网络中,得到所述第一子网络输出的第一检测结果;Input the detection image into the first sub-network of the target detection model, and obtain the first detection result output by the first sub-network;
    若所述第一检测结果中存在非显著目标,则将非显著目标在所述检测图像中的对应部分输入到目标检测模型的第二子网络中,得到所述第二子网络输出的第二检测结果;If there is a non-salient target in the first detection result, input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second sub-network output by the second sub-network. Test results;
    根据所述第一检测结果和所述第二检测结果确定最终检测结果。A final detection result is determined according to the first detection result and the second detection result.
  2. 如权利要求1所述的方法,其特征在于,所述第一子网络为目标检测网络,所述第一检测结果包括目标的位置和第一分类;The method of claim 1, wherein the first sub-network is a target detection network, and the first detection result includes a position and a first classification of the target;
    所述第二子网络为目标分类网络,所述第二检测结果包括目标的第二分类;The second sub-network is a target classification network, and the second detection result includes the second classification of the target;
    所述根据所述第一检测结果和所述第二检测结果确定最终检测结果包括:以非显著目标的第二分类替换所述第一分类,作为非显著目标的最终分类。The determining of the final detection result according to the first detection result and the second detection result includes: replacing the first classification with a second classification of non-salient objects as the final classification of non-salient objects.
  3. 如权利要求2所述的方法,其特征在于,该方法还包括:The method of claim 2, further comprising:
    根据第一检测结果中各目标的位置,计算各目标的面积;Calculate the area of each target according to the position of each target in the first detection result;
    若一个目标的面积小于第一阈值,则该目标为非显著目标。If the area of an object is smaller than the first threshold, the object is a non-salient object.
  4. 如权利要求2所述的方法,其特征在于,若一个目标的第一分类的置信度得分小于第二阈值,则该目标为非显著目标。The method of claim 2, wherein if the confidence score of the first classification of an object is less than the second threshold, the object is a non-salient object.
  5. 如权利要求3或4所述的方法,其特征在于,该方法还包括:The method of claim 3 or 4, wherein the method further comprises:
    在所述目标检测模型的训练阶段,In the training phase of the target detection model,
    确定第一检测结果中的各目标的面积,将各目标按面积大小进行排序,将面积序列中第一序列位置处的面积作为所述第一阈值;Determine the area of each target in the first detection result, sort each target according to the size of the area, and use the area at the first sequence position in the area sequence as the first threshold;
    或者,or,
    将第一检测结果中的各目标按分类置信度得分高低进行排序,将置信度得分序列中第二序列位置处的置信度得分作为所述第二阈值。The targets in the first detection result are sorted according to the classification confidence score, and the confidence score at the second sequence position in the confidence score sequence is used as the second threshold.
  6. 如权利要求2所述的方法,其特征在于,所述目标检测网络基于 Mask_RCNN算法实现,所述目标分类网络基于EfficientNet算法实现。The method of claim 2, wherein the target detection network is implemented based on the Mask_RCNN algorithm, and the target classification network is implemented based on the EfficientNet algorithm.
  7. 如权利要求1所述的方法,其特征在于,该方法还包括:The method of claim 1, further comprising:
    统计第一检测结果中的目标数量,将所述目标数量赋值给控制变量;Counting the target quantity in the first detection result, and assigning the target quantity to the control variable;
    若所述控制变量的值不为0,则选择第一检测结果中一个未被选择过的目标,对本次选择的目标进行是否为非显著目标的判断;若本次选择的目标为非显著目标,则执行将非显著目标在所述检测图像中的对应部分输入到目标检测模型的第二子网络中,得到所述第二子网络输出的第二检测结果的步骤,并将所述控制变量的值减1;If the value of the control variable is not 0, select a target that has not been selected in the first detection result, and judge whether the target selected this time is a non-significant target; if the target selected this time is a non-significant target target, then perform the step of inputting the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model to obtain the second detection result output by the second sub-network, and use the control Decrement the value of the variable by 1;
    若所述控制变量的值为0,则执行根据所述第一检测结果和所述第二检测结果确定最终检测结果的步骤。If the value of the control variable is 0, the step of determining the final detection result according to the first detection result and the second detection result is performed.
  8. 如权利要求1-7中任一项所述的方法,其特征在于,所述目标检测模型是通过如下方式训练的:The method of any one of claims 1-7, wherein the target detection model is trained in the following manner:
    将第一训练图像输入到所述第一子网络中,得到所述第一子网络输出的第一检测结果;inputting the first training image into the first sub-network to obtain the first detection result output by the first sub-network;
    确定第一检测结果中的非显著目标,提取各非显著目标在所述第一训练图像中的对应部分作为第二训练图像;Determine the non-salient target in the first detection result, and extract the corresponding part of each non-salient target in the first training image as the second training image;
    将第二训练图像输入到所述第二子网络中,得到所述第二子网络输出的第二检测结果;Inputting the second training image into the second sub-network to obtain a second detection result output by the second sub-network;
    根据第一检测结果、第二检测结果和第一训练图像的标注信息确定训练损失值,根据训练损失值对第一子网络和第二子网络的参数进行更新。The training loss value is determined according to the first detection result, the second detection result and the labeling information of the first training image, and the parameters of the first sub-network and the second sub-network are updated according to the training loss value.
  9. 一种目标检测装置,其特征在于,所述目标检测装置包括:A target detection device, characterized in that the target detection device comprises:
    第一检测单元,用于将检测图像输入到目标检测模型的第一子网络中,得到第一子网络输出的第一检测结果;a first detection unit, configured to input the detection image into the first sub-network of the target detection model, and obtain the first detection result output by the first sub-network;
    第二检测单元,用于若第一检测结果中存在非显著目标,则将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果;The second detection unit is configured to input the corresponding part of the non-salient target in the detection image into the second sub-network of the target detection model if there is a non-salient target in the first detection result, and obtain the first sub-network output by the second sub-network. 2. Test results;
    确定单元,用于根据第一检测结果和第二检测结果确定最终检测结果。A determination unit, configured to determine the final detection result according to the first detection result and the second detection result.
  10. 根据权利要求9所述的装置,其中,The apparatus of claim 9, wherein,
    确定单元,用于以非显著目标的第二分类替换第一分类,作为非显著目标的最终分类;A determination unit for replacing the first classification with the second classification of the non-salient target as the final classification of the non-salient target;
    第一检测单元,用于根据第一检测结果中各目标的位置,计算各目标的面积;若一个目标的面积小于第一阈值,则该目标为非显著目标。The first detection unit is configured to calculate the area of each target according to the position of each target in the first detection result; if the area of one target is smaller than the first threshold, the target is a non-salient target.
  11. 根据权利要求9所述的装置,其中,该装置还包括:The apparatus of claim 9, wherein the apparatus further comprises:
    训练单元,用于在目标检测模型的训练阶段,确定第一检测结果中的各目标的面积,将各目标按面积大小进行排序,将面积序列中第一序列位置处的面积作为第一阈值;或者,将第一检测结果中的各目标按分类置信度得分高低进行排序,将置信度得分序列中第二序列位置处的置信度得分作为第二阈值。The training unit is used to determine the area of each target in the first detection result in the training phase of the target detection model, sort each target according to the size of the area, and use the area at the first sequence position in the area sequence as the first threshold; Or, sort each target in the first detection result according to the classification confidence score, and use the confidence score at the second sequence position in the confidence score sequence as the second threshold.
  12. 根据权利要求9所述的装置,其中,The apparatus of claim 9, wherein,
    第一检测单元,用于统计第一检测结果中的目标数量,将目标数量赋值给控制变量;若控制变量的值不为0,则选择第一检测结果中一个未被选择过的目标,对本次选择的目标进行是否为非显著目标的判断;若本次选择的目标为非显著目标,则使第二检测单元将非显著目标在检测图像中的对应部分输入到目标检测模型的第二子网络中,得到第二子网络输出的第二检测结果,并将控制变量的值减1;若控制变量的值为0,则使确定单元根据第一检测结果和第二检测结果确定最终检测结果。The first detection unit is used to count the number of targets in the first detection result, and assign the target number to the control variable; if the value of the control variable is not 0, then select an unselected target in the first detection result, and correct the The target selected this time is judged whether it is a non-salient target; if the target selected this time is a non-salient target, the second detection unit is made to input the corresponding part of the non-salient target in the detection image to the second detection model of the target. In the sub-network, the second detection result output by the second sub-network is obtained, and the value of the control variable is decremented by 1; if the value of the control variable is 0, the determination unit is made to determine the final detection according to the first detection result and the second detection result. result.
  13. 一种电子设备,包括:An electronic device comprising:
    处理器;以及processor; and
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行下述目标检测方法:a memory arranged to store computer-executable instructions which, when executed, cause the processor to perform the following object detection method:
    将检测图像输入到目标检测模型的第一子网络中,得到所述第一子网络输出的第一检测结果;若所述第一检测结果中存在非显著目标,则将非显著目标在所述检测图像中的对应部分输入到目标检测模型的第二子网络中,得到所述第二子网络输出的第二检测结果;根据所述第一检测结果和所述第二检测结果确定最终检测结果。Input the detection image into the first sub-network of the target detection model, and obtain the first detection result output by the first sub-network; if there is a non-salient target in the first detection result, put the non-salient target in the The corresponding part in the detection image is input into the second sub-network of the target detection model, and the second detection result output by the second sub-network is obtained; the final detection result is determined according to the first detection result and the second detection result .
PCT/CN2021/124860 2020-12-02 2021-10-20 Target detection method and apparatus, and electronic device WO2022116720A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011398827.4 2020-12-02
CN202011398827.4A CN112634201B (en) 2020-12-02 2020-12-02 Target detection method and device and electronic equipment

Publications (1)

Publication Number Publication Date
WO2022116720A1 true WO2022116720A1 (en) 2022-06-09

Family

ID=75307573

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/124860 WO2022116720A1 (en) 2020-12-02 2021-10-20 Target detection method and apparatus, and electronic device

Country Status (2)

Country Link
CN (1) CN112634201B (en)
WO (1) WO2022116720A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311080A (en) * 2023-05-12 2023-06-23 重庆华悦生态环境工程研究院有限公司深圳分公司 Monitoring image detection method and device
CN118247493A (en) * 2024-05-23 2024-06-25 杭州海康威视数字技术股份有限公司 Fake picture detection and positioning method and device based on segmentation integrated learning

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634201B (en) * 2020-12-02 2023-12-05 歌尔股份有限公司 Target detection method and device and electronic equipment
CN113657143B (en) * 2021-06-25 2023-06-23 中国计量大学 Garbage classification method based on classification and detection combined judgment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130230A1 (en) * 2017-10-26 2019-05-02 Samsung Sds Co., Ltd. Machine learning-based object detection method and apparatus
CN110726724A (en) * 2019-10-22 2020-01-24 北京百度网讯科技有限公司 Defect detection method, system and device
CN110827247A (en) * 2019-10-28 2020-02-21 上海悦易网络信息技术有限公司 Method and equipment for identifying label
CN110991443A (en) * 2019-10-29 2020-04-10 北京海益同展信息科技有限公司 Key point detection method, image processing method, key point detection device, image processing device, electronic equipment and storage medium
CN111931920A (en) * 2020-09-25 2020-11-13 北京智芯微电子科技有限公司 Target detection method, device and storage medium based on cascade neural network
CN112634201A (en) * 2020-12-02 2021-04-09 歌尔股份有限公司 Target detection method and device and electronic equipment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108573238A (en) * 2018-04-23 2018-09-25 济南浪潮高新科技投资发展有限公司 A kind of vehicle checking method based on dual network structure
CN109117879B (en) * 2018-08-03 2021-06-22 南京旷云科技有限公司 Image classification method, device and system
CN109635666B (en) * 2018-11-16 2023-04-18 南京航空航天大学 Image target rapid detection method based on deep learning
CN109727229B (en) * 2018-11-28 2023-10-20 歌尔股份有限公司 Method and device for detecting false solder
CN111044525B (en) * 2019-12-30 2021-10-29 歌尔股份有限公司 Product defect detection method, device and system
CN111414910B (en) * 2020-03-18 2023-05-02 上海嘉沃光电科技有限公司 Small target enhancement detection method and device based on double convolution neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130230A1 (en) * 2017-10-26 2019-05-02 Samsung Sds Co., Ltd. Machine learning-based object detection method and apparatus
CN110726724A (en) * 2019-10-22 2020-01-24 北京百度网讯科技有限公司 Defect detection method, system and device
CN110827247A (en) * 2019-10-28 2020-02-21 上海悦易网络信息技术有限公司 Method and equipment for identifying label
CN110991443A (en) * 2019-10-29 2020-04-10 北京海益同展信息科技有限公司 Key point detection method, image processing method, key point detection device, image processing device, electronic equipment and storage medium
CN111931920A (en) * 2020-09-25 2020-11-13 北京智芯微电子科技有限公司 Target detection method, device and storage medium based on cascade neural network
CN112634201A (en) * 2020-12-02 2021-04-09 歌尔股份有限公司 Target detection method and device and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311080A (en) * 2023-05-12 2023-06-23 重庆华悦生态环境工程研究院有限公司深圳分公司 Monitoring image detection method and device
CN116311080B (en) * 2023-05-12 2023-09-12 重庆华悦生态环境工程研究院有限公司深圳分公司 Monitoring image detection method and device
CN118247493A (en) * 2024-05-23 2024-06-25 杭州海康威视数字技术股份有限公司 Fake picture detection and positioning method and device based on segmentation integrated learning

Also Published As

Publication number Publication date
CN112634201A (en) 2021-04-09
CN112634201B (en) 2023-12-05

Similar Documents

Publication Publication Date Title
WO2022116720A1 (en) Target detection method and apparatus, and electronic device
WO2022121531A1 (en) Product defect detection method and apparatus
WO2020156409A1 (en) Data processing method, defect detection method, computing apparatus, and storage medium
WO2022183780A1 (en) Target labeling method and target labeling apparatus
WO2021109775A1 (en) Methods and devices for generating training sample, training model and recognizing character
CN111325717B (en) Mobile phone defect position identification method and equipment
US20230214989A1 (en) Defect detection method, electronic device and readable storage medium
CN110796669A (en) Vertical frame positioning method and equipment
CN109283182A (en) A kind of detection method of battery welding point defect, apparatus and system
CN110599453A (en) Panel defect detection method and device based on image fusion and equipment terminal
CN110827245A (en) Method and equipment for detecting screen display disconnection
US11036967B2 (en) Method and device for face selection, recognition and comparison
CN110796078A (en) Vehicle light detection method and device, electronic equipment and readable storage medium
CN114418898B (en) Data enhancement method based on target overlapping degree calculation and self-adaptive adjustment
CN107844803B (en) Picture comparison method and device
CN109741296A (en) Product quality detection method and device
CN110909772A (en) High-precision real-time multi-scale dial pointer detection method and system
CN112884054A (en) Target labeling method and target labeling device
CN115546219B (en) Detection plate type generation method, plate card defect detection method, device and product
AU2021203867B2 (en) Image identification methods and apparatuses, image generation methods and apparatuses, and neural network training methods and apparatuses
US20220222810A1 (en) Wafer detection method, device, apparatus, and storage medium
WO2022143261A1 (en) Grid map quality evaluation method and apparatus, computer device, and storage medium
CN114332112A (en) Cell image segmentation method and device, electronic equipment and storage medium
CN113870754A (en) Method and system for judging defects of panel detection electronic signals
TWI770561B (en) Product defect detection method, computer device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21899754

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21899754

Country of ref document: EP

Kind code of ref document: A1