WO2021212482A1 - Method and apparatus for mining difficult case during target detection - Google Patents

Method and apparatus for mining difficult case during target detection Download PDF

Info

Publication number
WO2021212482A1
WO2021212482A1 PCT/CN2020/086742 CN2020086742W WO2021212482A1 WO 2021212482 A1 WO2021212482 A1 WO 2021212482A1 CN 2020086742 W CN2020086742 W CN 2020086742W WO 2021212482 A1 WO2021212482 A1 WO 2021212482A1
Authority
WO
WIPO (PCT)
Prior art keywords
result
image
detection
detected
tracking
Prior art date
Application number
PCT/CN2020/086742
Other languages
French (fr)
Chinese (zh)
Inventor
晋周南
孙叠
刘新春
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2020/086742 priority Critical patent/WO2021212482A1/en
Priority to CN202080004676.1A priority patent/CN112639872B/en
Publication of WO2021212482A1 publication Critical patent/WO2021212482A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a method and device for mining difficult cases in target detection.
  • the data-driven deep learning target detection algorithm is a method to obtain information such as location information and category information of objects such as vehicles and people by analyzing the acquired images.
  • the target detection algorithm usually includes two stages of training and inference. Among them, the training phase is the process of learning data by the algorithm, and the inference phase is the phase in which the output of the algorithm in the training phase is used to analyze the information contained in the image.
  • the results of the analysis at the inference stage may be correct or incorrect. Errors in the analysis result include two cases: missed detection and false detection. Objects that have missed detections are rare cases of missed detections, and objects that have missed detections are rare cases of misdetection. Exemplarily, (a), (b) and (c) of FIG.
  • the road image includes a traffic sign 101, a traffic sign 102, and a vehicle 103, and a car sticker 104 is affixed to the vehicle 103.
  • the detection frame 1 is correctly labeled with a traffic sign 101
  • the detection frame 2 is correctly labeled with a traffic sign 102
  • the detection frame 1 is correctly labeled with a traffic sign 101, but it is not correctly detected Traffic sign 102, that is, a missed detection occurs.
  • the traffic sign 102 in this image is a difficult example of missed detection; in Figure 1(c), the detection frame 1 is correctly marked with the traffic sign 101, and the detection frame 2 is correctly marked with the traffic sign 102.
  • the detection frame 3 is marked with the car sticker 104, that is, the car sticker 104 is mistakenly detected as a traffic sign, and a misdetection occurs.
  • the car sticker 104 in the image is a rare example of misdetection.
  • the hard cases of missed detection and the hard cases of false detection constitute hard case data. Using hard-case data to join the training phase is an effective way to reduce the probability of error in the analysis results.
  • Difficult case mining is a method of extracting difficult case data sets. Its purpose is to extract difficult cases of missed detection and difficult cases of false detection.
  • Commonly used methods for mining hard cases are divided into two categories, one is the method for mining supervised hard cases, and the other is the method for mining unsupervised hard cases.
  • the supervised and difficult mining method requires a large amount of label data, while data labeling requires a lot of manpower; especially in the case of large-scale data, the cost is very high.
  • unsupervised hard case mining how to realize effective hard case mining and extract the missed hard cases and misdetected hard cases more accurately and efficiently is a problem that needs to be solved.
  • the embodiments of the present application provide a method and device for mining difficult cases in target detection, which can implement difficult case mining more effectively and extract difficult case data more accurately.
  • this application provides a method and device for mining difficult cases in target detection.
  • the method may include: obtaining the image to be detected; using a preset target detection algorithm to analyze the image to be detected to obtain the detection result of the image to be detected; using a preset single target tracking algorithm to detect the image Perform analysis to obtain the tracking result of the image to be detected; according to the detection result, tracking result and preset rules, obtain the discrimination result of each object in the image to be detected.
  • the discrimination result includes: successful matching, missed detection, false detection, new appearance and End; the object whose judgment result is a missed detection is determined as a rare case of missed detection, and the object whose judgment result is a false detection is determined as a rare case of misdetected.
  • the detection result includes: the category of one or more detection objects, the detection position of one or more detection objects, the classification accuracy value of one or more detection objects;
  • the tracking result includes: the tracking position of one or more tracking objects, The tracking confidence of one or more tracking objects;
  • the discrimination results include: successful matching, missed detection, false detection, new appearance and end.
  • the single target tracking algorithm and target detection algorithm are combined to mine the difficult cases in the target detection, and the tracking results of the single target tracking algorithm are applied to the difficult case mining in the target detection, and at the same time, the missed rare cases and false detections are distinguished. Difficult cases can be extracted more accurately from missed cases and misdetected cases, and difficult cases can be discovered more effectively.
  • the method includes: obtaining the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules; Associate the result, make a preliminary judgment of the first object in the image to be detected, and obtain the preliminary judgment result of the first object in the image to be detected; determine the judgment result of the first object in the image to be detected according to the preliminary judgment result of the first object in the image to be detected ;
  • the association results include: successful matching, the first association result, and the second association result;
  • the preliminary discrimination results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, end;
  • the first association result is detection
  • the tracking result is used for target detection, the location information of the detected target can be effectively predicted, and the accuracy of difficult case mining is improved.
  • obtaining the association result of the first object in the image to be detected includes: selecting an object of the same category in the detection result as the tracking result to track The object is a row, the detection object is a column, or the detection object is a row, the tracking object is a column, and a matrix is constructed; using an association matching algorithm, the matrix is used to obtain the association result of the first object in the image to be detected.
  • association matching algorithm is used to match the detection result and the tracking result, effectively ensuring that each detection object in the detection result has at most one tracking object corresponding to it in the tracking result; improving the accuracy of difficult case mining.
  • the preliminary discrimination rules include: for the first object whose association result is a successful match, the preliminary discrimination result is a successful match; for the first object whose association result is the first association result, the preliminary discrimination result is a possible error Check; for the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection; for the association result is the second association result, and the tracking object For the first object whose tracking confidence is less than or equal to the second confidence threshold, the preliminary judgment result is the end; for the correlation result is the second correlation result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold.
  • the preliminary judgment result is a possible missed detection; wherein, the second confidence threshold is less than the first confidence threshold.
  • determining the discrimination result of the first object in the image to be detected according to the preliminary discrimination result of the first object in the image to be detected includes: for the first object whose preliminary discrimination result is a successful match, determining that the discrimination result is a match Success; for the first object whose preliminary judgment result is a missed detection, the judgment result is determined to be a missed detection; for the first object whose preliminary judgment result is over, the judgment result is determined to be over; for the preliminary judgment result, it may be misdetected or may be missed
  • the first object of is combined with the discrimination result of the S frame images adjacent to the image to be detected to determine the discrimination result; where S>1.
  • the judgment results of the adjacent S-frame images are combined to determine the judgment result, which can effectively reduce the probability of misjudgment of the difficult cases of false detection and the difficult cases of missed detection.
  • the association result of the object corresponding to the first object that may be missed as a result of the preliminary discrimination is the second association result, and the association result with the preliminary discrimination result is that the first object may be missed
  • the tracking confidence of the object corresponding to the first object is less than or equal to the second confidence threshold, and it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is the end; if the first frame image adjacent to the image to be detected reaches In each frame of the S-th image, the association result of the object corresponding to the first object that may be missed by the preliminary judgment result is the second association result, and corresponds to the first object that may be missed by the preliminary judgment result
  • the tracking confidence of the object is less than the
  • the preliminary judgment result as the first object that may be misdetected, if each frame of image from the first frame image to the Sth frame image adjacent to the image to be detected, the preliminary judgment result is The association results of the objects corresponding to the first object that may be falsely detected are all successful matches, and the tracking confidence of the objects corresponding to the first object that may be falsely detected is greater than the first confidence threshold, and the preliminary determination result is determined The judgment result of the first object that is likely to be misdetected is new; otherwise, it is determined that the preliminary judgment result is the judgment result of the first object that may be misdetected as a misdetection.
  • the S frame of images adjacent to the image to be detected has a frame number greater than the frame number of the image to be detected
  • the S frame images adjacent to the image to be detected are S frame images with a frame number smaller than the frame number of the image to be detected. In this way, when the number of remaining undetected image frames is insufficient, the information of S frames of images before the image to be detected can be used to assist the determination of the current image to be detected, and the problem of insufficient remaining frames can be effectively solved.
  • the reverse order judgment can be adopted from back to front.
  • the preset single target tracking algorithm is a single target tracking algorithm based on deep learning.
  • Using a single-target tracking algorithm based on deep learning and analyzing images based on deep semantic features can effectively predict the location information of the matching result and improve the accuracy of extracting rare cases of missed detection and difficult false detection.
  • the present application also provides a device for mining difficult cases in target detection, which can implement the method for mining difficult cases in target detection described in the first aspect.
  • the device can implement the above method by software, hardware, or by hardware executing corresponding software.
  • the device may include: an image acquisition unit, a target detection unit, a target tracking unit, and a difficult case mining unit.
  • the image acquisition unit is used to obtain the image to be detected;
  • the target detection unit is used to analyze the image to be detected using a preset target detection algorithm to obtain the detection result of the image to be detected;
  • the target tracking unit is used to use the preset single target tracking
  • the algorithm analyzes the image to be detected to obtain the tracking result of the image to be detected;
  • the detection result includes: the category of one or more detection objects, the detection position of one or more detection objects, and the classification accuracy value of one or more detection objects ;
  • the tracking results include: the tracking position of one or more tracking objects, the tracking confidence of one or more tracking objects;
  • the difficult case mining unit is used to obtain each of the images to be detected according to the detection results, the tracking results and the preset rules
  • the discrimination result of the object includes: matching success, missed detection, misdetection, new appearance and end; the difficult case mining unit is also used
  • the difficult case mining unit is specifically used to: obtain the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules;
  • the correlation result of the first object, the first object in the image to be detected is preliminarily discriminated, and the preliminary discrimination result of the first object in the image to be detected is obtained;
  • the first object in the image to be detected is determined according to the preliminary discrimination result of the first object in the image to be detected Discrimination result of the object; among them, the association result includes: the matching is successful, the first association result, the second association result;
  • the first association result is that the first object exists in the detection result, and the first object does not exist in the tracking result;
  • the second association result Because the first object does not exist in the detection result, the first object exists in the tracking result;
  • the preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, and end.
  • the difficult case mining unit obtains the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules, which specifically includes: selecting the detection result and the tracking result in the same category
  • a matrix is constructed by taking the tracked object as the row and the detected object as the column or the detected object as the row and the tracked object as the column; using the association matching algorithm, the matrix is used to obtain the association result of the first object in the image to be detected.
  • the preliminary discrimination rules include: for the first object whose association result is a successful match, the preliminary discrimination result is a successful match; for the first object whose association result is the first association result, the preliminary discrimination result is a possible error Check; for the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection; for the association result is the second association result, and the tracking object For the first object whose tracking confidence is less than or equal to the second confidence threshold, the preliminary judgment result is the end; for the correlation result is the second correlation result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold.
  • the preliminary judgment result is a possible missed detection; wherein, the second confidence threshold is less than the first confidence threshold.
  • the difficult case mining unit determines the discrimination result of the first object in the image to be detected according to the preliminary discrimination result of the first object in the image to be detected, which specifically includes: for the first object whose preliminary discrimination result is a successful match, Determine that the discrimination result is a successful match; for the first object whose preliminary discrimination result is missed, the discrimination result is determined to be missed; for the first object whose preliminary discrimination result is over, the discrimination result is determined to be over; for the initial discrimination result, it may be wrong
  • the first object that is detected or may be missed is determined by combining the discrimination results of the S frame images adjacent to the image to be detected; where S>1.
  • the difficult case mining unit determines that the preliminary discrimination result is the first object that may be misdetected or may be missed, and the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result, which specifically includes:
  • the preliminary judgment result is the first object that may be missed.
  • the association result of the object corresponding to the first object that may be missed as a result of the preliminary determination is the second association result, and the tracking confidence of the object corresponding to the first object that may be missed as the preliminary determination result If the degree is less than or equal to the second confidence threshold, it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is the end; if it is adjacent to the image to be detected from the first frame image to the S-th frame image ,
  • the association results of the objects corresponding to the first object that may be missed by the preliminary discrimination result are all second association results, and the tracking confidence of the objects corresponding to the first object that may be missed by the preliminary discrimination result is less than that of the first object.
  • the judgment result of the preliminary judgment result is the judgment result of the missed detection .
  • the difficult case mining unit determines that the preliminary discrimination result is the first object that may be misdetected or may be missed, and the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result, which specifically includes:
  • the preliminary discrimination result is the first object that may be misdetected. If each frame image from the first frame image to the Sth frame image adjacent to the image to be detected corresponds to the first object that may be misdetected by the preliminary discrimination result
  • the association results of the objects are all matching successfully, and the tracking confidence of the objects corresponding to the first object that may be misdetected by the preliminary discrimination result is greater than the first confidence threshold, and the preliminary discrimination result is determined to be the first possible misdetection.
  • the discrimination result of the object is new; otherwise, it is determined that the preliminary discrimination result is the first object that may be misdetected, and the discrimination result is a misdetection.
  • the S frames of images adjacent to the image to be detected have a frame sequence number greater than that of the image to be detected.
  • S-frame image with a large serial number if the number of frames of the remaining undetected images to be detected is less than the preset frame number S, the S-frame image adjacent to the image to be detected is the S frame whose frame number is smaller than the frame number of the image to be detected image.
  • the preset single target tracking algorithm is a single target tracking algorithm based on deep learning.
  • an embodiment of the present application provides a device that can implement the method for mining difficult cases in target detection described in the first aspect.
  • the device may be a server.
  • the device may include a processor and a memory.
  • the processor is configured to support the device to perform the corresponding function in the method of the first aspect described above.
  • the memory is used for coupling with the processor, and it stores the necessary program instructions and data of the device.
  • embodiments of the present application provide a computer-readable storage medium, which includes computer instructions, which when the computer instructions run on a device, cause the device to perform any of the above-mentioned aspects and possible The method of mining difficult cases in the target detection described in the design method.
  • the embodiments of the present application provide a computer program product.
  • the computer program product runs on a computer
  • the computer can execute the target detection as described in any of the above aspects and possible design methods. Examples of mining methods.
  • the embodiments of the present application also provide a chip system, which includes a processor and may also include a memory, which is used to implement the difficult example mining in target detection described in any of the above aspects and possible design methods. Methods.
  • any device or device or computer readable storage medium or computer program product or chip system provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the above provided The beneficial effects of the corresponding scheme in the corresponding method will not be repeated here.
  • FIG. 1 is a schematic diagram of a scenario to which the technical solution provided by an embodiment of the application is applicable;
  • FIG. 2 is a schematic diagram of a device to which the technical solution provided in an embodiment of the application is applicable;
  • FIG. 3 is a schematic diagram 1 of a method for mining difficult cases in target detection according to an embodiment of this application;
  • FIG. 4 is a second schematic diagram of a method for mining difficult cases in target detection according to an embodiment of this application.
  • FIG. 5 is a structural schematic diagram 1 of an apparatus provided by an embodiment of this application.
  • FIG. 6 is a second structural diagram of an apparatus provided by an embodiment of this application.
  • FIG. 7 is a third structural diagram of an apparatus provided by an embodiment of this application.
  • the term “plurality” herein refers to two or more.
  • the terms “first” and “second” herein are used to distinguish different objects, rather than to describe a specific order of objects.
  • the first threshold and the second threshold are only for distinguishing different thresholds, and the order of their order is not limited.
  • the term “and/or” in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations.
  • words such as “exemplary” or “for example” are used as examples, illustrations, or illustrations. Any embodiment or design solution described as “exemplary” or “for example” in the embodiments of the present application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, words such as “exemplary” or “for example” are used to present related concepts in a specific manner.
  • FIG. 2 is a schematic structural diagram of a device 100 provided in an embodiment of this application.
  • the device 100 includes at least one processor 110, a communication line 120, a memory 130, and at least one communication interface 140.
  • the processor 110 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. integrated circuit.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • the communication line 120 may include a path to transmit information between the aforementioned components.
  • the communication interface 140 uses any device such as a transceiver to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc. .
  • RAN radio access network
  • WLAN wireless local area networks
  • the memory 130 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions
  • the dynamic storage device can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory 130 may exist independently, and is connected to the processor 110 through a communication line 120.
  • the memory 130 may also be integrated with the processor 110.
  • the memory 130 is used to store computer-executed instructions for executing the solution of the present application, and the processor 110 controls the execution.
  • the processor 110 is configured to execute computer-executable instructions stored in the memory 130, so as to implement the method for mining difficult cases in target detection provided in the following embodiments of the present application.
  • the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
  • the processor 110 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 2.
  • the device 100 may include multiple processors, such as the processor 110 and the processor 111 in FIG. 2. Each of these processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
  • the device 100 may further include an output device 150 and an input device 160.
  • the output device 150 communicates with the processor 110 and can display information in a variety of ways.
  • the output device 150 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait.
  • the input device 160 communicates with the processor 110, and can receive user input in a variety of ways.
  • the input device 160 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the above-mentioned device 100 may be a general-purpose device or a special-purpose device.
  • the device 100 may be a vehicle-mounted device, a desktop computer, a portable computer, a network server, a palmtop computer (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or the like in Figure 2 Structure of the equipment.
  • PDA personal digital assistant
  • the embodiment of the present application does not limit the type of the device 100.
  • the structure of the device 100 shown in FIG. 2 is only used as an example, and is not used to limit the technical solution of the present application.
  • the device 100 may also be in other forms and may also include other components.
  • the embodiment of the present application provides a method for mining difficult cases in target detection, which combines target detection algorithms and single target tracking algorithms to mine difficult cases in target detection. Difficult case mining can be carried out more effectively, and difficult case data can be extracted more accurately.
  • the method for mining difficult cases in target detection can be the above-mentioned hardware equipment or device that supports scientific operations; the device can be a chip or a chip system; it can also be a computer-readable storage medium; it can also be a computer program Product; the embodiment of this application does not limit this.
  • the embodiment of the present application provides a method for mining difficult cases in target detection, which can be applied to the device shown in FIG. 2. As shown in Figure 3, the method may include:
  • S301 Acquire an image to be detected.
  • a sequence of images is a group of images with time continuity obtained by decimating video frames.
  • the video can be captured by a camera.
  • Perform frame extraction processing on the captured video to obtain the original sequence image.
  • Analyze each frame of the original sequence image to obtain the information of one or more objects in each frame.
  • the object with the wrong analysis result is the hard case data; among them, the hard case data includes missed rare cases and false detections Hard case.
  • the t-th frame image is the image to be detected; where t>0.
  • the images to be detected are acquired in the order from front to back.
  • the images to be detected are acquired in the order from back to front.
  • the embodiment of the present application takes as an example a difficult example of digging evidence for target detection on a sequence image. It does not constitute a limitation on the technical solution of this application.
  • the method for mining difficult cases in target detection provided by the embodiments of the present application can also be applied to mining difficult cases in sequence data such as laser point clouds.
  • S302 Use a preset target detection algorithm to analyze the image to be detected, and obtain a detection result of the image to be detected.
  • Object detection is a method of identifying objects and their positions in an image.
  • the preset target detection algorithm can be any target detection algorithm in conventional technologies; for example, the YOLO (you only look once: unified, real-time object detection) algorithm.
  • the detection result may include: the category of one or more detection objects, the position of each detection object (referred to as the detection position in this application), the classification accuracy value of each detection object, and other information.
  • the detection object is the recognition target.
  • the category of the detection object is used to distinguish the type of the detection object; for example, the category of the detection object may include people, traffic signs, buildings, vehicles, etc.; for example, in (a), (b) and (c) of Figure 1,
  • the category of the detected object is a traffic sign.
  • the position of the detection object may be the coordinates of the detection object in the image.
  • the classification accuracy value of the detection object that is, the probability value of each detection object output by the target detection algorithm is the category; when the classification accuracy value is greater than the set first value, the target detection algorithm determines and recognizes the detection object of the given category.
  • the device can save the detection result of the image to be detected.
  • the device saves a template information table
  • the template information table includes the detection result of the image to be detected and related information; for example, the image frame number, object serial number, object category, object location, classification accuracy value.
  • the template information table includes the information shown in Table 1.
  • the image frame number is the sequence number of the frame in which the detection object is located.
  • the object serial number is the serial number of the detected object (optionally, the serial number of the detected object may be manually labeled).
  • the object category is the category of the detection object.
  • the object position is the position information of the detected object; for example, (x 1 , y 1 ), (x 2 , y 2 ) are the coordinates of the upper left corner and the lower right corner of the object 1 in the first frame of image, used to represent The position of object 1 in the first frame of image.
  • the classification accuracy value is the classification accuracy value of the detection object.
  • the relevant information in the template information table can be updated.
  • the first frame image is called a template image.
  • the second frame of the sequence image is obtained as the image to be detected; the second frame image is analyzed using the preset target detection algorithm to obtain the detection result of the second frame image; according to the second frame image
  • the detection results update the relevant information in the template information table, and the information in the template information table is shown in Table 2.
  • template information table in the embodiment of the present application is only an exemplary description. In actual applications, the template information table may also be in other forms, which are not limited in the embodiment of the present application.
  • S303 Use a preset single-target tracking algorithm to analyze the image to be detected, and obtain a tracking result of the image to be detected.
  • the single-target tracking algorithm is an algorithm that predicts the size and position of the target in subsequent frames under the condition of a given target size and position in the initial frame.
  • the preset single-target tracking algorithm may be any single-target tracking algorithm based on deep learning in conventional technologies; for example, a correlation filter (CF, correlation filter) algorithm.
  • CF correlation filter
  • Using a single-target tracking algorithm based on deep learning and analyzing images based on deep semantic features can effectively predict the location information of the matching results, thereby improving the accuracy of difficult case mining.
  • the given initial frame can be the first frame of the sequence image.
  • the target of the initial frame may be a detection object obtained by performing target detection on the first frame of image.
  • the information of the first frame of the sequence image can be obtained from the template information table saved by the device.
  • the tracking result may include information such as the position of one or more tracking objects (referred to as the tracking position in this application), and the tracking confidence of each tracking object.
  • the tracking confidence is used to reflect the reliability of each tracking result. The higher the tracking confidence, the higher the reliability of the tracking result; when the tracking confidence is greater than the set second value, the single target tracking algorithm determines that the tracking is correct To the given goal.
  • the output of the single-target tracking algorithm based on deep learning is the last layer feature map f of the network part of the single-target tracking algorithm, and the position information of each tracking object; where f is the size of f w ⁇ f h matrix.
  • the tracking confidence is the maximum response score in f; it can be expressed as max(f(i,j)).
  • S304 Obtain a discrimination result of each object in the image to be detected according to the detection result of the image to be detected, the tracking result and the preset rule.
  • step S304 may include:
  • IOU intersection over union
  • the preset association rule is to use an association matching algorithm (for example, the Hungarian algorithm) and use a matrix to obtain the association result of each object in the image to be detected.
  • the association result includes: a successful match, a first association result, and a second association result.
  • the matching success indicates that the detection object in the detection result and the tracking object in the tracking result are the same object, that is, the pairing is successful.
  • the first association result indicates that after the matching, there is no detection object matching the tracking object in the tracking result; that is, the object exists in the detection result, and the object does not exist in the tracking result.
  • the second association result indicates that after the matching, there is no tracking object matching the detection object in the detection result; that is, the object does not exist in the detection result, and the object exists in the tracking result. Further, it can also be determined that the association result is the classification accuracy value and the IOU value of the successfully paired object. If the classification accuracy value is less than the first threshold or the IOU value is less than the second threshold, the association result of the corresponding detection object is determined as the first association result , Determine the association result of the corresponding tracking object as the second association result.
  • Using the correlation matching algorithm to match the detection result and the tracking result can effectively ensure that each detection object in the detection result is matched with at most one tracking object in the tracking result.
  • the discrimination rules include:
  • the preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, and ended.
  • the preliminary judgment rules include:
  • the preliminary judgment result is a successful match.
  • the preliminary judgment result is a possible misdetection.
  • the preliminary judgment result is a missed detection.
  • the preliminary judgment result is the end.
  • the second confidence threshold is less than the first confidence threshold.
  • the preliminary judgment result is that the detection may be missed.
  • the discrimination results include: successful matching, missed detection, false detection, new appearance and end.
  • the discrimination result is determined to be a successful match.
  • the judgment result is determined to be over.
  • the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result.
  • the S frame image adjacent to the image to be detected is the S frame image after the image to be detected (that is, the frame number is greater than the frame number of the image to be detected), or before the image to be detected (that is, the frame number is greater than the frame number of the image to be detected). S-frame image with the smaller serial number).
  • the S frames of images adjacent to the image to be detected are S frames of images after the image to be detected, and the order of discrimination is from front to back
  • the preliminary judgment result is the object that may be missed
  • the association result of the object corresponding to the object is the second Association result, and the tracking confidence of the object corresponding to the object is less than or equal to the second confidence threshold, then the determination result is determined to be the end; if each frame of image from the 1st frame to the Sth frame of image corresponds to the object
  • the association results of the objects are the second association results, and the tracking confidence of the object corresponding to the object is less than the first confidence threshold and greater than the second confidence threshold, then the determination result is determined to be the end; for the above two cases Yes, it is determined that the judgment result is a missed inspection.
  • the association result of the object corresponding to the object is that the matching is successful, and the tracking confidence of the object corresponding to the object is all If it is greater than the first confidence threshold, it is determined that the discrimination result is new; for those that do not belong to the above situation, it is determined that the discrimination result is a false detection.
  • the device may maintain a temporary template information table to store information used to identify objects that may be missed and objects that may be missed.
  • the temporary template information table may include: image frame number, object serial number, image detection result (for example, object category, object location, classification accuracy value), missed detection identification, false detection identification, possible missed detection identification, possible false detection Identification, matching success identification, etc.
  • the temporary template information table After discriminating each of the image to be detected and the first frame image to the Sth frame image adjacent to the image to be detected, the temporary template information table is updated.
  • the update rules are:
  • the detection result of the object is used to update the corresponding location information in the temporary template information table, and a possible misdetection flag is set.
  • the discrimination result of each object in the image to be detected can be determined according to the temporary template information table.
  • the image to be detected is the t-th frame image.
  • the temporary template information table is updated.
  • the temporary template information table includes the information shown in Table 3.
  • the discrimination result of object 1 in the t-th frame image is a successful match
  • the discrimination result of object 2 is a missed detection
  • the discrimination result of object 3 is a misdetection
  • the discrimination result of object 4 is a new appearance
  • object 5 The judgment result of is the end; the object 6 is a possible missed detection, and the object 7 is a possible false detection.
  • You can combine the S (for example, S 2) frame images adjacent to the t-th frame image to determine the judgment results of the object 6 and the object 7 .
  • the image to be detected is the (t+1)th frame image.
  • the temporary template information table includes the information shown in Table 4.
  • the information of successful matching, missed detection, misdetection, and end in the tth frame may be deleted.
  • Table 4 may also include information about other objects in the (t+1)th frame of image except for object 6 and object 7. This part of the content is omitted in this application.
  • the image to be detected is the (t+2)th frame image.
  • the temporary template information table includes the information shown in Table 5.
  • the table 5 may also include the information of other objects in the (t+2)th frame image except for the object 6 and the object 7. This part of the content is omitted in this application.
  • Objects in the image to be detected whose judgment result is a missed detection are determined as difficult cases of missed detection; objects in the image to be detected that are judged to be misdetected are determined as difficult cases of false detection; That is, difficult case data is obtained.
  • missed rare cases and/or falsely detected rare cases can be added to the rare case data set.
  • the template information table is updated according to the discrimination result of each object in the image to be detected. In this way, the effective use of the detected results can be realized.
  • the template information table update rules are:
  • the information in the template information table is not updated for objects whose judgment results are false detections;
  • the information of the object is removed from the template information table.
  • the method for mining difficult cases in target detection combines single target tracking algorithm and target detection algorithm to mine difficult cases in target detection, and at the same time distinguishes difficult cases of missed detection and difficult cases of false detection, and can extract more accurately Missing and false detection of difficult cases, more effective realization of difficult cases mining.
  • Adopt a single target tracking algorithm based on deep learning analyze the image based on deep semantic features, and effectively predict the location information of the matching result; and use the associated matching algorithm to match the detection result and the tracking result, effectively ensuring each detection object in the detection result , There is only one tracking object corresponding to it in the tracking results; the accuracy of mining difficult cases is improved.
  • the above-mentioned device includes hardware structures and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one process.
  • Module The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation. The following is an example of dividing each function module corresponding to each function.
  • FIG. 5 is a schematic diagram of the logical structure of a device 500 provided by an embodiment of the present application.
  • the device 500 may be a device for mining difficult cases in target detection, and can implement the method for mining difficult cases in target detection provided by an embodiment of the present application.
  • the apparatus 500 may be a hardware structure, a software module, or a hardware structure plus a software module.
  • the device 500 includes an initialization module 501, a target detection module 502, an information update module 503, a single target tracking module 504, and a difficult case mining module 505.
  • the initialization module 501 is used to initialize various information, such as initializing the frame number of the image to be detected; The number is initialized to 1.
  • the initialization module 501 can also be used to initialize the temporary template information table and the template information table to be empty.
  • the target detection module 502 is configured to use a preset target detection algorithm to analyze the image to be detected and obtain the detection result of the image to be detected.
  • the information update module 503 is used to update the temporary template information table and the template information table. For example, the detection result of the image to be detected is updated to the temporary template information table and the template information table.
  • the single target tracking module 504 is configured to use a preset single target tracking algorithm to analyze the image to be detected and obtain the tracking result of the image to be detected.
  • the hard case mining module 505 obtains the discrimination result of each object in the image to be detected according to the detection result output by the target detection module 502, the tracking result output by the single target tracking module 504, and the preset association rules and discrimination rules.
  • the hard cases of missed detection and the hard cases of misdetection output by the hard case mining module 505 are added to the hard case data set.
  • the information update module 503 updates the template information table.
  • the information update module 503 is also used to update the temporary template information table when the difficult case mining module 505 discriminates an object that may be misdetected or may be missed as a result of the preliminary judgment. After obtaining the discrimination result of each object in the current image to be detected, the initialization module 501 adds 1 to the frame number of the current image to be detected to obtain the next image to be detected.
  • FIG. 7 is a schematic diagram of the logical structure of an apparatus 700 provided by an embodiment of the present application.
  • the apparatus 700 may be a device for mining difficult cases in target detection, and can implement the method for mining difficult cases in target detection provided in the embodiments of the present application.
  • the apparatus 700 may be a hardware structure, a software module, or a hardware structure plus a software module.
  • the apparatus 700 includes an image acquisition unit 701, a target detection unit 702, a target tracking unit 703, and a difficult case mining unit 704.
  • the image acquisition unit 701 may be used to perform S301 in FIG. 3, and/or perform other steps described in this application.
  • the target detection unit 702 may be used to perform S302 in FIG. 3, and/or perform other steps described in this application.
  • the target tracking unit 703 may be used to perform S303 in FIG. 3, and/or perform other steps described in this application.
  • the hard case mining unit 704 may be used to perform S304 and S305 in FIG. 3, and/or perform other steps described in this application.
  • the embodiment of the present application also provides a storage medium, and the storage medium may include a memory.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or includes one or more data storage devices such as servers, data centers, etc. that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)) Wait.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a method and apparatus for mining a difficult case during target detection, which relate to the fields of artificial intelligence and intelligence automobiles. By means of the present application, the mining of a difficult case during target detection can be realized more effectively, and difficult case data can be extracted more accurately. The method can comprise: analyzing, by using a preset target detection algorithm, an image to be subjected to detection, so as to obtain a detection result of said image; analyzing said image by using a preset single-target tracking algorithm, so as to obtain a tracking result of said image; acquiring a determination result of each object in said image according to the detection result, the tracking result, and a preset rule, wherein the determination result comprises successful matching, missed detection, false detection, new occurrence, and an end; and determining an object, the determination result of which indicates missed detection, as a missed-detection difficult case, and determining an object, the determination result of which indicates false detection, as a false-detection difficult case.

Description

一种目标检测中难例挖掘的方法及装置Method and device for mining difficult cases in target detection 技术领域Technical field
本申请涉及人工智能领域,尤其涉及一种目标检测中难例挖掘的方法及装置。This application relates to the field of artificial intelligence, and in particular to a method and device for mining difficult cases in target detection.
背景技术Background technique
现阶段无人驾驶、智慧医疗、智慧城市等被广泛关注,其需要使用基于数据驱动的深度学习目标检测算法分析图像信息。At this stage, unmanned driving, smart medical, smart cities, etc. are widely concerned, which need to use data-driven deep learning target detection algorithms to analyze image information.
基于数据驱动的深度学习目标检测算法是通过分析获取的图像,得到车辆、人等对象的位置信息、类别信息等信息的方法。目标检测算法通常包括训练与推理两个阶段。其中,训练阶段是算法学习数据的过程,推理阶段是利用训练阶段算法输出来分析图像中包含信息的阶段。推理阶段的分析结果可能正确或者错误。分析结果错误包括漏检与误检两种情况,发生漏检的对象即漏检难例,发生误检的对象即误检难例。示例性的,图1的(a)、(b)和(c)分别是在一帧道路图像中检测交通标志的检测结果。该道路图像中包括交通标志101、交通标志102和车辆103,车辆103上贴有车贴104。图1的(a)中,检测框1正确标注了交通标志101,检测框2正确标注了交通标志102;图1的(b)中,检测框1正确标注了交通标志101,未正确检测到交通标志102,即发生了漏检,该图像中交通标志102为漏检难例;图1的(c)中,检测框1正确标注了交通标志101,检测框2正确标注了交通标志102,检测框3标注了车贴104,即错误的将车贴104检测为交通标志,发生了误检,该图像中车贴104为误检难例。漏检难例和误检难例构成难例数据。使用难例数据加入到训练阶段是降低分析结果错误几率的有效方法。The data-driven deep learning target detection algorithm is a method to obtain information such as location information and category information of objects such as vehicles and people by analyzing the acquired images. The target detection algorithm usually includes two stages of training and inference. Among them, the training phase is the process of learning data by the algorithm, and the inference phase is the phase in which the output of the algorithm in the training phase is used to analyze the information contained in the image. The results of the analysis at the inference stage may be correct or incorrect. Errors in the analysis result include two cases: missed detection and false detection. Objects that have missed detections are rare cases of missed detections, and objects that have missed detections are rare cases of misdetection. Exemplarily, (a), (b) and (c) of FIG. 1 are respectively the detection results of detecting traffic signs in a frame of road image. The road image includes a traffic sign 101, a traffic sign 102, and a vehicle 103, and a car sticker 104 is affixed to the vehicle 103. In Figure 1(a), the detection frame 1 is correctly labeled with a traffic sign 101, and the detection frame 2 is correctly labeled with a traffic sign 102; in Figure 1(b), the detection frame 1 is correctly labeled with a traffic sign 101, but it is not correctly detected Traffic sign 102, that is, a missed detection occurs. The traffic sign 102 in this image is a difficult example of missed detection; in Figure 1(c), the detection frame 1 is correctly marked with the traffic sign 101, and the detection frame 2 is correctly marked with the traffic sign 102. The detection frame 3 is marked with the car sticker 104, that is, the car sticker 104 is mistakenly detected as a traffic sign, and a misdetection occurs. The car sticker 104 in the image is a rare example of misdetection. The hard cases of missed detection and the hard cases of false detection constitute hard case data. Using hard-case data to join the training phase is an effective way to reduce the probability of error in the analysis results.
难例挖掘就是提取难例数据集的方法。其目的是提取出漏检难例和误检难例。常用的难例挖掘方法分为两大类,一类是有监督难例挖掘方法,另一类是无监督难例挖掘方法。有监督难例挖掘方法需要使用大量的标签数据,而数据标注需要消耗大量人力;特别是在大规模数据情况下,成本非常高。在无监督难例挖掘方法中,如何实现有效的难例挖掘,更准确高效地提取出漏检难例和误检难例,是需要解决的一个问题。Difficult case mining is a method of extracting difficult case data sets. Its purpose is to extract difficult cases of missed detection and difficult cases of false detection. Commonly used methods for mining hard cases are divided into two categories, one is the method for mining supervised hard cases, and the other is the method for mining unsupervised hard cases. The supervised and difficult mining method requires a large amount of label data, while data labeling requires a lot of manpower; especially in the case of large-scale data, the cost is very high. In the method of unsupervised hard case mining, how to realize effective hard case mining and extract the missed hard cases and misdetected hard cases more accurately and efficiently is a problem that needs to be solved.
发明内容Summary of the invention
本申请实施例提供一种目标检测中难例挖掘的方法及装置,能够更有效地实现难例挖掘,更准确地提取出难例数据。The embodiments of the present application provide a method and device for mining difficult cases in target detection, which can implement difficult case mining more effectively and extract difficult case data more accurately.
为达到上述目的,本申请的实施例采用如下技术方案:In order to achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of the present application:
第一方面,本申请提供了一种目标检测中难例挖掘的方法及装置。In the first aspect, this application provides a method and device for mining difficult cases in target detection.
在一种可能的设计中,该方法可以包括:获取待检测图像;使用预设的目标检测算法对待检测图像进行分析,获得待检测图像的检测结果;使用预设的单目标跟踪算法对待检测图像进行分析,获得待检测图像的跟踪结果;根据检测结果、跟踪结果以及预设规则,获取待检测图像中每个对象的判别结果,判别结果包括:匹配成功,漏检,误检,新出现和结束;将判别结果为漏检的对象确定为漏检难例,将判别结果为误检的对象确定为误检难例。其中,检测结果包括:一个或多个检测对象的类别,一个或多个检测对象的检测位置,一个或多个检测对象的分类精度值;跟踪结果包括:一个或多个跟踪对象的跟踪位置,一个或多个跟踪对象的跟踪置信度;判别结果包括:匹配成功,漏检,误检,新出现和结束。In a possible design, the method may include: obtaining the image to be detected; using a preset target detection algorithm to analyze the image to be detected to obtain the detection result of the image to be detected; using a preset single target tracking algorithm to detect the image Perform analysis to obtain the tracking result of the image to be detected; according to the detection result, tracking result and preset rules, obtain the discrimination result of each object in the image to be detected. The discrimination result includes: successful matching, missed detection, false detection, new appearance and End; the object whose judgment result is a missed detection is determined as a rare case of missed detection, and the object whose judgment result is a false detection is determined as a rare case of misdetected. Wherein, the detection result includes: the category of one or more detection objects, the detection position of one or more detection objects, the classification accuracy value of one or more detection objects; the tracking result includes: the tracking position of one or more tracking objects, The tracking confidence of one or more tracking objects; the discrimination results include: successful matching, missed detection, false detection, new appearance and end.
在该方法中,结合单目标跟踪算法和目标检测算法进行目标检测中的难例挖掘,将单目 标跟踪算法的跟踪结果应用到目标检测的难例挖掘中,同时判别漏检难例和误检难例,可以更准确地提取漏检难例和误检难例,更有效地实现难例挖掘。In this method, the single target tracking algorithm and target detection algorithm are combined to mine the difficult cases in the target detection, and the tracking results of the single target tracking algorithm are applied to the difficult case mining in the target detection, and at the same time, the missed rare cases and false detections are distinguished. Difficult cases can be extracted more accurately from missed cases and misdetected cases, and difficult cases can be discovered more effectively.
在一种可能的设计中,该方法包括:根据检测结果,跟踪结果以及预设的关联规则,获取待检测图像中第一对象的关联结果;根据初步判别规则和待检测图像中第一对象的关联结果,对待检测图像中第一对象进行初步判别,获得待检测图像中第一对象的初步判别结果;根据待检测图像中第一对象的初步判别结果确定待检测图像中第一对象的判别结果;其中,关联结果包括:匹配成功,第一关联结果,第二关联结果;初步判别结果包括:匹配成功,漏检,误检,可能漏检,可能误检,结束;第一关联结果为检测结果中存在第一对象,跟踪结果中不存在第一对象;第二关联结果为检测结果中不存在第一对象,跟踪结果中存在第一对象。由于将跟踪结果用于目标检测,可以有效的预测检测目标的位置信息,提高了难例挖掘的准确性。In a possible design, the method includes: obtaining the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules; Associate the result, make a preliminary judgment of the first object in the image to be detected, and obtain the preliminary judgment result of the first object in the image to be detected; determine the judgment result of the first object in the image to be detected according to the preliminary judgment result of the first object in the image to be detected ; Among them, the association results include: successful matching, the first association result, and the second association result; the preliminary discrimination results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, end; the first association result is detection The first object exists in the result, but the first object does not exist in the tracking result; the second association result is that the first object does not exist in the detection result and the first object exists in the tracking result. Since the tracking result is used for target detection, the location information of the detected target can be effectively predicted, and the accuracy of difficult case mining is improved.
在一种可能的设计中,根据检测结果,所述结果以及预设的关联规则,获取待检测图像中第一对象的关联结果包括:选择检测结果中与跟踪结果中类别相同的对象,以跟踪对象为行检测对象为列或者以检测对象为行跟踪对象为列,构建矩阵;使用关联匹配算法,利用矩阵获得待检测图像中第一对象的关联结果。In a possible design, according to the detection result, the result and the preset association rules, obtaining the association result of the first object in the image to be detected includes: selecting an object of the same category in the detection result as the tracking result to track The object is a row, the detection object is a column, or the detection object is a row, the tracking object is a column, and a matrix is constructed; using an association matching algorithm, the matrix is used to obtain the association result of the first object in the image to be detected.
这样,使用关联匹配算法对检测结果和跟踪结果进行匹配,有效的保证了检测结果中每个检测对象,在跟踪结果中最多只有一个跟踪对象与之对应;提高了难例挖掘的准确性。In this way, the association matching algorithm is used to match the detection result and the tracking result, effectively ensuring that each detection object in the detection result has at most one tracking object corresponding to it in the tracking result; improving the accuracy of difficult case mining.
在一种可能的设计中,初步判别规则包括:对于关联结果为匹配成功的第一对象,初步判别结果为匹配成功;对于关联结果为第一关联结果的第一对象,初步判别结果为可能误检;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度大于或者等于第一置信阈值的第一对象,初步判别结果为漏检;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于或者等于第二置信阈值的第一对象,初步判别结果为结束;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于第一置信阈值且大于第二置信阈值的第一对象,初步判别结果为可能漏检;其中,第二置信阈值小于第一置信阈值。In a possible design, the preliminary discrimination rules include: for the first object whose association result is a successful match, the preliminary discrimination result is a successful match; for the first object whose association result is the first association result, the preliminary discrimination result is a possible error Check; for the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection; for the association result is the second association result, and the tracking object For the first object whose tracking confidence is less than or equal to the second confidence threshold, the preliminary judgment result is the end; for the correlation result is the second correlation result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold. For one subject, the preliminary judgment result is a possible missed detection; wherein, the second confidence threshold is less than the first confidence threshold.
在一种可能的设计中,根据待检测图像中第一对象的初步判别结果确定待检测图像中第一对象的判别结果包括:对于初步判别结果为匹配成功的第一对象,确定判别结果为匹配成功;对于初步判别结果为漏检的第一对象,确定判别结果为漏检;对于初步判别结果为结束的第一对象,确定判别结果为结束;对于初步判别结果为可能误检或者可能漏检的第一对象,结合与待检测图像相邻的S帧图像的判别结果确定判别结果;其中,S>1。In a possible design, determining the discrimination result of the first object in the image to be detected according to the preliminary discrimination result of the first object in the image to be detected includes: for the first object whose preliminary discrimination result is a successful match, determining that the discrimination result is a match Success; for the first object whose preliminary judgment result is a missed detection, the judgment result is determined to be a missed detection; for the first object whose preliminary judgment result is over, the judgment result is determined to be over; for the preliminary judgment result, it may be misdetected or may be missed The first object of is combined with the discrimination result of the S frame images adjacent to the image to be detected to determine the discrimination result; where S>1.
对于可能误检或者可能漏检的对象,结合与其相邻的S帧图像的判别结果确定判别结果,可以有效的降低对误检难例和漏检难例误判的几率。For objects that may be misdetected or missed, the judgment results of the adjacent S-frame images are combined to determine the judgment result, which can effectively reduce the probability of misjudgment of the difficult cases of false detection and the difficult cases of missed detection.
在一种可能的设计中,对于初步判别结果为可能漏检的第一对象,按照与待检测图像相邻的第1帧图像至第S帧图像的顺序判断,如果与待检测图像相邻的第1帧图像至第S帧图像的至少一帧图像中,与初步判别结果为可能漏检的第一对象对应的对象的关联结果为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度小于或者等于第二置信阈值,确定该初步判别结果为可能漏检的第一对象的判别结果为结束;如果与待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与初步判别结果为可能漏检的第一对象对应的对象的关联结果都为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度都小于第一置信阈值且大于第二置信阈值,确定该初步判别结果为可能漏检的第一对象的判别结果为结束;否则,确定该初步判别结果为可能漏检的第一对象的判别结果为漏检。In a possible design, for the first object that may be missed as a result of the preliminary judgment, judge according to the order of the first frame image to the Sth frame image adjacent to the image to be detected, if the image adjacent to the image to be detected In at least one frame of images from the 1st frame to the Sth frame of images, the association result of the object corresponding to the first object that may be missed as a result of the preliminary discrimination is the second association result, and the association result with the preliminary discrimination result is that the first object may be missed The tracking confidence of the object corresponding to the first object is less than or equal to the second confidence threshold, and it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is the end; if the first frame image adjacent to the image to be detected reaches In each frame of the S-th image, the association result of the object corresponding to the first object that may be missed by the preliminary judgment result is the second association result, and corresponds to the first object that may be missed by the preliminary judgment result The tracking confidence of the object is less than the first confidence threshold and greater than the second confidence threshold, and it is determined that the preliminary judgment result is a possible missed detection result of the first object; otherwise, the preliminary judgment result is determined to be a possible missed detection result The judgment result of the first object is a missed inspection.
在一种可能的设计中,对于初步判别结果为可能误检的第一对象,如果与待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与初步判别结果为可能误检的第一对象对应的对象的关联结果都为匹配成功,并且与初步判别结果为可能误检的第一对象对应的对象的跟踪置信度都大于第一置信阈值,确定该初步判别结果为可能误检的第一对象的判别结果为新出现;否则,确定该初步判别结果为可能误检的第一对象的判别结果为误检。In a possible design, for the preliminary judgment result as the first object that may be misdetected, if each frame of image from the first frame image to the Sth frame image adjacent to the image to be detected, the preliminary judgment result is The association results of the objects corresponding to the first object that may be falsely detected are all successful matches, and the tracking confidence of the objects corresponding to the first object that may be falsely detected is greater than the first confidence threshold, and the preliminary determination result is determined The judgment result of the first object that is likely to be misdetected is new; otherwise, it is determined that the preliminary judgment result is the judgment result of the first object that may be misdetected as a misdetection.
在一种可能的设计中,如果剩余未检测的待检测图像的帧数大于或者等于预设的帧数S,与待检测图像相邻的S帧图像为帧序号比待检测图像的帧序号大的S帧图像;如果剩余未检测的待检测图像的帧数小于预设的帧数S,与待检测图像相邻的S帧图像为帧序号比待检测图像的帧序号小的S帧图像。这样,对于剩余未检测的待检测图像帧数不足的情况,可以使用待检测图像之前的S帧图像的信息辅助当前待检测图像的判别,有效解决剩余帧不足的问题。在一种可能的设计中,对于剩余未检测的待检测图像帧数小于预设的帧数S的,可以采用从后到前,逆序判别。In a possible design, if the number of frames of the remaining undetected images to be detected is greater than or equal to the preset frame number S, the S frame of images adjacent to the image to be detected has a frame number greater than the frame number of the image to be detected If the number of frames of the remaining undetected images to be detected is less than the preset number of frames S, the S frame images adjacent to the image to be detected are S frame images with a frame number smaller than the frame number of the image to be detected. In this way, when the number of remaining undetected image frames is insufficient, the information of S frames of images before the image to be detected can be used to assist the determination of the current image to be detected, and the problem of insufficient remaining frames can be effectively solved. In a possible design, if the number of remaining undetected image frames to be detected is less than the preset frame number S, the reverse order judgment can be adopted from back to front.
在一种可能的设计中,预设的单目标跟踪算法为基于深度学习的单目标跟踪算法。采用基于深度学习的单目标跟踪算法,基于深度语义特征分析图像,可以有效预测匹配结果的位置信息,还可以提高提取漏检难例和误检难例的准确性。In a possible design, the preset single target tracking algorithm is a single target tracking algorithm based on deep learning. Using a single-target tracking algorithm based on deep learning and analyzing images based on deep semantic features can effectively predict the location information of the matching result and improve the accuracy of extracting rare cases of missed detection and difficult false detection.
相应的,本申请还提供了一种目标检测中难例挖掘的装置,该装置可以实现第一方面所述的目标检测中难例挖掘的方法。该装置可以通过软件、硬件、或者通过硬件执行相应的软件实现上述方法。Correspondingly, the present application also provides a device for mining difficult cases in target detection, which can implement the method for mining difficult cases in target detection described in the first aspect. The device can implement the above method by software, hardware, or by hardware executing corresponding software.
在一种可能的设计中,该装置可以包括:图像获取单元、目标检测单元、目标跟踪单元和难例挖掘单元。其中,图像获取单元用于获取待检测图像;目标检测单元用于使用预设的目标检测算法对待检测图像进行分析,获得待检测图像的检测结果;目标跟踪单元用于使用预设的单目标跟踪算法对待检测图像进行分析,获得待检测图像的跟踪结果;其中,检测结果包括:一个或多个检测对象的类别,一个或多个检测对象的检测位置,一个或多个检测对象的分类精度值;跟踪结果包括:一个或多个跟踪对象的跟踪位置,一个或多个跟踪对象的跟踪置信度;难例挖掘单元用于根据检测结果,跟踪结果以及预设规则,获取待检测图像中每个对象的判别结果;判别结果包括:匹配成功,漏检,误检,新出现和结束;难例挖掘单元还用于将判别结果为漏检的对象确定为漏检难例,将判别结果为误检的对象确定为误检难例。In a possible design, the device may include: an image acquisition unit, a target detection unit, a target tracking unit, and a difficult case mining unit. Among them, the image acquisition unit is used to obtain the image to be detected; the target detection unit is used to analyze the image to be detected using a preset target detection algorithm to obtain the detection result of the image to be detected; the target tracking unit is used to use the preset single target tracking The algorithm analyzes the image to be detected to obtain the tracking result of the image to be detected; the detection result includes: the category of one or more detection objects, the detection position of one or more detection objects, and the classification accuracy value of one or more detection objects ; The tracking results include: the tracking position of one or more tracking objects, the tracking confidence of one or more tracking objects; the difficult case mining unit is used to obtain each of the images to be detected according to the detection results, the tracking results and the preset rules The discrimination result of the object; the discrimination result includes: matching success, missed detection, misdetection, new appearance and end; the difficult case mining unit is also used to determine the object whose judgment result is a missed detection as a difficult case for missed detection, and the judgment result is wrong The object of inspection is determined to be a rare case of false inspection.
在一种可能的设计中,难例挖掘单元具体用于:根据检测结果,跟踪结果以及预设的关联规则,获取待检测图像中第一对象的关联结果;根据初步判别规则和待检测图像中第一对象的关联结果,对待检测图像中第一对象进行初步判别,获得待检测图像中第一对象的初步判别结果;根据待检测图像中第一对象的初步判别结果确定待检测图像中第一对象的判别结果;其中,关联结果包括:匹配成功,第一关联结果,第二关联结果;第一关联结果为检测结果中存在第一对象,跟踪结果中不存在第一对象;第二关联结果为检测结果中不存在第一对象,跟踪结果中存在第一对象;初步判别结果包括:匹配成功,漏检,误检,可能漏检,可能误检,结束。In a possible design, the difficult case mining unit is specifically used to: obtain the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules; The correlation result of the first object, the first object in the image to be detected is preliminarily discriminated, and the preliminary discrimination result of the first object in the image to be detected is obtained; the first object in the image to be detected is determined according to the preliminary discrimination result of the first object in the image to be detected Discrimination result of the object; among them, the association result includes: the matching is successful, the first association result, the second association result; the first association result is that the first object exists in the detection result, and the first object does not exist in the tracking result; the second association result Because the first object does not exist in the detection result, the first object exists in the tracking result; the preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, and end.
在一种可能的设计中,难例挖掘单元根据检测结果,跟踪结果以及预设的关联规则,获取待检测图像中第一对象的关联结果具体包括:选择检测结果中与跟踪结果中类别相同的对象,以跟踪对象为行检测对象为列或者以检测对象为行跟踪对象为列,构建矩阵;使用关联匹配算法,利用矩阵获得待检测图像中第一对象的关联结果。In a possible design, the difficult case mining unit obtains the association result of the first object in the image to be detected according to the detection result, the tracking result and the preset association rules, which specifically includes: selecting the detection result and the tracking result in the same category For objects, a matrix is constructed by taking the tracked object as the row and the detected object as the column or the detected object as the row and the tracked object as the column; using the association matching algorithm, the matrix is used to obtain the association result of the first object in the image to be detected.
在一种可能的设计中,初步判别规则包括:对于关联结果为匹配成功的第一对象,初步 判别结果为匹配成功;对于关联结果为第一关联结果的第一对象,初步判别结果为可能误检;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度大于或者等于第一置信阈值的第一对象,初步判别结果为漏检;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于或者等于第二置信阈值的第一对象,初步判别结果为结束;对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于第一置信阈值且大于第二置信阈值的第一对象,初步判别结果为可能漏检;其中,第二置信阈值小于第一置信阈值。In a possible design, the preliminary discrimination rules include: for the first object whose association result is a successful match, the preliminary discrimination result is a successful match; for the first object whose association result is the first association result, the preliminary discrimination result is a possible error Check; for the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection; for the association result is the second association result, and the tracking object For the first object whose tracking confidence is less than or equal to the second confidence threshold, the preliminary judgment result is the end; for the correlation result is the second correlation result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold. For one subject, the preliminary judgment result is a possible missed detection; wherein, the second confidence threshold is less than the first confidence threshold.
在一种可能的设计中,难例挖掘单元根据待检测图像中第一对象的初步判别结果确定待检测图像中第一对象的判别结果具体包括:对于初步判别结果为匹配成功的第一对象,确定判别结果为匹配成功;对于初步判别结果为漏检的第一对象,确定判别结果为漏检;对于初步判别结果为结束的第一对象,确定判别结果为结束;对于初步判别结果为可能误检或者可能漏检的第一对象,结合与待检测图像相邻的S帧图像的判别结果确定判别结果;其中,S>1。In a possible design, the difficult case mining unit determines the discrimination result of the first object in the image to be detected according to the preliminary discrimination result of the first object in the image to be detected, which specifically includes: for the first object whose preliminary discrimination result is a successful match, Determine that the discrimination result is a successful match; for the first object whose preliminary discrimination result is missed, the discrimination result is determined to be missed; for the first object whose preliminary discrimination result is over, the discrimination result is determined to be over; for the initial discrimination result, it may be wrong The first object that is detected or may be missed is determined by combining the discrimination results of the S frame images adjacent to the image to be detected; where S>1.
在一种可能的设计中,难例挖掘单元对于初步判别结果为可能误检或者可能漏检的第一对象,结合与待检测图像相邻的S帧图像的判别结果确定判别结果具体包括:对于初步判别结果为可能漏检的第一对象,按照与待检测图像相邻的第1帧图像至第S帧图像的顺序判断,如果与待检测图像相邻的第1帧图像至第S帧图像的至少一帧图像中,与初步判别结果为可能漏检的第一对象对应的对象的关联结果为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度小于或者等于第二置信阈值,确定该初步判别结果为可能漏检的第一对象的判别结果为结束;如果与待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与初步判别结果为可能漏检的第一对象对应的对象的关联结果都为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度都小于第一置信阈值且大于第二置信阈值,确定该初步判别结果为可能漏检的第一对象的判别结果为结束;否则,确定该初步判别结果为可能漏检的第一对象的判别结果为漏检。In a possible design, the difficult case mining unit determines that the preliminary discrimination result is the first object that may be misdetected or may be missed, and the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result, which specifically includes: The preliminary judgment result is the first object that may be missed. According to the order of the first frame image to the S frame image adjacent to the image to be detected, if the image from the first frame image to the S frame image adjacent to the image to be detected In at least one frame of the image, the association result of the object corresponding to the first object that may be missed as a result of the preliminary determination is the second association result, and the tracking confidence of the object corresponding to the first object that may be missed as the preliminary determination result If the degree is less than or equal to the second confidence threshold, it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is the end; if it is adjacent to the image to be detected from the first frame image to the S-th frame image , The association results of the objects corresponding to the first object that may be missed by the preliminary discrimination result are all second association results, and the tracking confidence of the objects corresponding to the first object that may be missed by the preliminary discrimination result is less than that of the first object. If a confidence threshold is greater than the second confidence threshold, it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is the end; otherwise, the judgment result of the preliminary judgment result is the first object that may be missed is the judgment result of the missed detection .
在一种可能的设计中,难例挖掘单元对于初步判别结果为可能误检或者可能漏检的第一对象,结合与待检测图像相邻的S帧图像的判别结果确定判别结果具体包括:对于初步判别结果为可能误检的第一对象,如果与待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与该初步判别结果为可能误检的第一对象对应的对象的关联结果都为匹配成功,并且与该初步判别结果为可能误检的第一对象对应的对象的跟踪置信度都大于第一置信阈值,确定该初步判别结果为可能误检的第一对象的判别结果为新出现;否则,确定该初步判别结果为可能误检的第一对象的判别结果为误检。In a possible design, the difficult case mining unit determines that the preliminary discrimination result is the first object that may be misdetected or may be missed, and the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result, which specifically includes: The preliminary discrimination result is the first object that may be misdetected. If each frame image from the first frame image to the Sth frame image adjacent to the image to be detected corresponds to the first object that may be misdetected by the preliminary discrimination result The association results of the objects are all matching successfully, and the tracking confidence of the objects corresponding to the first object that may be misdetected by the preliminary discrimination result is greater than the first confidence threshold, and the preliminary discrimination result is determined to be the first possible misdetection. The discrimination result of the object is new; otherwise, it is determined that the preliminary discrimination result is the first object that may be misdetected, and the discrimination result is a misdetection.
在一种可能的设计中,如果剩余未检测的待检测图像的帧数大于或者等于预设的帧数S,与所述待检测图像相邻的S帧图像为帧序号比待检测图像的帧序号大的S帧图像;如果剩余未检测的待检测图像的帧数小于预设的帧数S,与待检测图像相邻的S帧图像为帧序号比待检测图像的帧序号小的S帧图像。In a possible design, if the number of frames of the remaining undetected images to be detected is greater than or equal to the preset frame number S, the S frames of images adjacent to the image to be detected have a frame sequence number greater than that of the image to be detected. S-frame image with a large serial number; if the number of frames of the remaining undetected images to be detected is less than the preset frame number S, the S-frame image adjacent to the image to be detected is the S frame whose frame number is smaller than the frame number of the image to be detected image.
在一种可能的设计中,预设的单目标跟踪算法为基于深度学习的单目标跟踪算法。In a possible design, the preset single target tracking algorithm is a single target tracking algorithm based on deep learning.
第二方面,本申请实施例提供一种设备,该设备可以实现第一方面所述的目标检测中难例挖掘的方法,比如,该设备可以是服务器。在一种可能的设计中,该设备可以包括处理器和存储器。该处理器被配置为支持该设备执行上述第一方面方法中相应的功能。存储器用于与处理器耦合,其保存该设备必要的程序指令和数据。In a second aspect, an embodiment of the present application provides a device that can implement the method for mining difficult cases in target detection described in the first aspect. For example, the device may be a server. In one possible design, the device may include a processor and a memory. The processor is configured to support the device to perform the corresponding function in the method of the first aspect described above. The memory is used for coupling with the processor, and it stores the necessary program instructions and data of the device.
第三方面,本申请实施例提供一种计算机可读存储介质,该计算机可读存储介质包括计算机指令,当所述计算机指令在设备上运行时,使得设备执行如上述任一方面及其可能的设计方式所述的目标检测中难例挖掘的方法。In a third aspect, embodiments of the present application provide a computer-readable storage medium, which includes computer instructions, which when the computer instructions run on a device, cause the device to perform any of the above-mentioned aspects and possible The method of mining difficult cases in the target detection described in the design method.
第四方面,本申请实施例提供一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如上述任一方面及其可能的设计方式所述的目标检测中难例挖掘的方法。In a fourth aspect, the embodiments of the present application provide a computer program product. When the computer program product runs on a computer, the computer can execute the target detection as described in any of the above aspects and possible design methods. Examples of mining methods.
第五方面,本申请实施例还提供一种芯片系统,该芯片系统中包括处理器,还可以包括存储器,用于实现上述任一方面及其可能的设计方式所述的目标检测中难例挖掘的方法。In a fifth aspect, the embodiments of the present application also provide a chip system, which includes a processor and may also include a memory, which is used to implement the difficult example mining in target detection described in any of the above aspects and possible design methods. Methods.
上述提供的任一种装置或设备或计算机可读存储介质或计算机程序产品或芯片系统均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文提供的对应的方法中对应方案的有益效果,此处不再赘述。Any device or device or computer readable storage medium or computer program product or chip system provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can refer to the above provided The beneficial effects of the corresponding scheme in the corresponding method will not be repeated here.
附图说明Description of the drawings
图1为本申请实施例提供的技术方案所适用的一种场景示意图;FIG. 1 is a schematic diagram of a scenario to which the technical solution provided by an embodiment of the application is applicable;
图2为本申请实施例提供的技术方案所适用的一种设备的示意图;FIG. 2 is a schematic diagram of a device to which the technical solution provided in an embodiment of the application is applicable;
图3为本申请实施例提供的一种目标检测中难例挖掘的方法的示意图一;3 is a schematic diagram 1 of a method for mining difficult cases in target detection according to an embodiment of this application;
图4为本申请实施例提供的一种目标检测中难例挖掘的方法的示意图二;4 is a second schematic diagram of a method for mining difficult cases in target detection according to an embodiment of this application;
图5为本申请实施例提供的一种装置的结构示意图一;FIG. 5 is a structural schematic diagram 1 of an apparatus provided by an embodiment of this application;
图6为本申请实施例提供的一种装置的结构示意图二;FIG. 6 is a second structural diagram of an apparatus provided by an embodiment of this application;
图7为本申请实施例提供的一种装置的结构示意图三。FIG. 7 is a third structural diagram of an apparatus provided by an embodiment of this application.
具体实施方式Detailed ways
本文中的术语“多个”是指两个或两个以上。本文中的术语“第一”和“第二”是用于区别不同的对象,而不是用于描述对象的特定顺序。例如,第一阈值和第二阈值仅仅是为了区分不同的阈值,并不对其先后顺序进行限定。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。The term "plurality" herein refers to two or more. The terms "first" and "second" herein are used to distinguish different objects, rather than to describe a specific order of objects. For example, the first threshold and the second threshold are only for distinguishing different thresholds, and the order of their order is not limited. The term "and/or" in this article is only an association relationship describing the associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist at the same time, exist alone B these three situations.
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。In the embodiments of the present application, words such as "exemplary" or "for example" are used as examples, illustrations, or illustrations. Any embodiment or design solution described as "exemplary" or "for example" in the embodiments of the present application should not be construed as being more preferable or advantageous than other embodiments or design solutions. To be precise, words such as "exemplary" or "for example" are used to present related concepts in a specific manner.
下面结合附图对本申请实施例提供的目标检测中难例挖掘的方法及装置进行详细描述。The method and device for mining difficult cases in target detection provided by the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
本申请提供的技术方案可以应用于支持科学运算的各种硬件设备,例如个人计算机(personal computer,PC)、服务器、笔记本电脑、平板电脑、车载电脑、手机、移动终端、智能摄像头、智能手表、嵌入式设备等。本申请实施例对该硬件设备的具体形式不做特殊限制。The technical solutions provided in this application can be applied to various hardware devices that support scientific computing, such as personal computers (PC), servers, laptops, tablet computers, vehicle-mounted computers, mobile phones, mobile terminals, smart cameras, smart watches, Embedded devices, etc. The embodiment of the application does not impose special restrictions on the specific form of the hardware device.
示例性的,图2为本申请实施例提供的一种设备100的结构示意图。设备100包括至少一个处理器110,通信线路120,存储器130以及至少一个通信接口140。Exemplarily, FIG. 2 is a schematic structural diagram of a device 100 provided in an embodiment of this application. The device 100 includes at least one processor 110, a communication line 120, a memory 130, and at least one communication interface 140.
处理器110可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。The processor 110 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. integrated circuit.
通信线路120可包括一通路,在上述组件之间传送信息。The communication line 120 may include a path to transmit information between the aforementioned components.
通信接口140,使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。The communication interface 140 uses any device such as a transceiver to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), etc. .
存储器130可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其 他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器130可以是独立存在,通过通信线路120与处理器110相连接。存储器130也可以和处理器110集成在一起。The memory 130 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions The dynamic storage device can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this. The memory 130 may exist independently, and is connected to the processor 110 through a communication line 120. The memory 130 may also be integrated with the processor 110.
其中,存储器130用于存储执行本申请方案的计算机执行指令,并由处理器110来控制执行。处理器110用于执行存储器130中存储的计算机执行指令,从而实现本申请下述实施例提供的目标检测中难例挖掘的方法。The memory 130 is used to store computer-executed instructions for executing the solution of the present application, and the processor 110 controls the execution. The processor 110 is configured to execute computer-executable instructions stored in the memory 130, so as to implement the method for mining difficult cases in target detection provided in the following embodiments of the present application.
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
在具体实现中,作为一种实施例,处理器110可以包括一个或多个CPU,例如图2中的CPU0和CPU1。In a specific implementation, as an embodiment, the processor 110 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 2.
在具体实现中,作为一种实施例,设备100可以包括多个处理器,例如图2中的处理器110和处理器111。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In a specific implementation, as an embodiment, the device 100 may include multiple processors, such as the processor 110 and the processor 111 in FIG. 2. Each of these processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor. The processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
在具体实现中,作为一种实施例,设备100还可以包括输出设备150和输入设备160。输出设备150和处理器110通信,可以以多种方式来显示信息。例如,输出设备150可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备160和处理器110通信,可以以多种方式接收用户的输入。例如,输入设备160可以是鼠标、键盘、触摸屏设备或传感设备等。In a specific implementation, as an embodiment, the device 100 may further include an output device 150 and an input device 160. The output device 150 communicates with the processor 110 and can display information in a variety of ways. For example, the output device 150 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait. The input device 160 communicates with the processor 110, and can receive user input in a variety of ways. For example, the input device 160 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
上述的设备100可以是一个通用设备或者是一个专用设备。在具体实现中,设备100可以是车载设备、台式机、便携式电脑、网络服务器、掌上电脑(personal digital assistant,PDA)、移动手机、平板电脑、无线终端设备、嵌入式设备或有图2中类似结构的设备。本申请实施例不限定设备100的类型。应注意,图2所示的设备100的结构仅用于举例,并非用于限制本申请的技术方案。本领域的技术人员应当明白,在具体实现过程中,设备100还可以是其他的形式,也可以包括其他部件。The above-mentioned device 100 may be a general-purpose device or a special-purpose device. In a specific implementation, the device 100 may be a vehicle-mounted device, a desktop computer, a portable computer, a network server, a palmtop computer (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or the like in Figure 2 Structure of the equipment. The embodiment of the present application does not limit the type of the device 100. It should be noted that the structure of the device 100 shown in FIG. 2 is only used as an example, and is not used to limit the technical solution of the present application. Those skilled in the art should understand that, in a specific implementation process, the device 100 may also be in other forms and may also include other components.
本申请实施例提供了一种目标检测中难例挖掘的方法,结合目标检测算法和单目标跟踪算法进行目标检测中的难例挖掘。可以更有效的进行难例挖掘,更准确的提取出难例数据。执行该目标检测中难例挖掘的方法的可以为上述支持科学运算的硬件设备或者装置;该装置可以是芯片或者芯片系统;也可以是一种计算机可读存储介质;也可以是一种计算机程序产品;本申请实施例对此不进行限定。The embodiment of the present application provides a method for mining difficult cases in target detection, which combines target detection algorithms and single target tracking algorithms to mine difficult cases in target detection. Difficult case mining can be carried out more effectively, and difficult case data can be extracted more accurately. The method for mining difficult cases in target detection can be the above-mentioned hardware equipment or device that supports scientific operations; the device can be a chip or a chip system; it can also be a computer-readable storage medium; it can also be a computer program Product; the embodiment of this application does not limit this.
本申请实施例提供一种目标检测中难例挖掘的方法,可以应用于图2所示的设备。如图3所示,该方法可以包括:The embodiment of the present application provides a method for mining difficult cases in target detection, which can be applied to the device shown in FIG. 2. As shown in Figure 3, the method may include:
S301、获取待检测图像。S301: Acquire an image to be detected.
本申请实施例提供的目标检测中难例挖掘的方法,可以应用于对序列图像进行目标检测的难例挖掘。序列图像是视频抽帧得到的具有时间连续性的一组图像。比如,该视频可以是 通过摄像头拍摄得到的。对拍摄得到的视频进行抽帧处理,则获取到原始的序列图像。对原始的序列图像的每一帧图像进行分析,获取每一帧图像中一个或多个对象的信息,分析结果错误的对象即难例数据;其中,难例数据包括漏检难例和误检难例。The method for mining difficult cases in target detection provided by the embodiments of the present application can be applied to difficult case mining in target detection on sequence images. A sequence of images is a group of images with time continuity obtained by decimating video frames. For example, the video can be captured by a camera. Perform frame extraction processing on the captured video to obtain the original sequence image. Analyze each frame of the original sequence image to obtain the information of one or more objects in each frame. The object with the wrong analysis result is the hard case data; among them, the hard case data includes missed rare cases and false detections Hard case.
对原始序列图像的第t帧图像进行分析,则该第t帧图像为待检测图像;其中,t>0。Analyze the t-th frame image of the original sequence image, then the t-th frame image is the image to be detected; where t>0.
在一种实现方式中,从序列图像的第1帧(即t=1)开始,按照从前到后的顺序,获取待检测图像。In an implementation manner, starting from the first frame (ie, t=1) of the sequence of images, the images to be detected are acquired in the order from front to back.
在一种实现方式中,如果剩余未检测的待检测图像的帧数小于预设的帧数S(即t>N-S,N为序列图像的总帧数),从序列图像的最后一帧(即t=N)图像开始,按照从后到前的顺序,获取待检测图像。In one implementation, if the number of frames of the remaining undetected images to be detected is less than the preset number of frames S (ie t>NS, N is the total number of frames of the sequence image), start from the last frame of the sequence image (ie t=N) At the beginning of the image, the images to be detected are acquired in the order from back to front.
需要说明的是,本申请实施例以对序列图像进行目标检测的难例挖据为例进行说明。其并不构成对本申请技术方案的限定。本申请实施例提供的目标检测中难例挖掘的方法,还可以应用于对激光点云等序列数据的难例挖掘。It should be noted that the embodiment of the present application takes as an example a difficult example of digging evidence for target detection on a sequence image. It does not constitute a limitation on the technical solution of this application. The method for mining difficult cases in target detection provided by the embodiments of the present application can also be applied to mining difficult cases in sequence data such as laser point clouds.
S302、使用预设的目标检测算法对待检测图像进行分析,获得待检测图像的检测结果。S302: Use a preset target detection algorithm to analyze the image to be detected, and obtain a detection result of the image to be detected.
目标检测,即识别图像中的对象及其位置的方法。Object detection is a method of identifying objects and their positions in an image.
使用预设的目标检测算法对待检测图像(第t帧图像)进行分析,获得待检测图像的检测结果。预设的目标检测算法可以是常规技术中的任意一种目标检测算法;比如,YOLO(you only look once:unified,real-time object detection)算法。Use the preset target detection algorithm to analyze the image to be detected (the t-th frame image) to obtain the detection result of the image to be detected. The preset target detection algorithm can be any target detection algorithm in conventional technologies; for example, the YOLO (you only look once: unified, real-time object detection) algorithm.
检测结果可以包括:一个或多个检测对象的类别,每个检测对象的位置(本申请中称为检测位置),每个检测对象的分类精度值等信息。其中,检测对象即为识别目标。检测对象的类别用于区分检测对象的种类;比如,检测对象的类别可以包括人、交通标识、建筑、车辆等;示例性的,图1的(a)、(b)和(c)中,检测对象的类别为交通标识。检测对象的位置可以为该检测对象在图像中的坐标。检测对象的分类精度值,即目标检测算法输出的每一个检测对象为该类别的概率值;当分类精度值大于设定的第一值时,目标检测算法判定识别出给定类别的检测对象。The detection result may include: the category of one or more detection objects, the position of each detection object (referred to as the detection position in this application), the classification accuracy value of each detection object, and other information. Among them, the detection object is the recognition target. The category of the detection object is used to distinguish the type of the detection object; for example, the category of the detection object may include people, traffic signs, buildings, vehicles, etc.; for example, in (a), (b) and (c) of Figure 1, The category of the detected object is a traffic sign. The position of the detection object may be the coordinates of the detection object in the image. The classification accuracy value of the detection object, that is, the probability value of each detection object output by the target detection algorithm is the category; when the classification accuracy value is greater than the set first value, the target detection algorithm determines and recognizes the detection object of the given category.
在一种实现方式中,设备可以保存待检测图像的检测结果。示例性的,设备保存一个模板信息表,模板信息表包括待检测图像的检测结果及相关信息;例如,图像帧号,对象序号,对象类别,对象位置,分类精度值。比如,模板信息表包括表1所示信息。In one implementation, the device can save the detection result of the image to be detected. Exemplarily, the device saves a template information table, the template information table includes the detection result of the image to be detected and related information; for example, the image frame number, object serial number, object category, object location, classification accuracy value. For example, the template information table includes the information shown in Table 1.
表1Table 1
图像帧号Image frame number 对象序号Object number 对象类别Object category 对象位置Object location 分类精度值Classification accuracy value
11 11 交通标识Traffic signs (x 1,y 1),(x 2,y 2) (x 1 , y 1 ), (x 2 , y 2 ) 0.870.87
11 22 交通标识Traffic signs (x 3,y 3),(x 4,y 4) (x 3 , y 3 ), (x 4 , y 4 ) 0.880.88
图像帧号为检测对象所在的帧的序号。对象序号为检测对象的序号(可选的,检测对象的序号可以是人工标注的)。对象类别为检测对象的类别。对象位置为检测对象的位置信息;示例性的,(x 1,y 1),(x 2,y 2)分别为对象1的左上角和右下角在第1帧图像中的坐标,用于表示对象1在第1帧图像中的位置。分类精度值为检测对象的分类精度值。 The image frame number is the sequence number of the frame in which the detection object is located. The object serial number is the serial number of the detected object (optionally, the serial number of the detected object may be manually labeled). The object category is the category of the detection object. The object position is the position information of the detected object; for example, (x 1 , y 1 ), (x 2 , y 2 ) are the coordinates of the upper left corner and the lower right corner of the object 1 in the first frame of image, used to represent The position of object 1 in the first frame of image. The classification accuracy value is the classification accuracy value of the detection object.
进一步的,每次获得待检测图像的检测结果后,可以更新模板信息表中相关信息。比如,使用预设的目标检测算法对序列图像的第1帧进行分析,获得第1帧图像的检测结果;根据第1帧图像的检测结果初始化模板信息表中相关信息,模板信息表的信息如表1所示。其中,第1帧图像称为模板图像。之后,按照从前到后的顺序,获取序列图像的第2帧为待检测图像;使用预设的目标检测算法对第2帧图像进行分析,获得第2帧图像的检测结果;根据第2帧图像的检测结果更新模板信息表中相关信息,模板信息表的信息如表2所示。Further, each time the detection result of the image to be detected is obtained, the relevant information in the template information table can be updated. For example, use the preset target detection algorithm to analyze the first frame of the sequence image to obtain the detection result of the first frame image; initialize the relevant information in the template information table according to the detection result of the first frame image, and the information of the template information table is as Table 1 shows. Among them, the first frame image is called a template image. After that, in the order from front to back, the second frame of the sequence image is obtained as the image to be detected; the second frame image is analyzed using the preset target detection algorithm to obtain the detection result of the second frame image; according to the second frame image The detection results update the relevant information in the template information table, and the information in the template information table is shown in Table 2.
表2Table 2
图像帧号Image frame number 对象序号Object number 对象类别Object category 对象位置Object location 分类精度值Classification accuracy value
11 11 交通标识Traffic signs (x 1,y 1),(x 2,y 2) (x 1 , y 1 ), (x 2 , y 2 ) 0.870.87
11 22 交通标识Traffic signs (x 3,y 3),(x 4,y 4) (x 3 , y 3 ), (x 4 , y 4 ) 0.880.88
22 11 交通标识Traffic signs (x 5,y 5),(x 6,y 6) (x 5 , y 5 ), (x 6 , y 6 ) 0.860.86
22 22 交通标识Traffic signs (x 7,y 7),(x 8,y 8) (x 7 , y 7 ), (x 8 , y 8 ) 0.90.9
需要说明的是,本申请实施例中的模板信息表仅为示例性说明。实际应用中,模板信息表还可以是其他形式,本申请实施例对此并不进行限定。It should be noted that the template information table in the embodiment of the present application is only an exemplary description. In actual applications, the template information table may also be in other forms, which are not limited in the embodiment of the present application.
S303、使用预设的单目标跟踪算法对待检测图像进行分析,获得待检测图像的跟踪结果。S303: Use a preset single-target tracking algorithm to analyze the image to be detected, and obtain a tracking result of the image to be detected.
单目标跟踪算法是在给定的初始帧的目标大小与位置的情况下,预测后续帧中该目标的大小与位置的算法。The single-target tracking algorithm is an algorithm that predicts the size and position of the target in subsequent frames under the condition of a given target size and position in the initial frame.
在一种实现方式中,预设的单目标跟踪算法可以是常规技术中的任意一种基于深度学习的单目标跟踪算法;比如,相关滤波(CF,correlation filter)类算法。采用基于深度学习的单目标跟踪算法,基于深度语义特征分析图像,可以有效预测匹配结果的位置信息,从而提高难例挖掘的准确性。In an implementation manner, the preset single-target tracking algorithm may be any single-target tracking algorithm based on deep learning in conventional technologies; for example, a correlation filter (CF, correlation filter) algorithm. Using a single-target tracking algorithm based on deep learning and analyzing images based on deep semantic features can effectively predict the location information of the matching results, thereby improving the accuracy of difficult case mining.
给定的初始帧可以为序列图像的第1帧。初始帧的目标可以为对第1帧图像进行目标检测获得的检测对象。序列图像的第1帧的信息可以从设备保存的模板信息表中获取。The given initial frame can be the first frame of the sequence image. The target of the initial frame may be a detection object obtained by performing target detection on the first frame of image. The information of the first frame of the sequence image can be obtained from the template information table saved by the device.
跟踪结果可以包括:一个或多个跟踪对象的位置(本申请中称为跟踪位置),每个跟踪对象的跟踪置信度等信息。其中,跟踪置信度用于反映每一次跟踪结果的可靠程度,跟踪置信度越高,跟踪结果的可靠程度越高;当跟踪置信度大于设定的第二值时,单目标跟踪算法判定正确跟踪到给定目标。The tracking result may include information such as the position of one or more tracking objects (referred to as the tracking position in this application), and the tracking confidence of each tracking object. Among them, the tracking confidence is used to reflect the reliability of each tracking result. The higher the tracking confidence, the higher the reliability of the tracking result; when the tracking confidence is greater than the set second value, the single target tracking algorithm determines that the tracking is correct To the given goal.
示例性的,基于深度学习的单目标跟踪算法的输出为,单目标跟踪算法网络部分最后一层特征图f,以及每个跟踪对象的位置信息;其中,f是大小为f w×f h的矩阵。跟踪置信度,即为f中最大响应分数;可以表示为max(f(i,j))。 Exemplarily, the output of the single-target tracking algorithm based on deep learning is the last layer feature map f of the network part of the single-target tracking algorithm, and the position information of each tracking object; where f is the size of f w × f h matrix. The tracking confidence is the maximum response score in f; it can be expressed as max(f(i,j)).
S304、根据待检测图像的检测结果,跟踪结果以及预设规则,获取待检测图像中每个对象的判别结果。S304: Obtain a discrimination result of each object in the image to be detected according to the detection result of the image to be detected, the tracking result and the preset rule.
如图4所示,步骤S304可以包括:As shown in Fig. 4, step S304 may include:
S3041、根据待检测图像的检测结果,跟踪结果以及预设的关联规则,获取待检测图像中每个对象的关联结果。S3041, according to the detection result of the image to be detected, the tracking result and the preset association rule, obtain the association result of each object in the image to be detected.
选择待检测图像的检测结果中与跟踪结果中类别相同的对象,以跟踪对象为行检测对象为列或者以检测对象为行跟踪对象为列,计算交并比(IOU,intersection over union)值,构建矩阵。其中,IOU为跟踪对象的检测框与检测对象的检测框之间的交集与并集的比例。Select the objects in the detection result of the image to be detected that are in the same category as the tracking result, take the tracked object as the row, the detected object as the column or the detected object as the row, and the tracked object as the column, and calculate the intersection over union (IOU) value, Build the matrix. Among them, IOU is the ratio of the intersection and union between the detection frame of the tracking object and the detection frame of the detection object.
预设的关联规则为,使用关联匹配算法(比如,匈牙利算法),利用矩阵获得待检测图像中每个对象的关联结果。关联结果包括:匹配成功,第一关联结果,第二关联结果。其中,匹配成功表示,检测结果中的检测对象和跟踪结果中的跟踪对象是同一对象,即配对成功。第一关联结果表示,经过匹配之后,跟踪结果中不存在与之匹配的跟踪对象的检测对象;即检测结果中存在该对象,跟踪结果中不存在该对象。第二关联结果表示,经过匹配之后,检测结果中不存在与之匹配的检测对象的跟踪对象;即检测结果中不存在该对象,跟踪结果中存在该对象。进一步的,还可以判断关联结果为配对成功对象的分类精度值与IOU值,如果分类精度值小于第一阈值或者IOU值小于第二阈值,将对应的检测对象的关联结果确定为第一关联结果,将对应的跟踪对象的关联结果确定为第二关联结果。The preset association rule is to use an association matching algorithm (for example, the Hungarian algorithm) and use a matrix to obtain the association result of each object in the image to be detected. The association result includes: a successful match, a first association result, and a second association result. Wherein, the matching success indicates that the detection object in the detection result and the tracking object in the tracking result are the same object, that is, the pairing is successful. The first association result indicates that after the matching, there is no detection object matching the tracking object in the tracking result; that is, the object exists in the detection result, and the object does not exist in the tracking result. The second association result indicates that after the matching, there is no tracking object matching the detection object in the detection result; that is, the object does not exist in the detection result, and the object exists in the tracking result. Further, it can also be determined that the association result is the classification accuracy value and the IOU value of the successfully paired object. If the classification accuracy value is less than the first threshold or the IOU value is less than the second threshold, the association result of the corresponding detection object is determined as the first association result , Determine the association result of the corresponding tracking object as the second association result.
这样就获取到待检测图像中每个对象的关联结果。In this way, the association result of each object in the image to be detected is obtained.
使用关联匹配算法对检测结果和跟踪结果进行匹配,可以有效的保证检测结果中每个检测对象,在跟踪结果中最多只有一个跟踪对象与之对应。Using the correlation matching algorithm to match the detection result and the tracking result can effectively ensure that each detection object in the detection result is matched with at most one tracking object in the tracking result.
S3042、根据判别规则和待检测图像中每个对象的关联结果,获得待检测图像中每个对象的判别结果。S3042, according to the discrimination rule and the association result of each object in the image to be detected, obtain a discrimination result of each object in the image to be detected.
判别规则包括:The discrimination rules include:
一、根据初步判别规则和待检测图像中每个对象的关联结果,对待检测图像中每个对象进行初步判别,获得其初步判别结果。1. According to the preliminary discrimination rules and the association result of each object in the image to be detected, a preliminary discrimination is made for each object in the image to be detected, and the preliminary discrimination result is obtained.
其中,初步判别结果包括:匹配成功,漏检,误检,可能漏检,可能误检,结束。Among them, the preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible false detection, and ended.
初步判别规则包括:The preliminary judgment rules include:
1、对于关联结果为匹配成功的对象,初步判别结果为匹配成功。1. For objects whose association result is a successful match, the preliminary judgment result is a successful match.
2、对于关联结果为第一关联结果的对象,初步判别结果为可能误检。2. For objects whose association result is the first association result, the preliminary judgment result is a possible misdetection.
3、对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度大于或者等于第一置信阈值的对象,初步判别结果为漏检。3. For an object whose correlation result is the second correlation result and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection.
4、对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于或者等于第二置信阈值的对象,初步判别结果为结束。其中,第二置信阈值小于第一置信阈值。4. For an object whose association result is the second association result, and the tracking confidence of the tracking object is less than or equal to the second confidence threshold, the preliminary judgment result is the end. Wherein, the second confidence threshold is less than the first confidence threshold.
5、对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于第一置信阈值且大于第二置信阈值的对象,初步判别结果为可能漏检。5. For an object whose correlation result is the second correlation result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold, the preliminary judgment result is that the detection may be missed.
二、根据待检测图像中每个对象的初步判别结果确定待检测图像中每个对象的判别结果。2. Determine the discrimination result of each object in the image to be detected according to the preliminary discrimination result of each object in the image to be detected.
其中,判别结果包括:匹配成功,漏检,误检,新出现和结束。Among them, the discrimination results include: successful matching, missed detection, false detection, new appearance and end.
1、对于初步判别结果为匹配成功的对象,确定判别结果为匹配成功。1. For objects whose preliminary discrimination result is a successful match, the discrimination result is determined to be a successful match.
2、对于初步判别结果为漏检的对象,确定判别结果为漏检。2. For objects whose preliminary judgment results are missed inspections, determine that the judgment results are missed inspections.
3、对于初步判别结果为结束的对象,确定判别结果为结束。3. For objects whose preliminary judgment result is over, the judgment result is determined to be over.
4、对于初步判别结果为可能误检或可能漏检的对象,结合与待检测图像相邻的S帧图像的判别结果确定其判别结果。其中,与待检测图像相邻的S帧图像为待检测图像之后(即帧序号比待检测图像的帧序号大)的S帧图像,或者待检测图像之前(即帧序号比待检测图像的帧序号小)的S帧图像。比如,待检测图像为第t帧图像;如果剩余未检测的待检测图像的帧数大于或者等于预设的帧数S(即t<=N-S,N为序列图像的总帧数,S>1),则与待检测图像相邻的S帧图像为待检测图像之后S帧图像,并且判别顺序为从前往后的顺序;如果剩余未检测的待检测图像的帧数小于预设的帧数S(即t>N-S,N为序列图像的总帧数),则与待检测图像相邻的S帧图像为待检测图像之前S帧图像,并且判别顺序为从后往前的顺序。4. For objects that may be misdetected or missed as a result of preliminary discrimination, the discrimination result of the S-frame image adjacent to the image to be detected is combined to determine the discrimination result. Among them, the S frame image adjacent to the image to be detected is the S frame image after the image to be detected (that is, the frame number is greater than the frame number of the image to be detected), or before the image to be detected (that is, the frame number is greater than the frame number of the image to be detected). S-frame image with the smaller serial number). For example, the image to be detected is the t-th frame image; if the number of frames of the remaining undetected images is greater than or equal to the preset number of frames S (ie t<=NS, N is the total number of frames of the sequence image, S>1 ), the S frames of images adjacent to the image to be detected are S frames of images after the image to be detected, and the order of discrimination is from front to back; if the number of frames of the remaining undetected images to be detected is less than the preset number of frames S (That is, t>NS, N is the total number of frames of the sequence image), the S frame images adjacent to the image to be detected are the S frame images before the image to be detected, and the discrimination order is from back to front.
①、对于初步判别结果为可能漏检的对象①. The preliminary judgment result is the object that may be missed
按照与待检测图像相邻的第1帧图像至第S帧图像的顺序判断,如果第1帧图像至第S帧图像的至少一帧图像中,与该对象对应的对象的关联结果为第二关联结果,并且与该对象对应的对象的跟踪置信度小于或者等于第二置信阈值,则确定判别结果为结束;如果第1帧图像至第S帧图像的每一帧图像中,与该对象对应的对象的关联结果都为第二关联结果,并且与该对象对应的对象的跟踪置信度都小于第一置信阈值且大于第二置信阈值,则确定判别结果为结束;对于不属于上述两种情况的,确定判别结果为漏检。According to the order of the first frame image to the S frame image adjacent to the image to be detected, if at least one frame of the image from the first frame image to the S frame image, the association result of the object corresponding to the object is the second Association result, and the tracking confidence of the object corresponding to the object is less than or equal to the second confidence threshold, then the determination result is determined to be the end; if each frame of image from the 1st frame to the Sth frame of image corresponds to the object The association results of the objects are the second association results, and the tracking confidence of the object corresponding to the object is less than the first confidence threshold and greater than the second confidence threshold, then the determination result is determined to be the end; for the above two cases Yes, it is determined that the judgment result is a missed inspection.
②、对于初步判别结果为可能误检的对象②, the preliminary judgment result is the object of possible misdetection
如果与待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与该对象对应的对象的关联结果都为匹配成功,并且与该对象对应的对象的跟踪置信度都大于第一置信阈值, 则确定判别结果为新出现;对于不属于上述情况的,确定判别结果为误检。If in each of the images from the 1st frame to the Sth frame image adjacent to the image to be detected, the association result of the object corresponding to the object is that the matching is successful, and the tracking confidence of the object corresponding to the object is all If it is greater than the first confidence threshold, it is determined that the discrimination result is new; for those that do not belong to the above situation, it is determined that the discrimination result is a false detection.
在一种实现方式中,设备可以维护临时模板信息表,保存用于判别可能漏检对象和可能误检对象的信息。比如,临时模板信息表可以包括:图像帧号,对象序号,图像的检测结果(例如,对象类别,对象位置,分类精度值),漏检标识,误检标识,可能漏检标识,可能误检标识,匹配成功标识等。In an implementation manner, the device may maintain a temporary template information table to store information used to identify objects that may be missed and objects that may be missed. For example, the temporary template information table may include: image frame number, object serial number, image detection result (for example, object category, object location, classification accuracy value), missed detection identification, false detection identification, possible missed detection identification, possible false detection Identification, matching success identification, etc.
对待检测图像以及与待检测图像相邻的第1帧图像至第S帧图像其中每一帧图像进行判别后,更新临时模板信息表。更新规则为:After discriminating each of the image to be detected and the first frame image to the Sth frame image adjacent to the image to be detected, the temporary template information table is updated. The update rules are:
对于初步判别结果为匹配成功的对象,使用该对象的检测结果更新临时模板信息表中对应的位置信息,并设置匹配成功标识;For an object whose preliminary discrimination result is a successful match, use the detection result of the object to update the corresponding location information in the temporary template information table, and set the matching success flag;
对于初步判别结果为漏检的对象,使用该对象的跟踪结果更新临时模板信息表中对应的位置信息,并设置漏检标识;For an object whose preliminary judgment result is a missed inspection, use the tracking result of the object to update the corresponding location information in the temporary template information table, and set the missed inspection flag;
对于初步判别结果为误检的对象,设置误检标识;For objects whose preliminary discrimination results are false detections, set false detection flags;
对于初步判别结果为新出现的对象,将该对象信息加入临时模板信息表,并设置新出现标识;For the newly-appearing object as a result of preliminary discrimination, add the object information to the temporary template information table and set the newly-appearing flag;
对于初步判别结果为结束的对象,设置结束标识;For the objects whose preliminary judgment result is the end, set the end flag;
对于初步判别结果为可能漏检的对象,使用该对象的跟踪结果更新临时模板信息表中对应的位置信息,并设置可能漏检标识;For an object that may be missed as a result of preliminary discrimination, use the tracking result of the object to update the corresponding location information in the temporary template information table, and set a possible missed indicator;
对于初步判别结果为可能误检的对象,使用该对象的检测结果更新临时模板信息表中对应的位置信息,并设置可能误检标识。For an object whose preliminary judgment result is a possible misdetection, the detection result of the object is used to update the corresponding location information in the temporary template information table, and a possible misdetection flag is set.
这样,可以根据临时模板信息表确定待检测图像中每个对象的判别结果。In this way, the discrimination result of each object in the image to be detected can be determined according to the temporary template information table.
示例性的,待检测图像为第t帧图像。对第t帧图像进行判别后,更新临时模板信息表。临时模板信息表包括表3所示信息。Exemplarily, the image to be detected is the t-th frame image. After the t-th frame image is judged, the temporary template information table is updated. The temporary template information table includes the information shown in Table 3.
表3table 3
Figure PCTCN2020086742-appb-000001
Figure PCTCN2020086742-appb-000001
可以根据临时模板信息表确定第t帧图像中对象1的判别结果为匹配成功,对象2的判别结果为漏检,对象3的判别结果为误检,对象4的判别结果为新出现,对象5的判别结果为结束;对象6为可能漏检,对象7为可能误检,可以结合与第t帧图像相邻的S(例如,S=2)帧图像,确定对象6和对象7的判别结果。According to the temporary template information table, it can be determined that the discrimination result of object 1 in the t-th frame image is a successful match, the discrimination result of object 2 is a missed detection, the discrimination result of object 3 is a misdetection, and the discrimination result of object 4 is a new appearance, and object 5 The judgment result of is the end; the object 6 is a possible missed detection, and the object 7 is a possible false detection. You can combine the S (for example, S=2) frame images adjacent to the t-th frame image to determine the judgment results of the object 6 and the object 7 .
进一步的,以从前往后的顺序为例,待检测图像为第(t+1)帧图像。对第(t+1)帧图像进行判别后,更新临时模板信息表。临时模板信息表包括表4所示信息。Further, taking the order from front to back as an example, the image to be detected is the (t+1)th frame image. After the (t+1)th frame image is judged, the temporary template information table is updated. The temporary template information table includes the information shown in Table 4.
表4Table 4
Figure PCTCN2020086742-appb-000002
Figure PCTCN2020086742-appb-000002
在一种实现方式中,在对第(t+1)帧图像进行判别后,可以将第t帧中匹配成功、漏检、误检和结束的信息删除。需要说明的是,表4中还可以包括第(t+1)帧图像中除对象6和对象7之外的其它对象的信息。本申请中省略这部分内容。In an implementation manner, after the (t+1)th frame image is discriminated, the information of successful matching, missed detection, misdetection, and end in the tth frame may be deleted. It should be noted that Table 4 may also include information about other objects in the (t+1)th frame of image except for object 6 and object 7. This part of the content is omitted in this application.
在对第(t+1)帧图像进行判别后,可以确定第t帧图像中对象7为误检。After the (t+1)-th frame image is discriminated, it can be determined that the object 7 in the t-th frame image is misdetected.
进一步的,以从前往后的顺序为例,待检测图像为第(t+2)帧图像。对第(t+2)帧图像进行判别后,更新临时模板信息表。临时模板信息表包括表5所示信息。Further, taking the order from front to back as an example, the image to be detected is the (t+2)th frame image. After the (t+2)th frame image is judged, the temporary template information table is updated. The temporary template information table includes the information shown in Table 5.
表5table 5
Figure PCTCN2020086742-appb-000003
Figure PCTCN2020086742-appb-000003
需要说明的是,表5中还可以包括第(t+2)帧图像中除对象6和对象7之外的其它对象的信息。本申请中省略这部分内容。It should be noted that the table 5 may also include the information of other objects in the (t+2)th frame image except for the object 6 and the object 7. This part of the content is omitted in this application.
在对第(t+2)帧图像进行判别后,可以确定第t帧图像中对象6为结束。After the (t+2)-th frame image is discriminated, it can be determined that the object 6 in the t-th frame image is the end.
这样,成功确定了第t帧图像中各个对象的判别结果。In this way, the discrimination result of each object in the t-th frame image is successfully determined.
S305、根据待检测图像中每个对象的判别结果,获取难例数据。S305. Obtain difficult case data according to the discrimination result of each object in the image to be detected.
待检测图像中判别结果为漏检的对象确定为漏检难例;待检测图像中判别结果为误检的对象确定为误检难例;获取到漏检难例和/或误检难例,即获取到难例数据。可选的,可以将漏检难例和/或误检难例加入难例数据集。Objects in the image to be detected whose judgment result is a missed detection are determined as difficult cases of missed detection; objects in the image to be detected that are judged to be misdetected are determined as difficult cases of false detection; That is, difficult case data is obtained. Optionally, missed rare cases and/or falsely detected rare cases can be added to the rare case data set.
在一种实现方式中,根据待检测图像中每个对象的判别结果,更新模板信息表。这样可以实现对已检测结果的有效利用。In an implementation manner, the template information table is updated according to the discrimination result of each object in the image to be detected. In this way, the effective use of the detected results can be realized.
模板信息表更新规则为:The template information table update rules are:
对于判别结果为匹配成功的对象,使用该对象的检测结果更新模板信息表中对应的位置信息;For an object whose discrimination result is a successful match, use the detection result of the object to update the corresponding position information in the template information table;
对于判别结果为漏检的对象,使用该对象的跟踪结果更新模板信息表中对应的位置信息;For an object whose identification result is a missed inspection, use the tracking result of the object to update the corresponding location information in the template information table;
对于判别结果为误检的对象,不更新模板信息表的信息;The information in the template information table is not updated for objects whose judgment results are false detections;
对于判别结果为新出现的对象,将该对象的检测结果的信息加入模板信息表;For the newly-appearing object as the result of the discrimination, add the information of the detection result of the object to the template information table;
对于判别结果为结束的对象,将该对象的信息从模板信息表中剔除。For the object whose judgment result is the end, the information of the object is removed from the template information table.
本申请实施例提供的目标检测中难例挖掘的方法,结合单目标跟踪算法和目标检测算法进行目标检测中的难例挖掘,同时判别漏检难例和误检难例,可以更准确地提取漏检难例和误检难例,更有效地实现难例挖掘。采用基于深度学习的单目标跟踪算法,基于深度语义特征分析图像,有效预测匹配结果的位置信息;并且使用关联匹配算法对检测结果和跟踪结果进行匹配,有效的保证了检测结果中每个检测对象,在跟踪结果中最多只有一个跟踪对象与之对应;提高了难例挖掘的准确性。The method for mining difficult cases in target detection provided by the embodiments of the present application combines single target tracking algorithm and target detection algorithm to mine difficult cases in target detection, and at the same time distinguishes difficult cases of missed detection and difficult cases of false detection, and can extract more accurately Missing and false detection of difficult cases, more effective realization of difficult cases mining. Adopt a single target tracking algorithm based on deep learning, analyze the image based on deep semantic features, and effectively predict the location information of the matching result; and use the associated matching algorithm to match the detection result and the tracking result, effectively ensuring each detection object in the detection result , There is only one tracking object corresponding to it in the tracking results; the accuracy of mining difficult cases is improved.
上述主要对本申请实施例提供的方案进行了介绍。可以理解的是,上述设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该可以理解,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions provided by the embodiments of the present application. It can be understood that, in order to realize the above-mentioned functions, the above-mentioned device includes hardware structures and/or software modules corresponding to each function. Those skilled in the art should understand that, in combination with the units and algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对目标检测中难例挖掘的装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。下面以采用对应各个功能划分各个功能模块为例进行说明。The embodiments of the present application can divide the functional modules of devices that are difficult to mine in target detection according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one process. Module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation. The following is an example of dividing each function module corresponding to each function.
图5是本申请实施例提供的装置500的逻辑结构示意图,装置500可以是目标检测中难例挖掘装置,能够实现本申请实施例提供的目标检测中难例挖掘的方法。装置500可以是硬件结构、软件模块、或硬件结构加软件模块。如图5所示,装置500包括初始化模块501、目标检测模块502、信息更新模块503、单目标跟踪模块504和难例挖掘模块505。FIG. 5 is a schematic diagram of the logical structure of a device 500 provided by an embodiment of the present application. The device 500 may be a device for mining difficult cases in target detection, and can implement the method for mining difficult cases in target detection provided by an embodiment of the present application. The apparatus 500 may be a hardware structure, a software module, or a hardware structure plus a software module. As shown in FIG. 5, the device 500 includes an initialization module 501, a target detection module 502, an information update module 503, a single target tracking module 504, and a difficult case mining module 505.
请参考图6,在目标检测的难例挖掘中,初始化模块501用于初始化各种信息,比如,初始化待检测图像的帧号;例如在处理第1帧图像时,将当前待检测图像的帧号初始化为1。 初始化模块501还可以用于将临时模板信息表和模板信息表初始化为空。目标检测模块502用于使用预设的目标检测算法对待检测图像进行分析,获得待检测图像的检测结果。信息更新模块503用于更新临时模板信息表和模板信息表。比如,将待检测图像的检测结果更新至临时模板信息表和模板信息表。单目标跟踪模块504用于使用预设的单目标跟踪算法对待检测图像进行分析,获得待检测图像的跟踪结果。难例挖掘模块505根据目标检测模块502输出的检测结果,单目标跟踪模块504输出的跟踪结果,预设的关联规则和判别规则,获取待检测图像中每个对象的判别结果。难例挖掘模块505输出的漏检难例和误检难例加入难例数据集。难例挖掘模块505获取待检测图像中每个对象的判别结果之后,信息更新模块503更新模板信息表。信息更新模块503还用于,在难例挖掘模块505对于初步判别结果为可能误检或者可能漏检的对象进行判别时,更新临时模板信息表。获取当前待检测图像中每个对象的判别结果之后,初始化模块501将当前待检测图像的帧号加1,获取下一待检测图像。Please refer to FIG. 6, in the difficult case mining of target detection, the initialization module 501 is used to initialize various information, such as initializing the frame number of the image to be detected; The number is initialized to 1. The initialization module 501 can also be used to initialize the temporary template information table and the template information table to be empty. The target detection module 502 is configured to use a preset target detection algorithm to analyze the image to be detected and obtain the detection result of the image to be detected. The information update module 503 is used to update the temporary template information table and the template information table. For example, the detection result of the image to be detected is updated to the temporary template information table and the template information table. The single target tracking module 504 is configured to use a preset single target tracking algorithm to analyze the image to be detected and obtain the tracking result of the image to be detected. The hard case mining module 505 obtains the discrimination result of each object in the image to be detected according to the detection result output by the target detection module 502, the tracking result output by the single target tracking module 504, and the preset association rules and discrimination rules. The hard cases of missed detection and the hard cases of misdetection output by the hard case mining module 505 are added to the hard case data set. After the difficult example mining module 505 obtains the discrimination result of each object in the image to be detected, the information update module 503 updates the template information table. The information update module 503 is also used to update the temporary template information table when the difficult case mining module 505 discriminates an object that may be misdetected or may be missed as a result of the preliminary judgment. After obtaining the discrimination result of each object in the current image to be detected, the initialization module 501 adds 1 to the frame number of the current image to be detected to obtain the next image to be detected.
图7是本申请实施例提供的装置700的逻辑结构示意图,装置700可以是进行目标检测中难例挖掘的设备,能够实现本申请实施例提供的目标检测中难例挖掘的方法。装置700可以是硬件结构、软件模块、或硬件结构加软件模块。如图7所示,装置700包括图像获取单元701、目标检测单元702、目标跟踪单元703和难例挖掘单元704。其中,图像获取单元701可以用于执行图3中的S301,和/或执行本申请中描述的其他步骤。目标检测单元702可以用于执行图3中的S302,和/或执行本申请中描述的其他步骤。目标跟踪单元703可以用于执行图3中的S303,和/或执行本申请中描述的其他步骤。难例挖掘单元704可以用于执行图3中的S304、S305,和/或执行本申请中描述的其他步骤。FIG. 7 is a schematic diagram of the logical structure of an apparatus 700 provided by an embodiment of the present application. The apparatus 700 may be a device for mining difficult cases in target detection, and can implement the method for mining difficult cases in target detection provided in the embodiments of the present application. The apparatus 700 may be a hardware structure, a software module, or a hardware structure plus a software module. As shown in FIG. 7, the apparatus 700 includes an image acquisition unit 701, a target detection unit 702, a target tracking unit 703, and a difficult case mining unit 704. Wherein, the image acquisition unit 701 may be used to perform S301 in FIG. 3, and/or perform other steps described in this application. The target detection unit 702 may be used to perform S302 in FIG. 3, and/or perform other steps described in this application. The target tracking unit 703 may be used to perform S303 in FIG. 3, and/or perform other steps described in this application. The hard case mining unit 704 may be used to perform S304 and S305 in FIG. 3, and/or perform other steps described in this application.
其中,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能单元的功能描述,在此不再赘述。Among them, all relevant content of each step involved in the above method embodiment can be cited in the functional description of the corresponding functional unit, which will not be repeated here.
本领域普通技术人员可知,上述方法中的全部或部分步骤可以通过程序指令相关的硬件完成,该程序可以存储于一计算机可读存储介质中,该计算机可读存储介质如ROM、RAM和光盘等。Those of ordinary skill in the art will know that all or part of the steps in the above method can be completed by a program instructing relevant hardware, and the program can be stored in a computer-readable storage medium such as ROM, RAM, and optical disk. .
本申请实施例还提供一种存储介质,该存储介质可以包括存储器。The embodiment of the present application also provides a storage medium, and the storage medium may include a memory.
上述提供的任一种装置中相关内容的解释及有益效果均可参考上文提供的对应的方法实施例,此处不再赘述。For explanations and beneficial effects of related content in any of the above-provided devices, reference may be made to the corresponding method embodiments provided above, which will not be repeated here.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件程序实现时,可以全部或部分地以计算机程序产品的形式来实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或者数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可以用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带),光介质(例如,数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using a software program, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, network equipment, user equipment, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server, or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or includes one or more data storage devices such as servers, data centers, etc. that can be integrated with the medium. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a tape), an optical medium (for example, a digital video disc (DVD)), or a semiconductor medium (for example, a solid state disk (SSD)) Wait.
尽管在此结合各实施例对本申请进行了描述,然而,在实施所要求保护的本申请过程中, 本领域技术人员通过查看所述附图、公开内容、以及所附权利要求书,可理解并实现所述公开实施例的其他变化。在权利要求中,“包括”(comprising)一词不排除其他组成部分或步骤,“一”或“一个”不排除多个的情况。单个处理器或其他单元可以实现权利要求中列举的若干项功能。相互不同的从属权利要求中记载了某些措施,但这并不表示这些措施不能组合起来产生良好的效果。Although this application has been described in conjunction with various embodiments, in the process of implementing the claimed application, those skilled in the art can understand and understand by viewing the drawings, the disclosure, and the appended claims in the process of implementing the claimed application. Implement other changes of the disclosed embodiment. In the claims, the word "comprising" does not exclude other components or steps, and "a" or "one" does not exclude a plurality. A single processor or other unit may implement several functions listed in the claims. Certain measures are described in mutually different dependent claims, but this does not mean that these measures cannot be combined to produce good results.
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Although the application has been described in combination with specific features and embodiments, it is obvious that various modifications and combinations can be made without departing from the spirit and scope of the application. Accordingly, the specification and drawings are merely exemplary descriptions of the application as defined by the appended claims, and are deemed to cover any and all modifications, changes, combinations or equivalents within the scope of the application. Obviously, those skilled in the art can make various changes and modifications to the application without departing from the spirit and scope of the application. In this way, if these modifications and variations of this application fall within the scope of the claims of this application and their equivalent technologies, then this application is also intended to include these modifications and variations.

Claims (21)

  1. 一种目标检测中难例挖掘的方法,其特征在于,包括:A method for mining difficult cases in target detection, which is characterized in that it includes:
    获取待检测图像;Obtain the image to be detected;
    使用预设的目标检测算法对所述待检测图像进行分析,获得所述待检测图像的检测结果;所述检测结果包括:一个或多个检测对象的类别,所述一个或多个检测对象的检测位置,所述一个或多个检测对象的分类精度值;Use a preset target detection algorithm to analyze the to-be-detected image to obtain a detection result of the to-be-detected image; the detection result includes: the category of one or more detection objects, and the Detection position, classification accuracy value of the one or more detection objects;
    使用预设的单目标跟踪算法对所述待检测图像进行分析,获得所述待检测图像的跟踪结果;所述跟踪结果包括:一个或多个跟踪对象的跟踪位置,所述一个或多个跟踪对象的跟踪置信度;Use a preset single-target tracking algorithm to analyze the image to be detected to obtain the tracking result of the image to be detected; the tracking result includes: the tracking position of one or more tracking objects, and the one or more tracking The tracking confidence of the object;
    根据所述检测结果,所述跟踪结果以及预设规则,获取所述待检测图像中每个对象的判别结果;所述判别结果包括:匹配成功,漏检,误检,新出现和结束;According to the detection result, the tracking result and the preset rules, obtain the discrimination result of each object in the image to be detected; the discrimination result includes: successful matching, missed detection, false detection, new appearance and end;
    将所述判别结果为漏检的对象确定为漏检难例,将所述判别结果为误检的对象确定为误检难例。The object whose judgment result is a missed detection is determined as a difficult case of missed detection, and the object whose judgment result is a wrong detection is determined as a difficult case of false detection.
  2. 根据权利要求1所述的目标检测中难例挖掘的方法,其特征在于,包括:The method for mining difficult cases in target detection according to claim 1, characterized in that it comprises:
    根据所述检测结果,所述跟踪结果以及预设的关联规则,获取所述待检测图像中第一对象的关联结果;所述关联结果包括:匹配成功,第一关联结果,第二关联结果;According to the detection result, the tracking result and the preset association rules, obtain the association result of the first object in the image to be detected; the association result includes: a successful match, a first association result, and a second association result;
    所述第一关联结果为检测结果中存在所述第一对象,跟踪结果中不存在所述第一对象;The first association result is that the first object exists in the detection result, and the first object does not exist in the tracking result;
    所述第二关联结果为检测结果中不存在所述第一对象,跟踪结果中存在所述第一对象;The second association result is that the first object does not exist in the detection result, and the first object exists in the tracking result;
    根据初步判别规则和待检测图像中第一对象的关联结果,对所述待检测图像中所述第一对象进行初步判别,获得所述待检测图像中所述第一对象的初步判别结果;所述初步判别结果包括:匹配成功,漏检,误检,可能漏检,可能误检,结束;According to the preliminary discrimination rule and the association result of the first object in the image to be detected, the first object in the image to be detected is preliminarily discriminated, and the preliminary discrimination result of the first object in the image to be detected is obtained; The preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible misdetection, end;
    根据所述待检测图像中所述第一对象的初步判别结果确定所述待检测图像中所述第一对象的所述判别结果。The judgment result of the first object in the image to be detected is determined according to the preliminary judgment result of the first object in the image to be detected.
  3. 根据权利要求2所述的目标检测中难例挖掘的方法,其特征在于,所述根据所述检测结果,所述跟踪结果以及预设的关联规则,获取所述待检测图像中第一对象的关联结果包括:The method for mining difficult cases in target detection according to claim 2, characterized in that, according to the detection result, the tracking result, and a preset association rule, the information of the first object in the image to be detected is obtained Association results include:
    选择所述检测结果中与所述跟踪结果中类别相同的对象,以跟踪对象为行检测对象为列或者以检测对象为行跟踪对象为列,构建矩阵;Selecting objects of the same category in the detection result as those in the tracking result, taking the tracking object as a row, detecting the object as a column, or taking the detection object as a row, and the tracking object as a column, to construct a matrix;
    使用关联匹配算法,利用所述矩阵获得所述待检测图像中所述第一对象的关联结果。An association matching algorithm is used to obtain an association result of the first object in the image to be detected by using the matrix.
  4. 根据权利要求2所述的目标检测中难例挖掘的方法,其特征在于,所述初步判别规则包括:The method for mining difficult cases in target detection according to claim 2, wherein the preliminary discrimination rule comprises:
    对于关联结果为匹配成功的第一对象,初步判别结果为匹配成功;For the first object whose association result is a successful match, the preliminary judgment result is a successful match;
    对于关联结果为第一关联结果的第一对象,初步判别结果为可能误检;For the first object whose association result is the first association result, the preliminary judgment result is a possible misdetection;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度大于或者等于第一置信阈值的第一对象,初步判别结果为漏检;For the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于或者等于第二置信阈值的第一对象,初步判别结果为结束;其中,所述第二置信阈值小于所述第一置信阈值;For the first object whose association result is the second association result, and the tracking confidence of the tracking object is less than or equal to the second confidence threshold, the preliminary judgment result is end; wherein, the second confidence threshold is less than the first confidence threshold;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于第一置信阈值且大于第二置信阈值的第一对象,初步判别结果为可能漏检。For the first object whose association result is the second association result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold, the preliminary judgment result is that the detection may be missed.
  5. 根据权利要求2所述的目标检测中难例挖掘的方法,其特征在于,所述根据所述待检测图像中所述第一对象的初步判别结果确定所述待检测图像中所述第一对象的所述判别结果 包括:The method for mining difficult cases in target detection according to claim 2, wherein the first object in the image to be detected is determined according to a preliminary discrimination result of the first object in the image to be detected The said discrimination results include:
    对于初步判别结果为匹配成功的第一对象,确定所述判别结果为匹配成功;For the first object whose preliminary discrimination result is a successful match, determine that the discrimination result is a successful match;
    对于初步判别结果为漏检的第一对象,确定所述判别结果为漏检;For the first object whose preliminary judgment result is a missed inspection, determine that the judgment result is a missed inspection;
    对于初步判别结果为结束的第一对象,确定所述判别结果为结束;For the first object whose preliminary discrimination result is over, determine that the discrimination result is over;
    对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的S帧图像的判别结果确定所述判别结果;其中,S>1。For the first object whose preliminary discrimination result is likely to be misdetected or possibly missed, the discrimination result is determined in combination with the discrimination result of S frames of images adjacent to the image to be detected; where S>1.
  6. 根据权利要求5所述的目标检测中难例挖掘的方法,其特征在于,所述对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的S帧图像的判别结果确定所述判别结果包括:The method for mining difficult cases in target detection according to claim 5, characterized in that the preliminary discrimination result is the first object that may be misdetected or may be missed, combined with the S that is adjacent to the image to be detected. The discrimination result of the frame image determines that the discrimination result includes:
    对于初步判别结果为可能漏检的第一对象,按照与所述待检测图像相邻的第1帧图像至第S帧图像的顺序判断,For the first object that may be missed as a result of the preliminary judgment, judge according to the order from the first frame image to the S frame image adjacent to the image to be detected,
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的至少一帧图像中,与所述初步判别结果为可能漏检的第一对象对应的对象的关联结果为第二关联结果,并且与所述初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度小于或者等于第二置信阈值,确定所述初步判别结果为可能漏检的第一对象的判别结果为结束;If in at least one frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be missed as a result of the preliminary discrimination is the second association As a result, and the tracking confidence of the object corresponding to the first object that may be missed as a result of the preliminary judgment is less than or equal to the second confidence threshold, the judgment result of determining that the preliminary judgment result is the first object that may be missed is Finish;
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与所述初步判别结果为可能漏检的第一对象对应的对象的关联结果都为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度都小于第一置信阈值且大于第二置信阈值,确定所述初步判别结果为可能漏检的第一对象的判别结果为结束;If in each frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be missed is the second The correlation result, and the tracking confidence of the object corresponding to the first object that is likely to be missed as a result of the preliminary discrimination is less than the first confidence threshold and greater than the second confidence threshold, and it is determined that the preliminary discrimination result is the first object that may be missed. The judgment result of is the end;
    否则,确定所述初步判别结果为可能漏检的第一对象的判别结果为漏检。Otherwise, it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is a missed detection.
  7. 根据权利要求5所述的目标检测中难例挖掘的方法,其特征在于,所述对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的S帧图像的判别结果确定所述判别结果包括:The method for mining difficult cases in target detection according to claim 5, characterized in that the preliminary discrimination result is the first object that may be misdetected or may be missed, combined with the S that is adjacent to the image to be detected. The discrimination result of the frame image determines that the discrimination result includes:
    对于初步判别结果为可能误检的第一对象,For the preliminary judgment result as the first object that may be misdetected,
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与所述初步判别结果为可能误检的第一对象对应的对象的关联结果都为匹配成功,并且与所述初步判别结果为可能误检的第一对象对应的对象的跟踪置信度都大于第一置信阈值,确定所述初步判别结果为可能误检的第一对象的判别结果为新出现;If in each frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be misdetected by the preliminary discrimination result is a successful match , And the tracking confidence of the object corresponding to the first object that may be misdetected by the preliminary judgment result is greater than the first confidence threshold, and the judgment result of the preliminary judgment result as the first object that may be misdetected is new ;
    否则,确定所述初步判别结果为可能误检的第一对象的判别结果为误检。Otherwise, it is determined that the preliminary discrimination result is that the discrimination result of the first object that may be misdetected is a misdetection.
  8. 根据权利要求5所述的目标检测中难例挖掘的方法,其特征在于,The method for mining difficult cases in target detection according to claim 5, characterized in that,
    如果剩余未检测的待检测图像的帧数大于或者等于预设的帧数S,所述与所述待检测图像相邻的S帧图像为帧序号比所述待检测图像的帧序号大的S帧图像;If the number of frames of the remaining undetected images to be detected is greater than or equal to the preset frame number S, the S frame of images adjacent to the image to be detected is S whose frame number is greater than the frame number of the image to be detected Frame image
    如果剩余未检测的待检测图像的帧数小于预设的帧数S,所述与所述待检测图像相邻的S帧图像为帧序号比所述待检测图像的帧序号小的S帧图像。If the number of frames of the remaining undetected images to be detected is less than the preset frame number S, the S frame image adjacent to the image to be detected is an S frame image with a frame number smaller than the frame number of the image to be detected .
  9. 根据权利要求1-8任意一项所述的目标检测中难例挖掘的方法,其特征在于,The method for mining difficult cases in target detection according to any one of claims 1-8, wherein:
    所述预设的单目标跟踪算法为基于深度学习的单目标跟踪算法。The preset single target tracking algorithm is a single target tracking algorithm based on deep learning.
  10. 一种目标检测中难例挖掘的装置,其特征在于,包括:A device for mining difficult cases in target detection, which is characterized in that it comprises:
    图像获取单元,用于获取待检测图像;The image acquisition unit is used to acquire the image to be detected;
    目标检测单元,用于使用预设的目标检测算法对所述待检测图像进行分析,获得所述待检测图像的检测结果;所述检测结果包括:一个或多个检测对象的类别,所述一个或多个检测对象的检测位置,所述一个或多个检测对象的分类精度值;The target detection unit is configured to analyze the to-be-detected image using a preset target detection algorithm to obtain a detection result of the to-be-detected image; the detection result includes: one or more types of detection objects, the one Detection positions of or multiple detection objects, and classification accuracy values of the one or more detection objects;
    目标跟踪单元,用于使用预设的单目标跟踪算法对所述待检测图像进行分析,获得所述待检测图像的跟踪结果;所述跟踪结果包括:一个或多个跟踪对象的跟踪位置,所述一个或多个跟踪对象的跟踪置信度;The target tracking unit is configured to analyze the to-be-detected image using a preset single-target tracking algorithm to obtain the tracking result of the to-be-detected image; the tracking result includes: the tracking position of one or more tracking objects, so State the tracking confidence of one or more tracking objects;
    难例挖掘单元,用于根据所述检测结果,所述跟踪结果以及预设规则,获取所述待检测图像中每个对象的判别结果;所述判别结果包括:匹配成功,漏检,误检,新出现和结束;The hard case mining unit is used to obtain the discrimination result of each object in the image to be detected according to the detection result, the tracking result and the preset rules; the discrimination result includes: successful matching, missed detection, and false detection , New appearance and end;
    难例挖掘单元,还用于将所述判别结果为漏检的对象确定为漏检难例,将所述判别结果为误检的对象确定为误检难例。The difficult case mining unit is also used to determine the object whose judgment result is a missed detection as a difficult case of missed detection, and determine the object whose judgment result is a false detection as a difficult case of misdetection.
  11. 根据权利要求10所述的目标检测中难例挖掘的装置,其特征在于,所述难例挖掘单元具体用于:The device for mining difficult cases in target detection according to claim 10, wherein the difficult case mining unit is specifically configured to:
    根据所述检测结果,所述跟踪结果以及预设的关联规则,获取所述待检测图像中第一对象的关联结果;所述关联结果包括:匹配成功,第一关联结果,第二关联结果;According to the detection result, the tracking result and the preset association rules, obtain the association result of the first object in the image to be detected; the association result includes: a successful match, a first association result, and a second association result;
    所述第一关联结果为检测结果中存在所述第一对象,跟踪结果中不存在所述第一对象;The first association result is that the first object exists in the detection result, and the first object does not exist in the tracking result;
    所述第二关联结果为检测结果中不存在所述第一对象,跟踪结果中存在所述第一对象;The second association result is that the first object does not exist in the detection result, and the first object exists in the tracking result;
    根据初步判别规则和待检测图像中第一对象的关联结果,对所述待检测图像中所述第一对象进行初步判别,获得所述待检测图像中所述第一对象的初步判别结果;所述初步判别结果包括:匹配成功,漏检,误检,可能漏检,可能误检,结束;According to the preliminary discrimination rule and the association result of the first object in the image to be detected, the first object in the image to be detected is preliminarily discriminated, and the preliminary discrimination result of the first object in the image to be detected is obtained; The preliminary judgment results include: successful matching, missed detection, false detection, possible missed detection, possible misdetection, end;
    根据所述待检测图像中所述第一对象的初步判别结果确定所述待检测图像中所述第一对象的所述判别结果。The judgment result of the first object in the image to be detected is determined according to the preliminary judgment result of the first object in the image to be detected.
  12. 根据权利要求11所述的目标检测中难例挖掘的装置,其特征在于,所述难例挖掘单元根据所述检测结果,所述跟踪结果以及预设的关联规则,获取所述待检测图像中第一对象的关联结果具体包括:The device for mining difficult cases in target detection according to claim 11, wherein the difficult case mining unit obtains the images in the to-be-detected image according to the detection result, the tracking result, and the preset association rules. The association results of the first object specifically include:
    选择所述检测结果中与所述跟踪结果中类别相同的对象,以跟踪对象为行检测对象为列或者以检测对象为行跟踪对象为列,构建矩阵;Selecting objects of the same category in the detection result as those in the tracking result, taking the tracking object as a row, detecting the object as a column, or taking the detection object as a row, and the tracking object as a column, to construct a matrix;
    使用关联匹配算法,利用所述矩阵获得所述待检测图像中所述第一对象的关联结果。An association matching algorithm is used to obtain an association result of the first object in the image to be detected by using the matrix.
  13. 根据权利要求11所述的目标检测中难例挖掘的装置,其特征在于,所述初步判别规则包括:The device for mining difficult cases in target detection according to claim 11, wherein the preliminary discrimination rule comprises:
    对于关联结果为匹配成功的第一对象,初步判别结果为匹配成功;For the first object whose association result is a successful match, the preliminary judgment result is a successful match;
    对于关联结果为第一关联结果的第一对象,初步判别结果为可能误检;For the first object whose association result is the first association result, the preliminary judgment result is a possible misdetection;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度大于或者等于第一置信阈值的第一对象,初步判别结果为漏检;For the first object whose association result is the second association result, and the tracking confidence of the tracking object is greater than or equal to the first confidence threshold, the preliminary judgment result is a missed detection;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于或者等于第二置信阈值的第一对象,初步判别结果为结束;其中,所述第二置信阈值小于所述第一置信阈值;For the first object whose association result is the second association result, and the tracking confidence of the tracking object is less than or equal to the second confidence threshold, the preliminary judgment result is end; wherein, the second confidence threshold is less than the first confidence threshold;
    对于关联结果为第二关联结果,并且跟踪对象的跟踪置信度小于第一置信阈值且大于第二置信阈值的第一对象,初步判别结果为可能漏检。For the first object whose association result is the second association result, and the tracking confidence of the tracking object is less than the first confidence threshold and greater than the second confidence threshold, the preliminary judgment result is that the detection may be missed.
  14. 根据权利要求11所述的目标检测中难例挖掘的装置,其特征在于,所述难例挖掘单元根据所述待检测图像中所述第一对象的初步判别结果确定所述待检测图像中所述第一对象的所述判别结果具体包括:The device for mining difficult cases in target detection according to claim 11, wherein the difficult case mining unit determines the data in the image to be detected according to the preliminary discrimination result of the first object in the image to be detected. The discrimination result of the first object specifically includes:
    对于初步判别结果为匹配成功的第一对象,确定所述判别结果为匹配成功;For the first object whose preliminary discrimination result is a successful match, determine that the discrimination result is a successful match;
    对于初步判别结果为漏检的第一对象,确定所述判别结果为漏检;For the first object whose preliminary judgment result is a missed inspection, determine that the judgment result is a missed inspection;
    对于初步判别结果为结束的第一对象,确定所述判别结果为结束;For the first object whose preliminary discrimination result is over, determine that the discrimination result is over;
    对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的 S帧图像的判别结果确定所述判别结果;其中,S>1。For the first object whose preliminary discrimination result is likely to be misdetected or possibly missed, the discrimination result is determined in combination with the discrimination result of S frames of images adjacent to the image to be detected; where S>1.
  15. 根据权利要求14所述的目标检测中难例挖掘的装置,其特征在于,所述难例挖掘单元对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的S帧图像的判别结果确定所述判别结果具体包括:The device for mining difficult cases in target detection according to claim 14, wherein the difficult case mining unit combines with the image to be detected for the first object that may be misdetected or may be missed as a result of the preliminary judgment. The judgment result of adjacent S frame images determines that the judgment result specifically includes:
    对于初步判别结果为可能漏检的第一对象,按照与所述待检测图像相邻的第1帧图像至第S帧图像的顺序判断,For the first object that may be missed as a result of the preliminary judgment, judge according to the order from the first frame image to the S frame image adjacent to the image to be detected,
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的至少一帧图像中,与所述初步判别结果为可能漏检的第一对象对应的对象的关联结果为第二关联结果,并且与所述初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度小于或者等于第二置信阈值,确定所述初步判别结果为可能漏检的第一对象的判别结果为结束;If in at least one frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be missed as a result of the preliminary discrimination is the second association As a result, and the tracking confidence of the object corresponding to the first object that may be missed as a result of the preliminary judgment is less than or equal to the second confidence threshold, the judgment result of determining that the preliminary judgment result is the first object that may be missed is Finish;
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与所述初步判别结果为可能漏检的第一对象对应的对象的关联结果都为第二关联结果,并且与初步判别结果为可能漏检的第一对象对应的对象的跟踪置信度都小于第一置信阈值且大于第二置信阈值,确定所述初步判别结果为可能漏检的第一对象的判别结果为结束;If in each frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be missed is the second The correlation result, and the tracking confidence of the object corresponding to the first object that is likely to be missed as a result of the preliminary discrimination is less than the first confidence threshold and greater than the second confidence threshold, and it is determined that the preliminary discrimination result is the first object that may be missed. The judgment result of is the end;
    否则,确定所述初步判别结果为可能漏检的第一对象的判别结果为漏检。Otherwise, it is determined that the preliminary judgment result is that the judgment result of the first object that may be missed is a missed detection.
  16. 根据权利要求14所述的目标检测中难例挖掘的装置,其特征在于,所述难例挖掘单元对于初步判别结果为可能误检或者可能漏检的第一对象,结合与所述待检测图像相邻的S帧图像的判别结果确定所述判别结果具体包括:The device for mining difficult cases in target detection according to claim 14, wherein the difficult case mining unit combines with the image to be detected for the first object that may be misdetected or may be missed as a result of the preliminary judgment. The judgment result of adjacent S frame images determines that the judgment result specifically includes:
    对于初步判别结果为可能误检的第一对象,For the preliminary judgment result as the first object that may be misdetected,
    如果与所述待检测图像相邻的第1帧图像至第S帧图像的每一帧图像中,与所述初步判别结果为可能误检的第一对象对应的对象的关联结果都为匹配成功,并且与所述初步判别结果为可能误检的第一对象对应的对象的跟踪置信度都大于第一置信阈值,确定所述初步判别结果为可能误检的第一对象的判别结果为新出现;If in each frame of image from the first frame image to the S-th frame image adjacent to the image to be detected, the association result of the object corresponding to the first object that may be misdetected by the preliminary discrimination result is a successful match , And the tracking confidence of the object corresponding to the first object that may be misdetected by the preliminary judgment result is greater than the first confidence threshold, and the judgment result of the preliminary judgment result as the first object that may be misdetected is new ;
    否则,确定所述初步判别结果为可能误检的第一对象的判别结果为误检。Otherwise, it is determined that the preliminary discrimination result is that the discrimination result of the first object that may be misdetected is a misdetection.
  17. 根据权利要求14所述的目标检测中难例挖掘的装置,其特征在于,The device for mining difficult cases in target detection according to claim 14, characterized in that:
    如果剩余未检测的待检测图像的帧数大于或者等于预设的帧数S,所述与所述待检测图像相邻的S帧图像为帧序号比所述待检测图像的帧序号大的S帧图像;If the number of frames of the remaining undetected images to be detected is greater than or equal to the preset frame number S, the S frame of images adjacent to the image to be detected is S whose frame number is greater than the frame number of the image to be detected Frame image
    如果剩余未检测的待检测图像的帧数小于预设的帧数S,所述与所述待检测图像相邻的S帧图像为帧序号比所述待检测图像的帧序号小的S帧图像。If the number of frames of the remaining undetected images to be detected is less than the preset frame number S, the S frame image adjacent to the image to be detected is an S frame image with a frame number smaller than the frame number of the image to be detected .
  18. 根据权利要求10-17任意一项所述的目标检测中难例挖掘的装置,其特征在于,The device for mining difficult cases in target detection according to any one of claims 10-17, wherein:
    所述预设的单目标跟踪算法为基于深度学习的单目标跟踪算法。The preset single target tracking algorithm is a single target tracking algorithm based on deep learning.
  19. 一种设备,其特征在于,所述设备包括:处理器和存储器;所述存储器与所述处理器耦合;所述存储器用于存储计算机程序代码;所述计算机程序代码包括计算机指令,当所述处理器执行上述计算机指令时,使得所述设备执行如权利要求1-9任意一项所述的目标检测中难例挖掘的方法。A device, characterized in that the device comprises: a processor and a memory; the memory is coupled with the processor; the memory is used to store computer program code; the computer program code includes computer instructions, when the When the processor executes the above computer instructions, it causes the device to execute the method for mining difficult cases in target detection according to any one of claims 1-9.
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机指令,当所述计算机指令在设备上运行时,使得所述设备执行如权利要求1-9任意一项所述的目标检测中难例挖掘的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium includes computer instructions, when the computer instructions run on a device, the device executes any one of claims 1-9 Difficult example mining method in target detection.
  21. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1-9任意一项所述的目标检测中难例挖掘的方法。A computer program product, characterized in that, when the computer program product runs on a computer, the computer is caused to execute the method for mining difficult cases in target detection according to any one of claims 1-9.
PCT/CN2020/086742 2020-04-24 2020-04-24 Method and apparatus for mining difficult case during target detection WO2021212482A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/086742 WO2021212482A1 (en) 2020-04-24 2020-04-24 Method and apparatus for mining difficult case during target detection
CN202080004676.1A CN112639872B (en) 2020-04-24 2020-04-24 Method and device for difficult mining in target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/086742 WO2021212482A1 (en) 2020-04-24 2020-04-24 Method and apparatus for mining difficult case during target detection

Publications (1)

Publication Number Publication Date
WO2021212482A1 true WO2021212482A1 (en) 2021-10-28

Family

ID=75291201

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/086742 WO2021212482A1 (en) 2020-04-24 2020-04-24 Method and apparatus for mining difficult case during target detection

Country Status (2)

Country Link
CN (1) CN112639872B (en)
WO (1) WO2021212482A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359308A (en) * 2022-04-06 2022-11-18 北京百度网讯科技有限公司 Model training method, apparatus, device, storage medium, and program for identifying difficult cases
CN117710944A (en) * 2024-02-05 2024-03-15 虹软科技股份有限公司 Model defect detection method, model training method, target detection method and target detection system

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361413A (en) * 2021-06-08 2021-09-07 南京三百云信息科技有限公司 Mileage display area detection method, device, equipment and storage medium
CN113468365B (en) * 2021-09-01 2022-01-25 北京达佳互联信息技术有限公司 Training method of image type recognition model, image retrieval method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108647577A (en) * 2018-04-10 2018-10-12 华中科技大学 A kind of pedestrian's weight identification model that adaptive difficult example is excavated, method and system
CN108647587A (en) * 2018-04-23 2018-10-12 腾讯科技(深圳)有限公司 Demographic method, device, terminal and storage medium
CN109635649A (en) * 2018-11-05 2019-04-16 航天时代飞鸿技术有限公司 A kind of high speed detection method and system of unmanned plane spot
EP3540634A1 (en) * 2018-03-13 2019-09-18 InterDigital CE Patent Holdings Method for audio-visual events classification and localization and corresponding apparatus computer readable program product and computer readable storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9092841B2 (en) * 2004-06-09 2015-07-28 Cognex Technology And Investment Llc Method and apparatus for visual detection and inspection of objects
US8712096B2 (en) * 2010-03-05 2014-04-29 Sri International Method and apparatus for detecting and tracking vehicles
CN104123532B (en) * 2013-04-28 2017-05-10 浙江大华技术股份有限公司 Target object detection and target object quantity confirming method and device
CN105046220A (en) * 2015-07-10 2015-11-11 华为技术有限公司 Multi-target tracking method, apparatus and equipment
CN106874894B (en) * 2017-03-28 2020-04-14 电子科技大学 Human body target detection method based on regional full convolution neural network
CN107516303A (en) * 2017-09-01 2017-12-26 成都通甲优博科技有限责任公司 Multi-object tracking method and system
CN108053427B (en) * 2017-10-31 2021-12-14 深圳大学 Improved multi-target tracking method, system and device based on KCF and Kalman
CN108446622A (en) * 2018-03-14 2018-08-24 海信集团有限公司 Detecting and tracking method and device, the terminal of target object
CN109460702B (en) * 2018-09-14 2022-02-15 华南理工大学 Passenger abnormal behavior identification method based on human body skeleton sequence
CN110751096B (en) * 2019-10-21 2022-02-22 陕西师范大学 Multi-target tracking method based on KCF track confidence
CN110852283A (en) * 2019-11-14 2020-02-28 南京工程学院 Helmet wearing detection and tracking method based on improved YOLOv3

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3540634A1 (en) * 2018-03-13 2019-09-18 InterDigital CE Patent Holdings Method for audio-visual events classification and localization and corresponding apparatus computer readable program product and computer readable storage medium
CN108647577A (en) * 2018-04-10 2018-10-12 华中科技大学 A kind of pedestrian's weight identification model that adaptive difficult example is excavated, method and system
CN108647587A (en) * 2018-04-23 2018-10-12 腾讯科技(深圳)有限公司 Demographic method, device, terminal and storage medium
CN109635649A (en) * 2018-11-05 2019-04-16 航天时代飞鸿技术有限公司 A kind of high speed detection method and system of unmanned plane spot

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115359308A (en) * 2022-04-06 2022-11-18 北京百度网讯科技有限公司 Model training method, apparatus, device, storage medium, and program for identifying difficult cases
CN115359308B (en) * 2022-04-06 2024-02-13 北京百度网讯科技有限公司 Model training method, device, equipment, storage medium and program for identifying difficult cases
CN117710944A (en) * 2024-02-05 2024-03-15 虹软科技股份有限公司 Model defect detection method, model training method, target detection method and target detection system

Also Published As

Publication number Publication date
CN112639872B (en) 2022-02-11
CN112639872A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
WO2021212482A1 (en) Method and apparatus for mining difficult case during target detection
US20220253631A1 (en) Image processing method, electronic device and storage medium
WO2022227764A1 (en) Event detection method and apparatus, electronic device, and readable storage medium
US11856277B2 (en) Method and apparatus for processing video, electronic device, medium and product
CN111275011B (en) Mobile traffic light detection method and device, electronic equipment and storage medium
CN113222942A (en) Training method of multi-label classification model and method for predicting labels
CN113705716B (en) Image recognition model training method and device, cloud control platform and automatic driving vehicle
CN115359308A (en) Model training method, apparatus, device, storage medium, and program for identifying difficult cases
CN112214770B (en) Malicious sample identification method, device, computing equipment and medium
CN111738290B (en) Image detection method, model construction and training method, device, equipment and medium
CN116127319B (en) Multi-mode negative sample construction and model pre-training method, device, equipment and medium
US20220392192A1 (en) Target re-recognition method, device and electronic device
CN114692778B (en) Multi-mode sample set generation method, training method and device for intelligent inspection
CN114429631B (en) Three-dimensional object detection method, device, equipment and storage medium
CN114187488B (en) Image processing method, device, equipment and medium
US20230096921A1 (en) Image recognition method and apparatus, electronic device and readable storage medium
CN113139542B (en) Object detection method, device, equipment and computer readable storage medium
CN113378836A (en) Image recognition method, apparatus, device, medium, and program product
CN113360688B (en) Method, device and system for constructing information base
CN112541496B (en) Method, device, equipment and computer storage medium for extracting POI (point of interest) names
CN114677691B (en) Text recognition method, device, electronic equipment and storage medium
CN113963322B (en) Detection model training method and device and electronic equipment
CN111401224B (en) Target detection method and device and electronic equipment
CN114299522B (en) Image recognition method device, apparatus and storage medium
US20220383613A1 (en) Object association method and apparatus and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20931937

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20931937

Country of ref document: EP

Kind code of ref document: A1