CN110555339A

CN110555339A - target detection method, system, device and storage medium

Info

Publication number: CN110555339A
Application number: CN201810547022.8A
Authority: CN
Inventors: 赵元; 沈海峰
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2019-12-10

Abstract

The invention provides a target detection method, a system, a device and a storage medium. The system comprises an acquisition module and a detection module. The acquisition module can acquire data to be detected. The detection module can be used for detecting the data to be detected and outputting a detection result containing the class identification of the target to be detected and the class identification of the interference object, so as to obtain the target to be detected, wherein the interference object is mistakenly identified as the target to be detected at least once in the target detection process. According to the method disclosed by the invention, after the target to be detected and the interference object thereof are identified, the influence of the interference object on the target to be detected can be removed based on the difference between the target to be detected and the interference object, so that a better target detection effect is achieved.

Description

Target detection method, system, device and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, a system, an apparatus, and a storage medium for target detection.

Background

The target detection technology is a technology for detecting whether a target to be detected exists in a complex image/video scene, and sometimes the category and the coordinate position of the target to be detected need to be further acquired. In the detection process, some background objects have characteristics similar to those of the object to be detected, so that the detection effect of the object is easily interfered, for example, a tie and a safety belt which are both long and long. Therefore, the effect of the target detection algorithm is easily destroyed, and a more complicated technique is required to distinguish the target to be detected from the interfering object. While more complex technology means more investment and high costs. There is a need for a simple and accurate target detection method and/or system to address the above-mentioned problems.

Disclosure of Invention

In order to solve the problem that it is difficult to distinguish a target to be detected and an interfering object in a target detection process in the prior art, an embodiment of the present invention provides a target detection method, a system, a device, and a storage medium.

In order to achieve the purpose of the invention, the technical scheme provided by the invention is as follows:

An object detection system. The system comprises an acquisition module and a detection module. The acquisition module is used for acquiring data to be detected. The detection module is used for detecting the data to be detected and outputting a detection result containing the class identification of the target to be detected and the class identification of the interference object, so as to obtain the target to be detected, wherein the interference object is mistakenly identified as the target to be detected at least once in the target detection process.

In the invention, the data to be detected is image data; and the detection result is image data of the target to be detected and the interference object marked by using different identification modes.

In the present invention, the detection module may be further configured to detect the data to be detected by using a target detection model. The system further comprises a training module, wherein the training module is used for training the initial model by using training data containing the marking information of the target to be detected and the marking information of the interference object to obtain a target detection model. And the interference object is mistakenly identified as the target to be detected at least once in the target detection process.

in the present invention, the training module further includes a preliminary training unit, a test model testing unit, and a retraining unit. The initial training unit is used for training an initial model by using the original training data to obtain an initial target detection model. The detection model test unit is used for carrying out target detection by utilizing the initial target detection model and outputting a detection result. And the retraining unit is used for acquiring training data after the interference object labeling is carried out based on the detection result, and training the initial target detection model by using the training data after the interference object labeling to obtain the target detection model. The original training data at least comprises the labeling information of the target to be detected.

in the present invention, the original training data further includes labeling information of a priori interfering objects. And the prior interference object is mistakenly marked as the target to be detected at least once in other target detection processes different from the target detection model training process.

a target detection method. The method may be implemented on a device that includes a processor and a memory. The method may include one or more of the following operations. Data to be detected can be acquired. The data to be detected can be detected, and a detection result comprising the category identification of the target to be detected and the category identification of the interference object is output, so that the target to be detected is obtained, and the interference object is mistakenly identified as the target to be detected at least once in the target detection process.

In the present invention, the detecting the data to be detected may include at least one of the following operations. The data to be detected can be detected by using a target detection model. The target detection model may be obtained based on the following training method. The initial model can be trained by using training data containing the marking information of the target to be detected and the marking information of the interference object to obtain a target detection model. And the interference object is mistakenly identified as the target to be detected at least once in the target detection process.

In the present invention, the training method may further include at least one of the following operations. Original training data can be obtained, and an initial target detection model is obtained by training an initial model through the original training data. The initial target detection model can be used for target detection, and a detection result is output. Training data after the interference object labeling based on the detection result can be acquired. The initial target detection model may be trained using training data labeled with the interfering object to obtain the target detection model. The original training data at least comprises the labeling information of the target to be detected.

An object detection apparatus, the apparatus comprising at least one processor and at least one memory; the at least one memory is to store instructions; the at least one processor is configured to execute at least a portion of the computer instructions to implement the object detection method as described in any one of the above. Any of the operations described herein.

A computer-readable storage medium, wherein the storage medium stores computer instructions, and when the computer instructions in the storage medium are read by a computer, the computer executes the object detection method according to any one of the above.

Additional features will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present invention may be realized and obtained by means of the instruments and methods set forth in the detailed description below.

Drawings

The present application may be further described in terms of exemplary embodiments. The exemplary embodiments may be described in detail with reference to the accompanying drawings. The described embodiments are not limiting exemplary embodiments in which like reference numerals represent similar structures throughout the several views of the drawings and wherein:

FIG. 1 is a schematic diagram of an exemplary object detection system, shown in accordance with some embodiments of the present invention;

FIG. 2 is a schematic diagram of exemplary hardware and/or software components of an exemplary computing device according to some embodiments of the present invention;

FIG. 3 is a schematic diagram of exemplary hardware components and/or software components of an exemplary mobile device shown in accordance with some embodiments of the present invention;

FIG. 4 is a block diagram of an exemplary processing engine shown in accordance with some embodiments of the invention;

FIG. 5 is an exemplary flow diagram illustrating target detection according to some embodiments of the invention;

FIG. 6 is an exemplary flow diagram illustrating the acquisition of a target detection improvement model according to some embodiments of the invention;

FIG. 7 is an exemplary flow diagram illustrating acquisition of a set of interfering objects of a target to be measured according to some embodiments of the invention;

FIG. 8 is an exemplary flow chart illustrating the determination of detection accuracy of a target detection improvement model according to some embodiments of the invention.

Detailed Description

in order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

as used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Although various references are made herein to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a vehicle client and/or server. The modules are merely illustrative and different aspects of the systems and methods may use different modules.

Flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Embodiments of the present application may be applied to different application scenarios, including but not limited to face detection recognition, video/surveillance analysis, picture analysis, smart driving, three-dimensional image vision, industrial vision detection, medical image diagnosis, text recognition, image and video editing, and the like, or any combination thereof. Face detection recognition may include, but is not limited to, attendance, access control, authentication, face attribute recognition, face detection tracking, real person detection, face comparison, face search, face key point localization, and the like, or any combination thereof. Video/surveillance analysis may include, but is not limited to, object, commodity intelligent recognition and analysis positioning, pedestrian attributes, pedestrian analysis and tracking, crowd density passenger flow analysis, road vehicle behavior analysis, and the like, or any combination thereof. The picture recognition analysis may include, but is not limited to, searching pictures, object/scene recognition, vehicle type recognition, person attribute analysis, person clothing analysis, merchandise recognition, yellow/violence discrimination, and the like, or any combination thereof. Intelligent driving may include, but is not limited to, vehicle and object detection collision warning, lane detection deviation warning, traffic sign recognition, pedestrian detection, vehicle distance detection, and the like, or any combination thereof. Three-dimensional image vision may include, but is not limited to, three-dimensional machine vision, binocular stereo vision, three-dimensional reconstruction, three-dimensional scanning, mapping, industrial simulation, etc., or any combination thereof. Industrial vision inspection may include, but is not limited to, industrial cameras, industrial vision monitoring, industrial vision measurements, industrial controls, and the like, or any combination thereof. Medical imaging diagnosis may include, but is not limited to, tissue detection identification, tissue localization, lesion detection identification, lesion localization, and the like, or any combination thereof. The character recognition may include, but is not limited to, character detection, character extraction in images, character recognition, and the like, or any combination thereof. Image and video editing may include, but is not limited to, image/video authoring, image/video inpainting, image/video beautification, image/video effect transformation, and the like, or any combination thereof. Different embodiments of the present application can be applied to different industries, including but not limited to one or more of internet, financial industry, smart home, e-commerce shopping, security, transportation, justice, military, public security, frontier inspection, government, aerospace, electric power, factory, agriculture and forestry, education, entertainment, medical treatment, etc. It should be understood that the application scenarios of the system and method of the present application are merely examples or embodiments of the present application, and those skilled in the art can also apply the present application to other similar scenarios without inventive effort based on these drawings.

FIG. 1 is a schematic diagram of an object detection system 100 according to some embodiments of the present invention. For example, object detection system 100 may be a platform that provides services for image/video detection. The object detection system 100 may include a server 110, a storage device 120, a network 150, and one or more terminals 140. The server 110 may include a processing engine 112.

in some embodiments, the server 110 may be a single server or a group of servers. The server farm can be centralized or distributed (e.g., server 110 can be a distributed system). In some embodiments, the server 110 may be local or remote. For example, server 110 may access information and/or data stored in storage device 120 and/or terminal 140 via network 130. As another example, server 110 may be directly connected to storage device 130 and/or terminal 140 to access stored information and/or data. In some embodiments, the server 110 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, between clouds, multiple clouds, the like, or any combination of the above. In some embodiments, server 110 may be implemented on a computing device similar to that shown in FIG. 2 or FIG. 3 of the present application. For example, server 110 may be implemented on one computing device 200 as shown in FIG. 2, including one or more components in computing device 200. As another example, server 110 may be implemented on a mobile device 300 as shown in FIG. 3, including one or more components in computing device 300.

In some embodiments, the server 110 may include a processing engine 112. Processing engine 112 may process information and/or data related to the service request to perform one or more of the functions described herein. For example, the processing engine 112 may detect an object to be measured from an image and/or video. In some embodiments, processing engine 112 may include one or more processors (e.g., a single-core processor or a multi-core processor). For example only, the processing engine 112 may include one or more hardware processors, such as a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an application specific instruction set processor (ASIP), a Graphics Processing Unit (GPU), a physical arithmetic processing unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a micro-controller unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination of the above.

storage device 120 may store data and/or instructions. In some embodiments, storage device 130 may store data obtained from terminal 140. In some embodiments, storage device 120 may store data and/or instructions for execution or use by server 110, which may be executed or used by server 110 to implement the example methods described herein. In some embodiments, storage device 120 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), the like, or any combination of the above. Exemplary mass storage devices may include magnetic disks, optical disks, solid state drives, and the like. Exemplary removable memory may include flash memory disks, floppy disks, optical disks, memory cards, compact disks, magnetic tape, and the like. Exemplary volatile read-only memory can include Random Access Memory (RAM). Exemplary random access memories may include Dynamic Random Access Memory (DRAM), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM), Static Random Access Memory (SRAM), silicon controlled random access memory (T-RAM), zero capacitance memory (Z-RAM), and the like. Exemplary read-only memories may include mask read-only memory (MROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), compact disk read-only memory (CD-ROM), digital versatile disk read-only memory (dfrom), and the like. In some embodiments, storage device 120 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, between clouds, multiple clouds, the like, or any combination of the above.

In some embodiments, storage device 120 may be connected to network 130 to enable communication with one or more components (e.g., server 110, terminal 140, etc.) in object detection system 100. One or more components of object detection system 100 may access data or instructions stored in storage device 120 via network 130. In some embodiments, the storage device 120 may be directly connected to or in communication with one or more components of the object detection system 100 (e.g., the terminal 140, etc.). In some embodiments, storage device 120 may be part of server 110.

The network 130 may facilitate the exchange of information and/or data. In some embodiments, one or more components in object detection system 100 (e.g., server 110, storage 120, and terminal 140, etc.) may send information and/or data to other components in on-demand service 100 via network 130. For example, the server 110 may obtain/obtain the request from the terminal 140 via the network 130. In some embodiments, the network 130 may be any one of a wired network or a wireless network, or a combination thereof. For example, network 130 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a bluetooth network, a ZigBee network, a Near Field Communication (NFC) network, the like, or any combination of the above. In some embodiments, the network 130 may include one or more network access points. For example, the network 130 may include wired or wireless network access points, such as base stations and/or Internet switching points 130-1, 130-2, and so forth. Through the access point, one or more components of the object detection system 100 may be connected to a network 130 to exchange data and/or information.

The terminal 140 may include one or more devices with picture taking and/or image capturing capabilities. Such as a desktop computer 140-1, a laptop computer 140-2, a camera device 140-3, a smart mobile device 140-4, etc. In some embodiments, the camera device 140-3 may include, but is not limited to, a video camera, a still camera, a security detection device, and the like, or any combination thereof. In some embodiments, mobile device 140-4 may include, but is not limited to, a smartphone, a Personal Digital Assistant (PDA), a tablet computer, a handheld game console, smart glasses, a smart watch, a wearable device, a virtual display device, a display enhancement device, and the like, or any combination thereof. In some embodiments, the terminal 140 may transmit the image/video to one or more devices in the object detection system 100. For example, the terminal 140 may send the image/video to the server 110 for processing.

FIG. 2 is a schematic diagram of an exemplary computing device 200 shown in accordance with some embodiments of the invention. Server 110, storage 120, and/or terminal 140 may be implemented on computing device 200. For example, the processing engine 112 may be implemented on the computing device 200 and configured to implement the functionality disclosed herein.

Computing device 200 may include any components used to implement the systems described herein. For example, the processing engine 112 may be implemented on the computing device 200 by its hardware, software programs, firmware, or a combination thereof. For convenience, only one computer is depicted in the figures, but the computational functions described herein in relation to the object detection system 100 may be implemented in a distributed manner by a set of similar platforms to distribute the processing load of the system.

Computing device 200 may include a communication port 250 for connecting to a network for enabling data communication. Computing device 200 may include a processor (e.g., CPU)220 that may execute program instructions in the form of one or more processors. An exemplary computer platform may include an internal bus 210, various forms of program memory and data storage including, for example, a hard disk 270, and Read Only Memory (ROM)230 or Random Access Memory (RAM)240 for storing various data files that are processed and/or transmitted by the computer. An exemplary computing device may include program instructions stored in read-only memory 230, random access memory 240, and/or other types of non-transitory storage media that are executed by processor 220. The methods and/or processes of the present application may be embodied in the form of program instructions. Computing device 200 also includes input/output component 260 for supporting input/output between the computer and other components. Computing device 200 may also receive programs and data in the present disclosure via network communication.

For ease of understanding, only one processor is exemplarily depicted in fig. 2. However, it should be noted that the computing device 200 in the present application may include multiple processors, and thus the operations and/or methods described in the present application that are implemented by one processor may also be implemented by multiple processors, collectively or independently. For example, if in the present application the processors of computing device 200 perform steps 1 and 2, it should be understood that steps 1 and 2 may also be performed by two different processors of computing device 200, either collectively or independently (e.g., a first processor performing step 1, a second processor performing step 2, or a first and second processor performing steps 1 and 2 collectively).

fig. 3 is a schematic diagram of exemplary hardware and/or software of an exemplary mobile device 300, shown in accordance with some embodiments of the present invention. The terminal 140 may be implemented on a mobile device 300. As shown in fig. 3, the mobile device 300 may include a communication unit 310, a display unit 320, a graphics processor 330, a processor 340, an input/output unit 350, a memory 360, and a storage unit 390. A bus or a controller may also be included in the mobile device 300. In some embodiments, mobile operating system 370 and one or more application programs 380 may be loaded from storage unit 390 into memory 360 and executed by processor 340. In some embodiments, application 380 may receive and display information for image processing or other information related to processing engine 112. The input/output unit 350 may enable user interaction with the object detection system 100 and provide interaction-related information to other components in the object detection system 100, such as the server 110, via the network 130.

to implement the various modules, units and their functionality described in this application, a computer hardware platform may be used as the hardware platform for one or more of the elements mentioned herein. A computer having user interface elements may be used to implement a Personal Computer (PC) or any other form of workstation or terminal equipment. A computer may also act as a server, suitably programmed.

fig. 4 is a block diagram illustrating an exemplary processing engine 112 according to some embodiments of the invention. As shown, the processing engine 112 may include an acquisition module 410 and a detection module 420.

The acquisition module 410 may acquire data. In some embodiments, the acquisition module 410 may acquire data from one or more of the object detection system 100, the storage device 120, the terminal 140, or any device or component disclosed herein capable of storing data. The acquired data may include one or more combinations of image data, video data, user instructions, algorithms, models, and the like. In some embodiments, the acquisition module 410 may acquire data to be detected. The data to be detected may comprise image data, video data or a combination thereof. The image data may include one or more pictures, and the video data may include a plurality of frames of pictures constituting a video. In some embodiments, a frame of video data may be image data, and a plurality of consecutive image data constitute a piece of video data. In some embodiments, the acquisition module 410 may acquire raw training data. The raw training data may be used to train an initial target detection model to obtain a target detection model. The original data may be image data, video data or a combination thereof, including the number of the image and the label of the target to be measured. In some embodiments, the obtaining module 410 may obtain the verification data. The verification data can be image data, video data or a combination thereof and is used for detecting whether the detection accuracy of the trained target detection model meets a preset condition. In some embodiments, the acquisition module 410, after acquiring the aforementioned data, may transmit to other modules of the processing engine 112 (e.g., the detection module 420 and/or the training module 440) for subsequent operation, or to the storage device 120 for storage over the network 130.

the detection module 420 may be configured to detect the data to be detected (e.g., the image and/or the video acquired by the acquisition module 410), and output a detection result including the category identifier of the target to be detected and the category identifier of the interfering object, so as to obtain the target to be detected. The target to be detected may be a target to be detected from the data to be detected, and the interfering object of the target to be detected may be a background target which is erroneously detected as the target to be detected in any target detection process. For example, the interference target may be an object that is mistakenly regarded as the target to be detected by other detection models in the process of performing target detection, may also be an object that is easily mistakenly detected as the target to be detected by artificially judging that the interference target has the same or similar properties as the target to be detected, may also be an object that is mistakenly detected as the target to be detected by the target detection model in the present invention, an object that is mistakenly detected as the target to be detected by the target detection model in the training stage, and the like. The detection result comprises all detected objects, the detected objects can be targets to be detected or interference objects of the targets to be detected, and the categories of the targets to be detected and the interference objects are respectively marked. The preliminary detection result is expressed in a form including identification, score and the like. The identification may include an identification box. The identification box may be a detection box for indicating a detected object in the data to be detected, and the score may be a probability value that the detected object is determined as the object to be detected or the interfering object of the object to be detected. In some embodiments, the preliminary detection result may further include category differentiation information of the target to be detected and the interfering object. For example, the target to be detected and the interfering object may be distinguished by identification frames with different colors or shapes, for example, a green identification frame is used to identify the target to be detected, and a red identification frame is used to identify the interfering object. For another example, a text description may be added near the identification frame to distinguish whether the detected object in the identification frame is the target to be detected or the interfering object. In some embodiments, the detection module 420 may detect the object in the data to be detected based on an object detection model to obtain a detection result, and obtain the object to be detected based on the detection result, for example, obtain the object to be detected based on a recognition result of the detected object in the category identifier (e.g., the identifier box). The target detection improved model may be a model obtained after training by using training including the labeling information of the target to be detected and the labeling information of the interfering object of the target to be detected. The training process of the target detection model is completed by the target detection model training system module 600.

It should be understood that the system and its modules shown in FIG. 4 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).

It should be noted that the above description is merely for convenience and should not be taken as limiting the scope of the present application. It will be understood by those skilled in the art that, having the benefit of the teachings of this system, various modifications and changes in form and detail may be made to the field of application for which the method and system described above may be practiced without departing from this teachings. However, such changes and modifications do not depart from the scope of the present application

FIG. 5 is an exemplary flow diagram illustrating target detection according to some embodiments of the invention. In some embodiments, flow 500 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (instructions run on a processing device to perform hardware simulation), etc., or any combination thereof. One or more of the operations in the flow 500 of object detection shown in fig. 5 may be implemented by the object detection system 100 shown in fig. 1. For example, the flow 500 may be stored in the storage device 120 in the form of instructions and executed by the processing engine 112 to perform the calls and/or perform the operations (e.g., the processor 220 of the computing device 200 shown in fig. 2, the central processor 340 of the mobile device 300 shown in fig. 3).

at 510, data to be detected may be acquired. Operation 510 may be performed by acquisition module 410. In some embodiments, the data to be detected may include image data, video data, or any combination thereof. The image data may include one or more pictures, and the video data may include a plurality of frames of pictures constituting a video. In some embodiments, a picture or a video acquired in real time by an acquisition device (e.g., a camera) of the terminal 140 may be used as the data to be detected, or a picture or a video acquired by the terminal 140, which is stored in the storage device 120 in advance, may be accessed as the data to be detected through the network 130.

In 520, the data to be detected may be detected, and a detection result including the category identifier of the target to be detected and the category identifier of the interfering object may be output, so as to obtain the target to be detected. The detection result may be obtained based on a target detection model. Operation 520 may be performed by detection module 420. In some embodiments, the target to be detected may be a target that needs to be obtained from the data to be detected, the interfering object may be a background target that is erroneously detected as the target to be detected in a target detection process, the background target having the same and/or similar physical properties (e.g., shape, color, texture, etc.) as the target to be detected, and the background target may be a target that does not need to be detected in the data to be detected except for the target to be detected. For example, assuming that a seat belt needs to be detected from a picture or video of an in-vehicle scene, the object to be measured may be designated as a seat belt. Other categories of targets may be uniformly designated as background targets. Some objects, such as a tie, a hanging rope of a rear view mirror pendant, a light shadow, a lane line, and the like, are easily confused with a safety belt due to the similar shape (e.g., a long strip shape) of the safety belt, and are therefore erroneously detected as an object to be detected in the object detection process. The other targets may be designated as interfering objects. In some embodiments, the target to be measured may be one or more than one. For example, when detecting an object in an image or a video stream of a scene in a vehicle, the object to be detected may include one of a seat belt, a driver's face, a passenger's face, and the like, or any combination thereof. The number of the interfering objects of each object to be measured may be different according to a property (e.g., a physical property) of the object to be measured. For example, the interference object whose target to be measured is a safety belt may include a tie, a hanging rope of a rear view mirror pendant, a light shadow, a lane line, and the like. The interference object with the target to be detected being the driver face can comprise the passenger face, the pedestrian face and the like.

In some embodiments, the interfering object may include, but is not limited to, an object once misdetected by the object detection model, an object misdetected by a detection model in an existing object detection domain, an object misdetected by a detection algorithm and/or model in other domains, an object misdetected by human detection, and the like, or any combination thereof. For example, the object detection model disclosed in the present application erroneously detects a safety belt, which is to be designated as an interfering object of the object to be detected, as the object tie to be detected in one detection process. For another example, in an existing target detection model, such as OverFeat, a safety belt is erroneously detected as a target tie to be detected in one detection process, and the safety belt is also designated as an interference object of the target tie to be detected. Also for example, when the object detection is performed using the adaboost detection algorithm, a safety belt, which will also be designated as an interfering object of the object to be detected, will be erroneously detected as the object tie to be detected. For another example, when the target is classified or identified manually, the safety belt is classified as the target tie to be detected, and the safety belt is also designated as the interference object of the target to be detected.

In some embodiments, the detection result may include a category identifier of the target to be detected and a category identifier of the interfering object. The category identification can include identification mode, score and the like on the expression form. The identification may comprise a detection box. The detection frame may include a detected object in the data to be detected, for example, an object to be detected, an interfering object of the object to be detected, or a background object. The detection result may include one or more detection frames, and the target to be detected, the interfering object of the target to be detected, and/or the background target are respectively located in different detection frames. The score may be a probability value that the target in the detection box is determined as the target to be detected or the interfering object, and may be 1, 0.99, 0.98, 0.97, or any other value from 0 to 1. For example, if the target to be detected needs to be detected from the data to be detected, after the detection, the score in the preliminary detection result is 0.98, which indicates that 98% of the targets in the detection frame are the target to be detected. In some embodiments, the detection result may further include category differentiation information of the target to be detected and the interfering object. For example, the target to be detected and the interfering object may be marked in different marking manners, for example, a green marking frame is used to mark the target to be detected, and a red marking frame is used to mark the interfering object. For another example, a text description may be added near the identification frame to distinguish whether the detected object in the identification frame is the target to be detected or the interfering object.

In some embodiments, the data to be detected may be input to a target detection model to obtain the detection result. The target detection model can be obtained by training a relevant model through training data. The correlation Model may be a classical learning Model used in the field of object detection, such as a Deformable Part Model (DMP), OverFeat, R-CNN, SPP-Net, Fast R-CNN, R-FCN, DSOD, etc. In some embodiments, the training data of the target detection model may be image data, video data, or a combination of both. The training data may include labeling information of a target to be measured and labeling information of an interfering object of the target to be measured. For each picture in the training data and/or each frame of image in each video, the position information and the classification information of the target to be detected and the interference object of the target to be detected can be displayed on the picture (image) through an identifier, such as a mark box, a character, an arrow, or any combination thereof. The position may be a coordinate or a set of coordinates of the target to be measured and an interfering object of the target to be measured on the image. The classification information may be information for distinguishing the target to be detected and the interfering object of the target to be detected, for example, the position information of the target to be detected and the interfering object of the target to be detected are respectively displayed by using identification frames with different colors or shapes, or the target to be detected and the interfering object of the target to be detected are explained by using characters near the identification frames. The identifier, the position information related to the identifier, and the classification information of the target to be measured and the interfering object of the target to be measured may be specified as the labeling information. For example, for an image of a scene in a vehicle, assuming that the object to be measured is a safety belt and the interfering object is a tie, the safety belt may be highlighted in the image by using a labeling frame (e.g., a green rectangular identification frame) whose coordinate range includes a coordinate set of the safety belt. The tie may be highlighted in the image with another callout box (e.g., a red rectangular logo box) whose coordinate range encompasses the set of coordinates of the tie. The marking of the target to be detected and the interference object of the target to be detected in the training data can be realized in a manual marking mode or a high-precision classifier, and the marking is not limited in the application. Specific descriptions of obtaining the target detection model can be found elsewhere in this application (e.g., fig. 7), and are not repeated herein.

In some embodiments, the target to be detected may be screened out based on the category of each detected object in the detection result. The detected object can be a target to be detected and/or an interference object of the target to be detected. In some embodiments, the category of the detected object may be obtained by identifying a target within the identifier. For example, the target recognition may be performed manually, or by machine recognition. And after the identification is finished, the target to be detected can be obtained from the identification.

The foregoing describes the present application and/or some other examples. The present application can be modified in various ways in light of the above. The subject matter disclosed herein can be implemented in various forms and examples, and the present application can be applied to a wide variety of applications. All applications, modifications and variations that are claimed in the following claims are within the scope of this application.

FIG. 6 is a block diagram illustrating an exemplary object detection model training system 600, according to some embodiments of the invention. As shown, the object detection model training system 600 may include a training module 610. Training module 610 may include a preliminary training unit 612, a test model testing unit 614, and a retraining unit 616. In some embodiments, the object detection model training system 600 may be part of the processing engine 112.

the preliminary training unit 612 may train the initial model with the original training data to obtain an initial target detection model. In some embodiments, the raw training data may be image data, video data, or any combination thereof. The image data may include a plurality of images, and the video data may include a plurality of frames of video images. The original training data may only include labeling information for the target to be measured, and in other embodiments, the original training data further includes labeling information for a priori interfering objects. The a priori interfering object may be a predetermined interfering object of the object to be measured. In some embodiments, the initial model may be a classical learning model, such as DMP, OverFeat, R-CNN, SPP-Net, Fast R-CNN, R-FCN, DSOD, and the like. The training process of the initial model can be found in the prior art, and is not described herein. The trained model will be designated as the initial target detection model. The detection model test unit 614 may be configured to output a detection result by performing target detection using the initial target detection model. The detection result may be a detection output of the initial object detection model on an object in an input (e.g., a newly acquired image and/or video). In some embodiments, the detection result may be one or more detection boxes added to the input, and each detection box may be a detected object within a range, for example, an object to be detected, an interfering object of the object to be detected, or a background object. The detection result may be that no detection box is added to the input, that is, the input and the output of the target detection model are the same and are not changed. In some embodiments, the detection result may further include a score, which is used to indicate a probability that the target in the detection box is the target to be detected. The score may be 1, 0.99, 0.98, 0.97, or any other value from 0 to 1. For example, a detection result with a score of 0.98 indicates that the target in the detection frame has a probability of 98% being the target to be detected.

The retraining unit 616 may obtain training data after the interference object labeling is performed based on the detection result, and train the initial target detection model by using the training data after the interference object labeling, so as to obtain the target detection model. In some embodiments, the training data may include the original training data, and may also include newly acquired data. The training data comprises the labeling information of all the interference objects in the interference object set, and also comprises the labeling information of the target to be detected and the labeling information of the prior interference object. In some embodiments, training data labeled by the interfering object may be input into the target detection model, and the training may be continued. When a predetermined condition is satisfied, for example, the number of training samples reaches a predetermined number, the detection accuracy of the model is greater than the set accuracy, or the value of the Loss Function (Loss Function) is set, the training process is stopped, and the final model after the training is completed is output as the target detection model.

FIG. 7 is an exemplary flow diagram illustrating the acquisition of a target detection model according to some embodiments of the invention. In some embodiments, the flow 700 may be performed by the object detection model training system 600 (e.g., the training module 610). In some embodiments, flow 700 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (instructions run on a processing device to perform hardware simulation), etc., or any combination thereof. One or more of the operations in the process 700 for obtaining an object detection model shown in FIG. 7 may be implemented by the object detection system 100 shown in FIG. 1. For example, flow 700 may be stored in storage device 120 in the form of instructions and executed by processing engine 112 to perform calls and/or execute (e.g., processor 220 of computing device 200 shown in fig. 2, central processor 340 of mobile device 300 shown in fig. 3).

At 710, raw training data may be acquired and an initial model may be trained using the raw training data to acquire an initial target detection model. Operation 710 may be performed by the preliminary training unit 614. In some embodiments, the raw training data may be image data, video data, or any combination thereof. The image data may include a plurality of images, and the video data may include a plurality of frames of video images. The training data may be pre-collected or may be acquired in real-time. In some embodiments, the images and/or video images contained by the raw training data may be numbered with symbols, for example, numbers with numbers, letters, or combinations thereof. In some embodiments, the original training data may include labeling information of the target to be measured, for example, for any one image in the image data or any one frame image in the video data, a square frame, a rectangular frame, a circular frame, a frame or any other identifier may be used to label the target to be measured, and display its position in the image (for example, coordinate information and/or coordinate set information). For example, a certain target to be detected may be marked by a green rectangular mark frame, and another target to be detected may be marked by a green circular mark frame. In some embodiments, the labeling of the target to be detected may be implemented in a manual labeling manner, or may be implemented by a high-precision classifier, which is not limited in this application. In some embodiments, the raw training data further includes labeling information of a priori interfering objects. The a priori interfering object may be a predetermined interfering object of the object to be measured. For example, assuming that the object to be measured is a tie, an object having the same and/or similar properties as the tie, for example, a safety belt, may be determined as a priori interfering object and the safety belt may be marked in the raw data. The method for obtaining the prior interference object may be obtained based on statistical analysis of a detection result of an existing target detection algorithm and/or model, or may be other methods, and the present application is not particularly limited. For the labeling of the prior interference object, the labeling is similar to that of the target to be measured, and is not described herein again.

In some embodiments, the initial model may be a classical learning model, such as DMP, OverFeat, R-CNN, SPP-Net, Fast R-CNN, R-FCN, DSOD, and the like. The initial model may have a plurality of initial model parameters, e.g., learning rate, hyper-parameters, etc. The initial model parameters can be default values of the system, and can also be adjusted and modified according to actual application conditions. The training process of the initial model can be found in the prior art, and is not described herein. When a predetermined condition is met, for example, the number of training samples reaches a predetermined number, the detection accuracy of the model is greater than a predetermined accuracy threshold, or the Loss Function (Loss Function) value is less than a predetermined value, the training process will stop. The trained model is designated as the initial target detection model, and subsequent operations, such as performance testing of the initial target detection model, are performed.

In 720, target detection may be performed using the target detection model, and a detection result may be output. Operation 720 may be performed by the test pattern test unit 614. In some embodiments, the input of the initial target detection model may be a newly acquired or previously prepared image and/or video at the time of target detection, for example, a picture or video acquired in real time by a camera (e.g., a camera) of the terminal 140, or a picture or video different from the original training data previously stored in the storage device 120. The detection result may be a detection output of the initial object detection model for an object in the input. In some embodiments, the detection result may be one or more detection boxes added to the input, and each detection box may be a detected object within a range, for example, an object to be detected, an interfering object of the object to be detected, or a background object. The detection result may be that no detection box is added to the input, that is, the input and the output of the target detection model are the same and are not changed. In some embodiments, the detection result may further include a score, which is used to indicate a probability that the target in the detection box is the target to be detected. The score may be 1, 0.99, 0.98, 0.97, or any other value from 0 to 1. For example, a detection result with a score of 0.98 indicates that the target in the detection frame has a probability of 98% being the target to be detected.

In 730, training data after labeling the interfering object based on the detection result may be obtained. Operation 730 may be performed by retraining unit 616. In some embodiments, the training data may include the original training data, and may also include newly acquired training data.

In some embodiments, the target detection process may cause other targets to be falsely detected as targets to be detected. For example, the detection object in the detection frame is identified as a non-object to be measured. At this time, the non-target to be measured may be designated as an interfering object of the target to be measured. In some embodiments, all detection results of the occurrence of false detection conditions in the detection process may be obtained and analyzed to obtain all interference objects of the target to be detected. The set of interfering objects may be generated based on the ranking result after all interfering objects are ranked. In some embodiments, the ordering may be based on policy. The policy may include a test result score, a number of samples, or a combination thereof. The detection result score may be a similarity score output by the target detection model when an interfering object is detected as a target to be detected in the detection sample. For example, for a detection sample whose detection result is wrong, it has been determined that the target in the detection frame is an interfering object of the target to be detected. And determining the interference object as the target to be detected by the target detection model in the detection sample due to the structure, the parameters and/or the training condition of the target detection model. The similarity score is the probability that the target detection model determines that the interference target is the target to be detected, and can be any value from 0 to 1. The value may be assigned as the test result score of the test sample. For example, assuming that a safety belt needs to be detected from a detection sample, for a certain detection sample, the target detection model detects a tie as a safety belt in the detection sample, and the correct probability of giving that the detected object (i.e., tie) is a safety belt is 85%. Then 85% may be the test result score for that test sample. The number of samples may be the number of detection samples corresponding to the same interference object when the target detection model detects the interference object as the target to be detected in all detection samples. For example, assuming that the object to be detected is a safety belt, after detection, the number of detection samples with a detection result of error is 5, wherein in 3 detection samples, the object detection model detects the safety belt as a safety belt, in 1 detection sample, the sling is detected as a safety belt, and in 1 detection sample, the light shadow is detected as a safety belt. The number of samples corresponding to the tie, the hanging rope and the light shadow is 3, 1, respectively. In some embodiments, the interfering objects may be sorted by the detection result score or sample number, for example, in a descending order, resulting in a sorted result. The sorting result can be expressed as the false detection degree of the interference object of the target to be detected, which is detected by the target detection model in the target detection process. For example, assuming that the object to be measured is a safety belt, the sorting results are arranged in tie, hanging rope, light shadow in descending order by the number of samples. The necktie is misdetected as the safety belt with the highest misdetection degree in the target detection process, and then is hung on a rope and shaded by light.

In some embodiments, according to the sorting result, all or part of the interference objects of the target to be detected may be selected, and combined with the target to be detected to generate the interference object set. For example, the first two interference objects in the sorting result may be selected, or all the selected interference objects may be combined with the target to be measured to generate the interference object set. In some embodiments, the number of the selected interfering objects may be a default value, or may be adjusted according to different targets to be measured, which is not limited in this application. In some embodiments, the set of interfering objects may be expressed in a vector-like form, e.g., [ (safety belt, tie, lanyard) ], or in a list. The expression form of the set of interfering objects is not limited in this application.

In some embodiments, the training data comprises raw training data for interfering object labeling according to the set of interfering objects; the training data may also be newly acquired training data, and the labeling data of the interfering object is added thereto. That is to say, the training data may include the labeling information of all the interference objects in the interference object set, and also include the labeling information of the target to be detected and the labeling information of the prior interference object. The training module 610 may obtain training data after performing the interference object labeling based on the detection result through a communication port (e.g., the input/output 260).

It should be noted that the above description is for illustrative purposes only, and is not intended to limit the scope of the present application. For example, the number of interfering objects may be the same or different for each target. For another example, the expression form of each target to be measured and its interfering object may be the same or different. Such variations and modifications are intended to be within the scope of the present application.

In 740, the initial target detection model may be trained using the training data labeled with the interfering object, and a target detection model may be obtained. Operation 740 may be performed by retraining unit 616. In some embodiments, training data labeled with the interfering object may be input into the target detection model, and the training may be continued. In the training process, model parameters (such as learning rate, hyper-parameters and the like) can be further adjusted, and the precision of the model is improved. When a certain preset condition is met, for example, the number of training samples reaches a preset number, the detection accuracy of the model is greater than the set accuracy, or the value of the loss function (LossFunction) is set, the training process will be stopped, and the target detection model after the training is completed will be output as the target detection improvement model. The detection accuracy of the target detection improvement model is higher than that of the target detection model. The determination of the detection accuracy of the object detection improvement model can be found elsewhere in the application (e.g., fig. 8). And will not be described in detail herein.

FIG. 8 is an exemplary flow chart illustrating the determination of detection accuracy of a target detection improvement model according to some embodiments of the invention. In some embodiments, flow 800 may be performed by processing logic that may comprise hardware (e.g., circuitry, dedicated logic, programmable logic, microcode, etc.), software (instructions run on a processing device to perform hardware simulation), etc., or any combination thereof. One or more operations of the process 800 for obtaining a set of interference objects for a target under test illustrated in fig. 8 may be implemented by the target detection system 100 illustrated in fig. 1. For example, the flow 800 may be stored in the storage device 120 in the form of instructions and executed and/or invoked by the processing engine 112 (e.g., the processor 220 of the computing device 200 shown in fig. 2, the central processor 340 of the mobile device 300 shown in fig. 3).

In 810, it may be detected whether a first detection accuracy of the target detection improvement model exceeds a first threshold using the validation data. In some embodiments, the verification data may be newly acquired data, including image data, video data, and the like. The verification data and the original training data are independently distributed in the same way, and the intersection is empty, namely, the verification data and the original training data obey the same distribution, are not influenced mutually, and do not have the same data. The first detection accuracy may be a ratio of data of the target detection improvement model accurately detecting the target to be detected from the verification data to all verification data. For example, assuming that the verification data includes 100 images, it is necessary to detect a seat belt from the 100 images. If the object detection improvement model correctly detects a seat belt from 97 images, the first detection accuracy may be 97%. In some embodiments, the first detection accuracy may be a preset value of the target detection system 100, or may be adjusted according to different application scenarios. For example, the first detection accuracy may be a fixed value for any target to be detected. For another example, different first detection accuracy rates may be determined according to different targets to be detected. In some embodiments, the first threshold may be a preset value of the object detection system 100, for example, 95%, and may also be manually set to meet requirements of different application scenarios.

At 820, it may be determined whether the first detection accuracy exceeds the first threshold. If the first detection accuracy is not less than the first threshold, it may be determined that the detection accuracy of the target detection improvement model has reached a preset requirement. The flow 800 will proceed to operation 830. In 830, the target detection improvement model may be used as a final target detection improvement model. The final target detection improved model can be directly used for target detection without training and optimization. If the first detection accuracy is smaller than the first threshold, it may be determined that the detection accuracy of the target detection improved model is not enough and does not reach a preset index, and the target detection improved model needs to be trained and optimized continuously. The process 800 proceeds to operation 840. At 840, the training data for the target detection model may continue to be updated and training may continue. In some embodiments, based on the detection result of the target detection improved model on the verification data, the interference object of the target to be detected may be obtained again, interference object labeling may be further performed on training data of the target detection improved model, and the target detection improved model may be trained continuously by using the labeled training data. The specific steps can be referred to the descriptions of fig. 6 to fig. 7 in the present application, and are not described herein again. After the retraining is completed, the detection accuracy of the retrained target detection improvement model may be continuously verified, and if a preset requirement is met (for example, the detection accuracy exceeds the first threshold), the trained target detection improvement model may be used as a final target detection improvement model. If the preset requirement is not met (for example, the detection accuracy is smaller than the first threshold), the retraining step may be repeated until the detection accuracy of the model meets the preset requirement.

It should be noted that the above description is merely for convenience and should not be taken as limiting the scope of the present application. It will be understood by those skilled in the art that, having the benefit of the teachings of this system, various modifications and changes in form and detail may be made to the field of application for which the method and system described above may be practiced without departing from this teachings.

Compared with the prior art, the beneficial effects that the above embodiments of the present application may bring include but are not limited to:

(1) The problem that fine features of a target to be detected and an interference object are difficult to construct is avoided;

(2) The target to be detected and the interference object of the target to be detected are only needed to be distinguished and labeled on the basis of original training data of the model, the model can effectively distinguish the target to be detected and the interference object of the target to be detected after training, the influence of the interference object on the target to be detected is removed, and a better detection effect is achieved.

It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.

Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.

Those skilled in the art will appreciate that various modifications and improvements may be made to the disclosure herein. For example, the different system components described above are implemented by hardware devices, but may also be implemented by software solutions only. For example: the system is installed on an existing server. Further, the location information disclosed herein may be provided via a firmware, firmware/software combination, firmware/hardware combination, or hardware/firmware/software combination.

all or a portion of the software may sometimes communicate over a network, such as the internet or other communication network. Such communication enables loading of software from one computer device or processor to another. For example: from a management server or host computer of the object detection system, to a hardware platform of a computer environment, or other computer environment implementing the system, or similar functionality related to providing information needed for order spelling rate prediction. Thus, another medium capable of transferring software elements may also be used as a physical connection between local devices, such as optical, electrical, electromagnetic waves, etc., propagating through cables, optical cables, or the air. The physical medium used for the carrier wave, such as an electric, wireless or optical cable or the like, may also be considered as the medium carrying the software. As used herein, unless limited to a tangible "storage" medium, other terms referring to a computer or machine "readable medium" refer to media that participate in the execution of any instructions by a processor.

computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).

Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numbers describing attributes, quantities, etc. are used in some embodiments, it being understood that such numbers used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

Each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, articles, and the like, cited in this application is hereby incorporated by reference in its entirety. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.

finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of the present application. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present application can be viewed as being consistent with the teachings of the present application. Accordingly, embodiments of the present application are not limited to those explicitly described and depicted herein.

Claims

1. A method of object detection, comprising:

Acquiring data to be detected;

Detecting the data to be detected, and outputting a detection result containing the class identification of the target to be detected and the class identification of the interference object, thereby obtaining the target to be detected;

and the interference object is mistakenly identified as the target to be detected at least once in the target detection process.

2. The object detection method according to claim 1, wherein the data to be detected is image data; and the detection result is image data of the target to be detected and the interference object marked by using different identification modes.

3. the method according to claim 1, wherein the detecting the data to be detected comprises:

Detecting the data to be detected by using a target detection model;

the target detection model is obtained based on the following training method:

Training the initial model by using training data containing target marking information to be detected and interference object marking information to obtain a target detection model;

4. the method of claim 3, wherein the training method further comprises:

Acquiring original training data, and training an initial model by using the original training data to obtain an initial target detection model;

carrying out target detection by using the initial target detection model and outputting a detection result;

Acquiring training data after the interference object is labeled based on the detection result; and

training the initial target detection model by using training data marked by the interference object to obtain the target detection model;

The original training data at least comprises the marking information of the target to be detected.

5. The method of claim 4, wherein the raw training data further comprises labeling information of a priori interfering objects;

And the prior interference object is mistakenly marked as the target to be detected at least once in other target detection processes different from the target detection model training process.

6. The target detection system is characterized by comprising an acquisition module and a detection module, wherein

the acquisition module is used for acquiring data to be detected;

the detection module is used for detecting the data to be detected and outputting a detection result containing the class identification of the target to be detected and the class identification of the interference object so as to obtain the target to be detected;

7. the system of claim 6, wherein the data to be detected is image data; and the detection result is image data of the target to be detected and the interference object marked by using different identification modes.

8. The system of claim 6, wherein the detection module is further configured to detect the data to be detected by using a target detection model;

The system further comprises a training module;

The training module is used for training the initial model by using training data containing target marking information to be detected and interference object marking information to obtain a target detection model;

9. The system of claim 8, wherein the training module further comprises:

the initial training unit is used for training an initial model by using the original training data to obtain an initial target detection model;

The detection model testing unit is used for carrying out target detection by utilizing the initial target detection model and outputting a detection result;

The retraining unit is used for acquiring training data after the interference object is labeled based on the detection result; training the initial target detection model by using training data marked by the interference object to obtain the target detection model;

10. the system of claim 9, wherein the raw training data further comprises labeling information of a priori interfering objects;

11. An object detection apparatus, comprising at least one processor and at least one memory;

the at least one memory is for storing computer instructions;

the at least one processor is configured to execute at least some of the computer instructions to implement the operations of any of claims 1-5.

12. a computer-storable medium that stores computer instructions, at least some of which, when executed by a processor, perform operations according to any one of claims 1-5.