CN112541948A

CN112541948A - Object detection method and device, terminal equipment and storage medium

Info

Publication number: CN112541948A
Application number: CN202011453067.2A
Authority: CN
Inventors: 黄冠文; 程骏; 庞建新; 谭欢
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2020-12-11
Filing date: 2020-12-11
Publication date: 2021-03-23
Anticipated expiration: 2040-12-11
Also published as: CN112541948B

Abstract

The invention is suitable for the technical field of object detection, and provides an object detection method, an object detection device, terminal equipment and a storage medium, wherein an RGBD image is preprocessed to determine whether the RGBD image is usable or not, so that the success rate of object detection can be effectively improved; determining whether an object exists in the preprocessed RGBD image by performing object detection on the preprocessed RGBD image when the RGBD image is available; when the object exists in the preprocessed RGBD image, the object detection result comprising the three-dimensional position information and the category information of the object is output, the three-dimensional position information and the category information of the object in the RGBD image can be obtained without three-dimensional modeling of the object, the consumed time is short, and the method can be widely applied to detection of objects of various categories.

Description

Object detection method and device, terminal equipment and storage medium

Technical Field

The invention belongs to the technical field of object detection, and particularly relates to an object detection method, an object detection device, terminal equipment and a storage medium.

Background

Object detection is one of the classical problems in computer vision, whose task is to mark the position of objects in an image with a box (bounding box) and to give the class of the object. From the traditional framework of artificially designing features and shallow classifiers to the end-to-end detection framework based on deep learning, object detection becomes more mature step by step. At present, an object detection technology is widely applied to a robot to realize the function of cooperatively grabbing an object by hands and eyes of the robot, when the robot cooperatively grabs the object by the hands and eyes, the robot needs to know where the object is and also needs to know the class information of the object, so that the robot can grab the specified object at the corresponding position. The traditional object detection method needs three-dimensional modeling on the object, is long in time consumption and only supports detection of a small number of classes of objects.

Disclosure of Invention

In view of this, embodiments of the present invention provide an object detection method, an object detection apparatus, a terminal device, and a storage medium, so as to solve the problems that a conventional object detection method needs three-dimensional modeling of an object, consumes a long time, and only supports detection of a small number of classes of objects.

A first aspect of an embodiment of the present invention provides an object detection method, including:

preprocessing an RGBD image to determine whether the RGBD image is usable;

when the RGBD image is available, performing object detection on the preprocessed RGBD image, and determining whether an object exists in the preprocessed RGBD image;

when an object exists in the preprocessed RGBD image, outputting an object detection result, wherein the object detection result comprises three-dimensional position information and category information of the object.

A second aspect of an embodiment of the present invention provides an object detection apparatus, including:

the image preprocessing unit is used for preprocessing the RGBD image and determining whether the RGBD image is available;

the object detection unit is used for carrying out object detection on the preprocessed RGBD image when the RGBD image is available and determining whether an object exists in the preprocessed RGBD image;

and the result output unit is used for outputting an object detection result when an object exists in the preprocessed RGBD image, wherein the object detection result comprises three-dimensional position information and category information of the object.

A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the object detection method according to the first aspect of the embodiments of the present invention when executing the computer program.

A fourth aspect of embodiments of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the object detection method according to the first aspect of embodiments of the present invention.

According to the object detection method provided by the first aspect of the embodiment of the invention, the RGBD image is preprocessed to determine whether the RGBD image is usable or not, so that the success rate of object detection can be effectively improved; determining whether an object exists in the preprocessed RGBD image by performing object detection on the preprocessed RGBD image when the RGBD image is available; when the object exists in the preprocessed RGBD image, the object detection result comprising the three-dimensional position information and the category information of the object is output, the three-dimensional position information and the category information of the object in the RGBD image can be obtained without three-dimensional modeling of the object, the consumed time is short, and the method can be widely applied to detection of objects of various categories.

It is understood that the beneficial effects of the second to fourth aspects can be seen from the description of the first aspect, and are not described herein again.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flow chart of a first method for detecting an object according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of an object detection method according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart of a third method for detecting an object according to an embodiment of the present invention;

fig. 4 is a fourth flowchart illustrating an object detection method according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present invention and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present invention. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The object detection method provided by the embodiment of the invention can be applied to terminal equipment such as a robot, a tablet Personal Computer, a notebook Computer, a netbook, a Digital Assistant (PDA), a Personal Computer (PC), an industrial Personal Computer, a server and the like which are provided with or can be in communication connection with the manipulator and the RGBD camera. The embodiment of the invention does not limit the specific type of the terminal equipment at all.

In application, the camera can be fixedly arranged relative to the manipulator and can move along with the manipulator during the movement of the manipulator, and the camera can be arranged on the manipulator so as to form an eye-in-hand (eye-in-hand) or eye-out-of-hand (eye-to-hand) manipulator together with the manipulator.

As shown in fig. 1, an object detection method provided in an embodiment of the present invention includes the following steps S101 to S103 executed by a processor of a terminal device:

step S101, preprocessing the RGBD image, and determining whether the RGBD image is available.

In application, the RGBD image may be a frame of RGBD image acquired by controlling the RGBD camera when the terminal device controls the manipulator and the RGBD camera to perform the hand-eye coordination operation, or may be a frame of RGBD image acquired and stored by the terminal device in advance.

In one embodiment, before step S101, the method includes:

an RGBD image of a scene is acquired by an RGBD camera.

In application, the RGBD camera can be controlled to acquire RGB images of any scene in the visual field of the RGBD camera, for example, any scene such as a production line, an express delivery warehouse, any working table, the ground and the like which can be used for hand-eye coordination work. The scene may or may not include an object, that is, the RGBD image may include an object and a background region, or may only include an object or a background region.

In application, when the RGBD image is unavailable, it is indicated that an object cannot be or is difficult to detect according to the RGBD image, at this time, the terminal device may automatically control the RGBD camera to reacquire the RGBD image, and may also wait for a user to input an instruction for performing a next operation, so as to perform the next operation according to the instruction of the user.

In one embodiment, after step S101, the method includes:

when the RGBD image is not available, adjusting parameters of the RGBD camera, and re-acquiring an RGBD image of the scene through the RGBD camera;

or, when the RGBD image is unavailable, acquiring an RGBD image of a next scene through the RGBD camera.

In an application, when the RGBD image is not available, it may be necessary to adjust performance parameters such as the resolution and the focal length of the RGBD camera to improve the performance of the RGBD camera, because the resolution of the RGBD camera is low or the performance parameters such as the focal length are not appropriate. When the RGBD image is not available or when the RGBD image of the same scene re-acquired after adjusting the performance parameters of the RGBD camera is still unavailable, the RGBD camera can be controlled to acquire the RGBD image of the next scene.

In one embodiment, after step S101, the method includes:

when the RGBD image is unavailable, outputting prompt information for representing that the RGBD image is unavailable.

In application, the prompt information indicating that the RGBD image is unavailable may be output in any human-computer interaction mode supported by the terminal device, for example, the prompt information may be sound information output based on a voice device, information of characters, images and character drawings displayed on a display screen, light indication information displayed on an indicator light, somatosensory information output based on a manipulator or a vibration motor, and the like.

In application, before object detection is performed on the RGBD image, the RGBD image may be preprocessed to determine whether the RGBD image is available, so as to facilitate subsequent object detection, and if the RGBD image is not available, prompt information indicating that the RGBD image is not available may be output. Whether the RGBD image is available or not can be determined based on the image characteristics of the RGBD image, and when the image characteristics of the RGBD image are satisfactory, the RGBD image can be determined to be available, otherwise, the RGBD image can be determined not to be available.

As shown in fig. 2, in one embodiment, step S101 includes the following steps S201 to S203:

step S201, converting the RGBD image into an RGB image.

In application, since the image characteristics of the rgbb image, such as definition, brightness, chromaticity, resolution, etc., are only related to the image information of the RGB three channels of the RGBD image, the RGBD image can be converted into an RGB image, and the depth information therein can be removed, so as to facilitate detection of the image characteristics related to the image information of the RGB three channels.

Step S202, detecting the image characteristics of the RGB image; wherein the image characteristics include at least one of sharpness, chrominance, luminance, and resolution;

step S203, when the image characteristic detection of the RGB image passes, determining that the RGBD image is available.

In an application, the image characteristics may include, but are not limited to, sharpness, brightness, chromaticity, size, resolution, etc., any of which may be undesirable, the RGBD image may be determined to be unavailable, and prompt information characterizing the RGBD image as unavailable may be output. When the image characteristics comprise at least two characteristics of definition, chromaticity, brightness and resolution ratio for detection, whether each characteristic of the RGB image meets requirements or not can be detected in sequence, when one characteristic meets the requirements, whether the next characteristic meets the requirements or not is detected continuously, if certain characteristic does not meet the requirements, the RGBD image is determined to be unavailable directly, prompt information for representing that the RGBD image is unavailable is output, and the RGBD image is determined to be available when all the characteristics meet the requirements (namely, the detection is passed).

As shown in fig. 3, in one embodiment, before step S202, the following steps S301 and S302 are included:

step S301, cutting the RGB image to intercept the RGB image in a preset area in the RGB image;

step S302, zooming the RGB image of the preset area to obtain the RGB image with the preset size.

In application, when the terminal device performs object detection, only objects in a preset area in a visual field of the RGBD camera are usually detected, and object detection is not performed on all areas in the visual field, so that the RGB image can be cropped, an irrelevant background area is removed, and only the RGB image in the preset area is reserved. The preset region may be a central region of a field of view of the RGBD camera. Because the object detection algorithm adopted by the terminal device usually only supports detection of an image with a fixed size, the RGB image in the preset area needs to be scaled to obtain the RGB image with the preset size supported by the object detection algorithm.

As shown in fig. 3, in one embodiment, step S202 includes the following steps S303 to S305:

step S303, detecting the definition of the RGB image;

step S304, when the definition of the RGB image is greater than a preset definition threshold value, detecting the chromaticity of the RGB image;

step S305, when the chromaticity of the RGB image is larger than a preset chromaticity threshold value, detecting the brightness of the RGB image;

step S203 includes:

step S306, when the brightness of the RGB image is larger than a preset brightness threshold value, determining that the RGBD image is available.

In application, the image features may include definition, chromaticity and brightness, the definition, chromaticity and brightness of the RGB image are sequentially detected, when a previous feature satisfies a corresponding threshold requirement, a next feature is continuously detected, when any one feature does not satisfy the corresponding threshold requirement, it may be determined that the RGBD image is unavailable, and prompt information indicating that the RGBD image is unavailable is output.

In application, when a certain feature of the RGBD image is not satisfactory, corresponding optimization processing may be performed on the RGBD image to improve the feature, and then it is determined again whether the feature of the RGBD image is satisfactory, and if not, it is determined that the RGBD image is not usable and prompt information that the RGBD image is not usable is output.

As shown in fig. 3, in one embodiment, step S101 further includes the following steps performed after step S305:

step S307, when the brightness of the RGB image is smaller than or equal to a preset brightness threshold and within a preset brightness range, performing high dynamic illumination rendering on the RGB image to obtain a high dynamic range image, and determining that the RGBD image is available.

In application, when the brightness of the RGB image does not meet the corresponding threshold requirement (i.e. the brightness is less than or equal to the preset brightness threshold), it may be further determined whether the brightness is within the threshold Range of the improvement processing (i.e. within the preset brightness Range), and if the brightness is within the threshold Range of the improvement processing, the brightness may be improved so that the RGB image with the improved brightness is usable. And when the brightness of the RGB image is less than or equal to the preset brightness threshold value and is not in the preset brightness range, determining that the RGBD image is unavailable, and outputting prompt information for representing that the RGBD image is unavailable.

Step S102, when the RGBD image is available, performing object detection on the preprocessed RGBD image, and determining whether an object exists in the preprocessed RGBD image.

In application, after the RGBD image is preprocessed to determine that the RGBD image is available, further object detection is performed on the preprocessed RGBD image (namely, an RGB image with a preset size) to determine whether an object exists, and when the object does not exist in the preprocessed RGBD image, subsequent object grabbing operation does not need to be further executed, at this time, the terminal device can automatically control the RGBD camera to reacquire the RGBD image of the next scene, and can also wait for the user to input an instruction for the next operation, so that the next operation is executed according to the instruction of the user.

In one embodiment, after step S102, the method includes:

when no object exists in the preprocessed RGBD image, acquiring an RGBD image of a next scene through the RGBD camera.

In one embodiment, after step S102, the method includes:

when no object exists in the preprocessed RGBD image, outputting prompt information representing that no object exists in the preprocessed RGBD image.

In application, the prompt information representing that no object exists in the preprocessed RGBD image can be output in any human-computer interaction mode supported by the terminal device, for example, the prompt information may be sound information output based on a voice device, information of characters, images and character drawings displayed on a display screen, light indication information displayed on an indicator lamp, somatosensory information output based on a manipulator or a vibration motor, and the like.

As shown in fig. 4, in one embodiment, step S102 includes the following steps S401 and S402:

step S401, when the RGBD image is available, performing object detection on the preprocessed RGBD image to obtain at least one first detection result, wherein the first detection result comprises an object frame, a category corresponding to the object frame and a confidence coefficient of the category;

step S402, when at least one second detection result with retention reliability larger than a preset confidence threshold exists in the at least one first detection result, determining that an object exists in the RGBD image after preprocessing.

In application, the method for determining whether an object exists in the preprocessed RGBD image may be to perform object detection on the preprocessed RGBD image to obtain at least one first detection result, and if only one first detection result is obtained and the result is empty, it may be determined that an object does not exist in the preprocessed RGBD image, and prompt information indicating that the object does not exist in the preprocessed RGBD image is output. And if the obtained first detection result is not empty, further determining whether an object exists in the preprocessed RGBD image according to the confidence of the category in the first detection result.

In application, an object frame (bounding box) is used for identifying position information of an object in a preprocessed RGBD image, and since the preprocessed RGBD image is an RGB image and does not contain depth information, two-dimensional position information, namely x-axis and y-axis information, of the object in the RGB image can be obtained according to the object frame, and height information h and width information w of the object can also be obtained according to the object frame. The category of the object frame indicates the category to which the object identified by the object frame belongs, and can be used to obtain the category attribute of the object, for example, the category under the large category such as people, animals, and articles, or the category under the small category such as football, cup, and pen.

In application, when at least one second detection result with retention reliability larger than a preset confidence threshold exists in at least one first detection result, it can be determined that an object exists in the preprocessed RGBD image, all the second detection results are retained, and the first detection result with retention reliability smaller than or equal to the preset confidence threshold is removed. When a second detection result with retention reliability larger than a preset confidence threshold value does not exist in at least one first detection result, it can be determined that no object exists in the preprocessed RGBD image, and prompt information representing that no object exists in the preprocessed RGBD image is output. The preset confidence threshold may be set according to actual needs, for example, any value from 0.5 to 0.95, and may specifically be 0.9.

Step S103, when an object exists in the preprocessed RGBD image, outputting an object detection result, wherein the object detection result comprises three-dimensional position information and category information of the object.

In application, when it is determined that an object exists in the preprocessed RGBD image, an object detection result is output, and the object detection result includes three-dimensional position information and category information of the detected object. The three-dimensional position information may specifically include x-axis information, y-axis information, and z-axis information, where the x-axis information and the y-axis information may be identified by an object frame, and the z-axis information is obtained by depth information included in the RGBD image, and specifically, two-dimensional position information (i.e., x-axis information and y-axis information) of the object in the RGBD image may be obtained based on the object frame, and then corresponding depth information (i.e., z-axis information) in the RGBD image may be obtained according to the two-dimensional position information; the category information includes a category corresponding to the object frame and may also include a confidence of the category.

As shown in fig. 4, in one embodiment, step S103 includes the following steps S403 to S408:

step S403, performing non-maximum suppression on all the second detection results with the same category to obtain at least one third detection result;

s404, detecting the intersection ratio among the object frames of all the third detection results of different classes;

step S405, obtaining the third detection result with the cross-over ratio larger than a preset cross-over ratio threshold value and with higher confidence coefficient, and obtaining at least one fourth detection result;

step S406, obtaining the third detection result with the intersection ratio smaller than or equal to a preset intersection ratio threshold value to obtain at least one fourth detection result;

step 407, acquiring z-axis information of the object in each fourth detection result according to the object frame and the RGBD image in each fourth detection result;

step S408, outputting an object detection result, where the object detection result includes all the fourth detection results and z-axis information of objects in all the fourth detection results.

In application, all the obtained second detection results include detection results of the same category and detection results of different categories. Non Maximum Suppression (NMS) is performed on the second detection result of the same category, and redundant object frames are removed, specifically, all object frames of the same category can be sorted, the object frame with the highest confidence in the sorted sequence is selected and an Intersection-Over-unity (IOU) ratio is calculated with other object frames in the sorted sequence respectively, if the Intersection ratio of the object frame with the highest confidence and any object frame is larger than a preset Intersection-Over-unity threshold value, removing any object frame with low confidence from the sorting sequence, if the intersection ratio of the object frame with the highest confidence and any object frame is less than or equal to a preset intersection ratio threshold value, and considering that a plurality of objects belonging to the same category exist and reserving any object frame with low confidence coefficient, and repeating the steps until all object frames of the same category are traversed to obtain at least one third detection result. And solving intersection ratios of all the third detection results of different classes, removing object frames with lower confidence degrees in the third detection results of two different classes of which the intersection ratios are larger than a preset intersection ratio and are larger than a threshold value, retaining object frames with higher confidence degrees, and retaining object frames in the third detection results of two different classes of which the intersection ratios are smaller than or equal to the preset intersection ratio and are larger than the threshold value to obtain at least one fourth detection result.

In application, since the fourth detection result only includes the object frame, the category corresponding to the object frame, and the confidence of the category, and only the two-dimensional position information of the object in the RGB image, that is, the x-axis and y-axis information, can be obtained according to the object frame, it is further necessary to obtain the z-axis information of the object through the depth information included in the RGBD image.

In one embodiment, before step S101, the method further includes:

and training the object detection model through the RGBD images under various scenes.

In application, the object detection model may be a dark net yolov4 model, and after the object detection model is trained through RGBD images in multiple scenes, the model may have a function of detecting different types of objects in various scenes, and then the trained object detection model is used to execute the object detection method. It should be understood that the method for training the object detection model is the same as the above object detection method, that is, the object detection model performs the object detection method each time, which is equivalent to performing the training on the object detection model once, and the detection performance of the object detection model can be improved to a certain extent each time the object detection model is applied to perform the object detection method.

According to the object detection method provided by the embodiment of the invention, the RGBD image is preprocessed to determine whether the RGBD image is usable or not, so that the success rate of object detection can be effectively improved; determining whether an object exists in the preprocessed RGBD image by performing object detection on the preprocessed RGBD image when the RGBD image is available; when the object exists in the preprocessed RGBD image, the object detection result comprising the three-dimensional position information and the category information of the object is output, the three-dimensional position information and the category information of the object in the RGBD image can be obtained without three-dimensional modeling of the object, the consumed time is short, and the method can be widely applied to detection of objects of various categories.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

The embodiment of the invention also provides an object detection device, which is used for executing the steps in the embodiment of the object detection method. The object detection device may be a virtual appliance (virtual application) in the terminal device, which is executed by a processor of the terminal device, or may be the terminal device itself.

As shown in fig. 5, an object detection apparatus 100 according to an embodiment of the present invention includes:

the image preprocessing unit 101 is configured to preprocess an RGBD image and determine whether the RGBD image is usable;

an object detection unit 102, configured to perform object detection on the pre-processed RGBD image when the RGBD image is available, and determine whether an object exists in the pre-processed RGBD image;

a result output unit 103, configured to output an object detection result when an object exists in the RGBD image after the preprocessing, where the object detection result includes three-dimensional position information and category information of the object.

In one embodiment, the object detection apparatus further comprises:

the image acquisition unit is used for acquiring an RGBD image of the scene through the RGBD camera.

In one embodiment, the object detection apparatus further comprises:

a parameter adjusting unit, configured to adjust parameters of the RGBD camera when the RGBD image is unavailable, and reacquire an RGBD image of the scene by the RGBD camera;

an image acquisition unit, configured to acquire, by the RGBD camera, an RGBD image of a next scene when the RGBD image is unavailable.

In one embodiment, the object detection apparatus further comprises:

and the prompting unit is used for outputting prompting information for representing that the RGBD image is unavailable when the RGBD image is unavailable.

In one embodiment, the object detection apparatus further comprises:

and the image acquisition unit is used for acquiring an RGBD image of the next scene through the RGBD camera when no object exists in the preprocessed RGBD image.

In one embodiment, the object detection apparatus further comprises:

and the prompting unit is used for outputting prompting information representing that no object exists in the preprocessed RGBD image when no object exists in the preprocessed RGBD image.

In one embodiment, the object detection apparatus further comprises:

and the training unit is used for training the object detection model through the RGBD images under various scenes.

In application, each unit in the object detection apparatus may be a software program unit, may also be implemented by different logic circuits integrated in a processor, and may also be implemented by a plurality of distributed processors.

As shown in fig. 6, an embodiment of the present invention further provides a terminal device 200, including: at least one processor 201 (only one processor is shown in fig. 6), a memory 202, and a computer program 203 stored in the memory 202 and executable on the at least one processor 201, further comprising a manipulator 204 and a camera 205 connected to the processor 201, the steps of any of the above-described method embodiments being implemented when the computer program 203 is executed by the processor 201.

In applications, end devices may include, but are not limited to, processors, memory, ultra-wideband transceiver modules, radar sensors, vision sensors, and odometers. Those skilled in the art will appreciate that fig. 6 is merely an example of a terminal device, and does not constitute a limitation of terminal device 6, and may include more or less components than those shown, or combine some components, or different components, such as an input/output device, a network access device, and the like.

In an Application, the Processor may be a Central Processing Unit (CPU), and the Processor may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field-Programmable Gate arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

In some embodiments, the storage may be an internal storage unit of the terminal device, such as a hard disk or a memory of the terminal device. The memory may also be an external storage device of the terminal device in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal device. Further, the memory may also include both an internal storage unit of the terminal device and an external storage device. The memory is used for storing an operating system, an application program, a Boot Loader (Boot Loader), data, and other programs, such as program codes of computer programs. The memory may also be used to temporarily store data that has been output or is to be output.

It should be noted that, because the contents of information interaction, execution process, and the like between the above-mentioned apparatuses/units are based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof can be referred to specifically in the method embodiment section, and are not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of each functional unit is illustrated, and in practical applications, the above-mentioned functional allocation may be performed by different functional units or modules according to requirements, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit in the embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working process of the units in the system may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.

An embodiment of the present invention further provides a network device, where the network device includes: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, the processor implementing the steps of the above-described method embodiments when executing the computer program.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in the above method embodiments.

Embodiments of the present invention provide a computer program product, which, when running on a terminal device, enables the terminal device to implement the steps in the above method embodiments when executed.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may be implemented by a computer program, which is stored in a computer readable storage medium and used for instructing related hardware to implement the steps of the embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to the apparatus/device, recording medium, computer Memory, Read-Only Memory (ROM), Random-Access Memory (RAM), electrical carrier signals, telecommunications signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided by the present invention, it should be understood that the disclosed apparatus/device and method can be implemented in other ways. For example, the above-described apparatus/device embodiments are merely illustrative, and for example, a module or a unit may be divided into only one logic function, and may be implemented in other ways, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. An object detection method, comprising:

preprocessing an RGBD image to determine whether the RGBD image is usable;

2. The object detection method of claim 1, wherein the image preprocessing the RGBD image to determine whether the RGBD image is available comprises:

converting the RGBD image into an RGB image;

detecting image characteristics of the RGB image; wherein the image characteristics include at least one of sharpness, chrominance, luminance, and resolution;

determining that the RGBD image is available when image feature detection of the RGB image passes.

3. The object detection method of claim 2, wherein the detecting the image features of the RGB image comprises:

detecting the definition of the RGB image;

when the definition of the RGB image is greater than a preset definition threshold value, detecting the chromaticity of the RGB image;

when the chromaticity of the RGB image is larger than a preset chromaticity threshold value, detecting the brightness of the RGB image;

the determining that the RGBD image is available when the image feature detection of the RGB image passes comprises:

and when the brightness of the RGB image is larger than a preset brightness threshold value, determining that the RGBD image is available.

4. The object detection method of claim 3, wherein the image preprocessing the RGBD image to determine whether the RGBD image is usable further comprises:

and when the brightness of the RGB image is smaller than or equal to a preset brightness threshold and within a preset brightness range, performing high-dynamic illumination rendering on the RGB image to obtain a high-dynamic-range image, and determining that the RGBD image is available.

5. The object detection method according to any one of claims 2 to 4, wherein before detecting the image features of the RGB image, the method comprises:

cutting the RGB image to intercept the RGB image in a preset area in the RGB image;

and zooming the RGB image in the preset area to obtain an RGB image with a preset size.

6. The object detection method according to any one of claims 1 to 4, wherein performing object detection on the pre-processed RGBD image when the RGBD image is available to determine whether an object exists in the pre-processed RGBD image comprises:

when the RGBD image is available, performing object detection on the preprocessed RGBD image to obtain at least one first detection result, wherein the first detection result comprises an object frame, a category corresponding to the object frame and a confidence coefficient of the category;

and when at least one second detection result with retention reliability larger than a preset confidence coefficient threshold exists in the at least one first detection result, determining that an object exists in the preprocessed RGBD image.

7. The object detection method according to claim 6, wherein outputting the object detection result when the object exists in the RGBD image after the preprocessing comprises:

performing non-maximum suppression on all the second detection results of the same category to obtain at least one third detection result;

detecting the intersection ratio among the object frames of all the third detection results of different classes;

obtaining the third detection result with the intersection ratio larger than a preset intersection ratio threshold value and the confidence coefficient higher, and obtaining at least one fourth detection result;

obtaining the third detection result with the intersection ratio smaller than or equal to a preset intersection ratio threshold value to obtain at least one fourth detection result;

acquiring z-axis information of the object in each fourth detection result according to the object frame and the RGBD image in each fourth detection result;

and outputting an object detection result, wherein the object detection result comprises all the fourth detection results and z-axis information of the objects in all the fourth detection results.

8. An object detecting device, comprising:

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the object detection method according to any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the object detection method according to any one of claims 1 to 7.