CN106650662B

CN106650662B - Target object shielding detection method and device

Info

Publication number: CN106650662B
Application number: CN201611191940.9A
Authority: CN
Inventors: 周舒畅; 何蔚然; 张弛
Original assignee: Beijing Kuangshi Technology Co Ltd; Beijing Megvii Technology Co Ltd
Current assignee: Beijing Kuangshi Technology Co Ltd; Beijing Megvii Technology Co Ltd
Priority date: 2016-12-21
Filing date: 2016-12-21
Publication date: 2021-03-23
Anticipated expiration: 2036-12-21
Also published as: CN106650662A

Abstract

The invention provides a target object occlusion detection method and a target object occlusion detection device, wherein the target object occlusion detection method comprises the following steps: receiving an input image; detecting a target object in the input image by using a trained neural network and outputting a thermodynamic diagram of the target object; and detecting whether the target object has an occlusion based on the thermodynamic diagram. According to the method and the device for detecting the occlusion of the target object, the target object in the input image is converted into the thermodynamic diagram based on the trained neural network, whether the target object is occluded or not is detected through the thermodynamic diagram, the difficulty in identifying the target object can be effectively reduced, and the identification precision and the stability are improved.

Description

Target object shielding detection method and device

Technical Field

The invention relates to the technical field of image processing, in particular to a target object occlusion detection method and device.

Background

For the recognition of an object in an image, the occlusion of the object existing in the image is a serious interference factor, for example, masks, glasses, bangs, hats, ornamentations and the like on a face of a person during face recognition can cause occlusion on the face. On the other hand, in some applications, such as face recognition in surveillance video, it may happen that recognition is intentionally disturbed by occlusion, or countered. These will seriously affect the accuracy and stability of the recognition.

Due to the variety and inexhaustibility of the ways and types of occlusions, training a classifier for each occlusion category can be very labor intensive and severely constrained by the amount of data. Therefore, a technique or system capable of detecting whether an object in an image is occluded is needed.

Disclosure of Invention

The present invention has been made to solve the above problems. According to an aspect of the present invention, a target object occlusion detection method is provided, which includes: receiving an input image; detecting a target object in the input image by using a trained neural network and outputting a thermodynamic diagram of the target object; and detecting whether the target object has an occlusion based on the thermodynamic diagram.

In one embodiment of the present invention, the trained neural network is a full convolution neural network.

In one embodiment of the invention, the fully convolutional neural network is generated based on replacing fully connected layers of the trained convolutional neural network with convolutional layers.

In one embodiment of the invention, the convolutional layers are 1 × 1 convolutional layers.

In one embodiment of the invention, the full convolution neural network further comprises an upsampling layer.

In one embodiment of the invention, the number of upsampling layers depends on the desired resolution of the output thermodynamic diagram.

In an embodiment of the present invention, the target object occlusion detection method further includes: after the thermodynamic diagram is output, labeling a target object in the thermodynamic diagram; and the detection of whether the target object has an occlusion is based on the labeled thermodynamic diagram.

In one embodiment of the present invention, the step of detecting whether the target object has an occlusion based on the thermodynamic diagram comprises: detecting whether the shape of a target object in the thermodynamic diagram conforms to the shape of an object of the class of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded.

In one embodiment of the present invention, the step of detecting whether the target object has an occlusion based on the thermodynamic diagram comprises: detecting whether the area of the target object in the thermodynamic diagram is a connected area; and when the area of the target object in the thermodynamic diagram is detected to be a connected area, determining that the target object is not shielded, otherwise, determining that the target object is shielded.

In an embodiment of the present invention, the target object occlusion detection method further includes: when the target object in the thermodynamic diagram is detected to have occlusion, determining an occlusion area.

According to another aspect of the present invention, there is provided a target object occlusion detection apparatus, comprising: a receiving module for receiving an input image; the thermodynamic diagram output module is used for detecting a target object in the input image by using the trained neural network and outputting a thermodynamic diagram of the target object; and an occlusion detection module for detecting whether the target object has an occlusion based on the thermodynamic diagram.

In an embodiment of the present invention, the target object occlusion detection apparatus further includes: the automatic labeling module is used for labeling the target object in the thermodynamic diagram; and the occlusion detection module is further used for detecting whether the target object has occlusion or not based on the labeled thermodynamic diagram.

In an embodiment of the invention, the occlusion detection module is further configured to: detecting whether the shape of a target object in the thermodynamic diagram conforms to the shape of an object of the class of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded.

In an embodiment of the invention, the occlusion detection module is further configured to: detecting whether the area of the target object in the thermodynamic diagram is a connected area; and when the area of the target object in the thermodynamic diagram is detected to be a connected area, determining that the target object is not shielded, otherwise, determining that the target object is shielded.

In an embodiment of the invention, the occlusion detection module is further configured to: when the target object in the thermodynamic diagram is detected to have occlusion, determining an occlusion area.

According to the method and the device for detecting the occlusion of the target object, the target object in the input image is converted into the thermodynamic diagram based on the trained neural network, whether the target object is occluded or not is detected through the thermodynamic diagram, the difficulty in identifying the target object can be effectively reduced, and the identification precision and the stability are improved.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent by describing in more detail embodiments of the present invention with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings, like reference numbers generally represent like parts or steps.

FIG. 1 shows a schematic block diagram of an example electronic device for implementing a target object occlusion detection method and apparatus according to embodiments of the invention;

FIG. 2 shows a schematic flow chart of a target object occlusion detection method according to an embodiment of the invention;

FIGS. 3A and 3B show schematic block diagrams of a convolutional neural network and a full convolutional neural network, respectively, according to an embodiment of the present invention;

fig. 4A and 4B show an example of an input image and an example of a corresponding thermodynamic diagram of an output, respectively;

FIG. 5 shows a schematic block diagram of a target object occlusion detection apparatus according to an embodiment of the present invention; and

FIG. 6 shows a schematic block diagram of a target object occlusion detection system according to an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of embodiments of the invention and not all embodiments of the invention, with the understanding that the invention is not limited to the example embodiments described herein. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the invention described herein without inventive step, shall fall within the scope of protection of the invention.

First, an example electronic device 100 for implementing the target object occlusion detection method and apparatus of the embodiments of the present invention is described with reference to fig. 1.

As shown in FIG. 1, electronic device 100 includes one or more processors 102, one or more memory devices 104, an input device 106, an output device 108, and an image sensor 110, which are interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device may have other components and structures as desired.

The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by processor 102 to implement client-side functionality (implemented by the processor) and/or other desired functionality in embodiments of the invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to an external (e.g., user), and may include one or more of a display, a speaker, and the like.

The image sensor 110 may take images (e.g., photographs, videos, etc.) desired by the user and store the taken images in the storage device 104 for use by other components. The image capture device 110 may be a camera. It should be understood that the image capture device 110 is merely an example, and the electronic device 100 may not include the image capture device 110. In this case, the image to be recognized may be captured by other image capturing devices and transmitted to the electronic apparatus 100.

Exemplarily, an exemplary electronic device for implementing the target object occlusion detection method and apparatus according to the embodiment of the present invention may be implemented as a smart phone, a tablet computer, or the like.

In the following, a target object occlusion detection method 200 according to an embodiment of the invention will be described with reference to fig. 2.

In step S210, an input image is received.

In one embodiment, the received input image may be an image including a target object to be recognized. The target object may be any one or more types of objects (e.g., human face, animal, various objects, etc.). Before the target object in the input image is identified, the target object occlusion detection method is firstly carried out to determine whether the target object to be identified has occlusion or not, and then the target object to be identified is identified, so that the identification difficulty can be reduced, and the identification precision can be improved.

In one example, the received input image may be an image acquired in real-time. In other examples, the received input image may also be an image from any source. Here, the received input image may be video data or picture data.

In step S220, a trained neural network is used to detect a target object in the input image and output a thermodynamic diagram of the target object.

In one embodiment, the neural network that outputs the thermodynamic diagram utilized in step S220 may be generated based on a trained convolutional neural network that is a generic classifier.

In this embodiment, the class of the target object to be subjected to occlusion detection may be determined based on the class of the target object to be identified, and then a classifier capable of identifying the class of the target object is obtained by training using the identification data set of the class of the target object, and then a neural network outputting a thermodynamic diagram is generated based on the classifier.

For example, a convolutional neural network that can identify the class of target objects may be trained first. FIG. 3A shows a schematic block diagram of a convolutional neural network 300A, according to an embodiment of the present invention. As shown in FIG. 3A, a convolutional neural network 300A that can identify a class or classes of target objects may include convolutional layers, pooling layers, and fully-connected layers. The input image is input into the convolutional neural network 300A to obtain a classification result, as shown in fig. 3A.

Based on the trained convolutional neural network, the fully-connected layers of the convolutional neural network (e.g., convolutional neural network 300A shown in fig. 3A) may then be replaced with convolutional layers to generate a fully-convolutional neural network, which may be trained as the neural network outputting the thermodynamic diagram utilized in step S220. FIG. 3B shows a schematic block diagram of a full convolution neural network 300B in accordance with an embodiment of the present invention. As shown in fig. 3B, the full convolutional neural network 300B may include convolutional layers and pooling layers. The thermodynamic diagram can be obtained after the input image is input into the full convolution neural network 300B, as shown in fig. 3B.

In one example, the convolutional layer replacing the fully-connected layers of convolutional neural network 300A may be a 1 × 1 convolutional layer. The use of the 1 × 1 convolutional layer can relatively reduce the amount of computation in the case of realizing a desired function, and can improve the computation speed. In other examples, the convolutional layers that replace the fully-connected layers of convolutional neural network 300A may also be convolutional layers of any other suitable scale.

With continued reference to fig. 3B, the full convolution neural network 300B may also include an upsampling layer that may increase the resolution of the output thermodynamic diagram to better detect whether an occlusion exists in the target object based on the thermodynamic diagram. In one example, the number of upsampling layers may depend on the desired resolution of the output thermodynamic diagram. In other examples, the number of upsampled layers may be set in combination taking into account other factors as well.

In the above example, the step of training the convolutional neural network 300A may be referred to as a pre-training phase in which a general classifier capable of identifying the class of the target object is trained; the step of training the fully convolutional neural network 300B may be referred to as a fine tuning phase, in which the fully convolutional neural network 300B is generated based on the convolutional neural network 300A, and the fully convolutional neural network 300B is trained to be a neural network that outputs a thermodynamic diagram used in step S220. The full convolution neural network 300B may be regarded as a new neural network that is constructed by extracting the embedded map from the convolution neural network 300A and then connecting the extracted map to the thermodynamic diagram output layer.

The above example exemplarily describes the training process of the neural network outputting the thermodynamic diagram utilized in step S220, and it can be understood by those skilled in the art that, although schematic structural diagrams of the convolutional neural network used in the training and the full convolutional neural network obtained after the training are shown in fig. 3A and 3B, they are merely exemplary, and they may be any other suitable structures. Furthermore, the neural network outputting the thermodynamic diagram utilized in step S220 can also be generated by direct training without being generated by training a general classifier first.

Based on a trained neural network (e.g., the full convolution neural network 300B shown in fig. 3B), a target object in an input image may be detected and a corresponding thermodynamic diagram may be output, as shown in fig. 4A and 4B. Fig. 4A shows an example of an input image, and assuming that a target object to be detected is a human face, a trained neural network (e.g., a full convolution neural network 300B as shown in fig. 3B) can output a corresponding thermodynamic diagram, as shown in fig. 4B, in which a target object (i.e., a human face) in the form of a thermodynamic diagram is shown, and it can be seen from the diagram that a human face with an occlusion (e.g., the leftmost human face thermodynamic diagram in fig. 4B) is obviously different from a human face without an occlusion.

The following continues with reference back to fig. 2 for a description of the steps of the target object occlusion detection method 200 according to an embodiment of the invention.

In step S230, whether the target object has an occlusion is detected based on the thermodynamic diagram.

In one embodiment, the step of detecting whether the target object has an occlusion based on thermodynamic diagrams may comprise: detecting whether the shape of the target object in the thermodynamic diagram conforms to the shape of the object of the category of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded. In this embodiment, "shape" may be understood as a face constituted by a boundary (i.e., an area constituted by a boundary) rather than the boundary itself.

For example, when detecting occlusion of a face, if the face has a long bang, or is covered with sunglasses (e.g., the leftmost face in fig. 4A), etc., a "black hole" (e.g., as shown in the leftmost face in fig. 4B) or other type of occlusion region will appear above the face region on the thermodynamic diagram, and by detecting its shape, it can be determined whether occlusion exists.

In another embodiment, it may be detected whether the region of the target object in the thermodynamic diagram is a connected region, and when there is an occlusion in the target object, it may no longer be a complete connected region, but rather be "truncated" by the occlusion, so that a black region appears in the region of the target object in the thermodynamic diagram (as shown by the black region in the leftmost face region in fig. 4B), and therefore it may also be determined whether there is an occlusion in the target object based on whether the region of the target object in the thermodynamic diagram is a connected region.

In other examples, whether the target object has an occlusion may also be detected based on thermodynamic diagrams in any other suitable manner.

Further, in one embodiment, the target object in the thermodynamic diagram may be labeled (e.g., the framing of the target object as shown in fig. 4B) after the thermodynamic diagram is output, and then occlusion may be detected based on the labeled thermodynamic diagram. By the method, the target objects which are close to each other can be isolated, and the target objects which are smaller in display in the image due to the fact that the target objects are far away from the lens can also be clearly framed, so that omission or errors in shielding detection are avoided, more accurate shielding detection is realized, and the reliability of detection results is improved. In one example, a neural network may be employed to implement the labeling of the target object. In another example, labeling of the target object may also be accomplished using a suitable algorithm. In other examples, the labeling of the target object may also be accomplished using any other suitable method.

In yet another embodiment, upon detecting an occlusion of the target object in the thermodynamic diagram, an occlusion region may be further determined. For example, the occlusion region may be segmented to determine the occlusion position of the target object, thereby providing a more reliable basis for the subsequent identification of the target object. In a real-time application scene, a prompt can be provided for a target object to be recognized (such as a person to be recognized) as a reference to inform the target object to be recognized to readjust the position to be recognized or remove an occlusion, so that the recognition efficiency and accuracy are improved.

Based on the above description, the target object occlusion detection method according to the embodiment of the invention converts the target object in the input image into the thermodynamic diagram based on the trained neural network, and detects whether the target object is occluded or not through the thermodynamic diagram, so that the difficulty in identifying the target object can be effectively reduced, and the identification precision and stability can be improved.

Illustratively, the target object occlusion detection method according to embodiments of the present invention may be implemented in a device, apparatus or system having a memory and a processor.

In addition, the target object occlusion detection method provided by the embodiment of the invention has the advantages of high processing speed and small model volume, and can be conveniently deployed on mobile equipment such as a smart phone, a tablet computer and a personal computer. Alternatively, the target object occlusion detection method according to the embodiment of the present invention may also be deployed at a server side (or a cloud side). Alternatively, the target object occlusion detection method according to the embodiment of the present invention may also be distributively deployed at a server side (or a cloud side) and a personal terminal side.

In the following, a target object occlusion detection apparatus provided by another aspect of the present invention is described with reference to fig. 5. Fig. 5 shows a schematic block diagram of a target object occlusion detection apparatus 500 according to an embodiment of the present invention.

As shown in fig. 5, the target object occlusion detection apparatus 500 according to the embodiment of the present invention includes a receiving module 510, a thermodynamic diagram output module 520, and an occlusion detection module 530. The respective modules may perform the respective steps/functions of the target object occlusion detection method described above in connection with fig. 2, respectively. Only the main functions of the respective modules of the target object occlusion detection apparatus 500 are described below, and the details that have been described above are omitted.

The receiving module 510 is used for receiving an input image. The thermodynamic diagram output module 520 is configured to detect a target object in the input image by using the trained neural network and output a thermodynamic diagram of the target object. The occlusion detection module 530 is configured to detect whether the target object is occluded based on the thermodynamic diagram. The receiving module 510, the thermodynamic diagram output module 520, and the occlusion detection module 530 may all be implemented by the processor 102 in the electronic device shown in fig. 1 executing program instructions stored in the storage 104.

In one embodiment, the input image received by the receiving module 510 may be an image including a target object to be recognized. The target object may be any one or more types of objects (e.g., human face, animal, various objects, etc.). Before the target object in the input image is identified, the target object is firstly subjected to shielding detection and then identified, so that the identification difficulty can be reduced, and the identification precision can be improved.

In one example, the input image received by the receiving module 510 may be an image acquired in real time. In other examples, the input image received by the receiving module 510 may also be an image from any source. Here, the input image received by the receiving module 510 may be video data or picture data.

In one embodiment, the neural network utilized by the thermodynamic diagram output module 520 is a full convolution neural network (such as the full convolution neural network 300B shown in fig. 3B). Illustratively, the full convolutional neural network may be generated based on replacing the fully-connected layers of a trained convolutional neural network (such as convolutional neural network 300A shown in fig. 3A) with convolutional layers. The training of the neural network utilized by the thermodynamic diagram output module 520 may refer to the process described above in conjunction with fig. 3A and 3B, and for brevity, will not be described again here.

In one example, the neural network utilized by the thermodynamic diagram output module 520 may also include an upsampling layer that may increase the resolution of the output thermodynamic diagram to better detect whether an occlusion of the target object exists based on the thermodynamic diagram by the occlusion detection module 530. In one example, the number of upsampling layers may depend on the desired resolution of the output thermodynamic diagram. In other examples, the number of upsampling layers may be synthetically set in consideration of other factors as well.

In one embodiment, the step of the occlusion detection module 530 detecting whether the target object has an occlusion based on thermodynamic diagrams may include: detecting whether the shape of the target object in the thermodynamic diagram conforms to the shape of the object of the category of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded. In this embodiment, "shape" may be understood as a face constituted by a boundary (i.e., an area constituted by a boundary) rather than the boundary itself.

In another embodiment, the occlusion detection module 530 may also be configured to detect whether the region of the target object in the thermodynamic diagram is a connected region, and when there is an occlusion in the target object, the target object may no longer be a complete connected region, but rather may be "truncated" by an occlusion, such that a black region appears in the region of the target object in the thermodynamic diagram (as shown by the black region in the leftmost face region in fig. 4B), and thus the occlusion detection module 530 may also be configured to determine whether there is an occlusion in the target object based on whether the region of the target object is a connected region.

In one embodiment, the target object occlusion detection apparatus 500 may further include an automatic labeling module (not shown in fig. 5) for labeling the target object in the thermodynamic diagram output by the thermodynamic diagram output module 520 (e.g., the frame selection of the target object as shown in fig. 4B). Moreover, the occlusion detection module 530 may be further configured to detect whether the target object has an occlusion based on the thermodynamic diagram labeled by the automatic labeling module. The target objects which are close to each other can be isolated based on the labeling of the automatic labeling module, the target objects which are small in display in the image due to the fact that the target objects are far away from the lens can also be clearly framed and selected, missing or blocking detection errors are avoided, blocking detection is achieved more accurately, and reliability of detection results is improved.

In yet another embodiment, the occlusion detection module 530 may be further configured to determine an occlusion region when detecting that the target object in the thermodynamic diagram is occluded. For example, the occlusion detection module 530 may segment the occlusion region to determine the occlusion position of the target object, thereby providing a more reliable basis for the subsequent identification of the target object. In a real-time application scenario, the occlusion region determined by the occlusion detection module 530 may also provide a prompt for a target object to be identified (e.g., a person to be identified) as a reference, so as to inform the target object to be identified to readjust the position to be identified or remove an occlusion, thereby improving the efficiency and accuracy of identification.

Based on the above description, the target object occlusion detection device according to the embodiment of the invention converts the target object in the input image into the thermodynamic diagram based on the trained neural network, and detects whether the target object is occluded or not through the thermodynamic diagram, so that the difficulty in identifying the target object can be effectively reduced, and the identification accuracy and stability can be improved.

FIG. 6 shows a schematic block diagram of a target object occlusion detection system 600 according to an embodiment of the invention. The target object occlusion detection system 600 comprises a storage 610 and a processor 620.

Wherein the storage means 610 stores program code for implementing the respective steps in the target object occlusion detection method according to an embodiment of the invention. The processor 620 is configured to run the program code stored in the storage 610 to perform the respective steps of the target object occlusion detection method according to the embodiment of the present invention, and to implement the respective modules in the target object occlusion detection apparatus according to the embodiment of the present invention. Furthermore, the target object occlusion detection system 600 may further comprise an image acquisition device (not shown in fig. 6), which may be used for acquiring the input image. Of course, the image capture device is not required and may receive input of input images directly from other sources.

In one embodiment, the program code, when executed by the processor 620, causes the target object occlusion detection system 600 to perform the steps of: receiving an input image; detecting a target object in the input image by using a trained neural network and outputting a thermodynamic diagram of the target object; and detecting whether the target object has an occlusion based on the thermodynamic diagram.

In one embodiment, the trained neural network is a full convolution neural network.

In one embodiment, the fully convolutional neural network is generated based on replacing fully connected layers of the trained convolutional neural network with convolutional layers.

In one embodiment, the convolutional layers are 1 × 1 convolutional layers.

In one embodiment, the full convolutional neural network further comprises an upsampling layer.

In one embodiment, the number of upsampling layers depends on the desired resolution of the output thermodynamic diagram.

In one embodiment, the program code when executed by the processor 620 further causes the target object occlusion detection system 600 to perform the steps of: after the thermodynamic diagram is output, labeling a target object in the thermodynamic diagram; and when executed by the processor 620, causes the target object occlusion detection system 600 to perform the detection of whether an occlusion exists for the target object based on the labeled thermodynamic diagram.

In one embodiment, the steps performed by the target object occlusion detection system 600 based on the thermodynamic diagram to detect whether an occlusion exists for the target object when the program code is executed by the processor 620 include: detecting whether the shape of a target object in the thermodynamic diagram conforms to the shape of an object of the class of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded.

In one embodiment, the steps performed by the target object occlusion detection system 600 based on the thermodynamic diagram to detect whether an occlusion exists for the target object when the program code is executed by the processor 620 include: detecting whether the area of the target object in the thermodynamic diagram is a connected area; and when the area of the target object in the thermodynamic diagram is detected to be a connected area, determining that the target object is not shielded, otherwise, determining that the target object is shielded.

In one embodiment, the program code when executed by the processor 620 further causes the target object occlusion detection system 600 to perform the steps of: when the target object in the thermodynamic diagram is detected to have occlusion, determining an occlusion area.

Furthermore, according to an embodiment of the present invention, there is also provided a storage medium on which program instructions are stored, which when executed by a computer or a processor are used for executing the corresponding steps of the target object occlusion detection method according to an embodiment of the present invention, and for implementing the corresponding modules in the target object occlusion detection apparatus according to an embodiment of the present invention. The storage medium may include, for example, a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM), a portable compact disc read only memory (CD-ROM), a USB memory, or any combination of the above storage media. The computer readable storage medium can be any combination of one or more computer readable storage media, e.g., one computer readable storage medium containing computer readable program code for receiving an input image, another computer readable storage medium containing computer readable program code for detecting a target object and outputting a thermodynamic diagram, and yet another computer readable storage medium containing computer readable program code for detecting occlusion.

In an embodiment, the computer program instructions may, when executed by a computer, implement the functional modules of the target object occlusion detection apparatus according to the embodiment of the present invention, and/or may perform the target object occlusion detection method according to the embodiment of the present invention.

In one embodiment, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the steps of: receiving an input image; detecting a target object in the input image by using a trained neural network and outputting a thermodynamic diagram of the target object; and detecting whether the target object has an occlusion based on the thermodynamic diagram.

In one embodiment, the convolutional layers are 1 × 1 convolutional layers.

In one embodiment, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the steps of: after the thermodynamic diagram is output, labeling a target object in the thermodynamic diagram; and the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the detecting of whether occlusion exists for the target object is based on the labeled thermodynamic diagram.

In one embodiment, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the step of detecting whether an occlusion exists in the target object based on the thermodynamic diagram comprises: detecting whether the shape of a target object in the thermodynamic diagram conforms to the shape of an object of the class of the target object; and when the shape of the target object in the thermodynamic diagram conforms to the shape of the object in the category of the target object, determining that the target object is not occluded, and otherwise, determining that the target object is occluded.

In one embodiment, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the step of detecting whether an occlusion exists in the target object based on the thermodynamic diagram comprises: detecting whether the area of the target object in the thermodynamic diagram is a connected area; and when the area of the target object in the thermodynamic diagram is detected to be a connected area, determining that the target object is not shielded, otherwise, determining that the target object is shielded.

In one embodiment, the computer program instructions, when executed by a computer or processor, cause the computer or processor to perform the steps of: when the target object in the thermodynamic diagram is detected to have occlusion, determining an occlusion area.

The modules in the target object occlusion detection apparatus according to the embodiment of the present invention may be implemented by a processor of an electronic device for target object occlusion detection according to the embodiment of the present invention running computer program instructions stored in a memory, or may be implemented when computer instructions stored in a computer readable storage medium of a computer program product according to the embodiment of the present invention are run by a computer.

According to the target object occlusion detection method, device and system and the storage medium, the target object in the input image is converted into the thermodynamic diagram based on the trained neural network, whether the target object is occluded or not is detected through the thermodynamic diagram, the target object identification difficulty can be effectively reduced, and the identification accuracy and stability are improved.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the foregoing illustrative embodiments are merely exemplary and are not intended to limit the scope of the invention thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one logical functional division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another device, or some features may be omitted, or not executed.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the present invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It will be understood by those skilled in the art that all of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where such features are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims, any of the claimed embodiments may be used in any combination.

The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some of the modules in an item analysis apparatus according to embodiments of the present invention. The present invention may also be embodied as apparatus programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

The above description is only for the specific embodiment of the present invention or the description thereof, and the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the protection scope of the present invention. The protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A face occlusion detection method, characterized in that the face occlusion detection method comprises:

receiving an input image;

detecting a human face in the input image by using a trained neural network, and outputting a thermodynamic diagram of the human face, wherein the thermodynamic diagram shows the human face in the form of a thermodynamic diagram; and

detecting whether an occlusion exists on the face based on the thermodynamic diagram;

the step of detecting whether the face has an occlusion based on the thermodynamic diagram comprises:

detecting whether the shape of the face in the thermodynamic diagram conforms to the shape of the object in the class of the face, and determining that the face has no occlusion when the shape of the face in the thermodynamic diagram conforms to the shape of the object in the class of the face, otherwise determining that the face has the occlusion; or

Detecting whether the area of the face in the thermodynamic diagram is a connected area, determining that the face is not shielded when the area of the face in the thermodynamic diagram is detected to be the connected area, and otherwise determining that the face is shielded.

2. The face occlusion detection method of claim 1, wherein the trained neural network is a full convolution neural network; the full convolution neural network is formed by extracting embedded mapping from a convolution neural network capable of recognizing human faces and then connecting a thermodynamic diagram output layer; the thermodynamic diagram output layer includes a convolutional layer.

3. The face occlusion detection method of claim 1, wherein the trained neural network is a full convolution neural network; the full convolutional neural network is generated based on replacing a full connection layer of the trained convolutional neural network capable of recognizing the human face with a convolutional layer.

4. The face occlusion detection method of claim 3, wherein the convolutional layer is a 1 x 1 convolutional layer.

5. The face occlusion detection method of claim 3, wherein the fully convolutional neural network further comprises an upsampling layer.

6. The face occlusion detection method of claim 2, wherein the thermodynamic diagram output layer further comprises an upsampling layer.

7. The face occlusion detection method of claim 5 or 6, characterized in that the number of upsampling layers depends on the desired resolution of the output thermodynamic diagram.

8. The face occlusion detection method of claim 2 or 3, characterized in that the fully convolutional neural network is trained by the following steps:

in the pre-training stage, training a convolutional neural network capable of recognizing a human face;

in the fine tuning stage, a full convolution neural network is generated based on the convolution neural network capable of recognizing the human face, and the full convolution neural network becomes a neural network outputting a thermodynamic diagram through training.

9. The face occlusion detection method of claim 1, further comprising:

after the thermodynamic diagram is output, labeling the human face in the thermodynamic diagram; and is

The detection of whether the face is occluded is based on the labeled thermodynamic diagram.

10. The face occlusion detection method of claim 1, further comprising:

when the fact that the face in the thermodynamic diagram is occluded is detected, an occlusion area is determined.

11. A face occlusion detection device, comprising:

a receiving module for receiving an input image;

the thermodynamic diagram output module is used for detecting a human face in the input image by using the trained neural network and outputting a thermodynamic diagram of the human face, wherein the thermodynamic diagram shows the human face in a thermodynamic diagram form; and

an occlusion detection module for detecting whether an occlusion exists in the face based on the thermodynamic diagram;

the occlusion detection module is further configured to:

12. The face occlusion detection device of claim 11, wherein the trained neural network is a full convolution neural network; the full convolution neural network is formed by extracting embedded mapping from a convolution neural network capable of recognizing human faces and then connecting a thermodynamic diagram output layer; the thermodynamic diagram output layer includes a convolutional layer.

13. The face occlusion detection device of claim 11, wherein the trained neural network is a full convolution neural network; the full convolutional neural network is generated based on replacing a full connection layer of the trained convolutional neural network capable of recognizing the human face with a convolutional layer.

14. The face occlusion detection device of claim 13, wherein the convolutional layers are 1 x 1 convolutional layers.

15. The face occlusion detection device of claim 13, wherein the fully convolutional neural network further comprises an upsampling layer.

16. The face occlusion detection device of claim 12, wherein the thermodynamic diagram output layer further comprises an upsampling layer.

17. The face occlusion detection apparatus of claim 15 or 16, wherein the number of upsampling layers depends on a desired resolution of the output thermodynamic diagram.

18. The face occlusion detection device of claim 12 or 13, wherein the fully convolutional neural network is trained by:

19. The face occlusion detection device of claim 11, further comprising:

the automatic labeling module is used for labeling the human face in the thermodynamic diagram; and is

The occlusion detection module is further configured to detect whether an occlusion exists in the face based on the labeled thermodynamic diagram.

20. The face occlusion detection device of claim 11, wherein the occlusion detection module is further configured to: