CN108563982B

CN108563982B - Method and apparatus for detecting image

Info

Publication number: CN108563982B
Application number: CN201810011481.4A
Authority: CN
Inventors: 汤旭
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-01-05
Filing date: 2018-01-05
Publication date: 2020-01-17
Anticipated expiration: 2038-01-05
Also published as: CN108563982A

Abstract

The embodiment of the application discloses a method and a device for detecting an image. One embodiment of the method comprises: moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face; determining the degree of overlap of each face of the at least one face and at least one partial image of the at least two partial images; selecting a preset number of local images from the at least one local image according to the sequence of the overlapping degrees from large to small; and carrying out face detection on the selected local images with the preset number. By framing at least two local images and determining the overlapping degree, the recall rate of the face detection is improved.

Description

Method and apparatus for detecting image

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a method and a device for detecting images.

Background

And detecting the human face, namely searching any given image by adopting a certain strategy to determine whether the image contains the human face, and returning information such as the position, the size, the posture and the like of the human face if the image contains the human face.

Disclosure of Invention

The embodiment of the application provides a method and a device for detecting an image.

In a first aspect, an embodiment of the present application provides a method for detecting an image, including: moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face; determining the overlapping degree of each face in the at least one face and at least one partial image in the at least two partial images; selecting a preset number of local images from at least one local image according to the sequence of the overlapping degrees from large to small; and carrying out face detection on the selected local images with the preset number.

In some embodiments, determining the degree of overlap of each of the at least one face with at least one of the at least two partial images comprises: for each face in at least one face, selecting at least one partial image overlapped with the face from at least two partial images; and determining the size of the intersection and the size of the union of the face and each local image in at least one local image, and determining the ratio of the size of the intersection to the size of the union as the overlapping degree.

In some embodiments, the performing face detection on the selected preset number of partial images includes: and performing face detection on the selected local images with the preset number by adopting a convolutional neural network.

In some embodiments, the performing face detection on the selected preset number of partial images includes: and carrying out face detection on the selected local images with the preset number by adopting a non-maximum value inhibition algorithm.

In a second aspect, an embodiment of the present application provides an apparatus for detecting an image, including: the moving unit is configured to move a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face; the determining unit is configured to determine the overlapping degree of each face in the at least one face and at least one partial image in the at least two partial images; the selecting unit is configured to select a preset number of local images from at least one local image according to the sequence of the overlapping degrees from large to small; and the detection unit is configured to perform face detection on the selected local images with the preset number.

In some embodiments, the determining unit is further configured to: for each face in at least one face, selecting at least one partial image overlapped with the face from at least two partial images; and determining the size of the intersection and the size of the union of the face and each local image in at least one local image, and determining the ratio of the size of the intersection to the size of the union as the overlapping degree.

In some embodiments, the detection unit is further configured to: and performing face detection on the selected local images with the preset number by adopting a convolutional neural network.

In some embodiments, the detection unit is further configured to: and carrying out face detection on the selected local images with the preset number by adopting a non-maximum value inhibition algorithm.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a storage device to store one or more programs that, when executed by one or more processors, cause the one or more processors to implement a method as in any embodiment of a method for detecting an image.

In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements a method as in any one of the embodiments of the method for detecting an image.

According to the method and the device for detecting the image, firstly, a preset rectangular frame is moved for at least one time in a target image according to a preset direction and a preset moving distance from a target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face. And then, determining the overlapping degree of each face in the at least one face and at least one partial image in the at least two partial images. Then, according to the sequence of the overlapping degree from large to small, a preset number of partial images are selected from at least one partial image. And finally. And carrying out face detection on the selected local images with the preset number. By framing at least two local images and determining the overlapping degree, the recall rate of the face detection is improved.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for detecting an image according to the present application;

FIG. 3 is a schematic illustration of an application scenario of a method for detecting an image according to the present application;

FIG. 4 is a flow diagram of yet another embodiment of a method for detecting an image according to the present application;

FIG. 5 is a flow diagram of yet another embodiment of a method for detecting an image according to the present application;

FIG. 6 is a schematic block diagram of one embodiment of an apparatus for detecting an image according to the present application;

FIG. 7 is a block diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 shows an exemplary system architecture 100 to which embodiments of the method for detecting an image or the apparatus for detecting an image of the present application may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as an image acquisition application, an image processing application, a face recognition application, etc., may be installed on the

terminal devices

101, 102, 103.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, laptop portable computers, desktop computers, and the like.

The server 105 may be a server that provides various services, such as a background server that provides support for content displayed on the

terminal devices

101, 102, 103. The background server can analyze and process the received data such as the target image and the like, and feed back the processing result to the terminal equipment.

It should be noted that the method for detecting an image provided in the embodiment of the present application may be executed by the

terminal devices

101, 102, 103 and the server 105, and accordingly, the apparatus for detecting an image is generally disposed in the

terminal devices

101, 102, 103 and the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for detecting an image in accordance with the present application is shown. The method for detecting the image comprises the following steps:

step 201, moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame.

In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the method for detecting an image is executed may move a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from a target position of the target image, so as to obtain at least two partial images framed by the rectangular frame. The size of the rectangular frame is preset. The target position may be an arbitrary position set in the target image. The preset direction may include one or two or more directions. For example, the rectangular frame may move in a plurality of lines of the image, where the starting point of each line is preset, the left end, and the preset direction is from left to right. In addition, the starting point of the first row may be at the left end, the preset direction of the first row is from left to right, the starting point of the second row is at the right end, and the preset direction of the second row is from right to left.

In practice, the partial images may include partial images framed by a rectangular frame at an initial position, and include respective partial images framed after movement.

Step 202, determining the overlapping degree of each face in the at least one face and at least one partial image in the at least two partial images.

In this embodiment, the electronic device determines an overlapping degree of each of at least one face presented in the target image and at least one partial image of the at least two partial images. The at least one partial image may be randomly selected from the at least two partial images, or may be selected according to a preset rule. The overlapping degree refers to the ratio of the size of the intersection between any local image and the face represented by any image to the size of the union. The size here may be calculated by an area or by the number of pixels.

Step 203, selecting a preset number of partial images from at least one partial image according to the sequence of the overlapping degrees from large to small.

In this embodiment, the electronic device selects a preset number of partial images from at least one partial image in an order from a large degree of overlap to a small degree of overlap. Specifically, the electronic device may sort the overlapping degrees in descending order. And then, determining a preset number of overlapping degrees from one end with larger overlapping degree of the sequence obtained by sequencing, and determining a preset number of partial images corresponding to the preset number of overlapping degrees.

And 204, performing face detection on the selected local images with the preset number.

In this embodiment, after the preset number of partial images are selected, the electronic device may perform face detection on the selected partial images to obtain a face detection result. The face detection result may be a face feature, for example, the face feature may be position information of each feature point, or a length between feature points, or the like.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for detecting an image according to the present embodiment. In the application scenario of fig. 3, the electronic device 301 moves a preset rectangular frame in the image 302 at least 9 times according to a preset direction and a preset movement distance from the coordinates (0,0) of the image 302 to obtain 10 partial images 303 framed by the rectangular frame, where the image 302 presents 2 faces. The electronic device 301 determines 304 the degree of overlap of each of the 2 faces with 6 of the 10 partial images. The 4 partial images 305 are selected from the 6 partial images in the order of the degree of overlap 304 from large to small. Face detection 306 is performed on the selected 4 partial images.

According to the method provided by the embodiment of the application, at least two local images are framed, more faces can be detected from the local images, the problem of missing detection is avoided, and the recall rate of face detection is further improved. Meanwhile, the embodiment of the application avoids invalid detection caused by detecting the local images with low overlapping degree by selecting the preset number of local images, and improves the detection efficiency.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for detecting an image is shown. The flow 400 of the method for detecting an image comprises the steps of:

step 401, moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame.

In this embodiment, the server may move the preset rectangular frame in the target image at least once according to the preset direction and the preset moving distance from the target position of the target image, so as to obtain at least two partial images framed by the rectangular frame. The size of the rectangular frame is preset. The target position may be an arbitrary position set in the target image. The preset direction may include one or two or more directions.

Step 402, for each face of at least one face, selecting at least one partial image overlapped with the face from at least two partial images.

In this embodiment, for each face of the at least one face, the server selects at least one partial image overlapping the face from the at least two partial images. At least one partial image is overlapped with the human face in the at least two partial images, and the server selects the partial image.

Step 403, determining the size of the intersection and the size of the union of the face and each local image in the at least one local image, and determining that the ratio of the size of the intersection to the size of the union is the overlapping degree.

In this embodiment, the server determines the size of the intersection and the size of the union of the face and each of the at least one partial image. Then, the ratio of the size of the intersection to the size of the union is determined, and the ratio is used as the overlapping degree.

And step 404, selecting a preset number of partial images from at least one partial image according to the sequence of the overlapping degrees from large to small.

In this embodiment, the server selects a preset number of partial images from at least one partial image in an order from a large degree of overlap to a small degree of overlap. Specifically, the server may sort the overlapping degrees in descending order. And then, determining a preset number of overlapping degrees from one end with larger overlapping degree, and determining a preset number of partial images corresponding to the preset number of overlapping degrees.

And 405, performing face detection on the selected local images with the preset number by using a convolutional neural network.

In this embodiment, the server performs face detection on a preset number of selected local images by using a convolutional neural network. Specifically, the server may input each of the selected local images in the preset number of local images into the convolutional neural network, so as to obtain the face features output by the convolutional neural network. In practice, a Convolutional Neural Network (CNN) is a feed-forward Neural Network, and its artificial neurons can respond to a part of surrounding units in a coverage range, and have excellent performance on image processing, so that a multilayer Convolutional Neural Network can be used for face detection to obtain face features. It should be noted that the multi-layer convolutional neural network can be obtained by performing supervised training on an existing deep convolutional neural network (e.g., ResNet-50) by using a machine learning method and training samples. The training samples may include a large number of images and facial features of faces represented by the images.

In the embodiment, more accurate face detection can be realized through the convolutional neural network.

With further reference to FIG. 5, a flow 500 of yet another embodiment of a method for detecting an image is shown. The flow 500 of the method for detecting an image comprises the following steps:

step 501, moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame.

In this embodiment, the terminal device may move the preset rectangular frame in the target image at least once from the target position of the target image according to the preset direction and the preset moving distance, so as to obtain at least two local images framed by the rectangular frame. The size of the rectangular frame is preset. The target position may be an arbitrary position set in the target image. The preset direction may include one or two or more directions. For example, a rectangular frame may be used to move multiple lines in the image, where the starting point of each line is preset, the left end, and the preset direction is from left to right. Furthermore, it is also possible that the start of the first line is at the left end, the start of the second line is at the right end, etc.

Step 502, determining the overlapping degree of each face in at least one face and at least one partial image in at least two partial images.

In this embodiment, the terminal device determines an overlapping degree of each of at least one face presented in the target image and at least one partial image of the at least two partial images. The at least one partial image may be randomly selected from the at least two partial images, or may be selected according to a preset rule. The overlapping degree refers to the ratio of the size of the intersection between any one local image and any one presented human face to the size of the union. The size here may be calculated by an area or by the number of pixels.

Step 503, selecting a preset number of partial images from at least one partial image according to the sequence of the overlapping degrees from large to small.

In this embodiment, the terminal device selects a preset number of partial images from at least one partial image in an order from a large degree of overlap to a small degree of overlap. Specifically, the terminal device may sort the overlapping degrees in descending order. And then, determining a preset number of overlapping degrees from one end with larger overlapping degree, and determining a preset number of partial images corresponding to the preset number of overlapping degrees.

And step 504, performing face detection on the selected local images in the preset number by adopting a non-maximum value inhibition algorithm.

In this embodiment, the terminal device performs face detection on the selected local images in the preset number by using a non-maximum suppression algorithm. The non-maximum suppression algorithm is to select at least one local image with the highest confidence for each presented face, and the local images are overlapped with the face and have an overlap with each other. And performing face detection on each partial image which is not overlapped with each other.

According to the embodiment, more accurate face detection can be realized through a non-maximum suppression algorithm.

With further reference to fig. 6, as an implementation of the methods shown in the above-mentioned figures, the present application provides an embodiment of an apparatus for detecting an image, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.

As shown in fig. 6, the apparatus 600 for detecting an image of the present embodiment includes: a moving unit 601, a determining unit 602, a selecting unit 603 and a detecting unit 604. The moving unit 601 is configured to move a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from a target position of the target image to obtain at least two local images framed by the rectangular frame, where the target image presents at least one face; a determining unit 602 configured to determine an overlapping degree of each face of the at least one face with at least one partial image of the at least two partial images; a selecting unit 603 configured to select a preset number of partial images from the at least one partial image according to a sequence of overlapping degrees from large to small; the detecting unit 604 is configured to perform face detection on the selected preset number of partial images.

In this embodiment, the moving unit 601 may move a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image, so as to obtain at least two partial images framed by the rectangular frame. The size of the rectangular frame is preset. The target position may be an arbitrary position set in the target image. The preset direction may include one or two or more directions.

In the present embodiment, the determining unit 602 determines the degree of overlap between each of at least one face presented in the target image and at least one partial image of the at least two partial images. The at least one partial image may be randomly selected from the at least two partial images, or may be selected according to a preset rule. The overlapping degree refers to the ratio of the size of the intersection between any local image and the face represented by any image to the size of the union. The size here may be calculated by an area or by the number of pixels.

In this embodiment, the selecting unit 603 selects a preset number of partial images from at least one partial image in order of decreasing degree of overlap. Specifically, the selecting unit 603 may sort the overlapping degrees in descending order. And then, determining a preset number of overlapping degrees from one end with larger overlapping degree of the sequence obtained by sequencing, and determining a preset number of partial images corresponding to the preset number of overlapping degrees.

In this embodiment, after selecting a preset number of partial images, the detection unit 604 may perform face detection on the selected partial images to obtain a face detection result. The face detection result may be a face feature, for example, the face feature may be position information of each feature point, or a length between feature points, or the like.

In some optional implementations of this embodiment, the determining unit is further configured to: for each face in at least one face, selecting at least one partial image overlapped with the face from at least two partial images; and determining the size of the intersection and the size of the union of the face and each local image in at least one local image, and determining the ratio of the size of the intersection to the size of the union as the overlapping degree.

In some optional implementations of this embodiment, the detection unit is further configured to: and performing face detection on the selected local images with the preset number by adopting a convolutional neural network.

In some optional implementations of this embodiment, the detection unit is further configured to: and carrying out face detection on the selected local images with the preset number by adopting a non-maximum value inhibition algorithm.

Fig. 7 shows a schematic structural diagram of a computer device suitable for implementing an electronic apparatus according to an embodiment of the present application. As shown in fig. 7, the computer apparatus 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the apparatus 700 are also stored. The CPU701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by a Central Processing Unit (CPU)701, performs the above-described functions defined in the method of the present application. It should be noted that the computer readable medium of the present application can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor device, apparatus, or a combination of any of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution apparatus, device, or apparatus. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution apparatus, device, or apparatus. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a moving unit, a determining unit, a selecting unit, and a detecting unit. Where the names of the cells do not in some cases constitute a limitation of the cell itself, the determining unit may also be described as a "cell determining a degree of overlap of each of the at least one face with at least one of the at least two partial images", for example.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face; determining the overlapping degree of each face in the at least one face and at least one partial image in the at least two partial images; selecting a preset number of local images from at least one local image according to the sequence of the overlapping degrees from large to small; and carrying out face detection on the selected local images with the preset number.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A method for detecting an image, comprising:

moving a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from the target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one face, and the preset direction comprises one or more than two directions;

determining the degree of overlap of each face of the at least one face and at least one partial image of the at least two partial images;

selecting a preset number of local images from the at least one local image according to the sequence of the overlapping degrees from large to small;

performing face detection on the selected local images with the preset number by adopting the convolutional neural network obtained by training;

wherein the determining the degree of overlap between each face of the at least one face and at least one partial image of the at least two partial images comprises:

for each face in the at least one face, selecting at least one partial image overlapped with the face from the at least two partial images; determining the size of the intersection and the size of the union of the human face and each local image in the at least one local image, and determining the ratio of the size of the intersection to the size of the union as the overlapping degree.

2. The method for detecting an image according to claim 1, wherein the performing face detection on the selected preset number of partial images includes:

and carrying out face detection on the selected local images with the preset number by adopting a non-maximum value inhibition algorithm.

3. An apparatus for detecting an image, comprising:

the mobile unit is configured to move a preset rectangular frame in the target image at least once according to a preset direction and a preset moving distance from a target position of the target image to obtain at least two local images framed by the rectangular frame, wherein the target image presents at least one human face, and the preset direction comprises one or more than two directions;

a determining unit configured to determine an overlapping degree of each face of the at least one face with at least one partial image of the at least two partial images;

the selecting unit is configured to select a preset number of local images from the at least one local image according to the sequence of the overlapping degrees from large to small;

the detection unit is configured to adopt the convolutional neural network obtained by training to perform face detection on the selected local images with the preset number;

wherein the determining unit is further configured to:

4. The apparatus for detecting an image according to claim 3, wherein the detection unit is further configured to:

5. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-2.

6. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method of any one of claims 1-2.