CN112948614A

CN112948614A - Image processing method, image processing device, electronic equipment and storage medium

Info

Publication number: CN112948614A
Application number: CN202110220469.6A
Authority: CN
Inventors: 周洋杰; 陈亮辉; 付琰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-02-26
Filing date: 2021-02-26
Publication date: 2021-06-11
Anticipated expiration: 2041-02-26
Also published as: CN112948614B

Abstract

The present disclosure provides an image processing method, an image processing apparatus, an electronic device and a storage medium, which relate to the field of artificial intelligence, and in particular, to the field of image processing. The implementation scheme is as follows: an image processing method includes: acquiring a set of images shot in a first time period and related to a first face, wherein the set of images comprises a plurality of face images with similarity larger than a first threshold value; clustering the plurality of face images based on the geographic location where each of the plurality of face images is captured to construct a plurality of image clusters; determining a first image cluster of the plurality of image clusters, a difference between the first image cluster and a second image cluster of the plurality of image clusters being greater than a second threshold, and setting a main image cluster and an alternative image cluster; in response to determining that at least one face image in the alternative image cluster has an association relationship with the face image in the main image cluster with respect to the shooting time and the shooting place, removing the face images other than the at least one face image in the alternative image cluster from the image set.

Description

Image processing method, image processing device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence, in particular to the field of image processing, and in particular to an image processing method, apparatus, electronic device, computer-readable storage medium, and computer program product.

Background

With the development of image processing techniques related to face recognition, these techniques are increasingly used in application scenarios such as smart cities and public safety. Accordingly, as the number of images increases, it is desirable to aggregate facial images belonging to the same person to form a facial image archive associated therewith for further manipulation and processing.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.

Disclosure of Invention

The present disclosure provides an image processing method, an image processing apparatus, an electronic device, a computer-readable storage medium, and a computer program product.

According to an aspect of the present disclosure, there is provided an image processing method including: acquiring an image set shot in a first time period relative to a first face, wherein the image set comprises a plurality of face images, and the similarity between the face images is larger than a first threshold value; clustering the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters; determining a first image cluster in the plurality of image clusters, wherein the difference of the similarity of the face images between the first image cluster and a second image cluster is larger than a second threshold value, the second image cluster is an image cluster in the plurality of image clusters except the first image cluster, the second image cluster is set as a main image cluster, and the first image cluster is set as an alternative image cluster; and in response to determining that at least one face image in the alternative image cluster has an association relationship with the face image in the main image cluster with respect to shooting time and shooting place, removing the face images except the at least one face image in the alternative image cluster from the image set.

According to another aspect of the present disclosure, there is provided an image processing apparatus including: an acquisition module configured to acquire a set of images captured within a first time period with respect to a first face, the set of images including a plurality of face images, a similarity between the plurality of face images being greater than a first threshold; a clustering module configured to cluster the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters; a splitting module configured to determine a first image cluster of the plurality of image clusters, a difference in face image similarity between the first image cluster and a second image cluster being an image cluster of the plurality of image clusters other than the first image cluster, being greater than a second threshold, and set the second image cluster as a main image cluster and the first image cluster as an alternative image cluster; a recall module: is configured to remove face images other than the at least one face image from the image set in response to determining that at least one face image in the candidate image cluster has an association relationship with a face image in the main image cluster with respect to a photographing time and a photographing place.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image processing method as described above.

According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the image processing method as described above.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program, wherein the computer program realizes the image processing method as described above when executed by a processor.

According to one or more embodiments of the present disclosure, face images that do not belong to the same person can be accurately excluded, and unaggregated face images can be recalled, thereby improving the accuracy and recall in aggregating the face images.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of illustration only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 shows a flow diagram of an image processing method according to an embodiment of the present disclosure;

FIG. 2 shows a schematic diagram of steps for recalling a face image according to an embodiment of the present disclosure;

FIG. 3 shows a block diagram of an image processing apparatus according to an embodiment of the present disclosure;

fig. 4 shows a block diagram of an image processing apparatus according to another embodiment of the present disclosure; and

FIG. 5 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", etc. to describe various elements is not intended to limit the positional relationship, the timing relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, based on the context, they may also refer to different instances.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

In the related art, when face images belonging to the same person are aggregated, there may arise problems of aggregating face images not belonging to the same person together and failing to aggregate face images belonging to the same person together.

In order to solve the above technical problem, the present disclosure provides an image processing method. Fig. 1 shows a flowchart of an image processing method according to an embodiment of the present disclosure. As shown in fig. 1, the image processing method of the present disclosure may include:

s101, acquiring an image set shot in a first time period and related to a first face, wherein the image set comprises a plurality of face images, and the similarity between the face images is greater than a first threshold value;

s102, clustering the face images based on the shot geographic position of each face image to construct a plurality of image clusters;

s103, determining a first image cluster in the plurality of image clusters, wherein the difference of the similarity of the face images between the first image cluster and a second image cluster is larger than a second threshold value, the second image cluster is an image cluster except the first image cluster in the plurality of image clusters, the second image cluster is set as a main image cluster, and the first image cluster is set as an alternative image cluster; and

s104, in response to the fact that at least one face image in the alternative image cluster and the face image in the main image cluster have an incidence relation with respect to shooting time and shooting place, removing the face images except the at least one face image in the alternative image cluster from the image set.

According to the image processing method disclosed by the embodiment of the disclosure, on one hand, the facial images are clustered by introducing spatial factors except for the similarity into the initial image set, so that the facial images which do not belong to the same person can be accurately excluded. On the other hand, because the situation that some face images belonging to the same person are also eliminated at the same time is possibly caused by introducing spatial factors for clustering, the face images which belong to the same person can be determined from the face images which are eliminated previously by further judging the spatiotemporal association relationship between the face images in the alternative image cluster and the main image cluster, and the face images which do not belong to the same person really are removed, so that the face images which are not aggregated are recalled. Therefore, the accuracy and recall rate in the process of aggregating the face images are improved.

In step S101, the first face may be related to any person of interest or interest. Accordingly, the first time period may be a selected certain time period of interest or interest, such as a certain day, a certain week, and so forth. In one example, the image collection may be provided externally, such as may be provided by a specialized image provider. The image set may comprise a plurality of face images that are most similar, i.e. closest, to the face of a person. Also, the plurality of face images may have a certain similarity between each other, that is, the similarity between the plurality of face images may be greater than a first threshold. Here, the first threshold may be selected according to various embodiments. In the present disclosure, a face image may refer to a face image for a person, and accordingly, a plurality of face images may refer to a plurality of face images for the person.

The similarity between the facial images can also be called vector punting of the facial images. In other words, the image set may be a set of a plurality of facial images that are rounded up by vector, i.e. a preliminary facial image aggregation, which needs to be further optimized to improve the accuracy of the aggregation. However, since the formation of the image set only uses the similarity of the face images singly, there may be a case where the face images that do not belong to the same person are also aggregated together in the image set. Therefore, it is necessary to exclude the face image that does not belong to the same person.

Optionally, after the image set is acquired, a face image determined not to belong to the first face according to the time and the place of the face image being shot in the plurality of face images may be removed. For example, one face image of a person is taken at a location D1 at time T, and another face image is taken at another location D2 10 km away from the location D1 after 5 minutes of time T. It will be appreciated that if it is assumed that the two face images originate from the same person, this relationship between time and location is a significant problem in terms of spatio-temporal logic, since a movement of 10 km over a time of 5 minutes can be considered almost impossible. Thus, it is indicated that the two face images do not originate from the same person. Thus, if it is determined that the previous face image belongs to the person, the subsequent face image can be excluded by such spatiotemporal constraints, and vice versa. Therefore, by the method, the face images obviously different from one another and belonging to the same person can be removed from the face images with a certain similarity.

In step S102, as described above, the image set, i.e. the preliminary face image aggregation, may be further optimized to improve the accuracy of the aggregation. For this reason, another factor other than the degree of similarity may be introduced, and face images are clustered in the image set so as to exclude face images that do not belong to the same person. According to the embodiments of the present disclosure, a spatial factor, i.e., a geographical location (i.e., a place) where each face image is photographed, is utilized, which can facilitate accurately determining whether the face images belong to the same person by means of spatio-temporal logic. Therefore, according to an embodiment of the present disclosure, the plurality of face images are clustered based on the geographical location where each of the plurality of face images is captured to construct a plurality of image clusters. In other words, an image cluster may refer to a subset of the plurality of facial images in the image set that are clustered. If there are facial images that do not belong to the same person, they will be clearly distinguished from other facial images because of the clustering process. This will be explained in more detail below in connection with step S103.

Optionally, clustering the plurality of facial images based on the geographic location where each of the plurality of facial images was captured to construct a plurality of image clusters comprises: selecting at least two geographical positions having a mutual distance exceeding a predetermined distance from among a plurality of geographical positions having an appearance number higher than a predetermined number, according to the geographical position where each face image is photographed; clustering the face images with the face images corresponding to the at least two geographic locations as a cluster center. By the clustering method, if facial images which do not belong to the same person exist, the facial images are likely to be clustered into an image cluster which is relatively different from other image clusters, so that the facial images which do not belong to the same person are convenient to determine.

In one example, the clustering approach may be K-Means clustering. Accordingly, the initialization center of the K-Means cluster needs to be selected. For this reason, according to the embodiments of the present disclosure, both the frequency of appearance and the mutual distance of the places where the face images are photographed may be considered in combination. Specifically, first, the places where the plurality of face images are captured may be sorted in the frequency of occurrence to determine the number of occurrences of each place. Thereafter, at least two locations having a mutual distance exceeding a predetermined distance may be selected among the plurality of locations having an occurrence number higher than the predetermined number. Here, the predetermined number and the predetermined distance may be differently set according to different embodiments. Then, the plurality of face images may be clustered with the face images corresponding to the at least two locations as a cluster center. For example, assuming that the location a, the location B closer to the location a, and the location C farther from the location a are determined as three locations with a relatively high number of occurrences after the ranking, the face images of the location a and the location C may be selected as a cluster center for clustering. By this clustering method, if there are face images that do not belong to the same person, the face images are likely to be clustered into an image cluster that is relatively different from the rest of the image clusters.

In step S103, as described above, if there is a face image that does not belong to the same person, it will be clearly distinguished from other face images because of the clustering process. Therefore, according to an embodiment of the present disclosure, a first image cluster of the plurality of image clusters is determined, a difference between the first image cluster and a second image cluster (i.e. the remaining image clusters) of the plurality of image clusters is larger than a second threshold, and the second image cluster is set as a main image cluster and the first image cluster is set as an alternative image cluster. Here, the second threshold may be set differently according to different embodiments. In other words, the image clusters with relatively large differences, for example, the graph cluster with the largest difference, which is the candidate image cluster, can be found by determining the differences between the image clusters. The significance of this is that if there are face images that do not belong to the same person, they will most likely be clustered into the alternative image cluster, and thus clearly distinguished from the main image cluster.

Optionally, determining a first image cluster of the plurality of image clusters may comprise: calculating the similarity of the face images among the image clusters; and in response to determining that the difference in facial image similarity between at least one of the plurality of image clusters and the remaining image clusters is greater than the second threshold, treating the at least one image cluster as a first image cluster. Therefore, the image cluster with relatively large difference can be found out through the similarity between the image cluster and the image cluster, and the face images in the image cluster are probably not belonging to the same person.

Optionally, determining a first image cluster of the plurality of image clusters may comprise: calculating the similarity of the face images among the image clusters; and in response to that the difference of the similarity of the face images among the plurality of image clusters is smaller than or equal to the second threshold value and in response to that at least one image cluster exists in the plurality of image clusters, determining that the at least one image cluster and the rest of the plurality of image clusters do not belong to the same face image cluster according to the time and the place of shooting the face images in the image clusters, and taking the at least one image cluster as a first image cluster. That is, the at least one image cluster and the remaining image clusters present a significant problem in terms of spatio-temporal logic, thereby indicating that the two image clusters do not represent the same person. Therefore, if an image cluster with relatively large difference cannot be found through the similarity between the image clusters, the image cluster with relatively large difference can be found through the space-time constraint condition between the image clusters, and the face images in the image clusters are probably not belonging to the same person.

In one example, calculating the similarity between the plurality of image clusters may include calculating the similarity between a face image included in one image cluster and a face image included in another image cluster in the plurality of image clusters, for example, an average distance, a maximum distance, a minimum distance, and/or the like may be calculated.

Since the candidate image cluster is determined by the difference between the image clusters in step S103, although it is conceptually indicated that the two image clusters do not represent the same person in the image cluster, face images individually belonging to the same person may still exist among the face images of the image clusters. Therefore, the face images of the part need to be recalled to aggregate the face images belonging to the same person but not aggregated.

In step S104, if it is determined that at least one face image in the candidate image cluster has an association relationship with a face image in the main image cluster with respect to the photographing time and the photographing place, the at least one face image is retained in the image set, and the remaining face images in the candidate image cluster are removed from the image set. Therefore, the face images which do not belong to the same person in the image set can be accurately removed, and the face images which belong to the same person are recalled at the same time.

Optionally, the determining that at least one face image in the candidate image cluster has an association relationship with a face image in the main image cluster with respect to the shooting time and the shooting location includes: constructing a time window according to the corresponding shooting time of two face images with different shooting time in a main image cluster; constructing a space boundary corresponding to the time window based on the shooting places corresponding to the two face images; determining the face images falling within the time window and the spatial boundary as having the association relationship. By the method, the face image belonging to the same person can be accurately recalled in two dimensions of time and space. This will be explained in more detail below in connection with fig. 2.

According to the image processing method disclosed by the embodiment of the disclosure, on one hand, the facial images are clustered by introducing spatial factors except for the similarity into the initial image set, so that the facial images which do not belong to the same person can be accurately excluded. On the other hand, by judging the spatial-temporal association relationship between the face images, the unaggregated face images can be recalled. Therefore, the accuracy and recall rate in the process of aggregating the face images are improved.

FIG. 2 shows a schematic diagram of steps for recalling a face image according to an embodiment of the present disclosure. The step for recalling the face image may correspond to the step S104 described in conjunction with fig. 1, in which a time window and a spatial boundary are respectively constructed for determining whether at least one face image in the alternative image cluster and the face image in the main image cluster have an association relationship with respect to the shooting time and the shooting place.

In the schematic diagram of fig. 2, the vertical direction may represent the passage of time. For example, the plane 202 and the plane 210 may respectively correspond to points in time at which two face images in the main image cluster are captured. For example, plane 202 may correspond to time point 12:00, while plane 210 may correspond to time point 13:00, i.e., a 1 hour time window is constructed through plane 202 and plane 210. Accordingly, planes 204, 206, and 208 may, for example, correspond to three points in time 12:15, 12:30, and 12:45 in time windows 12:00 through 13:00, respectively.

In addition, in the schematic diagram of fig. 2, the horizontal direction may represent a spatial range. Each of the

planes

204, 206, and 208 is shown to include a plurality of grids, each of which may represent a location that may be reached spatially at a point in time corresponding to the respective plane based on the shooting location corresponding to the

planes

202 and 210. Thereby, a spatial boundary corresponding to the time window is constructed.

Taking the plane 206 corresponding to time points 12:30 as an example, it is shown as including a plurality of bins 2060. Each grid 2060 may represent a plurality of places that can be reached at time points 12:30, based on the shooting places corresponding to the

planes

202 and 210. In one example, spread over a space may be estimated at a rate of 100 meters per minute. Assuming that the shooting location corresponding to the plane 202 is a location a and the shooting location corresponding to the plane 210 is a location B, the grids 2060 of the plane 206 may represent possible locations within 3 km from the location a and within 3 km from the location B.

If at least one face image in the alternative image cluster falls within the time window and the spatial boundary, the face image can be determined to have an association relationship with the face image in the main image cluster with respect to the shooting time and the shooting place. For example, assuming that the shooting time of one face image in the candidate image cluster is 12:28 and the shooting location is the location represented by one of the grids 2060, the face image can be considered to fall within the time window and the spatial boundary, and therefore, the face image should be considered to have an association relationship with the face image in the main image cluster and be recalled.

It is to be understood that although only three

planes

204, 206 and 208 in the time window are shown in fig. 2 and include 9, 25, 9 grids, respectively, this is for purposes of illustration and explanation only. The number of planes and lattices may be differently set according to different embodiments.

According to another aspect of the present disclosure, there is also provided an image processing apparatus. Fig. 3 illustrates a block diagram of an image processing apparatus according to an embodiment of the present disclosure. As shown in fig. 3, the image processing apparatus 300 may include:

an obtaining module 302, which may be configured to obtain a set of images captured within a first time period with respect to a first face, the set of images including a plurality of face images, a similarity between the plurality of face images being greater than a first threshold;

a clustering module 304, which may be configured to cluster the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters;

a splitting module 306, which may be configured to determine a first image cluster of the plurality of image clusters, a difference in face image similarity between the first image cluster and a second image cluster, which is an image cluster other than the first image cluster of the plurality of image clusters, is greater than a second threshold, and set the second image cluster as a main image cluster and the first image cluster as an alternative image cluster;

a recall module 308 that may be configured to remove facial images other than the at least one facial image from the image set in the candidate image cluster in response to determining that the at least one facial image in the candidate image cluster has an association relationship with facial images in the main image cluster with respect to a capture time and a capture location.

Optionally, the obtaining module 302 may be further configured to: and after the image set is obtained, removing the face images which are determined not to belong to the first face according to the time and the place of the face images.

The operations of the above-mentioned

modules

302, 304, 306, and 308 of the image processing apparatus 300 may respectively correspond to the operations of steps S101, S102, S103, and S104 described above with reference to fig. 1, and are not described again here.

Fig. 4 illustrates a block diagram of an image processing apparatus according to another embodiment of the present disclosure. As shown in fig. 4, the image processing apparatus 400 may include an acquisition module 402, a clustering module 404, a splitting module 406, and a recall module 408. The above-described

blocks

402, 404, 406, and 408 operate similarly to the

blocks

302, 304, 306, and 308 described in conjunction with fig. 3, and thus are not described in detail herein.

Optionally, the clustering module 404 may further include: a selecting module 4040 configured to select, from among a plurality of geographic locations whose number of occurrences is higher than a predetermined number, at least two geographic locations whose mutual distance exceeds a predetermined distance, according to the geographic location at which each face image is captured; and an operation module 4042 configured to cluster the plurality of face images with the face images corresponding to the at least two geographic locations as a cluster center.

Optionally, the splitting module 406 may further include: a calculating module 4060 configured to calculate face image similarity between the plurality of image clusters; and a first performing module 4062 configured to treat at least one of the plurality of image clusters as a first image cluster in response to determining that a difference in face image similarity between the at least one image cluster and the remaining image clusters is greater than the second threshold.

Optionally or alternatively, the splitting module 406 may also further include: a calculating module 4060 configured to calculate face image similarity between the plurality of image clusters; and a second execution module 4064 configured to, in response to the difference in similarity between the face images of the plurality of image clusters being less than or equal to the second threshold, and in response to the presence of at least one image cluster among the plurality of image clusters, determine that the at least one image cluster and the remaining image clusters do not belong to the same face image cluster according to the time and place at which the face images of the image clusters are captured, and take the at least one image cluster as the first image cluster.

Optionally, the recall module 408 may further include: a first constructing module 4080 configured to construct a time window with corresponding capturing times of two face images in the main image cluster, which have different capturing times; a second constructing module 4082, configured to construct a spatial boundary corresponding to the time window based on the shooting locations corresponding to the two face images; and a determining module 4084 configured to determine the face images falling within the time window and the spatial boundary as having the association relationship.

According to another aspect of the present disclosure, there is also provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image processing method as described above.

According to another aspect of the present disclosure, there is also provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to execute the image processing method as described above.

According to another aspect of the present disclosure, there is also provided a computer program product comprising a computer program, wherein the computer program realizes the image processing method as described above when executed by a processor.

Referring to fig. 5, a block diagram of a structure of an electronic device 500, which is an example of a hardware device that can be applied to aspects of the present disclosure, which can be applied to the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the electronic device 500 includes a computing unit 501, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 can also be stored. The calculation unit 501, the ROM 502, and the RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in the electronic device 500 are connected to the I/O interface 505, including: an input unit 506, an output unit 507, a storage unit 508, and a communication unit 509. The input unit 506 may be any type of device capable of inputting information to the electronic device 500, and the input unit 506 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a track pad, a track ball, a joystick, a microphone, and/or a remote controller. Output unit 507 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 508 may include, but is not limited to, a magnetic disk, an optical disk. The communication unit 509 allows the electronic device 500 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks, and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as bluetooth^TMDevices, 1302.11 devices, WiFi devices, WiMax devices, cellular communication devices, and/or the like.

The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 executes the respective methods and processes described above, such as the image processing method. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into the RAM 503 and executed by the computing unit 501, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform the image processing method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

In the technical scheme of the disclosure, the acquisition, storage and application of the personal information of the related user are all in accordance with the regulations of related laws and regulations, and do not violate the good customs of the public order. It is an intention of the present disclosure that personal information data should be managed and processed in a manner that minimizes the risk of inadvertent or unauthorized access to the use. By limiting data collection and deleting data when it is no longer needed, risks are minimized. All information related to a person in the present application is collected with the knowledge and consent of the person.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely exemplary embodiments or examples and that the scope of the present invention is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. An image processing method comprising:

acquiring an image set shot in a first time period relative to a first face, wherein the image set comprises a plurality of face images, and the similarity between the face images is larger than a first threshold value;

clustering the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters;

determining a first image cluster in the plurality of image clusters, wherein the difference of the similarity of the face images between the first image cluster and a second image cluster is larger than a second threshold value, the second image cluster is an image cluster in the plurality of image clusters except the first image cluster, the second image cluster is set as a main image cluster, and the first image cluster is set as an alternative image cluster; and

in response to determining that at least one face image in the alternative image cluster has an association relationship with a face image in the main image cluster with respect to shooting time and shooting place, removing face images other than the at least one face image in the alternative image cluster from the image set.

2. The method of claim 1, further comprising: and after the image set is obtained, removing the face images which are determined not to belong to the first face according to the time and the place of the face images.

3. The method of claim 1, wherein said clustering the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters comprises:

selecting at least two geographical positions having a mutual distance exceeding a predetermined distance from among a plurality of geographical positions having an appearance number higher than a predetermined number, according to the geographical position where each face image is photographed;

clustering the face images with the face images corresponding to the at least two geographic locations as a cluster center.

4. The method of any of claims 1 to 3, wherein the determining a first image cluster of the plurality of image clusters comprises:

calculating the similarity of the face images among the image clusters; and

in response to determining that a difference in facial image similarity between at least one of the plurality of image clusters and the remaining image clusters is greater than the second threshold, treating the at least one image cluster as the first image cluster.

5. The method of any of claims 1 to 3, wherein the determining a first image cluster of the plurality of image clusters comprises:

calculating the similarity of the face images among the image clusters;

and in response to that the difference of the similarity of the face images among the plurality of image clusters is smaller than or equal to the second threshold value and in response to that at least one image cluster exists in the plurality of image clusters, wherein the at least one image cluster and the rest of the image clusters are determined not to belong to the same face image cluster according to the time and the place of shooting the face images in the image clusters, and the at least one image cluster is taken as the first image cluster.

6. The method according to any one of claims 1 to 3, wherein the determining that at least one face image in the alternative image cluster has an association relationship with a face image in the main image cluster with respect to shooting time and shooting place comprises:

constructing a time window according to the corresponding shooting time of two face images with different shooting time in the main image cluster;

constructing a space boundary corresponding to the time window based on the shooting places corresponding to the two face images;

determining the face images falling within the time window and the spatial boundary as having the association relationship.

7. An image processing apparatus comprising:

an acquisition module configured to acquire a set of images captured within a first time period with respect to a first face, the set of images including a plurality of face images, a similarity between the plurality of face images being greater than a first threshold;

a clustering module configured to cluster the plurality of facial images based on the geographic location at which each of the plurality of facial images was captured to construct a plurality of image clusters;

a splitting module configured to determine a first image cluster of the plurality of image clusters, a difference in face image similarity between the first image cluster and a second image cluster being an image cluster of the plurality of image clusters other than the first image cluster, being greater than a second threshold, and set the second image cluster as a main image cluster and the first image cluster as an alternative image cluster;

a recall module: is configured to remove face images other than the at least one face image from the image set in response to determining that at least one face image in the candidate image cluster has an association relationship with a face image in the main image cluster with respect to a photographing time and a photographing place.

8. The apparatus of claim 7, wherein the acquisition module is further configured to: and after the image set is obtained, removing the face images which are determined not to belong to the first face according to the time and the place of the face images.

9. The apparatus of claim 7, wherein the clustering module further comprises:

a selection module configured to select at least two geographical positions whose mutual distances exceed a predetermined distance from among a plurality of geographical positions whose number of occurrences is higher than a predetermined number, according to the geographical position at which each face image is photographed; and

an operation module configured to cluster the plurality of facial images with facial images corresponding to the at least two geographic locations as a cluster center.

10. The apparatus of any of claims 7 to 9, wherein the splitting module further comprises:

a calculation module configured to calculate a face image similarity between the plurality of image clusters; and

a first execution module configured to treat at least one of the plurality of image clusters as the first image cluster in response to determining that a difference in facial image similarity between the at least one image cluster and a remaining image cluster is greater than the second threshold.

11. The apparatus of any of claims 7 to 9, wherein the splitting module further comprises:

and the second execution module is configured to respond to the fact that the difference of the similarity of the face images among the plurality of image clusters is smaller than or equal to the second threshold value and respond to the fact that at least one image cluster exists in the plurality of image clusters, wherein the at least one image cluster and the rest image clusters are not the image cluster of the same face according to the time and the place of shooting of the face images in the image clusters, and the at least one image cluster is taken as the first image cluster.

12. The apparatus of any of claims 7 to 9, wherein the recall module further comprises:

the first construction module is configured to construct a time window according to the corresponding shooting time of two face images with different shooting time in the main image cluster;

the second construction module is configured to construct a space boundary corresponding to the time window based on shooting places corresponding to the two face images; and

a determination module configured to determine face images falling within the time window and the spatial boundary as having the association relationship.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor, wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.

15. A computer program product comprising a computer program, wherein the computer program realizes the method according to any of claims 1-6 when executed by a processor.