CN112115740A

CN112115740A - Method and apparatus for processing image

Info

Publication number: CN112115740A
Application number: CN201910533007.2A
Authority: CN
Inventors: 施皓; 陈孟飞
Original assignee: Beijing Haiyi Tongzhan Information Technology Co Ltd
Current assignee: Beijing Haiyi Tongzhan Information Technology Co Ltd
Priority date: 2019-06-19
Filing date: 2019-06-19
Publication date: 2020-12-22
Anticipated expiration: 2039-06-19
Also published as: CN112115740B

Abstract

Embodiments of the present disclosure disclose methods and apparatus for processing images. One embodiment of the method comprises: acquiring a target human body image; determining a target category to which a target human body image belongs from a category set, wherein the category of a preset human body image in the preset human body image set is a category corresponding to the preset human body image set, the preset human body image set comprises a preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person; acquiring a candidate human body image set corresponding to a target category; determining whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value or not; in response to determining yes, determining a preset human body image subset larger than or equal to as a result human body image subset. The embodiment can improve the precision and the efficiency of image matching and reduce the resource consumed by image processing.

Description

Method and apparatus for processing image

Technical Field

Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method and an apparatus for processing an image.

Background

Pedestrian re-identification (Person re-identification), also known as pedestrian re-identification, is a technique that uses computer vision techniques to determine whether a particular pedestrian object is present in a captured image.

At present, due to the difference between different camera devices and the characteristic of rigidity and flexibility of pedestrians, the appearance is easily affected by wearing, size, shielding, posture, visual angle and the like, so that the pedestrian re-identification becomes a popular subject which not only has research value but also has challenge in the field of computer vision.

Disclosure of Invention

Embodiments of the present disclosure propose methods and apparatuses for processing an image.

In a first aspect, an embodiment of the present disclosure provides a method for processing an image, the method including: acquiring a target human body image; determining a category to which a target human body image belongs from a predetermined category set as a target category, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is the category corresponding to the preset human body image set, the preset human body image set comprises at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person; acquiring a preset human body image set corresponding to a target category as a candidate human body image set; determining whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value or not; and responding to the determination that the preset human body image subset is larger than or equal to the target similarity threshold value, and determining the preset human body image subset as a result human body image subset corresponding to the target human body image.

In some embodiments, the set of classes includes a face-containing class and a face-not-containing class that are classified based on whether the human image contains a face object; and determining a category to which the target human body image belongs from a predetermined category set as a target category includes: inputting a target human body image into a pre-trained human face detection model to obtain a detection result, wherein the detection result is used for indicating whether the target human body image contains a human face object; and determining the category to which the target human body image belongs from the category set as a target category based on the detection result.

In some embodiments, the set of categories includes a front class, a side class, and a back class divided based on an orientation of the human object in the human image; and determining a category to which the target human body image belongs from a predetermined category set as a target category includes: inputting a target human body image into a pre-trained human face detection model to obtain a detection result, wherein the detection result is used for indicating whether the target human body image contains a human face object; inputting the target human body image into a human body skeleton key point identification model trained in advance to obtain an identification result, wherein the identification result is used for indicating the position of the human body skeleton key point in the target human body image; based on the obtained detection result and recognition result, a category to which the target human body image belongs is determined from the category set as a target category.

In some embodiments, the method further comprises: and adding the target human body image into the result human body image subset to obtain an updated result human body image subset.

In some embodiments, the target human body image is a human body image extracted from the target human body video; and after determining the preset human body image subset larger than or equal to the target similarity threshold value as a result human body image subset corresponding to the target human body image, the method further comprises: selecting a target number of personal images from a target personal video; for a body image of the target number of body images, the following processing steps are performed: determining the category of the human body image from the category set; and in response to that the category to which the human body image belongs is the same as the target category, adding the human body image to the result human body image subset to obtain an updated result human body image subset.

In some embodiments, the processing step further comprises: in response to that the category to which the human body image belongs is different from the target category, determining a preset human body image subset of the same person corresponding to the result human body image subset from a preset human body image set corresponding to the category to which the human body image belongs; and adding the human body image into the determined preset human body image subset to obtain an updated preset human body image subset.

In some embodiments, determining whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to the target similarity threshold value includes: for a preset human body image subset in the candidate human body image set, executing the following steps: determining the similarity between the preset human body image included in the preset human body image subset and the target human body image as candidate similarity; generating the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image based on the obtained candidate similarity; and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value.

In some embodiments, the preset subset of human body images corresponds to a target similarity threshold; and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold comprises: and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold corresponding to the preset human body image subset.

In some embodiments, the preset target similarity threshold corresponding to the human body image subset is obtained by: determining the number of preset human body images in a preset human body image subset; and acquiring a similarity threshold corresponding to the determined quantity as a target similarity threshold corresponding to a preset human body image subset based on the corresponding relation between the quantity and the similarity threshold established in advance, wherein the quantity is positively correlated with the similarity threshold.

In some embodiments, determining a preset human body image subset larger than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image includes: determining whether the candidate human body image set comprises at least two preset human body image subsets with the similarity greater than or equal to the corresponding target similarity threshold; and responding to the request, and selecting a preset human body image subset with the largest number of included preset human body images from at least two preset human body image subsets with the similarity larger than or equal to the corresponding target similarity threshold value as a result human body image subset corresponding to the target human body image.

In some embodiments, after determining a preset human body image subset larger than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image, the method further includes: and outputting prompt information for representing and determining a result human body image subset corresponding to the target human body image.

In a second aspect, an embodiment of the present disclosure provides an apparatus for processing an image, the apparatus including: a first acquisition unit configured to acquire a target human body image; a first determining unit, configured to determine a category to which a target human body image belongs as a target category from a predetermined category set, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is a category corresponding to the preset human body image set, the preset human body image set includes at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person; the second acquisition unit is configured to acquire a preset human body image set corresponding to the target category as a candidate human body image set; a second determining unit configured to determine whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold; a third determining unit configured to determine, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold, the preset human body image subset greater than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image.

In some embodiments, the set of classes includes a face-containing class and a face-not-containing class that are classified based on whether the human image contains a face object; and the first determination unit includes: the system comprises a first input module, a second input module and a third input module, wherein the first input module is configured to input a target human body image into a pre-trained human face detection model to obtain a detection result, and the detection result is used for indicating whether the target human body image contains a human face object; and the first determination module is configured to determine a category to which the target human body image belongs from the category set as a target category based on the detection result.

In some embodiments, the set of categories includes a front class, a side class, and a back class divided based on an orientation of the human object in the human image; and the first determination unit includes: the second input module is configured to input the target human body image into a pre-trained human face detection model to obtain a detection result, wherein the detection result is used for indicating whether the target human body image contains a human face object; a third input module configured to input the target human body image into a pre-trained human body bone key point identification model to obtain an identification result, wherein the identification result is used for indicating the position of the human body bone key point in the target human body image; and a second determination module configured to determine, as a target category, a category to which the target human body image belongs from the category set based on the obtained detection result and the recognition result.

In some embodiments, the apparatus further comprises: an adding unit configured to add the target human body image to the result human body image subset, obtaining an updated result human body image subset.

In some embodiments, the target human body image is a human body image extracted from the target human body video; and the apparatus further comprises: a selecting unit configured to select a target number of individual body images from the target body video; an execution unit configured to execute the following processing steps for a human body image among the target number of human body images: determining the category of the human body image from the category set; and in response to that the category to which the human body image belongs is the same as the target category, adding the human body image to the result human body image subset to obtain an updated result human body image subset.

In some embodiments, the second determination unit is further configured to: for a preset human body image subset in the candidate human body image set, executing the following steps: determining the similarity between the preset human body image included in the preset human body image subset and the target human body image as candidate similarity; generating the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image based on the obtained candidate similarity; and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value.

In some embodiments, the preset subset of human body images corresponds to a target similarity threshold; and the second determination unit is further configured to: and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold corresponding to the preset human body image subset.

In some embodiments, the third determination unit comprises: a third determining module configured to determine whether at least two preset human body image subsets with similarity greater than or equal to a corresponding target similarity threshold are included in the candidate human body image set; and the selecting module is configured to respond to the selection of a preset human body image subset with the largest number of included preset human body images from at least two preset human body image subsets with the similarity larger than or equal to the corresponding target similarity threshold as a result human body image subset corresponding to the target human body image.

In some embodiments, the apparatus further comprises: and the output unit is configured to output prompt information for representing a result human body image subset corresponding to the determined target human body image.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon which, when executed by one or more processors, cause the one or more processors to implement the method of any of the embodiments of the method for processing images described above.

In a fourth aspect, embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, which when executed by a processor, implements the method of any of the above-described methods for processing an image.

The method and apparatus for processing images provided by the embodiments of the present disclosure determine a category to which a target human body image belongs as a target category from a predetermined category set, wherein the category in the category set corresponds to a predetermined human body image set, the category of the predetermined human body image in the predetermined human body image set is a category corresponding to the predetermined human body image set, the predetermined human body image set includes at least one predetermined human body image subset, the predetermined human body images in the predetermined human body image subset correspond to the same person, then the predetermined human body image set corresponding to the target category is obtained as a candidate human body image set, and then it is determined whether a similarity between the person corresponding to the predetermined human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold, in response to the determination, determining a preset human body image subset larger than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image, wherein the images of the same category generally comprise similar image features, so that the human body image subset can be matched from the preset human body image set corresponding to the target category based on the target category to which the target human body image belongs, so that the matching precision can be improved, and a more accurate result human body image subset can be obtained; in addition, in the process of matching the result human body image subsets, the method only needs to match the preset human body image subsets in the preset human body image set corresponding to the target category, and compared with the method that all the preset human body image subsets are matched without classification, the method can reduce the complexity of matching, improve the efficiency of image processing, and reduce the resources consumed in the image processing process.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for processing an image according to the present disclosure;

FIG. 3 is a schematic illustration of one application scenario of a method for processing an image according to an embodiment of the present disclosure;

FIG. 4 is a flow diagram of yet another embodiment of a method for processing an image according to the present disclosure;

FIG. 5 is a schematic block diagram of one embodiment of an apparatus for processing images according to the present disclosure;

FIG. 6 is a schematic block diagram of a computer system suitable for use with an electronic device implementing embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the method for processing images or the apparatus for processing images of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as an image processing application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, and 103 are hardware, they may be various electronic devices with cameras, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as an image processing server that processes a target human body image captured by the

terminal apparatuses

101, 102, 103. The image processing server may perform processing such as analysis on the received data such as the target human body image, and obtain a processing result (e.g., a result human body image subset). In some application scenarios, the image processing server may also feed back the processing result to the terminal device.

It should be noted that the method for processing the image provided by the embodiment of the present disclosure may be executed by the

terminal devices

101, 102, and 103, or may be executed by the server 105, and accordingly, the apparatus for processing the image may be disposed in the

terminal devices

101, 102, and 103, or may be disposed in the server 105.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In particular, in the case where the data used in generating the resulting subset of human body images need not be obtained remotely, the system architecture described above may not include a network, but only a terminal device or server.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for processing an image in accordance with the present disclosure is shown. The method for processing the image comprises the following steps:

step 201, acquiring a target human body image.

In this embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for processing an image may acquire the target human body image from a remote or local place by a wired connection manner or a wireless connection manner. The target human body image may be a human body image of whether a person corresponding to the target human body image is a predetermined person to be determined.

In practice, the target human body image may be an image obtained by photographing a human body (e.g., a pedestrian). Specifically, the executing body or other electronic device may control the camera to perform continuous shooting, and in response to shooting of the human body object, acquire an image of the shot human body object as the target human body image.

Step 202, determining a category to which the target human body image belongs from a predetermined category set as a target category.

In this embodiment, based on the target human body image obtained in step 201, the execution subject may determine a category to which the target human body image belongs as a target category from a predetermined category set. The categories in the category set are the categories divided by technicians according to a specific human body image classification mode. For example, the classification may be performed according to whether the human body image includes a human face object, and the classification may be performed to include a human face class and not include a human face class; the male type and the female type can be divided according to the gender division of the person corresponding to the human body image, and further, the category set can comprise the male type and the female type.

In the present embodiment, the categories in the category set correspond to a preset human body image set. Specifically, for each category in the category set, a preset human body image set corresponding to the category is predetermined, and the categories of the preset human body images in the preset human body image set corresponding to the category are all the categories (for example, the gender of the person corresponding to the preset human body image in the preset human body image set corresponding to the male category is all male).

In this embodiment, the preset human body image set may be a human body image set formed by previously using existing human body images. Each preset human body image set comprises at least one preset human body image subset. The preset human body images in the preset human body image subset correspond to the same person.

As an example, the preset human body image set a may include a preset human body image subset a and a preset human body image subset b. The preset human body image in the preset human body image subset a can be an image obtained by shooting the person a in advance, and the preset human body image in the preset human body image subset a corresponds to the person a; the preset human body image in the preset human body image subset b may be an image obtained by shooting the person b in advance, and the preset human body image in the preset human body image subset b corresponds to the person b.

Specifically, the preset human body image subset may be used for matching with the target human body image to determine whether the person corresponding to the preset human body image subset is the same as the person corresponding to the target human body image. Here, combining the preset human body images corresponding to a certain person into the preset human body image subset in advance can increase the number of the preset human body images for matching the person, thereby contributing to improving the matching accuracy when matching the person corresponding to the target human body image with the person.

In this embodiment, the execution subject may determine the target category to which the target human body image belongs from the category set by using various methods based on the classification manner of the categories in the category set.

In some optional implementations of this embodiment, the category set may include a face-containing class and a face-not-containing class divided based on whether the human body image contains a face object; and the executing body can determine the target class to which the target human body image belongs from the class set through the following steps: firstly, the execution subject can input the target human body image into a human face detection model trained in advance to obtain a detection result. Then, the executing body may determine a target class to which the target human body image belongs from the class set based on the detection result.

In this implementation, the face detection model is used to detect whether a face object is included in an input image. Specifically, as an example, the face detection model may be a model obtained by training an initial model (e.g., a convolutional neural network) by using a machine learning method based on a training sample. The detection result is used to indicate whether the target human body image contains a human face object, and may include, but is not limited to, at least one of the following: characters, numbers, symbols, images. For example, the detection result may include a digital "0" or a digital "1". Wherein the number "0" may be used to indicate that the target human body image does not contain a human face object; the number "1" may be used to indicate that the target human image contains a human face object.

Specifically, the executing body may determine, in response to determining that the detection result indicates that the target human body image includes the human face object, that the target class to which the target human body image belongs includes the human face class; and in response to determining that the detection result indicates that the target human body image does not contain the human face object, determining that the target class to which the target human body image belongs is not the human face class.

In some optional implementations of the present embodiment, the set of categories may include a front class, a side class, and a back class divided based on an orientation of the human object in the human image; and the executing body may determine a target class to which the target human body image belongs from a predetermined class set by:

firstly, inputting a target human body image into a pre-trained human face detection model to obtain a detection result, wherein the detection result is used for indicating whether the target human body image contains a human face object.

And then, inputting the target human body image into a human body skeleton key point recognition model trained in advance to obtain a recognition result, wherein the recognition result is used for indicating the position of the human body skeleton key point in the target human body image.

Here, the human bone key point recognition model may be used to recognize a human image to determine the position of the human bone key point in the human image. Specifically, as an example, the human skeleton key point identification model may be a model obtained by training an initial model (e.g., a convolutional neural network) by using a machine learning method based on a training sample.

In practice, the human skeleton key points refer to key skeleton points on the human object, and generally may include points corresponding to the head, shoulders, crotch, knees, and the like.

And finally, determining the category to which the target human body image belongs from the category set as a target category based on the obtained detection result and the identification result.

Specifically, the executing body may determine the target category to which the target human body image belongs from the category set by using various methods based on the obtained detection result and the recognition result.

As an example, the executing body may directly determine the back class in the class set as the target class to which the target human body image belongs in response to determining that the detection result indicates that the target human body image does not contain the human face object. It is to be understood that when the orientation of the human body object in the target human body image is a back side, the human face object cannot be detected from the target human body image, and when the orientation of the human body object in the target human body image is a front side or a side, the human face object can be detected from the target human body image, so that the execution main body can determine whether the target human body image belongs to a back side class based on whether the target human body image includes the human face object.

In this example, if the detection result indicates that the target human body image contains a human face object, it may be determined that the target human body image belongs to the front class or the side class. And specifically belongs to which of the front class and the side class, the execution subject may be determined based on the recognition result output by the human skeleton keypoint recognition model.

Specifically, the executing body may determine the target category to which the target human body image belongs from the front category and the side category by various methods, for example, the recognition result includes positions of key points corresponding to shoulders, the executing body may determine whether a distance between the key points corresponding to the two shoulders is greater than or equal to a preset threshold, and in response to the distance being greater than or equal to the preset threshold, determine that the target human body image belongs to the front category; and in response to the image being smaller than the preset threshold, determining that the target human body image belongs to the side class. It should be noted that, in order to improve the accuracy of the target category, the execution subject may determine the target category to which the target human body image belongs based on a plurality of kinds of human skeleton key points at the same time, for example, may determine the target category based on a key point corresponding to a shoulder and a key point corresponding to a crotch at the same time (here, the distance between the key points corresponding to two crotch bones is large for the front category, and the distance between the key points corresponding to two crotch bones is small for the side category).

Specifically, for the back class, the execution subject may also determine whether the target human body image belongs to the back class based on whether the target human body image includes a human face object and whether human skeleton key points in the target human body image satisfy a preset requirement (for example, a distance between key points corresponding to two shoulders is greater than or equal to a preset threshold). Thus, the accuracy of the category identification can be further improved.

Step 203, acquiring a preset human body image set corresponding to the target category as a candidate human body image set.

In this embodiment, based on the target category determined in step 202, the executing entity may obtain a preset human body image set corresponding to the target category as a candidate human body image set based on a corresponding relationship between the category and the preset human body image set.

Step 204, determining whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold.

In this embodiment, for the preset human body image subset in the candidate human body image set obtained in step 203, the executing entity may first determine whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to the target similarity threshold. The similarity is a numerical value used for representing the degree of similarity between two objects (e.g., two people, two images, etc.), and specifically, the greater the similarity, the higher the degree of similarity that can represent two objects. The target similarity threshold may be a predetermined similarity threshold, or may be a similarity threshold determined in the process of currently executing step 204. The similarity threshold is the minimum value of the similarity.

Specifically, for a preset human body image subset in the candidate human body image set, the executing entity may determine whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to the target similarity threshold value by using various methods.

In some optional implementations of the embodiment, for a preset human body image subset in the candidate human body image set, the executing body may perform the following steps: determining the similarity between the preset human body image included in the preset human body image subset and the target human body image as candidate similarity; generating the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image based on the obtained candidate similarity; and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value.

In this implementation manner, the executing entity may employ various methods to generate the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image based on the obtained candidate similarity, for example, the obtained candidate similarity may be subjected to mean calculation to obtain a calculation result as the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image; or, the candidate similarity with the minimum similarity among the obtained candidate similarities may be determined as the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image.

The executing entity may determine the candidate similarity by using various image similarity calculation methods, for example, a similarity calculation method based on a feature point, a similarity calculation method based on a perceptual hash algorithm, or a similarity calculation method based on a peak signal-to-noise ratio.

In some optional implementations of this embodiment, the preset human body image subset corresponds to a target similarity threshold (i.e., different preset human body image subsets correspond to different target similarity thresholds); and for a preset human body image subset in the candidate human body image set, the executing body may determine whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold corresponding to the preset human body image subset.

It can be understood that the human features of different people are different, and then the matching difficulty is also different, so that different target similarity thresholds can be set for different preset human image subsets in the implementation mode, and therefore, the matching adaptability can be improved, and more flexible image matching can be realized.

In some optional implementation manners of this embodiment, the target similarity threshold corresponding to the preset human body image subset may be obtained by the executing main body or other electronic device through the following steps: determining the number of preset human body images in a preset human body image subset; based on the pre-established corresponding relationship between the number and the similarity threshold, the similarity threshold corresponding to the determined number is obtained as the target similarity threshold corresponding to the preset human body image subset, wherein the number and the similarity threshold are positively correlated (i.e., the greater the number, the greater the similarity threshold).

It can be understood that the more the number of the preset human body images included in the preset human body image subset is, the more the reference data for performing person matching is, and the more accurate the obtained matching result is, at this time, a stricter evaluation criterion (corresponding to a larger similarity threshold) may be set to determine whether the person corresponding to the target human body image matches with the person corresponding to the preset human body image subset; on the contrary, the smaller the number of the preset human body images included in the preset human body image subset, the less the reference data for performing person matching, and the more inaccurate the obtained matching result, at this time, a more relaxed judgment criterion (corresponding to a smaller similarity threshold) may be set to determine whether the person corresponding to the target human body image matches with the person corresponding to the preset human body image subset. Therefore, on the premise of ensuring the matching precision, the passing rate of matching is improved, and the application range of image matching is enlarged.

In step 205, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to the target similarity threshold, the preset human body image subset greater than or equal to the target similarity threshold is determined as the result human body image subset corresponding to the target human body image.

In this embodiment, the executing entity may determine, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to the target similarity threshold, the preset human body image subset greater than or equal to the target similarity threshold as the result human body image subset corresponding to the target human body image. And the result human body image subset is a preset human body image subset which is matched and most probably corresponds to the same person with the target human body image.

Specifically, the executing entity may first determine a similarity between a person corresponding to each preset human body image subset in the candidate human body image set and a person corresponding to the target human body image. Then, the executing body may obtain a preset human body image subset with a corresponding similarity greater than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image.

In some optional implementation manners of this embodiment, after obtaining the result human body image subset corresponding to the target human body image, the execution main body may add the target human body image to the result human body image subset to obtain an updated result human body image subset. Therefore, the number of the human body images in the result human body image subset can be increased, and the matching precision can be improved when the updated result human body image subset is used for image matching subsequently.

It should be noted that, if each preset human body image subset corresponds to a target similarity threshold, and the target similarity threshold is determined based on the number of human body images included in the corresponding preset human body image subset, in this implementation manner, as the number of human body images in the updated result human body image subset increases, the target similarity threshold corresponding to the updated result human body image subset increases.

In some optional implementation manners of this embodiment, the executing main body may determine, as the result human body image subset corresponding to the target human body image, a preset human body image subset that is greater than or equal to the target similarity threshold by: first, the execution subject may determine whether at least two preset human body image subsets having similarity greater than or equal to a corresponding target similarity threshold are included in the candidate human body image set. Then, in response to determining that the candidate human body image set includes at least two preset human body image subsets having similarity greater than or equal to the corresponding target similarity threshold, the executing entity may select, as the result human body image subset corresponding to the target human body image, a preset human body image subset including the largest number of preset human body images from the at least two preset human body image subsets having similarity greater than or equal to the corresponding target similarity threshold.

It can be understood that the more the number of the preset human body images in the preset human body image subset is, the larger the target similarity threshold corresponding to the preset human body image subset can be, and further, the higher the matching precision when the preset human body image subset is used for image matching is, therefore, in the implementation mode, the accuracy of the matched result human body image subset can be further improved by selecting the preset human body image subset with the largest number of the preset human body images from at least two preset human body image subsets with the similarity greater than or equal to the target similarity threshold as the result human body image subset.

In some optional implementation manners of this embodiment, after obtaining the result human body image subset, the execution main body may further output prompt information for representing and determining the result human body image subset corresponding to the target human body image. Specifically, the execution main body can output and present the prompt information; or the execution main body can output the prompt information to other electronic equipment (such as the terminal equipment shown in fig. 1) in communication connection, so that the electronic equipment presents the prompt information. Here, the prompt information may be information in various forms (e.g., text, image, audio, video, etc.). Specifically, the prompt information may be predetermined information (for example, "1"), or may be information generated based on the result human body image subset (for example, the result human body subset is directly determined as the prompt information).

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for processing an image according to the present embodiment.

In the application scenario of fig. 3, the server 31 may first acquire the target human body image 33 transmitted by the mobile phone 32. Then, the server 31 may determine a category to which the target human body image 33 belongs as the target category 35 from a predetermined category set 34, where the category set 34 includes a first category 341 (for example, including a human face class) and a second category 342 (for example, not including a human face class), the first category 341 and the second category 342 correspond to a preset human body image set 361 and a preset human body image set 362, respectively, a category of a preset human body image in the preset human body image set 361 is the first category 341, a category of a preset human body image in the preset human body image set 362 is the second category 342, and each of the preset human body image set 361 and the preset human body image set 362 includes at least one preset human body image subset, and preset human body images in the preset human body image subsets correspond to the same person.

Then, the server 31 may acquire a preset human image set corresponding to the target category 35 as the candidate human image set 37.

Finally, the server 31 may determine whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set 37 and the person corresponding to the target human body image 33 is greater than or equal to a target similarity threshold, and in response to determining that the preset human body image subset greater than or equal to the target similarity threshold is determined as the result human body image subset 38 corresponding to the target human body image 33.

The method provided by the embodiment of the disclosure can match the human body image subset from the preset human body image set corresponding to the target category based on the target category to which the target human body image belongs, so that the matching precision can be improved, and a more accurate result human body image subset can be obtained; in the process of matching the result human body image subsets, the matching is only needed to be performed on the preset human body image subsets in the preset human body image set corresponding to the target categories, and compared with the method that all the preset human body image subsets are matched without classification, the matching complexity can be reduced, the image processing efficiency is improved, and the resources consumed in the image processing process are reduced; in addition, the method and the device can determine the human body images with the similarity meeting the preset requirement with the target human body images in batches based on the matching of the preset human body image subset and the target human body images, so that the diversity of the determined human body images can be improved, and the accuracy of image matching can be further improved.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for processing an image is shown. The flow 400 of the method for processing an image comprises the steps of:

step 401, acquiring a target human body image.

In this embodiment, an execution subject (e.g., a server shown in fig. 1) of the method for processing an image may acquire the target human body image from a remote or local place by a wired connection manner or a wireless connection manner. Wherein the target human body image is a human body image extracted from the target human body video. The target human body video may be a human body video of whether a person corresponding to the target human body video is a predetermined person to be determined. Specifically, the target human body video may be a video obtained by photographing a human body (e.g., a pedestrian).

It can be understood that, since the target human body video is obtained by shooting a human body, the video frame in the target human body video may include a human body object, and further, the video frame in the target human body video is substantially a human body image, and extracting a human body image from the target human body video is equivalent to extracting a video frame from the target human body video. Specifically, the execution subject or other electronic device may extract a human body image from the target human body video as the target human body image by using various methods. For example, random extraction may be adopted, or a human body image with high definition may be extracted as the target human body image.

Step 402, determining a category to which the target human body image belongs from a predetermined category set as a target category.

In this embodiment, based on the target human body image obtained in step 401, the execution subject may determine a category to which the target human body image belongs as a target category from a predetermined category set. And the categories in the category set correspond to the preset human body image set. The preset human body image set may be a human body image set formed by previously using existing human body images. Each preset human body image set comprises at least one preset human body image subset. The preset human body images in the preset human body image subset correspond to the same person.

And step 403, acquiring a preset human body image set corresponding to the target category as a candidate human body image set.

In this embodiment, based on the target category determined in step 402, the executing entity may obtain a preset human body image set corresponding to the target category as a candidate human body image set based on a corresponding relationship between the category and the preset human body image set.

In step 404, it is determined whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold.

In this embodiment, for the preset human body image subset in the candidate human body image set obtained in step 403, the executing entity may first determine whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to the target similarity threshold. The similarity is a numerical value used for representing the degree of similarity between two objects, and specifically, the greater the similarity, the higher the degree of similarity that can represent two objects. The target similarity threshold may be a predetermined similarity threshold, or may be a similarity threshold determined in the process of executing step 404.

Step 405, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to the target similarity threshold, determining the preset human body image subset greater than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image.

Step 401, step 402, step 403, step 404, and step 405 may be performed in a manner similar to that of step 201, step 202, step 203, step 204, and step 205 in the foregoing embodiment, respectively, and the above description for step 201, step 202, step 203, step 204, and step 205 also applies to step 401, step 402, step 403, step 404, and step 405, and is not repeated here.

Step 406, selecting a target number of human body images from the target human body video.

In this embodiment, the executing body may obtain a target human body video to which the target human body image belongs, and select a target number of human body images from the target human body video. The target number may be a preset number (e.g., "3"), or a number determined based on the number of video frames included in the target human body video (e.g., the target number is one-half of the number of video frames included in the target human body video).

Specifically, the execution main body may select a target number of personal images from the target human video by using various methods, for example, the execution main body may select the target number of personal images in a random selection manner, or may select the top-ranked personal images in a sequence of facial images (i.e., a sequence of video frames) corresponding to the target human video.

Step 407, for the human body images in the target number of human body images, executing the following processing steps: determining the category of the human body image from the category set; and in response to that the category to which the human body image belongs is the same as the target category, adding the human body image to the result human body image subset to obtain an updated result human body image subset.

In the present embodiment, for the human body image out of the target number of human body images obtained in step 406, the executing body may execute the following processing steps:

firstly, determining the category of the human body image from a category set.

Here, the category to which the human body image belongs may be determined by referring to the determination method of the target category in step 402, and details are not repeated here.

And secondly, adding the human body image into a result human body image subset to obtain an updated result human body image subset in response to the fact that the category to which the human body image belongs is the same as the target category.

It can be understood that matching the result human body image subset can indicate that the person corresponding to the target human body image is matched with the person corresponding to the result human body image subset, and further, can indicate that the person corresponding to the target human body video is matched with the person corresponding to the result human body image subset. However, the result human body image subset also corresponds to a target category (for example, a front category), and therefore, if a human body image selected from the target human body video is to be added to the result human body image subset, it is also necessary to ensure that the category of the selected human body image is the target category.

In some optional implementations of this embodiment, the processing step may further include:

thirdly, in response to that the category to which the human body image belongs is different from the target category, determining a preset human body image subset of the same person corresponding to the result human body image subset from a preset human body image set corresponding to the category to which the human body image belongs; and adding the human body image into the determined preset human body image subset to obtain an updated preset human body image subset.

In practice, the preset human body image subset corresponding to the same person may be pre-established with a relationship (for example, with a relationship storage, or including the same person mark), and further, in response to that the category to which the human body image belongs is different from the target category, the execution main body may first obtain the preset human body image subset corresponding to the category (for example, a side category) to which the human body image belongs, and then determine the preset human body image subset having the relationship with the result human body image subset from the preset human body image subset, that is, determine the preset human body image subset having the same category as the category to which the human body image belongs and corresponding to the same person as the human body image.

Here, through the above processing steps, the number of human body images included in the human body image subset (including the result human body image subset and the preset human body image subset) can be increased, which is helpful for improving the matching precision when performing person matching by using the updated human body image subset.

It should be noted that, if each preset human body image subset corresponds to a target similarity threshold, and the target similarity threshold is determined based on the number of human body images included in the corresponding preset human body image subset, after the processing step is executed, as the number of human body images in the updated human body image subset increases, the target similarity threshold corresponding to the updated human body image subset increases.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for processing images in this embodiment highlights that after the result human body image subset corresponding to the target human body image is determined, a target number of human body images are selected from the target human body video to which the target human body image belongs, and for a human body image in the target number of human body images, a category to which the human body image belongs is determined from the category set, and in response to that the category to which the human body image belongs is the same as the target category, the human body image is added to the result human body image subset, so as to obtain an updated result human body image subset. Therefore, according to the scheme described in this embodiment, after the result human body image subset is determined, the number of human body images in the result human body image subset can be increased based on the target human body video to which the target human body image belongs, so that the diversity of the result human body image subset can be improved, the data for matching can be increased when the result human body image subset is subsequently used for matching a new target human body image, and the accuracy of image matching can be improved.

With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of an apparatus for processing an image, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable in various electronic devices.

As shown in fig. 5, the apparatus 500 for processing an image of the present embodiment includes: a first acquisition unit 501, a first determination unit 502, a second acquisition unit 503, a second determination unit 504, and a third determination unit 505. Wherein the first acquiring unit 501 is configured to acquire a target human body image; the first determining unit 502 is configured to determine a category to which the target human body image belongs as a target category from a predetermined category set, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is a category corresponding to the preset human body image set, the preset human body image set includes at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person; the second obtaining unit 503 is configured to obtain a preset human body image set corresponding to the target category as a candidate human body image set; the second determining unit 504 is configured to determine whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold; the third determining unit 505 is configured to determine a preset human body image subset larger than or equal to a target similarity threshold as a result human body image subset corresponding to the target human body image in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is larger than or equal to the target similarity threshold.

In this embodiment, the first acquiring unit 501 of the apparatus for processing images 500 may acquire the target human body image from a remote or local place by a wired connection manner or a wireless connection manner. The target human body image may be a human body image of whether a person corresponding to the target human body image is a predetermined person to be determined.

In this embodiment, based on the target human body image obtained by the first obtaining unit 501, the first determining unit 502 may determine a category to which the target human body image belongs as a target category from a predetermined category set. And the categories in the category set correspond to the preset human body image set. The preset human body image set may be a human body image set formed by previously using existing human body images. Each preset human body image set comprises at least one preset human body image subset. The preset human body images in the preset human body image subset correspond to the same person.

In this embodiment, based on the target category determined by the first determining unit 502, the second acquiring unit 503 may acquire a preset human body image set corresponding to the target category as a candidate human body image set.

In this embodiment, for a preset human body image subset in the candidate human body image set obtained by the second obtaining unit 503, the second determining unit 504 may first determine whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold. The similarity is a numerical value used for representing the degree of similarity between the two objects, and the target similarity threshold may be a predetermined similarity threshold or a currently determined similarity threshold.

In this embodiment, the third determining unit 505 may determine, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to the target similarity threshold, the preset human body image subset that is greater than or equal to the target similarity threshold as the result human body image subset corresponding to the target human body image. And the result human body image subset is the matched human body image subset which most probably corresponds to the same person as the target human body image.

It will be understood that the elements described in the apparatus 500 correspond to various steps in the method described with reference to fig. 2. Thus, the operations, features and resulting advantages described above with respect to the method are also applicable to the apparatus 500 and the units included therein, and are not described herein again.

The apparatus 500 provided in the above embodiment of the present disclosure may match the human body image subset from the preset human body image set corresponding to the target category based on the target category to which the target human body image belongs, so as to improve the matching precision, and further obtain a more accurate result human body image subset; in the process of matching the result human body image subsets, the matching is only needed to be performed on the preset human body image subsets in the preset human body image set corresponding to the target categories, and compared with the method that all the preset human body image subsets are matched without classification, the matching complexity can be reduced, the image processing efficiency is improved, and the resources consumed in the image processing process are reduced; in addition, the method and the device can determine the human body images with the similarity meeting the preset requirement with the target human body images in batches based on the matching of the preset human body image subset and the target human body images, so that the diversity of the determined human body images can be improved, and the accuracy of image matching can be further improved.

Referring now to fig. 6, a schematic diagram of an electronic device (e.g., a terminal device or a server in fig. 1) 600 suitable for implementing embodiments of the present disclosure is shown. The terminal device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a stationary terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, electronic device 600 may include a processing means (e.g., central processing unit, graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device 600 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a target human body image; determining a category to which a target human body image belongs from a predetermined category set as a target category, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is the category corresponding to the preset human body image set, the preset human body image set comprises at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person; acquiring a preset human body image set corresponding to a target category as a candidate human body image set; determining whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value or not; and responding to the determination that the preset human body image subset is larger than or equal to the target similarity threshold value, and determining the preset human body image subset as a result human body image subset corresponding to the target human body image.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. Here, the name of the unit does not constitute a limitation to the unit itself in some cases, and for example, the first acquisition unit may also be described as a "unit that acquires an image of the target human body".

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A method for processing an image, comprising:

acquiring a target human body image;

determining a category to which the target human body image belongs as a target category from a predetermined category set, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is the category corresponding to the preset human body image set, the preset human body image set comprises at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person;

acquiring a preset human body image set corresponding to a target category as a candidate human body image set;

determining whether the similarity between the person corresponding to a preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value or not;

and responding to the determination that the preset human body image subset is larger than or equal to the target similarity threshold value, and determining the preset human body image subset as a result human body image subset corresponding to the target human body image.

2. The method of claim 1, wherein the set of categories comprises a containing face class and a non-containing face class classified based on whether the human body image contains a face object; and

the determining the category to which the target human body image belongs from a predetermined category set as a target category includes:

inputting the target human body image into a pre-trained human face detection model to obtain a detection result, wherein the detection result is used for indicating whether the target human body image contains a human face object;

and determining the category to which the target human body image belongs from the category set as a target category based on the detection result.

3. The method of claim 1, wherein the set of categories includes a front class, a side class, and a back class divided based on an orientation of a human object in a human image; and

inputting the target human body image into a human body skeleton key point identification model trained in advance to obtain an identification result, wherein the identification result is used for indicating the position of the human body skeleton key point in the target human body image;

and determining the category to which the target human body image belongs from the category set as a target category based on the obtained detection result and the recognition result.

4. The method of claim 1, wherein the method further comprises:

and adding the target human body image into the result human body image subset to obtain an updated result human body image subset.

5. The method of claim 1, wherein the target human body image is a human body image extracted from a target human body video; and

after determining the preset human body image subset larger than or equal to the target similarity threshold value as a result human body image subset corresponding to the target human body image, the method further includes:

selecting a target number of personal images from the target personal video;

for a body image of the target number of body images, performing the following processing steps: determining the category to which the human body image belongs from the category set; and in response to that the category to which the human body image belongs is the same as the target category, adding the human body image to a result human body image subset to obtain an updated result human body image subset.

6. The method of claim 5, wherein the processing step further comprises:

in response to that the category to which the human body image belongs is different from the target category, determining a preset human body image subset of the same person as the result human body image subset from a preset human body image set corresponding to the category to which the human body image belongs; and adding the human body image into the determined preset human body image subset to obtain an updated preset human body image subset.

7. The method of claim 1, wherein the determining whether the similarity between the person corresponding to the preset subset of the candidate human body images and the person corresponding to the target human body image is greater than or equal to a target similarity threshold comprises:

for a preset human body image subset in the candidate human body image set, executing the following steps: determining the similarity between the preset human body image included in the preset human body image subset and the target human body image as a candidate similarity; generating the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image based on the obtained candidate similarity; and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value.

8. The method according to claim 7, wherein the preset subset of human images corresponds to a target similarity threshold; and

the determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold value comprises:

and determining whether the similarity between the person corresponding to the preset human body image subset and the person corresponding to the target human body image is greater than or equal to a target similarity threshold corresponding to the preset human body image subset.

9. The method according to claim 8, wherein the preset target similarity threshold corresponding to the human body image subset is obtained by:

determining the number of preset human body images in a preset human body image subset;

and acquiring a similarity threshold corresponding to the determined quantity as a target similarity threshold corresponding to a preset human body image subset based on the corresponding relation between the quantity and the similarity threshold established in advance, wherein the quantity is positively correlated with the similarity threshold.

10. The method according to claim 9, wherein the determining a preset human body image subset larger than or equal to a target similarity threshold as a result human body image subset corresponding to the target human body image comprises:

determining whether the candidate human body image set comprises at least two preset human body image subsets with the similarity greater than or equal to the corresponding target similarity threshold;

and responding to the request, and selecting a preset human body image subset with the largest number of included preset human body images from the preset human body image subsets with the similarity larger than or equal to the corresponding target similarity threshold as a result human body image subset corresponding to the target human body image.

11. The method according to one of claims 1 to 10, wherein after determining the preset subset of human images greater than or equal to the target similarity threshold as the result subset of human images corresponding to the target human image, the method further comprises:

and outputting prompt information for representing and determining a result human body image subset corresponding to the target human body image.

12. An apparatus for processing an image, comprising:

a first acquisition unit configured to acquire a target human body image;

a first determining unit, configured to determine a category to which the target human body image belongs as a target category from a predetermined category set, wherein the category in the category set corresponds to a preset human body image set, the category of the preset human body image in the preset human body image set is a category corresponding to the preset human body image set, the preset human body image set includes at least one preset human body image subset, and the preset human body images in the preset human body image subset correspond to the same person;

the second acquisition unit is configured to acquire a preset human body image set corresponding to the target category as a candidate human body image set;

a second determining unit configured to determine whether the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold;

a third determining unit configured to determine, in response to determining that the similarity between the person corresponding to the preset human body image subset in the candidate human body image set and the person corresponding to the target human body image is greater than or equal to a target similarity threshold, the preset human body image subset greater than or equal to the target similarity threshold as a result human body image subset corresponding to the target human body image.

13. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-11.

14. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-11.