CN110442742A9

CN110442742A9 - Method and device for retrieving image, processor, electronic equipment and storage medium

Info

Publication number: CN110442742A9
Application number: CN201910704558.0A
Authority: CN
Inventors: 黄潇莹; 李蔚琳; 杨松; 李晓通; 付豪
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2021-03-16
Also published as: CN110442742A

Abstract

The application discloses a method and a device for retrieving images, a processor, electronic equipment and a storage medium. The method comprises the following steps: acquiring reference character information; using the reference person information to search a database, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be searched; under the condition that a retrieval request aiming at the image to be retrieved is received, retrieving the image to be retrieved by using the reference face attribute in the retrieval request, and obtaining an image with characteristic data matched with the reference face attribute in the image to be retrieved as a target image. Corresponding apparatus, processor, electronic device and storage medium are also disclosed. The method and the device can be applied to the field of security protection, and the database in the database is retrieved by using the reference character information and the reference face attribute to obtain the target image.

Description

Method and device for retrieving image, processor, electronic equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a method and an apparatus for retrieving an image, a processor, an electronic device, and a storage medium.

Background

At present, in order to enhance safety in work, life or social environments, camera monitoring equipment is installed in various regional places so as to perform safety protection according to video stream information. With the rapid increase of the number of cameras in public places, how to effectively determine images containing target people through massive video streams and determine information such as the tracks of the target people according to the information of the images is of great significance.

In the conventional technology, the relevant person can determine the image containing the target person from the video stream by traversing the images in the video stream acquired by the camera, but this method is inefficient.

Disclosure of Invention

The application provides a method and a device for retrieving images, a processor, electronic equipment and a storage medium, which are used for retrieving images.

In a first aspect, a method for retrieving an image is provided, the method comprising: acquiring reference character information; using the reference person information to search a database, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be searched; under the condition that a retrieval request aiming at the image to be retrieved is received, retrieving the image to be retrieved by using the reference face attribute in the retrieval request, and obtaining an image with characteristic data matched with the reference face attribute in the image to be retrieved as a target image.

In this aspect, the image to be retrieved is obtained by first retrieving the database using the reference personal information. Under the condition that the user needs to spend long time on studying and judging the images to be retrieved one by one, the images to be retrieved are retrieved by using the reference face attribute to obtain the target image, the studying and judging range of the user can be reduced, and the studying and judging efficiency of the user is improved.

In one possible implementation, the reference personal information includes a reference face image; the using the reference person information retrieval database to obtain an image of feature data matched with the reference person information in the database as an image to be retrieved includes: and retrieving the database by using the reference face image, and obtaining an image matched with the reference face image in the database as the image to be retrieved.

In the possible implementation mode, the reference face image is used for searching the database to obtain the image to be searched, and then the reference face attribute is used for searching the image to be searched to obtain the target image, so that the target image containing the target person can be quickly and accurately searched from the database under the condition that the user has the face image and the face attribute of the target person.

In another possible implementation manner, the reference person information includes a reference face image and a reference body image; the using the reference person information to search the database to obtain an image with characteristic data matched with the reference person information in the database as an image to be searched comprises the following steps: and retrieving the database by using the reference face image and the reference human body image, and obtaining an image matched with the reference face image and an image matched with the reference human body image in the database as an image to be retrieved.

In this possible implementation manner, the image to be retrieved may be obtained by retrieving data using the reference face image and the reference human body image, respectively, and the image to be retrieved may possibly include the target person.

In yet another possible implementation manner, the reference person information includes a reference time range and/or a reference geographical position range, and the feature data of the images in the database includes an acquisition time and an acquisition position; the using the reference person information to search the database to obtain an image with characteristic data matched with the reference person information in the database as an image to be searched comprises the following steps: and taking the image with the acquisition time within the reference time range and/or the acquisition position within the reference geographical position range in the database as the image to be retrieved.

In the possible implementation mode, the database is searched by using the reference time range and/or the reference geographic position range to obtain the image to be searched, and then the image to be searched is searched by using the reference face attribute to obtain the target image, so that the target image containing the target person can be quickly and accurately searched from the database under the condition that the user has the face image and the face attribute of the target person.

In yet another possible implementation manner, the reference personal information includes a reference human body image; the using the reference person information to search the database to obtain an image with characteristic data matched with the reference person information in the database as an image to be searched comprises the following steps: and retrieving the database by using the reference human body image, and obtaining an image matched with the reference human body image in the database as the image to be retrieved.

In the possible implementation mode, the reference human body image is used for searching the database to obtain the image to be searched, and then the reference human face attribute is used for searching the image to be searched to obtain the target image, so that the target image containing the target person can be quickly and accurately searched from the database under the condition that the user has the human body image and the human face attribute of the target person.

In yet another possible implementation manner, before the obtaining of the reference person information, the method further includes: acquiring a video stream to be processed; performing face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed; and taking the face attribute as the feature data of the image to be processed to obtain the database, wherein the database comprises the image to be processed and the feature data of the image to be processed.

In this possible implementation manner, the face attribute of the image to be processed is obtained by performing face attribute extraction processing on the image to be processed in the video stream to be processed, and the face attribute is used as feature data of the image to be processed to obtain the database. For subsequent retrieval of the images in the database by reference to the attributes.

In another possible implementation manner, before performing the face attribute extraction process on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed, the method further includes: performing feature extraction processing on the image to be processed in the video stream to obtain feature data; and under the condition that the image to be processed contains the face according to the features in the feature data, executing the step of carrying out face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed.

In this possible implementation, it is determined whether the image to be processed contains a human face. And under the condition that the image to be processed contains the face, performing face attribute extraction processing on the image to be processed so as to reduce the data processing amount in the database and improve the speed of establishing the database.

In another possible implementation manner, the performing a face attribute extraction process on the to-be-processed image in the to-be-processed video stream to obtain the face attribute of the to-be-processed image includes: according to the features in the feature data, obtaining the position of the face in the image to be processed, wherein the position is the position of any pair of opposite angles of a rectangular frame containing the face in the image to be processed; intercepting a rectangular area determined by the position in the image to be processed to obtain a face image; and carrying out face attribute extraction processing on the face image to obtain the face attribute of the image to be processed.

In the possible implementation mode, the position of the face in the image to be processed is determined according to the extracted feature data, the face image is obtained according to the position, feature extraction processing is carried out on the face image, the attribute of the face is obtained, the data processing amount in the database can be further reduced, and the database building speed is improved.

In another possible implementation manner, after the retrieving the image to be retrieved by using the reference face attribute in the retrieval request, and obtaining an image in the image to be retrieved, which has feature data matching with the reference face attribute, as a target image, the method further includes: sending a target face image with characteristic data matched with the reference face attribute to a terminal; and under the condition of receiving a detail display request aiming at the target face image sent by the terminal, sending a target image to be processed with feature data matched with the reference face attribute to the terminal.

In this possible implementation manner, a face image with feature data matched with the reference face attribute is first sent to the terminal, so that the user can confirm whether a person object included in the face image is a target person. Furthermore, the image to be processed can be sent to the terminal, so that the user can further confirm whether the person object contained in the face image is the target person, and the efficiency of determining the target person by the user can be improved.

In yet another possible implementation manner, the feature data of the image to be processed in the database includes an acquisition time and an acquisition position, and the method further includes: and under the condition of receiving a track display request sent by the terminal, sending an instruction to the terminal, wherein the instruction is used for indicating the terminal to display the acquisition time and the acquisition position of the image to be processed in a map.

In this possible implementation manner, the feature data of the image to be processed further includes the acquisition time of the image to be processed and the acquisition position of the image to be processed, so that the target image can be obtained while the whereabouts of the human object in the target image are obtained according to the acquisition position and the acquisition time of the target image.

In another possible implementation manner, after determining that the image to be processed includes a human face according to the features in the feature data, the method further includes: obtaining the quality score of the image to be processed according to a preset image quality evaluation index, wherein the image quality evaluation index comprises at least one of the following: the definition of a face region and the shielding condition of the face region; the feature extraction processing is performed on the image to be processed in the video stream to be processed to obtain feature data, and the feature data includes: and performing feature extraction processing on the image to be processed with the quality score reaching a threshold value to obtain the feature data.

In the possible implementation mode, the quality of the image to be processed is evaluated according to the image quality evaluation index, the face attribute extraction processing is carried out on the image to be processed with the quality score reaching the threshold value, the accuracy of the feature data in the database can be improved by obtaining the feature data, and the accuracy of the retrieval image is further improved.

In yet another possible implementation manner, the reference face attribute includes at least one of: whether wearing a mask or wearing glasses, the style, sex, age group, race, ethnic group and beard type of the glasses.

In this possible implementation manner, the target image may be obtained by searching the image to be searched using at least one of the characteristics of whether or not the mask is worn, whether or not the glasses are worn, the style of the glasses, the sex, the age group, the race, the ethnicity, and the whisker type.

In another possible implementation manner, before the retrieving the image to be retrieved by using the reference face attribute in the retrieval request, and obtaining an image in the image to be retrieved, which has feature data matching with the reference face attribute, as a target image, the method further includes: determining a retrieval sequence according to the priority of preset attributes under the condition that the reference face attributes comprise at least two attributes; the retrieving the image to be retrieved by using the reference face attribute in the retrieval request to obtain an image with feature data matched with the reference face attribute in the image to be retrieved, wherein the image to be retrieved is used as a target image, and the retrieving method comprises the following steps: and sequentially using the attributes in the reference face attributes to retrieve the images to be retrieved according to the retrieval sequence, and obtaining images with characteristic data matched with the reference face attributes in the images to be retrieved as the target images.

In the possible implementation mode, the retrieval sequence of the reference face attributes is determined according to the priority of the face attributes, so that the retrieval speed can be increased, and the retrieval efficiency can be improved.

In yet another possible implementation manner, the video stream to be processed is acquired by a camera; the parameters of the camera include: the range of the human face recognizable deflection angle is-45 degrees to +45 degrees, the range of the human face recognizable pitch angle is-30 degrees to +30 degrees, and the distance between the pupils of two eyes in the collected image is larger than or equal to 18 pixel points.

In this possible implementation manner, the face attributes may be extracted from the video stream acquired by the camera satisfying the above conditions (i.e., parameters) and a database may be established, and then the database may be retrieved by using the reference face attributes to obtain the target image.

In a second aspect, there is provided an apparatus for retrieving an image, the apparatus comprising: an acquisition unit configured to acquire reference character information; the first retrieval unit is used for retrieving a database by using the reference person information, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be retrieved; and the second retrieval unit is used for retrieving the image to be retrieved by using the reference face attribute in the retrieval request under the condition that the retrieval request aiming at the image to be retrieved is received, and obtaining an image with characteristic data matched with the reference face attribute in the image to be retrieved as a target image.

In one possible implementation, the reference personal information includes a reference face image; the first retrieval unit is to: and retrieving the database by using the reference face image, and obtaining an image matched with the reference face image in the database as the image to be retrieved.

In another possible implementation manner, the reference person information includes a reference face image and a reference body image; the first retrieval unit is to: and retrieving the database by using the reference face image and the reference human body image, and obtaining an image matched with the reference face image and an image matched with the reference human body image in the database as an image to be retrieved.

In yet another possible implementation manner, the reference person information includes a reference time range and/or a reference geographical position range, and the feature data of the images in the database includes an acquisition time and an acquisition position; the first retrieval unit is to: and taking the image with the acquisition time within the reference time range and/or the acquisition position within the reference geographical position range in the database as the image to be retrieved.

In yet another possible implementation manner, the reference personal information includes a reference human body image; the first retrieval unit is to: and obtaining an image matched with the reference human body image in the database by using the database of the reference human body image as the image to be retrieved.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: the acquiring unit is used for acquiring a video stream to be processed before the reference character information is acquired; the face attribute extraction unit is used for carrying out face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed; and the determining unit is used for taking the face attribute as the feature data of the image to be processed to obtain the database, wherein the database comprises the image to be processed and the feature data of the image to be processed.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: the feature extraction processing unit is used for performing feature extraction processing on the image to be processed in the video stream to obtain feature data before performing face attribute extraction processing on the image to be processed in the video stream to obtain the face attribute of the image to be processed; the face attribute extraction unit is configured to, when it is determined that the image to be processed includes a face according to the features in the feature data, perform the step of performing face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed.

In yet another possible implementation manner, the face attribute extraction unit is configured to: according to the features in the feature data, obtaining the position of the face in the image to be processed, wherein the position is the position of any pair of opposite angles of a rectangular frame containing the face in the image to be processed; intercepting a rectangular area determined by the position in the image to be processed to obtain a face image; and performing face attribute extraction processing on the face image to obtain the face attribute of the image to be processed.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: a sending unit, configured to, after the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, obtain an image in the image to be retrieved, where the image has feature data matching the reference face attribute, and serve as a target image, send the target face image having feature data matching the reference face attribute to a terminal; the sending unit is further configured to send a target image to be processed having feature data matching the reference face attribute to the terminal in a case where a detail showing request for the target face image sent by the terminal is received.

In yet another possible implementation manner, the feature data of the image to be processed in the database includes an acquisition time and an acquisition location, and the apparatus for retrieving the image further includes: the sending unit is used for sending an instruction to the terminal under the condition of receiving a track display request sent by the terminal, wherein the instruction is used for indicating the terminal to display the acquisition time and the acquisition position of the image to be processed in a map.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: an evaluation unit, configured to, after the determination that the image to be processed includes a human face according to the features in the feature data, obtain a quality score of the image to be processed according to a preset image quality evaluation index, where the image quality evaluation index includes at least one of: the definition of a face region and the shielding condition of the face region; the feature extraction processing unit is configured to: and performing feature extraction processing on the image to be processed with the quality score reaching a threshold value to obtain the feature data.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: a sorting unit, configured to, before the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, obtain an image in the image to be retrieved, where the image has feature data that matches the reference face attribute, and determine a retrieval order according to a priority of a preset attribute when the reference face attribute includes at least two attributes, where the image is used as a target image; the second retrieval unit is to: and sequentially using the attributes in the reference face attributes to retrieve the images to be retrieved according to the retrieval sequence, and obtaining images with characteristic data matched with the reference face attributes in the images to be retrieved as the target images.

In a third aspect, a processor is provided, which is configured to perform the method according to the first aspect and any one of the possible implementations thereof.

In a fourth aspect, an electronic device is provided, comprising: a processor, transmitting means, input means, output means, and a memory for storing computer program code comprising computer instructions which, when executed by the processor, cause the electronic device to perform the method of the first aspect and any one of its possible implementations.

In a fifth aspect, there is provided a computer readable storage medium having stored therein a computer program comprising program instructions which, when executed by a processor of an electronic device, cause the processor to perform the method of the first aspect and any one of its possible implementations.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present application, the drawings required to be used in the embodiments or the background art of the present application will be described below.

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic flowchart of a method for retrieving an image according to an embodiment of the present application;

FIG. 2 is a schematic flowchart of another method for retrieving an image according to an embodiment of the present disclosure;

FIG. 3 is a schematic flowchart of another method for retrieving an image according to an embodiment of the present disclosure;

fig. 4 is a flowchart illustrating a method for retrieving an image in a case where a user only has a face image of a target person according to an embodiment of the present application;

fig. 5 is a flowchart illustrating a method for retrieving an image in a case where a user only has a human body image of a target person according to an embodiment of the present application;

fig. 6 is a schematic flowchart of a method for creating a database according to an embodiment of the present application;

fig. 7 is a schematic diagram of a rectangular frame of a human face in an image to be processed according to an embodiment of the present application;

FIG. 8 is a flowchart illustrating another method for retrieving images according to an embodiment of the present disclosure;

fig. 9 is a schematic diagram of a face image and an image to be processed according to an embodiment of the present disclosure;

FIG. 10 is a flowchart illustrating another method for retrieving images according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of an apparatus for retrieving an image according to an embodiment of the present application;

fig. 12 is a schematic hardware structure diagram of an apparatus for retrieving an image according to an embodiment of the present disclosure.

Detailed Description

The terms "first," "second," and the like in the description and claims of the present application and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to improve the social security control capability and maintain a good social security environment, monitoring cameras are arranged in more and more places, and when related personnel need to find a target person, the track of the target person can be determined from video streams collected by the cameras arranged at different positions according to the human body images, the human face images, the wearing characteristics, the decoration characteristics and the like of the target person.

The embodiments of the present application will be described below with reference to the drawings.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for retrieving an image according to an embodiment (a) of the present application.

101. Reference character information is acquired.

In the embodiment of the application, the reference person information may be identity information or whereabouts information. The identity information can be a human body image or a human face image. The human body image refers to an image including a trunk and four limbs, and the human face image refers to an image including a human face. The whereabouts information includes the time when the person appears and/or the place where the person appears.

For example (example 1), a robbery occurs in place a, a witness provides an image a of a suspect to the police, and a is used as a human body image if the image does not contain the face of the suspect. The witnesses are all four, the witnesses provide the face image b of the suspect for the police, and the image b can be used as the face image only if the face of the suspect exists and the trunk and the limbs do not exist. Meanwhile, the witness wang provides the police with the time and/or place of the suspect (e.g., where and when the suspect is seen), and the time and/or place of the suspect can be used as the whereabouts information.

The reference person object information may be acquired by receiving reference person object information input by a user through an input component, wherein the input component includes: keyboard, mouse, touch screen, touch pad, audio input device, etc. Or receiving the reference person object information sent by the terminal, wherein the terminal comprises a mobile phone, a computer, a tablet computer, a server and the like.

102. And searching the database by using the reference person information, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be searched.

In the embodiment of the present application, the database may be established before the reference person information is acquired, and the database includes the image and the feature data of the image. The images comprise human faces and trunks, and can be images captured by a monitoring camera or video images selected from video streams acquired by the monitoring camera. The feature data of the image includes an acquisition time and an acquisition position of the image.

As indicated by 101, the reference person information may be a human body image, a human face image, or a trace information. Therefore, retrieving the database using the reference personal information may be retrieving the database using the reference human body image in the reference information, and taking an image in the database that matches the reference human body image as an image to be retrieved. Continuing to give an example (example 2) in the following example 1, when the image a is used as a reference human body image to search the database, feature extraction processing is performed on the image a to obtain features of a human body in the image a, matching is performed on the features of the human body and the features of the human body of any image in the database according to the features of the human body, and the image with the matching degree reaching a human body feature threshold value is used as an image to be searched.

The human body features in this embodiment are used to characterize the identity of a human body. The human body feature may be a feature vector (hereinafter, referred to as a human body feature vector), and whether different human bodies are the same person may be determined according to a similarity between different human body feature vectors. For example, the human feature vector extracted from the image 1 is a, the human feature vector extracted from the image 2 is b, and whether the human in the image 1 and the human in the image 2 are the same person can be determined according to the similarity of a and b (e.g., cosine similarity of a and b, and euclidean distance of a and b).

The database retrieval by using the reference person information can also be a database retrieval by using a reference face image in the reference information, and an image matched with the reference face image in the database is used as an image to be retrieved. Continuing the example (example 3) in the following example 1, because the image b only has a face and does not have a trunk or four limbs, when the image b is used as a reference face image to search the database, feature extraction processing is performed on the image b to obtain face features in the image b (the face features are used for representing the identity of a person in the image b), matching is performed on the face features and the face features of any image in the database according to the face features, and the image with the matching degree reaching a face feature threshold value is used as an image to be searched.

Since there may be an unclear, incomplete and blocked face region in the images in the database (hereinafter, the images in this case will be referred to as low-quality images), when the reference face image is used for matching with the low-quality images, the matching degree is lower than the face feature threshold value because the face features extracted from the low-quality images contain less information, and further, the images obtained by searching the database using the reference face image will not contain the low-quality images. However, the low-quality image may contain a target person, and in order to reduce missing of the image containing the target person, the reference face image and the reference body image can be used to search the database respectively to obtain an image matched with the reference face image and an image matched with the reference body image as images to be searched. Namely, the low-quality image containing the target person is found out from the database by referring to the human body image, and the retrieval effect is improved.

The face region in the image in the database may be unclear, incomplete and occluded (hereinafter, the image in which this occurs is referred to as a low-quality face image), and the body region in the image may be unclear, incomplete and occluded (hereinafter, the image in which this occurs is referred to as a low-quality body image). When the reference face image is matched with the low-quality face image or the reference body image is matched with the low-quality body image, the matching degree obtained by matching is lower than a face feature threshold or a body feature threshold because the information contained in the face features extracted from the low-quality face image and the information contained in the body features extracted from the low-quality body image are less, and the images obtained by searching the database by using the reference face image or the reference body image do not contain the low-quality face image or the low-quality body image. However, the low-quality face image or the low-quality body image may include the target person, and therefore, if only the reference face image or the reference body image is used to search the database, the image including the target person may be omitted. In order to reduce the probability of missing an image containing a target person, a reference face image and a reference human body image in reference person information can be respectively used for searching a database, and an image matched with the reference face image and an image matched with the reference human body image are obtained and used as images to be searched. Namely, the low-quality human face image containing the target person is found out from the database by referring to the human face image, or the low-quality human face image containing the target person is found out from the database by referring to the human face image, so that the retrieval effect is improved.

The database for searching by using the reference human body image and the reference face image belongs to the database for searching by using the image. In some cases, the user does not have an image of the target person, i.e., the reference human body image and the reference human face image, and at this time, if the user has the track information (where how and where the target person appears), the track information may be used to search the database to obtain the image to be searched. Therefore, the database is searched by using the reference person information, namely, the database is searched by using the reference time range and/or the reference geographic position range in the reference information, and the image with the acquisition time within the reference time range and/or the acquisition position within the reference geographic position range in the database is used as the image to be searched.

Continuing the example following example 1 (example 4), from the information provided by wang on the whereabouts of the suspect, it was determined that the suspect had appeared near a hotel between 10 am on 7/29.7/2019 and 11 am on 29.7/2019, then 10 am on 7/29/2019 to 11 am on 7/29/2019 may be used as the reference time range, and using the geographical location range of the hotel attachment (e.g., 116 ° 23 'north latitude 39 ° 54' east longitude 117 ° 10 'north latitude 39 ° 10') as a reference geographical location range, searching the database using the reference time range and the reference geographical location range, setting the collection time in the database to be within 10 am of 7/29 th of 2019 and 11 am of 29 th of 2019, and images with positions within 116 ° 23 'north latitude 39 ° 54' of the east longitude to 117 ° 10 'north latitude 39 ° 10' of the east longitude are acquired as images to be retrieved.

In the embodiment of the application, under the condition that the number of the images obtained by inputting the reference person information retrieval is large, a user spends a large amount of time if studying and judging the images obtained by retrieval one by one, so that the image obtained by using the reference person information retrieval can be used as the image to be retrieved, and the image to be retrieved can be retrieved subsequently to obtain the target image.

It should be understood that, when the technical solution provided by the present application is applied to search, if the number of images obtained by using the reference person information search is small, a user can directly research and judge the images obtained by the search one by one to obtain a target image.

103. Under the condition that a retrieval request aiming at the image to be retrieved is received, the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and the image with the characteristic data matched with the reference face attribute in the image to be retrieved is obtained and used as a target image.

As shown in fig. 102, when the number of images obtained by searching the database using the reference person information is large, it takes a lot of time for the user to judge the images obtained by searching one by one, so that the image to be searched can be further searched using the reference face attribute, so as to reduce the number of images required to be judged by the user, and further reduce the judging time.

In the embodiment of the present application, the feature data of the images in the database further includes face attributes of the images, where the face attributes include at least one of the following: whether wearing a mask or wearing glasses, the style, sex, age group, race, ethnic group and beard type of the glasses. Wherein, whether wearing the pending option of gauze mask includes: wearing and not wearing the gauze mask, whether wearing the pending option of glasses includes: with and without glasses, the options for the style of glasses include: transparent lens glasses, sunglasses, the pending options of gender include: male and female, age group candidates include: elderly, adults, children; the candidate for the race includes: yellow race, black race, white race; the waiting options of the nationality comprise: some specific ethnic and other candidate options for the beard category include: without beard and beard-free and beard-free.

Because the feature data of each image in the database has the face attribute, when a retrieval request for the image to be retrieved is received, the reference face attribute in the retrieval request is used for retrieving the image to be retrieved, namely the feature data matched with the reference face attribute is determined from the image to be retrieved, and then the target image is determined. It should be understood that the number of target images may be one or more.

For example, the reference face attributes are: the middle-aged Han men who wear sunglasses and remain the beard-breaking beard. And (3) using a reference face attribute retrieval database, namely determining characteristic data from the image to be retrieved, wherein the characteristic data comprises an image of which the glasses style is sunglasses, the beard type is huhu, the age bracket is adults, the nationality is Han and the gender is male, and obtaining a target image.

Optionally, the database includes an image sub-library and a feature data sub-library, the image sub-library includes images, the feature data sub-library includes feature data, and the images in the image sub-library correspond to the feature data in the feature data sub-library one to one. And using the reference face image and/or the reference human body image retrieval database, namely using the reference face image and/or the reference human body image to retrieve the image sub-library, obtaining an image matched with the reference face image and/or the reference human body image in the image sub-library as an image to be retrieved, and using the feature data corresponding to the image to be retrieved in the feature data sub-library as feature data to be retrieved. And retrieving the image to be retrieved by using the reference face attribute, namely retrieving the characteristic data to be retrieved by using the reference face attribute, obtaining the characteristic data matched with the reference face attribute in the characteristic data to be retrieved as target characteristic data, and taking the image corresponding to the target characteristic data as a target image.

The present embodiment obtains an image to be retrieved by first retrieving the database using the reference character information. Under the condition that the user needs to spend long time on studying and judging the images to be retrieved one by one, the images to be retrieved are retrieved by using the reference face attribute to obtain the target image, the studying and judging range of the user can be reduced, and the studying and judging efficiency of the user is improved.

Referring to fig. 2, fig. 2 is a schematic flowchart illustrating another method for retrieving an image according to a second embodiment of the present application.

201. A reference body image is acquired.

In this embodiment, the reference personal information includes a reference human body image. As indicated at 102, the reference body image may not include a face, but includes a torso and/or limbs. For example: a passerby shoots a back shadow of a criminal suspect by a mobile phone when the criminal suspect runs away, the back shadow does not have the face of the suspect but has the trunk and the limbs of the suspect, and the back shadow can be used as a reference human body image at the moment. Another example is: the robbers wear the mask during the case making process in order to disguise themselves, the witness people do not have human faces in the captured images of the robbers, but some captured images comprise trunks and limbs, some comprise upper half bodies and some comprise lower half bodies, and the captured images can be used as reference human body images.

The manner of obtaining the reference body image can be referred to as 101, and will not be described herein.

202. And searching the database by using the reference human body image, and obtaining an image matched with the reference human body image in the database as an image to be searched.

And (3) searching the database by using the reference human body image, namely comparing the reference human body image with the image in the database to obtain the matching degree, and taking the image in the database, the matching degree of which with the reference human body image reaches a human body characteristic threshold value, as the image to be searched.

Optionally, the comparison between the reference human body image and the image in the database may be performed by performing feature extraction processing on the reference human body image to obtain human body features in the reference human body image, and then comparing the human body features with the human body features of the image to be processed in the database to obtain a human body feature matching degree, which is used as the matching degree between the reference human body image and the image to be processed in the database.

The above-mentioned feature extraction processing on the reference human body image to obtain the human body features in the reference human body image can be realized by a pre-trained convolutional neural network. The training of the convolutional neural network can be completed by taking a plurality of marked images as a training set, wherein the marking information of the images in the training set is the identity information of people, and the people contained in at least two images in the training set are the same person. For example, the training set includes a human body image a, a human body image B, a human body image C, a human body image d, a human body image e, and a human body image f, and both the human figures in the human body image a and the human body image C are a, both the human figures in the human body image B and the human body image f are B, and both the human figures in the human body image d and the human body image e are C. And training the convolutional neural network by using a training set, wherein the convolutional neural network extracts human body features from the images in the training set, and gives an identification result of the images according to the extracted human body features, and the identification result is the identity (such as A) of a person in the images. And updating parameters of the convolutional neural network according to the recognition result given by the convolutional neural network aiming at the image, the marking information of the image and the loss function, so that the recognition accuracy of the convolutional neural network reaches a preset value, and the training of the convolutional neural network is completed. The identification accuracy refers to the ratio of the number of identification results which are the same as the image labeling information in all the identification results obtained in the training to the number of all the identification results. In this way, the trained convolutional neural network can be used to extract human features for identifying human identities from the reference human image.

The above-mentioned feature extraction processing on the reference human body image to obtain the human body features in the human body image can also be realized by other human body feature extraction algorithms, and the application is not limited.

203. Under the condition that a retrieval request aiming at the image to be retrieved is received, the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and the image with the characteristic data matched with the reference face attribute in the image to be retrieved is obtained and used as a target image.

As shown in fig. 102, the number of images to be retrieved may be very large, and a user may spend a lot of time determining the images to be retrieved one by one. Therefore, the embodiment provides a retrieval mode of ' secondary filtering ' to narrow the scope of the user's judgment and reduce the judgment time.

The second filtering means that the database is searched by using the reference character information to obtain the image to be searched, and when the number of the image to be searched is large, the user can search the image to be searched by using the attribute of the reference face to narrow the range of judgment.

Therefore, in the case that an execution subject (e.g., a computer, a server) receives a retrieval request for an image to be retrieved, the execution subject retrieves the image to be retrieved by using the reference face attribute in the retrieval request, and takes an image having feature data matching the reference face attribute in the image to be retrieved as a target image.

For example, a police party wants to catch a flight evacuee, and a person in an image to be retrieved obtained by referring to a human body image retrieval database has glasses, does not have glasses, has a hat, and does not have a hat. According to the clue provided by the related witness, the escaper wears red hat and black glasses. Therefore, the police can take the red hat and the black glasses as the reference face attributes to continue to search the image to be searched, and obtain the image which is taken with the red hat and the black glasses in the image to be searched and is taken as the target image. Namely, the images to be retrieved are retrieved by referring to the face attributes, so that the number of the searched images can be further reduced.

Optionally, when the user does not want to miss any image that may contain the target person, the database may also be retrieved by using the reference human body image and the reference face attribute in the manner of "secondary filtering" provided by this embodiment.

For example, an police party wants to catch a suspect and have a body image and face attributes of the suspect, wherein the face attributes are the wearing of glasses. If the database is searched by using the human body image and the human face attribute at the same time, the finally obtained persons in the target image are all glasses-worn, and if the database contains images of the suspected persons who do not wear glasses, the images are missed. The database is retrieved by the 'secondary filtering' method provided by the embodiment, the human body image can be used for retrieving the database to obtain the image to be retrieved, and then the human face attribute is used for retrieving the image to be retrieved to obtain the target image. Under the condition that the police has sufficient time, the images to be retrieved and the target images can be researched and judged one by one, so that the accuracy of the research and judgment of the police can be greatly improved, and the efficiency of catching the suspect is improved.

According to the embodiment, the reference human body image retrieval database is used for obtaining the image to be retrieved, and then the reference face attribute is used for retrieving the image to be retrieved to obtain the target image, so that the target image containing the target person can be rapidly and accurately retrieved from the database under the condition that the user has the human body image and the face attribute of the target person.

Referring to fig. 3, fig. 3 is a schematic flow chart of another method for retrieving an image according to the third embodiment of the present application.

301. And acquiring a reference face image.

In this embodiment, the reference personal information includes a reference face image. As indicated at 102, the reference face image includes a human face. For example: a passerby shoots a face photograph of a criminal suspect with a mobile phone when the criminal suspect runs away, only the face of a suspect exists in the face photograph, and the trunk and the four limbs of the suspect do not exist, so that the face photograph can be used as a reference face image.

The manner of obtaining the reference face image can be referred to as 101, and will not be described herein.

302. And using the reference face image to search the database, and obtaining an image matched with the reference face image in the database as an image to be searched.

And (3) retrieving the database by using the reference face image, namely comparing the reference face image with the image in the database to obtain the matching degree, and taking the image in the database, the matching degree of which with the reference face image reaches a face threshold value, as the image to be retrieved.

Optionally, the comparison between the reference face image and the image in the database may be performed by performing feature extraction processing on the reference human body image to obtain a face feature in the reference face image, and then comparing the face feature with a face feature of the image to be processed in the database to obtain a face feature matching degree, which is used as a matching degree between the reference face image and the image to be processed in the database. The above feature extraction processing on the reference face image to obtain the face features in the reference face image may be implemented by a convolutional neural network, or by other face feature extraction algorithms, which is not limited in this application.

303. Under the condition that a retrieval request aiming at the image to be retrieved is received, the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and the image with the characteristic data matched with the reference face attribute in the image to be retrieved is obtained and used as a target image.

Please refer to 203, which will not be described herein.

In the embodiment, the reference face image is used for searching the database to obtain the image to be searched, and then the reference face attribute is used for searching the image to be searched to obtain the target image, so that the target image containing the target person can be quickly and accurately searched from the database under the condition that the user has the face image and the face attribute of the target person.

As described in 102, since there are low-quality images in the database, and thus there may be no low-quality image containing the target person in the image obtained by using the reference face image retrieval (hereinafter, referred to as a case of missing target images), and retrieving the database using the reference face image and the reference body image can reduce the probability of the occurrence of the case of missing target images. However, in a real situation, the user may only have a face image of the target person or only a human body image of the target person, and for this situation, the embodiment of the present application provides another method for retrieving an image.

Referring to fig. 4, fig. 4 is a flowchart illustrating a method for retrieving an image in a case where a user only has a face image of a target person according to an embodiment (four) of the present application.

401. And acquiring a reference face image.

Please refer to 301, which will not be described herein.

402. And searching the database by using the reference face image to obtain a first image set matched with the reference face image in the database.

The implementation process of obtaining the first image set by using the reference facial image retrieval database may refer to the process of obtaining the image to be retrieved by using the reference facial image retrieval database in 302, which will not be described herein again.

403. A reference body image is acquired.

In the case where an image including a human body (i.e., an image including both a human face and a human body) exists in the first image set, the terminal (i.e., the execution subject of the present embodiment) may cut out a human body region from the image including a human body, and obtain a reference human body image.

Optionally, the terminal (i.e. the executing entity in this embodiment) may display the first image set after obtaining the first image set. The user can select an image possibly containing a target person from the first image set according to the judgment of the user, and the image is used as a first image to be intercepted. And the user intercepts the first image to be intercepted to obtain a reference human body image and sends the reference human body image to the terminal.

Optionally, the terminal (i.e. the executing entity in this embodiment) may display the first image set after obtaining the first image set. The user may select an image possibly including the target person from the first image set according to the judgment of the user, and the image is used as a second image to be intercepted, and sends an intercepting instruction to the terminal (i.e., the execution subject of the embodiment). And the terminal intercepts the human body area from the second image to be intercepted under the condition of receiving the instruction, and obtains a reference human body image.

404. And searching the database by using the reference human body image, obtaining a second image set matched with the reference human body image in the database, and taking the images in the first image set and the images in the second image set as images to be searched.

The process of obtaining the second image set by using the reference human body image retrieval database may be referred to as the process of obtaining the image to be retrieved by using the reference human body image retrieval database in 202, and will not be described herein again.

By taking all images in the first image set and all images in the second image set as images to be retrieved, the probability of occurrence of a situation in which a target image is missed can be reduced.

405. Under the condition that a retrieval request aiming at the image to be retrieved is received, the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and the image with the characteristic data matched with the reference face attribute in the image to be retrieved is obtained and used as a target image.

Please refer to fig. 103, which will not be described herein.

In the embodiment, the reference human face image is used for retrieving the database to obtain the first image set, the reference human body image is obtained based on the first image set, and the images in the second image set and the images in the first image set, which are obtained by retrieving the database by using the reference human body image, are used as the images to be retrieved, so that the probability of missing target images can be reduced, and the retrieval effect can be improved.

Referring to fig. 5, fig. 5 is a flowchart illustrating a method for retrieving an image in a case where a user only has a human body image of a target person according to an embodiment (five) of the present application.

501. A reference body image is acquired.

Please refer to 201, which will not be described herein.

502. And searching the database by using the reference human body image to obtain a third image set matched with the reference human body image in the database.

The process of obtaining the third image set by using the reference human body image retrieval database may be referred to as the process of obtaining the image to be retrieved by using the reference human body image retrieval database in 202, and will not be described herein again.

503. And acquiring a reference face image.

In the case where an image including a face (i.e., an image including both a face and a human body) exists in the third image set, the terminal (i.e., the execution subject of the present embodiment) may cut out a face region from the image including a face, and obtain a reference face image.

Optionally, the terminal (i.e. the executing entity in this embodiment) may display the third image set after obtaining the third image set. The user can select an image possibly containing the target person from the third image set according to the judgment of the user, and the image is used as a third image to be intercepted. And the user intercepts the face area from the third image to be intercepted to obtain a reference face image and sends the reference face image to the terminal.

Optionally, the terminal (i.e. the executing entity in this embodiment) may display the third image set after obtaining the third image set. The user may select an image that may include the target person from the third image set according to the determination of the user, and use the image as a fourth image to be intercepted, and send an intercepting instruction to the terminal (i.e., the execution subject of this embodiment). And the terminal intercepts a face area from the fourth image to be intercepted under the condition of receiving the instruction, and obtains a reference face image.

504. And using the reference face image to search the database, obtaining a fourth image set matched with the reference face image in the database, and using the images in the third image set and the images in the fourth image set as images to be searched.

The implementation process of obtaining the fourth image set by using the reference facial image retrieval database may refer to the process of obtaining the image to be retrieved by using the reference facial image retrieval database in 302, which will not be described herein again.

By taking all the images in the third image set and all the images in the fourth image set as the images to be retrieved, the probability of occurrence of a situation in which the target image is missed can be reduced.

505. Under the condition that a retrieval request aiming at the image to be retrieved is received, the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and the image with the characteristic data matched with the reference face attribute in the image to be retrieved is obtained and used as a target image.

Please refer to fig. 103, which will not be described herein.

In the embodiment, the reference human face image is used for retrieving the database to obtain the third image set, the reference face image is obtained based on the third image set, and the images in the fourth image set and the images in the third image set, which are obtained by retrieving the database by using the reference face image, are used as the images to be retrieved, so that the probability of missing the target images can be reduced, and the retrieval effect can be improved.

Alternatively, when the user does not want to miss any image that may contain the target person, the database may be retrieved by using the reference human body image and/or the reference face image and the reference face attribute through the "second filtering" provided in embodiments (two) to (five).

For example, an police party wants to catch a suspect and have a human image or face image of the suspect, and face attributes, wherein the face attributes are that the glasses are worn. If the database is searched by using the human body image and the human face attribute at the same time, the finally obtained persons in the target image are all glasses-worn, and if the database contains images of the suspected persons who do not wear glasses, the images are missed. The database is retrieved by the 'secondary filtering' method provided by the embodiment, the database can be retrieved by using the human body image or the human face image to obtain the image to be retrieved, and then the image to be retrieved is retrieved by using the human face attribute to obtain the target image. Under the condition that the police has sufficient time, the images to be retrieved and the target images can be researched and judged one by one, so that the accuracy of the research and judgment of the police can be greatly improved, and the efficiency of catching the suspect is improved.

Referring to fig. 6, fig. 6 is a flowchart illustrating a method for creating a database according to an embodiment (six) of the present application.

601. And acquiring a video stream to be processed.

In the embodiment of the application, the server is connected with the plurality of cameras, the installation positions of the cameras in the plurality of cameras are different, and the server can acquire the video stream acquired in real time from each camera, namely the video stream to be processed.

It should be understood that the number of cameras connected to the server is not fixed, and the network address of the camera is input to the server, i.e. the captured video stream can be obtained from the camera through the server.

For example, if a controller in the B-site wants to establish a database in the B-site by using the technical solution provided by the present application (i.e., via the server), the controller in the B-site can acquire the video stream acquired by the camera in the B-site via the server by inputting the network address of the camera in the B-site to the server, and can perform subsequent processing on the video stream acquired by the camera in the B-site to establish the database in the B-site.

The video stream to be processed comprises continuous multiframe images to be processed, and the server can decode the video stream to be processed before the subsequent processing of the video stream to be processed to obtain one frame of image.

602. And performing face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed.

In this embodiment of the present application, the image to be processed is a frame-by-frame image obtained from the video stream to be processed in 601, that is, in this embodiment, each frame of image in the video stream to be processed is subjected to face attribute extraction processing, so as to obtain a face attribute of the image to be processed. The face attribute extraction processing can be realized by a pre-trained neural network or a feature extraction model, and the method is not limited in the application.

In some possible implementation manners, the feature extraction processing of the image to be processed is completed by performing convolution processing on the image to be processed layer by layer through a plurality of layers of convolution layers which are randomly stacked, wherein the feature content and the semantic information extracted by each convolution layer are different, and the concrete expression is that the feature extraction processing abstracts the features of the image step by step and removes relatively secondary feature data step by step, so that the content and the semantic information are more concentrated when the feature data extracted later is smaller. The method comprises the steps of carrying out convolution processing on an image to be processed step by step through a plurality of layers of convolution layers and extracting corresponding feature data, so that the size of the image can be reduced while main content information (namely the feature data of the image to be processed) of the image to be processed is obtained, the calculated amount of a system is reduced, the operation speed is improved, and finally the face attribute is obtained according to the feature data of the image to be processed.

In one possible implementation, the convolution process is implemented as follows: and performing convolution processing on the image to be processed by the convolution layer, namely sliding on the image to be processed by utilizing a convolution kernel, multiplying pixels on the image to be processed by numerical values on the corresponding convolution kernel, adding all multiplied values to serve as pixel values on the image corresponding to the middle pixel of the convolution kernel, finally finishing sliding processing on all pixels in the image to be processed, and extracting characteristic data.

In a possible implementation manner, the pre-trained neural network is used to perform face detection processing on the image to be processed to determine whether the image to be processed contains a face. The face detection processing can also be realized by a feature extraction model or a feature extraction algorithm, which is not limited in the present application.

And under the condition that the image to be processed contains the face, carrying out face attribute extraction processing on the image to be processed to obtain the face attribute of the image to be processed.

It should be understood that the feature extraction process for obtaining the feature data and the face attribute extraction process for obtaining the face attribute of the image to be processed may be implemented by different neural networks or different feature extraction algorithms. And for the to-be-processed images which do not contain the human face and the human body in the to-be-processed video stream, the human face attribute extraction processing is not carried out any more, and the to-be-processed images which do not contain the human face and the human body are not stored. Thus, the data processing amount can be greatly reduced, and the data storage space is reduced.

Since a piece of image to be processed may contain multiple faces or multiple person objects, it is convenient for a user (here, a user who inputs a reference face attribute) to quickly obtain information of each face in the image to be processed so as to determine whether the person object in the image to be processed is a target person. Optionally, the face attribute extraction processing is performed on the image to be processed, and obtaining the face attribute of the image to be processed may include the following steps:

according to the features in the feature data obtained by performing feature extraction processing on the image to be processed, obtaining the position of a human face in the image to be processed, wherein the position is the position of any pair of opposite angles of a rectangular frame containing the human face in the image to be processed;

intercepting a rectangular area determined by the position in the image to be processed to obtain a face image;

and carrying out face attribute extraction processing on the face image to obtain the face attribute of the image to be processed.

It should be understood that, the above-mentioned obtaining the position of the face in the image to be processed according to the features in the feature data is performed after determining that the image to be processed contains the face according to the features in the first feature data, that is, obtaining the position of the face in the image to be processed according to the features in the first feature data of the image to be processed containing the face.

As shown in fig. 7, the image C to be processed includes a human face D, the coordinate system in the image C to be processed is xoy, the rectangular frame including the human face D is a (x1, y1) b (x2, y2) C (x3, y3) D (x4, y4), and the positions of the human face D in the image C to be processed are a (x1, y1) and C (x3, y3), or b (x2, y2) and D (x4, y 4). It is to be understood that the rectangular frame abcd in fig. 5 is drawn for convenience of understanding, and in the process of obtaining the position of the human face D in the image C to be processed, the rectangular frame abcd does not exist in the image C to be processed, but the coordinates of the point a and the point C, or the coordinates of the point b and the point D are directly given.

The above process of obtaining the position of the face in the image to be processed can be realized by a neural network or a feature extraction model of feature data obtained by performing feature extraction processing on the image to be processed.

According to the position of the face in the image to be processed, a corresponding rectangular region, for example, a region included in the rectangular frame abcd, may be determined in the image to be processed, as shown in fig. 5. And intercepting the rectangular area from the image to be processed to obtain a face image.

It should be understood that each face image only contains one face, and for an image to be processed containing multiple faces at the same time, multiple face images are obtained. In addition, after the rectangular region determined by the position is intercepted from the image to be processed and the face image is obtained, the rectangular region is not missed in the image to be processed.

603. And taking the face attribute as the feature data of the image to be processed to obtain a database, wherein the database comprises the image to be processed and the feature data of the image to be processed.

The method comprises the steps of carrying out face attribute extraction processing on a face image through a neural network or a feature extraction model to obtain the face attribute of the face image, using the extracted face attribute as feature data of an image to be processed, storing the image to be processed and the feature data, and obtaining a database.

For example (example 5), the image to be processed E includes a face of the human object F, and the image of the face including F is G, and the face attributes of F include: the glasses are not worn, the beard is not worn, the adult is in the age group, the ethnic group is other, and the sex is male. Further, since G is obtained based on E, the feature data of G is the same as that of F. And finally, storing the face attribute of the face image F, the image E to be processed and the face image G into a database.

It should be understood that the feature data, the image to be processed, and the face image are related to each other, that is, the image to be processed may be determined by the feature data, the face image may be determined by the feature data, or the related image to be processed may be determined by the face image.

Continuing the example following example 5, the to-be-processed image E further includes a face of the human object H, the image including the face of the human object H is I, and the face attributes of H include: wear sunglasses, have no beard, the age bracket is adult, the nation is other, gender is women. If the feature data matched with the reference face attribute is the feature data of F, determining the target images as a face image G and an image E to be processed; and if the feature data matched with the reference face attribute is the feature data of H, determining the target images to be the face image I and the image E to be processed.

The position of each camera is determined, that is, longitude information and latitude information of the camera are stored in the server, so that the acquisition position of the image to be processed is determined, and the time of the video stream acquired by the camera is also determined, that is, the acquisition time of the image to be processed in the video stream is determined. Optionally, the acquisition position and the acquisition time of the image to be processed may also be used as feature data, that is, the feature data of the image to be processed includes the face attribute, the acquisition time and the acquisition position of the image to be processed. The face image is obtained based on the image to be processed, the acquisition time of the face image is the acquisition time of the image to be processed, and the acquisition position of the face image is the acquisition position of the image to be processed.

In this embodiment, by performing feature extraction processing on the to-be-processed image in the acquired to-be-processed video stream, the face attribute of the to-be-processed image can be used as feature data of the to-be-processed image and the face image, and a database is established. Therefore, under the condition that all the acquired video streams to be processed are not stored, the data size in the database can be reduced by storing the feature data, the images to be processed containing the human faces and the human bodies and the human face images into the database.

If the reference face attribute is sent to the server by the terminal, the server can also send the target image to the terminal after obtaining the target image, and the terminal can display the target image so that a user can confirm whether the person object in the target image is the target person.

Referring to fig. 8, fig. 8 is a schematic flowchart illustrating another method for retrieving an image according to a seventh embodiment of the present application.

801. And sending the target face image with the characteristic data matched with the reference face attribute to the terminal.

In this embodiment, the terminal is a terminal that sends the reference face attribute.

As shown in 603, the database includes a face image and an image to be processed, and the feature data, the face image, and the image to be processed are associated with each other. Therefore, after the characteristic data matched with the reference face attribute is retrieved and obtained from the image to be retrieved, the target face image with the characteristic data is sent to the terminal. The terminal can display the target face image sent by the server so that the user can confirm whether the character object in the target face image is the target character or not.

802. And under the condition of receiving a detail display request aiming at the target face image sent by the terminal, sending the target image to be processed with the characteristic data matched with the reference face attribute to the terminal.

There may be a plurality of face images obtained in 801, when the face images are displayed, thumbnails of all the face images may be displayed in a list form, and when it is determined that a person object in a target face image has a high probability of being a target person, a user may send a detail display request for the target face image to a server to obtain detailed information of the target face image.

Optionally, the user may send a request for showing details of the face image to the server by clicking the face image.

The server can send the target to-be-processed image associated with the target face image to the terminal under the condition that the server receives a detail display request aiming at the face image sent by the terminal, and the terminal can display the target to-be-processed image sent by the server so that a user can know the details of the target face image and further confirm whether a character object in the target face image is a target character or not.

Optionally, if the target to-be-processed image includes a plurality of faces, each of the face images includes a position of a face in the target to-be-processed image, so that when the target to-be-processed image is displayed to the user according to the detail display request of the face image, the face in the target to-be-processed image may be framed in a rectangular frame according to the position of the person object in the target to-be-processed image, so that the user may determine the person object in the face image from the plurality of faces in the target to-be-processed image.

For example, as shown in fig. 9, after receiving a detail display request for a sent by a terminal, an image to be processed a that includes a face of a person c is sent to the terminal, where b has a rectangular frame that includes the face of c.

Since the data size contained in the face image is smaller than the data size contained in the image to be processed, in the implementation manners provided by 601 and 602, the data size required to be processed by the terminal display image can be reduced by first sending the target face image to the terminal, and then sending the target image to be processed to the terminal after receiving the detail display request sent by the terminal. And under the condition that the sizes of the human faces in the images are the same, the number of the terminal displaying the target human face images is more than that of the target images to be processed, so that the efficiency of determining the target person by the user can be improved.

803. And under the condition of receiving a track display request sent by the terminal, sending an instruction to the terminal, wherein the instruction is used for instructing the terminal to display the acquisition position and the acquisition time of the target image to be processed in the map.

In the present embodiment, the track display indicates that the place where the person object in the image to be processed appears and the time of the appearance are displayed in the map.

When the user confirms that the person object in the target image to be processed is the target object, the user can send a track display request to the server through the terminal.

Optionally, after the terminal displays the target face image, the user may send a track display request to the server through the terminal, that is, the user confirms that the person object in the target face image is the target person, and may directly send the track display request to the server through the terminal without sending a detail display request to the server through the terminal. The user can also send a trace display request to the server through the terminal after the terminal displays the image to be processed, that is, the user cannot confirm whether the person object in the face image is the target object through the target face image, and needs to further confirm whether the person object is the target object through the target image to be processed, and the user can send the trace display request to the server through the terminal under the condition that the user confirms that the person object is the target object through the target image to be processed. The way for the user to send the track display request to the server through the terminal may be to click a preset button in a display page of the terminal.

The server sends an instruction to the terminal under the condition of receiving a track display request sent by the terminal, wherein the instruction is used for instructing the terminal to display the acquisition position and the acquisition time of the image to be processed with the characteristic data matched with the reference face attribute in the map. In this way, the user can more intuitively acquire the whereabouts, i.e., when and where the person object appears in the target image to be processed.

In this embodiment, the server first sends a target face image with feature data matched with the reference face attribute to the terminal, so that the user can confirm whether a person object included in the face image is a target person. Further, the target image to be processed can be sent to the terminal, so that the user can further confirm whether the character object contained in the target face image is the target character. In this way, the data amount required to be processed by the terminal to display the image can be reduced, and the efficiency of the user for determining the target person can be improved.

Referring to fig. 10, fig. 10 is a flowchart illustrating a possible implementation manner of step 103 in embodiment (a) provided in embodiment (nine) of the present application.

1001. And determining a retrieval sequence according to the priority of the preset attributes under the condition that the reference face attributes comprise at least two attributes.

As described above, the reference face attributes include: whether wearing a mask, whether wearing glasses, at least one of the styles, sexes, age groups, race, nationality and beard types of the glasses. Clearly, there are differences between the significance of different attributes, such as: the proportion of people wearing the sunglasses in the crowd is small, if the attribute of the reference face comprises the attribute of wearing the sunglasses, the sunglasses are used for searching the image to be searched, the searching range can be quickly reduced, and the searching speed is improved. On the contrary, if the attribute of the reference face includes gender, the proportion of people who accord with the attribute in the crowd is large no matter whether the gender is male or female, the image to be retrieved is retrieved by the gender, and the retrieval speed is slow.

Therefore, before retrieving an image to be retrieved using the attributes of the reference face, priorities may be set for the features in the attributes, and the retrieval order may be determined according to the priorities of the attributes.

For example (example 6), if the attributes of the face include 7 features in table 1, the retrieval order is to retrieve the image to be retrieved by using the attribute of wearing a hat, then retrieving the image to be retrieved by using the attribute of nationality of han, …, and finally retrieving the image to be retrieved by using the attribute of gender.

Priority (from high to low)	Feature(s)
		1	Sunglasses with glasses
2	Chinese family
		3	Wearing mask
4	Hu-luohu
		5	Race of a person
6	Age group
		7	Sex

TABLE 1

It should be understood that the priority recited in table 1 is for example only and not limiting.

1002. And sequentially using the attributes in the reference face attributes to retrieve the images to be retrieved according to the retrieval sequence, and obtaining the images with the characteristic data matched with the reference face attributes in the images to be retrieved as target images.

After the retrieval sequence is determined by the priority of the attributes, the attributes in the attributes of the reference human face can be sequentially used for retrieving the image to be retrieved according to the retrieval sequence, and the target image is obtained.

Continuing the example following example 6, the reference face attributes include the following features: the method comprises the steps that an adult man who wears sunglasses or a mask and has no beard searches an image to be searched by using the attribute of the sunglasses, and an image with characteristic data of the sunglasses in the image to be searched is obtained and serves as a third image to be searched. Then, the attribute of the wearing mask is used for retrieving the third image to be retrieved, and an image with characteristic data of the wearing mask in the third image to be retrieved is obtained and serves as a fourth image to be retrieved; …, respectively; and finally, searching the sixth image to be searched (namely the image obtained by searching the attribute of the adult with the age group) by using the attribute of the sex of the male, and obtaining an image containing the characteristic data of the sex of the male in the sixth image to be searched as a target image.

Optionally, the priority may also be set according to the position of the camera that acquires the video stream to be processed. For example, the position of the camera acquiring the video stream to be processed may be understood as an area monitored by the camera acquiring the video stream to be processed. Obviously, the attributes of people in different areas (here, geographical areas) are different, for example: the number of people of yellow race is large among people in inland regions in china, but the number of people of white race and black race is small. In northern europe, white race is more, but yellow race and black race are less. Therefore, the priority of the features can be set according to the region monitored by the camera acquiring the video stream to be processed.

By applying the embodiment, the priority can be set for the features, and the retrieval speed can be increased and the retrieval efficiency can be improved by determining the retrieval sequence according to the priority of the features.

As described above, by performing the feature extraction processing on the image to be processed, the feature data of the image to be processed and the face image can be obtained. Obviously, the more abundant the features extracted from the image to be processed, the higher the accuracy of the obtained feature data, optionally, before the feature extraction processing is performed on the image to be processed, the image quality detection may be performed on the image to be processed, and the feature extraction processing may be performed on the image to be processed with high image quality, so as to extract the richer features, thereby improving the accuracy of the feature data.

The following is a possible implementation manner of a method for detecting image quality of an image to be processed according to embodiment (ten) of the present application.

Before the database is established, an image quality evaluation index can be preset, and the image quality evaluation index comprises at least one of the following: the image quality evaluation index includes at least one of: the definition of the face area and the shielding condition of the face area.

The clearer the face region in the image to be processed is, the richer the face attributes and face features extracted from the image to be processed are, and accordingly, the accuracy of database retrieval using the reference face image is improved. Meanwhile, the accuracy rate of retrieving the image to be retrieved by using the reference face attribute to obtain the target image can be improved. In addition, the smaller the blocked area in the face area in the image to be processed is, the richer the subsequently extracted features are, and the accuracy of using the reference face image to search the database is improved. Meanwhile, the accuracy rate of retrieving the image to be retrieved by using the reference face attribute to obtain the target image can be improved.

For example, the quality of the image to be processed may be scored according to the above image quality evaluation indexes, such as: the larger the area of the image in which the face area is occluded, the more the subtraction is, such as: the area of the blocked area is less than or equal to 15%, the score is reduced by 0.5 point, the area of the blocked area is greater than 15% and less than or equal to 40%, the score is reduced by 1 point, the area of the blocked area is greater than 40% and less than or equal to 70%, the score is reduced by 2 points, the area of the blocked area is greater than 70%, and the score is reduced by 3.5 points. In addition, corresponding scores can be obtained from 1-5 points according to the definition of the face area in the image, and finally, the image quality scores are obtained through all the scores. It should be understood that the determination of the sharpness of the face region in the image may be implemented by any image sharpness algorithm, such as: the gray variance function, the gray variance product function, and the energy gradient function are not specifically limited in this application.

And if the image to be processed with the quality score not reaching the threshold value is not processed in the next step, taking the image to be processed with the quality score reaching the threshold value as an image capable of being processed in the next step, namely, performing face attribute extraction processing on the image to be processed with the quality score reaching the threshold value to obtain the face attribute of the image to be processed (the process of obtaining the face attribute of the image to be processed can be referred to as 202).

Optionally, the video streams to be processed mentioned in embodiments (six) to (ten) may be collected by a camera, where the camera includes the following parameters: the range of the human face recognizable deflection angle is-45 degrees to +45 degrees, the range of the human face recognizable pitch angle is-30 degrees to +30 degrees, and the distance between two eye pupils in the collected human face image is more than or equal to 18 pixel points.

Above-mentioned distinguishable angle of deflection of people's face refers to the shooting direction of camera lens and the contained angle between the vertical line of crossing the regional face of the people who takes, and from being shot the human top from last down seeing, when the shooting direction of camera is compared the skew direction of the vertical line of crossing the regional face of the people who takes is clockwise, distinguishable angle of deflection of people's face is positive, otherwise, from being shot the human top from last down seeing, when the shooting direction of camera is compared the skew direction of the vertical line of crossing the regional face of the people who takes is anticlockwise, distinguishable angle of deflection of people's face is the negative.

When the recognizable deflection angle range of the human face of the camera is-45 degrees to +45 degrees, the quality score of the acquired image in the video stream under the image quality evaluation index is higher.

The above-mentioned distinguishable pitch angle of people's face refers to the contained angle between the shooting direction of camera lens and the horizontal line of crossing the regional face of the people who takes, and look from the left side of being shot the human body to the direction of right side, the shooting direction of camera is compared when the skew direction of the horizontal line of crossing the regional face of the people who takes is clockwise, distinguishable pitch angle of people's face is positive, otherwise, look from the left side of being shot the human body to the direction of right side, the shooting direction of camera is compared when the skew direction of the horizontal line of crossing the regional face of the people who takes is anticlockwise, distinguishable pitch angle of people's face is negative.

The range of the human face recognizable deflection angle of the camera is-45 degrees to +45 degrees, the range of the human face recognizable pitch angle of the camera is-30 degrees to +30 degrees, the distance between two eye pupils in a human face image collected by the camera is larger than or equal to 18 pixel points, and the quality score of the image in the video stream collected by the camera under the image quality evaluation index is high. And the images in the video stream collected by the camera can be used for face recognition.

Since the quality scores of the images in the video stream acquired based on the camera with the parameters are high, the reference images can also be used for searching a database established based on the images in the video stream acquired by the camera with the parameters to determine the target image.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

The method of the embodiments of the present application is set forth above in detail and the apparatus of the embodiments of the present application is provided below.

Referring to fig. 11, fig. 11 is a schematic structural diagram of an apparatus for retrieving an image according to an embodiment of the present application, where the apparatus 1 includes: the face recognition system comprises an acquisition unit 11, a first retrieval unit 12, a second retrieval unit 13, a face attribute extraction unit 14, a determination unit 15, a feature extraction processing unit 16, a transmission unit 17, an evaluation unit 18 and a sorting unit 19. Wherein:

an acquisition unit 11 for acquiring reference character information;

a first retrieval unit 12, configured to retrieve a database using the reference personal information, and obtain an image in the database, which has feature data matching the reference personal information, as an image to be retrieved;

and a second retrieving unit 13, configured to, in a case that a retrieval request for the image to be retrieved is received, retrieve the image to be retrieved by using the reference face attribute in the retrieval request, and obtain an image, which has feature data matching the reference face attribute, in the image to be retrieved as a target image.

In one possible implementation, the reference personal information includes a reference face image; the first retrieving unit 12 is configured to: and retrieving the database by using the reference face image, and obtaining an image matched with the reference face image in the database as the image to be retrieved.

In another possible implementation manner, the reference person information includes a reference face image and a reference body image; the first retrieving unit 12 is configured to: and retrieving the database by using the reference face image and the reference human body image, and obtaining an image matched with the reference face image and an image matched with the reference human body image in the database as an image to be retrieved.

In yet another possible implementation manner, the reference person information includes a reference time range and/or a reference geographical position range, and the feature data of the images in the database includes an acquisition time and an acquisition position; the first retrieval unit is configured to 12: and taking the image with the acquisition time within the reference time range and/or the acquisition position within the reference geographical position range in the database as the image to be retrieved.

In yet another possible implementation manner, the reference personal information includes a reference human body image; the first retrieving unit 12 is configured to: and obtaining an image matched with the reference human body image in the database by using the database of the reference human body image as the image to be retrieved.

In yet another possible implementation manner, the apparatus 1 for retrieving an image further includes: the acquiring unit 11 is configured to acquire a video stream to be processed before the reference character information is acquired; a face attribute extraction unit 14, configured to perform face attribute extraction processing on the image to be processed in the video stream to be processed, so as to obtain a face attribute of the image to be processed; a determining unit 15, configured to use the face attribute as feature data of the image to be processed to obtain the database, where the database includes the image to be processed and feature data of the image to be processed.

In yet another possible implementation manner, the apparatus for retrieving an image further includes: a feature extraction processing unit 16, configured to perform feature extraction processing on the image to be processed in the video stream to obtain feature data before performing face attribute extraction processing on the image to be processed in the video stream to obtain a face attribute of the image to be processed; the face attribute extraction unit 14 is configured to, when it is determined that the image to be processed includes a face according to the features in the feature data, perform the step of performing face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed.

In yet another possible implementation manner, the face attribute extraction unit 14 is configured to: according to the features in the feature data, obtaining the position of the face in the image to be processed, wherein the position is the position of any pair of opposite angles of a rectangular frame containing the face in the image to be processed; intercepting a rectangular area determined by the position in the image to be processed to obtain a face image; and performing face attribute extraction processing on the face image to obtain the face attribute of the image to be processed.

In yet another possible implementation manner, the apparatus 1 for retrieving an image further includes: a sending unit 17, configured to, after the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, and an image having feature data matching the reference face attribute in the image to be retrieved is obtained and serves as a target image, send the target face image having feature data matching the reference face attribute to a terminal; the sending unit 17 is further configured to send, to the terminal, a target image to be processed having feature data matching the reference face attribute in the case of receiving a detail showing request for the target face image sent by the terminal.

In yet another possible implementation manner, the feature data of the image to be processed in the database includes an acquisition time and an acquisition position, and the apparatus 1 for retrieving an image further includes: the sending unit 17 is configured to send an instruction to the terminal when receiving a trace display request sent by the terminal, where the instruction is used to instruct the terminal to display the acquisition time and the acquisition position of the image to be processed in a map.

In yet another possible implementation manner, the apparatus 1 for retrieving an image further includes: an evaluation unit 18, configured to, after the determination that the image to be processed includes a human face according to the features in the feature data, obtain a quality score of the image to be processed according to a preset image quality evaluation index, where the image quality evaluation index includes at least one of: the definition of a face region and the shielding condition of the face region; the feature extraction processing unit is configured to: and performing feature extraction processing on the image to be processed with the quality score reaching a threshold value to obtain the feature data.

In yet another possible implementation manner, the apparatus 1 for retrieving an image further includes: a sorting unit 19, configured to, before the image to be retrieved is retrieved by using the reference face attribute in the retrieval request, obtain an image in the image to be retrieved, where the image has feature data that matches the reference face attribute, and determine, as a target image, a retrieval order according to a priority of a preset attribute when the reference face attribute includes at least two attributes; the second retrieving unit 12 is configured to: and sequentially using the attributes in the reference face attributes to retrieve the images to be retrieved according to the retrieval sequence, and obtaining images with characteristic data matched with the reference face attributes in the images to be retrieved as the target images.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

Fig. 12 is a schematic hardware structure diagram of an apparatus for retrieving an image according to an embodiment of the present disclosure. The apparatus 2 comprises a processor 21 and may further comprise an input 22, an output 23 and a memory 24. The input device 22, the output device 23, the memory 24 and the processor 21 are connected to each other via a bus.

The memory includes, but is not limited to, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a portable read-only memory (CD-ROM), which is used for storing instructions and data.

The input means are for inputting data and/or signals and the output means are for outputting data and/or signals. The output means and the input means may be separate devices or may be an integral device.

The processor may include one or more processors, for example, one or more Central Processing Units (CPUs), and in the case of one CPU, the CPU may be a single-core CPU or a multi-core CPU.

The memory is used to store program codes and data of the network device.

The processor is used for calling the program codes and data in the memory and executing the steps in the method embodiment. Specifically, reference may be made to the description of the method embodiment, which is not repeated herein.

It will be appreciated that fig. 12 only shows a simplified design of an apparatus for retrieving images. In practical applications, the image retrieving apparatus may further include other necessary components, including but not limited to any number of input/output devices, processors, controllers, memories, etc., and all apparatuses capable of implementing the image retrieving apparatus of the embodiments of the present application are within the scope of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It is also clear to those skilled in the art that the descriptions of the various embodiments of the present application have different emphasis, and for convenience and brevity of description, the same or similar parts may not be repeated in different embodiments, so that the parts that are not described or not described in detail in a certain embodiment may refer to the descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)), or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., Digital Versatile Disk (DVD)), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

One of ordinary skill in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the above method embodiments. And the aforementioned storage medium includes: various media that can store program codes, such as a read-only memory (ROM) or a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Claims

1. A method of retrieving an image, the method comprising:

acquiring reference character information;

using the reference person information to search a database, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be searched;

under the condition that a retrieval request aiming at the image to be retrieved is received, retrieving the image to be retrieved by using the reference face attribute in the retrieval request, and obtaining an image with characteristic data matched with the reference face attribute in the image to be retrieved as a target image.

2. The method of claim 1, wherein the reference personal information includes a reference face image;

the using the reference person information retrieval database to obtain an image of feature data matched with the reference person information in the database as an image to be retrieved includes:

and retrieving the database by using the reference face image, and obtaining an image matched with the reference face image in the database as the image to be retrieved.

3. The method according to claim 1, wherein the reference personal information includes a reference face image and a reference body image;

the using the reference person information to search the database to obtain an image with characteristic data matched with the reference person information in the database as an image to be searched comprises the following steps:

and retrieving the database by using the reference face image and the reference human body image, and obtaining an image matched with the reference face image and an image matched with the reference human body image in the database as an image to be retrieved.

4. The method of claim 1, wherein the reference personal information includes a reference time range and/or a reference geographical location range, and the feature data of the images in the database includes an acquisition time and an acquisition location;

and taking the image with the acquisition time within the reference time range and/or the acquisition position within the reference geographical position range in the database as the image to be retrieved.

5. The method of claim 1, wherein the reference personal information includes a reference human body image;

and retrieving the database by using the reference human body image, and obtaining an image matched with the reference human body image in the database as the image to be retrieved.

6. The method according to any one of claims 1 to 5, wherein before the acquiring the reference personal information, the method further comprises:

acquiring a video stream to be processed;

performing face attribute extraction processing on the image to be processed in the video stream to be processed to obtain the face attribute of the image to be processed;

and taking the face attribute as the feature data of the image to be processed to obtain the database, wherein the database comprises the image to be processed and the feature data of the image to be processed.

7. An apparatus for retrieving an image, the apparatus comprising:

an acquisition unit configured to acquire reference character information;

the first retrieval unit is used for retrieving a database by using the reference person information, and obtaining an image with characteristic data matched with the reference person information in the database as an image to be retrieved;

and the second retrieval unit is used for retrieving the image to be retrieved by using the reference face attribute in the retrieval request under the condition that the retrieval request aiming at the image to be retrieved is received, and obtaining an image with characteristic data matched with the reference face attribute in the image to be retrieved as a target image.

8. A processor configured to perform the method of any one of claims 1 to 6.

9. An electronic device, comprising: a processor, transmitting means, input means, output means and a memory for storing computer program code comprising computer instructions which, when executed by the processor, cause the electronic device to perform the method of any of claims 1 to 6.

10. A computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions which, when executed by a processor of an electronic device, cause the processor to carry out the method of any one of claims 1 to 6.