CN113157956A

CN113157956A - Picture searching method, system, mobile terminal and storage medium

Info

Publication number: CN113157956A
Application number: CN202110440522.3A
Authority: CN
Inventors: 林杰; 赵复军; 王淞泽; 严昀; 陈婧萍
Original assignee: Yamaha Motor Solutions Co Ltd Xiamen
Current assignee: Yamaha Motor Solutions Co Ltd Xiamen
Priority date: 2021-04-23
Filing date: 2021-04-23
Publication date: 2021-07-23
Anticipated expiration: 2041-04-23
Also published as: CN113157956B

Abstract

The invention provides a picture searching method, a system, a mobile terminal and a storage medium, wherein the method comprises the following steps: extracting the characteristics of a target face in a picture to be searched to obtain a face image and face characteristic points; carrying out image calibration on the face image according to the face characteristic point, and carrying out characteristic coding on the face image after image calibration to obtain a face characteristic vector; and carrying out category identification on a target object and a target scene in the picture to be searched to obtain a target object category and a target scene category, and carrying out picture search according to the face characteristic vector, the target object category and the target scene category to obtain a face search picture, an object search picture and a scene search picture. The method and the device can effectively search the pictures of the face features, the object categories and the scene categories based on the face feature vectors, the target object categories and the target scene categories corresponding to the pictures to be searched so as to obtain the pictures of the same characters, objects and scenes as those in the pictures to be searched, thereby improving the picture searching experience of users.

Description

Picture searching method, system, mobile terminal and storage medium

Technical Field

The invention relates to the technical field of Internet of things, in particular to a picture searching method, a picture searching system, a mobile terminal and a storage medium.

Background

At present, along with intelligent terminal's rapid development, its hardware performance, especially the specification of camera is better and better, and the quality of formation of image is also higher and higher to because intelligent terminal's the function of shooing is more and more powerful, for example, intelligence is focused, scene analysis, smiling face discernment, speech input, quick continuous shooting etc. more and more users like to adopt the smart mobile phone to shoot, not only effectual, convenient to use moreover, the user can shoot more conveniently and fast and take notes the life.

In the prior art, due to the fact that the number of the pictures is large, when a user wants to search for pictures of the same person, object or scene, the user often cannot find or cannot find the pictures completely, and therefore the picture searching experience of the user is improved.

Disclosure of Invention

The embodiment of the invention aims to provide a picture searching method, a picture searching system, a mobile terminal and a storage medium, and aims to solve the problem of low picture searching experience of a user caused by the fact that pictures of the same person, object or scene cannot be found or cannot be found completely in the existing picture searching process.

The embodiment of the invention is realized in such a way that an image searching method comprises the following steps:

respectively determining a target face, a target object and a target scene in a picture to be searched, and extracting the characteristics of the target face in the picture to be searched to obtain a face image and face characteristic points in the face image;

carrying out image calibration on the face image according to the face feature point, and carrying out feature coding on the face image after image calibration to obtain a face feature vector;

and respectively carrying out category identification on the target object and the target scene to obtain a target object category and a target scene category, and respectively carrying out picture search according to the face feature vector, the target object category and the target scene category to obtain a face search picture, an object search picture and a scene search picture.

Further, the performing feature extraction on the target face in the picture to be searched to obtain a face image and a face feature point in the face image includes:

inputting the picture to be searched into a pre-trained face detection model for face analysis to obtain the face image and the face characteristic points;

before the picture to be searched is input into the pre-trained face detection model for face analysis, the method further comprises the following steps:

inputting a sample picture into a convolutional neural network in the face detection model for convolution processing to obtain a characteristic picture;

performing picture sampling by adopting an up-sampling mode and connecting feature pictures with the same scale by adopting a transverse connection mode aiming at the convolutional neural network in the face detection model;

respectively carrying out picture combination on the convolutional neural network of the upper layer and the characteristic picture obtained by transverse connection aiming at the convolutional neural network in the face detection model to obtain a combined picture, and setting the combined picture as the characteristic picture output by the corresponding convolutional neural network;

and performing model training on the face detection model according to the characteristic picture output by the convolutional neural network to obtain the pre-trained face detection model.

Further, the inputting the picture to be searched into the pre-trained face detection model for face analysis includes:

carrying out face analysis on the picture to be searched according to the pre-trained face detection model to obtain face anchor frame coordinates and the face characteristic points;

and calibrating the coordinates of the human face anchor frame and the human face characteristic points, and outputting the calibrated coordinates of the human face anchor frame and the human face characteristic points.

Further, the performing picture search according to the face feature vector, the target object category, and the target scene category includes:

matching the target object type and the target scene type with a picture database respectively to obtain the object search picture and the scene search picture, wherein the picture database stores corresponding relations between different target object types and different target scene types and corresponding search pictures;

and matching the face feature vector with the feature vector database to obtain the face search picture, wherein the feature vector database stores the corresponding relation between different feature vectors and corresponding search pictures.

Further, the matching the face feature vector with the feature vector database includes:

clustering the feature vectors in the feature vector database to obtain a clustering category and an abnormal category, wherein the abnormal category comprises the feature vectors which cannot be clustered;

determining a target category in the cluster categories, and calculating feature similarity between feature vectors in the target category and the abnormal category and the face feature vector;

and if the feature similarity is greater than a similarity threshold value, setting the feature vector corresponding to the feature similarity as a target vector, and outputting a search picture corresponding to the target vector as the face search picture.

Further, the determining the target category in the cluster category includes:

respectively carrying out vector sampling in different clustering categories to obtain sampling vectors, and respectively calculating the vector similarity between the face feature vector and the sampling vectors;

and determining the cluster category corresponding to the maximum vector similarity as a target category.

Further, the image calibration of the face image according to the face feature point includes:

determining eye characteristic points in the face characteristic points, and calculating an eye included angle between the eye characteristic points;

and carrying out rotation calibration on the face image according to the eye included angle.

Another object of an embodiment of the present invention is to provide an image search system, including:

the characteristic extraction module is used for respectively determining a target face, a target object and a target scene in a picture to be searched, and extracting the characteristics of the target face in the picture to be searched to obtain a face image and face characteristic points in the face image;

the feature coding module is used for carrying out image calibration on the face image according to the face feature point and carrying out feature coding on the face image after image calibration to obtain a face feature vector;

and the picture searching module is used for respectively carrying out category identification on the target object and the target scene to obtain a target object category and a target scene category, and respectively carrying out picture searching according to the face feature vector, the target object category and the target scene category to obtain a face searching picture, an object searching picture and a scene searching picture.

Another objective of an embodiment of the present invention is to provide a mobile terminal, including a storage device and a processor, where the storage device is used to store a computer program, and the processor runs the computer program to make the mobile terminal execute the above-mentioned picture search method.

Another object of an embodiment of the present invention is to provide a storage medium, which stores a computer program used in the mobile terminal, wherein the computer program, when executed by a processor, implements the steps of the image searching method.

The embodiment of the invention can effectively extract the face image and the face characteristic points in the picture to be searched by extracting the characteristics of the target face in the picture to be searched, improve the accuracy of the characteristic coding of the face image by carrying out the image calibration on the face image according to the face characteristic points, effectively convert the image characteristics into vector characteristics by carrying out the characteristic coding on the face image after the image calibration, effectively determine the corresponding categories of the target object and the target scene by respectively carrying out the category identification on the target object and the target scene, and effectively carry out the picture searching of the face characteristics, the object categories and the scene categories respectively based on the face characteristic vector, the target object category and the target scene category corresponding to the picture to be searched, the picture of the same person, object and scene as those in the picture to be searched can be obtained, the phenomenon that the picture cannot be found completely or cannot be found is prevented, and the picture searching experience of the user is improved.

Drawings

Fig. 1 is a flowchart of a picture searching method according to a first embodiment of the present invention;

fig. 2 is a flowchart of a picture searching method according to a second embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a picture search system according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a mobile terminal according to a fourth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Example one

Referring to fig. 1, which is a flowchart of a picture searching method according to a first embodiment of the present invention, the picture searching method may be applied to any terminal device, where the terminal device may be a mobile phone, a tablet, a server, or a wearable smart device, and the picture searching method includes the steps of:

step S10, respectively determining a target face, a target object and a target scene in a picture to be searched, and extracting the characteristics of the target face in the picture to be searched to obtain a face image and face characteristic points in the face image;

the method includes determining a target face in a picture to be searched by using a convolutional neural network, performing category identification on an object in the picture to be searched to obtain an object category, matching the obtained object category with a pre-stored category determination table, and determining an object corresponding to the object category which is successfully matched as the target object, where the pre-stored category determination table stores specified categories preset by a user, for example, when the picture to be searched includes an object a1, an object a2, and an object a3, and the object categories corresponding to the object a1, the object a2, and the object a3 are a category b1, a category b2, and a category b3, and if the category b1 and the category b2 are successfully matched with the pre-stored category determination table, determining the object a1 and the object a2 as the target object, and a plurality of different target faces and target objects may exist in the same picture to be searched. In the step, the target scene is determined and obtained by extracting the background image in the picture to be searched.

Optionally, in this step, the performing feature extraction on the target face in the picture to be searched to obtain a face image and a face feature point in the face image includes:

and inputting the picture to be searched into a pre-trained face detection model for face analysis to obtain the face image and the face characteristic points.

Further, in this step, before the image to be searched is input into the pre-trained face detection model for face analysis, the method further includes:

inputting sample pictures into a convolutional neural network in the face detection model for convolution processing to obtain feature pictures, wherein the number and the picture content of the sample pictures can be set according to user requirements, the sample pictures are used for carrying out model training on the face detection model, and in the step, the sample pictures are input into the convolutional neural network for convolution processing to obtain the feature pictures with different scales;

for the convolutional neural network in the face detection model, performing picture sampling in an up-sampling mode, and connecting feature pictures with the same scale in a transverse connection mode, wherein the up-sampling mode is used from top to bottom in a top-down mode, and each layer of convolutional neural network is connected with the feature pictures with the same scale in a transverse connection mode;

respectively carrying out picture combination on the convolutional neural network of the previous layer and the characteristic picture obtained by transverse connection aiming at the convolutional neural network in the face detection model to obtain a combined picture, and setting the combined picture as the characteristic picture output by the convolutional neural network, wherein the characteristic pictures obtained by the previous layer and the transverse connection are added to obtain the characteristic picture output by the layer, and the context of different scale space face detection algorithms is used for the characteristic pictures of different scales to detect faces of different sizes;

and performing model training on the face detection model according to the feature picture output by the convolutional neural network to obtain the pre-trained face detection model, wherein when the convergence of the face detection model after model training is detected, the face detection model is output to obtain the pre-trained face detection model.

acquiring an anchor frame error and a characteristic point error of the pre-trained face detection model, and calibrating the face anchor frame coordinate and the face characteristic point according to the anchor frame error and the characteristic point error respectively;

adopting a non-maximum suppression algorithm to screen the calibrated face anchor frame coordinates and the face characteristic points, and outputting the screened face anchor frame coordinates and the face characteristic points;

in the step, an anchor frame error between a predicted anchor frame and a picture frame and a characteristic point error between a predicted face characteristic point and an actual face characteristic point are returned through a pre-trained face detection model in different receptive fields, the corresponding anchor frame error and the corresponding characteristic point error are added to the coordinates of the face anchor frame and the face characteristic point generated in the different receptive fields, correct coordinates of the face anchor frame and the face characteristic point are calculated, finally, a non-maximum suppression algorithm is used for screening out the face anchor frame with low confidence coefficient, and the face anchor frame which meets the requirements most is left, so that the correct coordinates of the face anchor frame and the face characteristic point are obtained.

Step S20, carrying out image calibration on the face image according to the face characteristic point, and carrying out characteristic coding on the face image after image calibration to obtain a face characteristic vector;

the accuracy of the face image is improved by carrying out image calibration on the face image according to the face characteristic points, and the image characteristics can be effectively converted into vector characteristics by carrying out characteristic coding on the face image after image calibration.

Optionally, in this step, the performing image calibration on the face image according to the face feature point includes:

determining two eye characteristic points in the face characteristic points, and calculating an eye included angle between the two eye characteristic points, wherein the accuracy of calculating the eye included angle between the two eye characteristic points is improved by determining the two eye characteristic points in the face characteristic points;

the middle point between the two eye characteristic points is used as a rotation central point, the eye included angle is used as a rotation angle to carry out rotation calibration on the face image, and the middle point between the two eye characteristic points is used as the rotation central point and the eye included angle is used as the rotation angle to carry out rotation calibration on the face image, so that the angle calibration can be effectively carried out on the face image by using a connecting line between the face and the eyes as a reference line.

Step S30, performing category identification on the target object and the target scene respectively to obtain a target object category and a target scene category, and performing picture search respectively according to the face feature vector, the target object category and the target scene category to obtain a face search picture, an object search picture and a scene search picture.

For an object, the object detection deep neural network can be used to identify the category of the object, and all pictures containing the object can be searched according to the category, so as to achieve the picture search effect for the object.

Optionally, in the network training step for the object detection deep neural network, a part of regions of an input sample picture can be erased and pixel values of other data regions in a training set are randomly filled to perform data enhancement, so that the sample picture is mixed with four pictures with different semantic information, the object detection deep neural network can learn a target exceeding a conventional context, the robustness of the object detection deep neural network is further enhanced, a feature level structure in the object detection deep neural network is enhanced by using a low-level accurate positioning signal through a bottom-up path, an information path between a low level and a top level is shortened, a self-adaptive feature pool is arranged in the object detection deep neural network and connects a feature grid in the object detection deep neural network with all feature levels, useful information at each feature level is propagated directly to the proposed sub-network.

In the embodiment, the feature extraction is performed on the target face in the picture to be searched, the face image and the face feature points in the picture to be searched can be effectively extracted, the accuracy of feature coding on the face image is improved by performing image calibration on the face image according to the face feature points, the image features can be effectively converted into vector features by performing feature coding on the face image after image calibration, the categories corresponding to the target object and the target scene can be effectively determined by performing category identification on the target object and the target scene respectively, and the picture search can be effectively performed on the basis of the face feature vector, the target object category and the target scene category corresponding to the picture to be searched, the picture of the same person, object and scene as those in the picture to be searched can be obtained, the phenomenon that the picture cannot be found completely or cannot be found is prevented, and the picture searching experience of the user is improved.

Example two

Please refer to fig. 2, which is a flowchart illustrating a picture searching method according to a second embodiment of the present invention, wherein the method for further refining step S30 includes the steps of:

step S31, matching the target object type and the target scene type with a picture database respectively to obtain the object search picture and the scene search picture;

in the step, the target object type and the target scene type are respectively matched with the picture database so as to search the picture containing the target object and the target scene in the picture database.

Step S32, matching the face feature vector with the feature vector database to obtain the face search picture;

in the step, a plurality of data tables are designed to store different user information and picture information, for example, a face analysis data table stores a face number, and a face image after being corrected and cut and related position information of the face in the corresponding picture. For example, the picture data table stores a picture number, a name of an original picture, a storage location, an uploader name, time, and a feature vector of the picture and a state of the picture. For example, the face image data table is respectively associated with face analysis and image information, and includes an image number, a face number, and some basic information, and is used for searching a corresponding relationship between a face and an original image. The user information data table contains user numbers, user names, sexes, representative pictures and some user basic information, and the user face data table is respectively associated with the user information and the picture information and contains the user numbers, the face numbers and some other information and is used for searching the corresponding relation between the user and the face.

Optionally, in this step, the matching the face feature vector with the feature vector database includes:

clustering the feature vectors in the feature vector database to obtain a clustering category and an abnormal category, wherein the abnormal category comprises the feature vectors which cannot be clustered, and in the step, a density-based clustering algorithm can be adopted to cluster the feature vectors in the feature vector database to obtain a clustered category, abnormal categories which cannot be clustered and unprocessed picture categories;

Further, the determining the target class in the cluster class includes:

determining the cluster category corresponding to the maximum vector similarity as a target category;

in this embodiment, all the feature vectors in the target category and all the feature vectors in the abnormal category and the unprocessed category are used as the search range of the face feature vector, and a picture with similarity greater than a similarity threshold is output to obtain the face search picture.

In the step, the clustering processing is carried out on the feature vectors in the feature vector database to obtain the clustering categories and the abnormal categories, and the target categories in the clustering categories are determined, so that the range of picture searching is effectively reduced, and the time of picture searching is reduced.

Optionally, if the vector similarity between any face feature vector and the feature vectors in the multiple cluster categories is greater than the similarity threshold, the face feature vector is stored in the feature vector database, and the feature vectors in the feature vector database are continuously re-clustered according to a preset time interval, so as to update the cluster category.

In this embodiment, the target object category and the target scene category are respectively matched with the image database to search an image containing a target object and a target scene in the image database, and the face feature vector is matched with the feature vector database to obtain a face search image of the same person as that in the image to be searched.

EXAMPLE III

Please refer to fig. 3, which is a schematic structural diagram of a picture search system 100 according to a third embodiment of the present invention, including: the image processing system comprises a feature extraction module 10, a feature coding module 11 and an image searching module 12, wherein:

the feature extraction module 10 is configured to determine a target face, a target object, and a target scene in a picture to be searched, and perform feature extraction on the target face in the picture to be searched to obtain a face image and a face feature point in the face image.

Wherein, the feature extraction module 10 is further configured to: and inputting the picture to be searched into a pre-trained face detection model for face analysis to obtain the face image and the face characteristic points.

Optionally, the feature extraction module 10 is further configured to: inputting a sample picture into a convolutional neural network in the face detection model for convolution processing to obtain a characteristic picture;

Further, the feature extraction module 10 is further configured to: carrying out face analysis on the picture to be searched according to the pre-trained face detection model to obtain face anchor frame coordinates and the face characteristic points;

And the feature coding module 11 is configured to perform image calibration on the face image according to the face feature point, and perform feature coding on the face image after image calibration to obtain a face feature vector.

Wherein, the feature encoding module 11 is further configured to: determining eye characteristic points in the face characteristic points, and calculating an eye included angle between the eye characteristic points;

The image searching module 12 is configured to perform category identification on the target object and the target scene respectively to obtain a target object category and a target scene category, and perform image searching respectively according to the face feature vector, the target object category, and the target scene category to obtain a face search image, an object search image, and a scene search image.

Wherein, the picture searching module 12 is further configured to: matching the target object type and the target scene type with a picture database respectively to obtain the object search picture and the scene search picture, wherein the picture database stores corresponding relations between different target object types and different target scene types and corresponding search pictures;

Optionally, the picture searching module 12 is further configured to: clustering the feature vectors in the feature vector database to obtain a clustering category and an abnormal category, wherein the abnormal category comprises the feature vectors which cannot be clustered;

Further, the picture searching module 12 is further configured to: respectively carrying out vector sampling in different clustering categories to obtain sampling vectors, and respectively calculating the vector similarity between the face feature vector and the sampling vectors;

Optionally, in this embodiment, a front-end display interface is provided, where the front-end display interface includes a graph searching interface and a clustering interface for searching a cluster including a target user.

On the map searching interface, the component comprises: and the component for uploading the pictures, wherein the user can browse the local folder through the component and then select the pictures for uploading. The selected picture is converted from the JPG or PNG format to base64 encoded data by the format conversion function, and the data encoded by the picture base64 is transmitted to the back end through a webpage request. And the interface also comprises a component for displaying the uploaded pictures and a component for displaying the returned search pictures with high matching degree. The back end returns the data set number of the matched search picture by calling the interface of the face picture search function, then accesses the database by the back end to call the base64 coded data of the related picture, converts the search picture into the original display format by the conversion function, and displays the original display format on the front end.

And searching a clustering interface containing the target user picture, wherein the contained component is a representative picture component for displaying the previously gathered people of the category, and the sub-page can be accessed by clicking the representative picture. In the sub-page, there are a component that displays the information of the person and a component that displays the picture containing the person. With both components, it is possible to display some relevant information of the selected person, as well as the entire set of pictures containing the person. And transmitting the number of the selected person to the back end by clicking the picture, inquiring the position of the picture corresponding to the number by the back end by using a database inquiry statement, and returning data meeting the requirement of base64 coding.

Example four

Referring to fig. 4, a mobile terminal 101 according to a fourth embodiment of the present invention includes a storage device and a processor, where the storage device is used to store a computer program, and the processor runs the computer program to enable the mobile terminal 101 to execute the above-mentioned picture searching method.

The present embodiment also provides a storage medium on which a computer program used in the above-mentioned mobile terminal 101 is stored, which when executed, includes the steps of:

and respectively carrying out category identification on the target object and the target scene to obtain a target object category and a target scene category, and respectively carrying out picture search according to the face feature vector, the target object category and the target scene category to obtain a face search picture, an object search picture and a scene search picture. The storage medium, such as: ROM/RAM, magnetic disk, optical disk, etc.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is used as an example, in practical applications, the above-mentioned function distribution may be performed by different functional units or modules according to needs, that is, the internal structure of the storage device is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit, and the integrated unit may be implemented in a form of hardware, or may be implemented in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application.

It will be understood by those skilled in the art that the constituent structure shown in fig. 3 does not constitute a limitation of the picture search system of the present invention, and may include more or less components than those shown, or combine some components, or different arrangement of components, and the picture search methods in fig. 1 and 2 are also implemented using more or less components than those shown in fig. 3, or combine some components, or different arrangement of components. The units, modules, etc. referred to herein are a series of computer programs that can be executed by a processor (not shown) in the target picture searching system and that can perform specific functions, and all of the computer programs can be stored in a storage device (not shown) of the target picture searching system.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A picture searching method, characterized in that the method comprises:

2. The image searching method of claim 1, wherein the extracting the features of the target face in the image to be searched to obtain a face image and face feature points in the face image comprises:

3. The image searching method of claim 2, wherein the inputting the image to be searched into a pre-trained face detection model for face analysis comprises:

4. The picture searching method according to claim 1, wherein the picture searching according to the face feature vector, the target object class and the target scene class respectively comprises:

5. The picture searching method of claim 4, wherein the matching the face feature vector with the feature vector database comprises:

6. The picture searching method of claim 5, wherein the determining the target category in the cluster categories comprises:

7. The picture searching method according to claim 1, wherein the image calibration of the face image according to the face feature point comprises:

8. A picture search system, the system comprising:

9. A mobile terminal, characterized by comprising a storage device for storing a computer program and a processor for executing the computer program to cause the mobile terminal to execute the picture search method according to any one of claims 1 to 7.

10. A storage medium, characterized in that it stores a computer program for use in a mobile terminal according to claim 9, which computer program, when executed by a processor, implements the steps of the picture search method according to any one of claims 1 to 7.