WO2017032243A1

WO2017032243A1 - Image feature extraction method, apparatus, terminal device, and system

Info

Publication number: WO2017032243A1
Application number: PCT/CN2016/095524
Authority: WO
Inventors: 刘荣; 易东; 张帆; 张伦; 楚汝峰
Original assignee: 阿里巴巴集团控股有限公司; 刘荣; 易东; 张帆; 张伦; 楚汝峰
Priority date: 2015-08-26
Filing date: 2016-08-16
Publication date: 2017-03-02
Also published as: CN106485186B; CN106485186A

Abstract

The present application provides an image feature extraction method. Firstly, an image input by a user is received. Then, the image input by the user is registered to obtain a registered image. Then, a plurality of structured sub-images is constructed for the registered image. Then, visual features of each structured sub-image are extracted by means of a feature model obtained by multi-model training. Then, the visual features of the plurality of structured sub-images are structurally combined to obtain structured feature data. Finally, the structured feature data is operated by means of a model obtained by structured model training to obtain image feature data. Compared with the prior art, the present application has the advantages that obtained image feature data is a feature vector; because the feature vector and a model keep structured characteristics of an image in a training process, the image feature data has higher accuracy and recognizability, and will have higher accuracy when being applied to image recognition, particularly to face recognition, thereby obtaining a better recognition effect.

Description

Image feature extraction method, device, terminal device and system

Technical field

The present application relates to the field of electronic technologies, and in particular, to an image feature extraction method, an image feature extraction device, an image feature extraction terminal device, and an image feature extraction system.

Background technique

Face recognition research began in the 1990s. At the beginning, there were proposed eigen face methods for describing faces with image principal components and fisher face methods for describing face images with distinguishing features. After entering this century, based on LBP Gabor's face local feature description method and boosting-based distinguishing feature learning method have quickly become mainstream; in recent years, with the deep learning method proposed, face recognition technology has been pushed to a new Steps. At present, there are several leading edge technologies in the field of face recognition:

The first is the American facebook company, which introduced the deep learning method to face recognition for the first time. Using five convolutional layers and two deep-separated neural networks, it extracts 4096-dimensional visual features from the entire face image. Described, the recognition accuracy has been significantly improved.

The domestic face++ company also uses the deep learning method to learn a deep neural network by pyramid structure, and analyzes the entire face image, which also makes a breakthrough in face recognition technology.

The research team of Prof. Tang Xiaoou from the Chinese University of Hong Kong has conducted a more in-depth study on face recognition technology based on deep learning. They used multiple face images to train deep neural networks and then connected the features of each sub-neural network. A better recognition effect is obtained, but this simple series connection of the features extracted from each sub-image loses the structural characteristics of the image itself.

Summary of the invention

In view of the above problems, the present application provides an image feature extraction method, an image feature extraction device, an image feature extraction terminal device, and an image feature extraction system. The technical solution adopted in this application is:

The application provides an image feature extraction method, including:

Receiving an image input by a user;

Registering the image input by the user to obtain a registered image;

Constructing a plurality of structured sub-images for the registered image;

Extracting a visual feature of each of the structured sub-images using a feature model obtained by multi-model training;

Structurally merging the visual features of the plurality of structured sub-images to obtain structured feature data;

The model obtained by the structured model training is used to calculate the structured feature data to obtain image feature data.

Optionally, the constructing the multiple structured sub-images on the registered image comprises:

Determining a structured reference point location of the registered image;

Determining a shape parameter of the sub image;

The registered image is cut according to the structured reference point position and the shape parameter of the sub-image to obtain a plurality of structured sub-images.

Optionally, the determining the location of the structured reference point of the registered image comprises:

Determining a structured reference point location of the registered image based on image feature points; or

A structured reference point location of the registered image is determined based on the spatial location.

Optionally, the mathematical algorithm for cutting the registered image according to the structured reference point position and the shape parameter of the sub-image to obtain a plurality of structured sub-images is:

a _ij =C(a,p _ij (x,y),s _ij )

Where a _ij denotes a structured sub-image in which the structural order is in the i-th row and j-th in the vertical row, C is a constructing function of the structured sub-image, a represents the image input by the user, and p _ij represents the order in the horizontal row i, vertical jth structured reference points, p _ij (x, y) indicates that the structured reference point p _{ij is} at the coordinates (x, y) of the image input by the user, and s _ij represents the structuring Shape parameters of the image, including rectangular, circular, elliptical and other arbitrary planar shapes and their dimensions.

Optionally, the feature model obtained by the multi-model training is obtained by the following method:

Select a predetermined training image library;

Registering each training image in the predetermined training image library according to a unified registration method to obtain a plurality of registered training images;

Constructing a plurality of structured sub-training images for the plurality of registered training images;

The plurality of structured sub-training images are subjected to feature model training by using a visual feature learning algorithm to extract corresponding plurality of sub-training image visual features, and a feature model is obtained.

Optionally, the visual feature learning algorithm includes any one of the following:

Learning algorithm for deep learning method, boosting algorithm, svm algorithm or local feature combination.

Optionally, the mathematical expression of the feature model is:

v _ij =M _ij (a _ij ,q _ij )

Where a _ij denotes a sub-training image in which the structural order is located in the i-th row and the j-th row in the horizontal row, M _ij is a feature model trained on the corresponding sub-training image a _ij , and q _ij is a feature model parameter obtained by training. v _ij is a sub-training image visual feature extracted by the feature model M _{ij for the} sub-training image a _ij .

Optionally, the structurally merging the visual features of the plurality of structured sub-images to obtain the structured feature data includes:

And structurally merging the visual features of the plurality of structured sub-images according to the determined structured reference point positions when constructing the plurality of structured sub-images to obtain structured feature data, where the structured feature data includes the feature space Relationship and feature information.

Optionally, the mathematical expression of the structured feature data is:

d(i,j,k)=v _ij (k)

Where v _ij represents the visual feature of the structured sub-image, k is the data of the k-th dimension, and d is the structured feature data after the fusion.

Optionally, the model obtained by the structured model training is obtained by:

Performing structural merging of the plurality of sub-training image visual features to obtain training image structured feature data;

The structured feature model is trained on the structured image data of the training image by using the visual feature learning algorithm, and the model obtained by the structured model training is obtained.

Optionally, the mathematical expression of the model obtained by the structured model training is:

v=M(d,q)

Where M is a model obtained by performing structured model training based on the fused training image feature data d, q is a model parameter obtained by training, and v is a corresponding visual feature obtained by merging the training image feature data d by the model M.

Optionally, the image feature extraction method further includes:

And sequentially comparing the image feature data with each predetermined image feature data in a predetermined image database;

Output comparison results.

Optionally, the comparing the image feature data with each predetermined image feature data in a predetermined image database, including:

Calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database;

The output comparison results include:

Determining, in turn, whether each of the difference values is greater than a predetermined difference threshold;

If each of the differences is greater than a predetermined similarity threshold, information having no similar image is output, otherwise, an image corresponding to predetermined image feature data having the smallest difference from the image feature data, and/or an image Information output.

Optionally, the algorithm for calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database includes any one of the following:

Euclidean distance calculation method, Cosine distance calculation method or Joint Bayesian distance calculation method.

Optionally, the image includes: a face image.

The application also provides an image feature extraction device, including:

An image receiving unit, configured to receive an image input by a user;

a registration unit, configured to register an image input by the user to obtain a registered image;

a sub-image construction unit, configured to construct a plurality of structured sub-images on the registered image;

a visual feature extraction unit, configured to extract a visual feature of each of the structured sub-images by using a feature model obtained by multi-model training;

a merging unit, configured to structurally fuse the visual features of the plurality of structured sub-images to obtain structured feature data;

The operation unit is configured to use the model obtained by the structural model training, and perform operation on the structured feature data to obtain image feature data.

Optionally, the registration unit includes:

a reference point determining subunit for determining a structured reference point position of the registered image;

a shape parameter determining subunit for determining a shape parameter of the sub image;

And a cutting subunit, configured to cut the registered image according to the structured reference point position and the shape parameter of the sub image to obtain a plurality of structured sub-images.

Optionally, the reference point determining subunit includes:

a feature reference point determining subunit, configured to determine a structured reference point position of the registered image according to the image feature point; or

A spatial reference point determining subunit is configured to determine a structured reference point position of the registered image based on the spatial location.

Optionally, the mathematical algorithm used by the cutting subunit is:

a _ij =C(a,p _ij (x,y),s _ij )

Optionally, the image feature extraction device further includes:

a multi-model training unit for obtaining a feature model by multi-model training;

The multi-model training unit includes:

Training image library selection subunit for selecting a predetermined training image library;

And a training image registration sub-unit, configured to register each training image in the predetermined training image library according to a unified registration method, to obtain a plurality of the registered training images;

a sub-training image construction sub-unit, configured to respectively construct a plurality of structured sub-training images for the plurality of registered training images;

The feature model acquisition sub-unit is configured to perform feature model training on the plurality of structured sub-training images by using a visual feature learning algorithm to extract corresponding plurality of sub-training image visual features, and obtain a feature model.

Optionally, the visual feature learning algorithm adopted by the feature model acquisition subunit includes any one of the following:

Optionally, the fusion unit includes:

a reference point fusion subunit, configured to structurally fuse the visual features of the plurality of structured sub-images according to the determined structured reference point position when constructing the plurality of structured sub-images, to obtain structured feature data, The structured feature data includes feature space relationships and feature information.

Optionally, the image feature extraction device further includes:

a structured model training unit for training a model through a structured model;

The structured model training unit includes:

a sub-training image fusion sub-unit, configured to structurally fuse the plurality of sub-training image visual features to obtain training image structured feature data;

The model acquisition subunit is configured to perform structural model training on the structured image data of the training image by using a visual feature learning algorithm, and obtain a model obtained by training the structured model.

Optionally, the image feature extraction device further includes:

a comparison unit, configured to sequentially compare the image feature data with each predetermined image feature data in a predetermined image database;

An output unit for outputting the comparison result.

Optionally, the comparison unit includes:

a difference calculation subunit, configured to sequentially calculate a difference between the image feature data and each predetermined image feature data in a predetermined image database;

The output unit includes:

a difference determining subunit, configured to sequentially determine whether each of the difference values is greater than a predetermined difference threshold;

An information output unit, configured to: if each of the differences is greater than a predetermined similarity threshold, output information without a similar image; otherwise, an image corresponding to predetermined image feature data having a minimum difference from the image feature data , and / or image information output.

Optionally, the algorithm for calculating, by the comparing unit, the difference between the image feature data and each predetermined image feature data in the predetermined image database includes any one of the following:

The application also provides an image feature extraction terminal device, including:

CPU;

Input and output unit;

a memory; the image feature extraction method provided by the present application is stored in the memory; and can be run according to the above method after startup.

The present application also provides an image feature extraction system, including a client and a remote server. Using the image feature extraction device provided by the application, the client captures an image and/or selects an image in the album. Sending to the remote server, the remote server extracts the image feature data, compares it with the image in the predetermined image database, and sends the comparison result to the client, and finally outputs the ratio by the client. For the result.

Compared with the prior art, the present application has the following advantages:

An image feature extraction method provided by the present application first receives an image input by a user; then, an image input by the user is registered to obtain a registered image; and then multiple structures are constructed on the registered image. Next, the feature model obtained by the multi-model training is used to extract the visual features of each of the structured sub-images; then the visual features of the plurality of structured sub-images are structurally fused to obtain the structural features. Data; finally, the model obtained by the structured model training is used to calculate the structured feature data to obtain image feature data. Compared with the image feature extraction method of the prior art, in the present application, the spatial position information between the structured sub-images is preserved by constructing the structured sub-image, and thus the extracted visual features of the structured sub-image include features simultaneously. The spatial relationship and feature information not only retain the descriptiveness of each visual feature, but also preserve the spatial relationship of each visual feature, so that the final image feature data is the feature vector, and the feature vector can be used. The distance describes the difference between different images, and because the feature vector and the model in the method better maintain the structural characteristics of the image during the training process, the image feature data has higher accuracy and identifiability. . Applying the image feature extraction method provided by the present application in image recognition, especially face recognition, has higher accuracy, thereby obtaining a better recognition effect.

DRAWINGS

1 is a flowchart of an embodiment of an image feature extraction method provided by the present application;

2 is a flow chart of constructing a plurality of structured sub-images in an embodiment of an image feature extraction method provided by the present application;

3 is a diagram showing an example of determining a structured reference point according to a spatial position relationship provided by the present application;

4 is an exemplary diagram of determining a structured reference point according to a face feature point provided by the present application;

FIG. 5 is a flowchart of multi-model training in an embodiment of an image feature extraction method provided by the present application; FIG.

6 is a schematic structural fusion diagram of the feature provided by the present application;

FIG. 7 is a schematic diagram of an embodiment of an image feature extraction device provided by the present application.

detailed description

Numerous specific details are set forth in the description below in order to provide a thorough understanding of the application. But this application It can be implemented in many other ways than those described herein, and those skilled in the art can make similar promotion without departing from the spirit of the present application, and thus the present application is not limited by the specific embodiments disclosed below.

The present application provides an image feature extraction method, an image feature extraction device, an image feature extraction terminal device, and an image feature extraction system. The embodiments of the present application are described in detail below with reference to the accompanying drawings.

Please refer to FIG. 1 , which is a flowchart of an embodiment of an image feature extraction method provided by the present application. The image feature extraction method includes the following steps:

Step S101: Receive an image input by the user.

In this step, the image input by the user is first received, and the user can select an image input from the electronic album of the terminal device, or take an image and input by the camera. It should be noted that the purpose of the present application is image recognition, so that the image input by the user is prioritized as a static image, but in order to improve the general applicability of the method, in one embodiment of the present application, a dynamic image input by the user may be received. However, pre-processing is performed to extract only a specific frame (such as the first frame) of the dynamic image as an image input by the user, and all of the above are within the protection scope of the present application.

In a preferred embodiment of the present application, the present image feature extraction method is used for face image recognition, and therefore, the image includes a face image.

Step S102: register the image input by the user to obtain a registered image.

In step S101, the image input by the user has been received, and then the image input by the user needs to be registered. The registration method commonly used in the prior art is to detect the image feature point first, and then perform image simulation according to the feature point. The transformation is performed, the image is normalized to a predetermined size and scale, and the registered avatar is obtained for identification and comparison.

In a preferred embodiment of the present application, the image feature extraction method is used for face image recognition, and the image is a face image. When performing registration, first detecting feature points of the face image, such as eyes, mouth, and nose. Position and the like, then performing image affine transformation according to the feature points, normalizing to a predetermined size and scale, and in this way, the images that need to be compared with the face image are also registered, so that Consistent with the size and proportion of the face image, the comparison can be performed under the same standard, thereby improving the accuracy of the comparison.

Step S103: Construct a plurality of structured sub-images on the registered images.

In step S103, the registered image is obtained by registering the image input by the user, and then, a plurality of structured sub-images are required to be constructed on the registered image. Referring to FIG. 2, It is a flowchart of constructing a plurality of structured sub-images in an embodiment of an image feature extraction method provided by the present application, and the constructing a plurality of structured sub-images on the registered image may be performed by the following sub-steps:

Step S1031: Determine a structured reference point position of the registered image.

A plurality of structured sub-images are constructed, that is, a plurality of sub-images are segmented from the image according to a certain structure, position, and constraints. First, the structured reference point position of the registered image is determined to determine the cutting position of the structured sub-image.

In an embodiment provided by the present application, the structured reference point is used as a center point of the structured sub-image cutting. In order to maintain the structural characteristics of the image and facilitate subsequent calculations, generally, the upper and lower left and right relations are basically kept unchanged. A set of reference points that are roughly rectangularly distributed.

The method for determining the structured reference point may be determined by determining a structured reference point position of the registered image according to a spatial position, or determining a structured reference point of the registered image according to the image feature point. position.

Still taking the preferred embodiment of the above-described face image as an example, as shown in FIG. 3, a set of 4×4 structured reference points is determined according to the spatial positional relationship, and the distance between them is completely fixed with respect to the image. As shown in Fig. 4, it determines 3×3 structured reference points according to the face feature points. The nine structured reference points in the figure are from top to bottom and from left to right: right eye center point and two eyes. The center point, the left eye center point, the right cheek point, the nose point, the left cheek point, the right mouth corner point, the lip center point, and the left corner corner point. The positional relationship of the nine structured reference points is different for different people, gestures, and expressions. A slight change occurs, but the approximate rectangular structure relationship is also satisfied.

The above only exemplifies the face image. For different types of images, the method of determining the structured reference point can be selected according to the main content of the image, and the number of structured reference points is not limited to the above 4×4 sum. The case of 3×3 can be flexibly determined according to the actual situation, and will not be further described herein, and all of them are within the protection scope of the present application.

Step S1032: Determine a shape parameter of the sub image.

By step S1031, the structured reference point position of the registered image has been determined, and then, the shape parameter of the sub-image needs to be determined, that is, the reference position of the structured reference point is used as a reference, and a certain ratio is The size determines a sub-image area, the shape parameter including a shape of the sub-image, such as an arbitrary planar shape such as a rectangle, a circle, an ellipse, and the size of the sub-image, such as a length and a width of the rectangle, and a radius of the circle Wait.

Still taking the preferred embodiment of the above-described face image as an example, as shown in FIG. 3, it is determined that the upper left is respectively Two rectangular sub-image areas of different sizes centered on the lower right two structured reference points.

Step S1033: Cutting the registered image according to the structured reference point position and the shape parameter of the sub-image to obtain a plurality of structured sub-images.

The structured reference point position and the shape parameter of the sub image have been determined by step S1031 and step S1032, and then, the configuration is required to be cut according to the structured reference point position and the shape parameter of the sub image. The quasi-image is extracted to extract a plurality of structured sub-images, and the positional relationship of the structured reference points is recorded and stored as structural information.

Still taking the preferred embodiment of the above-described face image as an example, the mathematical algorithm of the structured sub-image may be:

a _ij =C(a,p _ij (x,y),s _ij )

Step S104: Extract a visual feature of each of the structured sub-images by using a feature model obtained by multi-model training.

By step S103, a plurality of structured sub-images have been constructed for the registered image, and then, a feature model obtained by multi-model training is required to extract visual features of each of the structured sub-images, the feature model It is a mathematical expression obtained by multi-model training to extract visual features of an image. The input is an overall or partial image, and the output is a corresponding visual feature. The visual feature is a mathematical expression based on an image that can describe the overall or local shape, texture, color, etc. of the image, and is generally expressed in the form of a vector. The multi-model training is a process of estimating the parameters of the feature model, and the estimation of the feature model parameters is generally completed according to a certain criterion by a large number of images.

In an embodiment provided by the present application, please refer to FIG. 5 , which is a flowchart of multi-model training in an image feature extraction method embodiment provided by the present application, and the feature model obtained by the multi-model training is through the following sub- Steps to achieve:

Step S1041: Select a predetermined training image library.

In this step, a predetermined training image library is first selected, and the predetermined training image library is a set of a plurality of training images that are consistent with the image subject content input by the user, and the preferred image of the face image is For example, if the image input by the user is a face image, the predetermined training image library is selected as a face training image library, and the face training image library can adopt a representative open face database in the industry, such as 1fw, CASIA_WebFace, etc., can also use their own face database organized according to uniform standards.

Step S1042: Register each training image in the predetermined training image library according to a unified registration method to obtain a plurality of registered training images.

By step S1041, a predetermined training image library has been selected. Next, in order to ensure that the feature model obtained by the multi-model training can be applied to the image input by the user, it is necessary to all the training images in the predetermined training image library. The registration method is the same as the registration method described in the step S102. For details, refer to the description of the above step S102, which is not described here, and is within the protection scope of the present application.

Step S1043: respectively construct a plurality of structured sub-training images for the plurality of registered training images.

In step S1042, each training image in the predetermined training image library has been registered according to a unified registration method, and a plurality of registered training images are obtained, and then, after the registration is required The plurality of training images respectively construct a plurality of structured sub-training images. For details, please refer to the description of the above step S103, which is not described herein again, and is within the protection scope of the present application.

Step S1044: Perform feature model training on the plurality of structured sub-training images by using a visual feature learning algorithm to extract corresponding plurality of sub-training image visual features, and obtain a feature model.

By step S1043, a plurality of structured sub-training images are respectively constructed for the plurality of registered training images, and then, the feature model training is performed on the plurality of structured sub-training images by using a visual feature learning algorithm to extract Corresponding multiple sub-training image visual features and obtaining feature models. In this step, multi-model training is performed on each structured sub-training image to extract the most characteristic visual features for each structured sub-training image.

The visual feature learning algorithm includes any one of the following: a deep learning method, a boosting algorithm, an svm algorithm, or a local feature combination learning algorithm. The above are all mature learning algorithms in the prior art, and are not described herein again, and are all within the protection scope of the present application.

In one embodiment provided by the present application, the mathematical expression of the feature model is:

v _ij =M _ij (a _ij ,q _ij )

Where a _ij denotes a sub-training image in which the structural order is located in the i-th row and the j-th row in the horizontal row, M _ij is a feature model trained corresponding to the sub-training image a _ij , and q _ij is a feature model parameter obtained by training, v _Ij is a sub-training image visual feature extracted by the feature model M _{ij for the} sub-training image a _ij .

Through steps S1041 to S1044, multi-model training is completed, and the feature model and the feature model parameters are determined. Next, the plurality of structured sub-images are substituted into the feature model, and each of the structurators can be calculated. The visual characteristics of the image.

Step S105: Structurally merging the visual features of the plurality of structured sub-images to obtain structured feature data.

By step S104, the feature features obtained by the multi-model training are used to extract the visual features of each of the structured sub-images, and then the visual features of the plurality of structured sub-images are structurally fused to obtain the structured features. data.

In an embodiment provided by the present application, the structurally merging the visual features of the plurality of structured sub-images to obtain structured feature data includes:

Taking the preferred embodiment of the above-described face image as an example, the visual features of the structured sub-image are spatially structured and fused according to the structured reference point position determined in the above step S103, so that on the spatial plane It may be reflected that a visual feature of each of the structured sub-images is based on a spatial relationship of the structured reference point locations, and a feature axis of the visual features of the structured sub-images reflects feature information of each of the structured sub-images Its length represents the feature dimension. Please refer to FIG. 6 , which is a schematic diagram of feature structure fusion provided by the present application. The feature value image 602 of the feature reference point 601 is extracted by a corresponding feature model, and the feature vector 603 is structured by structural fusion. The feature data 604, since the process of structured merging maintains the spatial positional relationship of the structured reference point 601 with respect to other structured reference points, the structured feature data 604 also includes feature space relationships and feature information.

In one embodiment provided by the present application, the mathematical representation of the structured feature data is:

d(i,j,k)=v _ij (k)

Where v _ij represents the visual feature of the structured sub-image, k is the data of the k-th dimension, and d is the structured feature data obtained after the fusion.

Step S106: The model obtained by training the structured model is used to perform operation on the structured feature data to obtain image feature data.

Structurally merging the visual features of the plurality of structured sub-images by step S105, The structured feature data is obtained. Next, the obtained model is trained by using the structured model, and the structured feature data is calculated to obtain image feature data.

The structured model training is a subsequent step of the multi-model training described in the above steps S1041 to S1044. For related information, refer to the description of the above steps S1041 to S1044, and details are not described herein. The following describes the structured model training.

The structured model training is to train the structured feature data, and the feature information is better integrated while maintaining the feature space relationship. In one embodiment provided by the present application, the structured model training includes:

In an embodiment provided by the present application, the mathematical expression of the model obtained by the structured model training is:

v=M(d,q)

Through the multi-model training described above, the model and model parameters can be determined. Next, by substituting the structured feature data into d in the above model, the final image feature data v can be calculated.

The process of the image feature extraction method provided by the present application is completed in steps S101 to S106. In the present application, the spatial position information between the structured sub-images is preserved by constructing the structured sub-image, and thus the extracted location is The visual features of the structured sub-image include both the feature space relationship and the feature information. When the structural fusion is performed, the descriptiveness of each visual feature is preserved, and the spatial relationship of each visual feature is preserved, so that the finally obtained image feature data is obtained. For feature vectors, the feature distance between feature vectors can be used to describe the difference between different images. Because the feature vector and model in this method better preserve the structural characteristics of the image during the training process, the image features are The data is more accurate and identifiable. Applying the image feature extraction method provided by the present application in image recognition, especially face recognition, has higher accuracy, thereby obtaining a better recognition effect.

Through the above steps, the image feature data of the image input by the user has been extracted, and then The image input data may be used to identify the image input by the user, and may be used to determine whether the image input by the user is similar to a certain image, or determine whether there is an image input by the user in a certain image database. A similar picture, or a picture similar to the image input by the user is selected in a certain image database. In an embodiment provided by the present application, the image feature extraction method further includes the steps of:

Output comparison results.

The comparison result may be a degree of similarity between the image input by the user and each picture in the predetermined image database, or may be a picture in a predetermined image database and its information and the like that bring the degree of similarity to a predetermined threshold. In practical application, the predetermined image database may be a criminal face database in a public security pursuit application, an employee face database in the attendance system, a member face database in the member management system, or a star in the star face retrieval system. a face database or the like, the comparison result may be whether the image input by the user is a fugitive, whether the image input by the user is a registered employee or a member, and whether the appearance of the attendant is consistent with the record in the attendance system. , the image input by the user is similar to the appearance of which star, and the like.

Considering that the image feature data is a vector, the degree of similarity can be characterized by the distance between the vectors. The smaller the distance, the higher the degree of similarity such as the Euclidean distance, the Cosine distance or the Joint Bayesian distance.

In an embodiment provided by the present application, the comparing the image feature data with each predetermined image feature data in a predetermined image database, including:

The output comparison results include:

Determining, in turn, whether each calculated similarity is greater than a predetermined difference threshold;

If each of the calculated differences is greater than a predetermined similarity threshold, information having no similar image is output, otherwise, an image corresponding to predetermined image feature data having the smallest difference from the image feature data, and/or an image is output. Information output.

The algorithm for calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database includes any one of the following:

The above is an embodiment of an image feature extraction method provided by the present application. Correspondingly, the present application further provides an image feature extraction device. Please refer to FIG. 7, which is a schematic diagram of an embodiment of an image feature extraction apparatus provided by the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment. The device embodiments described below are merely illustrative.

In an embodiment of the image feature extraction device provided by the present application, the image feature extraction device includes: an image receiving unit 701, configured to receive an image input by a user; and a registration unit 702, configured to perform an image input by the user Registration, obtaining a registered image; a sub-image construction unit 703 for constructing a plurality of structured sub-images for the registered image; a visual feature extraction unit 704 for using a multi-model training to obtain a feature model Extracting a visual feature of each of the structured sub-images; a fusing unit 705, configured to structurally fuse the visual features of the plurality of structured sub-images to obtain structured feature data; and an operation unit 706, configured to adopt a structure The model obtained by the model training is operated on the structured feature data to obtain image feature data.

Optionally, the registration unit 702 includes:

Optionally, the reference point determining subunit includes:

Optionally, the mathematical algorithm used by the cutting subunit is:

a _ij =C(a,p _ij (x,y),s _ij )

Where a _ij denotes a structured sub-image in which the structural order is in the i-th row and j-th in the vertical row, C is a constructing function of the structured sub-image, a represents the image input by the user, and p _ij represents the order in the horizontal row i, vertical j-th structured reference points, p _ij (x, y) indicates that the structured reference point p _{ij is} at the coordinates (x, y) of the image input by the user, and s _ij represents the structuring Shape parameters of the image, including rectangular, circular, elliptical and other arbitrary planar shapes and their dimensions.

Optionally, the image feature extraction device further includes: a multi-model training unit, configured to obtain a feature model by multi-model training.

Optionally, the multi-model training unit includes:

Optionally, the merging unit 705 includes:

Optionally, the image feature extraction device further includes:

A structured model training unit for training models through structured model training.

Optionally, the structured model training unit includes:

Optionally, the image feature extraction device further includes:

a matching unit for using the image feature data with each predetermined map in a predetermined image database Performing alignments like feature data in sequence;

An output unit for outputting the comparison result.

Optionally, the comparison unit includes:

The output unit includes:

The above is an embodiment of an image feature extraction device provided by the present application.

CPU;

Input and output unit;

For example, the client is a tablet computer, and the user takes a photo with the tablet or selects a face photo from the album, and the tablet calls the image feature extraction method provided by the application to extract the image feature of the photo. Data is compared with the image in the pre-stored star face image database to obtain a star image with the highest similarity to the photo, and the character information of the star is retrieved, and then the star image and the person information are displayed. On-screen output.

For the description of the embodiment of the image feature extraction method, the description of the embodiment of the image feature extraction method is omitted here.

The present application also provides an image feature extraction system, including a client and a remote server. The system is provided with the image feature extraction device provided by the application. During operation, the client captures an image and/or selects an album. The image in the image is sent to the remote server, which extracts the image features The data is compared with an image in a predetermined image database, and the comparison result is sent to the client, and finally the comparison result is output by the client.

For example, the client is a smart phone, and the user takes a photo by using a smart phone or selects a face photo from the album, and then sends the photo to the remote server, and the remote server invokes the image feature extraction method provided by the application. Extracting image feature data of the photo, and comparing with the image in the pre-stored star face image database, obtaining a star image with the highest similarity to the photo, and retrieving the character information of the star, and then the star The image and character information are sent to the client and finally output on the display screen of the client.

For the image feature extraction system, the image feature extraction method is used. For details, refer to the description of the image feature extraction method embodiment, and details are not described herein.

The present application is disclosed in the above preferred embodiments, but it is not intended to limit the present application, and any person skilled in the art can make possible changes and modifications without departing from the spirit and scope of the present application. The scope of protection should be based on the scope defined by the claims of the present application.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium.

1. Computer readable media including both permanent and non-persistent, removable and non-removable media may be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media, such as modulated data signals and carrier waves.

2. Those skilled in the art will appreciate that embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the present application may employ an entirely hardware embodiment, an entirely software embodiment, or a combination of software. And in the form of an embodiment of the hardware aspect. Moreover, the application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.

Claims

An image feature extraction method, comprising:

Receiving an image input by a user;

Registering the image input by the user to obtain a registered image;

Constructing a plurality of structured sub-images for the registered image;

Extracting a visual feature of each of the structured sub-images using a feature model obtained by multi-model training;

Structurally merging the visual features of the plurality of structured sub-images to obtain structured feature data;

The model obtained by the structured model training is used to calculate the structured feature data to obtain image feature data.
The image feature extraction method according to claim 1, wherein the constructing the plurality of structured sub-images to the registered image comprises:

Determining a structured reference point location of the registered image;

Determining a shape parameter of the sub image;

The registered image is cut according to the structured reference point position and the shape parameter of the sub-image to obtain a plurality of structured sub-images.
The image feature extraction method according to claim 2, wherein the determining the structured reference point position of the registered image comprises:

Determining a structured reference point location of the registered image based on image feature points; or

A structured reference point location of the registered image is determined based on the spatial location.
The image feature extraction method according to claim 2, wherein the image is cut according to the structured reference point position and the shape parameter of the sub-image to obtain a plurality of structuring elements The mathematical algorithm of the image is:

a ij =C(a,p ij (x,y),s ij )

Where a ij denotes a structured sub-image in which the structural order is in the i-th row and j-th in the vertical row, C is a constructing function of the structured sub-image, a represents the image input by the user, and p ij represents the order in the horizontal row i, vertical jth structured reference points, p ij (x, y) indicates that the structured reference point p ij is at the coordinates (x, y) of the image input by the user, and s ij represents the structuring Shape parameters of the image, including rectangular, circular, elliptical and other arbitrary planar shapes and their dimensions.
The image feature extraction method according to claim 1, wherein the feature model obtained by the multi-model training is obtained by the following method:

Select a predetermined training image library;

Registering each training image in the predetermined training image library according to a unified registration method to obtain a plurality of registered training images;

Constructing a plurality of structured sub-training images for the plurality of registered training images;

The plurality of structured sub-training images are subjected to feature model training by using a visual feature learning algorithm to extract corresponding plurality of sub-training image visual features, and a feature model is obtained.
The image feature extraction method according to claim 5, wherein the visual feature learning algorithm comprises any one of the following:

Learning algorithm for deep learning method, boosting algorithm, svm algorithm or local feature combination.
The image feature extraction method according to claim 5, wherein the mathematical expression of the feature model is:

v ij =M ij (a ij ,q ij )

Where a ij denotes a sub-training image in which the structural order is located in the i-th row and the j-th row in the horizontal row, M ij is a feature model trained on the corresponding sub-training image a ij , and q ij is a feature model parameter obtained by training. v ij is a sub-training image visual feature extracted by the feature model M ij for the sub-training image a ij .
The image feature extraction method according to claim 1, wherein the structurally merging the visual features of the plurality of structured sub-images to obtain structured feature data comprises:

And structurally merging the visual features of the plurality of structured sub-images according to the determined structured reference point positions when constructing the plurality of structured sub-images to obtain structured feature data, where the structured feature data includes the feature space Relationship and feature information.
The image feature extraction method according to claim 8, wherein the mathematical expression of the structured feature data is:

d(i,j,k)=v ij (k)

Where v ij represents the visual feature of the structured sub-image, k is the data of the k-th dimension, and d is the structured feature data after the fusion.
The image feature extraction method according to claim 5, wherein the model obtained by the structured model training is obtained by:

Performing structural merging of the plurality of sub-training image visual features to obtain training image structured feature data;

The structured feature model is trained on the structured image data of the training image by using the visual feature learning algorithm, and the model obtained by the structured model training is obtained.
The image feature extraction method according to claim 5, wherein the mathematical expression of the model obtained by the structured model training is:

v=M(d,q)

Where M is a model obtained by performing structured model training based on the fused training image feature data d, q is a model parameter obtained by training, and v is a corresponding visual feature obtained by merging the training image feature data d by the model M.
The image feature extraction method according to claim 1, further comprising:

And sequentially comparing the image feature data with each predetermined image feature data in a predetermined image database;

Output comparison results.
The image feature extraction method according to claim 12, wherein the comparing the image feature data with each predetermined image feature data in a predetermined image database, comprises:

Calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database;

The output comparison results include:

Determining, in turn, whether each of the difference values is greater than a predetermined difference threshold;

If each of the differences is greater than a predetermined similarity threshold, information having no similar image is output, otherwise, an image corresponding to predetermined image feature data having the smallest difference from the image feature data, and/or an image Information output.
The image feature extraction method according to claim 13, wherein the algorithm for calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database comprises any one of the following:

Euclidean distance calculation method, Cosine distance calculation method or Joint Bayesian distance calculation method.
The image feature extraction method according to any one of claims 1 to 14, wherein The image includes: a face image.
An image feature extraction device, comprising:

An image receiving unit, configured to receive an image input by a user;

a registration unit, configured to register an image input by the user to obtain a registered image;

a sub-image construction unit, configured to construct a plurality of structured sub-images on the registered image;

a visual feature extraction unit, configured to extract a visual feature of each of the structured sub-images by using a feature model obtained by multi-model training;

a merging unit, configured to structurally fuse the visual features of the plurality of structured sub-images to obtain structured feature data;

The operation unit is configured to use the model obtained by the structural model training, and perform operation on the structured feature data to obtain image feature data.
The image feature extraction device according to claim 16, wherein the registration unit comprises:

a reference point determining subunit for determining a structured reference point position of the registered image;

a shape parameter determining subunit for determining a shape parameter of the sub image;

And a cutting subunit, configured to cut the registered image according to the structured reference point position and the shape parameter of the sub image to obtain a plurality of structured sub-images.
The image feature extraction device according to claim 17, wherein the reference point determination subunit comprises:

a feature reference point determining subunit, configured to determine a structured reference point position of the registered image according to the image feature point; or

A spatial reference point determining subunit is configured to determine a structured reference point position of the registered image based on the spatial location.
The image feature extraction device according to claim 17, wherein the mathematical algorithm used by the cutting subunit is:

a ij =C(a,p ij (x,y),s ij )

Where a ij denotes a structured sub-image in which the structural order is in the i-th row and j-th in the vertical row, C is a constructing function of the structured sub-image, a represents the image input by the user, and p ij represents the order in the horizontal row i, vertical j-th structured reference points, p ij (x, y) indicates that the structured reference point p ij is at the coordinates (x, y) of the image input by the user, and s ij represents the structuring Shape parameters of the image, including rectangular, circular, elliptical and other arbitrary planar shapes and their dimensions.
The image feature extraction device according to claim 16, further comprising:

a multi-model training unit for obtaining a feature model by multi-model training;

The multi-model training unit includes:

Training image library selection subunit for selecting a predetermined training image library;

And a training image registration sub-unit, configured to register each training image in the predetermined training image library according to a unified registration method, to obtain a plurality of the registered training images;

a sub-training image construction sub-unit, configured to respectively construct a plurality of structured sub-training images for the plurality of registered training images;

The feature model acquisition sub-unit is configured to perform feature model training on the plurality of structured sub-training images by using a visual feature learning algorithm to extract corresponding plurality of sub-training image visual features, and obtain a feature model.
The image feature extraction device according to claim 20, wherein the visual feature learning algorithm adopted by the feature model acquisition subunit comprises any one of the following:

Learning algorithm for deep learning method, boosting algorithm, svm algorithm or local feature combination.
The image feature extraction device according to claim 16, wherein the fusion unit comprises:

a reference point fusion subunit, configured to structurally fuse the visual features of the plurality of structured sub-images according to the determined structured reference point position when constructing the plurality of structured sub-images, to obtain structured feature data, The structured feature data includes feature space relationships and feature information.
The image feature extraction device according to claim 20, further comprising:

a structured model training unit for training a model through a structured model;

The structured model training unit includes:

a sub-training image fusion sub-unit, configured to structurally fuse the plurality of sub-training image visual features to obtain training image structured feature data;

The model acquisition subunit is configured to perform structural model training on the structured image data of the training image by using a visual feature learning algorithm, and obtain a model obtained by training the structured model.
The image feature extraction device according to claim 16, further comprising:

a comparison unit, configured to sequentially compare the image feature data with each predetermined image feature data in a predetermined image database;

An output unit for outputting the comparison result.
The image feature extraction device according to claim 24, wherein the comparison unit comprises:

a difference calculation subunit, configured to sequentially calculate a difference between the image feature data and each predetermined image feature data in a predetermined image database;

The output unit includes:

a difference determining subunit, configured to sequentially determine whether each of the difference values is greater than a predetermined difference threshold;

An information output unit, configured to: if each of the differences is greater than a predetermined similarity threshold, output information without a similar image; otherwise, an image corresponding to predetermined image feature data having a minimum difference from the image feature data , and / or image information output.
The image feature extraction device according to claim 25, wherein the algorithm for calculating a difference between the image feature data and each predetermined image feature data in a predetermined image database by the comparison unit includes any of the following Kind:

Euclidean distance calculation method, Cosine distance calculation method or Joint Bayesian distance calculation method.
An image feature extraction terminal device includes:

CPU;

Input and output unit;

a memory; the image feature extraction method according to claim 1 to claim 15 is stored in the memory; and can be operated according to the above method after startup.
An image feature extraction system comprising a client and a remote server, characterized by using the image feature extraction device of claim 16 to claim 26, the client capturing an image and/or selecting an image transmission in an album Going to the remote server, the remote server extracts the image feature data, compares it with the image in the predetermined image database, and sends the comparison result to the client, and finally outputs the comparison by the client. result.