CN116452886A

CN116452886A - Image recognition method, device, equipment and storage medium

Info

Publication number: CN116452886A
Application number: CN202310446706.XA
Authority: CN
Inventors: 许啸; 叶红; 王雪霏; 李沅坷
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2023-04-24
Filing date: 2023-04-24
Publication date: 2023-07-18

Abstract

The present disclosure provides an image recognition method, apparatus, device, and storage medium, which can be applied to the fields of computer vision technology, financial technology, information security technology, and artificial intelligence technology. The method comprises the steps of acquiring a static image, a dynamic image and image acquisition parameters of the static image of a target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image; performing image transformation processing on the static image according to the image acquisition parameters to obtain a target image; identifying the target image to obtain a first identification result; extracting the motion trail characteristics of the target object from the dynamic image, and extracting the image characteristics of the target object from the target image; identifying the motion trail features and the image features to obtain a second identification result; and obtaining a target recognition result according to the first recognition result and the second recognition result.

Description

Image recognition method, device, equipment and storage medium

Technical Field

The present disclosure relates to the fields of computer vision technology, financial technology, information security technology, and artificial intelligence technology, and more particularly, to an image recognition method, apparatus, device, medium, and program product.

Background

With the rapid development of network technology, the public uses the internet to perform online business such as living payment, personal information inquiry or commodity buying and selling.

When the user handles the business to carry out the identity verification, the face image recognition technology is mainly adopted to carry out the identity verification on the user, and the user is required to complete specific actions in a matching way to realize the face authenticity verification. In the related art, identity verification is generally performed based on a comparison result of an image acquired in real time and a stored image, but the single-dimension image recognition method has the problem of lower accuracy when applied to a scene of identity verification.

Disclosure of Invention

In view of the foregoing, the present disclosure provides image recognition methods, apparatuses, devices, media, and program products.

According to a first aspect of the present disclosure, there is provided an image recognition method including: acquiring a static image, a dynamic image and image acquisition parameters of the static image of a target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image;

Performing image transformation processing on the static image according to the image acquisition parameters to obtain a target image;

identifying the target image to obtain a first identification result;

extracting the motion trail feature of the target object from the dynamic image, and extracting the image feature of the target object from the target image;

identifying the motion trail features and the image features to obtain a second identification result; and

and obtaining a target recognition result according to the first recognition result and the second recognition result.

According to an embodiment of the present disclosure, the extracting the motion trajectory feature of the target object from the moving image includes:

performing frame extraction processing on the dynamic image to obtain a plurality of image frames;

performing optical flow detection on the plurality of image frames to obtain global motion track characteristics of the target object;

detecting key pixel points of the plurality of image frames to obtain local motion track characteristics of the target object; and

and obtaining the motion trail feature of the target object according to the global motion trail feature and the local motion trail feature.

According to an embodiment of the present disclosure, the plurality of image frames includes S, where S is an integer greater than 1; the optical flow detection for the plurality of image frames to obtain the global motion track feature of the target object includes:

Processing the S-th image frame and the s+1th image frame by utilizing an optical flow algorithm to obtain the S-th pixel point change information, wherein S is an integer greater than or equal to 1 and less than S;

in the case where it is determined that S is less than S-1, returning to perform a processing operation for the S-th image frame and the s+1th image frame, and incrementing S; and

and under the condition that S is equal to S-1, obtaining the global motion track characteristic of the target object according to the S-1 pixel point change information.

According to an embodiment of the present disclosure, the detecting key pixel points of the plurality of image frames to obtain a local motion trajectory feature of the target object includes:

extracting key pixel point characteristics of the plurality of image frames;

detecting the characteristics of the key pixel points to obtain the position information of the key pixel points in each image frame; and

and obtaining the local motion track characteristics of the target object according to the position information of the key pixel points in each image frame.

According to an embodiment of the present disclosure, the identifying the motion trajectory feature and the image feature to obtain a second identification result includes:

performing splicing processing on the motion trail feature and the image feature to obtain an intermediate feature; and

And identifying the intermediate features to obtain the second identification result.

According to an embodiment of the present disclosure, the extracting the image feature of the target object from the target image includes:

performing wavelet decomposition processing on the target image to obtain frequency domain information of the target image; and

and carrying out feature extraction processing on the frequency domain information to obtain the image features.

According to an embodiment of the disclosure, the performing image transformation processing on the static image according to the image acquisition parameter to obtain a target image includes:

determining image transformation parameters according to the image acquisition parameters; and

and carrying out image transformation processing on the static image according to the image transformation parameters to obtain the target image.

According to an embodiment of the present disclosure, determining the image transformation parameter according to the image acquisition parameter includes:

determining deviation information of an image acquisition frame of the static image and an image acquisition frame of the target image according to the image acquisition parameters; and

and determining the image transformation parameters according to the deviation information.

According to an embodiment of the present disclosure, the identifying the target image to obtain a first identification result includes:

Extracting the position characteristics of at least two preset key points in the target image;

determining the gradient of the target object in the target image according to the position characteristics of the at least two preset key points; and

and when the inclination is smaller than a predetermined threshold value, obtaining the first recognition result according to the target image and the historical image of the target object.

A second aspect of the present disclosure provides an image recognition apparatus, comprising: the acquisition module is used for acquiring a static image, a dynamic image and image acquisition parameters of the static image of the target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image;

the processing module is used for carrying out image transformation processing on the static image according to the image acquisition parameters to obtain a target image;

the first identification module is used for identifying the target image to obtain a first identification result;

the feature extraction module is used for extracting the motion trail feature of the target object from the dynamic image and extracting the image feature of the target object from the target image;

The second recognition module is used for recognizing the motion trail features and the image features to obtain a second recognition result; and

the obtaining module is used for obtaining a target recognition result according to the first recognition result and the second recognition result.

A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method described above.

A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described method.

A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above method.

According to the image recognition method, the device, the equipment, the medium and the program product provided by the disclosure, the image acquisition parameters of the static image, the dynamic image and the static image of the target object are acquired, the static image is subjected to image transformation processing according to the image acquisition parameters to obtain a target image, and the target image is recognized to obtain a first recognition result; extracting the motion trail feature of the target object from the dynamic image, extracting the image feature of the target object from the target image, identifying the motion trail feature and the image feature to obtain a second identification result, and obtaining a target identification result according to the first identification result and the second identification result. According to the identification result of the static image and the identification result of the dynamic image, and by combining the image acquisition parameters of the static image, the target identification result is obtained, the problem of single dimension of the image identification method in the scene applied to identity verification is at least partially solved, and the technical effect of improving the accuracy of image identification verification is realized.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates an application scenario diagram of an image recognition method, apparatus, device, medium, and program product according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart of an image recognition method according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart for extracting motion trajectory features of a target object from a dynamic image according to an embodiment of the disclosure;

FIG. 4 schematically illustrates an overall block diagram of a model according to an embodiment of the disclosure;

FIG. 5 schematically illustrates an overall flow chart of an image recognition method according to an embodiment of the present disclosure;

fig. 6 schematically shows a block diagram of a structure of an image recognition apparatus according to an embodiment of the present disclosure; and

fig. 7 schematically illustrates a block diagram of an electronic device adapted to implement an image recognition method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In the technical scheme of the disclosure, the related data (such as including but not limited to personal information of a user) are collected, stored, used, processed, transmitted, provided, disclosed, applied and the like, all conform to the regulations of related laws and regulations, necessary security measures are adopted, and the public welcome is not violated.

When the user performs online business such as living payment, personal information inquiry or commodity buying and selling, the user mainly performs identity verification by using a face image recognition technology. In the process of acquiring face information for identity authentication, a user usually needs to complete a specific action to realize face authenticity authentication. However, in the existing verification method, the input image or video can be forged or imitated by the information prepared in advance, and the accuracy of image identification verification needs to be improved.

The embodiment of the disclosure provides an image recognition method, which comprises the steps of acquiring a static image, a dynamic image and image acquisition parameters of the static image of a target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image; performing image transformation processing on the static image according to the image acquisition parameters to obtain a target image; identifying the target image to obtain a first identification result; extracting the motion trail characteristics of the target object from the dynamic image, and extracting the image characteristics of the target object from the target image; identifying the motion trail features and the image features to obtain a second identification result; and obtaining a target recognition result according to the first recognition result and the second recognition result.

Fig. 1 schematically illustrates an application scenario diagram of an image recognition method according to an embodiment of the present disclosure.

As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is a medium used to provide a communication link between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 through the network 104 using at least one of the first terminal device 101, the second terminal device 102, the third terminal device 103, to receive or send messages, etc. Various communication client applications, such as a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only) may be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103.

The first terminal device 101, the second terminal device 102, the third terminal device 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by the user using the first terminal device 101, the second terminal device 102, and the third terminal device 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that, the image recognition method provided by the embodiment of the present disclosure may be generally performed by the server 105. Accordingly, the image recognition apparatus provided by the embodiments of the present disclosure may be generally provided in the server 105. The image recognition method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105. Accordingly, the image recognition apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and/or the server 105.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The image recognition method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 5 based on the scene described in fig. 1.

Fig. 2 schematically illustrates a flowchart of an image recognition method according to an embodiment of the present disclosure.

As shown in fig. 2, the image recognition method of this embodiment includes operations S210 to S260.

In operation S210, a still image of the target object, a moving image, which is acquired when the target object completes a specified motion of in-vivo detection, and image acquisition parameters of the still image, which are randomly generated when the target object performs still image acquisition, are acquired.

According to the embodiment of the disclosure, when identity verification is performed through a face image recognition technology, a target object shoots and records static images such as face outlines, five sense organs and the like of the target object in real time in a designated area of a client. The image is acquired under the condition of acquiring the authorization of the user, and meets the requirements of related laws.

And the target object finishes the real-time recording of the dynamic image according to the appointed action of the living body detection, so as to acquire the dynamic image of the target object. Note that the specified action may be a nodding, blinking, turning, or the like, and the embodiment of the present disclosure does not limit the specific actions and the order of the actions of the specified action.

According to an embodiment of the present disclosure, the image acquisition parameter may be a rotation angle of the image or a magnification of the image, etc., which is not limited herein. In the static image acquisition process, the image acquisition parameters are randomly generated, and the range of the image acquisition parameters can be preset so as to prevent the influence on user experience.

In operation S220, an image conversion process is performed on the still image according to the image acquisition parameters to obtain a target image.

According to an embodiment of the present disclosure, the image transformation process may be inverse transformation according to the image acquisition parameters, thereby obtaining the target image. For example, the image acquisition parameter is 15 degrees of left rotation of the target object, represents a static image acquired by the target object under the condition of 15 degrees of left rotation, and performs image transformation processing on the static image to obtain a target image of the static image of the target object under the condition of 15 degrees of right rotation.

In operation S230, the target image is identified, and a first identification result is obtained.

According to the embodiment of the disclosure, the target image and the standard image of the target object are identified and compared and analyzed, and a first identification result is obtained. The first recognition result is classified into pass or fail cases.

In operation S240, motion trajectory features of the target object are extracted from the dynamic image, and image features of the target object are extracted from the target image.

According to embodiments of the present disclosure, the motion trajectory feature may be a global motion trajectory feature of a face contour, and may also be a local motion trajectory feature of eyes, nose, mouth, etc. of the face.

According to embodiments of the present disclosure, the image features of the target object may be image features of facial spots or the like of the target object.

In operation S230, the motion trajectory feature and the image feature are identified, and a second identification result is obtained.

According to the embodiment of the disclosure, the second recognition result is obtained by recognizing the motion trail feature and the image feature, and the second recognition result is divided into a passing case and a non-passing case. The motion trail features and the image features enhance the complexity of identifying and verifying the dynamic image, facilitate the investigation of the pre-injection fake video, identify the problems of uncertain face background, incoherent motion trail and the like of the fake video, and improve the accuracy of image identification and verification.

In operation S260, a target recognition result is obtained according to the first recognition result and the second recognition result.

According to the embodiment of the disclosure, through acquiring the static image, the dynamic image and the image acquisition parameters of the static image of the target object, performing image transformation processing on the static image according to the image acquisition parameters to obtain the target image, and identifying the target image to obtain a first identification result; extracting the motion trail feature of the target object from the dynamic image, extracting the image feature of the target object from the target image, identifying the motion trail feature and the image feature to obtain a second identification result, and combining the image acquisition parameters of the static image according to the first identification result and the second identification result to obtain the target identification result. According to the identification result of the static image and the identification result of the dynamic image, a target identification result is obtained, the problem of single dimension of the image identification method in a scene applied to identity verification is at least partially solved, and the technical effect of improving the accuracy of image identification verification is achieved.

According to an embodiment of the present disclosure, in operation S220, performing image transformation processing on a still image according to image acquisition parameters to obtain a target image, including determining image transformation parameters according to the image acquisition parameters; and performing image transformation processing on the static image according to the image transformation parameters to obtain a target image.

According to an embodiment of the disclosure, determining an image transformation parameter according to an image acquisition parameter includes determining deviation information of an image acquisition frame of a still image and an image acquisition frame of a target image according to the image acquisition parameter; and determining the image transformation parameters according to the deviation information.

According to embodiments of the present disclosure, the image acquisition parameter may be a rotation angle of the acquired image or a magnification of the acquired image. For example, the image acquisition parameter is 15 degrees of left rotation, which means that when the static image is acquired, the image acquisition frame randomly rotates, the target object acquires the static image under the condition of 15 degrees of left rotation, the deviation information of the image acquisition frame of the static image and the image acquisition frame of the target image is determined, and the image transformation parameter is 15 degrees of right rotation according to the deviation information. And (5) carrying out image transformation processing on the static image rotated 15 degrees to the right according to the image transformation parameters to obtain a target image.

According to the embodiment of the disclosure, the parameter information of the image acquisition frame of the static image is determined through the image acquisition parameters, so that the image acquisition frame can be randomly adjusted, counterfeit videos injected in advance can be conveniently identified, and the accuracy of image identification verification is improved.

According to an embodiment of the present disclosure, in operation 230, identifying a target image to obtain a first identification result, including extracting position features of at least two preset key points in the target image; determining the gradient of the target object in the target image according to the position characteristics of at least two preset key points; and under the condition that the gradient is smaller than a preset threshold value, obtaining a first identification result according to the target image and the historical image of the target object.

According to the embodiment of the present disclosure, the position features of the preset key points may be the position features of the two eyes, the position features of the highest points of the two eyebrows, and the like, which are not limited herein. Extracting horizontal lines according to the position features of two preset key points, determining the inclination of the target object in the target image, and conveniently observing the distortion degree of the target object in the target image, wherein the target image is determined to be a standard image under the condition that the inclination is smaller than a preset threshold value; and obtaining a first recognition result according to the target image and the historical image of the target object. Under the condition that the target image is consistent with the historical image of the target object, the first identification result is passing; in the case where the target image and the history image of the target object do not coincide, the first recognition result is not passed.

Fig. 3 schematically illustrates a flowchart for extracting motion trajectory features of a target object from a dynamic image according to an embodiment of the present disclosure.

As shown in fig. 3, the extraction of the motion trajectory characteristics of the target object of this embodiment includes operations S310 to S340.

In operation S310, a frame extraction process is performed on a moving image to obtain a plurality of image frames.

According to the embodiment of the disclosure, the dynamic image comprises a plurality of frames of images, frame extraction processing is needed, and a plurality of image frames are obtained after frame extraction screening.

In operation S320, optical flow detection is performed on the plurality of image frames to obtain global motion trajectory characteristics of the target object.

According to the embodiment of the disclosure, the size of single image frame data is m×m×frame number, where M is the width and height of a single frame image, and parameters are adjusted by using an I3D pre-training model to extract features, so as to obtain global motion track features of a target object, where the global motion track features may be motion track features of a face contour.

In operation S330, key pixel point detection is performed on the plurality of image frames to obtain local motion trail features of the target object.

According to the embodiment of the disclosure, the key pixel points may be local positions of eyes, noses, mouths and the like of the face, the data size of the key pixel points of a single image frame is n×2×frame number, where N is the number of points of the key pixel points of the face, 2 is a dimension value, and the resnet model is used for performing parameter adjustment to extract features, so as to obtain local motion track features of the target object, where the local motion track features may be motion track features of the local positions of eyes, noses, mouths and the like of the face.

In operation S340, the motion profile of the target object is obtained according to the global motion profile and the local motion profile.

According to the embodiment of the disclosure, the frame extraction processing is performed on the dynamic image to obtain each image frame, the optical flow detection and the key pixel point detection are performed on a plurality of image frames to respectively obtain the global motion track characteristic and the local motion track characteristic of the target object, so that the identification of the dynamic image is enhanced, the investigation of the pre-injected fake video is facilitated, the problems of uncertain face background, incoherent motion track and the like of the fake video are identified, and the accuracy of image identification verification is improved.

According to an embodiment of the present disclosure, the plurality of image frames includes S, S being an integer greater than 1. In operation S320, optical flow detection is performed on a plurality of image frames to obtain a global motion track feature of a target object, including processing an S-th image frame and an s+1st image frame by using an optical flow algorithm to obtain S-th pixel point change information, where S is an integer greater than or equal to 1 and less than S; in the case where it is determined that S is less than S-1, returning to perform a processing operation for the S-th image frame and the s+1th image frame, and incrementing S; and under the condition that S is equal to S-1, obtaining the global motion trail characteristic of the target object according to the S-1 pixel point change information.

For example, in the case where S is 5,s and 1, the plurality of image frames includes 5, and the 1 st image frame and the 2 nd image frame are processed by using the optical flow algorithm to obtain 1 st pixel change information; at the moment, S is smaller than S-1, S is increased, and the processing operation on the 2 nd image frame and the 3 rd image frame is executed in a returning mode, so that the 2 nd pixel point change information is obtained; returning to execute the processing operation of the 3 rd image frame and the 4 th image frame to obtain the 3 rd pixel point change information; and returning to execute the processing operation of the 4 th image frame and the 5 th image frame to obtain the 4 th pixel point change information, and obtaining the global motion trail feature of the target object according to the 4 pixel point change information.

According to the embodiment of the disclosure, when the optical flow detection is carried out on a plurality of image frames, the optical flow algorithm is utilized to process the adjacent image frames to obtain the pixel point change information of the plurality of image frames, and the global motion track characteristic of the target object is obtained, so that the problem of unstable face background of the forged video is conveniently identified, and the accuracy of image identification verification is improved.

According to an embodiment of the present disclosure, in operation S330, key pixel point detection is performed on a plurality of image frames to obtain local motion trajectory features of a target object, including extracting key pixel point features of the plurality of image frames; detecting the characteristics of the key pixel points to obtain the position information of the key pixel points in each image frame; and obtaining the local motion track characteristics of the target object according to the position information of the key pixel points in each image frame.

For example, the key pixels may be eyes, and the key pixels may be other parts of the face, which is not limited herein. The eye characteristics of a plurality of image frames are extracted and detected to obtain the position information of the eyes in each image frame, and the local motion track characteristics of the eyes of the target object are obtained according to the position information of the eyes in each image frame, so that the continuity of the local motion track characteristics can be identified, and the completion condition of the local motion track characteristics under the condition of appointed action can be identified.

According to the embodiment of the invention, the local motion trail characteristic of the target object is obtained according to the position information of the key pixel points in each image frame, so that the motion trail of the continuous key pixel points can be obtained, the problem of incontinuous motion trail in the non-real-time video can be conveniently identified, and the accuracy of image identification verification is improved.

According to an embodiment of the present disclosure, in S240, extracting an image feature of a target object from a target image includes performing wavelet decomposition processing on the target image to obtain frequency domain information of the target image; and carrying out feature extraction processing on the frequency domain information to obtain image features.

According to the embodiment of the disclosure, wavelet decomposition processing is performed on a target image to obtain frequency domain information of the target image, single-face high-frequency information is w×w×1, W is a face size, and feature extraction processing is performed on the frequency domain information by using a VGG feature extraction model to obtain image features.

According to the embodiment of the disclosure, in S250, the motion trail feature and the image feature are identified to obtain a second identification result, including performing stitching processing on the motion trail feature and the image feature to obtain an intermediate feature; and identifying the intermediate features to obtain a second identification result.

Fig. 4 schematically illustrates an overall block diagram of a model according to an embodiment of the disclosure.

As shown in fig. 4, according to an embodiment of the present disclosure, frame extraction processing is performed on a dynamic image to obtain a plurality of image frames, optical flow detection is performed on the plurality of image frames, parameter adjustment is performed by using an I3D pre-training model to extract features, and global motion trail features of a target object are obtained; detecting key pixel points of a plurality of image frames, and performing parameter adjustment and extraction on the characteristics by utilizing a resnet model to obtain local motion trail characteristics of a target object; extracting image features of a target object from a target image, performing wavelet decomposition processing on the target image to obtain frequency domain information of the target image, and performing feature extraction processing on the frequency domain information by using a VGG feature extraction model to obtain image features.

According to the embodiment of the disclosure, vector stitching is performed on the global motion trail feature, the local motion trail feature and the image feature to obtain an intermediate feature, and the intermediate feature is processed by using a two-class network to obtain a second recognition result. The method solves the problem of single dimension of the image recognition method in the scene applied to identity verification, and achieves the technical effect of improving the accuracy of image recognition verification.

According to the embodiment of the disclosure, the target recognition result is obtained according to the first recognition result and the second recognition result. Under the condition that the first recognition result and the second recognition result are both passing, the target recognition result is passing; in other cases, the target recognition result is failed.

Fig. 5 schematically illustrates an overall flowchart of an image recognition method according to an embodiment of the present disclosure.

As shown in fig. 5, according to an embodiment of the present disclosure, an image recognition method includes a client and a server. In the static evidence obtaining stage, a client collects a static image of a target object, in the collecting process, the client automatically and randomly adjusts the distance and the angle of an image collecting frame, and a server receives the static image and image collecting parameters submitted by the client; performing image transformation processing on the static image according to the image acquisition parameters to obtain a target image, and determining whether the target image is a standard image; and identifying the target image, and obtaining a first identification result according to the target image and the historical image of the target object.

According to an embodiment of the present disclosure, in a dynamic evidence obtaining stage, a client acquires a dynamic image when a target object completes a specified action of living body detection. The server receives the dynamic image, performs frame extraction processing on the dynamic image, performs optical flow detection and key pixel point detection, extracts global motion track features and local motion track features of the target object from the dynamic image, and extracts image features of the target object from the target image; vector stitching is carried out on the global motion trail feature, the local motion trail feature and the image feature to obtain an intermediate feature, and the intermediate feature is processed by using a two-class network to obtain a second recognition result.

According to the embodiment of the disclosure, a target recognition result is obtained according to the first recognition result and the second recognition result.

Based on the image recognition method, the disclosure also provides an image recognition device. The device will be described in detail below in connection with fig. 6.

Fig. 6 schematically shows a block diagram of the structure of an image recognition apparatus according to an embodiment of the present disclosure.

As shown in fig. 6, the image recognition apparatus 600 of this embodiment includes an acquisition module 610, a processing module 620, a first recognition module 630, a feature extraction module 640, a second recognition module 650, and an acquisition module 660.

The acquiring module 610 is configured to acquire a still image, a moving image, and an image acquisition parameter of the still image, where the moving image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameter is randomly generated when the target object performs still image acquisition. In an embodiment, the obtaining module 610 may be configured to perform the operation S210 described above, which is not described herein.

The processing module 620 is configured to perform image transformation processing on the still image according to the image acquisition parameters, so as to obtain a target image. In an embodiment, the processing module 620 may be configured to perform the operation S220 described above, which is not described herein.

The first recognition module 630 is configured to recognize the target image, and obtain a first recognition result. In an embodiment, the first identifying module 630 may be used to perform the operation S230 described above, which is not described herein.

And the feature extraction module 640 is configured to extract a motion trail feature of the target object from the dynamic image, and extract an image feature of the target object from the target image. In an embodiment, the processing module 640 may be configured to perform the operation S260 described above, which is not described herein.

The second recognition module 650 is configured to recognize the motion trail feature and the image feature, and obtain a second recognition result. In an embodiment, the processing module 650 may be configured to perform the operation S250 described above, which is not described herein.

The obtaining module 660 is configured to obtain a target recognition result according to the first recognition result and the second recognition result. In an embodiment, the processing module 660 may be configured to perform the operation S260 described above, which is not described herein.

According to an embodiment of the disclosure, the feature extraction module includes a first processing sub-module, a first detection sub-module, a second detection sub-module, and a first obtaining sub-module. The first processing sub-module is used for performing frame extraction processing on the dynamic image to obtain a plurality of image frames. The first detection submodule is used for carrying out optical flow detection on a plurality of image frames to obtain global motion track characteristics of the target object. The second detection sub-module is used for detecting key pixel points of the plurality of image frames to obtain local motion track characteristics of the target object. The first obtaining submodule is used for obtaining the motion trail feature of the target object according to the global motion trail feature and the local motion trail feature.

According to an embodiment of the present disclosure, the first detection sub-module includes a first obtaining unit, an executing unit, and a second obtaining unit. The first obtaining unit is used for processing the S-th image frame and the s+1th image frame by utilizing an optical flow algorithm to obtain the S-th pixel point change information, wherein S is an integer greater than or equal to 1 and less than S. The execution unit is configured to, in a case where S is determined to be smaller than S-1, return to perform processing operations for the S-th image frame and the s+1th image frame, and increment S. The second obtaining unit is used for obtaining the global motion track characteristic of the target object according to the S-1 pixel point change information under the condition that S is equal to S-1.

According to an embodiment of the present disclosure, the second detection sub-module includes an extraction unit, a detection unit, and a third obtaining unit. The extraction unit is used for extracting key pixel point characteristics of a plurality of image frames. The detection unit is used for detecting the characteristics of the key pixel points to obtain the position information of the key pixel points in each image frame. The third obtaining unit is used for obtaining the local motion track characteristics of the target object according to the position information of the key pixel point in each image frame.

According to an embodiment of the disclosure, the second recognition module includes a second processing sub-module, a second obtaining sub-module. The second processing sub-module is used for performing splicing processing on the motion trail feature and the image feature to obtain an intermediate feature. The second obtaining sub-module is used for identifying the intermediate features to obtain a second identification result.

According to an embodiment of the present disclosure, the feature extraction module further comprises a decomposition sub-module and a first extraction sub-module. The decomposition submodule is used for carrying out wavelet decomposition processing on the target image to obtain frequency domain information of the target image. The first extraction submodule is used for carrying out feature extraction processing on the frequency domain information to obtain image features.

According to an embodiment of the present disclosure, the processing module includes a first determination sub-module and a transformation processing sub-module. The first determining submodule is used for determining image transformation parameters according to the image acquisition parameters. The transformation processing sub-module is used for carrying out image transformation processing on the static image according to the image transformation parameters to obtain a target image.

According to an embodiment of the present disclosure, the first determining unit and the second determining unit are determined for the sub-module. The first determining unit is used for determining deviation information of an image acquisition frame of the static image and an image acquisition frame of the target image according to the image acquisition parameters. The second determining unit is used for determining image transformation parameters according to the deviation information.

According to an embodiment of the present disclosure, the first recognition module includes a second extraction sub-module, a second determination sub-module, and a third acquisition sub-module. The second extraction sub-module is used for extracting the position features of at least two preset key points in the target image. The second determining submodule is used for determining the gradient of the target object in the target image according to the position characteristics of at least two preset key points. The third obtaining submodule is used for obtaining a first identification result according to the target image and the historical image of the target object under the condition that the gradient is smaller than a preset threshold value.

Any of the acquisition module 610, the processing module 620, the first identification module 630, the feature extraction module 640, the second identification module 650, and the acquisition module 660 may be combined in one module to be implemented, or any of the modules may be split into multiple modules, according to embodiments of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the acquisition module 610, the processing module 620, the first identification module 630, the feature extraction module 640, the second identification module 650, and the acquisition module 660 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or as any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the acquisition module 610, the processing module 620, the first identification module 630, the feature extraction module 640, the second identification module 650, and the acquisition module 660 may be at least partially implemented as computer program modules that, when executed, perform the respective functions.

As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.

In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. The processor 701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. Note that the program may be stored in one or more memories other than the ROM 702 and the RAM 703. The processor 701 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.

According to an embodiment of the present disclosure, the electronic device 700 may further include an input/output (I/O) interface 705, the input/output (I/O) interface 705 also being connected to the bus 704. The electronic device 700 may also include one or more of the following components connected to an input/output (I/O) interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to an input/output (I/O) interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 702 and/or RAM 703 and/or one or more memories other than ROM 702 and RAM 703 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the item recommendation method provided by embodiments of the present disclosure.

The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed over a network medium in the form of signals, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. An image recognition method, comprising:

acquiring a static image, a dynamic image and image acquisition parameters of the static image of a target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image;

identifying the target image to obtain a first identification result;

2. The method of claim 1, wherein the extracting the motion profile feature of the target object from the dynamic image comprises:

3. The method of claim 2, wherein the plurality of image frames includes S, S being an integer greater than 1; performing optical flow detection on the plurality of image frames to obtain global motion track characteristics of the target object, including:

4. The method of claim 2, wherein the performing key pixel detection on the plurality of image frames to obtain the local motion profile feature of the target object comprises:

extracting key pixel point characteristics of the plurality of image frames;

and obtaining the local motion track characteristics of the target object according to the position information of the key pixel point in each image frame.

5. The method of claim 1, wherein the identifying the motion profile feature and the image feature to obtain a second identification result comprises:

6. The method of claim 1, wherein the extracting image features of the target object from the target image comprises:

7. The method according to claim 1, wherein said performing image transformation processing on the still image according to the image acquisition parameters to obtain a target image includes:

8. The method of claim 7, wherein said determining image transformation parameters from said image acquisition parameters comprises:

9. The method of claim 1, wherein the identifying the target image to obtain a first identification result includes:

and under the condition that the gradient is smaller than a preset threshold value, obtaining the first identification result according to the target image and the historical image of the target object.

10. An image recognition apparatus comprising:

the acquisition module is used for acquiring a static image, a dynamic image and image acquisition parameters of the static image of the target object, wherein the dynamic image is acquired when the target object completes a specified motion of living body detection, and the image acquisition parameters are randomly generated when the target object acquires the static image;

the second recognition module is used for recognizing the motion trail feature and the image feature to obtain a second recognition result; and

11. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.

12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 9.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.