CN111738199B

CN111738199B - Image information verification method, device, computing device and medium

Info

Publication number: CN111738199B
Application number: CN202010616687.7A
Authority: CN
Inventors: 李桂锋; 陈永录; 张飞燕
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2023-12-19
Anticipated expiration: 2040-06-30
Also published as: CN111738199A

Abstract

The present disclosure provides an image information verification method, including: acquiring a target video sequence of a target user when information to be verified is input, wherein the target video sequence comprises a plurality of frame images containing human faces; dividing a face image from each of a plurality of frame images to obtain a plurality of face images; extracting target features of a target video sequence from a plurality of face images according to a local binary pattern algorithm; and processing the target feature using the trained classification model to verify the authenticity of the information to be verified. The present disclosure also provides an image information verification apparatus, a computing apparatus, and a medium.

Description

Image information verification method, device, computing device and medium

Technical Field

The present disclosure relates to the field of computer vision, and more particularly, to an image information verification method, apparatus, computing apparatus, and medium.

Background

Micro-expressions are an instinctive behavior of humans, which is part of psychological stress, and non-linguistic behavior of humans to express their own emotional information. Micro-expressions are often subconscious, uncontrolled by thought, unable to disguise, and unable to disguise. The duration of the microexpressions is short, typically between 1/25 second and 1/2 second, and the amplitude of the facial muscle movements is small when microexpressions occur.

In the past, the business of the bank mainly takes off-line, and when a user transacts the banking business, the user needs to transact at off-line network points of the bank, which wastes time and labor for the user. With the advent of 5G, online customer service has gone to the era stage. In the past, the business which the user needs to go to the website for face-to-face treatment can be transacted on line through online customer service, so that the user can transact the required business quickly without going to the website.

However, when a user handles complicated services such as "revocation of loss", "information maintenance" or "expiration of certificate information and re-upload" on line, it is necessary to verify personal identity information. The online system of the bank is difficult to distinguish the authenticity of the information, so that certain potential safety hazards are brought to the user account.

Disclosure of Invention

One aspect of the present disclosure provides an image information verification method, including: acquiring a target video sequence of a target user when information to be verified is input, wherein the target video sequence comprises a plurality of frame images containing human faces; dividing a face image from each frame image in the plurality of frame images to obtain a plurality of face images; extracting target features of the target video sequence from the plurality of face images according to a local binary pattern algorithm; and processing the target feature with a trained classification model to verify the authenticity of the information to be verified.

Optionally, the method further comprises: training the classification model, wherein the training the classification model comprises: acquiring a plurality of sample video sequences and authenticity labels corresponding to the plurality of sample video sequences, wherein each sample video sequence in the plurality of sample video sequences comprises a plurality of sample images; extracting sample features from sample images of each sample video sequence according to a local binary pattern algorithm; inputting the classification model according to the sample characteristics to obtain a classification result; and adjusting parameters of a local binary pattern algorithm and parameters of the classification model according to the classification result and the authenticity label.

Optionally, the method further comprises, before the dividing the face image from each of the plurality of frame images, respectively: determining a key point set according to each frame image; and carrying out normalization processing on each frame image according to the key point set.

Optionally, the normalizing processing is performed on each frame image according to the key point set, including: acquiring a template frame image, and determining a plurality of first face key points from the template image; determining a plurality of second face key points from the frame images for each of the plurality of frame images; determining the weight of the frame image according to the plurality of first face key points and the plurality of second face key points; and weighting each pixel value of the frame image by using the weight to obtain the normalized frame image.

Optionally, the method further comprises, after the dividing the face image from each of the plurality of frame images, respectively: and carrying out interpolation operation on a plurality of face images according to a time interpolation algorithm so as to normalize the number of image sequence frames.

Optionally, the dividing the face image from each of the plurality of frame images includes: acquiring two pupil coordinates in the template frame image; determining a region of interest according to the two pupil coordinates; and dividing the face image from each frame image according to the region of interest.

Another aspect of the present disclosure provides an image information authentication apparatus, including: the acquisition module is used for acquiring a target video sequence of a target user when the target user inputs information to be verified, wherein the target video sequence comprises a plurality of frame images containing human faces; the segmentation module is used for respectively segmenting the face image from each frame image in the plurality of frame images to obtain a plurality of face images; the feature extraction module is used for extracting target features of the target video sequence from the plurality of face images according to a local binary pattern algorithm; and a classification module for processing the target features using a trained classification model to verify the authenticity of the information to be verified.

Optionally, the apparatus further comprises: the training module is used for training the classification model, wherein the training module comprises: a sample obtaining sub-module, configured to obtain a plurality of sample video sequences and an authenticity tag corresponding to the plurality of sample video sequences, where each sample video sequence in the plurality of sample video sequences includes a plurality of sample images; the extraction submodule is used for extracting sample features from sample images of each sample video sequence according to a local binary pattern algorithm; the input sub-module is used for inputting the classification model according to the sample characteristics to obtain a classification result; and the adjustment sub-module is used for adjusting parameters of a local binary pattern algorithm and parameters of the classification model according to the classification result and the authenticity label.

Another aspect of the present disclosure provides a computing device comprising: one or more processors; and a storage means for storing one or more programs, which when executed by the one or more processors cause the one or more processors to implement the methods as described above.

Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions that, when executed, are configured to implement a method as described above.

Another aspect of the present disclosure provides a computer program comprising computer executable instructions which when executed are for implementing a method as described above.

According to embodiments of the present disclosure, features are extracted from a target video sequence by obtaining the target video sequence of a target user at the time of inputting information to be verified, and the target features are processed using a trained classification model in order to verify the authenticity of the information to be verified. The user can verify the information required by the business without going to the appointed offline website, the online business handling range of the bank is expanded, the work efficiency of the related business for verifying the bank information is improved, and the potential safety hazard of the customer account is reduced.

Drawings

For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

fig. 1 schematically illustrates a system architecture of an image information verification method and an image information verification apparatus according to an embodiment of the present disclosure;

Fig. 2 schematically illustrates a flowchart of an image information verification method according to an embodiment of the present disclosure;

FIG. 3 schematically illustrates a flow chart of training a classification model according to an embodiment of the disclosure;

fig. 4 schematically illustrates a flowchart of an image information verification method according to another embodiment of the present disclosure;

FIG. 5 schematically illustrates a flowchart of normalizing each frame image according to a set of key points, according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a flow chart for segmenting a face image from each of a plurality of frame images, according to an embodiment of the disclosure;

fig. 7 schematically illustrates a flowchart of an image information verification method according to another embodiment of the present disclosure;

fig. 8 schematically shows a schematic diagram of a sequence of face images according to an embodiment of the disclosure;

FIG. 9 schematically illustrates a schematic view of a set of face images according to an embodiment of the present disclosure;

FIG. 10 schematically illustrates a schematic diagram of dividing a face image set into feature blocks according to an embodiment of the present disclosure;

FIG. 11 schematically illustrates a schematic diagram of obtaining three-dimensional LBP features for a face image set using an LBP-TOP algorithm in accordance with an embodiment of the present disclosure;

FIG. 12 schematically illustrates a schematic diagram of an confusion matrix according to an embodiment of the disclosure;

FIG. 13 schematically illustrates an online identity verification system operation interface schematic in accordance with an embodiment of the present disclosure;

fig. 14 schematically illustrates a block diagram of an image information verification apparatus according to an embodiment of the present disclosure;

fig. 15 schematically illustrates a block diagram of an image information verification apparatus according to an embodiment of the present disclosure; and

fig. 16 schematically illustrates a block diagram of a computer system suitable for implementing the above-described method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a formulation similar to at least one of "A, B or C, etc." is used, in general such a formulation should be interpreted in accordance with the ordinary understanding of one skilled in the art (e.g. "a system with at least one of A, B or C" would include but not be limited to systems with a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

Some of the block diagrams and/or flowchart illustrations are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, when executed by the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer program product on a computer-readable storage medium having instructions stored thereon, the computer program product being for use by or in connection with an instruction execution system.

Embodiments of the present disclosure provide an image information verification method and an image information verification apparatus capable of applying the method. The method comprises the steps of obtaining a target video sequence of a target user when information to be verified is input, wherein the target video sequence of the user comprises a plurality of frame images containing human faces; dividing a face image from each of a plurality of frame images of a user to obtain a plurality of face images; extracting target features of a user target video sequence from a plurality of face images of a user according to a local binary pattern algorithm; and processing the user target feature using the trained classification model to verify the authenticity of the information to be verified by the user.

Fig. 1 schematically illustrates a system architecture of an image information verification method and an image information verification apparatus according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which embodiments of the present disclosure may be applied to assist those skilled in the art in understanding the technical content of the present disclosure, but does not mean that embodiments of the present disclosure may not be used in other devices, systems, environments, or scenarios.

As shown in fig. 1, the system architecture 100 may include a terminal device 101, a server 102, and a network 103. The network 103 is a medium used to provide a communication link between the terminal device 101 and the server 102. The network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 102 via the network 103 using the terminal device 101 to receive or send messages or the like. The terminal device 101 may have various communication client applications installed thereon, such as a mobile banking client, a shopping class application, a web browser application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc. (as examples only).

The terminal device 101 may be a variety of electronic devices having image capturing means (e.g., camera) and input means (e.g., keyboard, touch screen, etc.), including but not limited to smartphones, tablet computers, laptop and desktop computers, and the like. The terminal apparatus 101 may collect a facial image of a user when the user inputs information through an input device, and transmit the information input by the user and the collected facial image to the server 102 through the network 103.

The server 102 may be a server providing various services, such as a background management server providing support for websites browsed by the user using the terminal device 101. The background management server can analyze, verify and the like the received face image of the user, obtain a verification result, and determine the authenticity of the information input by the user according to the verification result.

It should be noted that, the image information verification method provided by the embodiment of the present disclosure may be generally performed by the server 102. Accordingly, the image information authentication apparatus provided by the embodiments of the present disclosure may be generally provided in the server 102. The image information verification method provided by the embodiment of the present disclosure may also be performed by a server or a server cluster that is different from the server 102 and is capable of communicating with the terminal device 101 and/or the server 102. Accordingly, the image information verification apparatus provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 102 and is capable of communicating with the terminal device 101 and/or the server 102.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Fig. 2 schematically illustrates a flowchart of an image information verification method according to an embodiment of the present disclosure.

As shown in fig. 2, the method includes operations S210 to S240.

In operation S210, a target video sequence of a target user when inputting information to be verified is acquired.

Wherein the target video sequence comprises a plurality of frame images (frames) comprising faces.

According to embodiments of the present disclosure, a target user may record information to be verified as a video clip in the process of inputting the information, and then serialize the video clip to obtain a target video sequence.

In operation S220, a face image is divided from each of a plurality of frame images, respectively, to obtain a plurality of face images.

According to the embodiment of the disclosure, the region which does not contain the face information in the frame image can be deleted, and only the region containing the face information is segmented, so that the face image is obtained for subsequent processing.

In operation S230, target features of a target video sequence are extracted from a plurality of face images according to a local binary pattern algorithm.

According to embodiments of the present disclosure, the local binary pattern algorithm may include, for example, an LBP-TOP (Local Binary Pattems on Three Orthogonal Planes, three orthogonal plane local binary pattern feature) algorithm. Based on this, operation S230 may include dividing a plurality of face images in a target video sequence into a×b×c feature blocks in a three-dimensional space using an LBP-TOP algorithm, where a is a column-wise block number, determined by the number of pixels in a unit column of the face image, B is a row-wise block number, determined by the number of pixels in a unit row of the face image, and C is a time-wise block number, determined by a duration of the target video sequence, for example.

In operation S240, the target features are processed using the trained classification model in order to verify the authenticity of the information to be verified.

When a person is doing his or her work, the facial expression may change slightly. Based on this principle, a classification model according to embodiments of the present disclosure, after training, may be used to identify facial microexpressive features in an image, thereby determining whether a user in the image is doing his or her work. The classification model may be, for example, LSVM (Linear Support Vector Machine ).

Based on this, operation S240 may include, for example, inputting the target feature of the target video sequence into the trained LSVM to obtain the authenticity of the user, and then verifying the authenticity of the information to be verified, which is input by the user, according to the authenticity of the user.

In addition to operations S210-S240, the method may further include operation S310, training a classification model. Operation S310 may be performed, for example, before operation S210.

FIG. 3 schematically illustrates a flow chart of training a classification model according to an embodiment of the disclosure.

As shown in FIG. 3, training the classification model may include, for example, operations S311-S314.

In operation S311, a plurality of sample video sequences and an authenticity tag corresponding to the plurality of sample video sequences are acquired. Wherein each sample video sequence of the plurality of sample video sequences comprises a plurality of sample images.

According to the embodiment of the disclosure, a plurality of video clips of the face under the condition of being falsified and the video clips of the face under the condition of not being falsified can be acquired, and the video clips containing the face information are serialized into a sample video sequence. Each sample video sequence has a corresponding true-false label, if the sample video sequence contains face information when fake is made, the true-false label is false, and if the sample video sequence contains face information when fake is not made, the true-false label is true.

In operation S312, for each sample video sequence, sample features are extracted from sample images of the sample video sequence according to a local binary pattern algorithm.

According to embodiments of the present disclosure, frame images containing face information may be extracted from a sample video sequence. And obtaining three-dimensional LBP characteristics by adopting an LBP-TOP algorithm to the frame image sets.

In operation S313, a classification model is input according to the sample features, resulting in classification results.

According to embodiments of the present disclosure, the classification model may be, for example, a linear support vector machine. Operation S313 may include, for example, creating a classification model, and inputting the three-dimensional LBP features obtained in operation S312 into a linear support vector machine to obtain a classification result. According to embodiments of the present disclosure, classification results may include true (true) and false (false), for example.

In operation S314, parameters of the local binary pattern algorithm and parameters of the classification model are adjusted according to the classification result and the genuine-genuine label.

According to embodiments of the present disclosure, parameters of the local binary pattern algorithm may include, for example, any one or more of a column-wise block number, a row-wise block number, and a time-wise block number.

According to an embodiment of the present disclosure, operations S312 to S314 may be repeatedly performed until the accuracy of the obtained classification result meets a preset requirement.

In this embodiment, the accuracy of the classification result may be represented in the form of an confusion matrix, where each column of the confusion matrix represents a prediction class, each row represents a true attribution class of the data, and the value of the confusion matrix on the diagonal represents the accuracy of model prediction. Illustratively, the confusion matrix is shown, for example, in table 1. The values on the diagonal of the confusion matrix are 71.60 and 68.60, respectively, that is, the accuracy of the model prediction true is 71.60%, and the accuracy of the prediction false is 68.60.

TABLE 1

According to the embodiment of the present disclosure, if the values of the confusion matrix on the diagonal line reach the locally optimal solution after repeating operations S312 to S314 for several times, that is, the accuracy of the classification result obtained later is not increased, the accuracy of the classification result meets the preset requirement.

Fig. 4 schematically illustrates a flowchart of an image information verification method according to another embodiment of the present disclosure.

As shown in fig. 4, the method may further include operations S410 to S420 in addition to operations S210 to S240. Wherein, operation S410 may be performed before dividing the face image from each of the plurality of frame images, respectively.

In operation S410, a set of key points is determined from each frame image.

According to the embodiment of the disclosure, a preset number of face key points can be calibrated for each frame image by using an ASM (Active Shape Model ) algorithm, and all face key points corresponding to each frame image are used as a key point set of the frame image. Illustratively, in the present embodiment, the preset number may be 68, for example.

When multiple sets of keypoints are identified in a frame, screening of the sets of keypoints is required according to embodiments of the present disclosure. More specifically, for k key point sets (k. Gtoreq.1) marked in a frame, the relative distance between any two key points in each key point set can be calculated according to formula I, and L can be selected ^(k) (I _i ) The largest two key points belong to the key point set which is used as the key point set of the face to be detected.

Wherein I is _i Representing the ith frame image in a video segment, i is greater than or equal to 1, L ^(k) (I _i ) Representing the relative distance between any two key points of the kth key point set of the ith frame image, wherein p and q refer to the same key point setAny two key points of p, q E [1, 68 ]]And is also provided with

In operation S420, normalization processing is performed on each frame image according to the set of key points.

Fig. 5 schematically illustrates a flowchart of normalization processing for each frame image according to a set of key points according to an embodiment of the present disclosure.

As shown in fig. 5, operations S521 to S524 may be included in addition to operation S420, for example.

In operation S521, a template frame image is acquired, and a plurality of first face key points are determined from the template image.

In operation S522, a plurality of second face key points are determined from the frame images for each of the plurality of frame images.

In operation S523, the weight of the frame image is determined according to the plurality of first face keypoints and the plurality of second face keypoints.

In operation S524, each pixel value of the frame image is weighted with a weight to obtain a normalized frame image.

According to the embodiment of the disclosure, a frame image with a centered face position and a right angle can be determined from all frame images and used as a template frame image. A plurality of first face keypoints is then determined from the set of keypoints of the template frame image. Illustratively, in the present embodiment, all the keypoints in the set of keypoints of the template frame image are taken as the first keypoints.

According to an embodiment of the present disclosure, a plurality of second face key points may be determined from each key point set in other frame images than the template frame image. Illustratively, in the present embodiment, all the keypoints in each of the keypoints sets in the other frame images are taken as the second keypoints.

According to embodiments of the present disclosure, a LWM (Local Weighted Mean, local weighted average algorithm) function may be utilized to establish a correspondence T between a set of keypoints of a template frame image and a set of keypoints of a first frame image in a video sequence:

wherein I is _mod For template frame image, I ₁ For the first frame image of a video sequence,for the key point set of the template frame image, +.>For the key point set of the first frame image, LWM () is an LWM function.

According to embodiments of the present disclosure, when I _mod For the first frame of a certain video sequence, T is a constant 1.

According to an embodiment of the present disclosure, T may be taken as the weight of a frame image, and then the frame images in the same video sequence may be normalized to a pattern of model-like frame images according to the following formula (formula III):

I′ _i ＝T·I _i (formula III)

Wherein I' _i Is the normalized i frame image.

Fig. 6 schematically illustrates a flowchart for separately segmenting a face image from each of a plurality of frame images according to an embodiment of the present disclosure.

As shown in fig. 6, operation S220 may include operations S610 to S630, for example.

In operation S610, two pupil coordinates in a template frame image are acquired.

In operation S620, a region of interest is determined from the two pupil coordinates.

In operation S630, a face image is segmented from each frame image according to the region of interest.

According to embodiments of the present disclosure, two pupil coordinates may be obtained by image recognition, or manually calibrated.

According to the embodiment of the disclosure, the pupil distance between two pupils can be calculated according to the two pupil coordinates, and then the region where the face is located, namely the region of interest, is calculated according to the pupil distance. Illustratively, in this embodiment, the area of interest is twice as wide and high as the pupil distance.

According to the embodiment of the disclosure, the face image is obtained by dividing the image in the region of interest from each frame image, the background region which does not contain the face information in the frame image can be deleted, only the region containing the face information is reserved, the interference of environmental factors on recognition can be reduced, and the accuracy of micro expression recognition is improved.

Fig. 7 schematically illustrates a flowchart of an image information verification method according to another embodiment of the present disclosure.

As shown in fig. 7, the method may further include an operation S710 of interpolating a plurality of face images according to a temporal interpolation algorithm to normalize the number of image sequence frames, in addition to operations S210 to S240 and S310.

According to embodiments of the present disclosure, if the number of frames of each video sequence is different, a difference operation may be performed for each video sequence using a TIM (Temporal Interpolation Model ) to unify the video sequences to the same number of frames.

According to an embodiment of the present disclosure, operation S710 may be performed, for example, after dividing a face image from each of a plurality of frame images, respectively.

The method of embodiments of the present disclosure is further described below in conjunction with figures 8-13 and the specific embodiments. Those skilled in the art will appreciate that the following example embodiments are merely for the understanding of the present disclosure, and the present disclosure is not limited thereto.

Step 1: collecting a plurality of video clips containing face information, serializing each video clip into a video sequence, and processing each video sequence by using the steps 1.1 to 1.3 to obtain a preprocessed face image sequence.

Step 1.1: and selecting a model frame image of the video fragment, and calibrating a key point set of the face to be identified by using an ASM face calibration algorithm on the model frame image and the first frame image of the video sequence respectively.

Step 1.2: and (2) establishing a corresponding relation model for the key point set of the model frame image and the first frame image obtained in the step (1.1), and respectively inputting the rest frame images in the video sequence into the corresponding relation model to obtain the video sequence with uniform gesture.

Step 1.3: and (3) carrying out background segmentation on the video sequence with uniform gestures obtained in the step (1.2) according to the pupil distance of the face to be recognized, and obtaining a plurality of segmented face images, namely a face image sequence, as shown in fig. 8.

Step 2: and (3) unifying the number of frames of the face image sequence obtained in the step (1) by utilizing a TIM algorithm to obtain a video sequence after the difference, namely a face image set, as shown in fig. 9.

Step 3: and (3) obtaining three-dimensional LBP characteristics by adopting an LBP-TOP algorithm on the face image set obtained in the step (2).

More specifically, as shown in fig. 10, the face image set is divided into a plurality of feature blocks (also called cubes) of 8×8×1 in three-dimensional space, wherein the first parameter (8) is the number of column-wise blocks, the second parameter (8) is the number of row-wise blocks, and the third parameter (1) is the number of time-wise blocks. As shown in fig. 11, the feature is extracted from the LBP model with the encoding mode of uniform code, the radius value R of 2, and the sampling number P of 8, for the face image set of fig. 10. The LBP characteristics of XY, XT and YT directions are extracted from a single cube, and the three LBP characteristics are combined in series to form the LBP-TOP characteristic (LBP-TOP) of the single cube ¹ ) The corpus is then traversed to obtain global LBP-TOP features (LBP-TOP) ² )。

Step 4: inputting the three-dimensional LBP characteristics into an LSVM classifier to obtain the microexpressive category corresponding to the face information contained in the face image set.

Step 5: and (3) determining and analyzing the confusion matrix according to the LSVM classification result obtained in the step (3), and changing the LBP-TOP parameters, and repeating the step (3) until the confusion matrix is clear in area, as shown in figure 12.

Step 6: and establishing an online information verification system according to the model face key point set, the corresponding relation, the face segmentation size, the TIM algorithm parameter, the LSVM algorithm parameter and the blocking parameter of the LBP-TOP feature extraction algorithm.

Step 7: when a user handles online business, the communication process between customer service personnel and a customer is recorded in real time, an online information verification system is utilized to analyze the micro expression change of the user in the conversation process according to the recorded video clips, an analysis result is obtained, the analysis result is displayed on a system interface in real time, and the customer service personnel can judge the authenticity of information provided by the user through the analysis result.

FIG. 13 illustrates an exemplary online identity verification system operation interface diagram in accordance with an embodiment of the present disclosure. As shown in fig. 13, the left side is a video acquisition and analysis result display area, and the right side is an audit service handling area. The auditor can determine the authenticity of the information provided by the user when the user transacts the business by taking the video image and the analysis result of the user displayed in the display area as references.

Fig. 14 schematically shows a block diagram of an image information authentication apparatus according to an embodiment of the present disclosure.

As shown in fig. 14, the image information verification apparatus 1400 includes an acquisition module 1410, a segmentation module 1420, a feature extraction module 1430, and a classification module 1440.

Specifically, the acquiring module 1410 is configured to acquire a target video sequence when the target user inputs information to be verified, where the target video sequence includes a plurality of frame images including a face.

The segmentation module 1420 is configured to segment a face image from each of a plurality of frame images to obtain a plurality of face images.

The feature extraction module 1430 is configured to extract target features of the target video sequence from the plurality of face images according to a local binary pattern algorithm.

The classification module 1440 is configured to process the target feature using the trained classification model to verify the authenticity of the information to be verified.

Fig. 15 schematically shows a block diagram of an image information authentication apparatus according to an embodiment of the present disclosure.

As shown in fig. 15, the image information verification apparatus 1500 may further include 1510 in addition to the acquisition module 1410, the segmentation module 1420, the feature extraction module 1430, and the classification module 1440.

Specifically, the training module 1510 is configured to train the classification model, where the training module may include:

a sample acquiring submodule 1511 is configured to acquire a plurality of sample video sequences and an authenticity tag corresponding to the plurality of sample video sequences, where each sample video sequence in the plurality of sample video sequences includes a plurality of sample images.

An extraction sub-module 1512 is configured to extract, for each sample video sequence, sample features from sample images of the sample video sequence according to a local binary pattern algorithm.

And an input submodule 1513, configured to input a classification model according to the sample features, and obtain a classification result.

And the adjustment submodule 1514 is used for adjusting parameters of the local binary pattern algorithm and parameters of the classification model according to the classification result and the authenticity label.

Any number of modules, sub-modules, units, sub-units, or at least some of the functionality of any number of the sub-units according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented as split into multiple modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or in any other reasonable manner of hardware or firmware that integrates or encapsulates the circuit, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be at least partially implemented as computer program modules, which when executed, may perform the corresponding functions.

For example, any of the acquisition module 1410, the segmentation module 1420, the feature extraction module 1430, the classification module 1440, and the training module 1510 may be combined in one module to be implemented, or any of the modules may be split into multiple modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. At least one of the acquisition module 1410, the segmentation module 1420, the feature extraction module 1430, the classification module 1440, and the training module 1510 may be implemented, at least in part, as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or as hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or as any one of or a suitable combination of three of software, hardware, and firmware, in accordance with embodiments of the present disclosure. Alternatively, at least one of the acquisition module 1410, the segmentation module 1420, the feature extraction module 1430, the classification module 1440, and the training module 1510 may be implemented at least in part as a computer program module that, when executed, performs the corresponding functions.

Fig. 16 schematically illustrates a block diagram of a computer system suitable for implementing the above-described method according to an embodiment of the present disclosure. The computer system illustrated in fig. 16 is merely an example, and should not be construed as limiting the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 16, the image information authentication apparatus 1600 includes a processor 1610 and a computer-readable storage medium 1620. The computer system 1600 may perform methods according to embodiments of the present disclosure.

In particular, processor 1610 can include, for example, a general purpose microprocessor, an instruction set processor, and/or an associated chipset and/or special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 1610 may also include on-board memory for caching purposes. Processor 1610 may be a single processing unit or multiple processing units for performing different actions of a method flow according to an embodiment of the disclosure.

Computer-readable storage medium 1620, which may be, for example, a non-volatile computer-readable storage medium, specific examples include, but are not limited to: magnetic storage devices such as magnetic tape or hard disk (HDD); optical storage devices such as compact discs (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; etc.

The computer-readable storage medium 1620 may include a computer program 1621, which computer program 1621 may include code/computer-executable instructions that, when executed by the processor 1610, cause the processor 1610 to perform a method according to an embodiment of the disclosure or any variation thereof.

Computer program 1621 may be configured to have computer program code comprising, for example, computer program modules. For example, in an example embodiment, code in computer program 1621 may include one or more program modules, including for example 1621A, modules 1621B, … …. It should be noted that the division and number of modules is not fixed, and a person skilled in the art may use suitable program modules or combinations of program modules according to the actual situation, which when executed by the processor 1610, enable the processor 1610 to perform the methods according to embodiments of the present disclosure or any variations thereof.

At least one of the acquisition module 1410, the segmentation module 1420, the feature extraction module 1430, the classification module 1440, and the training module 1510 may be implemented as computer program modules described with reference to fig. 16, which when executed by the processor 1610, may implement the respective operations described above, in accordance with embodiments of the invention.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be combined in various combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

While the present disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present disclosure as defined by the appended claims and their equivalents. The scope of the disclosure should, therefore, not be limited to the above-described embodiments, but should be determined not only by the following claims, but also by the equivalents of the following claims.

Claims

1. An image information verification method, comprising:

acquiring a target video sequence of a target user when information to be verified is input, wherein the target video sequence comprises a plurality of frame images containing human faces;

Dividing a face image from each frame image in the plurality of frame images to obtain a plurality of face images;

extracting target features of the target video sequence from the plurality of face images according to a local binary pattern algorithm; and

processing the target feature with a trained classification model to verify the authenticity of the information to be verified;

the method for acquiring the target video sequence of the target user when inputting the information to be verified comprises the following steps:

under the condition that the target user handles online business, the online information verification system records the communication process between the target user and customer service personnel into a video clip in real time;

serializing the video segment to obtain the target video sequence;

the method further includes, after the dividing the face image from each of the plurality of frame images, respectively:

performing interpolation operation on a plurality of face images according to a time interpolation algorithm to normalize the number of image sequence frames;

the method further comprises, prior to the segmenting the face image from each of the plurality of frame images, respectively:

determining a key point set according to each frame image; and

Carrying out normalization processing on each frame image according to the key point set;

the online information verification system is constructed according to the key point set, the face images, the time interpolation algorithm parameters, the classification model parameters and the blocking parameters of the local binary pattern algorithm, and is used for displaying the real-time verification of the information to be verified.

2. The method of claim 1, further comprising:

training the classification model, wherein the training the classification model comprises:

acquiring a plurality of sample video sequences and authenticity labels corresponding to the plurality of sample video sequences, wherein each sample video sequence in the plurality of sample video sequences comprises a plurality of sample images;

extracting sample features from sample images of each sample video sequence according to a local binary pattern algorithm;

inputting the classification model according to the sample characteristics to obtain a classification result; and

and adjusting parameters of a local binary pattern algorithm and parameters of the classification model according to the classification result and the authenticity label.

3. The method of claim 1, wherein said normalizing each frame image according to the set of keypoints comprises:

acquiring a template frame image, and determining a plurality of first face key points from the template frame image;

determining a plurality of second face key points from the frame images for each of the plurality of frame images;

determining the weight of the frame image according to the plurality of first face key points and the plurality of second face key points; and

and weighting each pixel value of the frame image by using the weight to obtain the normalized frame image.

4. A method according to claim 3, wherein the separately segmenting the face image from each of the plurality of frame images comprises:

acquiring two pupil coordinates in the template frame image;

determining a region of interest according to the two pupil coordinates; and

and dividing the face image from each frame image according to the region of interest.

5. An image information authentication apparatus comprising:

the acquisition module is used for acquiring a target video sequence of a target user when the target user inputs information to be verified, wherein the target video sequence comprises a plurality of frame images containing human faces;

The segmentation module is used for respectively segmenting the face image from each frame image in the plurality of frame images to obtain a plurality of face images;

the feature extraction module is used for extracting target features of the target video sequence from the plurality of face images according to a local binary pattern algorithm; and

a classification module for processing the target features using a trained classification model to verify the authenticity of the information to be verified;

serializing the video segment to obtain the target video sequence;

after the face image is segmented from each of the plurality of frame images, the method further comprises:

before the face image is segmented from each of the plurality of frame images, the method further comprises:

Determining a key point set according to each frame image; and

6. The apparatus of claim 5, further comprising:

the training module is used for training the classification model, wherein the training module comprises:

a sample obtaining sub-module, configured to obtain a plurality of sample video sequences and an authenticity tag corresponding to the plurality of sample video sequences, where each sample video sequence in the plurality of sample video sequences includes a plurality of sample images;

the extraction submodule is used for extracting sample features from sample images of each sample video sequence according to a local binary pattern algorithm;

the input sub-module is used for inputting the classification model according to the sample characteristics to obtain a classification result; and

And the adjustment sub-module is used for adjusting parameters of a local binary pattern algorithm and parameters of the classification model according to the classification result and the authenticity label.

7. A computing device, comprising:

one or more processors;

a memory for storing one or more computer programs,

wherein the one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 4.

8. A computer readable storage medium having stored thereon executable instructions which when executed by a processor cause the processor to implement the method of any one of claims 1 to 4.