WO2020211396A1 - 静默式活体图片识别方法、装置、计算机设备和存储介质 - Google Patents

静默式活体图片识别方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020211396A1
WO2020211396A1 PCT/CN2019/122920 CN2019122920W WO2020211396A1 WO 2020211396 A1 WO2020211396 A1 WO 2020211396A1 CN 2019122920 W CN2019122920 W CN 2019122920W WO 2020211396 A1 WO2020211396 A1 WO 2020211396A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
verified
data
channel
image data
Prior art date
Application number
PCT/CN2019/122920
Other languages
English (en)
French (fr)
Inventor
王德勋
徐国强
邱寒
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2020211396A1 publication Critical patent/WO2020211396A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • This application relates to a silent type living body picture recognition method, device, computer equipment and storage medium.
  • face recognition has also been greatly developed.
  • face recognition it is necessary to obtain face information through the camera, and then recognize the face information to determine the identity of the person.
  • face information it is impossible to confirm that the collected face information is a living body, resulting in the face Identification is not safe.
  • a method, apparatus, computer equipment, and storage medium for silent living picture recognition are provided.
  • a silent type living body picture recognition method includes:
  • the multi-channel picture data of the picture to be verified is constructed; the color information is the pixel data of the picture to be verified, and the brightness information is the value of the picture to be verified. Brightness performance information;
  • the picture to be verified is a living picture.
  • a silent type living body picture recognition device includes:
  • the data acquisition module is used to acquire the image to be verified
  • the feature extraction module is used to construct the multi-channel picture data of the picture to be verified according to the color information and brightness information of the picture to be verified; the color information is the pixel data of the picture to be verified, and the brightness information is The brightness performance information of the picture to be verified;
  • a prediction module configured to input the multi-channel picture data into a preset deep convolutional network to obtain a feature label corresponding to the multi-channel picture data
  • the recognition module is configured to determine that the picture to be verified is a live picture when the feature tag matches the target tag.
  • a computer device including a memory and one or more processors, the memory stores computer readable instructions, when the computer readable instructions are executed by the processor, the one or more processors execute The following steps:
  • the data acquisition module is used to acquire the image to be verified
  • the feature extraction module is used to construct the multi-channel picture data of the picture to be verified according to the color information and brightness information of the picture to be verified; the color information is the pixel data of the picture to be verified, and the brightness information is The brightness performance information of the picture to be verified;
  • a prediction module configured to input the multi-channel picture data into a preset deep convolutional network to obtain a feature label corresponding to the multi-channel picture data
  • the recognition module is configured to determine that the picture to be verified is a live picture when the feature tag matches the target tag.
  • One or more non-volatile storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the data acquisition module is used to acquire the image to be verified
  • the feature extraction module is used to construct the multi-channel picture data of the picture to be verified according to the color information and brightness information of the picture to be verified; the color information is the pixel data of the picture to be verified, and the brightness information is The brightness performance information of the picture to be verified;
  • a prediction module configured to input the multi-channel picture data into a preset deep convolutional network to obtain a feature label corresponding to the multi-channel picture data
  • the recognition module is configured to determine that the picture to be verified is a live picture when the feature tag matches the target tag.
  • Fig. 1 is an application scenario diagram of a silent living picture recognition method according to one or more embodiments.
  • Fig. 2 is a schematic flowchart of a silent live picture recognition method according to one or more embodiments.
  • Fig. 3 is a schematic flowchart of the steps of constructing multi-channel picture data according to one or more embodiments.
  • FIG. 4 is a schematic flowchart of a silent living picture recognition method in another embodiment.
  • Fig. 5 is a block diagram of a silent live picture recognition device according to one or more embodiments.
  • Figure 6 is a block diagram of a computer device according to one or more embodiments.
  • the silent living picture recognition method provided in this application can be applied to the application environment as shown in FIG. 1.
  • the terminal 102 and the server 104 communicate through the network.
  • the terminal 102 can be, but is not limited to, various camera devices, personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 can be implemented by an independent server or a server cluster composed of multiple servers.
  • the terminal 102 When the terminal 102 is a camera device, the terminal 102 is connected to the server 104 through a local area network or the Internet. When the terminal 102 receives a shooting instruction, it takes a picture and sends the picture data obtained by the shooting to the server 104 via the network.
  • the server 104 includes two ways to obtain image data. One is through the camera function of the personal computer. At this time, the personal computer is equivalent to a camera device, and the other is to store the image data in the personal computer. In the storage medium, the picture data is uploaded to the server 104 via the network by taking out the picture data from the storage medium.
  • the server 104 obtains the image data, defines the image data as the image data to be verified, and the server extracts the color information and brightness information of the image data to construct multi-channel data of the image data to be verified, thereby inputting the multi-channel data to the server 104 In the deep convolutional network set in, the feature label corresponding to the multi-channel image data of the deep convolutional network data.
  • the server matches the feature tag with the target tag, and if the two match, it determines that the picture to be verified is a live picture.
  • a silent live picture recognition method is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Obtain a picture to be verified.
  • the picture to be verified indicates that the server has received picture data with a face, and the picture with a face indicates that there is at least one face in the picture area.
  • the server may receive picture data from the camera device, or picture data from a terminal such as a personal computer.
  • a terminal such as a personal computer.
  • the face of the target person appears in the camera coverage area, and the camera device shoots the camera coverage area Therefore, the image data with the face is uploaded to the server, and the server marks the image data as a picture to be verified.
  • Step 204 Construct multi-channel picture data of the picture to be verified according to the color information and brightness information of the picture to be verified.
  • Color information refers to the pixel data of the picture to be verified.
  • the color distribution of the picture to be verified can be obtained through the color information. It can be expressed in standard RGB format or other color formats;
  • brightness information refers to the picture to be verified. Brightness performance information, the brightness display effect of the picture to be verified can be obtained through the brightness information, and the brightness information can be identified by the HSV model;
  • multi-channel picture data refers to picture data with multiple aspects of information, and this step specifically means that it also contains color information And the picture data of the brightness information.
  • the multi-channel image data can be obtained by fusion, or the multi-channel image data can be obtained by fitting and superposition.
  • Step 206 Input the multi-channel image data into a preset deep convolutional network to obtain a feature label corresponding to the multi-channel image data.
  • the deep convolutional network is obtained by deep learning of the convolutional neural network.
  • the multiple convolutional layers in the deep convolutional network establish the connection between the input image data and the preset label through the deep learning of a large amount of image data. Therefore, when multi-channel data is input, the feature label corresponding to the multi-channel image data can be output. It is worth noting that the feature label is one of the labels of the output layer of the deep convolutional network.
  • a preliminary convolutional neural network can be set in the server in advance, and then the convolutional neural network can be trained through a large amount of data collected to obtain a deep convolutional network that meets the accuracy requirements.
  • Step 208 When the feature tag matches the target tag, it is determined that the picture to be verified is a living picture.
  • the target label is a label preset in the server, and the target label can be selected from the labels in the output layer of the deep convolutional network according to the matching logic.
  • Living pictures refer to the picture data obtained by shooting real living objects, which are distinguished from secondary pictures obtained by shooting fake faces.
  • the above silent living picture recognition method by obtaining silent picture data, multi-channel input of the picture data based on color characteristics and brightness characteristics is used to construct multi-channel picture data, and the multi-channel picture data is used as the input of the deep convolutional network.
  • the deep convolutional network is obtained by deep learning through a large amount of image data. Therefore, for multi-channel image data, it can complete the extraction of low-level features and the conversion of low-level features to high-level features. Since the bottom-level features are converted from brightness features and color features, Therefore, the high-level features can further deepen the connection between the brightness feature and the color feature.
  • the fully connected layer outputs the corresponding feature tag according to the result of the high-level feature mapping to each tag.
  • the output feature tag and the target tag When matching, it is determined whether the picture data is a live picture. Therefore, it is possible to determine whether the picture data is a living picture without acquiring sequential pictures with time series.
  • the technical solution of the embodiment of the present invention has simpler operations when realizing living picture recognition, thereby improving the efficiency of living picture recognition.
  • the technical solutions of the above embodiments are very convenient to operate in various usage scenarios. For example, when applying for a credit card online, it is necessary to take a picture of the applicant’s face and confirm that it is the applicant’s own operating information.
  • the terminal used by the person obtains the face picture of the applicant. After the face picture is uploaded to the server, the server processes the picture data, data fusion, model input and other operations. The server confirms whether it is the applicant's own operation according to the model output result Behavior and convenient operation.
  • the server may also obtain video data, and then extract the picture to be verified from the video data.
  • the video data is decomposed into multiple video frames, and the multiple video frames are analyzed.
  • the analysis process includes noise analysis on the image data corresponding to the video frame and edge algorithm to calculate the image data corresponding to the video frame.
  • the size of the face area so that the video frame with the smallest noise and the largest face area is selected as the picture to be verified.
  • video data can be obtained through a single camera, thereby reducing the difficulty of obtaining data from the data source.
  • FIG. 3 a schematic flowchart for constructing multi-channel image data is provided, and the specific steps are as follows:
  • Step 302 Obtain RGB three-channel picture data of the picture to be verified according to the color information of the picture to be verified.
  • the data of the R (red), G (green) and B (blue) channels represented by the RGB three-channel image data can be obtained by inputting the image to be verified through the RGB three-channel input.
  • Step 304 Obtain HSV three-channel picture data of the picture to be verified according to the brightness information of the picture to be verified.
  • the data of the H (hue) channel, S (saturation) channel and V (lightness) channel identified by the HSV three-channel image data can be obtained by inputting the image to be verified through the HSV three-channel input.
  • Step 306 Obtain multi-channel image data according to the RGB three-channel image data and the HSV three-channel image data.
  • a large amount of information in the picture to be verified is extracted by using a multi-channel input method, thereby increasing the completeness of the description of the picture to be verified, thereby improving the efficiency of model prediction training and model prediction during model training and model prediction. Accuracy.
  • the RGB three-channel refers to the R value, G value, and B value of the image to be verified by inputting the image to be verified into the pixel separation tool, for example, the RGB value in a pixel matrix For [(128, 255, 255), (0, 255, 255), (128, 0, 255)], after the RGB three-channel input, the data of the R channel is [128, 0, 128], and the G channel is obtained The data of is [225, 225, 0] and the data of channel B is [225, 225, 225].
  • the HSV three-channel refers to inputting the image to be verified into the pixel separation tool to separate the H, S, and V values of the image to be verified.
  • the HSV value of a segment of pixels is [(1,0.5,0.5),(2,0.3,0.3),(3,0.2,0.2)], where the unit of the H value is an angle, that is, when the H value is 1, you need to convert 1 to the corresponding angle .
  • the H value of the image to be verified is [1,2,3]
  • the S value is [0.5, 0.3, 0.2]
  • the V value is [0.5, 0.3, 0.2].
  • the multi-channel image data may superimpose the three-channel RGB value and the three-channel HSV value, and then input the same convolution layer for convolution operation, thereby establishing the relationship between the values of the channels.
  • the multi-channel image data is input into the deep convolutional network, and the following operations are specifically performed: the multi-channel image data is input into the preset deep convolution network, and the RGB three The channel image data and HSV three-channel image data are convolved to obtain the image feature corresponding to the multi-channel image data, and the feature label corresponding to the multi-channel image data is obtained according to the image feature.
  • RGB three-channel image data and HSV three-channel image data are both low-level features.
  • high-level image features can be obtained. Therefore, through the deep convolutional network, the to-be-verified image data can be extracted The high-level features of the picture, thereby improving the accuracy of live picture prediction.
  • the process from the image feature to the output feature label specifically performs the following operations: According to the fully connected layer of the deep convolutional network, the probability that the image feature is mapped to each preset label is obtained, thereby passing the normalized index Function to output one of the preset labels as the feature label corresponding to the multi-channel image data.
  • the connection relationship between the nodes in the image feature and the fully connected layer nodes is established, and then the normalized exponential function (softmax layer) is used for regression prediction, thereby outputting the feature labels corresponding to the multi-channel image data.
  • the activation function of the fully connected layer can select the Relu function for nonlinear mapping.
  • FIG. 4 a schematic flow body of a method for training a deep convolutional network is provided, and the specific steps are as follows:
  • Step 402 Construct a secondary picture corresponding to the primary picture according to the preset primary picture.
  • the secondary picture is the picture data obtained by taking a picture, and the primary picture refers to the live picture.
  • a large number of primary pictures can be obtained through the Internet or physical shooting, and data support has been provided.
  • step 404 a training set and a verification set of the deep convolutional network are established based on the primary picture and the secondary picture.
  • the training set includes a large number of primary pictures and a corresponding number of secondary pictures, and the verification set also includes an appropriate amount of primary pictures and a corresponding number of secondary pictures.
  • the data in the training set is responsible for training the initial convolutional neural network, and the validation set is responsible for verifying the training effect.
  • step 406 the initial convolutional neural network is trained through the training set and the preset loss function, and when the accuracy of the initial convolutional neural network in the verification set reaches the threshold, a deep convolutional neural network is obtained.
  • the default output value of the loss function is set in the server.
  • the parameters of the convolutional layer and the parameters of the fully connected layer are adjusted according to the output value of the loss function to perform the initial Convolutional neural network training.
  • the accuracy rate refers to the statistical accuracy rate obtained by inputting the primary image or the secondary image in the verification set into the trained initial convolutional neural network.
  • the training set and the verification set are designed through the primary picture and the secondary picture, so as to achieve the purpose of training the initial convolutional neural network, thereby improving the accuracy of the prediction of the deep convolutional network.
  • the data source of a picture may be video data, so that video frames are extracted from the video data, and the video frames are preliminarily screened, that is, video frames with excessive noise are screened, so that video frames can be
  • the data gets more than one picture, which greatly expands the amount of data, which can further improve the training level of the deep convolutional network.
  • the number of primary pictures is equal to the number of secondary pictures, so that during training, each picture is guaranteed to have a higher accuracy for prediction.
  • the specific steps for establishing the training set and the verification set are as follows: perform a data enhancement operation on a picture to obtain multiple enhanced pictures corresponding to one picture; the data enhancement operations include: rotation operation, Zoom operation and flip operation, perform data enhancement operations on the secondary pictures, and get multiple enhanced secondary pictures corresponding to the secondary pictures.
  • the training set and validation set of the deep convolutional network are established .
  • a method for expanding training set and validation set samples is proposed, so that the training level of the deep convolutional network can be improved, and the prediction accuracy of the deep convolutional network can be further improved.
  • the rotation operation can take the original picture as the original, and perform the copy operation, and then rotate the copy to a certain angle to get a new picture.
  • Multiple rotations can get multiple pictures.
  • the same operation method is through two pictures.
  • Secondary pictures can get multiple secondary pictures as samples.
  • the scaling operation refers to scaling the pixel size, for example, a 1920*1080 primary picture is scaled to a 1280*720 primary picture, thereby expanding from a secondary picture to two, performing different degrees of Zooming can get multiple secondary pictures.
  • the zooming operation will not change the display effect, the dimension of the input data will be changed during feature extraction.
  • the number of primary pictures and secondary pictures can also be expanded through the flip operation.
  • the preset label output by the deep convolutional network includes 1 and 0, so the feature label may be 1 or 0.
  • the target label is set to 1
  • the feature label is also 1
  • the characteristic label and the target are determined
  • the tags are matched to determine that the image to be verified is a live image. It is worth noting that when the label output by the deep convolutional network is 1, it means that the input image to be verified is a live image.
  • a silent live picture recognition device including: a data acquisition module 502, a feature extraction module 504, a prediction module 506, and a recognition module 508, wherein:
  • the data acquisition module 502 is used to acquire the image to be verified.
  • the feature extraction module 504 is configured to construct multi-channel image data of the image to be verified according to the color information and brightness information of the image to be verified.
  • the prediction module 506 is configured to input the multi-channel image data into a preset deep convolutional network to obtain a feature label corresponding to the multi-channel image data.
  • the recognition module 508 is configured to determine that the picture to be verified is a live picture when the feature label matches the target label.
  • the feature extraction module 504 is further configured to obtain the RGB three-channel picture data of the picture to be verified according to the color information of the picture to be verified; obtain the HSV three-channel picture of the picture to be verified according to the brightness information of the picture to be verified Data; According to RGB three-channel picture data and HSV three-channel picture data, multi-channel picture data is obtained.
  • the feature extraction module 504 is also used to input multi-channel image data into a preset deep convolutional network, and perform RGB three-channel image data and HSV three-channel image data through the convolution layer of the deep convolution network. Through calculation, the image feature corresponding to the multi-channel image data is obtained; according to the image feature, the feature label corresponding to the multi-channel image data is obtained.
  • the feature extraction module 504 is further configured to obtain the probability of the image feature mapped to each preset label according to the fully connected layer of the deep convolutional network, and output the preset label through the preset normalized exponential function One of them is used as a feature label corresponding to the multi-channel image data.
  • it also includes a model training module, which is used to construct a secondary picture corresponding to the primary picture according to the preset primary picture; the secondary picture is the picture data obtained by taking the primary picture; according to the primary picture and the second picture Second picture, establish the training set and verification set of the deep convolutional network; train the initial convolutional neural network through the training set and the preset loss function, when the accuracy of the initial convolutional neural network in the verification set reaches the threshold, Get a deep convolutional neural network.
  • a model training module which is used to construct a secondary picture corresponding to the primary picture according to the preset primary picture; the secondary picture is the picture data obtained by taking the primary picture; according to the primary picture and the second picture Second picture, establish the training set and verification set of the deep convolutional network; train the initial convolutional neural network through the training set and the preset loss function, when the accuracy of the initial convolutional neural network in the verification set reaches the threshold, Get a deep convolutional neural network.
  • the model training module is also used to perform a data enhancement operation on a picture to obtain multiple enhanced once pictures corresponding to a picture;
  • the data enhancement operations include: rotation operation, zoom operation, and flip operation;
  • Data enhancement operations are performed on the secondary pictures to obtain multiple enhanced secondary pictures corresponding to the secondary pictures; according to the enhanced primary pictures and enhanced secondary pictures, a training set and a validation set of the deep convolutional network are established.
  • the feature tag includes 1 or 0; the target tag is 1; the identification module 508 is further configured to determine that the feature tag matches the target tag when the feature tag is 1, and determine that the picture to be verified is a live picture.
  • Each module in the above silent living body picture recognition device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment is used to store picture data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a silent live picture recognition method.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the one or more processors execute the following steps:
  • the color information and brightness information of the image to be verified construct the multi-channel image data of the image to be verified;
  • the color information is the pixel data of the image to be verified, and the brightness information is the brightness performance information of the image to be verified;
  • the picture to be verified is a live picture.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the RGB three-channel image data of the image to be verified is obtained
  • the multi-channel picture data is obtained.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the feature tags corresponding to the multi-channel image data are obtained.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the probability that the image feature is mapped to each preset label is obtained, and one of the preset labels is output as the feature label corresponding to the multi-channel image data through the preset normalized exponential function.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the secondary picture is the picture data obtained by shooting the primary picture
  • the initial convolutional neural network is trained through the training set and the preset loss function.
  • the accuracy of the initial convolutional neural network in the verification set reaches the threshold, the deep convolutional neural network is obtained.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • data enhancement operations include: rotation operation, zoom operation and flip operation;
  • the training set and verification set of the deep convolutional network are established.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the feature label includes 1 or 0; the target label is 1;
  • the picture to be verified is a live picture, including:
  • the feature tag is 1, it is determined that the feature tag matches the target tag, and the picture to be verified is determined to be a live picture.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the multi-channel image data is obtained based on the fusion method; or according to the color information and brightness information of the image to be verified, the multi-channel image data is obtained based on the fitting and superposition method.
  • the processor further implements the following steps when executing the computer-readable instructions:
  • the video frame with the smallest noise and the largest face area is determined as the picture to be verified.
  • One or more non-volatile storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, the one or more processors perform the following steps:
  • the color information and brightness information of the image to be verified construct the multi-channel image data of the image to be verified;
  • the color information is the pixel data of the image to be verified, and the brightness information is the brightness performance information of the image to be verified;
  • the picture to be verified is a live picture.
  • the RGB three-channel image data of the image to be verified is obtained
  • the multi-channel picture data is obtained.
  • the feature tags corresponding to the multi-channel image data are obtained.
  • the probability that the image feature is mapped to each preset label is obtained, and one of the preset labels is output as the feature label corresponding to the multi-channel image data through the preset normalized exponential function.
  • the secondary picture is the picture data obtained by shooting the primary picture
  • the initial convolutional neural network is trained.
  • the accuracy of the initial convolutional neural network in the verification set reaches the threshold, the deep convolutional neural network is obtained.
  • data enhancement operations include: rotation operation, zoom operation and flip operation;
  • the training set and verification set of the deep convolutional network are established.
  • the feature label includes 1 or 0; the target label is 1;
  • the picture to be verified is a live picture, including:
  • the feature tag is 1, it is determined that the feature tag matches the target tag, and the picture to be verified is determined to be a live picture.
  • the multi-channel image data is obtained based on the fusion method; or according to the color information and brightness information of the image to be verified, the multi-channel image data is obtained based on the fitting and superposition method.
  • the video frame with the smallest noise and the largest face area is determined as the picture to be verified.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种基于机器学习的静默式活体图片识别方法,包括:获取待验证图片,根据待验证图片的颜色信息和亮度信息,构建待验证图片的多通道图片数据,将多通道图片数据输入预先设置的深度卷积网络,得到多通道图片数据对应的特征标签,当特征标签与目标标签匹配时,则确定待验证图片为活体图片。

Description

静默式活体图片识别方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请要求于2019年04月15日提交中国专利局,申请号为2019102984826,申请名称为“静默式活体图片识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种静默式活体图片识别方法、装置、计算机设备和存储介质。
背景技术
随着计算机技术的发展,人脸识别也得到了较大的发展。在进行人脸识别时,需要通过摄像头获取人脸信息,然后对人脸信息进行识别,从而确定人的身份,但是这种方式下,无法确认采集的是活体的人脸信息,从而导致人脸识别不安全。
传统技术中,为了解决活体识别的问题,可以采用双目摄像头,获取三维集合信息,然而,发明人意识到,这种方式对硬件要求高,可实现性较差,可以采用软件方法实现活体识别,在利用软件方法实现时,需要在用户的配合下,获取多帧的人脸图片,然后通过图片信息中包含的时间信息确认所获取的图片中是否为活体,然而这种方式下,操作较为复杂,需要用户配合才能实现。
发明内容
根据本申请公开的各种实施例,提供一种静默式活体图片识别方法、装置、计算机设备和存储介质。
一种静默式活体图片识别方法包括:
获取待验证图片;
根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
一种静默式活体图片识别装置包括:
数据获取模块,用于获取待验证图片;
特征提取模块,用于根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
预测模块,用于将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
识别模块,用于当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:
数据获取模块,用于获取待验证图片;
特征提取模块,用于根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
预测模块,用于将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
识别模块,用于当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
数据获取模块,用于获取待验证图片;
特征提取模块,用于根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
预测模块,用于将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
识别模块,用于当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中静默式活体图片识别方法的应用场景图。
图2为根据一个或多个实施例中静默式活体图片识别方法的流程示意图。
图3为根据一个或多个实施例中构建多通道图片数据的步骤的流程示意图。
图4为另一个实施例中静默式活体图片识别方法的流程示意图。
图5为根据一个或多个实施例中静默式活体图片识别装置的框图。
图6为根据一个或多个实施例中计算机设备的框图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的静默式活体图片识别方法,可以应用于如图1所示的应用环境中。终端102与服务器104通过网络进行通信。终端102可以但不限于是各种摄像装置、个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
当终端102为摄像装置时,终端102通过本地局域网络或者互联网与服务器104相连,终端102在接收到拍摄指令时,进行拍摄并将拍摄得到的图片数据通过网络发送给服务器104。
另外,终端102是个人计算机时,服务器104包括两种获取图片数据的方式,其一是通过个人计算机的摄像功能,此时,个人计算机等同于摄像装置,其二是将图片数据存储在个人计算机的存储介质中,从而通过从存储介质中取出图片数据,将图片数据通过网络上传至服务器104。
通过上述方式,服务器104获取到图片数据,将图片数据定义为待验证图片数据,服务器提取图片数据的颜色信息和亮度信息,构建待验证图片数据的多通道数据,从而将多通道数据输入服务器104中设置的深度卷积网络中,深度卷积网络数据多通道图片数据对应的特征标签。然后服务器对特征 标签和目标标签进行匹配,若二者相匹配时,确定待验证图片为活体图片。
在其中一个实施例中,如图2所示,提供了一种静默式活体图片识别方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:
步骤202,获取待验证图片。
待验证图片表示服务器接收到带有人脸的图片数据,带有人脸表示在图片区域中至少存在一个人脸。
具体的,服务器可以接收来自摄像装置的图片数据,也可以接收来自个人计算机等终端的图片数据,在一具体活体验证场景下,目标人员的人脸出现在摄像头覆盖区域,摄像装置拍摄摄像头覆盖区域的图片,因此将带人脸的图片数据上传至服务器,服务器将该图片数据标记为待验证图片。
步骤204,根据待验证图片的颜色信息和亮度信息,构建待验证图片的多通道图片数据。
颜色信息指的是待验证图片的像素数据,通过颜色信息可以得到待验证图片的颜色分布情况,可以采用标准的RGB格式表示,也可以采用其他颜色格式表示;亮度信息指的是待验证图片的亮度表现信息,通过亮度信息可以得到待验证图片的亮度显示效果,可以通过HSV模型来标识亮度信息;多通道图片数据指的是拥有多方面信息的图片数据,本步骤中具体表示同时包含颜色信息和亮度信息的图片数据。
具体的,在提取待验证图片的颜色信息和亮度信息之后,可以采用融合的方式得到多通道图片数据,也可以采用拟合叠加的方式得到多通道图片数据。
步骤206,将多通道图片数据输入预先设置的深度卷积网络,得到多通道图片数据对应的特征标签。
深度卷积网络是对卷积神经网络进行深度学习得到的,深度卷积网络中多个卷积层,通过对大量的图片数据的深度学习,建立输入图片数据与预先设置标签之间的联系,因此在输入多通道数据时,可以输出多通道图片数据对应的特征标签,值得说明的是,特征标签是深度卷积网络的输出层的标签中的一种。
服务器中可以预先设置初步的卷积神经网络,然后通过收集的大量数据,对卷积神经网络进行训练,可以得到满足准确性要求的深度卷积网络。
步骤208,当特征标签与目标标签匹配时,则确定待验证图片为活体图片。
目标标签为服务器中预先设置的标签,目标标签可以根据匹配的逻辑,从深度卷积网络的输出层中的标签中选择一个。活体图片指的是图片数据是通过拍摄真实活体对象得到的,区别与拍摄假脸得到的二次图片。
上述静默式活体图片识别方法中,通过获取静默式的图片数据,基于颜色特征和亮度特征对图片数据进行多通道输入,构建多通道图片数据,以多通道图片数据作为深度卷积网络的输入,深度卷积网络是通过大量图片数据进行深度学习得到,因此,针对多通道图片数据,可以完成底层特征提取,以及底层特征向高层特征的转化,由于底层特征是由亮度特征和颜色特征转化得到,因此,高层特征可以及进一步深化亮度特征和颜色特征之间的联系,在检测活体时,全连接层根据高层特征映射至各个标签的结果,输出对应的特征标签,当输出的特征标签与目标标签匹配时,确定图片数据是否为活体图片。因而无需获取带时序的连续图片,就可以确定图片数据是否为活体图片,本发明实施例的技术方案在实现活体图片识别时操作更为简单,从而提高活体图片识别的效率。
上述实施例的技术方案,在各个使用场景下,操作均十分便利,例如:线上办理信用卡时,需要录取申请人的人脸图片,并且确认是申请人本人的操作信息,此时可以通过申请人使用的终端获取申请人的人脸图片,人脸图片上传至服务器后,服务器通过对图片数据进行处理、数据融合、模型输入等一系列操作,服务器根据模型输出结果确认是否为申请人本人操作行为,操作便利。
在其中一个实施例中,对于步骤202,服务器还可以获取到视频数据,然后从视频数据中提取待验证图片。
具体的,将视频数据分解为多个视频帧,通过对多个视频帧进行分析,分析过程包括,通过对视频帧对应的图片数据进行噪点分析以及通过边缘算法,计算视频帧对应的图片数据中人脸的面积大小,从而选择噪点最小、人脸面积最大的视频帧作为待验证图片。
另外,视频数据可以通过单摄像头获取,从而降低数据源数据获取的难度。
在其中一个实施例中,如图3所示,提供一种构建多通道图片数据的示意性流程图,具体步骤如下:
步骤302,根据待验证图片的颜色信息,得到待验证图片的RGB三通道图片数据。
RGB三通道图片数据表示的R(红)通道、G(绿)通道以及B(蓝)通道的数据,通过将待验证图片通过RGB三通道输入,可以得到RGB三通道图片数据。
步骤304,根据待验证图片的亮度信息,得到待验证图片的HSV三通道图片数据。
HSV三通道图片数据标识的H(色调)通道、S(饱和度)通道以及V (明度)通道的数据,通过将待验证图片通过HSV三通道输入,可以得到HSV三通道图片数据。
步骤306,根据RGB三通道图片数据和HSV三通道图片数据,得到多通道图片数据。
本实施例中,通过采用多通道输入的方式,提取待验证图片中大量信息,因此增加对待验证图片描述的完整性,从而在进行模型训练以及模型预测时,提高模型预测训练的效率以及模型预测的准确性。
对于步骤302,在一实施例中,RGB三通道指的是通过将待验证图片输入像素分离工具,分离得到待验证图片中R值、G值和B值,例如在一段像素点阵的RGB值为[(128,255,255),(0,255,255),(128,0,255)],RGB三通道输入后,得到R通道的数据为[128,0,128],得到G通道的数据为[225,225,0]以及得到B通道的数据为[225,225,225]。
对于步骤304,在一实施例中,HSV三通道指的时通过将待验证图片输入像素分离工具,分离得到待验证图片的H值、S值和V值,例如在一段像素点的HSV值为[(1,0.5,0.5),(2,0.3,0.3),(3,0.2,0.2)],其中H值的单位为角度,即当H值为1时,需要将1转化为相应的角度,HSV三通道输入后,得到待验证图片的H值为[1,2,3],得到S值为[0.5,0.3,0.2],得到V值为[0.5,0.3,0.2]。
对于步骤306,在一实施例中,多通道图片数据可以将三通道RGB的值和三通道HSV的值进行叠加后,输入同一卷积层进行卷积运算,从而建立各个通道值的联系。
在其中一个实施例中,将多通道图片数据输入深度卷积网络,具体进行了如下操作:将多通道图片数据输入预先设置的深度卷积网络,通过深度卷积网络的卷积层对RGB三通道图片数据和HSV三通道图片数据进行卷积计算,得到多通道图片数据对应的图片特征,根据图片特征,得到多通道图片数据对应的特征标签。
具体的,RGB三通道图片数据和HSV三通道图片数据均为低层次特征,通过多个卷积层的卷积计算,得到高层次的图片特征,因此通过深度卷积网络,可以提取出待验证图片的高层次特征,从而提高活体图片预测的准确性。
在一些实施例中,由图片特征至输出特征标签的过程,具体进行了如下操作:根据深度卷积网络的全连接层,得到图片特征映射至各个预设标签的概率,从而通过归一化指数函数,输出预设标签中的一个作为多通道图片数据对应的特征标签。
具体的,通过全连接层,建立图片特征中的节点与全连接层节点的连接关系,然后采用归一化指数函数(softmax layer)进行回归预测,从而输出多 通道图片数据对应的特征标签。全连接层的激励函数可以选择Relu函数进行非线性映射。
在其中一个实施例中,如图4所示,提供一种训练深度卷积网络的方式的示意性流程体,具体步骤如下:
步骤402,根据预先设置的一次图片,构建对应于一次图片的二次图片。
二次图片是通过拍摄一次图片得到的图片数据,一次图片指的是活体图片。可以通过互联网或者实体拍摄,获取大量的一次图片,已提供数据支撑。
步骤404,根据一次图片和二次图片,建立深度卷积网络的训练集和验证集。
训练集中包括大量一次图片以及对应数量的二次图片,验证集中也包括适量的一次图片和对应数量的二次图片。训练集中的数据负责对初始卷积神经网络进行训练,而验证集负责验证训练效果。
步骤406,通过训练集以及预先设置的损失函数,对初始卷积神经网络进行训练,当初始卷积神经网络在验证集的准确率达到阈值时,得到深度卷积神经网络。
服务器中设置损失函数的预设输出值,当损失函数的输出值未达到预设输出值时,根据损失函数输出的值,对卷积层的参数以及全连接层的参数进行调整,从而进行初始卷积神经网络的训练。准确率指的是,通过将验证集中的一次图片或者二次图片输入训练后的初始卷积神经网络后,得到的统计准确率。
本实施例中,通过一次图片和二次图片设计训练集和验证集,从而达到对初始卷积神经网络进行训练的目的,从而可以提高深度卷积网络预测的准确性。
对于步骤402,在一些实施例中,一次图片的数据源可以是视频数据,从而视频数据中提取出视频帧,对视频帧进行初步筛选,即筛选噪点过大的视频帧,从而可以根据一段视频数据得到多张一次图片,从而极大的拓展了数据量,从而可以进一步提高深度卷积网络的训练程度。
值得说明的是,在一训练集或者验证集中,一次图片的数量与二次图片的数量相等,从而在训练时,保证各个图片进行预测,均有较高的准确性。
对于步骤404,在其中一个实施例中,建立训练集和验证集的具体步骤如下:对一次图片进行数据增强操作,得到多张对应于一次图片的增强一次图片;数据增强操作包括:旋转操作、缩放操作以及翻转操作,对二次图片进行数据增强操作,得到多张对应于二次图片的增强二次图片,根据增强一次图片和增强二次图片,建立深度卷积网络的训练集和验证集。
本实施例中,提出一种拓展训练集和验证集样本的方法,因此可以提高 深度卷积网络的训练程度,进一步提高深度卷积网络的预测准确性。
进一步的,旋转操作可以以原一次图片为原件,进行复制操作,然后对复制件进行旋转一定角度得到新的一次图片,多次旋转可以得到多张一次图片,同样的操作方式,通过一张二次图片可以得到多张二次图片作为样本。
进一步的,缩放操作指的是对像素大小进行缩放,例如一张1920*1080的一次图片缩放为一张1280*720的一次图片,从而由一张二次图片拓展为两张,进行不同程度的缩放,可以得到多张二次图片,那行缩放操作虽然不会改变其显示效果,但是在进行特征提取时,对输入数据的维度有改变。同理,也可以通过翻转操作对一次图片和二次图片的数量进行拓展。
在一些实施例中,深度卷积网络输出的预设标签包括1和0,因此特征标签可能是1或者0,将目标标签设置为1时,当特征标签也是1时,则确定特征标签和目标标签匹配,确定待验证图片为活体图片。值得说明的是,当深度卷积网络输出的标签是1时,则表示输入的待验证图片是活体图片。
应该理解的是,虽然图2-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在其中一个实施例中,如图5所示,提供了一种静默式活体图片识别装置,包括:数据获取模块502、特征提取模块504、预测模块506和识别模块508,其中:
数据获取模块502,用于获取待验证图片。
特征提取模块504,用于根据待验证图片的颜色信息和亮度信息,构建待验证图片的多通道图片数据。
预测模块506,用于将多通道图片数据输入预先设置的深度卷积网络,得到多通道图片数据对应的特征标签。
识别模块508,用于当特征标签与目标标签匹配时,则确定待验证图片为活体图片。
在其中一个实施例中,特征提取模块504还用于根据待验证图片的颜色信息,得到待验证图片的RGB三通道图片数据;根据待验证图片的亮度信息,得到待验证图片的HSV三通道图片数据;根据RGB三通道图片数据和HSV三通道图片数据,得到多通道图片数据。
在其中一个实施例中,特征提取模块504还用于将多通道图片数据输入预先设置的深度卷积网络,通过深度卷积网络的卷积层对RGB三通道图片数据和HSV三通道图片数据进行计算,得到多通道图片数据对应的图片特征;根据图片特征,得到多通道图片数据对应的特征标签。
在其中一个实施例中,特征提取模块504还用于根据深度卷积网络的全连接层,得到图片特征映射至各个预设标签的概率,通过预先设置的归一化指数函数,输出预设标签中的一个作为多通道图片数据对应的特征标签。
在其中一个实施例中,还包括模型训练模块,用于根据预先设置的一次图片,构建对应于一次图片的二次图片;二次图片是通过拍摄一次图片得到的图片数据;根据一次图片和二次图片,建立深度卷积网络的训练集和验证集;通过训练集以及预先设置的损失函数,对初始卷积神经网络进行训练,当初始卷积神经网络在验证集的准确率达到阈值时,得到深度卷积神经网络。
在其中一个实施例中,模型训练模块,还用于对一次图片进行数据增强操作,得到多张对应于一次图片的增强一次图片;数据增强操作包括:旋转操作、缩放操作以及翻转操作;对二次图片进行数据增强操作,得到多张对应于二次图片的增强二次图片;根据增强一次图片和增强二次图片,建立深度卷积网络的训练集和验证集。
在其中一个实施例中,特征标签包括1或0;目标标签为1;识别模块508还用于当特征标签为1时,确定特征标签与目标标签匹配,确定待验证图片为活体图片。
关于静默式活体图片识别装置的具体限定可以参见上文中对于静默式活体图片识别方法的限定,在此不再赘述。上述静默式活体图片识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储图片数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种静默式活体图片识别方法。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤:
获取待验证图片;
根据待验证图片的颜色信息和亮度信息,构建待验证图片的多通道图片数据;颜色信息为待验证图片的像素数据,亮度信息为待验证图片的亮度表现信息;
将多通道图片数据输入预先设置的深度卷积网络,得到多通道图片数据对应的特征标签;及
当特征标签与目标标签匹配时,则确定待验证图片为活体图片。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
根据待验证图片的颜色信息,得到待验证图片的RGB三通道图片数据;
根据待验证图片的亮度信息,得到待验证图片的HSV三通道图片数据;及
根据RGB三通道图片数据和HSV三通道图片数据,得到多通道图片数据。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
将多通道图片数据输入预先设置的深度卷积网络,通过深度卷积网络的卷积层对RGB三通道图片数据和HSV三通道图片数据进行计算,得到多通道图片数据对应的图片特征;及
根据图片特征,得到多通道图片数据对应的特征标签。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
根据深度卷积网络的全连接层,得到图片特征映射至各个预设标签的概率,通过预先设置的归一化指数函数,输出预设标签中的一个作为多通道图片数据对应的特征标签。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
根据预先设置的一次图片,构建对应于一次图片的二次图片;二次图片是通过拍摄一次图片得到的图片数据;
根据一次图片和二次图片,建立深度卷积网络的训练集和验证集;及
通过训练集以及预先设置的损失函数,对初始卷积神经网络进行训练, 当初始卷积神经网络在验证集的准确率达到阈值时,得到深度卷积神经网络。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
对一次图片进行数据增强操作,得到多张对应于一次图片的增强一次图片;数据增强操作包括:旋转操作、缩放操作以及翻转操作;
对二次图片进行数据增强操作,得到多张对应于二次图片的增强二次图片;及
根据增强一次图片和增强二次图片,建立深度卷积网络的训练集和验证集。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
特征标签包括1或0;目标标签为1;
当特征标签与目标标签匹配时,则确定待验证图片为活体图片,包括:
当特征标签为1时,确定特征标签与目标标签匹配,确定待验证图片为活体图片。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
根据待验证图片的颜色信息和亮度信息,基于采用融合的方式得到多通道图片数据;或根据待验证图片的颜色信息和亮度信息,基于采用拟合叠加的方式得到多通道图片数据。
在其中一个实施例中,处理器执行计算机可读指令时还实现以下步骤:
获取待验证视频数据;
将待验证视频数据分解为多个视频帧,并对多个视频帧对应的图片数据进行噪点分析;
基于边缘算法,计算各视频帧对应的图片数据中人脸的面积大小;及
将噪点最小以及人脸面积最大的视频帧确定为待验证图片。
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
获取待验证图片;
根据待验证图片的颜色信息和亮度信息,构建待验证图片的多通道图片数据;颜色信息为待验证图片的像素数据,亮度信息为待验证图片的亮度表现信息;
将多通道图片数据输入预先设置的深度卷积网络,得到多通道图片数据对应的特征标签;及
当特征标签与目标标签匹配时,则确定待验证图片为活体图片。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
根据待验证图片的颜色信息,得到待验证图片的RGB三通道图片数据;
根据待验证图片的亮度信息,得到待验证图片的HSV三通道图片数据; 及
根据RGB三通道图片数据和HSV三通道图片数据,得到多通道图片数据。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
将多通道图片数据输入预先设置的深度卷积网络,通过深度卷积网络的卷积层对RGB三通道图片数据和HSV三通道图片数据进行计算,得到多通道图片数据对应的图片特征;及
根据图片特征,得到多通道图片数据对应的特征标签。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
根据深度卷积网络的全连接层,得到图片特征映射至各个预设标签的概率,通过预先设置的归一化指数函数,输出预设标签中的一个作为多通道图片数据对应的特征标签。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
根据预先设置的一次图片,构建对应于一次图片的二次图片;二次图片是通过拍摄一次图片得到的图片数据;
根据一次图片和二次图片,建立深度卷积网络的训练集和验证集;及
通过训练集以及预先设置的损失函数,对初始卷积神经网络进行训练,当初始卷积神经网络在验证集的准确率达到阈值时,得到深度卷积神经网络。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
对一次图片进行数据增强操作,得到多张对应于一次图片的增强一次图片;数据增强操作包括:旋转操作、缩放操作以及翻转操作;
对二次图片进行数据增强操作,得到多张对应于二次图片的增强二次图片;及
根据增强一次图片和增强二次图片,建立深度卷积网络的训练集和验证集。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
特征标签包括1或0;目标标签为1;
当特征标签与目标标签匹配时,则确定待验证图片为活体图片,包括:
当特征标签为1时,确定特征标签与目标标签匹配,确定待验证图片为活体图片。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
根据待验证图片的颜色信息和亮度信息,基于采用融合的方式得到多通道图片数据;或根据待验证图片的颜色信息和亮度信息,基于采用拟合叠加的方式得到多通道图片数据。
在其中一个实施例中,计算机可读指令被处理器执行时还实现以下步骤:
获取待验证视频数据;
将待验证视频数据分解为多个视频帧,并对多个视频帧对应的图片数据进行噪点分析;
基于边缘算法,计算各视频帧对应的图片数据中人脸的面积大小;及
将噪点最小以及人脸面积最大的视频帧确定为待验证图片。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种静默式活体图片识别方法,包括:
    获取待验证图片;
    根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
    将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
    当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据,包括:
    根据所述待验证图片的颜色信息,得到所述待验证图片的RGB三通道图片数据;
    根据所述待验证图片的亮度信息,得到所述待验证图片的HSV三通道图片数据;及
    根据所述RGB三通道图片数据和所述HSV三通道图片数据,得到所述多通道图片数据。
  3. 根据权利要求2所述的方法,其特征在于,所述将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签,包括:
    将所述多通道图片数据输入预先设置的深度卷积网络,通过所述深度卷积网络的卷积层对所述RGB三通道图片数据和HSV三通道图片数据进行计算,得到所述多通道图片数据对应的图片特征;及
    根据所述图片特征,得到所述多通道图片数据对应的特征标签。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述图片特征,得到所述多通道图片数据对应的特征标签,包括:
    根据所述深度卷积网络的全连接层,得到所述图片特征映射至各个预设标签的概率,通过预先设置的归一化指数函数,输出所述预设标签中的一个作为所述多通道图片数据对应的特征标签。
  5. 根据权利要求1所述的方法,其特征在于,训练深度卷积网络的方式,包括:
    根据预先设置的一次图片,构建对应于所述一次图片的二次图片;所述二次图片是通过拍摄所述一次图片得到的图片数据;
    根据所述一次图片和所述二次图片,建立所述深度卷积网络的训练集和验证集;及
    通过所述训练集以及预先设置的损失函数,对初始卷积神经网络进行训练,当所述初始卷积神经网络在所述验证集的准确率达到阈值时,得到所述深度卷积神经网络。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述一次图片和所述二次图片,建立所述深度卷积网络的训练集和验证集,包括:
    对所述一次图片进行数据增强操作,得到多张对应于所述一次图片的增强一次图片;所述数据增强操作包括:旋转操作、缩放操作以及翻转操作;
    对所述二次图片进行所述数据增强操作,得到多张对应于所述二次图片的增强二次图片;及
    根据所述增强一次图片和所述增强二次图片,建立所述深度卷积网络的训练集和验证集。
  7. 根据权利要求1至6任一项所述的方法,其特征在于,所述特征标签包括1或0;所述目标标签为1;
    所述当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片,包括:
    当所述特征标签为1时,确定所述特征标签与所述目标标签匹配,确定所述待验证图片为活体图片。
  8. 根据权利要求1所述的方法,其特征在于,所述根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据,包括:
    根据所述待验证图片的颜色信息和亮度信息,基于采用融合的方式得到多通道图片数据;或根据所述待验证图片的颜色信息和亮度信息,基于采用拟合叠加的方式得到多通道图片数据。
  9. 根据权利要求1所述的方法,其特征在于,所述获取待验证图片包括:
    获取待验证视频数据;
    将所述待验证视频数据分解为多个视频帧,并对多个所述视频帧对应的图片数据进行噪点分析;
    基于边缘算法,计算各所述视频帧对应的图片数据中人脸的面积大小;及
    将噪点最小以及人脸面积最大的视频帧确定为待验证图片。
  10. 一种静默式活体图片识别装置,包括:
    数据获取模块,用于获取待验证图片;
    特征提取模块,用于根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
    预测模块,用于将所述多通道图片数据输入预先设置的深度卷积网络, 得到所述多通道图片数据对应的特征标签;及
    识别模块,用于当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
  11. 根据权利要求10所述的装置,其特征在于,所述特征提取模块还用于:
    根据所述待验证图片的颜色信息,得到所述待验证图片的RGB三通道图片数据;
    根据所述待验证图片的亮度信息,得到所述待验证图片的HSV三通道图片数据;及
    根据所述RGB三通道图片数据和所述HSV三通道图片数据,得到所述多通道图片数据。
  12. 根据权利要求11所述的装置,其特征在于,所述预测模块还用于:
    将所述多通道图片数据输入预先设置的深度卷积网络,通过所述深度卷积网络的卷积层对所述RGB三通道图片数据和HSV三通道图片数据进行计算,得到所述多通道图片数据对应的图片特征;及
    根据所述图片特征,得到所述多通道图片数据对应的特征标签。
  13. 根据权利要求12所述的装置,其特征在于,所述预测模块还用于:
    根据所述深度卷积网络的全连接层,得到所述图片特征映射至各个预设标签的概率,通过预先设置的归一化指数函数,输出所述预设标签中的一个作为所述多通道图片数据对应的特征标签。
  14. 根据权利要求10所述的装置,其特征在于,所述静默式活体图片识别装置,还包括模型训练模块,用于:
    根据预先设置的一次图片,构建对应于所述一次图片的二次图片;所述二次图片是通过拍摄所述一次图片得到的图片数据;
    根据所述一次图片和所述二次图片,建立所述深度卷积网络的训练集和验证集;及
    通过所述训练集以及预先设置的损失函数,对初始卷积神经网络进行训练,当所述初始卷积神经网络在所述验证集的准确率达到阈值时,得到所述深度卷积神经网络。
  15. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取待验证图片;
    根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息 为所述待验证图片的亮度表现信息;
    将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
    当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
  16. 根据权利要求15所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    根据所述待验证图片的颜色信息,得到所述待验证图片的RGB三通道图片数据;
    根据所述待验证图片的亮度信息,得到所述待验证图片的HSV三通道图片数据;及
    根据所述RGB三通道图片数据和所述HSV三通道图片数据,得到所述多通道图片数据。
  17. 根据权利要求16所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    将所述多通道图片数据输入预先设置的深度卷积网络,通过所述深度卷积网络的卷积层对所述RGB三通道图片数据和HSV三通道图片数据进行计算,得到所述多通道图片数据对应的图片特征;及
    根据所述图片特征,得到所述多通道图片数据对应的特征标签。
  18. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    获取待验证图片;
    根据所述待验证图片的颜色信息和亮度信息,构建所述待验证图片的多通道图片数据;所述颜色信息为所述待验证图片的像素数据,所述亮度信息为所述待验证图片的亮度表现信息;
    将所述多通道图片数据输入预先设置的深度卷积网络,得到所述多通道图片数据对应的特征标签;及
    当所述特征标签与目标标签匹配时,则确定所述待验证图片为活体图片。
  19. 根据权利要求18所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    根据所述待验证图片的颜色信息,得到所述待验证图片的RGB三通道图片数据;
    根据所述待验证图片的亮度信息,得到所述待验证图片的HSV三通道图片数据;及
    根据所述RGB三通道图片数据和所述HSV三通道图片数据,得到所述 多通道图片数据。
  20. 根据权利要求19所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    将所述多通道图片数据输入预先设置的深度卷积网络,通过所述深度卷积网络的卷积层对所述RGB三通道图片数据和HSV三通道图片数据进行计算,得到所述多通道图片数据对应的图片特征;及
    根据所述图片特征,得到所述多通道图片数据对应的特征标签。
PCT/CN2019/122920 2019-04-15 2019-12-04 静默式活体图片识别方法、装置、计算机设备和存储介质 WO2020211396A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910298482.6A CN110135259A (zh) 2019-04-15 2019-04-15 静默式活体图片识别方法、装置、计算机设备和存储介质
CN201910298482.6 2019-04-15

Publications (1)

Publication Number Publication Date
WO2020211396A1 true WO2020211396A1 (zh) 2020-10-22

Family

ID=67569940

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/122920 WO2020211396A1 (zh) 2019-04-15 2019-12-04 静默式活体图片识别方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN110135259A (zh)
WO (1) WO2020211396A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135259A (zh) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 静默式活体图片识别方法、装置、计算机设备和存储介质
CN112257685A (zh) * 2020-12-08 2021-01-22 成都新希望金融信息有限公司 人脸翻拍识别方法、装置、电子设备及存储介质
CN112464873A (zh) * 2020-12-09 2021-03-09 携程计算机技术(上海)有限公司 模型的训练方法、人脸活体识别方法、系统、设备及介质
CN112926497B (zh) * 2021-03-20 2024-07-05 杭州知存智能科技有限公司 基于多通道数据特征融合的人脸识别活体检测方法和装置
CN113111750A (zh) * 2021-03-31 2021-07-13 智慧眼科技股份有限公司 人脸活体检测方法、装置、计算机设备及存储介质
CN114360074A (zh) * 2022-01-10 2022-04-15 北京百度网讯科技有限公司 检测模型的训练方法、活体检测方法、装置、设备和介质
CN116259091B (zh) * 2023-01-18 2023-11-10 北京飞腾时光信息科技有限公司 一种静默活体检测的方法和装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818313A (zh) * 2017-11-20 2018-03-20 腾讯科技(深圳)有限公司 活体识别方法、装置、存储介质和计算机设备
CN107992842A (zh) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 活体检测方法、计算机装置及计算机可读存储介质
CN109034102A (zh) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 人脸活体检测方法、装置、设备及存储介质
CN109101925A (zh) * 2018-08-14 2018-12-28 成都智汇脸卡科技有限公司 活体检测方法
CN109271863A (zh) * 2018-08-15 2019-01-25 北京小米移动软件有限公司 人脸活体检测方法及装置
CN110135259A (zh) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 静默式活体图片识别方法、装置、计算机设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109376518A (zh) * 2018-10-18 2019-02-22 深圳壹账通智能科技有限公司 基于人脸识别的防止隐私泄露方法及相关设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818313A (zh) * 2017-11-20 2018-03-20 腾讯科技(深圳)有限公司 活体识别方法、装置、存储介质和计算机设备
CN107992842A (zh) * 2017-12-13 2018-05-04 深圳云天励飞技术有限公司 活体检测方法、计算机装置及计算机可读存储介质
CN109034102A (zh) * 2018-08-14 2018-12-18 腾讯科技(深圳)有限公司 人脸活体检测方法、装置、设备及存储介质
CN109101925A (zh) * 2018-08-14 2018-12-28 成都智汇脸卡科技有限公司 活体检测方法
CN109271863A (zh) * 2018-08-15 2019-01-25 北京小米移动软件有限公司 人脸活体检测方法及装置
CN110135259A (zh) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 静默式活体图片识别方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN110135259A (zh) 2019-08-16

Similar Documents

Publication Publication Date Title
WO2020211396A1 (zh) 静默式活体图片识别方法、装置、计算机设备和存储介质
US11373275B2 (en) Method for generating high-resolution picture, computer device, and storage medium
CN110135406B (zh) 图像识别方法、装置、计算机设备和存储介质
US10832086B2 (en) Target object presentation method and apparatus
US20230021661A1 (en) Forgery detection of face image
WO2020147445A1 (zh) 翻拍图像识别方法、装置、计算机设备和计算机可读存储介质
CN111553267B (zh) 图像处理方法、图像处理模型训练方法及设备
CN113435330B (zh) 基于视频的微表情识别方法、装置、设备及存储介质
CN110020582B (zh) 基于深度学习的人脸情绪识别方法、装置、设备及介质
US20230034040A1 (en) Face liveness detection method, system, and apparatus, computer device, and storage medium
JP2022133378A (ja) 顔生体検出方法、装置、電子機器、及び記憶媒体
WO2022033219A1 (zh) 人脸活体检测方法、系统、装置、计算机设备和存储介质
CN110287836B (zh) 图像分类方法、装置、计算机设备和存储介质
CN110427972B (zh) 证件视频特征提取方法、装置、计算机设备和存储介质
CN111275685A (zh) 身份证件的翻拍图像识别方法、装置、设备及介质
CN111191521B (zh) 人脸活体检测方法、装置、计算机设备及存储介质
CN113469092B (zh) 字符识别模型生成方法、装置、计算机设备和存储介质
WO2021169616A1 (zh) 非活体人脸的检测方法、装置、计算机设备及存储介质
CN111339897B (zh) 活体识别方法、装置、计算机设备和存储介质
US20210374476A1 (en) Method and system for identifying authenticity of an object
US20230143452A1 (en) Method and apparatus for generating image, electronic device and storage medium
WO2021169625A1 (zh) 网络翻拍照片的检测方法、装置、计算机设备及存储介质
WO2022089185A1 (zh) 图像处理方法和图像处理装置
CN109784154B (zh) 基于深度神经网络的情绪识别方法、装置、设备及介质
CN108460811B (zh) 面部图像处理方法、装置及计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19925152

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19925152

Country of ref document: EP

Kind code of ref document: A1