WO2020211248A1 - 活体检测日志解析的方法、装置、存储介质及计算机设备 - Google Patents

活体检测日志解析的方法、装置、存储介质及计算机设备 Download PDF

Info

Publication number
WO2020211248A1
WO2020211248A1 PCT/CN2019/103199 CN2019103199W WO2020211248A1 WO 2020211248 A1 WO2020211248 A1 WO 2020211248A1 CN 2019103199 W CN2019103199 W CN 2019103199W WO 2020211248 A1 WO2020211248 A1 WO 2020211248A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
log
sample
test
sub
Prior art date
Application number
PCT/CN2019/103199
Other languages
English (en)
French (fr)
Inventor
孙丹
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020211248A1 publication Critical patent/WO2020211248A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Definitions

  • This application relates to the technical field of log analysis, and in particular to a method, device, storage medium, and computer equipment for analyzing a living body detection log.
  • Living body detection is a method to determine the true physiological characteristics of objects in some identity verification scenarios. For example, in face recognition applications, living body detection can use combined actions such as blinking, opening mouth, shaking head, and nodding, using facial key point positioning and face tracking, etc. Technology to verify whether the user is a real living person.
  • the living body detection technology needs to cover multiple models (such as xfaceM1 model, Ace model, xface online model, etc.) in the testing process, and it needs to conduct competitive product analysis with industry counterparts, resulting in the number of samples for living body detection Generally ranging from tens of thousands to hundreds of thousands, the number of samples is relatively large, and the result statistics cannot be performed manually like the conventional version test, and the test efficiency is low.
  • models such as xfaceM1 model, Ace model, xface online model, etc.
  • the present application provides a method, device, storage medium and computer equipment for analyzing a living body detection log.
  • a method for parsing a living body detection log including:
  • an apparatus for analyzing a living body detection log including:
  • An obtaining module used to obtain test logs of living body detection and determine the log format of the test logs
  • the parsing module is used to determine the corresponding parsing method according to the log format of the test log, and to analyze the test log through the parsing method to determine the parsing result of each log data in the test log.
  • the analysis result includes one or more sub-data in the log data;
  • the processing module is used to determine the data field corresponding to each sub-data of the log data, and store the sub-data in the corresponding database of the database according to the correspondence between the data field and the table field of the database Location
  • the statistics module is used to perform statistics on the data stored in the database and generate the test results of the test log.
  • a computer-readable storage medium having computer-readable instructions stored thereon, and when the computer-readable instructions are executed by a processor, the steps of parsing a living body detection log are realized.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor.
  • the processor executes the computer-readable instructions Realize the steps of living body detection log analysis.
  • the method for parsing living body detection logs uses corresponding analysis methods to determine the sub-data fields of each log data for test logs of different models and different formats, and stores the log data in a unified database , And then can determine the test results of different model test logs.
  • This method can be used for multi-sample and multi-model collection, by configuring the mapping relationship between data and fields to quickly analyze logs, quickly count and determine test results, which greatly improves efficiency and saves time.
  • FIG. 1 is a schematic flowchart of a method for analyzing a living body detection log provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a method for determining the data field corresponding to each sub-data of the log data in the living body detection log analysis method provided by the embodiment of the application;
  • FIG. 3 is a schematic flowchart of a method for determining a test result of a test log in the method for analyzing a living body detection log provided by an embodiment of the application;
  • FIG. 4 is a schematic structural diagram of a living body detection log analysis device provided by an embodiment of the application.
  • FIG. 5 is a schematic structural diagram of a computer device for executing a method for analyzing a living body detection log provided by an embodiment of the application.
  • a method for parsing a living body detection log provided by an embodiment of the present application, as shown in FIG. 1, includes:
  • Step 101 Obtain the test log of the living body detection, and determine the log format of the test log.
  • In vivo detection is a method to determine the true physiological characteristics of an object in some identity verification scenarios.
  • a corresponding test log is generated.
  • the test log contains the test results of the in vivo detection, such as the presence of a human face in the judgment result.
  • the test log may be in different formats; the log format of the test log includes txt file format, pos file format, log file format, etc., or the data in the test log is json (JavaScript Object Notation, JS object notation) format.
  • the living body detection technology needs to cover multiple models (such as xfaceM1 model, Ace model, xface online model, etc.) during the test.
  • test logs in different formats may return test logs in different formats, and test logs in different log formats are different For example, some models return test logs in json format, and some return test logs combined with other separators (such as commas, spaces, etc.).
  • the data in non-standard test logs can be captured and formatted into a specified format for output, such as txt file format, pos file format, etc.
  • Step 102 Determine the corresponding parsing method according to the log format of the test log, and analyze the test log through the parsing method to determine the parsing result of each log data in the test log.
  • the parsing result contains one or more sub-data in the log data .
  • different test logs may adopt different log formats, so different parsing methods may be required, and the specific parsing method to be selected may be determined according to actual conditions.
  • Each test log contains one or more pieces of log data, and each piece of log data contains specific information about the test.
  • the log data can be divided into one or more sub-data.
  • the log data in the test logs of different log formats may use different separators. In this case, different parsing methods are required to identify the separators, so that the log data can be parsed correctly.
  • the log data in the test log in non-json format contains separators (such as commas, spaces, etc.).
  • separators such as commas, spaces, etc.
  • a log data in non-json format with a comma as the separator is as follows:
  • a log data in json format with a comma as the separator can be specifically as follows:
  • Step 103 Determine the data field corresponding to each sub-data of the log data, and store the sub-data in the corresponding position of the database according to the correspondence between the data field and the table field of the database.
  • each sub-data of the log data corresponds to a field, that is, a data field; the data field has a one-to-one correspondence with the table field in the database.
  • the log data contains the socre field
  • the database table field also has the socre field, and the two are in a one-to-one correspondence.
  • "-0.398854" corresponds to the score field of the database table field
  • "/home/_live/A4Paper_2121/VID20180628164054_525.jpg” corresponds to the path field of the database table field, etc.
  • Step 104 Perform statistics on the data stored in the database to determine the test result of the test log.
  • the log data in the test logs of different models can be stored in a unified database in a formatted form through the above process, so that the log data in the test log can be counted conveniently, and the corresponding test results can be obtained by quick statistics. .
  • the method for parsing living body detection logs uses corresponding analysis methods to determine the sub-data fields of each log data for test logs of different models and different formats, and stores the log data in a unified database , And then can determine the test results of different model test logs.
  • This method can be used for multi-sample and multi-model collection, by configuring the mapping relationship between data and fields to quickly analyze logs, quickly count and determine test results, which greatly improves efficiency and saves time.
  • Another embodiment of the present application provides a method for parsing a living body detection log.
  • the method includes steps 101-104 in the foregoing embodiment.
  • steps 101-104 for implementation principles and technical effects, refer to the corresponding embodiment in FIG. 1.
  • the log data contains multiple sub-data
  • different sub-data corresponds to different data fields, and some data fields have special formats, such as path fields (path fields).
  • path fields path fields
  • the data fields of the sub-data can be easily identified ; But different sub-data may have the same or similar forms.
  • the serial number of the log data in the test log is in digital form, and the sample score (ie the score field) is also in digital form.
  • Traditional identification methods (such as regular expressions) The recognition rate may be low.
  • the data field of each sub-data of the log data is determined by predetermining the sequence of the fields in the standard log data. For details, refer to FIG. 2, step 103 "Determine that each sub-data of the log data corresponds to each sub-data.
  • Data fields including:
  • Step 1031 Determine the standard log data.
  • the standard log data is the log data determined according to the log format of the test log, or the log data selected from the test log that conforms to the log format.
  • the standard log data corresponding to each log format is predetermined. Specifically, the standard log data only needs to satisfy the log format, or the standard log data is one item of log data in the test log.
  • the log data in this format is a sub-data in the test log, which contains five sub-data, namely "M1", “M1_Living_Detect_Score:”, “-0.398854", "VID20180628164054_525.jpg”, “/home/_live/A4Paper_2121/VID20180628164054_525.jpg", you can directly use this sub-data as standard log data in non-json format.
  • Step 1032 Determine the standard sub-data contained in the standard log data, and the sequence of each standard sub-data in the standard log data, and set a corresponding standard field for each sequence of the standard sub-data.
  • the standard sub-data refers to the sub-data contained in the standard log data
  • the standard log data includes multiple standard sub-data, and different standard sub-data have different positions in the standard log data. That is, different ranks; as in the non-json format log data, the sub-data "M1" is the first rank, the sub-data "-0.398854" is the third rank, and so on.
  • the standard field of each standard sub-data is determined, that is, a standard sub-data has a unique sequence and a unique standard field, so that the corresponding relationship between the sequence and the field can be formed.
  • the sub-data "M1" is the image name field
  • the sub-data "-0.398854" is the sample score field
  • the first rank corresponds to the image name field
  • the third rank corresponds to the sample score field
  • the other ranks correspond to this Similar, I won’t repeat them here.
  • more methods can be used to more accurately determine the standard field corresponding to each standard sub-data in the standard log data, and the amount of processing added by one standard log data is very small. can be ignored.
  • Step 1033 Determine the sequence of the sub-data of the log data in the log data, and use the standard field corresponding to the sequence of the sub-data as the data field of the sub-data.
  • the sub-data of the common log data and the standard log data in the same sequence should have the same fields, that is, have Correspondence between the same sequence and the field.
  • the third-ranked sub-data corresponds to the standard field "Sample Score Field"
  • the third-ranked sub-data field of each log data is a sample score field.
  • the data field of each sub-data of the log data is determined by predetermining the sequence of the fields in the standard log data, so that the data field of the sub-data can be accurately determined without increasing the workload, which is convenient for follow-up The statistical processing process.
  • a set of log data ie standard log data
  • the three sub-data correspond to the fields k1, k2, and k3, then the three fields k1, k2, and k3 are used as the three
  • the parsed sub-data corresponding to the three fields k1, k2, k3, and there is no need to analyze the field corresponding to each sub-data in the log data.
  • the test result is generated based on the pass rate of positive samples and/or the false recognition rate of negative samples.
  • the above step 104 "determine the test result of the test log" includes:
  • Step 1041 Determine the sample attribute of each log data in the test log.
  • the sample attribute includes a positive sample and a negative sample.
  • the sample attribute of the log data can be predetermined, that is, whether the log data is a positive sample or a negative sample can be determined in advance; it can also be determined that each log data corresponds to a positive sample when the test log of a living body is obtained in step 101.
  • the sample is also a negative sample.
  • the log data It is a positive sample; among them, the resolution of general HD photos is greater than 640*480, and the resolution of SD photos is ⁇ 640*480.
  • the log data is a negative sample; among them, A4 paper, coated paper remake, that is After printing the photo on A4 paper or coated paper, remake the photo; the remake of the origami mask refers to the remake of the face after printing on the coated paper, the eyes and mouth of the face are exposed.
  • Step 1042 Determine the statistical parameters of the test log according to the n positive samples and/or m negative samples in the test log, the statistical parameters include the positive sample pass rate and/or the negative sample misrecognition rate; the positive sample pass rate is the sample score The ratio of the number of positive samples greater than the first living body threshold to n, and the negative sample misrecognition rate is the ratio of the number of negative samples with a sample score greater than the second living body threshold to m; the sample score corresponds to a data field in the log data.
  • the test log contains multiple log data, that is, generally contains multiple positive samples and/or multiple negative samples.
  • the test log is determined based on n positive samples and/or m negative samples.
  • Statistical parameters that is, determine the pass rate of positive samples and/or the false recognition rate of negative samples.
  • a data field in the log data is a sample score field, such as "-0.398854" in the non-json format log data.
  • the sample score (ie score) is a score for the live detection of a picture. Whether it can be a negative number depends on the model or algorithm used. For example, the score of the M1 model can be a positive or negative number.
  • the sample pass rate is the ratio of the number of positive samples with a sample score greater than the first living body threshold to n
  • the negative sample misrecognition rate is the ratio of the number of negative samples with a sample score greater than the second living body threshold to m.
  • determining the false recognition rate of negative samples in the statistical parameters includes:
  • Step A2 The ratio of the number of negative samples with sample scores greater than the second living body threshold to the total number of negative samples m is used as the negative sample misrecognition rate.
  • the standard positive sample pass rate Rt 0 is preset, and then the position corresponding to the standard positive sample pass rate in the total positive samples n, that is, n 0 can be determined .
  • the function f() represents a rounding function, which specifically can be a function INT() that returns the largest integer not greater than the target value, or can be a rounding function, a rounding function, etc., which is not limited in this embodiment .
  • n 0 means that in n positive samples, there are n 0 positive samples that meet the standard positive sample pass rate Rt 0 .
  • the n positive samples are arranged in reverse order to determine which sample is the n 0th positive sample after the reverse order, and then the sample score of the sample is used as the second living body threshold .
  • the sample score for example, 0.6
  • the sample score can be used as the second living body threshold.
  • the negative sample misrecognition rate corresponding to the standard positive sample pass rate can be determined, that is, the ratio of the number of negative samples whose sample score is greater than the second living body threshold (for example, 0.6) to the total number of negative samples m.
  • the standard positive sample pass rate determines the negative sample misrecognition rate can be based on the double standard of positive sample pass rate and negative sample misrecognition rate to more comprehensively evaluate the performance of the living body detection algorithm.
  • multiple standard positive sample pass rates can be set, such as 90%, 95%, 98%, etc.
  • the corresponding second living body threshold can be determined in sequence, such as 0.6, 0.8., 0.95, etc., and then the corresponding Negative sample misrecognition rate.
  • the higher the pass rate of standard positive samples the lower the determined false recognition rate of negative samples.
  • determining the pass rate of positive samples in statistical parameters includes:
  • Step B2 The ratio of the number of positive samples with a sample score greater than the first living body threshold to the total number of positive samples n is taken as the positive sample pass rate.
  • the positive sample pass rate is determined by the standard negative sample misrecognition rate. Similar to the above steps A1-A2, it can also be based on the double standard of the positive sample pass rate and the negative sample misrecognition rate to evaluate the living body more comprehensively. Check the performance of the algorithm.
  • Step 1043 Generate the test result of the test log according to the statistical parameters.
  • the test results of the test log can be generated after the statistical parameters are determined, and the pros and cons of the living body detection algorithm can be determined.
  • this embodiment eliminates useless or erroneous log data by setting the number of valid data columns.
  • the above step 102 "analyzes the test log to determine the analysis of each log data in the test log Results" include:
  • Step C1 Determine the number of valid data columns of the test log according to the log format of the test log.
  • the number of valid data columns is the number of sub-data that each log data in the test log should contain.
  • Step C2 Analyze the test log to determine the number of sub-data contained in the log data after the analytical processing. When the number of sub-data contained in the log data is inconsistent with the number of valid data columns, the log data is excluded; When the number of sub-data matches the number of valid data columns, the analysis result of the log data is determined.
  • each piece of log data in the test log will contain multiple sub-data, and one sub-data corresponds to one data field. Since the sub-data of different data fields are stored in the form of "columns" in the database, this implementation In the example, the number of valid data columns is set for the test log to indicate how many sub-data items are contained in each log data in the test log; if the number of columns (number of fields) of a log data in the test log is inconsistent with the number of valid data columns , Indicating that the piece of data may have problems or identification errors. At this time, the erroneous log data can be re-identified or eliminated to prevent the erroneous log data from adversely affecting the subsequent analysis results and avoid causing inaccurate analysis results.
  • step 103 "store the sub-data at the corresponding location of the database" includes:
  • unnecessary data field refers to the entire field that is not needed, such as the field of "picture name”.
  • the "unnecessary data field” may also refer to a part corresponding to a certain data field.
  • a data field corresponds to multiple types of data, that is, to multiple types of sub-data; when determining the test result, some sub-data under this data field is useful, and some sub-data is useless. Then the useless sub-data can also be considered as data that needs to be proposed.
  • test log in json format will have an error code field (error_code), and it is necessary to calculate log data with error_code of 0 and sample scores (score points) during statistics; that is, sub-data whose error_code is not zero is useless data.
  • error_code error code field
  • sample scores sample scores
  • the method for parsing living body detection logs uses corresponding analysis methods to determine the sub-data fields of each log data for test logs of different models and different formats, and stores the log data in a unified database , And then can determine the test results of different model test logs.
  • This method can be used for multi-sample and multi-model collection, by configuring the mapping relationship between data and fields to quickly analyze logs, quickly count and determine test results, which greatly improves efficiency and saves time.
  • the standard positive sample pass rate to determine the negative sample false recognition rate
  • the performance of the living body detection algorithm can be more comprehensively evaluated based on the double standard of the positive sample pass rate and the negative sample false recognition rate.
  • the process of the method for analyzing the living body detection log is described in detail above.
  • the method can also be implemented by a corresponding device.
  • the structure and function of the device are described in detail below.
  • An apparatus for analyzing a living body detection log provided by an embodiment of the present application, as shown in FIG. 4, includes:
  • the obtaining module 41 is used to obtain the test log of the living body detection and determine the log format of the test log;
  • the parsing module 42 is used to determine the corresponding parsing method according to the log format of the test log, and to analyze the test log through the parsing method to determine the parsing result of each log data in the test log.
  • the analysis result includes one or more sub-data in the log data;
  • the processing module 43 is configured to determine the data field corresponding to each sub-data of the log data, and store the sub-data in the database according to the correspondence between the data field and the table field of the database. Corresponding position
  • the statistics module 44 is configured to perform statistics on the data stored in the database and generate test results of the test log.
  • the processing module 43 determines the data field corresponding to each sub-data of the log data, including:
  • Standard log data is log data determined according to the log format of the test log, or log data selected from the test log that meets the log format;
  • the sequence of the sub-data of the log data in the log data is determined, and the standard field corresponding to the sequence of the sub-data is used as the data field of the sub-data.
  • the statistics module 44 includes:
  • a sample determining unit configured to determine a sample attribute of each log data in the test log, the sample attribute includes a positive sample and a negative sample;
  • a parameter determination unit configured to determine statistical parameters of the test log according to n positive samples and/or m negative samples in the test log, the statistical parameters including a positive sample pass rate and/or a negative sample misrecognition rate
  • the positive sample pass rate is the ratio of the number of positive samples with sample scores greater than the first living body threshold to n
  • the negative sample misidentification rate is the ratio of the number of negative samples with sample scores greater than the second living body threshold to m
  • the sample score corresponds to a data field in the log data
  • the statistical unit is configured to generate the test result of the test log according to the statistical parameter.
  • the parameter determination unit determining the statistical parameters of the test log includes:
  • the n positive samples are arranged in reverse order, the sample score of the n 0th positive sample in the positive samples arranged in reverse order is determined, and the sample score of the n 0th positive sample is taken as the first 2.
  • the ratio of the number of negative samples with sample scores greater than the second living body threshold to the total number of negative samples m is used as the negative sample misrecognition rate.
  • the parameter determination unit determining the statistical parameters of the test log includes:
  • the m negative samples in reverse order, determine the sample score of the m 0th negative sample among the negative samples in reverse order, and use the sample score of the m 0th negative sample as the first A living body threshold;
  • m 0 f(m ⁇ Rf 0 ), Rf 0 is the preset standard negative sample error rate, and f() represents the rounding function;
  • the ratio of the number of positive samples with a sample score greater than the first living body threshold to the total number of positive samples n is taken as the positive sample pass rate.
  • the analysis module 42 performs analysis processing on the test log to determine the analysis result of each log data in the test log, including:
  • the processing module 43 stores the sub-data in a corresponding location of the database, including:
  • the sub-data corresponding to the valid data field is stored in a corresponding location of the database.
  • An apparatus for analyzing a living body detection log uses a corresponding analysis method to determine the sub-data field of each log data for test logs of different models and different formats, and stores the log data in a unified database , And then can determine the test results of different model test logs.
  • This method can be used for multi-sample and multi-model collection, by configuring the mapping relationship between data and fields to quickly analyze logs, quickly count and determine test results, which greatly improves efficiency and saves time.
  • the standard positive sample pass rate to determine the negative sample false recognition rate the performance of the living body detection algorithm can be more comprehensively evaluated based on the double standard of the positive sample pass rate and the negative sample false recognition rate.
  • An embodiment of the present application also provides a computer storage medium that stores computer-executable instructions, which contains a program for executing the above-mentioned method for analyzing a living body detection log, and the computer-executable instructions can execute any of the above-mentioned methods The method in the embodiment.
  • the computer storage medium may be any available medium or data storage device that the computer can access, including but not limited to magnetic storage (such as floppy disk, hard disk, magnetic tape, magneto-optical disk (MO), etc.), optical storage (such as CD, DVD, BD, HVD, etc.), and semiconductor memory (such as ROM, EPROM, EEPROM, non-volatile memory (NANDFLASH), solid state drive (SSD)), etc.
  • magnetic storage such as floppy disk, hard disk, magnetic tape, magneto-optical disk (MO), etc.
  • optical storage such as CD, DVD, BD, HVD, etc.
  • semiconductor memory such as ROM, EPROM, EEPROM, non-volatile memory (NANDFLASH), solid state drive (SSD)
  • Fig. 5 shows a structural block diagram of a computer device according to another embodiment of the present application.
  • the computer device 1100 may be a host server with computing capabilities, a personal computer PC, or a portable computer or terminal that can be carried.
  • the specific embodiment of the present application does not limit the specific implementation of the computer device.
  • the computer device 1100 includes at least one processor (processor) 1110, a communication interface (Communications Interface) 1120, a memory (memory array) 1130, and a bus 1140. Among them, the processor 1110, the communication interface 1120, and the memory 1130 communicate with each other through the bus 1140.
  • processor processor
  • Communication interface Communication interface
  • memory memory array
  • the communication interface 1120 is used to communicate with network elements, where the network elements include, for example, a virtual machine management center, shared storage, and the like.
  • the processor 1110 is used to execute programs.
  • the processor 1110 may be a central processing unit CPU, or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present application.
  • ASIC application specific integrated circuit
  • the memory 1130 is used for executable instructions.
  • the memory 1130 may include a high-speed RAM memory, or may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
  • the memory 1130 may also be a memory array.
  • the memory 1130 may also be divided into blocks, and the blocks may be combined into a virtual volume according to certain rules.
  • the instructions stored in the memory 1130 may be executed by the processor 1110, so that the processor 1110 can execute the method in any of the foregoing method embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Debugging And Monitoring (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

一种活体检测日志解析的方法、装置、存储介质及计算机设备,其中,该方法包括:获取活体检测的测试日志,并确定测试日志的日志格式(101);根据测试日志的日志格式确定相应的解析方式,并通过解析方式对测试日志进行解析处理,确定测试日志中每条日志数据的解析结果(102);确定日志数据的每个子数据分别所对应的数据字段,并根据数据字段与数据库的表字段之间的对应关系,将子数据存储至数据库的相应位置处(103);对数据库中存储的数据进行统计,生成测试日志的测试结果(104)。该方法可以针对多样本、多模型采集,通过配置数据与字段之间的映射关系实现快速解析日志,快速统计并确定测试结果,极大提升了效率,节省了时间。

Description

活体检测日志解析的方法、装置、存储介质及计算机设备 技术领域
本申请涉及日志解析技术领域,特别涉及一种活体检测日志解析的方法、装置、存储介质及计算机设备。
背景技术
活体检测是在一些身份验证场景确定对象真实生理特征的方法,比如在人脸识别应用中,活体检测能通过眨眼、张嘴、摇头、点头等组合动作,使用人脸关键点定位和人脸追踪等技术,验证用户是否为真实活体本人操作。
目前,活体检测技术在测试过程中需覆盖多个模型(比如xfaceM1模型、Ace模型、xface线上模型等),且需与业内同行做竞品分析,导致测试活体检测时,活体检测的样本数量一般从几万到几十万不等,样本数量比较大,无法像常规版本测试一样通过纯手工方式进行结果统计,测试效率较低。
发明内容
为解决上述技术问题,本申请提供一种活体检测日志解析的方法、装置、存储介质及计算机设备。
根据本申请的第一个方面,提供一种活体检测日志解析的方法,包括:
获取活体检测的测试日志,并确定所述测试日志的日志格式;
根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
根据本申请的第二个方面,提供一种活体检测日志解析的装置,包括:
获取模块,用于获取活体检测的测试日志,并确定所述测试日志的日志格式;
解析模块,用于根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
处理模块,用于确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
统计模块,用于对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
根据本申请的第三个方面,提供一种计算机可读存储介质,其上存储有计算机可读指令,该计算机可读指令被处理器执行时实现活体检测日志解析的步骤。
根据本申请的第四个方面,提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现活体检测日志解析的步骤。
本申请实施例提供的一种活体检测日志解析的方法,针对不同模型不同格式的测试日志,采用相应的解析方式确定每条日志数据中子数据的字段,并将日志数据存储到统一的数据库中,进而可以确定不同模型测试日志的测试结果。该方式可以针对多样本、多模型采集,通过配置数据与字段之间的映射关系实现快速解析日志,快速统计并确定测试结果,极大提升了效率,节省了时间。
本申请的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本申请而了解。本申请的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
附图说明
附图用来提供对本申请的进一步理解,并且构成说明书的一部分,与本申请 的实施例一起用于解释本申请,并不构成对本申请的限制。在附图中:
图1为本申请实施例提供的一种活体检测日志解析方法的流程示意图;
图2为本申请实施例提供的活体检测日志解析方法中,确定日志数据的每个子数据分别所对应的数据字段的方法流程示意图;
图3为本申请实施例提供的活体检测日志解析方法中,确定测试日志的测试结果的方法流程示意图;
图4为本申请实施例提供的一种活体检测日志解析装置的结构示意图;
图5为本申请实施例提供的用于执行活体检测日志解析方法的计算机设备的结构示意图。
具体实施方式
以下结合附图对本申请的优选实施例进行说明,应当理解,此处所描述的优选实施例仅用于说明和解释本申请,并不用于限定本申请。
本申请实施例提供的一种活体检测日志解析的方法,参见图1所示,包括:
步骤101:获取活体检测的测试日志,并确定测试日志的日志格式。
活体检测是在一些身份验证场景确定对象真实生理特征的方法,在进行活体检测时会生成相应的测试日志,该测试日志中包含活体检测的测试结果,比如判断结果中存在人脸。本申请实施例中,测试日志可能是不同的格式;测试日志的日志格式包括txt文件格式、pos文件格式、log文件格式等,或者测试日志中的数据为json(JavaScript Object Notation,JS对象简谱)格式。具体的,活体检测技术在测试过程中需覆盖多个模型(比如xfaceM1模型、Ace模型、xface线上模型等),不同的模型可能会返回不同格式的测试日志,不同日志格式的测试日志以不同的方式记录,比如有的模型返回的是json格式的测试日志,有的是返回以其他分隔符(如逗号、空格等)组合的测试日志。此外,还可抓取非规范的测试日志中的数据,并格式化成指定样式输出,比如txt文件格式、pos文件格式等。
步骤102:根据测试日志的日志格式确定相应的解析方式,并通过解析方式 对测试日志进行解析处理,确定测试日志中每条日志数据的解析结果,解析结果包含日志数据中的一个或多个子数据。
本申请实施例中,不同的测试日志可能采用不同的日志格式,故可能需要不同的解析方式,具体选用哪种解析方式可根据实际情况而定。每个测试日志中包含一条或多条日志数据,每条日志数据中包含该项测试的具体信息,该日志数据具体可以分为一个或多个子数据。例如,不同日志格式的测试日志中的日志数据可能采用不同的分隔符,此时需要采用不同的解析方式来识别分隔符,进而可以正确地对日志数据进行解析。
一般情况下,非json格式的测试日志中的日志数据包含分隔符(如逗号、空格等),一条以逗号为分隔符的非json格式的日志数据具体如下:
“M1,M1_Living_Detect_Score:,-0.398854,VID20180628164054_525.jpg,/home/_live/A4Paper_2121/VID20180628164054_525.jpg”。
一条以逗号为分隔符的json格式的日志数据具体可以如下:
{"errorcode":"5","errormsg":"人脸比对失败,图片内容为空,请重新上传!","file":"bfd65b633536f0e-A.jpeg,bfd65b633536f0e-B.jpg","ref_thres":"42.000000","similarity":"0"}。
步骤103:确定日志数据的每个子数据分别所对应的数据字段,并根据数据字段与数据库的表字段之间的对应关系,将子数据存储至数据库的相应位置处。
本申请实施例中,日志数据的每个子数据对应一个字段,即数据字段;该数据字段与数据库中的表字段是一一对应的关系。比如日志数据中包含socre字段,数据库表字段中也会设有socre字段,二者是一一对应的。例如,上述非json格式的日志数据中:“-0.398854”对应数据库表字段的score字段,“/home/_live/A4Paper_2121/VID20180628164054_525.jpg”对应数据库表字段的path字段等。通过确定子数据与数据库表字段之间的对应关系,可以实现日志数据的字段入库,即将子数据存储到数据库中的相应位置。
步骤104:对数据库中存储的数据进行统计,确定测试日志的测试结果。
本申请实施例中,通过上述过程可以将不同模型的测试日志中的日志数据以格式化的形式存储到统一的数据库中,方便统计测试日志中的日志数据,进而可以快速统计得到相应的测试结果。
本申请实施例提供的一种活体检测日志解析的方法,针对不同模型不同格式的测试日志,采用相应的解析方式确定每条日志数据中子数据的字段,并将日志数据存储到统一的数据库中,进而可以确定不同模型测试日志的测试结果。该方式可以针对多样本、多模型采集,通过配置数据与字段之间的映射关系实现快速解析日志,快速统计并确定测试结果,极大提升了效率,节省了时间。
本申请另一实施例提供一种活体检测日志解析的方法,该方法包括上述实施例中的步骤101-104,其实现原理以及技术效果参见图1对应的实施例。同时,由于日志数据包含多个子数据,不同的子数据对应不同的数据字段,有的数据字段具有特殊的格式,比如路径字段(即path字段),该子数据的数据字段可以比较容易被识别出;但是不同的子数据可能具有相同或相似的形式,比如日志数据在该测试日志中的序号是数字形式,而样本分值(即score字段)也是数字形式,传统识别方式(比如正则表达式)可能识别率较低。本申请实施例中,通过预先确定标准日志数据中字段的顺位来确定日志数据每个子数据的数据字段,具体的,参见图2所示,步骤103“确定日志数据的每个子数据分别所对应的数据字段”,包括:
步骤1031:确定标准日志数据,标准日志数据为根据测试日志的日志格式所确定的日志数据、或者从测试日志中选取的符合日志格式的日志数据。
本申请实施例中,预先确定每个日志格式所对应的标准日志数据。具体的,该标准日志数据只需要满足该日志格式即可,或者,该标准日志数据即为测试日志中的一项日志数据。例如,以上述非json格式的日志数据为例说明,该格式的日志数据为测试日志中的一个子数据,其包含五个子数据,分别是“M1”、“M1_Living_Detect_Score:”、“-0.398854”、“VID20180628164054_525.jpg”、“/home/_live/A4Paper_2121/VID20180628164054_525.jpg”,可以将该子数据直接 作为非json格式的标准日志数据。
步骤1032:确定标准日志数据所包含的标准子数据,以及每个标准子数据在标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段。
本申请实施例中,标准子数据指的是该标准日志数据中所包含的子数据,标准日志数据所包含多个标准子数据,不同的标准子数据在该标准日志数据中具有不同的位置,即不同的顺位;如上述的非json格式的日志数据,其中的子数据“M1”为第一顺位,子数据“-0.398854”为第三顺位,以此类推。同时,在确定标准子数据的同时,确定每个标准子数据的标准字段,即一个标准子数据具有唯一的顺位和唯一的标准字段,从而可以形成顺位与字段之间的对应关系。如上例,子数据“M1”为图像名称字段,子数据“-0.398854”为样本分值字段,故第一顺位对应图像名称字段,第三顺位对应样本分值字段;其他顺位与此类似,此处不做赘述。其中,在选取其中一个标准日志数据后,可以采用更多方式、更加准确地确定该标准日志数据中每个标准子数据对应的标准字段,且一个标准日志数据所增加的处理量极少,可以忽略不计。
步骤1033:确定日志数据的子数据在日志数据中的顺位,并将子数据的顺位对应的标准字段作为子数据的数据字段。
本实施例中,由于日志数据与该标准日志数据是相同日志格式的测试日志中的数据,故普通的日志数据与该标准日志数据在相同的顺位上的子数据应该具有相同字段,即具有相同的顺位与字段之间的对应关系。如上例所示,第三顺位的子数据对应标准字段“样本分值字段”,则该测试日志中,每条日志数据的第三顺位的子数据的字段均为样本分值字段。本申请实施例中,通过预先确定标准日志数据中字段的顺位来确定日志数据每个子数据的数据字段,从而可以在不过多增加工作量的情况下准确地确定子数据的数据字段,方便后续的统计处理过程。
此外,在确定标准字段的同时,也可以在数据库中创建与该标准字段对应的表字段,后续批量将子数据存储到数据库中时不需要再配置数据库的表字段。例如,先调用的一组日志数据(即标准日志数据)为{d1,d2,d3},三个子数据分别对 应字段k1、k2、k3,则将三个字段k1、k2、k3作为数据库的三个表字段,后续再解析其他日志数据时,将解析出来的子数据分别对应三个字段k1、k2、k3进行存储即可,不需要解析日志数据中每个子数据所对应的字段。
在上述实施例的基础上,通过正样本通过率和/或负样本误识率生成测试结果。具体的,参见图3所示,上述步骤104“确定测试日志的测试结果”,包括:
步骤1041:确定测试日志中每个日志数据的样本属性,样本属性包括正样本和负样本。
本申请实施例中,日志数据的样本属性是可以预先确定,即可以预先确定该条日志数据是正样本还是负样本;也可以在步骤101中获取活体检测的测试日志时确定每个日志数据对应正样本还是负样本。可选的,在活体检测过程中,若日志数据对应的图像为一次成像的图像,如高清照(相机直接采集,微信H5场景)、标清照(sdk采集,app场景)等,则该日志数据为正样本;其中,一般高清照分辨率大于640*480,标清照分辨率≤640*480。若该日志数据对应的图像为翻拍成像的图像,如手机/pad翻拍、A4纸翻拍、铜版纸翻拍、折纸面具翻拍等,则该日志数据为负样本;其中,A4纸、铜版纸翻拍,就是用A4纸或铜版纸将照片打印出来后,再翻拍;折纸面具翻拍指的是用铜版纸打印后,将人脸眼睛嘴巴扣显出来后,再进行翻拍。
步骤1042:根据测试日志中的n个正样本和/或m个负样本确定测试日志的统计参数,统计参数包括正样本通过率和/或负样本误识率;正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;样本分值对应日志数据中的一个数据字段。
本申请实施例中,测试日志中包含多个日志数据,即一般包含多个正样本和/或多个负样本,本实施例中根据n个正样本和/或m个负样本确定测试日志的统计参数,即确定正样本通过率和/或负样本误识率。同时,日志数据中有一个数据字段为样本分值字段,如上述非json格式的日志数据中的“-0.398854”等。样本 分值(即score分)是图片进行活体检测的一个分数,是否能为负数跟采用的模型或算法有关,比如M1模型的score分可以为正数或负数。一般来讲,针对正样本,score分越高,代表通过率越高;针对负样本,score分越高,代表误识率越高。其中,样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值。
其中,在步骤1042中,确定统计参数中的负样本误识率包括:
步骤A1:根据正样本的样本分值的大小对n个正样本进行倒序排列,确定倒序排列的正样本中第n 0个正样本的样本分值,并将第n 0个正样本的样本分值作为第二活体阈值;其中,n 0=f(n×Rt 0),Rt 0为预设的标准正样本通过率,f()表示取整函数。
步骤A2:将样本分值大于第二活体阈值的负样本数量与负样本总数量m的比值作为负样本误识率。
本申请实施例中,预先设置标准正样本通过率Rt 0,进而可以确定该标准正样本通过率在总的正样本n中所对应的位置,即n 0。其中,函数f()表示取整函数,其具体可以是返回不大于目标值的最大整数的函数INT(),也可以是四舍五入函数、五舍六入函数等,本实施例对此不做限定。其中,n 0表示在n个正样本中,存在n 0个正样本是满足该标准正样本通过率Rt 0
同时,根据正样本的样本分值的大小对n个正样本进行倒序排列,即可确定倒序排列后第n 0个正样本是哪一个样本,进而将该样本的样本分值作为第二活体阈值。例如,正样本共1000个,即n=1000,并按照样本分值倒序排列,即按照样本分值从小到大的顺序排列;若预设标准正样本通过率为90%,则n 0=900,此时即可将倒序排列后第900个正样本的样本分值(比如是0.6)作为第二活体阈值。之后,即可确定与该标准正样本通过率对应的负样本误识率,即样本分值大于该第二活体阈值(比如0.6)的负样本数量与负样本总数量m的比值。
对于不同的测试日志,若采用相同的标准正样本通过率,则负样本误识率越高,活体检测算法越差;同时,负样本误识率越高,说明第二活体阈值越小,正 样本通过率也越低,同样说明活体检测算法越差。因此,利用标准正样本通过率来确定负样本误识率,可以基于正样本通过率和负样本误识率双重标准更加全面地评价活体检测算法的性能。
可选的,可以设置多个标准正样本通过率,比如90%、95%、98%等,此时可以依次确定相应的第二活体阈值,比如0.6、0.8.、0.95等,进而确定相应的负样本误识率。一般情况下,标准正样本通过率越高,所确定的负样本误识率越低,
同理,确定统计参数中的正样本通过率包括:
步骤B1:根据负样本的样本分值的大小对m个负样本进行倒序排列,确定倒序排列的负样本中第m 0个负样本的样本分值,并将第m 0个负样本的样本分值作为第一活体阈值;其中,m 0=m×Rf 0,Rf 0为预设的标准负样本误识率。
步骤B2:将样本分值大于第一活体阈值的正样本数量与正样本总数量n的比值作为正样本通过率。
同样的,本实施例中通过标准负样本误识率来确定正样本通过率,与上述步骤A1-A2类似,其也可以基于正样本通过率和负样本误识率双重标准更加全面地评价活体检测算法的性能。
步骤1043:根据统计参数生成测试日志的测试结果。
本申请实施例中,在确定统计参数后即可生成该测试日志的测试结果,即可以确定活体检测算法的优劣。一般情况下,正样本识别率越高、说明活体检测算法越好,负样本误识率越低、说明活体检测算法越好。
在上述实施例的基础上,本实施例通过设置有效数据列数来剔除无用或错误的日志数据,具体的,上述步骤102“对测试日志进行解析处理,确定测试日志中每条日志数据的解析结果”包括:
步骤C1:根据测试日志的日志格式确定测试日志的有效数据列数,有效数据列数为测试日志中的每个日志数据应当包含的子数据的数量。
步骤C2:对测试日志进行解析处理,确定解析处理后日志数据所包含的子数据数量,在日志数据所包含的子数据数量与有效数据列数不一致时,剔除日志数 据;在日志数据所包含的子数据数量与有效数据列数一致时,确定日志数据的解析结果。
本申请实施例中,测试日志中的每条日志数据会包含多项子数据,一个子数据对应一个数据字段,由于不同数据字段的子数据在数据库中以“列”的形式存储,故本实施例中对测试日志设置有效数据列数,用于表示该测试日志中每条日志数据包含多少项子数据;若测试日志中的某条日志数据的列数(字段数)与有效数据列数不一致,说明该条数据可能存在问题或识别错误,此时可以重新识别或剔除该条错误的日志数据,避免该错误的日志数据对后续的解析结果造成不利影响,避免导致解析结果不准确。
在上述实施例的基础上,在将日志数据存储到数据库中时,可以全字段入库,即日志数据中每个字段对应的子数据均存储到数据库中。此外,日志数据中可能存在部分字段对测试结果没有影响,例如图片名称字段等,此时可以剔除在确定测试日志的测试结果时不需要的数据字段。具体的,本实施例中步骤103“将子数据存储至数据库的相应位置处”,包括:
将在确定测试日志的测试结果时不需要的数据字段标记为无效数据字段,其余的数据字段为有效数据字段;将有效数据字段对应的子数据存储至数据库的相应位置处。
本申请实施例中,“不需要的数据字段”指的是不需要的整个字段,比如“图片名称”这一字段。此外,该“不需要的数据字段”也可以指的是某个数据字段所对应的一部分。具体的,在某些情况下,一个数据字段会对应多种数据,即对应多种子数据;在确定测试结果时,该数据字段下某些子数据是有用的,而有些子数据是无用的,则该无用的子数据也可认为是需要提出的数据。例如,json格式的测试日志会有错误编码字段(error_code),统计时需要计算error_code为0,且有样本分值(score分)的日志数据;即error_code不为零的子数据为无用的数据。通过剔除无用的数据,使得后续可以基于较少的样本确定测试结果,可以减少处理量。
本申请实施例提供的一种活体检测日志解析的方法,针对不同模型不同格式的测试日志,采用相应的解析方式确定每条日志数据中子数据的字段,并将日志数据存储到统一的数据库中,进而可以确定不同模型测试日志的测试结果。该方式可以针对多样本、多模型采集,通过配置数据与字段之间的映射关系实现快速解析日志,快速统计并确定测试结果,极大提升了效率,节省了时间。利用标准正样本通过率来确定负样本误识率,可以基于正样本通过率和负样本误识率双重标准更加全面地评价活体检测算法的性能。
以上详细介绍了活体检测日志解析的方法流程,该方法也可以通过相应的装置实现,下面详细介绍该装置的结构和功能。
本申请实施例提供的一种活体检测日志解析的装置,参见图4所示,包括:
获取模块41,用于获取活体检测的测试日志,并确定所述测试日志的日志格式;
解析模块42,用于根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
处理模块43,用于确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
统计模块44,用于对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
在上述实施例的基础上,所述处理模块43确定所述日志数据的每个子数据分别所对应的数据字段,包括:
确定标准日志数据,所述标准日志数据为根据所述测试日志的日志格式所确定的日志数据、或者从所述测试日志中选取的符合所述日志格式的日志数据;
确定所述标准日志数据所包含的标准子数据,以及每个标准子数据在所述标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段;
确定所述日志数据的子数据在所述日志数据中的顺位,并将所述子数据的顺位对应的标准字段作为所述子数据的数据字段。
在上述实施例的基础上,所述统计模块44包括:
样本确定单元,用于确定所述测试日志中每个日志数据的样本属性,所述样本属性包括正样本和负样本;
参数确定单元,用于根据所述测试日志中的n个正样本和/或m个负样本确定所述测试日志的统计参数,所述统计参数包括正样本通过率和/或负样本误识率;所述正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,所述负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;所述样本分值对应所述日志数据中的一个数据字段;
统计单元,用于根据所述统计参数生成所述测试日志的测试结果。
在上述实施例的基础上,所述参数确定单元确定所述测试日志的统计参数,包括:
根据正样本的样本分值的大小对n个正样本进行倒序排列,确定倒序排列的正样本中第n 0个正样本的样本分值,并将第n 0个正样本的样本分值作为第二活体阈值;其中,n 0=f(n×Rt 0),Rt 0为预设的标准正样本通过率,f()表示取整函数;
将样本分值大于所述第二活体阈值的负样本数量与负样本总数量m的比值作为负样本误识率。
在上述实施例的基础上,所述参数确定单元确定所述测试日志的统计参数,包括:
根据负样本的样本分值的大小对m个负样本进行倒序排列,确定倒序排列的负样本中第m 0个负样本的样本分值,并将第m 0个负样本的样本分值作为第一活体阈值;其中,m 0=f(m×Rf 0),Rf 0为预设的标准负样本误识率,f()表示取整函数;
将样本分值大于所述第一活体阈值的正样本数量与正样本总数量n的比值作 为正样本通过率。
在上述实施例的基础上,所述解析模块42对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,包括:
根据所述测试日志的日志格式确定所述测试日志的有效数据列数,所述有效数据列数为所述测试日志中的每个日志数据应当包含的子数据的数量;
对所述测试日志进行解析处理,确定解析处理后日志数据所包含的子数据数量,在日志数据所包含的子数据数量与所述有效数据列数不一致时,剔除所述日志数据;在日志数据所包含的子数据数量与所述有效数据列数一致时,确定所述日志数据的解析结果。
在上述实施例的基础上,所述处理模块43将所述子数据存储至所述数据库的相应位置处,包括:
将在生成所述测试日志的测试结果时不需要的数据字段标记为无效数据字段,其余的数据字段为有效数据字段;
将所述有效数据字段对应的子数据存储至所述数据库的相应位置处。
本申请实施例提供的一种活体检测日志解析的装置,针对不同模型不同格式的测试日志,采用相应的解析方式确定每条日志数据中子数据的字段,并将日志数据存储到统一的数据库中,进而可以确定不同模型测试日志的测试结果。该方式可以针对多样本、多模型采集,通过配置数据与字段之间的映射关系实现快速解析日志,快速统计并确定测试结果,极大提升了效率,节省了时间。利用标准正样本通过率来确定负样本误识率,可以基于正样本通过率和负样本误识率双重标准更加全面地评价活体检测算法的性能。
本申请实施例还提供了一种计算机存储介质,所述计算机存储介质存储有计算机可执行指令,其包含用于执行上述活体检测日志解析的方法的程序,该计算机可执行指令可执行上述任意方法实施例中的方法。
其中,所述计算机存储介质可以是计算机能够存取的任何可用介质或数据存储设备,包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(MO)等)、光 学存储器(例如CD、DVD、BD、HVD等)、以及半导体存储器(例如ROM、EPROM、EEPROM、非易失性存储器(NANDFLASH)、固态硬盘(SSD))等。
图5示出了本申请的另一个实施例的一种计算机设备的结构框图。所述计算机设备1100可以是具备计算能力的主机服务器、个人计算机PC、或者可携带的便携式计算机或终端等。本申请具体实施例并不对计算机设备的具体实现做限定。
该计算机设备1100包括至少一个处理器(processor)1110、通信接口(Communications Interface)1120、存储器(memory array)1130和总线1140。其中,处理器1110、通信接口1120、以及存储器1130通过总线1140完成相互间的通信。
通信接口1120用于与网元通信,其中网元包括例如虚拟机管理中心、共享存储等。
处理器1110用于执行程序。处理器1110可能是一个中央处理器CPU,或者是专用集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本申请实施例的一个或多个集成电路。
存储器1130用于可执行的指令。存储器1130可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。存储器1130也可以是存储器阵列。存储器1130还可能被分块,并且所述块可按一定的规则组合成虚拟卷。存储器1130存储的指令可被处理器1110执行,以使处理器1110能够执行上述任意方法实施例中的方法。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (20)

  1. 一种活体检测日志解析的方法,包括:
    获取活体检测的测试日志,并确定所述测试日志的日志格式;
    根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
    确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
    对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
  2. 根据权利要求1所述的方法,所述确定所述日志数据的每个子数据分别所对应的数据字段,包括:
    确定标准日志数据,所述标准日志数据为根据所述测试日志的日志格式所确定的日志数据、或者从所述测试日志中选取的符合所述日志格式的日志数据;
    确定所述标准日志数据所包含的标准子数据,以及每个标准子数据在所述标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段;
    确定所述日志数据的子数据在所述日志数据中的顺位,并将所述子数据的顺位对应的标准字段作为所述子数据的数据字段。
  3. 根据权利要求1所述的方法,所述生成所述测试日志的测试结果,包括:
    确定所述测试日志中每个日志数据的样本属性,所述样本属性包括正样本和负样本;
    根据所述测试日志中的n个正样本和/或m个负样本确定所述测试日志的统计参数,所述统计参数包括正样本通过率和/或负样本误识率;所述正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,所述负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;所述样本分值对应所述日志数据中的一个数据字段;
    根据所述统计参数生成所述测试日志的测试结果。
  4. 根据权利要求3所述的方法,所述确定所述测试日志的统计参数,包括:
    根据正样本的样本分值的大小对n个正样本进行倒序排列,确定倒序排列的正样本中第n 0个正样本的样本分值,并将第n 0个正样本的样本分值作为第二活体阈值;其中,n 0=f(n×Rt 0),Rt 0为预设的标准正样本通过率,f( )表示取整函数;
    将样本分值大于所述第二活体阈值的负样本数量与负样本总数量m的比值作为负样本误识率。
  5. 根据权利要求3所述的方法,所述确定所述测试日志的统计参数,包括:
    根据负样本的样本分值的大小对m个负样本进行倒序排列,确定倒序排列的负样本中第m 0个负样本的样本分值,并将第m 0个负样本的样本分值作为第一活体阈值;其中,m 0=f(m×Rf 0),Rf 0为预设的标准负样本误识率,f( )表示取整函数;
    将样本分值大于所述第一活体阈值的正样本数量与正样本总数量n的比值作为正样本通过率。
  6. 根据权利要求1-5任一所述的方法,所述对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,包括:
    根据所述测试日志的日志格式确定所述测试日志的有效数据列数,所述有效数据列数为所述测试日志中的每个日志数据应当包含的子数据的数量;
    对所述测试日志进行解析处理,确定解析处理后日志数据所包含的子数据数量,在日志数据所包含的子数据数量与所述有效数据列数不一致时,剔除所述日志数据;在日志数据所包含的子数据数量与所述有效数据列数一致时,确定所述日志数据的解析结果。
  7. 根据权利要求1-5任一所述的方法,所述将所述子数据存储至所述数据库的相应位置处,包括:
    将在生成所述测试日志的测试结果时不需要的数据字段标记为无效数据字段, 其余的数据字段为有效数据字段;
    将所述有效数据字段对应的子数据存储至所述数据库的相应位置处。
  8. 一种活体检测日志解析的装置,包括:
    获取模块,用于获取活体检测的测试日志,并确定所述测试日志的日志格式;
    解析模块,用于根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
    处理模块,用于确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
    统计模块,用于对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
  9. 根据权利要求8所述的装置,所述处理模块还用于:
    确定标准日志数据,所述标准日志数据为根据所述测试日志的日志格式所确定的日志数据、或者从所述测试日志中选取的符合所述日志格式的日志数据;
    确定所述标准日志数据所包含的标准子数据,以及每个标准子数据在所述标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段;
    确定所述日志数据的子数据在所述日志数据中的顺位,并将所述子数据的顺位对应的标准字段作为所述子数据的数据字段。
  10. 根据权利要求8所述的装置,所述统计模块还用于:
    确定所述测试日志中每个日志数据的样本属性,所述样本属性包括正样本和负样本;
    根据所述测试日志中的n个正样本和/或m个负样本确定所述测试日志的统计参数,所述统计参数包括正样本通过率和/或负样本误识率;所述正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,所述负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;所述样本分值对应所述日志数据中的一个数据字段;
    根据所述统计参数生成所述测试日志的测试结果。
  11. 根据权利要求10所述的装置,所述统计模块还用于:
    根据正样本的样本分值的大小对n个正样本进行倒序排列,确定倒序排列的正样本中第n 0个正样本的样本分值,并将第n 0个正样本的样本分值作为第二活体阈值;其中,n 0=f(n×Rt 0),Rt 0为预设的标准正样本通过率,f( )表示取整函数;
    将样本分值大于所述第二活体阈值的负样本数量与负样本总数量m的比值作为负样本误识率。
  12. 根据权利要求10所述的装置,所述统计模块还用于:
    根据负样本的样本分值的大小对m个负样本进行倒序排列,确定倒序排列的负样本中第m 0个负样本的样本分值,并将第m 0个负样本的样本分值作为第一活体阈值;其中,m 0=f(m×Rf 0),Rf 0为预设的标准负样本误识率,f( )表示取整函数;
    将样本分值大于所述第一活体阈值的正样本数量与正样本总数量n的比值作为正样本通过率。
  13. 根据权利要求8-12任一所述的装置,所述解析模块,还用于:
    根据所述测试日志的日志格式确定所述测试日志的有效数据列数,所述有效数据列数为所述测试日志中的每个日志数据应当包含的子数据的数量;
    对所述测试日志进行解析处理,确定解析处理后日志数据所包含的子数据数量,在日志数据所包含的子数据数量与所述有效数据列数不一致时,剔除所述日志数据;在日志数据所包含的子数据数量与所述有效数据列数一致时,确定所述日志数据的解析结果。
  14. 根据权利要求8-12任一所述的装置,所述处理模块,还用于:
    将在生成所述测试日志的测试结果时不需要的数据字段标记为无效数据字段,其余的数据字段为有效数据字段;
    将所述有效数据字段对应的子数据存储至所述数据库的相应位置处。
  15. 一种计算机可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现一种活体检测日志解析的方法的步骤,包括:
    获取活体检测的测试日志,并确定所述测试日志的日志格式;
    根据所述测试日志的日志格式确定相应的解析方式,并通过所述解析方式对所述测试日志进行解析处理,确定所述测试日志中每条日志数据的解析结果,所述解析结果包含所述日志数据中的一个或多个子数据;
    确定所述日志数据的每个子数据分别所对应的数据字段,并根据所述数据字段与数据库的表字段之间的对应关系,将所述子数据存储至所述数据库的相应位置处;
    对所述数据库中存储的数据进行统计,生成所述测试日志的测试结果。
  16. 根据权利要求15所述的计算机可读存储介质,所述确定所述日志数据的每个子数据分别所对应的数据字段,包括:
    确定标准日志数据,所述标准日志数据为根据所述测试日志的日志格式所确定的日志数据、或者从所述测试日志中选取的符合所述日志格式的日志数据;
    确定所述标准日志数据所包含的标准子数据,以及每个标准子数据在所述标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段;
    确定所述日志数据的子数据在所述日志数据中的顺位,并将所述子数据的顺位对应的标准字段作为所述子数据的数据字段。
  17. 根据权利要求15所述的计算机可读存储介质,所述生成所述测试日志的测试结果,包括:
    确定所述测试日志中每个日志数据的样本属性,所述样本属性包括正样本和负样本;
    根据所述测试日志中的n个正样本和/或m个负样本确定所述测试日志的统计参数,所述统计参数包括正样本通过率和/或负样本误识率;所述正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,所述负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;所述样本分值对应所述日志数据中的一个数据字段;
    根据所述统计参数生成所述测试日志的测试结果。
  18. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现权利要求1至7中任一项所述方法的步骤。
  19. 根据权利要求18所述的计算机设备,所述确定所述日志数据的每个子数据分别所对应的数据字段,包括:
    确定标准日志数据,所述标准日志数据为根据所述测试日志的日志格式所确定的日志数据、或者从所述测试日志中选取的符合所述日志格式的日志数据;
    确定所述标准日志数据所包含的标准子数据,以及每个标准子数据在所述标准日志数据中的顺位,并为每个顺位的标准子数据设置相应的标准字段;
    确定所述日志数据的子数据在所述日志数据中的顺位,并将所述子数据的顺位对应的标准字段作为所述子数据的数据字段。
  20. 根据权利要求18所述的计算机设备,所述生成所述测试日志的测试结果,包括:
    确定所述测试日志中每个日志数据的样本属性,所述样本属性包括正样本和负样本;
    根据所述测试日志中的n个正样本和/或m个负样本确定所述测试日志的统计参数,所述统计参数包括正样本通过率和/或负样本误识率;所述正样本通过率为样本分值大于第一活体阈值的正样本数量与n的比值,所述负样本误识率为样本分值大于第二活体阈值的负样本数量与m的比值;所述样本分值对应所述日志数据中的一个数据字段;
    根据所述统计参数生成所述测试日志的测试结果。
PCT/CN2019/103199 2019-04-19 2019-08-29 活体检测日志解析的方法、装置、存储介质及计算机设备 WO2020211248A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910319427.0 2019-04-19
CN201910319427.0A CN110188073A (zh) 2019-04-19 2019-04-19 活体检测日志解析的方法、装置、存储介质及计算机设备

Publications (1)

Publication Number Publication Date
WO2020211248A1 true WO2020211248A1 (zh) 2020-10-22

Family

ID=67714897

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103199 WO2020211248A1 (zh) 2019-04-19 2019-08-29 活体检测日志解析的方法、装置、存储介质及计算机设备

Country Status (2)

Country Link
CN (1) CN110188073A (zh)
WO (1) WO2020211248A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113810160A (zh) * 2021-09-17 2021-12-17 北京京航计算通讯研究所 一种多元网络设备智能接入系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559333B (zh) * 2020-12-10 2022-05-20 武汉联影医疗科技有限公司 日志生产方法、装置、计算机设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120246179A1 (en) * 2011-03-23 2012-09-27 Bmc Software, Inc. Log-Based DDL Generation
CN103593442A (zh) * 2013-11-15 2014-02-19 北京国双科技有限公司 日志数据的去重方法及装置
CN108363791A (zh) * 2018-02-13 2018-08-03 沈阳东软医疗系统有限公司 一种数据库的数据同步方法和装置
CN109117440A (zh) * 2017-06-23 2019-01-01 中国移动通信集团公司 一种元数据信息获取方法、系统和计算机可读存储介质
CN109324996A (zh) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 日志文件处理方法、装置、计算机设备及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10366096B2 (en) * 2015-04-03 2019-07-30 Oracle International Corporation Method and system for implementing a log parser in a log analytics system
CN107404658A (zh) * 2016-05-19 2017-11-28 中兴通讯股份有限公司 一种交互式网络电视系统及用户数据实时获取方法
CN106682097B (zh) * 2016-12-01 2020-06-05 北京奇虎科技有限公司 一种处理日志数据的方法和装置
CN109325009B (zh) * 2018-09-19 2021-11-30 亚信科技(成都)有限公司 日志解析的方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120246179A1 (en) * 2011-03-23 2012-09-27 Bmc Software, Inc. Log-Based DDL Generation
CN103593442A (zh) * 2013-11-15 2014-02-19 北京国双科技有限公司 日志数据的去重方法及装置
CN109117440A (zh) * 2017-06-23 2019-01-01 中国移动通信集团公司 一种元数据信息获取方法、系统和计算机可读存储介质
CN108363791A (zh) * 2018-02-13 2018-08-03 沈阳东软医疗系统有限公司 一种数据库的数据同步方法和装置
CN109324996A (zh) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 日志文件处理方法、装置、计算机设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113810160A (zh) * 2021-09-17 2021-12-17 北京京航计算通讯研究所 一种多元网络设备智能接入系统

Also Published As

Publication number Publication date
CN110188073A (zh) 2019-08-30

Similar Documents

Publication Publication Date Title
US10216558B1 (en) Predicting drive failures
US9070041B2 (en) Image processing apparatus and image processing method with calculation of variance for composited partial features
CN109800320B (zh) 一种图像处理方法、设备及计算机可读存储介质
CN108091372B (zh) 医疗字段映射校验方法及装置
CN109817339B (zh) 基于大数据的患者分组方法和装置
WO2019062534A1 (zh) 一种图像检索方法、装置、设备及可读存储介质
WO2020155757A1 (zh) 柱状图数据转换控制方法、装置、计算机设备及存储介质
US11734954B2 (en) Face recognition method, device and electronic equipment, and computer non-volatile readable storage medium
CN110737689B (zh) 数据标准符合性检测方法、装置、系统及存储介质
WO2022166532A1 (zh) 人脸识别方法、装置、电子设备及存储介质
CN111291824B (zh) 时间序列的处理方法、装置、电子设备和计算机可读介质
WO2020211248A1 (zh) 活体检测日志解析的方法、装置、存储介质及计算机设备
WO2020143301A1 (zh) 一种训练样本有效性检测方法、计算机设备及计算机非易失性存储介质
CN113723157B (zh) 一种农作物病害识别方法、装置、电子设备及存储介质
US20230004979A1 (en) Abnormal behavior detection method and apparatus, electronic device, and computer-readable storage medium
CN111062440B (zh) 一种样本选择方法、装置、设备及存储介质
CN114595765A (zh) 数据处理方法、装置、电子设备及存储介质
CN110826616B (zh) 信息处理方法及装置、电子设备、存储介质
US20160063394A1 (en) Computing Device Classifier Improvement Through N-Dimensional Stratified Input Sampling
WO2021227951A1 (zh) 前端页面元素的命名
CN112699908B (zh) 标注图片的方法、电子终端、计算机可读存储介质及设备
JP2015082190A (ja) 外れ値検出装置、方法、及びプログラム
US11797775B1 (en) Determining emebedding vectors for an unmapped content item using embedding inferenece
CN113688708A (zh) 一种基于概率特征的人脸识别方法、系统及存储介质
Lu et al. Re-Benchmarking Pool-Based Active Learning for Binary Classification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19925518

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19925518

Country of ref document: EP

Kind code of ref document: A1