WO2020232894A1 - Real-time data verification method, device, server and medium - Google Patents

Real-time data verification method, device, server and medium Download PDF

Info

Publication number
WO2020232894A1
WO2020232894A1 PCT/CN2019/103300 CN2019103300W WO2020232894A1 WO 2020232894 A1 WO2020232894 A1 WO 2020232894A1 CN 2019103300 W CN2019103300 W CN 2019103300W WO 2020232894 A1 WO2020232894 A1 WO 2020232894A1
Authority
WO
WIPO (PCT)
Prior art keywords
real
data
time
word
preset
Prior art date
Application number
PCT/CN2019/103300
Other languages
French (fr)
Chinese (zh)
Inventor
王旭
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020232894A1 publication Critical patent/WO2020232894A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0815Network architectures or network communication protocols for network security for authentication of entities providing single-sign-on or federations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information

Definitions

  • This application belongs to the field of data processing technology, and in particular relates to a method, device, server and medium for verifying real-time data.
  • the embodiments of the present application provide a real-time data verification method and server to solve the problem of inaccurate and incomplete real-time data analysis.
  • the first aspect of the embodiments of the present application provides a method for verifying real-time data, including: after receiving user login information, determining user identity data corresponding to the login information, and analyzing the key contained in the login information Code, determine multiple verification keywords corresponding to the user based on the key code; receive real-time data of the user, and divide the real-time data into real-time video components and real-time audio components, the real-time video components include face data Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the prediction It is assumed that the number of verification keywords contained in the real-time audio component within the time period; if the number of verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold, then One of the asynchronous processing servers is selected as the selected server, and the real-time data is sent to the selected server to perform non-real-time verification on the
  • user identity data and verification keywords are obtained through login information, real-time data of the user is received, real-time data is divided into real-time video and real-time audio data, and face data and user identity contained in real-time video components are calculated
  • the similarity of the data is used to verify the identity of the user. If the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal If the number of the verification keywords contained in the real-time audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
  • FIG. 1 is an implementation flowchart of a method for verifying real-time data provided by an embodiment of the present application
  • FIG. 2 is a specific implementation flowchart of a method S101 for verifying real-time data provided by an embodiment of the present application
  • FIG. 3 is a specific implementation flowchart of the method S103 for verifying real-time data provided by an embodiment of the present application
  • FIG. 4 is a specific implementation flowchart of the method S107 for verifying real-time data provided by an embodiment of the present application
  • Figure 5 is a structural block diagram of a real-time data verification device provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application.
  • Fig. 1 shows an implementation process of a method for verifying real-time data provided by an embodiment of the present application.
  • the process of the method includes steps S101 to S108.
  • the specific implementation principle of each step is as follows.
  • the server will verify the collected video of the user and business personnel (ie real-time Data) conduct artificial intelligence analysis to determine whether there are hidden safety hazards in the entire business.
  • the embodiments of this application mainly verify the real-time data by automatically verifying whether the user in the real-time data matches the user data corresponding to the login information, and automatically verifying whether the real-time data contains sufficient verification keywords. , So as to ensure the safety of business processing.
  • the server will verify the login name and password contained in the login information. Once the verification is passed, the user identity data will be determined according to the login name, where the user identity data includes the user's facial photo.
  • the key codes contained in the login information are not the same. Obviously, the key codes can be used to distinguish different businesses; on the contrary, Different types of users log in on the same service interface, and the key codes contained in the login information may also be different. For example, the types of users can be divided into minor users, adult users, and elderly users, due to their different discrimination capabilities , After the same business login, the key code contained in the login information is also different.
  • one of the verification dimensions of the subsequent real-time data in the embodiment of this application is to automatically verify whether the real-time data contains sufficient verification keywords. Therefore, it is necessary to first determine which verification keys need to be included in the real-time data based on the key code. word. Understandably, in the process of online business processing, for example, for elderly users, business personnel need to inform elderly users of business risks, and elderly users must confirm these risks. At this time, because real-time data is a recording of the entire communication process, Therefore, qualified real-time data must contain a certain number of verification keywords.
  • a plurality of keywords corresponding to the key code contained in the login information can be determined through the correspondence between the preset key code and the verification keyword.
  • the verification keyword corresponding to the key code can be determined by predicting the occurrence probability of multiple words corresponding to a key code in the future.
  • the above S101 includes:
  • S1011 Retrieve a word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment.
  • the process of generating the word set corresponding to the key code is: the business person extracts the real-time data recorded during the business transaction with the user within a preset period (for example: within one month) as the specimen data, and determines The key code corresponding to the specimen data is artificially screened out from the specimen data that can be used to ensure the security of business processing, thereby determining multiple words corresponding to a key code.
  • the words extracted from a specimen data all correspond to the reception time.
  • the business personnel can obtain more words corresponding to the keyword by extracting multiple specimen data corresponding to a key code, thereby generating a set of words corresponding to the keyword. It is worth noting that the embodiments of the present application will not merge words, so the word set will contain a large number of repeated words, and each word corresponds to a receiving time.
  • the receiving time period includes multiple receiving times, and the number of occurrences of the word is at The number of occurrences of the word in the word set in one of the receiving time periods.
  • a word set contains many repeated words, and each word corresponds to a receiving time, it is possible to count the number of occurrences of a certain word in the word set during a receiving time period.
  • the corresponding relationship between the reception time period and the word appearance times of each word will be generated.
  • the number of occurrences of the word "losing money" during the period from January 1 to January 5 (receiving time period) is 10
  • the number of occurrences of the word "risk” is 8
  • the occurrence of the word "not recommended” The number of times is 9, the number of occurrences of the word "clear” is 20, and so on.
  • the regression model Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations, and the e is a natural constant.
  • the coefficients of the regression model can be obtained by the existing nonlinear regression equation solving method, so it is not detailed here.
  • the serial number of the receiving time period here indicates the first receiving time period from front to back. For example, there are 5 receiving time periods in total, namely: January 1 to January 5, and January 6. From January 10th, January 11th to January 15th, January 16th to January 20th, and January 21st to January 25th, then the corresponding reception from January 1st to January 5th The serial number of the time zone is 1, and the serial number of the corresponding receiving time zone from January 6 to January 10 is 2, and so on.
  • each word corresponds to a regression equation.
  • S1014 Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word.
  • the independent variables are calculated as 11, 12, 13, 14, and 15 through the regression equation corresponding to each word.
  • the sum of the corresponding dependent variables ie the sum of the number of occurrences is used as the predicted number of occurrences corresponding to each word.
  • S1015 Select words whose predicted number of occurrences are not less than a preset number threshold as verification keywords corresponding to the user.
  • real-time data of the user is received, and the real-time data is divided into a real-time video component and a real-time audio component, and the real-time video component includes face data.
  • the real-time data of the user in the embodiment of the present application is the video collected when the user communicates with the business personnel.
  • the video can be divided into real-time video component and real-time audio component.
  • the real-time video component only contains Image information does not contain components of voice information
  • real-time audio components are components that only contain voice information without image information.
  • the real-time data is the user-side video collected during the communication between the user and the business personnel
  • the real-time video component contains the user's face data
  • the real-time audio component contains the voice data of both the user and the business personnel.
  • the foregoing S103 includes:
  • S1031 Intercept a frame of image from the real-time video component as a target image.
  • the real-time video component is actually composed of multiple frames of images, and the embodiment of the present application selects one frame of images as the target image for analysis.
  • S1032 Divide the target image into multiple image regions, read pixel point data of each pixel in the image region, and number the image regions according to a preset sequence.
  • the target image can be divided into 100 equal parts along the horizontal direction and 100 equal parts along the ordinate. Then the target image can be divided into 10,000 image regions, and each image region contains multiple Pixels.
  • the pixel point data may be the RGB value of one pixel point, that is, the RGB value of one pixel point can be represented by one pixel point data.
  • S1033 Simultaneously input the pixel point data of a preset number of adjacent image areas with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image area with a middle number, where the preset number is an odd number greater than 1. .
  • the pixel data of each image area is not individually input into the preset VGG neural network model, but because the face is continuously covered in each image area, in order to avoid Intermittent image area calculations may cause accidental errors. Therefore, in the embodiment of the present application, a preset number of pixel data of adjacent image areas with a number are simultaneously input into the neural network model (for example, three consecutive image areas are Pixel data is input into the neural network model at the same time), and when the result is output, only the segmentation coefficient corresponding to the image area with the middle number is output. The segmentation coefficient is used to distinguish the image area covered by the face image or the non-face image. Image area.
  • a preset number of image areas with adjacent numbers are used as a region group, and after the pixel data of each pixel in each image region is organized and combined into a vector, the characteristics of a region group can be generated vector.
  • the input parameters are set to the preset number of pixel data of adjacent image areas, and the output parameters are set to the preset numbered image.
  • the segmentation coefficient corresponding to the region so naturally the neural network model trained in this way can realize the segmentation coefficient corresponding to the image region with the middle output number described above.
  • the training process of the VGG neural network includes: obtaining the training feature vectors of multiple training region groups and the training segmentation coefficients of the image regions in the center of the training region groups; repeating the following steps until the adjusted cross-entropy loss of the VGG neural network
  • the function value is less than the preset threshold: the training feature vector is used as the input of the VGG neural network, the training segmentation coefficient is used as the output of the VGG neural network, and the VGG neural network is fully connected by the random gradient descent method
  • the parameters of each layer in the layer are updated, and the cross entropy loss function value of the adjusted VGG neural network is calculated.
  • the VGG neural network whose cross-entropy loss function value is less than the preset threshold is output as the preset neural network model.
  • the pixel data of the image regions numbered 1-3 are input into the neural network model, and the segmentation coefficient of the image region numbered 2 will be output; After the pixel data of the image areas numbered 2-4 are input to the neural network model, the segmentation coefficient of the image area numbered 3 will be output, and so on, until the segmentation coefficient of the image area numbered 99 is output.
  • segmentation coefficients of the image regions numbered 1 and 100 cannot be obtained through the above method, but the ultimate purpose of obtaining the segmentation coefficients of the image region is to segment the face image from the target image, and the face image The probability of pixels covering the edges of the target image is extremely small. Therefore, although the embodiment of the present application cannot calculate the segmentation coefficients of the image area with the largest number and the smallest number, it does not affect the segmentation of the face image in practical applications.
  • S1034 Use pixels corresponding to the segmentation coefficients that are less than a preset coefficient threshold as face pixels, and generate face data according to pixel point data of all the face pixels.
  • S1035 Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data by a distance formula, as the similarity between the face data included in the video component and the user identity data.
  • the distance formula is: Wherein the S is the similarity, the X i is the face data corresponding to the i-th matrix element Y i is the identity of the user data corresponding to the i-th matrix element The K is the number of elements included in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
  • the real-time audio component in the preset time period is converted into text data through a speech-to-text conversion algorithm; the text data is word segmented to generate a speech word set, and the speech word set contains multiple Words; calculating the number of the verification keywords included in the voice word set as the number of the verification keywords included in the real-time audio component.
  • the embodiment of the present application when the number of keywords contained in the real-time audio component is not enough, it cannot be directly determined whether the real-time data has passed the verification, and other asynchronous processing servers need to be used for further automatic analysis or manual analysis. Since the embodiment of this application only introduces the real-time data verification method from the side of the main server, here only how the main server selects an asynchronous processing server as the selected server and sends the real-time data to the selected server is not involved. Choose how the server performs non-real-time verification of real-time data.
  • the foregoing S107 includes:
  • S1071 Invoke the number of threads contained in each of the multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server within a unit time period before the current moment.
  • S1072 Calculate sending parameters corresponding to each asynchronous processing server through a segmented formula.
  • the segmentation formula includes: The K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) represents the number of abnormal tasks corresponding to the asynchronous processing server i .
  • S1073 Calculate the sending ratio corresponding to each asynchronous processing server through a ratio calculation formula.
  • the ratio calculation formula includes:
  • the Par i represents the sending ratio corresponding to the asynchronous processing server i
  • the K(i) represents the sending parameter corresponding to the asynchronous processing server i
  • the n is the number of asynchronous processing servers.
  • S1074 Select one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
  • the user identity data and verification keywords through login information, receive real-time data from users, divide real-time data into real-time video and real-time audio data, and calculate the similarity between face data contained in real-time video components and user identity data
  • the similarity is not less than the preset similarity threshold
  • the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal. If the number of the verification keywords contained in the audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
  • FIG. 5 shows a structural block diagram of the real-time data verification device provided in the embodiment of the present application. For ease of description, only the information related to the embodiment of the present application is shown. section.
  • the device includes:
  • the parsing module 501 is configured to, after receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine the number corresponding to the user based on the key code. Verification keywords;
  • the decomposition module 502 is configured to receive real-time data of the user, and divide the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
  • the calculation module 503 is configured to calculate the similarity between the face data contained in the real-time video component and the user identity data, if the similarity between the face data and the user identity data is not less than a preset similarity Threshold, calculate the number of verification keywords included in the real-time audio component within a preset time period;
  • the first execution module 504 is configured to select one of a plurality of asynchronous processing servers as the selected server if the number of the verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold , And send the real-time data to the selected server to perform non-real-time verification on the real-time data through the selected server;
  • the second execution module 505 is configured to determine that the real-time data passes the verification if the number of the verification keywords contained in the real-time audio component is greater than the first number threshold.
  • the parsing module is specifically used for:
  • the corresponding relationship between the receiving time period and the word appearance times of each word is established.
  • the receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
  • the words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
  • calculation module is specifically used for:
  • the calculating the number of verification keywords contained in the real-time audio component within a preset time period includes: converting the real-time audio component within the preset time period into a voice-to-text conversion algorithm Text data; word segmentation processing of the text data to generate a set of voice words, the set of voice words contains multiple words; the number of the verification keywords included in the set of voice words is calculated as the real-time audio component Contains the number of verification keywords.
  • the selecting one of multiple asynchronous processing servers as the selected server includes:
  • the user identity data and verification keywords through login information, receive real-time data from users, divide real-time data into real-time video and real-time audio data, and calculate the similarity between face data contained in real-time video components and user identity data
  • the similarity is not less than the preset similarity threshold
  • the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal. If the number of the verification keywords contained in the audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
  • Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application.
  • the server 6 of this embodiment includes: a processor 60, a memory 61, and computer-readable instructions 62 stored in the memory 61 and running on the processor 60, such as verification of real-time data program.
  • the processor 60 executes the computer-readable instructions 62
  • the steps in the foregoing embodiments of the verification method for real-time data are implemented, such as steps 101 to 108 shown in FIG. 1.
  • the processor 60 executes the computer-readable instructions 62
  • the functions of the modules/units in the foregoing device embodiments such as the functions of the units 501 to 505 shown in FIG. 5, are implemented.
  • the computer-readable instructions 62 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 61 and executed by the processor 60, To complete this application.
  • the one or more modules/units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 62 in the server 6.
  • the server 6 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the server may include, but is not limited to, a processor 60 and a memory 61.
  • FIG. 6 is only an example of the server 6 and does not constitute a limitation on the server 6. It may include more or less components than shown, or a combination of certain components, or different components, for example
  • the server may also include input and output devices, network access devices, buses, and the like.
  • the so-called processor 60 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the storage 61 may be an internal storage unit of the server 6, such as a hard disk or a memory of the server 6.
  • the memory 61 may also be an external storage device of the server 6, for example, a plug-in hard disk, a smart media card (SMC), or a secure digital (SD) card equipped on the server 6. Flash Card, etc. Further, the memory 61 may also include both an internal storage unit of the server 6 and an external storage device.
  • the memory 61 is used to store the computer readable instructions and other programs and data required by the server.
  • the memory 61 can also be used to temporarily store data that has been output or will be output.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through computer-readable instructions.
  • the computer-readable instructions can be stored in a computer-readable storage medium. in.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Collating Specific Patterns (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application is applicable to the technical field of data processing and provides a real-time data verification method and device, a server and a medium. The method comprises: obtaining user identity data of and verification keywords through login information; receiving real-time data of the user; and dividing the real-time data into real-time video data and real-time audio data; and computing the similarity between human face data contained in a real-time video component and the user identity data to verify the user identity. If the similarity is not less than a preset similarity threshold, the number of verification keywords contained in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal; if the number of the verification keywords contained in the real-time audio component within is greater than a first quantity threshold, it is determined that the real-time data ha been verified to perform intelligent analysis on the real-time data and improve the security of remote service management.

Description

实时数据的验证方法、装置、服务器及介质Real-time data verification method, device, server and medium
本申请申明享有2019年05月21日递交的申请号为201910424123.0、名称为“实时数据的验证方法、装置、服务器及介质”中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application affirms that it enjoys the priority of the Chinese patent application with the application number 201910424123.0 and the name "Real-time data verification method, device, server and medium" filed on May 21, 2019. The entire content of the Chinese patent application is by reference Incorporated in this application.
技术领域Technical field
本申请属于数据处理技术领域,尤其涉及一种实时数据的验证方法、装置、服务器及介质。This application belongs to the field of data processing technology, and in particular relates to a method, device, server and medium for verifying real-time data.
背景技术Background technique
随着互联网技术的发展,用户可以远程线上办理各种业务。但是线上业务的办理也在一定程度上存在安全隐患。例如,可能有些不法分子会冒用他人身份办理相关业务,以获取不法利益;可能有些老年用户会由于业务人员的疏忽或刻意隐瞒而办理了不必要的业务,引发了不必要的损失。这些安全隐患基本上都是由于对实时录入的用户的视频数据分析不准确不全面造成的。With the development of Internet technology, users can remotely handle various businesses online. However, the handling of online business also has security risks to a certain extent. For example, some criminals may pretend to be others to handle related businesses in order to obtain illegal benefits; some elderly users may handle unnecessary business due to the negligence or deliberate concealment of business personnel, causing unnecessary losses. These security risks are basically caused by inaccurate and incomplete analysis of the user's video data entered in real time.
综上,当前存在严重的对于实时数据分析不准确不全面的技术问题。In summary, there are currently serious technical problems with inaccurate and incomplete real-time data analysis.
技术问题technical problem
有鉴于此,本申请实施例提供了一种实时数据的验证方法及服务器,以解决对于实时数据分析不准确不全面的问题。In view of this, the embodiments of the present application provide a real-time data verification method and server to solve the problem of inaccurate and incomplete real-time data analysis.
技术解决方案Technical solutions
本申请实施例的第一方面提供了一种实时数据的验证方法,包括:在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;若所述实时音频分量中包含所述验证关键词的数量不大于第一数量阈值但大于第二数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证;若所述实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通过验证。The first aspect of the embodiments of the present application provides a method for verifying real-time data, including: after receiving user login information, determining user identity data corresponding to the login information, and analyzing the key contained in the login information Code, determine multiple verification keywords corresponding to the user based on the key code; receive real-time data of the user, and divide the real-time data into real-time video components and real-time audio components, the real-time video components include face data Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the prediction It is assumed that the number of verification keywords contained in the real-time audio component within the time period; if the number of verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold, then One of the asynchronous processing servers is selected as the selected server, and the real-time data is sent to the selected server to perform non-real-time verification on the real-time data through the selected server; if the real-time audio component is If the number of the verification keywords is greater than the first number threshold, it is determined that the real-time data passes the verification.
有益效果Beneficial effect
在本申请实施例中,通过登录信息获取用户身份数据以及验证关键词,接收用户的实时数据,将实时数据分为实时视频以及实时音频数据,计算实时视频分量中包含的人脸数据与用户身份数据的相似度,以对用户身份进行验证,若相似度不小于预设的相似度阈值,则计算预设时间段内实时音频分量中包含验证关键词的数量,以对实时数据是否合法进行验证,若实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通过验证,以对实时数据进行智能化分析,提高业务远程办理的安全性。In the embodiment of this application, user identity data and verification keywords are obtained through login information, real-time data of the user is received, real-time data is divided into real-time video and real-time audio data, and face data and user identity contained in real-time video components are calculated The similarity of the data is used to verify the identity of the user. If the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal If the number of the verification keywords contained in the real-time audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
附图说明Description of the drawings
图1是本申请实施例提供的实时数据的验证方法的实现流程图;FIG. 1 is an implementation flowchart of a method for verifying real-time data provided by an embodiment of the present application;
图2是本申请实施例提供的实时数据的验证方法S101的具体实现流程图;FIG. 2 is a specific implementation flowchart of a method S101 for verifying real-time data provided by an embodiment of the present application;
图3是本申请实施例提供的实时数据的验证方法S103的具体实现流程图;3 is a specific implementation flowchart of the method S103 for verifying real-time data provided by an embodiment of the present application;
图4是本申请实施例提供的实时数据的验证方法S107的具体实现流程图;FIG. 4 is a specific implementation flowchart of the method S107 for verifying real-time data provided by an embodiment of the present application;
图5是本申请实施例提供的实时数据的验证装置的结构框图;Figure 5 is a structural block diagram of a real-time data verification device provided by an embodiment of the present application;
图6是本申请实施例提供的服务器的示意图。Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application.
本发明的实施方式Embodiments of the invention
图1示出了本申请实施例提供的实时数据的验证方法的实现流程,该方法流程包括步骤S101至S108。各步骤的具体实现原理如下。Fig. 1 shows an implementation process of a method for verifying real-time data provided by an embodiment of the present application. The process of the method includes steps S101 to S108. The specific implementation principle of each step is as follows.
在S101中,在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词。In S101, after receiving the login information of the user, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verifications corresponding to the user based on the key code Key words.
在本申请实施例中,当用户远程办理在线业务时,需要在登录后通过视频与业务人员进行沟通,服务器会在验证用户的登录信息后,对采集到的用户与业务人员的视频(即实时数据)进行人工智能分析,以判断整个业务的办理是否存在安全隐患。总体而言,本申请实施例主要是通过自动验证实时数据中的用户是否与登录信息对应的用户数据相匹配,以及自动验证实时数据是否包含足够的验证关键词这两个维度对实时数据进行验证,从而保证业务办理的安全性的。In the embodiment of this application, when a user remotely handles online business, he needs to communicate with business personnel through video after logging in. After verifying the user’s login information, the server will verify the collected video of the user and business personnel (ie real-time Data) conduct artificial intelligence analysis to determine whether there are hidden safety hazards in the entire business. In general, the embodiments of this application mainly verify the real-time data by automatically verifying whether the user in the real-time data matches the user data corresponding to the login information, and automatically verifying whether the real-time data contains sufficient verification keywords. , So as to ensure the safety of business processing.
可以理解地,服务器在接收到登录信息后,会对登录信息中包含的登录名和密码进行验证,一旦验证通过会根据登录名确定用户身份数据,其中用户身份数据包括用户的面部照片。另一方面,同一个用户在不同的业务的界面进行登录,或者选择不同的业务功能进行登录后,登录信息中包含的关键码并不相同,显然关键码可以用于区分不同的业务;反 之,不同类型的用户在同一个业务的界面进行登录,登录信息中包含的关键码也可能并不相同,例如,用户的类型可以分为未成年用户、成年用户以及老年用户,由于他们的判别能力不同,在办理同一个业务登陆后,登录信息中包含的关键码也不同。Understandably, after receiving the login information, the server will verify the login name and password contained in the login information. Once the verification is passed, the user identity data will be determined according to the login name, where the user identity data includes the user's facial photo. On the other hand, after the same user logs in on different business interfaces, or selects different business functions to log in, the key codes contained in the login information are not the same. Obviously, the key codes can be used to distinguish different businesses; on the contrary, Different types of users log in on the same service interface, and the key codes contained in the login information may also be different. For example, the types of users can be divided into minor users, adult users, and elderly users, due to their different discrimination capabilities , After the same business login, the key code contained in the login information is also different.
如上文所述,本申请实施例对于后续的实时数据的其中一个验证维度就是自动验证实时数据是否包含足够的验证关键词,所以需要在此首先基于关键码确定该实时数据中需要包含哪些验证关键词。可以理解地,在线上业务办理的过程中,例如对于老年用户,需要业务人员告知老年用户业务的风险,以及老年用户必须对这些风险进行确认,这时由于实时数据是对整个交流过程的录制,所以合格的实时数据中必然应该包含一定数量的验证关键词。As mentioned above, one of the verification dimensions of the subsequent real-time data in the embodiment of this application is to automatically verify whether the real-time data contains sufficient verification keywords. Therefore, it is necessary to first determine which verification keys need to be included in the real-time data based on the key code. word. Understandably, in the process of online business processing, for example, for elderly users, business personnel need to inform elderly users of business risks, and elderly users must confirm these risks. At this time, because real-time data is a recording of the entire communication process, Therefore, qualified real-time data must contain a certain number of verification keywords.
显然,一段实时数据中是否包含了一定数量的验证关键词表示了用户是否从业务人员处获知了相应的风险提示,以及是否做出了一些肯定答复。Obviously, whether a certain number of verification keywords are included in a piece of real-time data indicates whether the user has learned the corresponding risk warning from the business personnel, and whether he has made some affirmative answers.
可选地,可以通过预设的关键码与验证关键词的对应关系,确定登陆信息中包含的关键码对应的多个关键词。Optionally, a plurality of keywords corresponding to the key code contained in the login information can be determined through the correspondence between the preset key code and the verification keyword.
可选地,可以通过预测一个关键码对应的多个词语在未来的出现概率,确定该关键码对应的验证关键词,作为本申请的一个实施例,如图2所示,上述S101包括:Optionally, the verification keyword corresponding to the key code can be determined by predicting the occurrence probability of multiple words corresponding to a key code in the future. As an embodiment of the present application, as shown in FIG. 2, the above S101 includes:
S1011,调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻。S1011: Retrieve a word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment.
在本申请实施例中,关键码对应的词语集合的生成过程为:业务人员提取一段预设周期内(例如:一个月内)与用户进行业务办理时录制的实时数据,作为标本数据,并确定该标本数据对应的关键码,从该标本数据中人为筛选出可用于保证业务办理安全性的词语,从而确定一个关键码对应的多个词语。此外,由于一个标本数据都有一个接收时间,所以一个标本数据中提取出的词语均对应该接收时间。In the embodiment of this application, the process of generating the word set corresponding to the key code is: the business person extracts the real-time data recorded during the business transaction with the user within a preset period (for example: within one month) as the specimen data, and determines The key code corresponding to the specimen data is artificially screened out from the specimen data that can be used to ensure the security of business processing, thereby determining multiple words corresponding to a key code. In addition, since a specimen data has a reception time, the words extracted from a specimen data all correspond to the reception time.
显然,业务人员可以通过提取一个关键码对应的多个标本数据,从而获取该关键词对应的更多的词语,从而生成关键词对应的词语集合。值得注意地,本申请实施例不会将词语进行归并,因此词语集合中会包含大量重复的词语,每个词语均对应一个接收时间。Obviously, the business personnel can obtain more words corresponding to the keyword by extracting multiple specimen data corresponding to a key code, thereby generating a set of words corresponding to the keyword. It is worth noting that the embodiments of the present application will not merge words, so the word set will contain a large number of repeated words, and each word corresponds to a receiving time.
S1012,根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数。S1012. Establish a corresponding relationship between a receiving time period and the number of occurrences of each word according to the corresponding relationship between each word in the word set and the receiving time. The receiving time period includes multiple receiving times, and the number of occurrences of the word is at The number of occurrences of the word in the word set in one of the receiving time periods.
可以理解地,由于一个词语集合中包含了很多重复的词语,且每个词语均对应一个接收时间,所以就可以统计出一个接收时间段内某一个词语在词语集合中的出现次数。显然,在对多个接收时间段进行上述统计后,就会生成接收时间段与各个词语的词语出现次数的 对应关系。示例性地,在1月1日到1月5日(接收时间段)内“赔钱”这个词语的出现次数为10,“风险”这个词语的出现次数为8,“不建议”这个词语的出现次数为9,“清楚”这个词语的出现次数为20等等。Understandably, since a word set contains many repeated words, and each word corresponds to a receiving time, it is possible to count the number of occurrences of a certain word in the word set during a receiving time period. Obviously, after the above statistics are performed on multiple reception time periods, the corresponding relationship between the reception time period and the word appearance times of each word will be generated. Exemplarily, the number of occurrences of the word "losing money" during the period from January 1 to January 5 (receiving time period) is 10, the number of occurrences of the word "risk" is 8, and the occurrence of the word "not recommended" The number of times is 9, the number of occurrences of the word "clear" is 20, and so on.
S1013,拟合表征所述词语出现次数与接收时间段的对应关系的回归方程。S1013: Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the receiving time period.
可选地,通过回归模型:
Figure PCTCN2019103300-appb-000001
拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述e为自然常数。
Optionally, through the regression model:
Figure PCTCN2019103300-appb-000001
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations, and the e is a natural constant.
可以理解地,以接收时间段为自变量,以某个词语的词语出现次数为因变量,可以通过现有的非线性回归方程的求解方法得到回归模型的系数,所以不在此进行详述。Understandably, taking the receiving time period as the independent variable, and the number of occurrences of a certain word as the dependent variable, the coefficients of the regression model can be obtained by the existing nonlinear regression equation solving method, so it is not detailed here.
需要说明的是,此处的接收时间段序号表示的是从前到后第几个接收时间段,例如一共有5个接收时间段,分别为:1月1日到1月5日、1月6日到1月10日、1月11日到1月15日、1月16日到1月20日以及1月21日到1月25日,那么1月1日到1月5日对应的接收时间段序号就为1,1月6日到1月10日对应的接收时间段序号为2,以此类推。It should be noted that the serial number of the receiving time period here indicates the first receiving time period from front to back. For example, there are 5 receiving time periods in total, namely: January 1 to January 5, and January 6. From January 10th, January 11th to January 15th, January 16th to January 20th, and January 21st to January 25th, then the corresponding reception from January 1st to January 5th The serial number of the time zone is 1, and the serial number of the corresponding receiving time zone from January 6 to January 10 is 2, and so on.
值得注意地,本申请实施例中,每一个词语均对应一个回归方程。It is worth noting that in the embodiments of the present application, each word corresponds to a regression equation.
S1014,基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数。S1014: Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word.
示例性地,假设当前时刻所在的接收时间段对应的接收时间段序号为10,预设数量为5,则通过各个词语对应的回归方程,分别计算自变量为11、12、13、14、15对应的因变量的和(即出现次数的和),作为各个词语对应的预测出现次数。Illustratively, assuming that the receiving time period sequence number corresponding to the receiving time period at the current moment is 10 and the preset number is 5, the independent variables are calculated as 11, 12, 13, 14, and 15 through the regression equation corresponding to each word. The sum of the corresponding dependent variables (ie the sum of the number of occurrences) is used as the predicted number of occurrences corresponding to each word.
S1015,选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。S1015: Select words whose predicted number of occurrences are not less than a preset number threshold as verification keywords corresponding to the user.
在S102中,接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据。In S102, real-time data of the user is received, and the real-time data is divided into a real-time video component and a real-time audio component, and the real-time video component includes face data.
如上文所述,本申请实施例中的用户的实时数据为用户与业务人员进行沟通时采集的视频,显然该视频可以被分为实时视频分量以及实时音频分量,其中,实时视频分量为只包含图像信息而不包含语音信息的分量,而实时音频分量为只包含语音信息而不包含图像信息的分量。As mentioned above, the real-time data of the user in the embodiment of the present application is the video collected when the user communicates with the business personnel. Obviously, the video can be divided into real-time video component and real-time audio component. Among them, the real-time video component only contains Image information does not contain components of voice information, while real-time audio components are components that only contain voice information without image information.
显然,由于实时数据是在用户与业务人员沟通过程中,采集的用户侧的视频,所以实 时视频分量中包含用户的人脸数据,实时音频分量中包含用户和业务人员双方的语音数据。Obviously, since the real-time data is the user-side video collected during the communication between the user and the business personnel, the real-time video component contains the user's face data, and the real-time audio component contains the voice data of both the user and the business personnel.
在S103中,计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度。In S103, the similarity between the face data contained in the real-time video component and the user identity data is calculated.
在本申请实施例中,需要首先通过计算实时视频分量中包含的人脸数据与所述用户身份数据的相似度,判断当前正在远程办理业务的用户是否为登录信息对应的用户,以此来避免由于登录密码泄露或盗号造成的安全性隐患。In the embodiment of this application, it is necessary to first calculate the similarity between the face data contained in the real-time video component and the user identity data to determine whether the user currently handling the business remotely is the user corresponding to the login information, so as to avoid Security hazards caused by leaked login passwords or account theft.
作为本申请的一个实施例,如图3所示,上述S103包括:As an embodiment of the present application, as shown in FIG. 3, the foregoing S103 includes:
S1031,从所述实时视频分量中截取一帧图像,作为目标图像。S1031: Intercept a frame of image from the real-time video component as a target image.
可以理解地,实时视频分量实际是由多帧图像组成的,本申请实施例选择其中一帧图像作为目标图像进行分析。It is understandable that the real-time video component is actually composed of multiple frames of images, and the embodiment of the present application selects one frame of images as the target image for analysis.
S1032,将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号。S1032: Divide the target image into multiple image regions, read pixel point data of each pixel in the image region, and number the image regions according to a preset sequence.
示例性地,可以将所述目标图像延横向划分为100个等份,并延纵坐标划分为100个等份,则目标图像可以被分为10000个图像区域,每个图像区域中均包含多个像素点。Exemplarily, the target image can be divided into 100 equal parts along the horizontal direction and 100 equal parts along the ordinate. Then the target image can be divided into 10,000 image regions, and each image region contains multiple Pixels.
可选地,像素点数据可以是一个像素点的RGB值,即通过一个像素点数据可以表示一个像素点的RGB值。Optionally, the pixel point data may be the RGB value of one pixel point, that is, the RGB value of one pixel point can be represented by one pixel point data.
S1033,将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数。S1033: Simultaneously input the pixel point data of a preset number of adjacent image areas with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image area with a middle number, where the preset number is an odd number greater than 1. .
在本申请实施例中,并不是将每一个图像区域的像素点数据单独地输入预设的VGG神经网络模型中,而是由于考虑到人脸在各个图像区域中连续覆盖的原因,为了尽量避免间断式的图像区域计算可能导致的偶然性误差,因此本申请实施例是将预设个数的编号相邻的图像区域的像素点数据同时输入神经网络模型(例如,将3个连续的图像区域的像素点数据同时输入神经网络模型),而在结果输出时,只输出编号居中的图像区域对应的分割系数,所述分割系数用于区分人脸图像所覆盖的图像区域或非人脸图像所覆盖的图像区域。In the embodiment of this application, the pixel data of each image area is not individually input into the preset VGG neural network model, but because the face is continuously covered in each image area, in order to avoid Intermittent image area calculations may cause accidental errors. Therefore, in the embodiment of the present application, a preset number of pixel data of adjacent image areas with a number are simultaneously input into the neural network model (for example, three consecutive image areas are Pixel data is input into the neural network model at the same time), and when the result is output, only the segmentation coefficient corresponding to the image area with the middle number is output. The segmentation coefficient is used to distinguish the image area covered by the face image or the non-face image. Image area.
在本申请实施例中,将预设数量的编号相邻的图像区域作为一个区域组,将每个图像区域中各个像素点的像素点数据整理组合成向量后,就可以生成一个区域组的特征向量。In the embodiment of the present application, a preset number of image areas with adjacent numbers are used as a region group, and after the pixel data of each pixel in each image region is organized and combined into a vector, the characteristics of a region group can be generated vector.
可以理解地,只需在VGG神经网络模型的训练过程中,将输入参数设定为预设数量的编号相邻的图像区域的像素点数据,将输出参数设定为预设的编号居中的图像区域对应的分割系数,那么自然这样训练出的神经网络模型,可以实现上文所述的输出编号居中的图像区域对应的分割系数。Understandably, only in the training process of the VGG neural network model, the input parameters are set to the preset number of pixel data of adjacent image areas, and the output parameters are set to the preset numbered image. The segmentation coefficient corresponding to the region, so naturally the neural network model trained in this way can realize the segmentation coefficient corresponding to the image region with the middle output number described above.
具体地,VGG神经网络的训练过程包括:获取多个训练区域组的训练特征向量以及训练区域组中居中的图像区域的训练分割系数;反复执行以下步骤直至调整后的VGG神经网 络的交叉熵损失函数值小于预设的阈值:将所述训练特征向量作为VGG神经网络的输入,将所述训练分割系数作为所述VGG神经网络的输出,通过随机梯度下降法对所述VGG神经网络的全连接层中的各层参数进行更新,并计算调整后的VGG神经网络的交叉熵损失函数值。最后输出交叉熵损失函数值小于预设的阈值的VGG神经网络作为预设的神经网络模型。Specifically, the training process of the VGG neural network includes: obtaining the training feature vectors of multiple training region groups and the training segmentation coefficients of the image regions in the center of the training region groups; repeating the following steps until the adjusted cross-entropy loss of the VGG neural network The function value is less than the preset threshold: the training feature vector is used as the input of the VGG neural network, the training segmentation coefficient is used as the output of the VGG neural network, and the VGG neural network is fully connected by the random gradient descent method The parameters of each layer in the layer are updated, and the cross entropy loss function value of the adjusted VGG neural network is calculated. Finally, the VGG neural network whose cross-entropy loss function value is less than the preset threshold is output as the preset neural network model.
示例性地,假设一共有100个图像区域,如果预设数量为3,则将编号1-3的图像区域的像素点数据输入神经网络模型后,会输出编号2的图像区域的分割系数;将编号2-4的图像区域的像素点数据输入神经网络模型后,会输出编号为3的图像区域的分割系数,以此类推,直至输出编号为99的图像区域的分割系数。Illustratively, assuming there are 100 image regions in total, if the preset number is 3, the pixel data of the image regions numbered 1-3 are input into the neural network model, and the segmentation coefficient of the image region numbered 2 will be output; After the pixel data of the image areas numbered 2-4 are input to the neural network model, the segmentation coefficient of the image area numbered 3 will be output, and so on, until the segmentation coefficient of the image area numbered 99 is output.
值得注意地,通过上述方法无法获取编号为1以及编号为100的图像区域的分割系数,但是由于获取图像区域的分割系数最终的目的是为了从目标图像中分割出人脸图像,而人脸图像覆盖到目标图像边缘的像素点的概率极小,所以虽然本申请实施例无法计算出编号最大和编号最小的图像区域的分割系数,但是在实际应用中并不会影响对于人脸图像的分割。It is worth noting that the segmentation coefficients of the image regions numbered 1 and 100 cannot be obtained through the above method, but the ultimate purpose of obtaining the segmentation coefficients of the image region is to segment the face image from the target image, and the face image The probability of pixels covering the edges of the target image is extremely small. Therefore, although the embodiment of the present application cannot calculate the segmentation coefficients of the image area with the largest number and the smallest number, it does not affect the segmentation of the face image in practical applications.
S1034,将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据。S1034: Use pixels corresponding to the segmentation coefficients that are less than a preset coefficient threshold as face pixels, and generate face data according to pixel point data of all the face pixels.
S1035,通过距离公式计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度。S1035: Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data by a distance formula, as the similarity between the face data included in the video component and the user identity data.
可选地,所述距离公式为:
Figure PCTCN2019103300-appb-000002
其中,所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
Optionally, the distance formula is:
Figure PCTCN2019103300-appb-000002
Wherein the S is the similarity, the X i is the face data corresponding to the i-th matrix element Y i is the identity of the user data corresponding to the i-th matrix element The K is the number of elements included in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
在S104中,若所述人脸数据与所述用户身份数据的相似度小于预设的相似度阈值,则判定所述实时数据未通过验证。In S104, if the similarity between the face data and the user identity data is less than a preset similarity threshold, it is determined that the real-time data fails the verification.
在S105中,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量。In S105, if the similarity between the face data and the user identity data is not less than a preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated.
可选地,通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。Optionally, the real-time audio component in the preset time period is converted into text data through a speech-to-text conversion algorithm; the text data is word segmented to generate a speech word set, and the speech word set contains multiple Words; calculating the number of the verification keywords included in the voice word set as the number of the verification keywords included in the real-time audio component.
在S106中,若所述实时音频分量中包含所述验证关键词的数量不大于第二预设数量 阈值,则判定所述实时数据未通过验证。In S106, if the number of the verification keywords included in the real-time audio component is not greater than a second preset number threshold, it is determined that the real-time data fails the verification.
在S107中,若所述实时音频分量中包含所述验证关键词的数量不大于第一预设数量阈值但大于第二预设数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证。In S107, if the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, select one of the multiple asynchronous processing servers as the selected server , And send the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server.
在本申请实施例中,当实时音频分量中包含的关键词的数量不足够多时,不能直接判定实时数据是否通过验证,需要通过其他的异步处理服务器进行进一步自动分析或人工分析。由于本申请实施例只从主服务器一侧介绍实时数据的验证方法,所以在此只介绍主服务器如何选取一个异步处理服务器作为被选服务器,并将实时数据发送至被选服务器,并不涉及被选服务器如何对实时数据进行非实时验证。In the embodiment of the present application, when the number of keywords contained in the real-time audio component is not enough, it cannot be directly determined whether the real-time data has passed the verification, and other asynchronous processing servers need to be used for further automatic analysis or manual analysis. Since the embodiment of this application only introduces the real-time data verification method from the side of the main server, here only how the main server selects an asynchronous processing server as the selected server and sends the real-time data to the selected server is not involved. Choose how the server performs non-real-time verification of real-time data.
作为本申请的一个实施例,如图4所示,上述S107包括:As an embodiment of the present application, as shown in FIG. 4, the foregoing S107 includes:
S1071,调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数。S1071: Invoke the number of threads contained in each of the multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server within a unit time period before the current moment.
S1072,通过分段公式计算各个异步处理服务器对应的发送参数。S1072: Calculate sending parameters corresponding to each asynchronous processing server through a segmented formula.
可选地,所述分段公式包括:
Figure PCTCN2019103300-appb-000003
所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数。
Optionally, the segmentation formula includes:
Figure PCTCN2019103300-appb-000003
The K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) represents the number of abnormal tasks corresponding to the asynchronous processing server i .
S1073,通过比例计算公式计算各个异步处理服务器对应的发送比例。S1073: Calculate the sending ratio corresponding to each asynchronous processing server through a ratio calculation formula.
可选地,所述比例计算公式包括:
Figure PCTCN2019103300-appb-000004
所述Par i表示异步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数。
Optionally, the ratio calculation formula includes:
Figure PCTCN2019103300-appb-000004
The Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers.
S1074,选择其中一个最高的发送比例对应的异步处理服务器作为被选服务器。S1074: Select one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
在S108中,若所述实时音频分量中包含所述验证关键词的数量大于所述第一预设数量阈值,则判定所述实时数据通过验证。In S108, if the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
可以理解地,通过登录信息获取用户身份数据以及验证关键词,接收用户的实时数据,将实时数据分为实时视频以及实时音频数据,计算实时视频分量中包含的人脸数据与用户身份数据的相似度,以对用户身份进行验证,若相似度不小于预设的相似度阈值,则计算预设时间段内实时音频分量中包含验证关键词的数量,以对实时数据是否合法进行验证,若实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通 过验证,以对实时数据进行智能化分析,提高业务远程办理的安全性。Understandably, obtain user identity data and verification keywords through login information, receive real-time data from users, divide real-time data into real-time video and real-time audio data, and calculate the similarity between face data contained in real-time video components and user identity data In order to verify the user’s identity, if the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal. If the number of the verification keywords contained in the audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
对应于上文实施例所述的实时数据的验证方法,图5示出了本申请实施例提供的实时数据的验证装置的结构框图,为了便于说明,仅示出了与本申请实施例相关的部分。Corresponding to the real-time data verification method described in the above embodiment, FIG. 5 shows a structural block diagram of the real-time data verification device provided in the embodiment of the present application. For ease of description, only the information related to the embodiment of the present application is shown. section.
参照图5,该装置包括:Referring to Figure 5, the device includes:
解析模块501,用于在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;The parsing module 501 is configured to, after receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine the number corresponding to the user based on the key code. Verification keywords;
分解模块502,用于接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;The decomposition module 502 is configured to receive real-time data of the user, and divide the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
计算模块503,用于计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;The calculation module 503 is configured to calculate the similarity between the face data contained in the real-time video component and the user identity data, if the similarity between the face data and the user identity data is not less than a preset similarity Threshold, calculate the number of verification keywords included in the real-time audio component within a preset time period;
第一执行模块504,用于若所述实时音频分量中包含所述验证关键词的数量不大于第一数量阈值但大于第二数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证;The first execution module 504 is configured to select one of a plurality of asynchronous processing servers as the selected server if the number of the verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold , And send the real-time data to the selected server to perform non-real-time verification on the real-time data through the selected server;
第二执行模块505,用于若所述实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通过验证。The second execution module 505 is configured to determine that the real-time data passes the verification if the number of the verification keywords contained in the real-time audio component is greater than the first number threshold.
可选地,所述解析模块,具体用于:Optionally, the parsing module is specifically used for:
调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻;Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;
根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数;According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
通过回归模型:
Figure PCTCN2019103300-appb-000005
拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述e为自然常数;
Through the regression model:
Figure PCTCN2019103300-appb-000005
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;
基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数;Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;
选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
可选地,所述计算模块,具体用于:Optionally, the calculation module is specifically used for:
从所述实时视频分量中截取一帧图像,作为目标图像;Intercept a frame of image from the real-time video component as a target image;
将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号;Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;
将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数Input the preset number of pixel data of adjacent image areas with numbers into the preset VGG neural network model at the same time, and output the segmentation coefficient corresponding to the image area with the middle number, and the preset number is an odd number greater than 1.
将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据;Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;
通过公式:
Figure PCTCN2019103300-appb-000006
计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度,其中所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
By formula:
Figure PCTCN2019103300-appb-000006
Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is the Similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the person The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.
可选地,所述计算预设时间段内所述实时音频分量中包含所述验证关键词的数量,包括:通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。Optionally, the calculating the number of verification keywords contained in the real-time audio component within a preset time period includes: converting the real-time audio component within the preset time period into a voice-to-text conversion algorithm Text data; word segmentation processing of the text data to generate a set of voice words, the set of voice words contains multiple words; the number of the verification keywords included in the set of voice words is calculated as the real-time audio component Contains the number of verification keywords.
可选地,所述从多个异步处理服务器中选择一个作为被选服务器,包括:Optionally, the selecting one of multiple asynchronous processing servers as the selected server includes:
调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数;Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;
通过公式:
Figure PCTCN2019103300-appb-000007
计算各个异步处理服务器对应的发送参数,所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数;通过公式:
Figure PCTCN2019103300-appb-000008
计算各个异步处理服务器对应的发送比例,所述Par i表示异步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数;选择其中一个最高的发送比例对应的异步处理服务器作为被选服 务器。
By formula:
Figure PCTCN2019103300-appb-000007
Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Represents the number of abnormal tasks corresponding to the asynchronous processing server i; through the formula:
Figure PCTCN2019103300-appb-000008
Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ; Select one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
可以理解地,通过登录信息获取用户身份数据以及验证关键词,接收用户的实时数据,将实时数据分为实时视频以及实时音频数据,计算实时视频分量中包含的人脸数据与用户身份数据的相似度,以对用户身份进行验证,若相似度不小于预设的相似度阈值,则计算预设时间段内实时音频分量中包含验证关键词的数量,以对实时数据是否合法进行验证,若实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通过验证,以对实时数据进行智能化分析,提高业务远程办理的安全性。Understandably, obtain user identity data and verification keywords through login information, receive real-time data from users, divide real-time data into real-time video and real-time audio data, and calculate the similarity between face data contained in real-time video components and user identity data In order to verify the user’s identity, if the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal. If the number of the verification keywords contained in the audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.
图6是本申请一实施例提供的服务器的示意图。如图6所示,该实施例的服务器6包括:处理器60、存储器61以及存储在所述存储器61中并可在所述处理器60上运行的计算机可读指令62,例如实时数据的验证程序。所述处理器60执行所述计算机可读指令62时实现上述各个实时数据的验证方法实施例中的步骤,例如图1所示的步骤101至108。或者,所述处理器60执行所述计算机可读指令62时实现上述各装置实施例中各模块/单元的功能,例如图5所示单元501至505的功能。Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application. As shown in FIG. 6, the server 6 of this embodiment includes: a processor 60, a memory 61, and computer-readable instructions 62 stored in the memory 61 and running on the processor 60, such as verification of real-time data program. When the processor 60 executes the computer-readable instructions 62, the steps in the foregoing embodiments of the verification method for real-time data are implemented, such as steps 101 to 108 shown in FIG. 1. Alternatively, when the processor 60 executes the computer-readable instructions 62, the functions of the modules/units in the foregoing device embodiments, such as the functions of the units 501 to 505 shown in FIG. 5, are implemented.
示例性的,所述计算机可读指令62可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器61中,并由所述处理器60执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令指令段,该指令段用于描述所述计算机可读指令62在所述服务器6中的执行过程。Exemplarily, the computer-readable instructions 62 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 61 and executed by the processor 60, To complete this application. The one or more modules/units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 62 in the server 6.
所述服务器6可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述服务器可包括,但不仅限于,处理器60、存储器61。本领域技术人员可以理解,图6仅仅是服务器6的示例,并不构成对服务器6的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述服务器还可以包括输入输出设备、网络接入设备、总线等。The server 6 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The server may include, but is not limited to, a processor 60 and a memory 61. Those skilled in the art can understand that FIG. 6 is only an example of the server 6 and does not constitute a limitation on the server 6. It may include more or less components than shown, or a combination of certain components, or different components, for example The server may also include input and output devices, network access devices, buses, and the like.
所称处理器60可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The so-called processor 60 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
所述存储器61可以是所述服务器6的内部存储单元,例如服务器6的硬盘或内存。所述存储器61也可以是所述服务器6的外部存储设备,例如所述服务器6上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器61还可以既包括所述服务器6的内部存储单元 也包括外部存储设备。所述存储器61用于存储所述计算机可读指令以及所述服务器所需的其他程序和数据。所述存储器61还可以用于暂时地存储已经输出或者将要输出的数据。The storage 61 may be an internal storage unit of the server 6, such as a hard disk or a memory of the server 6. The memory 61 may also be an external storage device of the server 6, for example, a plug-in hard disk, a smart media card (SMC), or a secure digital (SD) card equipped on the server 6. Flash Card, etc. Further, the memory 61 may also include both an internal storage unit of the server 6 and an external storage device. The memory 61 is used to store the computer readable instructions and other programs and data required by the server. The memory 61 can also be used to temporarily store data that has been output or will be output.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above-mentioned functional units and modules is used as an example. In practical applications, the above-mentioned functions can be allocated to different functional units and modules as required. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which is not repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一计算机可读存储介质中。If the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a computer-readable storage medium. in.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (20)

  1. 一种实时数据的验证方法,其特征在于,包括:A method for verifying real-time data, which is characterized in that it includes:
    在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;
    接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
    计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;
    若所述实时音频分量中包含所述验证关键词的数量不大于第一预设数量阈值但大于第二预设数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证,所述第二预设数量阈值小于所述第一预设数量阈值;If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;
    若所述实时音频分量中包含所述验证关键词的数量大于所述第一预设数量阈值,则判定所述实时数据通过验证。If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
  2. 如权利要求1所述的实时数据的验证方法,其特征在于,所述基于所述关键码确定所述用户对应的多个验证关键词,包括:The method for verifying real-time data according to claim 1, wherein said determining a plurality of verification keywords corresponding to said user based on said key code comprises:
    调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻;Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;
    根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数;According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
    通过回归模型:
    Figure PCTCN2019103300-appb-100001
    拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述e为自然常数;
    Through the regression model:
    Figure PCTCN2019103300-appb-100001
    Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;
    基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数;Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;
    选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
  3. 如权利要求1所述的实时数据的验证方法,其特征在于,所述计算所述实时视频 分量中包含的人脸数据与所述用户身份数据的相似度,包括:The method for verifying real-time data according to claim 1, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:
    从所述实时视频分量中截取一帧图像,作为目标图像;Intercept a frame of image from the real-time video component as a target image;
    将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号;Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;
    将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数;Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;
    将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据;Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;
    通过公式:
    Figure PCTCN2019103300-appb-100002
    计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度,其中,所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
    By formula:
    Figure PCTCN2019103300-appb-100002
    Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.
  4. 如权利要求1所述的实时数据的验证方法,其特征在于,所述计算预设时间段内所述实时音频分量中包含所述验证关键词的数量,包括:The method for verifying real-time data according to claim 1, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:
    通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;
    将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;
    计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
  5. 如权利要求1所述的实时数据的验证方法,其特征在于,所述从多个异步处理服务器中选择一个作为被选服务器,包括:The method for verifying real-time data according to claim 1, wherein the selecting one of a plurality of asynchronous processing servers as the selected server comprises:
    调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数;Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;
    通过公式:
    Figure PCTCN2019103300-appb-100003
    计算各个异步处理服务器对应的发送参数,所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数;
    By formula:
    Figure PCTCN2019103300-appb-100003
    Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;
    通过公式:
    Figure PCTCN2019103300-appb-100004
    计算各个异步处理服务器对应的发送比例,所述Par i表示异 步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数;
    By formula:
    Figure PCTCN2019103300-appb-100004
    Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ;
    选择其中一个最高的发送比例对应的异步处理服务器作为被选服务器。Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
  6. 一种实时数据的验证装置,其特征在于,所述装置包括:A real-time data verification device, characterized in that the device includes:
    解析模块,用于在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;The parsing module is used to determine the user identity data corresponding to the login information after receiving the login information of the user, analyze the key code contained in the login information, and determine the multiple corresponding to the user based on the key code Verify keywords;
    分解模块,用于接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;The decomposition module is configured to receive real-time data of the user, and divide the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
    计算模块,用于计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;A calculation module for calculating the similarity between the face data contained in the real-time video component and the user identity data, if the similarity between the face data and the user identity data is not less than a preset similarity threshold Calculate the number of verification keywords contained in the real-time audio component within the preset time period;
    第一执行模块,用于若所述实时音频分量中包含所述验证关键词的数量不大于第一数量阈值但大于第二数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证;The first execution module is configured to select one of a plurality of asynchronous processing servers as the selected server if the number of the verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold, And sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server;
    第二执行模块,用于若所述实时音频分量中包含所述验证关键词的数量大于第一数量阈值,则判定所述实时数据通过验证。The second execution module is configured to determine that the real-time data passes the verification if the number of the verification keywords contained in the real-time audio component is greater than a first number threshold.
  7. 如权利要求6所述的实时数据的验证装置,其特征在于,所述解析模块,具体用于:The real-time data verification device according to claim 6, wherein the analysis module is specifically used for:
    调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻;Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;
    根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数;According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
    通过回归模型:
    Figure PCTCN2019103300-appb-100005
    拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述e为自然常数;
    Through the regression model:
    Figure PCTCN2019103300-appb-100005
    Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;
    基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设 数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数;Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;
    选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
  8. 如权利要求6所述的实时数据的验证装置,其特征在于,所述计算模块,具体用于:The real-time data verification device according to claim 6, wherein the calculation module is specifically configured to:
    从所述实时视频分量中截取一帧图像,作为目标图像;Intercept a frame of image from the real-time video component as a target image;
    将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号;Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;
    将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数Input the preset number of pixel data of adjacent image areas with numbers into the preset VGG neural network model at the same time, and output the segmentation coefficient corresponding to the image area with the middle number, and the preset number is an odd number greater than 1.
    将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据;Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;
    通过公式:
    Figure PCTCN2019103300-appb-100006
    计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度,其中所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
    By formula:
    Figure PCTCN2019103300-appb-100006
    Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is the Similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the person The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.
  9. 根据权利要求6所述的识别设备,其特征在于,所述计算模块,具体用于:The identification device according to claim 6, wherein the calculation module is specifically configured to:
    通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;
    将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;
    计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
  10. 根据权利要求6-8任一项所述的识别设备,其特征在于,所述第一执行模块,具体用于:The identification device according to any one of claims 6-8, wherein the first execution module is specifically configured to:
    调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数;Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;
    通过公式:
    Figure PCTCN2019103300-appb-100007
    计算各个异步处理服务器对应的发送参数,所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数;
    By formula:
    Figure PCTCN2019103300-appb-100007
    Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;
    通过公式:
    Figure PCTCN2019103300-appb-100008
    计算各个异步处理服务器对应的发送比例,所述Par i表示异步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数;
    By formula:
    Figure PCTCN2019103300-appb-100008
    Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ;
    选择其中一个最高的发送比例对应的异步处理服务器作为被选服务器。Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
  11. 一种终端设备,其特征在于,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:A terminal device, which is characterized by comprising a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes the computer-readable instructions as follows step:
    在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;
    接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
    计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;
    若所述实时音频分量中包含所述验证关键词的数量不大于第一预设数量阈值但大于第二预设数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证,所述第二预设数量阈值小于所述第一预设数量阈值;If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;
    若所述实时音频分量中包含所述验证关键词的数量大于所述第一预设数量阈值,则判定所述实时数据通过验证。If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
  12. 根据权利要求11所述的终端设备,其特征在于,所述基于所述关键码确定所述用户对应的多个验证关键词,包括:The terminal device according to claim 11, wherein the determining a plurality of verification keywords corresponding to the user based on the key code comprises:
    调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻;Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;
    根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数;According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
    通过回归模型:
    Figure PCTCN2019103300-appb-100009
    拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述 e为自然常数;
    Through the regression model:
    Figure PCTCN2019103300-appb-100009
    Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;
    基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数;Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;
    选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
  13. 根据权利要求11所述的终端设备,其特征在于,所述计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,包括:The terminal device according to claim 11, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:
    从所述实时视频分量中截取一帧图像,作为目标图像;Intercept a frame of image from the real-time video component as a target image;
    将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号;Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;
    将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数;Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;
    将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据;Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;
    通过公式:
    Figure PCTCN2019103300-appb-100010
    计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度,其中,所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
    By formula:
    Figure PCTCN2019103300-appb-100010
    Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The number of elements contained in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
  14. 根据权利要求11所述的终端设备,其特征在于,所述计算预设时间段内所述实时音频分量中包含所述验证关键词的数量,包括:The terminal device according to claim 11, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:
    通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;
    将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;
    计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
  15. 根据权利要求11所述的终端设备,其特征在于,所述从多个异步处理服务器中选择一个作为被选服务器,包括:The terminal device according to claim 11, wherein said selecting one of a plurality of asynchronous processing servers as the selected server comprises:
    调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数;Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;
    通过公式:
    Figure PCTCN2019103300-appb-100011
    计算各个异步处理服务器对应的发送参数,所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数;
    By formula:
    Figure PCTCN2019103300-appb-100011
    Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;
    通过公式:
    Figure PCTCN2019103300-appb-100012
    计算各个异步处理服务器对应的发送比例,所述Par i表示异步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数;
    By formula:
    Figure PCTCN2019103300-appb-100012
    Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ;
    选择其中一个最高的发送比例对应的异步处理服务器作为被选服务器。Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
  16. 一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现如下步骤:A computer non-volatile readable storage medium, the computer non-volatile readable storage medium storing computer readable instructions, wherein the computer readable instructions are executed by a processor to implement the following steps:
    在接收到用户的登录信息后,确定所述登录信息对应的用户身份数据,并解析所述登录信息中包含的关键码,基于所述关键码确定所述用户对应的多个验证关键词;After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;
    接收用户的实时数据,将所述实时数据分为实时视频分量以及实时音频分量,所述实时视频分量中包含人脸数据;Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;
    计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,若所述人脸数据与所述用户身份数据的相似度不小于预设的相似度阈值,则计算预设时间段内所述实时音频分量中包含所述验证关键词的数量;Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;
    若所述实时音频分量中包含所述验证关键词的数量不大于第一预设数量阈值但大于第二预设数量阈值,则从多个异步处理服务器中选择一个作为被选服务器,并将所述实时数据发送至所述被选服务器,以通过所述被选服务器对所述实时数据进行非实时验证,所述第二预设数量阈值小于所述第一预设数量阈值;If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;
    若所述实时音频分量中包含所述验证关键词的数量大于所述第一预设数量阈值,则判定所述实时数据通过验证。If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
  17. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述基于所述关键码确定所述用户对应的多个验证关键词,包括:The computer non-volatile readable storage medium of claim 16, wherein the determining a plurality of verification keywords corresponding to the user based on the key code comprises:
    调取预设周期内接收的所述关键码对应的词语集合,所述词语集合中的各个词语均对应一个接收时刻;Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;
    根据所述词语集合中各个词语与接收时刻的对应关系,建立接收时间段与各个词语的词语出现次数的对应关系,所述接收时间段包括多个接收时刻,所述词语出现次数为在一个所述接收时间段内所述词语在所述词语集合中的出现次数;According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;
    通过回归模型:
    Figure PCTCN2019103300-appb-100013
    拟合表征所述词语出现次数与接收时间段的对应关系的回归方程,生成所述词语集合中各个词语对应的回归方程,其中,num表示所述词语次数、time表示接收时间段序号、pre1和pre2分别是两个非线性回归方程的系数,所述e为自然常数;
    Through the regression model:
    Figure PCTCN2019103300-appb-100013
    Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;
    基于所述词语集合中各个词语对应的回归方程,计算各个词语在当前时刻之后的预设数量的接收时间段内的出现次数,作为各个词语对应的预测出现次数;Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;
    选择所述预测出现次数不小于预设次数阈值的词语作为所述用户对应的验证关键词。The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
  18. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述计算所述实时视频分量中包含的人脸数据与所述用户身份数据的相似度,包括:The computer non-volatile readable storage medium according to claim 16, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:
    从所述实时视频分量中截取一帧图像,作为目标图像;Intercept a frame of image from the real-time video component as a target image;
    将所述目标图像分割成多个图像区域,并读取所述图像区域中各个像素点的像素点数据,依照预设顺序对所述图像区域进行编号;Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;
    将预设个数的编号相邻的图像区域的像素点数据同时输入预设的VGG神经网络模型,输出编号居中的图像区域对应的分割系数,所述预设个数为大于1的奇数;Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;
    将小于预设的系数阈值的分割系数对应的像素点作为人脸像素点,并根据全部所述人脸像素点的像素点数据,生成人脸数据;Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;
    通过公式:
    Figure PCTCN2019103300-appb-100014
    计算所述人脸数据对应的矩阵与所述用户身份数据对应的矩阵的相似度,作为所述视频分量中包含的人脸数据与所述用户身份数据的相似度,其中,所述S为所述相似度,所述X i为所述人脸数据对应的矩阵中的第i个元素,所述Y i为所述用户身份数据对应的矩阵中的第i个元素,所述K为所述人脸数据对应的矩阵以及用户身份数据对应的矩阵所包含元素的个数。
    By formula:
    Figure PCTCN2019103300-appb-100014
    Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The number of elements contained in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
  19. 根据权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述计算预设时间段内所述实时音频分量中包含所述验证关键词的数量,包括:The computer non-volatile readable storage medium according to claim 16, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:
    通过语音文本转换算法将所述预设时间段内的所述实时音频分量转换为文本数据;Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;
    将所述文本数据进行分词处理,生成语音词语集合,所述语音词语集合中包含多个词语;Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;
    计算所述语音词语集合中包含所述验证关键词的数量,作为所述实时音频分量中包含所述验证关键词的数量。The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
  20. 如权利要求16所述的计算机非易失性可读存储介质,其特征在于,所述从多个异步处理服务器中选择一个作为被选服务器,包括:The computer non-volatile readable storage medium of claim 16, wherein the selecting one of a plurality of asynchronous processing servers as the selected server comprises:
    调取多个异步处理服务器各自包含的线程数,并统计在当前时刻之前的单位时间段内接收到各个异步处理服务器反馈的异常任务数;Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;
    通过公式:
    Figure PCTCN2019103300-appb-100015
    计算各个异步处理服务器对应的发送参数,所述K(i)表示异步处理服务器i对应的发送参数,所述Z(i)表示异步处理服务器i对应的所述线程数,所述D(i)表示异步处理服务器i对应的异常任务数;
    By formula:
    Figure PCTCN2019103300-appb-100015
    Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;
    通过公式:
    Figure PCTCN2019103300-appb-100016
    计算各个异步处理服务器对应的发送比例,所述Par i表示异步处理服务器i对应的发送比例,所述K(i)表示异步处理服务器i对应的发送参数,所述n为异步处理服务器的个数;
    By formula:
    Figure PCTCN2019103300-appb-100016
    Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ;
    选择其中一个最高的发送比例对应的异步处理服务器作为被选服务器。Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
PCT/CN2019/103300 2019-05-21 2019-08-29 Real-time data verification method, device, server and medium WO2020232894A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910424123.0A CN110266645A (en) 2019-05-21 2019-05-21 Verification method, device, server and the medium of real time data
CN201910424123.0 2019-05-21

Publications (1)

Publication Number Publication Date
WO2020232894A1 true WO2020232894A1 (en) 2020-11-26

Family

ID=67914974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103300 WO2020232894A1 (en) 2019-05-21 2019-08-29 Real-time data verification method, device, server and medium

Country Status (2)

Country Link
CN (1) CN110266645A (en)
WO (1) WO2020232894A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036370B (en) * 2020-09-22 2023-05-12 济南博观智能科技有限公司 Face feature comparison method, system, equipment and computer storage medium
US11688106B2 (en) * 2021-03-29 2023-06-27 International Business Machines Corporation Graphical adjustment recommendations for vocalization
CN117726307B (en) * 2024-02-18 2024-04-30 成都汇智捷成科技有限公司 Data management method based on business center

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580650A (en) * 2014-12-25 2015-04-29 广东欧珀移动通信有限公司 Method for pointing out defrauding call and communication terminal
WO2017194978A1 (en) * 2016-05-13 2017-11-16 Lucozade Ribena Suntory Limited Method of controlling a state of a display of a device
CN109389279A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Insurance sales link closes rule determination method, device, equipment and storage medium
CN109729383A (en) * 2019-01-04 2019-05-07 深圳壹账通智能科技有限公司 Double record video quality detection methods, device, computer equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9967619B2 (en) * 2014-12-01 2018-05-08 Google Llc System and method for associating search queries with remote content display
CN114464186A (en) * 2016-07-28 2022-05-10 北京小米移动软件有限公司 Keyword determination method and device
CN108512869B (en) * 2017-02-24 2020-02-11 北京数安鑫云信息技术有限公司 Method and system for processing concurrent data in asynchronous mode
CN109376344A (en) * 2018-09-03 2019-02-22 平安普惠企业管理有限公司 The generation method and terminal device of list
CN109190775A (en) * 2018-09-05 2019-01-11 南方电网科学研究院有限责任公司 A kind of intelligence operation management equipment and operation management method
CN109377500B (en) * 2018-09-18 2023-07-25 平安科技(深圳)有限公司 Image segmentation method based on neural network and terminal equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580650A (en) * 2014-12-25 2015-04-29 广东欧珀移动通信有限公司 Method for pointing out defrauding call and communication terminal
WO2017194978A1 (en) * 2016-05-13 2017-11-16 Lucozade Ribena Suntory Limited Method of controlling a state of a display of a device
CN109389279A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Insurance sales link closes rule determination method, device, equipment and storage medium
CN109729383A (en) * 2019-01-04 2019-05-07 深圳壹账通智能科技有限公司 Double record video quality detection methods, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110266645A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
US11475143B2 (en) Sensitive data classification
US20210257066A1 (en) Machine learning based medical data classification method, computer device, and non-transitory computer-readable storage medium
WO2019119505A1 (en) Face recognition method and device, computer device and storage medium
CN110162593A (en) A kind of processing of search result, similarity model training method and device
CN111680159B (en) Data processing method and device and electronic equipment
WO2020232894A1 (en) Real-time data verification method, device, server and medium
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
CN110569350B (en) Legal recommendation method, equipment and storage medium
CN111695338A (en) Interview content refining method, device, equipment and medium based on artificial intelligence
US20230032728A1 (en) Method and apparatus for recognizing multimedia content
US11734360B2 (en) Methods and systems for facilitating classification of documents
CN110489747A (en) A kind of image processing method, device, storage medium and electronic equipment
Alshehri et al. Iterative keystroke continuous authentication: A time series based approach
US20230237252A1 (en) Digital posting match recommendation apparatus and methods
CN112468658A (en) Voice quality detection method and device, computer equipment and storage medium
CN111488501A (en) E-commerce statistical system based on cloud platform
US20220156489A1 (en) Machine learning techniques for identifying logical sections in unstructured data
CN111460139B (en) Intelligent management based engineering supervision knowledge service system and method
WO2020253353A1 (en) Resource acquisition qualification generation method for preset user and related device
CN112364136B (en) Keyword generation method, device, equipment and storage medium
CN115146589B (en) Text processing method, device, medium and electronic equipment
CN109961801A (en) Intelligent Service evaluation method, computer readable storage medium and terminal device
CN114724072A (en) Intelligent question pushing method, device, equipment and storage medium
Li et al. A deep learning approach of financial distress recognition combining text
CN112733645A (en) Handwritten signature verification method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19929289

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19929289

Country of ref document: EP

Kind code of ref document: A1