WO2020232894A1

WO2020232894A1 - Real-time data verification method, device, server and medium

Info

Publication number: WO2020232894A1
Application number: PCT/CN2019/103300
Authority: WO
Inventors: 王旭
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-05-21
Filing date: 2019-08-29
Publication date: 2020-11-26
Also published as: CN110266645A

Abstract

The present application is applicable to the technical field of data processing and provides a real-time data verification method and device, a server and a medium. The method comprises: obtaining user identity data of and verification keywords through login information; receiving real-time data of the user; and dividing the real-time data into real-time video data and real-time audio data; and computing the similarity between human face data contained in a real-time video component and the user identity data to verify the user identity. If the similarity is not less than a preset similarity threshold, the number of verification keywords contained in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal; if the number of the verification keywords contained in the real-time audio component within is greater than a first quantity threshold, it is determined that the real-time data ha been verified to perform intelligent analysis on the real-time data and improve the security of remote service management.

Description

Real-time data verification method, device, server and medium

This application affirms that it enjoys the priority of the Chinese patent application with the application number 201910424123.0 and the name "Real-time data verification method, device, server and medium" filed on May 21, 2019. The entire content of the Chinese patent application is by reference Incorporated in this application.

Technical field

This application belongs to the field of data processing technology, and in particular relates to a method, device, server and medium for verifying real-time data.

Background technique

With the development of Internet technology, users can remotely handle various businesses online. However, the handling of online business also has security risks to a certain extent. For example, some criminals may pretend to be others to handle related businesses in order to obtain illegal benefits; some elderly users may handle unnecessary business due to the negligence or deliberate concealment of business personnel, causing unnecessary losses. These security risks are basically caused by inaccurate and incomplete analysis of the user's video data entered in real time.

In summary, there are currently serious technical problems with inaccurate and incomplete real-time data analysis.

technical problem

In view of this, the embodiments of the present application provide a real-time data verification method and server to solve the problem of inaccurate and incomplete real-time data analysis.

Technical solutions

The first aspect of the embodiments of the present application provides a method for verifying real-time data, including: after receiving user login information, determining user identity data corresponding to the login information, and analyzing the key contained in the login information Code, determine multiple verification keywords corresponding to the user based on the key code; receive real-time data of the user, and divide the real-time data into real-time video components and real-time audio components, the real-time video components include face data Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the prediction It is assumed that the number of verification keywords contained in the real-time audio component within the time period; if the number of verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold, then One of the asynchronous processing servers is selected as the selected server, and the real-time data is sent to the selected server to perform non-real-time verification on the real-time data through the selected server; if the real-time audio component is If the number of the verification keywords is greater than the first number threshold, it is determined that the real-time data passes the verification.

Beneficial effect

In the embodiment of this application, user identity data and verification keywords are obtained through login information, real-time data of the user is received, real-time data is divided into real-time video and real-time audio data, and face data and user identity contained in real-time video components are calculated The similarity of the data is used to verify the identity of the user. If the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal If the number of the verification keywords contained in the real-time audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.

Description of the drawings

FIG. 1 is an implementation flowchart of a method for verifying real-time data provided by an embodiment of the present application;

FIG. 2 is a specific implementation flowchart of a method S101 for verifying real-time data provided by an embodiment of the present application;

3 is a specific implementation flowchart of the method S103 for verifying real-time data provided by an embodiment of the present application;

FIG. 4 is a specific implementation flowchart of the method S107 for verifying real-time data provided by an embodiment of the present application;

Figure 5 is a structural block diagram of a real-time data verification device provided by an embodiment of the present application;

Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application.

Embodiments of the invention

Fig. 1 shows an implementation process of a method for verifying real-time data provided by an embodiment of the present application. The process of the method includes steps S101 to S108. The specific implementation principle of each step is as follows.

In S101, after receiving the login information of the user, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verifications corresponding to the user based on the key code Key words.

In the embodiment of this application, when a user remotely handles online business, he needs to communicate with business personnel through video after logging in. After verifying the user’s login information, the server will verify the collected video of the user and business personnel (ie real-time Data) conduct artificial intelligence analysis to determine whether there are hidden safety hazards in the entire business. In general, the embodiments of this application mainly verify the real-time data by automatically verifying whether the user in the real-time data matches the user data corresponding to the login information, and automatically verifying whether the real-time data contains sufficient verification keywords. , So as to ensure the safety of business processing.

Understandably, after receiving the login information, the server will verify the login name and password contained in the login information. Once the verification is passed, the user identity data will be determined according to the login name, where the user identity data includes the user's facial photo. On the other hand, after the same user logs in on different business interfaces, or selects different business functions to log in, the key codes contained in the login information are not the same. Obviously, the key codes can be used to distinguish different businesses; on the contrary, Different types of users log in on the same service interface, and the key codes contained in the login information may also be different. For example, the types of users can be divided into minor users, adult users, and elderly users, due to their different discrimination capabilities , After the same business login, the key code contained in the login information is also different.

As mentioned above, one of the verification dimensions of the subsequent real-time data in the embodiment of this application is to automatically verify whether the real-time data contains sufficient verification keywords. Therefore, it is necessary to first determine which verification keys need to be included in the real-time data based on the key code. word. Understandably, in the process of online business processing, for example, for elderly users, business personnel need to inform elderly users of business risks, and elderly users must confirm these risks. At this time, because real-time data is a recording of the entire communication process, Therefore, qualified real-time data must contain a certain number of verification keywords.

Obviously, whether a certain number of verification keywords are included in a piece of real-time data indicates whether the user has learned the corresponding risk warning from the business personnel, and whether he has made some affirmative answers.

Optionally, a plurality of keywords corresponding to the key code contained in the login information can be determined through the correspondence between the preset key code and the verification keyword.

Optionally, the verification keyword corresponding to the key code can be determined by predicting the occurrence probability of multiple words corresponding to a key code in the future. As an embodiment of the present application, as shown in FIG. 2, the above S101 includes:

S1011: Retrieve a word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment.

In the embodiment of this application, the process of generating the word set corresponding to the key code is: the business person extracts the real-time data recorded during the business transaction with the user within a preset period (for example: within one month) as the specimen data, and determines The key code corresponding to the specimen data is artificially screened out from the specimen data that can be used to ensure the security of business processing, thereby determining multiple words corresponding to a key code. In addition, since a specimen data has a reception time, the words extracted from a specimen data all correspond to the reception time.

Obviously, the business personnel can obtain more words corresponding to the keyword by extracting multiple specimen data corresponding to a key code, thereby generating a set of words corresponding to the keyword. It is worth noting that the embodiments of the present application will not merge words, so the word set will contain a large number of repeated words, and each word corresponds to a receiving time.

S1012. Establish a corresponding relationship between a receiving time period and the number of occurrences of each word according to the corresponding relationship between each word in the word set and the receiving time. The receiving time period includes multiple receiving times, and the number of occurrences of the word is at The number of occurrences of the word in the word set in one of the receiving time periods.

Understandably, since a word set contains many repeated words, and each word corresponds to a receiving time, it is possible to count the number of occurrences of a certain word in the word set during a receiving time period. Obviously, after the above statistics are performed on multiple reception time periods, the corresponding relationship between the reception time period and the word appearance times of each word will be generated. Exemplarily, the number of occurrences of the word "losing money" during the period from January 1 to January 5 (receiving time period) is 10, the number of occurrences of the word "risk" is 8, and the occurrence of the word "not recommended" The number of times is 9, the number of occurrences of the word "clear" is 20, and so on.

S1013: Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the receiving time period.

Optionally, through the regression model:

Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations, and the e is a natural constant.

Understandably, taking the receiving time period as the independent variable, and the number of occurrences of a certain word as the dependent variable, the coefficients of the regression model can be obtained by the existing nonlinear regression equation solving method, so it is not detailed here.

It should be noted that the serial number of the receiving time period here indicates the first receiving time period from front to back. For example, there are 5 receiving time periods in total, namely: January 1 to January 5, and January 6. From January 10th, January 11th to January 15th, January 16th to January 20th, and January 21st to January 25th, then the corresponding reception from January 1st to January 5th The serial number of the time zone is 1, and the serial number of the corresponding receiving time zone from January 6 to January 10 is 2, and so on.

It is worth noting that in the embodiments of the present application, each word corresponds to a regression equation.

S1014: Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word.

Illustratively, assuming that the receiving time period sequence number corresponding to the receiving time period at the current moment is 10 and the preset number is 5, the independent variables are calculated as 11, 12, 13, 14, and 15 through the regression equation corresponding to each word. The sum of the corresponding dependent variables (ie the sum of the number of occurrences) is used as the predicted number of occurrences corresponding to each word.

S1015: Select words whose predicted number of occurrences are not less than a preset number threshold as verification keywords corresponding to the user.

In S102, real-time data of the user is received, and the real-time data is divided into a real-time video component and a real-time audio component, and the real-time video component includes face data.

As mentioned above, the real-time data of the user in the embodiment of the present application is the video collected when the user communicates with the business personnel. Obviously, the video can be divided into real-time video component and real-time audio component. Among them, the real-time video component only contains Image information does not contain components of voice information, while real-time audio components are components that only contain voice information without image information.

Obviously, since the real-time data is the user-side video collected during the communication between the user and the business personnel, the real-time video component contains the user's face data, and the real-time audio component contains the voice data of both the user and the business personnel.

In S103, the similarity between the face data contained in the real-time video component and the user identity data is calculated.

In the embodiment of this application, it is necessary to first calculate the similarity between the face data contained in the real-time video component and the user identity data to determine whether the user currently handling the business remotely is the user corresponding to the login information, so as to avoid Security hazards caused by leaked login passwords or account theft.

As an embodiment of the present application, as shown in FIG. 3, the foregoing S103 includes:

S1031: Intercept a frame of image from the real-time video component as a target image.

It is understandable that the real-time video component is actually composed of multiple frames of images, and the embodiment of the present application selects one frame of images as the target image for analysis.

S1032: Divide the target image into multiple image regions, read pixel point data of each pixel in the image region, and number the image regions according to a preset sequence.

Exemplarily, the target image can be divided into 100 equal parts along the horizontal direction and 100 equal parts along the ordinate. Then the target image can be divided into 10,000 image regions, and each image region contains multiple Pixels.

Optionally, the pixel point data may be the RGB value of one pixel point, that is, the RGB value of one pixel point can be represented by one pixel point data.

S1033: Simultaneously input the pixel point data of a preset number of adjacent image areas with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image area with a middle number, where the preset number is an odd number greater than 1. .

In the embodiment of this application, the pixel data of each image area is not individually input into the preset VGG neural network model, but because the face is continuously covered in each image area, in order to avoid Intermittent image area calculations may cause accidental errors. Therefore, in the embodiment of the present application, a preset number of pixel data of adjacent image areas with a number are simultaneously input into the neural network model (for example, three consecutive image areas are Pixel data is input into the neural network model at the same time), and when the result is output, only the segmentation coefficient corresponding to the image area with the middle number is output. The segmentation coefficient is used to distinguish the image area covered by the face image or the non-face image. Image area.

In the embodiment of the present application, a preset number of image areas with adjacent numbers are used as a region group, and after the pixel data of each pixel in each image region is organized and combined into a vector, the characteristics of a region group can be generated vector.

Understandably, only in the training process of the VGG neural network model, the input parameters are set to the preset number of pixel data of adjacent image areas, and the output parameters are set to the preset numbered image. The segmentation coefficient corresponding to the region, so naturally the neural network model trained in this way can realize the segmentation coefficient corresponding to the image region with the middle output number described above.

Specifically, the training process of the VGG neural network includes: obtaining the training feature vectors of multiple training region groups and the training segmentation coefficients of the image regions in the center of the training region groups; repeating the following steps until the adjusted cross-entropy loss of the VGG neural network The function value is less than the preset threshold: the training feature vector is used as the input of the VGG neural network, the training segmentation coefficient is used as the output of the VGG neural network, and the VGG neural network is fully connected by the random gradient descent method The parameters of each layer in the layer are updated, and the cross entropy loss function value of the adjusted VGG neural network is calculated. Finally, the VGG neural network whose cross-entropy loss function value is less than the preset threshold is output as the preset neural network model.

Illustratively, assuming there are 100 image regions in total, if the preset number is 3, the pixel data of the image regions numbered 1-3 are input into the neural network model, and the segmentation coefficient of the image region numbered 2 will be output; After the pixel data of the image areas numbered 2-4 are input to the neural network model, the segmentation coefficient of the image area numbered 3 will be output, and so on, until the segmentation coefficient of the image area numbered 99 is output.

It is worth noting that the segmentation coefficients of the image regions numbered 1 and 100 cannot be obtained through the above method, but the ultimate purpose of obtaining the segmentation coefficients of the image region is to segment the face image from the target image, and the face image The probability of pixels covering the edges of the target image is extremely small. Therefore, although the embodiment of the present application cannot calculate the segmentation coefficients of the image area with the largest number and the smallest number, it does not affect the segmentation of the face image in practical applications.

S1034: Use pixels corresponding to the segmentation coefficients that are less than a preset coefficient threshold as face pixels, and generate face data according to pixel point data of all the face pixels.

S1035: Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data by a distance formula, as the similarity between the face data included in the video component and the user identity data.

Optionally, the distance formula is:

Wherein the S is the similarity, the X _i is the face data corresponding to the i-th matrix element Y _i is the identity of the user data corresponding to the i-th matrix element The K is the number of elements included in the matrix corresponding to the face data and the matrix corresponding to the user identity data.

In S104, if the similarity between the face data and the user identity data is less than a preset similarity threshold, it is determined that the real-time data fails the verification.

In S105, if the similarity between the face data and the user identity data is not less than a preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated.

Optionally, the real-time audio component in the preset time period is converted into text data through a speech-to-text conversion algorithm; the text data is word segmented to generate a speech word set, and the speech word set contains multiple Words; calculating the number of the verification keywords included in the voice word set as the number of the verification keywords included in the real-time audio component.

In S106, if the number of the verification keywords included in the real-time audio component is not greater than a second preset number threshold, it is determined that the real-time data fails the verification.

In S107, if the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, select one of the multiple asynchronous processing servers as the selected server , And send the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server.

In the embodiment of the present application, when the number of keywords contained in the real-time audio component is not enough, it cannot be directly determined whether the real-time data has passed the verification, and other asynchronous processing servers need to be used for further automatic analysis or manual analysis. Since the embodiment of this application only introduces the real-time data verification method from the side of the main server, here only how the main server selects an asynchronous processing server as the selected server and sends the real-time data to the selected server is not involved. Choose how the server performs non-real-time verification of real-time data.

As an embodiment of the present application, as shown in FIG. 4, the foregoing S107 includes:

S1071: Invoke the number of threads contained in each of the multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server within a unit time period before the current moment.

S1072: Calculate sending parameters corresponding to each asynchronous processing server through a segmented formula.

Optionally, the segmentation formula includes:

The K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) represents the number of abnormal tasks corresponding to the asynchronous processing server i .

S1073: Calculate the sending ratio corresponding to each asynchronous processing server through a ratio calculation formula.

Optionally, the ratio calculation formula includes:

The Par _i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers.

S1074: Select one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.

In S108, if the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.

Understandably, obtain user identity data and verification keywords through login information, receive real-time data from users, divide real-time data into real-time video and real-time audio data, and calculate the similarity between face data contained in real-time video components and user identity data In order to verify the user’s identity, if the similarity is not less than the preset similarity threshold, the number of verification keywords included in the real-time audio component within the preset time period is calculated to verify whether the real-time data is legal. If the number of the verification keywords contained in the audio component is greater than the first number threshold, it is determined that the real-time data has passed the verification, so as to perform intelligent analysis on the real-time data and improve the security of remote business processing.

Corresponding to the real-time data verification method described in the above embodiment, FIG. 5 shows a structural block diagram of the real-time data verification device provided in the embodiment of the present application. For ease of description, only the information related to the embodiment of the present application is shown. section.

Referring to Figure 5, the device includes:

The parsing module 501 is configured to, after receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine the number corresponding to the user based on the key code. Verification keywords;

The decomposition module 502 is configured to receive real-time data of the user, and divide the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;

The calculation module 503 is configured to calculate the similarity between the face data contained in the real-time video component and the user identity data, if the similarity between the face data and the user identity data is not less than a preset similarity Threshold, calculate the number of verification keywords included in the real-time audio component within a preset time period;

The first execution module 504 is configured to select one of a plurality of asynchronous processing servers as the selected server if the number of the verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold , And send the real-time data to the selected server to perform non-real-time verification on the real-time data through the selected server;

The second execution module 505 is configured to determine that the real-time data passes the verification if the number of the verification keywords contained in the real-time audio component is greater than the first number threshold.

Optionally, the parsing module is specifically used for:

Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;

According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;

Through the regression model:

Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;

Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;

The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.

Optionally, the calculation module is specifically used for:

Intercept a frame of image from the real-time video component as a target image;

Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;

Input the preset number of pixel data of adjacent image areas with numbers into the preset VGG neural network model at the same time, and output the segmentation coefficient corresponding to the image area with the middle number, and the preset number is an odd number greater than 1.

Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;

By formula:

Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is the Similarity, the X _i is the i-th element in the matrix corresponding to the face data, the Y _i is the i-th element in the matrix corresponding to the user identity data, and the K is the person The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.

Optionally, the calculating the number of verification keywords contained in the real-time audio component within a preset time period includes: converting the real-time audio component within the preset time period into a voice-to-text conversion algorithm Text data; word segmentation processing of the text data to generate a set of voice words, the set of voice words contains multiple words; the number of the verification keywords included in the set of voice words is calculated as the real-time audio component Contains the number of verification keywords.

Optionally, the selecting one of multiple asynchronous processing servers as the selected server includes:

Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;

By formula:

Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Represents the number of abnormal tasks corresponding to the asynchronous processing server i; through the formula:

Calculate the sending ratio corresponding to each asynchronous processing server, the Par _i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ; Select one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.

Fig. 6 is a schematic diagram of a server provided by an embodiment of the present application. As shown in FIG. 6, the server 6 of this embodiment includes: a processor 60, a memory 61, and computer-readable instructions 62 stored in the memory 61 and running on the processor 60, such as verification of real-time data program. When the processor 60 executes the computer-readable instructions 62, the steps in the foregoing embodiments of the verification method for real-time data are implemented, such as steps 101 to 108 shown in FIG. 1. Alternatively, when the processor 60 executes the computer-readable instructions 62, the functions of the modules/units in the foregoing device embodiments, such as the functions of the units 501 to 505 shown in FIG. 5, are implemented.

Exemplarily, the computer-readable instructions 62 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 61 and executed by the processor 60, To complete this application. The one or more modules/units may be a series of computer-readable instruction instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 62 in the server 6.

The server 6 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The server may include, but is not limited to, a processor 60 and a memory 61. Those skilled in the art can understand that FIG. 6 is only an example of the server 6 and does not constitute a limitation on the server 6. It may include more or less components than shown, or a combination of certain components, or different components, for example The server may also include input and output devices, network access devices, buses, and the like.

The so-called processor 60 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The storage 61 may be an internal storage unit of the server 6, such as a hard disk or a memory of the server 6. The memory 61 may also be an external storage device of the server 6, for example, a plug-in hard disk, a smart media card (SMC), or a secure digital (SD) card equipped on the server 6. Flash Card, etc. Further, the memory 61 may also include both an internal storage unit of the server 6 and an external storage device. The memory 61 is used to store the computer readable instructions and other programs and data required by the server. The memory 61 can also be used to temporarily store data that has been output or will be output.

Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above-mentioned functional units and modules is used as an example. In practical applications, the above-mentioned functions can be allocated to different functional units and modules as required. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which is not repeated here.

In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

If the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, this application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a computer-readable storage medium. in.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions, which can be stored in a non-volatile computer. In a readable storage medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A method for verifying real-time data, which is characterized in that it includes:

After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;

Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;

Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;

If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;

If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
The method for verifying real-time data according to claim 1, wherein said determining a plurality of verification keywords corresponding to said user based on said key code comprises:

Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;

According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;

Through the regression model:
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;

Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;

The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
The method for verifying real-time data according to claim 1, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:

Intercept a frame of image from the real-time video component as a target image;

Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;

Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;

Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;

By formula:
Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.
The method for verifying real-time data according to claim 1, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:

Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;

Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;

The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
The method for verifying real-time data according to claim 1, wherein the selecting one of a plurality of asynchronous processing servers as the selected server comprises:

Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;

By formula:
Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;

By formula:
Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ；

Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
A real-time data verification device, characterized in that the device includes:

The parsing module is used to determine the user identity data corresponding to the login information after receiving the login information of the user, analyze the key code contained in the login information, and determine the multiple corresponding to the user based on the key code Verify keywords;

The decomposition module is configured to receive real-time data of the user, and divide the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;

A calculation module for calculating the similarity between the face data contained in the real-time video component and the user identity data, if the similarity between the face data and the user identity data is not less than a preset similarity threshold Calculate the number of verification keywords contained in the real-time audio component within the preset time period;

The first execution module is configured to select one of a plurality of asynchronous processing servers as the selected server if the number of the verification keywords contained in the real-time audio component is not greater than the first number threshold but greater than the second number threshold, And sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server;

The second execution module is configured to determine that the real-time data passes the verification if the number of the verification keywords contained in the real-time audio component is greater than a first number threshold.
The real-time data verification device according to claim 6, wherein the analysis module is specifically used for:

Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;

According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;

Through the regression model:
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;

Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;

The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
The real-time data verification device according to claim 6, wherein the calculation module is specifically configured to:

Intercept a frame of image from the real-time video component as a target image;

Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;

Input the preset number of pixel data of adjacent image areas with numbers into the preset VGG neural network model at the same time, and output the segmentation coefficient corresponding to the image area with the middle number, and the preset number is an odd number greater than 1.

Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;

By formula:
Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is the Similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the person The matrix corresponding to face data and the number of elements contained in the matrix corresponding to user identity data.
The identification device according to claim 6, wherein the calculation module is specifically configured to:

Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;

Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;

The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
The identification device according to any one of claims 6-8, wherein the first execution module is specifically configured to:

Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;

By formula:
Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;

By formula:
Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ；

Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
A terminal device, which is characterized by comprising a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor executes the computer-readable instructions as follows step:

After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;

Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;

Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;

If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;

If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
The terminal device according to claim 11, wherein the determining a plurality of verification keywords corresponding to the user based on the key code comprises:

Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;

According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;

Through the regression model:
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;

Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;

The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
The terminal device according to claim 11, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:

Intercept a frame of image from the real-time video component as a target image;

Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;

Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;

Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;

By formula:
Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The number of elements contained in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
The terminal device according to claim 11, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:

Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;

Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;

The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
The terminal device according to claim 11, wherein said selecting one of a plurality of asynchronous processing servers as the selected server comprises:

Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;

By formula:
Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;

By formula:
Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ；

Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.
A computer non-volatile readable storage medium, the computer non-volatile readable storage medium storing computer readable instructions, wherein the computer readable instructions are executed by a processor to implement the following steps:

After receiving the user's login information, determine the user identity data corresponding to the login information, analyze the key code contained in the login information, and determine multiple verification keywords corresponding to the user based on the key code;

Receiving real-time data of the user, dividing the real-time data into a real-time video component and a real-time audio component, and the real-time video component includes face data;

Calculate the similarity between the face data contained in the real-time video component and the user identity data, and if the similarity between the face data and the user identity data is not less than a preset similarity threshold, then calculate the preset The number of verification keywords contained in the real-time audio component within the time period;

If the number of the verification keywords contained in the real-time audio component is not greater than the first preset number threshold but greater than the second preset number threshold, then one of the multiple asynchronous processing servers is selected as the selected server, and all Sending the real-time data to the selected server to perform non-real-time verification of the real-time data through the selected server, and the second preset number threshold is less than the first preset number threshold;

If the number of the verification keywords contained in the real-time audio component is greater than the first preset number threshold, it is determined that the real-time data passes verification.
The computer non-volatile readable storage medium of claim 16, wherein the determining a plurality of verification keywords corresponding to the user based on the key code comprises:

Retrieve the word set corresponding to the key code received in a preset period, and each word in the word set corresponds to a receiving moment;

According to the corresponding relationship between each word in the word set and the receiving time, the corresponding relationship between the receiving time period and the word appearance times of each word is established. The receiving time period includes multiple receiving times, and the word appearance times are in one place. The number of occurrences of the word in the word set in the receiving time period;

Through the regression model:
Fit a regression equation that characterizes the correspondence between the number of occurrences of the word and the reception time period, and generate a regression equation corresponding to each word in the word set, where num represents the number of the word, time represents the number of the reception time period, pre1 and pre2 are the coefficients of two nonlinear regression equations respectively, and the e is a natural constant;

Based on the regression equation corresponding to each word in the word set, calculate the number of occurrences of each word within a preset number of receiving time periods after the current moment as the predicted number of occurrences corresponding to each word;

The words whose predicted occurrence times are not less than a preset threshold are selected as the verification keywords corresponding to the user.
The computer non-volatile readable storage medium according to claim 16, wherein the calculating the similarity between the face data contained in the real-time video component and the user identity data comprises:

Intercept a frame of image from the real-time video component as a target image;

Dividing the target image into a plurality of image areas, reading the pixel point data of each pixel in the image area, and numbering the image areas according to a preset sequence;

Simultaneously input the pixel data of a preset number of adjacent image regions with a number into a preset VGG neural network model, and output a segmentation coefficient corresponding to the image region with a middle number, and the preset number is an odd number greater than one;

Taking the pixel points corresponding to the segmentation coefficients smaller than the preset coefficient threshold as the face pixels, and generating face data according to the pixel point data of all the face pixels;

By formula:
Calculate the similarity between the matrix corresponding to the face data and the matrix corresponding to the user identity data as the similarity between the face data contained in the video component and the user identity data, where the S is all According to the similarity, the X i is the i-th element in the matrix corresponding to the face data, the Y i is the i-th element in the matrix corresponding to the user identity data, and the K is the The number of elements contained in the matrix corresponding to the face data and the matrix corresponding to the user identity data.
The computer non-volatile readable storage medium according to claim 16, wherein said calculating the number of said verification keywords contained in said real-time audio component within a preset time period comprises:

Converting the real-time audio component in the preset time period into text data by using a speech-to-text conversion algorithm;

Performing word segmentation processing on the text data to generate a set of voice words, the set of voice words containing multiple words;

The number of the verification keywords included in the voice word set is calculated as the number of the verification keywords included in the real-time audio component.
The computer non-volatile readable storage medium of claim 16, wherein the selecting one of a plurality of asynchronous processing servers as the selected server comprises:

Call the number of threads contained in each of multiple asynchronous processing servers, and count the number of abnormal tasks received from each asynchronous processing server in the unit time period before the current moment;

By formula:
Calculate the sending parameters corresponding to each asynchronous processing server, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, the Z(i) represents the number of threads corresponding to the asynchronous processing server i, and the D(i) Indicates the number of abnormal tasks corresponding to the asynchronous processing server i;

By formula:
Calculate the sending ratio corresponding to each asynchronous processing server, the Par i represents the sending ratio corresponding to the asynchronous processing server i, the K(i) represents the sending parameter corresponding to the asynchronous processing server i, and the n is the number of asynchronous processing servers ；

Choose one of the asynchronous processing servers corresponding to the highest sending ratio as the selected server.