WO2019085331A1

WO2019085331A1 - Fraud possibility analysis method, device, and storage medium

Info

Publication number: WO2019085331A1
Application number: PCT/CN2018/076122
Authority: WO
Inventors: 陈林
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-11-02
Filing date: 2018-02-10
Publication date: 2019-05-09
Also published as: CN108038413A

Abstract

Provided in the present application are a fraud possibility analysis method, a device, and a computer readable storage medium. The method comprises the following steps: collecting sample videos and allocating fraud annotations; extracting an image feature and an audio feature of each sample video, and combining the features to obtain a video feature of each sample video; constructing a neural network according to sequence lengths of the sample videos and dimensions of the video features; training the neural network by using the video feature and the fraud annotation of each sample video, and optimizing training parameters, so as to obtain a fraud possibility analysis model; collecting a face video of a predetermined duration of an object to be analyzed; extracting the image feature and the audio feature of the video, and combining the features to obtain a video feature of the video; and inputting the video feature into the fraud possibility analysis model, outputting a fraud probability and a non-fraud probability of the object to be analyzed, and taking an output result of a greater probability value as an analysis result of whether the object to be analyzed is involved in fraud. According to the present application, whether a person is suspected of fraud can be objectively judged.

Description

Fraud possibility analysis method, device and storage medium

Priority claim

This application claims the priority of the Chinese Patent Application filed on November 2, 2017, the Chinese Patent Office, Application No. 201711061172.X, and the invented name is "Fraud possibility analysis method, device and storage medium", the contents of which are all cited by reference. Combined in this application.

Technical field

The present application relates to the field of information processing technologies, and in particular, to a fraud possibility analysis method, apparatus, and storage medium.

Background technique

At present, the analysis of character fraud is generally realized through face-to-face review, which relies heavily on the experience and judgment of analysts, and consumes a lot of time and manpower. The analysis results are often inaccurate and objective. There are also professional instruments and equipment to detect fraud, pulse, skin and other indicators to determine whether fraud suspects are fraudulent, but such equipment is usually expensive and easy to violate human rights.

Summary of the invention

In view of the above reasons, it is necessary to provide a method, device and storage medium for analyzing the possibility of fraud, and by analyzing the facial video of a person, objectively and accurately determine whether the person is suspected of fraud.

To achieve the above objective, the present application provides a fraud possibility analysis method, the method comprising:

Sample preparation step: collecting facial video of a person's predetermined duration as a sample, and assigning a fraud label to each sample;

Sample feature extraction step: extracting image features and audio features of each sample, and combining to obtain video features of each sample;

Network construction step: setting the number of neural network layers and the number of neurons in each layer network according to the sequence length of each sample and the dimension of the video feature;

Network training step: Define the Softmax loss function, use the fraud labeling and video features of each sample as sample data, train the neural network, output the fraud probability and no fraud probability of each sample, and update the training of the neural network every training. a parameter, the training parameter that minimizes the Softmax loss function is used as a final parameter to obtain a fraud probability analysis model;

The model application step is: collecting a facial video of a predetermined duration of the object to be analyzed, and analyzing the facial video of the object to be analyzed by using the fraud possibility analysis model to obtain an analysis result of the possibility of fraud of the object to be analyzed.

The application also provides a computing device comprising a memory and a processor, the memory including a fraud probability analysis program. The computing device is directly or indirectly connected to the camera device, and the camera device transmits the facial video of the captured person's conversation to the computing device. When the processor of the computing device executes the fraud probability analysis program in memory, the following steps are implemented:

In addition, in order to achieve the above object, the present application further provides a computer readable storage medium including a fraud possibility analysis program, where the fraud possibility analysis program is executed by a processor, implementing the following steps :

The fraud possibility analysis method, device and storage medium provided by the present application trains the neural network through the face video of a large number of people, updates the training parameters of the neural network according to the Softmax loss function, and obtains the last updated training parameter as the final parameter to obtain fraud. Probability analysis model. Then, collecting the facial video of the predetermined duration of the object to be analyzed, extracting the audio feature and the image feature of the video, combining the video features of the video, and inputting the video feature into the training fraud probability analysis model to obtain the The analysis result of the possibility of fraud of the object to be analyzed. By using this application, it is possible to objectively and effectively judge whether a person is suspected of fraud, which also reduces costs and saves time.

DRAWINGS

FIG. 1 is an application environment diagram of a first preferred embodiment of a fraud possibility analysis method according to the present application.

FIG. 2 is an application environment diagram of a second preferred embodiment of the fraud possibility analysis method of the present application.

FIG. 3 is a program block diagram of the fraud possibility analysis program in FIGS. 1 and 2.

4 is a flow chart of a preferred embodiment of a fraud possibility analysis method of the present application.

The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed ways

The principles and spirit of the present application are described below with reference to a number of specific embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

Referring to FIG. 1, an application environment diagram of a first preferred embodiment of a fraud possibility analysis method of the present application is shown. In this embodiment, the imaging device 3 is connected to the computing device 1 via the network 2, and the camera device 3 captures the face video of the person's conversation, and transmits it to the computing device 1 via the network 2, and the computing device 1 utilizes the fraud possibility analysis program provided by the present application. 10 Analyze the video to output the person's fraud probability and no fraud probability for reference.

The computing device 1 may be a terminal device having a storage and computing function, such as a server, a smart phone, a tablet computer, a portable computer, a desktop computer, or the like.

The computing device 1 includes a memory 11, a processor 12, a network interface 13, and a communication bus 14.

The image pickup apparatus 3 is installed in a specific place, such as an office place, a monitoring area, and the like, for photographing a face video at the time of a person's conversation, and then transmits the captured video to the memory 11 through the network 2. The network interface 13 may include a standard wired interface, a wireless interface (such as a WI-FI interface). Communication bus 14 is used to implement connection communication between these components.

The memory 11 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card type memory, or the like. In some embodiments, the readable storage medium may be an internal storage unit of the computing device 1, such as a hard disk of the computing device 1. In other embodiments, the readable storage medium may also be an external memory 11 of the computing device 1, such as a plug-in hard disk equipped on the computing device 1, a smart memory card (SMC). , Secure Digital (SD) card, Flash Card, etc.

In the present embodiment, the memory 11 stores the program code of the fraud possibility analysis program 10, the dialog video captured by the camera 3, and the data to which the processor 12 executes the program code of the fraud possibility analysis program 10 and The last output of the data, etc.

Processor 12 may be a Central Processing Unit (CPU), microprocessor or other data processing chip in some embodiments.

Figure 1 shows only computing device 1 with components 11-14, but it should be understood that not all illustrated components may be implemented and that more or fewer components may be implemented instead.

Optionally, the computing device 1 may further include a user interface, and the user interface may include an input unit such as a keyboard, a voice input device such as a microphone, a device with a voice recognition function, a voice output device such as an audio, a headphone, and the like. Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the computing device 1 may also include a display. The display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments. The display is used to display information processed by the computing device 1 and a visualized user interface.

Optionally, the computing device 1 further comprises a touch sensor. The area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area. Further, the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like. Furthermore, the touch sensor may be a single sensor or a plurality of sensors arranged, for example, in an array. A user, such as a counselor, can initiate the fraud probability analysis program 10 by touch.

The computing device 1 may also include radio frequency (RF) circuits, sensors, audio circuits, and the like, and details are not described herein.

Referring to FIG. 2, it is an application environment diagram of a second preferred embodiment of the fraud possibility analysis method of the present application. The object to be analyzed realizes the analysis process of the possibility of character fraud by the terminal 3, the camera device 30 of the terminal 3 captures the face video of the object to be analyzed, and transmits it to the computing device 1 via the network 2, and the processor 12 of the computing device 1 executes the memory. 11 The program code of the stored fraud possibility analysis program 10 analyzes the audio part and the video frame of the video, and outputs the fraud probability and the non-fraud probability of the object to be analyzed, for reference by the object to be analyzed or the reviewer.

The components of the computing device 1 of FIG. 2, such as the memory 11, the processor 12, the network interface 13, and the communication bus 14, shown in the figures, and the components not shown in the figures, refer to the description of FIG.

The terminal 3 can be a terminal device having a storage and computing function, such as a smart phone, a tablet computer, a portable computer, and a desktop computer.

The fraud possibility analysis program 10 of Figures 1 and 2, when executed by the processor 12, implements the following steps:

For a detailed description of the above steps, please refer to the following description of the program module diagram of the fraud possibility analysis program 10 and the flowchart of the preferred embodiment of the fraud possibility analysis method of FIG.

Referring to Fig. 3, it is a program block diagram of the fraud possibility analysis program 10 in Figs. In the present embodiment, the fraud possibility analysis program 10 is divided into a plurality of modules, which are stored in the memory 11 and executed by the processor 12 to complete the present application. A module as referred to in this application refers to a series of computer program instructions that are capable of performing a particular function.

The fraud possibility analysis program 10 can be divided into: an acquisition module 110, an extraction module 120, a training module 130, and a prediction module 140.

The obtaining module 110 is configured to acquire a facial video of a predetermined duration when the character is in a conversation. The video may be acquired by the camera device 3 of FIG. 1 or the camera device 30 of FIG. 2, or may be a face video and a fraud-free face video that are obviously fraudulently selected from a network information or a video library. Fragment labeling is assigned to the sample video used for neural network training. The fraud label indicates that the person in the sample video has suspected fraud. For example, 1 indicates fraud suspect and 0 indicates no fraud suspect.

The extracting module 120 is configured to extract audio features and image features of the video, and combine the audio features and the image features to obtain video features of each video. Decoding and pre-processing the video acquired by the obtaining module 110 to obtain an audio part and a video frame of each video, respectively performing feature extraction on the audio part and the video frame to obtain audio features and image features of each video, The audio features and image features are combined to obtain the video features of each video.

When the extraction module 120 extracts the image features of the video, the HOG feature, the LBP feature, and the like of the video frame processed by normalization, noise removal, etc. may be used as image features, or the feature of the video frame may be directly extracted by the convolutional neural network. vector.

When the extraction module 120 extracts the audio features of the video, the amplitude value of the audio portion of the video may be used as an audio feature. For example, assuming that the predetermined duration of the video is 3 minutes and the audio sampling rate is 8000 Hz, the audio portion of the 3 minute video extracts 8000*60*3 amplitude values as audio features.

When the extraction module 120 combines the image feature and the audio feature, the dimension of the combined video feature is the sum of the image feature dimension of each frame image and the corresponding audio feature dimension. According to the above example, assuming that the audio sample rate of the face video of the character dialogue is 8000HZ and the video sampling rate is 20HZ, it takes 50ms for each image to be read, and 400 audio amplitude values for 50ms, if the image features of each frame of the video The dimension is k1, and the dimension of the corresponding audio feature is k2=400, and the dimension of the obtained video feature is k=k1+k2.

The training module 130 is configured to optimize the constructed neural network by iterative training to obtain a trained fraud possibility analysis model. The video frames and audio frames of the face video of the character dialogue are arranged in chronological order, so the present application uses a Long Short-Term Memory (LSTM) in a cyclic neural network.

When constructing the LSTM, firstly, according to the sequence length of the predetermined duration face video obtained by the acquisition module 110 and the dimension of the video feature extracted by the extraction module 120, the network shape is defined, and the number of layers of the LSTM and the neurons of each layer of the LSTM are set. Number. With the above example, assuming that the predetermined duration of the video is 3 minutes, the video sampling rate is 20 Hz, and the combined video feature has a dimension of k, the sequence length of each video is recorded as 3*60*20, the shape of the LSTM. The code of the tflearn deep learning library can be expressed as follows:

Net=tflearn.input_data(shape=[None,3*60*20,k])

Then construct two hidden layers, 128 neural units per layer, using the code of the tflearn deep learning library as follows:

Net=tflearn.lstm(net,128)

Next define the Softmax loss function formula as follows:

After the LSTM and Softmax loss functions are constructed, the training parameters are set. Assuming that the number of iterations is 100, the gradient optimization algorithm is adam, and the verification set is 0.1, the code for the LSTM model training using the tflearn deep learning library is as follows:

Net=tflearn.regression(net,optimizer=‘adam’,loss=‘categorical_crossentropy’,name=‘output1’)

Model=tflearn.DNN(net,tersorboard_verbose=2)

Model.fit(X,Y,n_epoch=100,validation_set=0.1,snapshot_step=100)

The training module 130 trains the LSTM by using the fraud feature of each sample and the combined video features, and updates the training parameters of the LSTM each time to minimize the Softmax loss function, and finally uses the last updated training parameter as a final Parameters, get the fraud possibility analysis model.

The analysis module 140 is configured to analyze the fraud possibility of the character. The obtaining module 110 acquires a facial video of a predetermined duration of the object to be analyzed, and the extracting module 120 extracts image features and audio features of the video, and combines the image features and audio features into video features of the video, and the analyzing module 140 inputs the video features. The training module 130 trains the fraud possibility analysis model to output the fraud probability and the no fraud probability of the object to be analyzed.

Referring to FIG. 4, it is a flowchart of a preferred embodiment of the fraud possibility analysis method of the present application. Using the architecture shown in FIG. 1 or FIG. 2, the computing device 1 is started, and the processor 12 executes the fraud possibility analysis program 10 stored in the memory 11 to implement the following steps:

In step S10, the acquisition module 110 is used to collect the predetermined duration video of the person's conversation and assign a fraud label to the video. The video may be acquired by the camera device 3 of FIG. 1 or the camera device 30 of FIG. 2, or may be a face video with obvious fraudulent behavior when a person selects from a network information or a video library, and a normal fraud-free device. video.

In step S20, the audio feature and the image feature of each video are extracted by the extraction module 120, and the audio feature and the image feature are combined to obtain a video feature of each video. The image feature may be an underlying feature such as a HOG feature or an LBP feature of the video frame, or may be a feature vector of a video frame directly extracted by the convolutional neural network. The audio feature may be a set of amplitude values of audio corresponding to each frame of image. The dimension of the video feature is the sum of the image feature dimension of the video frame and the corresponding audio feature dimension.

Step S30, constructing a neural network according to the sequence length of the video of the predetermined duration and the dimension of the video feature. The number of layers of the neural network and the number of neurons in each layer of the network are set according to the sequence length of the face video of the predetermined duration acquired by the obtaining module 110 and the dimension of the video feature extracted and combined by the extraction module 120, because the output result of the neural network The probability of fraud and the probability of no fraud of the character, so the number of neurons as the classifier of the network output layer is 2.

Step S40: training the neural network according to video features and fraud labels of each video to obtain a trained fraud possibility analysis model. The video features extracted and combined by the fraud annotation and extraction module 120 acquired by the acquisition module 110 are sample data, and the neural network is iteratively trained, and the training parameters of the neural network are updated each time to make the Softmax loss. The training parameters that minimize the function are used as the final parameters to obtain a trained fraud possibility analysis model.

In step S50, the acquisition module 110 is used to collect the facial video of the object to be analyzed for a predetermined duration. This face video is acquired by the image pickup device 3 of Fig. 1 or the image pickup device 30 of Fig. 2 .

In step S60, the image feature and the audio feature of the video to be analyzed are extracted by the extraction module 120, and the image feature and the audio feature are combined to obtain a video feature of the video to be analyzed. For the specific process of feature extraction and combination, please refer to the extraction module 120 and the detailed description of step S20.

Step S70: input the video feature into the fraud probability analysis model to obtain a fraud possibility analysis result of the object to be analyzed. The video feature of the object to be analyzed obtained by the extraction module 120 is input into the training fraud probability analysis model, and the fraud probability value and the non-fraud probability value of the object to be analyzed are output, and the output result with the large probability value is taken as the object to be analyzed. No fraud analysis results.

In addition, the embodiment of the present application further provides a computer readable storage medium, which may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read only memory (ROM), and an erasable programmable Any combination or combination of any one or more of read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, and the like. The computer readable storage medium includes a sample video and fraud possibility analysis program 10, and when the fraud possibility analysis program 10 is executed by the processor, the following operations are performed:

The specific implementation manner of the computer readable storage medium of the present application is substantially the same as the foregoing fraud possibility analysis method and the specific implementation manner of the computing device 1, and details are not described herein again.

It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a series of elements includes those elements. It also includes other elements not explicitly listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.

The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments. Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.

The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims

A fraud possibility analysis method, characterized in that the method comprises:

Sample preparation step: collecting facial video of a person's predetermined duration as a sample, and assigning a fraud label to each sample;

Sample feature extraction step: extracting image features and audio features of each sample, and combining to obtain video features of each sample;

Network construction step: setting the number of neural network layers and the number of neurons in each layer network according to the sequence length of each sample and the dimension of the video feature;

Network training step: Define the Softmax loss function, use the fraud labeling and video features of each sample as sample data, train the neural network, output the fraud probability and no fraud probability of each sample, and update the training of the neural network every training. a parameter, the training parameter that minimizes the Softmax loss function is used as a final parameter to obtain a fraud probability analysis model;

The model application step is: collecting a facial video of a predetermined duration of the object to be analyzed, and analyzing the facial video of the object to be analyzed by using the fraud possibility analysis model to obtain an analysis result of the possibility of fraud of the object to be analyzed.
The fraud possibility analysis method according to claim 1, wherein the sample feature extraction step comprises:

Decoding and preprocessing each sample to obtain a video frame and an audio portion of each sample;

Perform feature extraction on the video frames of each sample to obtain image features of each sample;

Feature extraction is performed on the audio portion of each sample to obtain audio characteristics for each sample.
The fraud possibility analysis method according to claim 2, wherein the image feature is an HOG feature of a video frame of each sample, an LBP feature, or a feature vector of a video frame extracted directly using a convolutional neural network.
The fraud possibility analysis method according to claim 1, wherein the dimension of the video feature is a sum of a dimension of the image feature and a dimension of a corresponding audio feature.
The fraud possibility analysis method according to claim 1, wherein the Softmax loss function formula is as follows:

Where θ is the training parameter of the neural network, X j represents the jth sample, and y j represents the fraud probability of the jth sample.
The fraud possibility analysis method according to claim 1, wherein the training parameters in the network training step include the number of iterations.
The fraud possibility analysis method according to claim 1, wherein the model application step further comprises:

Decoding and pre-processing the video to be analyzed to obtain an audio portion and a video frame of the video to be analyzed;

Performing feature extraction on the video frame of the video to be analyzed to obtain an image feature of the video to be analyzed;

Performing feature extraction on the audio portion of the video to be analyzed to obtain an audio feature of the video to be analyzed;

Combining the image feature and the audio feature of the video to be analyzed to obtain a video feature of the video to be analyzed;

The video feature is input into the trained fraud possibility analysis model, and the fraud probability and the fraud-free probability of the object to be analyzed are output.
A computing device comprising a memory and a processor, wherein the memory includes a fraud probability analysis program, and the fraud possibility analysis program is executed by the processor to implement the following steps:

Sample preparation step: collecting facial video of a person's predetermined duration as a sample, and assigning a fraud label to each sample;

Sample feature extraction step: extracting image features and audio features of each sample, and combining to obtain video features of each sample;

Network construction step: setting the number of neural network layers and the number of neurons in each layer network according to the sequence length of each sample and the dimension of the video feature;

Network training step: Define the Softmax loss function, use the fraud labeling and video features of each sample as sample data, train the neural network, output the fraud probability and no fraud probability of each sample, and update the training of the neural network every training. a parameter, the training parameter that minimizes the Softmax loss function is used as a final parameter to obtain a fraud probability analysis model;

The model application step is: collecting a facial video of a predetermined duration of the object to be analyzed, and analyzing the facial video of the object to be analyzed by using the fraud possibility analysis model to obtain an analysis result of the possibility of fraud of the object to be analyzed.
The computing device of claim 8 wherein said sample feature extraction step comprises:

Decoding and preprocessing each sample to obtain a video frame and an audio portion of each sample;

Perform feature extraction on the video frames of each sample to obtain image features of each sample;

Feature extraction is performed on the audio portion of each sample to obtain audio characteristics for each sample.
The fraud possibility analysis method according to claim 9, wherein the image feature is an HOG feature of a video frame of each sample, an LBP feature, or a feature vector of a video frame extracted directly using a convolutional neural network.
The computing device of claim 8 wherein the dimension of the video feature is the sum of the dimensions of the image feature and the dimensions of the corresponding audio feature.
The computing device of claim 8 wherein said Softmax loss function formula is as follows:

Where θ is the training parameter of the neural network, X j represents the jth sample, and y j represents the fraud probability of the jth sample.
The computing device of claim 8 wherein the training parameters in the network training step comprise the number of iterations.
The computing device of claim 8 wherein the model application step further comprises:

Decoding and pre-processing the video to be analyzed to obtain an audio portion and a video frame of the video to be analyzed;

Performing feature extraction on the video frame of the video to be analyzed to obtain an image feature of the video to be analyzed;

Performing feature extraction on the audio portion of the video to be analyzed to obtain an audio feature of the video to be analyzed;

Combining the image feature and the audio feature of the video to be analyzed to obtain a video feature of the video to be analyzed;

The video feature is input into the trained fraud possibility analysis model, and the fraud probability and the fraud-free probability of the object to be analyzed are output.
A computer readable storage medium, characterized in that the computer readable storage medium includes a fraud possibility analysis program, and when the fraud possibility analysis program is executed by the processor, the following steps are implemented:

Sample preparation step: collecting facial video of a person's predetermined duration as a sample, and assigning a fraud label to each sample;

Sample feature extraction step: extracting image features and audio features of each sample, and combining to obtain video features of each sample;

Network construction step: setting the number of neural network layers and the number of neurons in each layer network according to the sequence length of each sample and the dimension of the video feature;

Network training step: Define the Softmax loss function, use the fraud labeling and video features of each sample as sample data, train the neural network, output the fraud probability and no fraud probability of each sample, and update the training of the neural network every training. a parameter, the training parameter that minimizes the Softmax loss function is used as a final parameter to obtain a fraud probability analysis model;

The model application step is: collecting a facial video of a predetermined duration of the object to be analyzed, and analyzing the facial video of the object to be analyzed by using the fraud possibility analysis model to obtain an analysis result of the possibility of fraud of the object to be analyzed.
The medium of claim 15 wherein said sample feature extraction step comprises:

Decoding and preprocessing each sample to obtain a video frame and an audio portion of each sample;

Perform feature extraction on the video frames of each sample to obtain image features of each sample;

Feature extraction is performed on the audio portion of each sample to obtain audio characteristics for each sample.
The medium of claim 15 wherein the dimension of the video feature is the sum of the dimensions of the image feature and the dimensions of the corresponding audio feature.
The medium of claim 15 wherein said Softmax loss function formula is as follows:

Where θ is the training parameter of the neural network, X j represents the jth sample, and y j represents the fraud probability of the jth sample.
The medium of claim 15 wherein the training parameters in the network training step comprise an iteration number.
The medium of claim 15 wherein said model application step further comprises:

Decoding and pre-processing the video to be analyzed to obtain an audio portion and a video frame of the video to be analyzed;

Performing feature extraction on the video frame of the video to be analyzed to obtain an image feature of the video to be analyzed;

Performing feature extraction on the audio portion of the video to be analyzed to obtain an audio feature of the video to be analyzed;

Combining the image feature and the audio feature of the video to be analyzed to obtain a video feature of the video to be analyzed;

The video feature is input into the trained fraud possibility analysis model, and the fraud probability and the fraud-free probability of the object to be analyzed are output.