WO2022105169A1 - 一种欺诈行为识别方法、装置、计算机设备及存储介质 - Google Patents

一种欺诈行为识别方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2022105169A1
WO2022105169A1 PCT/CN2021/096471 CN2021096471W WO2022105169A1 WO 2022105169 A1 WO2022105169 A1 WO 2022105169A1 CN 2021096471 W CN2021096471 W CN 2021096471W WO 2022105169 A1 WO2022105169 A1 WO 2022105169A1
Authority
WO
WIPO (PCT)
Prior art keywords
autoencoder
encoder
data
training
deep
Prior art date
Application number
PCT/CN2021/096471
Other languages
English (en)
French (fr)
Inventor
李响
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022105169A1 publication Critical patent/WO2022105169A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks

Definitions

  • the present application relates to the technical field of speech signal processing, and in particular, to a method, device, computer equipment and storage medium for identifying fraudulent behavior based on a deep autoencoder.
  • the purpose of the embodiments of the present application is to propose a method, device, computer equipment and storage medium for identifying fraudulent behavior based on a deep autoencoder, so as to solve the problem that the traditional method for identifying fraudulent behavior requires a large workload of data labeling and a relatively high recognition accuracy. low problem.
  • the embodiment of the present application provides a method for identifying fraudulent behavior based on a deep self-encoder, which adopts the following technical solutions:
  • the embodiment of the present application also provides a device for identifying fraudulent behavior based on a deep self-encoder, which adopts the following technical solutions:
  • the voice acquisition module is used to receive the original voice data collected by the audio acquisition equipment during the interview;
  • an encoding and decoding module for inputting the original voice data into a pre-trained deep autoencoder for encoding and decoding operations to obtain encoding and decoding results
  • a comparison operation module for performing a comparison operation on the original voice data and the codec result to obtain an error value
  • a threshold judgment module used to judge whether the error value satisfies a preset fraud threshold
  • a first behavior determination module configured to determine that the original voice data has a fraudulent behavior if the error value satisfies the fraud threshold
  • a second behavior determination module configured to determine that the original voice data does not have fraudulent behavior if the error value does not meet the fraud threshold.
  • the embodiment of the present application also provides a computer device, which adopts the following technical solutions:
  • It comprises a memory and a processor, wherein computer-readable instructions are stored in the memory, and when the processor executes the computer-readable instructions, the processor implements the steps of the method for identifying fraudulent behavior based on a deep self-encoder as described below:
  • the embodiments of the present application also provide a computer-readable storage medium, which adopts the following technical solutions:
  • the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, implements the steps of the method for identifying fraudulent behavior based on a deep autoencoder as described below:
  • the deep self-encoder-based fraud identification method, device, computer equipment and storage medium mainly have the following beneficial effects:
  • the original voice data is encoded and decoded by the trained deep auto-encoder, and the encoding and decoding results based on the deep auto-encoder recovery are obtained. By comparing the signal error between the original voice data and the encoding and decoding results, the original voice can be confirmed. Whether the data has fraudulent behavior, since most of the training corpus are normal samples without fraud, the deep autoencoder will not have a large signal error when encoding and decoding fraud-free speech, thereby avoiding the difficulty of using traditional speech emotion models to be accurate. At the same time, this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves the manpower and material resources of technicians.
  • Fig. 1 is the realization flow chart of the method for identifying fraudulent behavior based on the deep autoencoder provided in the first embodiment of the present application;
  • FIG. 2 is a schematic diagram of an autoencoder architecture provided in Embodiment 1 of the present application.
  • Fig. 3 is the realization flow chart of the deep autoencoder acquisition method provided by the first embodiment of the present application.
  • Fig. 4 is the realization flow chart of the deep autoencoder optimization method provided by the first embodiment of the present application.
  • Fig. 5 is the realization flow chart of step S301 in Fig. 4;
  • FIG. 6 is a schematic structural diagram of a device for identifying fraudulent behavior based on a deep autoencoder provided in Embodiment 2 of the present application;
  • FIG. 7 is a schematic structural diagram of an apparatus for obtaining a depth autoencoder provided in Embodiment 2 of the present application.
  • FIG. 8 is a schematic structural diagram of an apparatus for optimizing a deep autoencoder provided in Embodiment 2 of the present application;
  • FIG. 9 is a schematic structural diagram of an embodiment of a computer device according to the present application.
  • FIG. 1 a flow chart of the implementation of the method for identifying fraudulent behavior based on a deep autoencoder provided in Embodiment 1 of the present application is shown. For the convenience of description, only parts related to the present application are shown.
  • step S101 when the interview is performed, the original voice data collected by the audio collection device is received.
  • the face-to-face review refers to the scene where the reviewer and the reviewee conduct face-to-face review and question and answer
  • the application scenario of the face-to-face review may be "school interview”, “civil servant interview”, “loan interview”, etc.
  • the interview is that the bank loan officer understands the lender’s loan motive and capital status in the form of face-to-face conversation, and makes a decision on potential credit risks and fraud risks. Anticipate.
  • the audio collection device is mainly used to collect voice signals, and the audio collection device collects audio data in the interview environment through a microphone.
  • the original voice data refers to the voice information sent by the examinee collected during the face-to-face examination.
  • the voice data will be recognized through a voiceprint recognition network to obtain voiceprint data based on the voice data.
  • step S102 the original speech data is input into the pre-trained deep autoencoder to perform an encoding and decoding operation, and an encoding and decoding result is obtained.
  • the deep auto-encoder is mainly used to identify fraudulent speech.
  • the deep auto-encoder consists of a speech encoder and a speech decoder.
  • the main function of the speech encoder is to convert the PCM (pulse code modulation) of the user's speech.
  • the samples are encoded into a small number of bits (frames). This method makes the voice robust when the link generates bit errors, network jitter and burst transmission.
  • the speech frame is first erroneously converted into PCM speech samples, and then converted into speech waveforms.
  • the speech decoder converts the encoding result output by the speech encoder into speech output data.
  • the voice output data is consistent with the user voice belonging to the voice encoder. If they are inconsistent, it means that this segment of user voice has fraudulent behavior.
  • the pre-trained deep autoencoder refers to the difference between the decoded data and the original input data before the training of the self-encoder before use, and the distribution of the speech data without fraud There is a certain similarity.
  • the autoencoder trained with this type of data can better fit the distribution of non-fraud speech data, and the error between the data recovered by decoding and the original data is small.
  • There is a big difference in the distribution of fraudulent voice data and non-fraudulent voice data so it cannot be recovered well after being input into the self-encoder. Therefore, it is possible to identify fraudulent speech using an autoencoder.
  • the codec result refers to the voice output data converted by the above-mentioned voice decoder from the code result output by the voice encoder, and the voice output data is the codec result.
  • FIG. 2 a schematic diagram of the architecture of the self-encoder provided by the first embodiment of the present application is shown.
  • the input data is passed through the encoder to obtain an encoding result, and then the input data is recovered by the decoder.
  • step S103 a comparison operation is performed on the original voice data and the codec result to obtain an error value.
  • the comparison operation is mainly used to identify whether the sound wave shape of the original voice data and the codec result are similar.
  • the shape of the sound wave corresponding to the original speech data and the codec result can be obtained through the Fourier transform processing operation, wherein the shape of the sound wave is represented as 1 up and down as 0, and then all shapes It can be represented by "1" and "0" to obtain two pieces of sound wave shape text with the same length, and then calculate the error value between the original speech data and the codec result according to the Hamming distance calculation method.
  • the Hamming distance calculation method requires that the input texts have the same length, and the Hamming distance refers to the number of digits of different letters between two texts.
  • step S104 it is determined whether the error value satisfies a preset fraud threshold.
  • the fraud threshold is mainly used to distinguish whether the original voice data has fraudulent behavior, and the user can preset it according to the actual situation.
  • the fraud threshold can be 10, 15, 20, etc. It should be understood that this The examples of the fraud threshold here are only for the convenience of understanding, and are not used to limit the present application.
  • step S105 if the error value satisfies the fraud threshold, it is determined that the original voice data has fraudulent behavior.
  • step S106 if the error value does not meet the fraud threshold, it is determined that there is no fraud in the original voice data.
  • the fraud threshold is set to 0.02
  • the voice data voiceprint 0 input by the user is [1.0, 2.0, 3.0, 4.0, 5.0]
  • the output voiceprint 1 obtained after encoding and decoding by the deep autoencoder is [1.1, 2.1, 3.1, 4.1, 5.1]
  • the voice data is judged to be no fraud
  • the output voiceprint 2 obtained after encoding and decoding by the deep autoencoder is [3.0, 4.0, 5.0, 6.0, 7.0]
  • the trained deep auto-encoder is used to perform encoding and decoding operations on the original speech data to obtain an encoding and decoding result restored based on the deep auto-encoder. Comparing the signal error between the original voice data and the encoding and decoding results can confirm whether the original voice data has fraudulent behavior. Since most of the training corpus are normal samples without fraud, the encoding and decoding of the deep self-encoder is free of fraud. There will be no large signal error during speech, thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraudulent samples. At the same time, this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves the manpower and material resources of technicians .
  • FIG. 3 an implementation flowchart of the method for obtaining a depth autoencoder provided by Embodiment 1 of the present application is shown. For the convenience of description, only the parts related to the present application are shown.
  • the method for identifying fraudulent behavior based on a deep autoencoder provided by the present application further includes: step S201 , step S202 , step S203 , and step S204 .
  • step S201 a local database is read, and training speech data is obtained from the local database.
  • the local database has pre-stored voice data samples that have been analyzed, and the voice data samples can be obtained through screening by technicians. Further, in order to avoid the limitations of subjective judgment, voice signal analysis can be used Sentiment fluctuations are used to screen and improve the accuracy of the sample.
  • step S202 a default autoencoder is constructed, and the default autoencoder consists of at least one autoencoder.
  • each autoencoder is a neural network with the same input and learning target, and its structure is divided into two parts: an encoder and a decoder. Given an input space X ⁇ x and a feature space h ⁇ F, the autoencoder solves the mapping f and g of the two to minimize the reconstruction error of the input features:
  • the hidden layer feature h output by the encoder that is, the "encoded feature” can be regarded as the representation of the input data X.
  • the number of self-encoders may be selected according to the actual situation of the user.
  • the number of self-encoders may be 4, 6, etc. It should be understood that the The examples of the number are only for the convenience of understanding, and are not used to limit the present application.
  • step S203 it is determined whether the default autoencoder consists of one autoencoder.
  • the training mode of the default autoencoder is determined by judging the composition number of the default autoencoder.
  • step S204 if the default autoencoder is composed of one autoencoder, the training speech data is input into the autoencoder to perform an autoencoder training operation to obtain a pre-trained deep autoencoder.
  • the above deep autoencoder can be obtained only by training the autoencoder.
  • step S205 if the default autoencoder consists of not only one autoencoder, the training speech data is input to the first autoencoder in the deep autoencoder to perform an autoencoder training operation to obtain the first training data.
  • step S206 the first training data is input into the second self-encoder to perform the self-encoder training operation, and the remaining self-encoders are trained one by one;
  • step S207 after all autoencoders complete the autoencoder training operation, a pre-trained deep autoencoder is obtained.
  • the user can preset the number of components of the deep autoencoder, and assign the corresponding training mode according to the number of components.
  • Embodiment 1 of the present application an implementation flowchart of the deep autoencoder optimization method provided in Embodiment 1 of the present application is shown. For the convenience of description, only parts related to the present application are shown.
  • the method before the foregoing step S207, the method further includes: step S301.
  • step S301 an optimization operation is performed on the deep auto-encoder based on the error back-propagation algorithm, so as to minimize the input and output errors of the deep auto-encoder.
  • the error back propagation algorithm is one of the most important and most widely used effective algorithms in automatic control.
  • the realization process of the error back-propagation algorithm is based on the feedback of the error data output by the output layer to each auto-encoder, and each auto-encoder modifies the weights of each auto-encoder according to the error data, thereby realizing the process of self-optimization.
  • step 2 If satisfied, the training ends; if not, return to step 2.
  • the first time is to add Gaussian noise of a specific distribution to the input of the coding layer; the second time is to force the input of the coding layer to be '0' or ' by rounding off 1', in backpropagation, the gradient is still calculated as a floating-point real number.
  • step S301 in FIG. 4 a flowchart of the implementation of step S301 in FIG. 4 is shown. For the convenience of description, only the parts related to the present application are shown.
  • step S301 specifically includes: step S401 and step S402.
  • step S401 Gaussian noise is added to the input end of the coding layer of the self-encoder to cause errors in the input data.
  • Gaussian noise is both an error conforming to a Gaussian normal distribution.
  • the mean value of Gaussian noise is 0, and the variance ⁇ 2 is predetermined and kept constant in the first tuning training. Further, the variance ⁇ 2 of the Gaussian noise is 0.3.
  • Gaussian noise of a specific distribution is added to the input end of the coding layer, so that the output of the coding layer of the deep autoencoder neural network obtained by training is approximated to a 0-1 Boolean distribution.
  • the decoder network is very sensitive to the output of the encoding layer, and a very small change in the output of the encoding layer will cause the decoder output to be different, and the goal of the auto-encoder optimization is to output the reconstructed input vector as much as possible. Therefore, the decoder The output is relatively ok.
  • the output of the encoding layer will tend to 0-1 Boolean distribution, because only the output of the encoding layer is least affected by the randomness under the Boolean distribution. , to ensure that the decoder output is stable.
  • step S402 when data is output from the output end of the coding layer of the encoder, a binarization operation is performed on the output data, so as to reduce the influence of the randomness of the input data on the output data.
  • the binarization operation refers to a method of forcibly converting the output data of the coding layer to '0' or '1' in a rounded manner, so that the output data is a floating-point real number during self-optimization Calculate the gradient.
  • the second tuning training is based on forcing the output of the coding layer to binary values. In this way, the best performance of the deep autoencoder neural network is obtained after training.
  • the above training operation is obtained by minimizing ⁇ * , and minimizing and minimizing ⁇ * is expressed as:
  • n the number of training data samples
  • ⁇ * , ⁇ ' * represent the optimized The parameter matrix of ;
  • x (i) is the input of the auto-encoder
  • E(x, z) is the loss function
  • E(x,z) is expressed as:
  • N is the vector dimension
  • k is the dimension subscript
  • the present application provides a method for identifying fraudulent behavior based on a deep auto-encoder.
  • the trained deep auto-encoder performs encoding and decoding operations on the original speech data, and obtains an encoded code restored based on the deep auto-encoder.
  • Decoding results by comparing the signal error between the original voice data and the encoding and decoding results, it can be confirmed whether the original voice data has fraudulent behavior.
  • There will be no large signal error when encoding and decoding fraud-free speech thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraudulent samples.
  • this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves technology Human and material resources of personnel.
  • a Gaussian noise of a specific distribution is added to the input end of the encoding layer, so that the output of the encoding layer of the trained deep autoencoder neural network approximates a 0-1 Boolean distribution.
  • the decoder network is very sensitive to the output of the encoding layer, and a very small change in the output of the encoding layer will cause the decoder output to be different, and the goal of the auto-encoder optimization is to output the reconstructed input vector as much as possible. Therefore, the decoder The output is relatively ok.
  • the output of the encoding layer will tend to 0-1 Boolean distribution, because only the output of the encoding layer is least affected by the randomness under the Boolean distribution.
  • the Gaussian noise of a specific distribution is added to the input of the coding layer, and the second tuning training is based on forcing the output of the coding layer to two value, so that the deep autoencoder neural network has the best performance after training.
  • the above-mentioned raw voice data can also be stored in a node of a blockchain.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the present application may be used in numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, and the like.
  • the application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer storage media including storage devices.
  • the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.
  • Embodiment 2 of the present application provides a device for identifying fraudulent behavior based on a deep autoencoder, and the embodiment of the device corresponds to the embodiment of the method shown in FIG. 1 .
  • the device can be specifically applied to various electronic devices.
  • the device 100 for identifying fraudulent behavior based on a deep autoencoder includes: a voice acquisition module 101 , an encoding operation module 102 , a comparison operation module 103 , a threshold value judgment module 104 , a first behavior A determination module 105 and a second behavior determination module 106 . in:
  • the voice collection module 101 is used for receiving the original voice data collected by the audio collection device during the interview;
  • the encoding and decoding module 102 is used for inputting the original voice data into the pre-trained deep autoencoder to perform encoding and decoding operations to obtain encoding and decoding results;
  • the comparison operation module 103 is used to perform a comparison operation on the original voice data and the codec result to obtain an error value
  • Threshold judgment module 104 for judging whether the error value satisfies a preset fraud threshold
  • the first behavior determination module 105 is used to determine that the original voice data has fraudulent behavior if the error value satisfies the fraud threshold;
  • the second behavior determining module 106 is configured to determine that there is no fraudulent behavior in the original voice data if the error value does not meet the fraud threshold.
  • the face-to-face review refers to the scene where the reviewer and the reviewee conduct face-to-face review and question and answer
  • the application scenario of the face-to-face review may be "school interview”, “civil servant interview”, “loan interview”, etc.
  • the interview is that the bank loan officer understands the lender’s loan motive and capital status in the form of face-to-face conversation, and makes a decision on potential credit risks and fraud risks. Anticipate.
  • the audio collection device is mainly used to collect voice signals, and the audio collection device collects audio data in the interview environment through a microphone.
  • the original voice data refers to the voice information sent by the examinee collected during the face-to-face examination.
  • the voice data will be recognized through a voiceprint recognition network to obtain voiceprint data based on the voice data.
  • the deep auto-encoder is mainly used to identify fraudulent speech.
  • the deep auto-encoder consists of a speech encoder and a speech decoder.
  • the main function of the speech encoder is to convert the PCM (pulse code modulation) of the user's speech.
  • the samples are encoded into a small number of bits (frames). This method makes the voice robust when the link generates bit errors, network jitter and burst transmission.
  • the speech frame is first erroneously converted into PCM speech samples, and then converted into speech waveforms.
  • the speech decoder converts the encoding result output by the speech encoder into speech output data.
  • the voice output data is consistent with the user voice belonging to the voice encoder. If they are inconsistent, it means that this segment of user voice has fraudulent behavior.
  • the pre-trained deep autoencoder refers to the difference between the decoded data and the original input data before the training of the self-encoder before use, and the distribution of the speech data without fraud There is a certain similarity.
  • the autoencoder trained with this type of data can better fit the distribution of non-fraud speech data, and the error between the data recovered by decoding and the original data is small.
  • There is a big difference in the distribution of fraudulent voice data and non-fraudulent voice data so it cannot be recovered well after being input into the self-encoder. Therefore, it is possible to identify fraudulent speech using an autoencoder.
  • the codec result refers to the voice output data converted by the above-mentioned voice decoder from the code result output by the voice encoder, and the voice output data is the codec result.
  • FIG. 2 a schematic diagram of the architecture of the self-encoder provided by the first embodiment of the present application is shown.
  • the input data is passed through the encoder to obtain an encoding result, and then the input data is recovered by the decoder.
  • the comparison operation is mainly used to identify whether the sound wave shape of the original voice data and the codec result are similar.
  • the shape of the sound wave corresponding to the original speech data and the codec result can be obtained through the Fourier transform processing operation, wherein the shape of the sound wave is represented as 1 up and down as 0, and then all shapes It can be represented by "1" and "0" to obtain two pieces of sound wave shape text with the same length, and then calculate the error value between the original speech data and the codec result according to the Hamming distance calculation method.
  • the Hamming distance calculation method requires that the input texts have the same length, and the Hamming distance refers to the number of digits of different letters between two texts.
  • the fraud threshold is mainly used to distinguish whether the original voice data has fraudulent behavior, and the user can preset it according to the actual situation.
  • the fraud threshold can be 10, 15, 20, etc. It should be understood that this The examples of the fraud threshold here are only for the convenience of understanding, and are not used to limit the present application.
  • the fraud threshold is set to 0.02
  • the voice data voiceprint 0 input by the user is [1.0, 2.0, 3.0, 4.0, 5.0]
  • the output voiceprint 1 obtained after encoding and decoding by the deep autoencoder is [1.1, 2.1, 3.1, 4.1, 5.1]
  • the voice data is judged to be no fraud
  • the output voiceprint 2 obtained after encoding and decoding by the deep autoencoder is [3.0, 4.0, 5.0, 6.0, 7.0]
  • the device for identifying fraudulent behavior based on a deep auto-encoder performs encoding operations and decoding operations on the original speech data by using the trained deep auto-encoder to obtain an encoding and decoding result restored based on the deep auto-encoder, Comparing the signal error between the original voice data and the encoding and decoding results can confirm whether the original voice data has fraudulent behavior. Since most of the training corpus are normal samples without fraud, the encoding and decoding of the deep self-encoder is free of fraud. There will be no large signal error during speech, thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraudulent samples. At the same time, this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves the manpower and material resources of technicians .
  • FIG. 7 a schematic structural diagram of the apparatus for obtaining a depth autoencoder provided by Embodiment 2 of the present application is shown. For the convenience of description, only parts related to the present application are shown.
  • the above-mentioned device 100 for identifying fraudulent behavior based on deep autoencoder further includes: a training data acquisition module 107 , a construction module 108 , a composition judgment module 109 , and a first result module 110 , a second result module 111 , a training operation module 112 and a deep autoencoder confirmation module 113 . in:
  • the training data acquisition module 107 is used to read the local database and acquire training voice data in the local database;
  • a building module 108 configured to build a default autoencoder, where the default autoencoder consists of at least one autoencoder;
  • the composition judgment module 109 is used for judging whether the default autoencoder is composed of an autoencoder
  • the first result module 110 is configured to input the training speech data into the autoencoder to perform an autoencoder training operation if the default autoencoder is composed of one autoencoder to obtain a pre-trained deep autoencoder;
  • the second result module 111 is configured to input the training speech data into the first autoencoder in the deep autoencoder to perform an autoencoder training operation if the default autoencoder is not only composed of one autoencoder, and obtain the first autoencoder. training data;
  • the training operation module 112 is used for inputting the first training data to the second autoencoder to perform the autoencoder training operation, and train the remaining autoencoders one by one in turn;
  • the deep autoencoder confirmation module 113 obtains a pre-trained deep autoencoder after all autoencoders complete the autoencoder training operation.
  • the local database has pre-stored voice data samples that have been analyzed, and the voice data samples can be obtained through screening by technicians. Further, in order to avoid the limitations of subjective judgment, voice signal analysis can be used Sentiment fluctuations are used to screen and improve the accuracy of the sample.
  • each autoencoder is a neural network with the same input and learning target, and its structure is divided into two parts: an encoder and a decoder. Given an input space X ⁇ x and a feature space h ⁇ F, the autoencoder solves the mapping f and g of the two to minimize the reconstruction error of the input features:
  • the hidden layer feature h output by the encoder that is, the "encoded feature” can be regarded as the representation of the input data X.
  • the number of self-encoders may be selected according to the actual situation of the user.
  • the number of self-encoders may be 4, 6, etc. It should be understood that the The examples of the number are only for the convenience of understanding, and are not used to limit the present application.
  • the training mode of the default autoencoder is determined by judging the composition number of the default autoencoder.
  • the above deep autoencoder can be obtained only by training the autoencoder.
  • the user can preset the number of components of the deep autoencoder, and assign the corresponding training mode according to the number of components.
  • FIG. 8 a schematic structural diagram of the apparatus for optimizing a deep autoencoder provided in Embodiment 2 of the present application is shown. For convenience of description, only parts related to the present application are shown.
  • the above-mentioned apparatus 100 for identifying fraudulent behavior based on a deep autoencoder further includes: an adjustment operation module 114 . in:
  • the tuning operation module 114 is configured to perform tuning operations on the deep auto-encoder based on the error back-propagation algorithm, so as to minimize the input and output errors of the deep auto-encoder.
  • the error back propagation algorithm is one of the most important and most widely used effective algorithms in automatic control.
  • the realization process of the error back-propagation algorithm is based on the feedback of the error data output by the output layer to each auto-encoder, and each auto-encoder modifies the weights of each auto-encoder according to the error data, thereby realizing the process of self-optimization.
  • step 2 If satisfied, the training ends; if not, return to step 2.
  • the first time is to add Gaussian noise of a specific distribution to the input of the coding layer; the second time is to force the input of the coding layer to be '0' or ' by rounding off 1', in backpropagation, the gradient is still calculated as a floating-point real number.
  • the foregoing tuning operation module 114 specifically includes: a first tuning operation sub-module and a second tuning operation sub-module. in:
  • the first tuning operation sub-module is used to add Gaussian noise to the input end of the coding layer of the self-encoder, so as to cause errors in the input data;
  • the second tuning operation sub-module is configured to perform a binarization operation on the output data when outputting data from the output end of the coding layer of the encoder, so as to reduce the influence of the randomness of the input data on the output data.
  • Gaussian noise is both an error conforming to a Gaussian normal distribution.
  • the mean value of Gaussian noise is 0, and the variance ⁇ 2 is predetermined and kept constant in the first tuning training. Further, the variance ⁇ 2 of the Gaussian noise is 0.3.
  • Gaussian noise of a specific distribution is added to the input end of the coding layer, so that the output of the coding layer of the deep autoencoder neural network obtained by training is approximated to a 0-1 Boolean distribution.
  • the decoder network is very sensitive to the output of the encoding layer, and a very small change in the output of the encoding layer will cause the decoder output to be different, and the goal of the auto-encoder optimization is to output the reconstructed input vector as much as possible. Therefore, the decoder The output is relatively ok.
  • the output of the coding layer will tend to a 0-1 Boolean distribution, because only the output of the coding layer is least affected by the randomness under the Boolean distribution. , to ensure that the decoder output is stable.
  • the binarization operation refers to a method of forcibly converting the output data of the coding layer to '0' or '1' in a rounded manner, so that the output data is a floating-point real number during self-optimization Calculate the gradient.
  • the mean value of Gaussian noise is 0, and the variance ⁇ 2 is predetermined and remains unchanged in the first tuning training. Further, the variance ⁇ 2 of Gaussian noise is 0.3.
  • Gaussian noise of a specific distribution is added to the input end of the coding layer, so that the output of the coding layer of the deep autoencoder neural network obtained by training is approximated to a 0-1 Boolean distribution.
  • the decoder network is very sensitive to the output of the encoding layer, and a very small change in the output of the encoding layer will cause the decoder output to be different, and the goal of the auto-encoder optimization is to output the reconstructed input vector as much as possible. Therefore, the decoder The output is relatively ok.
  • the output of the coding layer will tend to a 0-1 Boolean distribution, because only the output of the coding layer is least affected by the randomness under the Boolean distribution. , to ensure that the decoder output is stable.
  • the above step S301 specifically includes: when the depth autoencoder is forward propagating, forcing the output of the encoding layer to be '0' or '0' or '1'; In backpropagation, the gradient is still calculated as a floating-point real number.
  • the second tuning training is based on forcing the output of the coding layer to binary values. In this way, the best performance of the deep autoencoder neural network is obtained after training.
  • the above training operation is obtained by minimizing ⁇ * , and minimizing and minimizing ⁇ * is expressed as:
  • n the number of training data samples
  • ⁇ * , ⁇ ' * represent the optimized The parameter matrix of ;
  • x (i) is the input of the auto-encoder
  • E(x, z) is the loss function
  • E(x,z) is expressed as:
  • N is the vector dimension
  • k is the dimension subscript
  • the present application provides a device for identifying fraudulent behavior based on a deep auto-encoder, which performs encoding and decoding operations on the original speech data through the trained deep auto-encoder, and obtains an encoded code recovered based on the deep auto-encoder.
  • Decoding results by comparing the signal error between the original voice data and the encoding and decoding results, it can be confirmed whether the original voice data has fraudulent behavior.
  • There will be no large signal error when encoding and decoding fraud-free speech thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraudulent samples.
  • this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves technology Human and material resources of personnel.
  • a Gaussian noise with a specific distribution is added to the input of the encoding layer, so that the output of the encoding layer of the trained deep autoencoder neural network approximates a 0-1 Boolean distribution.
  • the decoder network is very sensitive to the output of the encoding layer, and a very small change in the output of the encoding layer will cause the decoder output to be different, and the goal of the auto-encoder optimization is to output the reconstructed input vector as much as possible. Therefore, the decoder The output is relatively ok.
  • the output of the coding layer will tend to a 0-1 Boolean distribution, because only the output of the coding layer is least affected by the randomness under the Boolean distribution.
  • the second tuning training is based on forcing the output of the coding layer to two value, so that the deep autoencoder neural network has the best performance after training.
  • FIG. 9 is a block diagram of the basic structure of a computer device according to this embodiment.
  • the computer device 200 includes a memory 210 , a processor 220 , and a network interface 230 that communicate with each other through a system bus. It should be noted that only the computer device 200 with components 210-230 is shown in the figure, but it should be understood that implementation of all of the shown components is not required, and more or less components may be implemented instead.
  • the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • DSP Digital Signal Processor
  • embedded equipment etc.
  • the computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment.
  • the computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.
  • the memory 210 includes at least one type of readable storage medium, including flash memory, hard disk, multimedia card, card-type memory (eg, SD or DX memory, etc.), random access memory (RAM), static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the computer readable storage Media can be non-volatile or volatile.
  • the memory 210 may be an internal storage unit of the computer device 200, such as a hard disk or a memory of the computer device 200.
  • the memory 210 may also be an external storage device of the computer device 200, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc.
  • the memory 210 may also include both the internal storage unit of the computer device 200 and its external storage device.
  • the memory 210 is generally used to store the operating system and various application software installed on the computer device 200 , such as computer-readable instructions for a method for identifying fraudulent behavior based on a deep autoencoder.
  • the memory 210 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 220 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 220 is typically used to control the overall operation of the computer device 200 .
  • the processor 220 is configured to execute computer-readable instructions stored in the memory 210 or process data, for example, computer-readable instructions for executing the deep autoencoder-based fraud identification method.
  • the network interface 230 may include a wireless network interface or a wired network interface, and the network interface 230 is generally used to establish a communication connection between the computer device 200 and other electronic devices.
  • the original speech data is encoded and decoded by a trained deep auto-encoder to obtain an encoding and decoding result restored based on the deep auto-encoder.
  • the signal error between the original voice data and the encoding and decoding results can confirm whether the original voice data has fraudulent behavior. Since most of the training corpus are normal samples without fraud, the encoding and decoding of the deep self-encoder will not be fraudulent. There will be a large signal error, thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraud samples. At the same time, this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves the manpower and material resources of technicians.
  • the present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to perform the steps of the method for identifying fraudulent behavior based on a deep autoencoder as described above.
  • the original speech data is encoded and decoded by a trained deep auto-encoder to obtain an encoding and decoding result restored based on the deep auto-encoder.
  • the signal error between the original voice data and the encoding and decoding results can confirm whether the original voice data has fraudulent behavior. Since most of the training corpus are normal samples without fraud, the encoding and decoding of the deep self-encoder will not be fraudulent. There will be a large signal error, thereby avoiding the problem that the traditional speech emotion model is difficult to accurately identify fraud samples. At the same time, this application does not require technicians to identify whether there is fraud in the training corpus, which greatly saves the manpower and material resources of technicians.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Abstract

一种基于深度自编码器的欺诈行为识别方法、装置、计算机设备及存储介质,该方法通过训练好的深度自编码器对原始语音数据进行编解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为。此外,还涉及区块链技术,用户的原始语音数据可存储于区块链中。由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免难以准确地鉴别欺诈样本的问题,同时,无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。

Description

一种欺诈行为识别方法、装置、计算机设备及存储介质
本申请以2020年11月17日提交的申请号为202011286464.5,名称为“一种欺诈行为识别方法、装置、计算机设备及存储介质”的中国发明专利申请为基础,并要求其优先权。
技术领域
本申请涉及语音信号处理技术领域,尤其涉及一种基于深度自编码器的欺诈行为识别方法、装置、计算机设备及存储介质。
背景技术
面审是银行信贷员以当面交谈的形式了解贷款者的贷款动机和资金状况,对潜在的信用风险和欺诈风险做出预判。然而还是有不法分子骗取银行的贷款,对银行造成巨大的损失。谎言的识别对于防止电话诈骗、辅助刑侦案件处理以及情报分析有着重要的意义,因此对于测谎的研究是目前的研究热点。
现有一种欺诈行为的识别方法,即通过足够经验的技术人员甄别训练语料是否存在欺诈行为,再将标注后的训练语料对语音情绪模型进行训练,最后通过训练好的语音情绪模型对语音数据进行训练,以确认该语音数据是否存在欺诈行为。
然而,申请人意识到传统的欺诈行为识别方法普遍不智能,需要技术人员甄别大量的训练语料,从而导致传统的欺诈行为识别方法需要数据标注的工作量极大;同时,在甄别训练语料时,主观判断往往存在较大的疏漏,进而极大降低识别的精确度。
发明内容
本申请实施例的目的在于提出一种基于深度自编码器的欺诈行为识别方法、装置、计算机设备及存储介质,以解决传统的欺诈行为识别方法需要数据标注的工作量极大、识别精确度较低的问题。
为了解决上述技术问题,本申请实施例提供一种基于深度自编码器的欺诈行为识别方法,采用了如下所述的技术方案:
当进行面审时,接收音频采集设备采集到的原始语音数据;
将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
判断所述误差值是否满足预设的欺诈阈值;
若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
为了解决上述技术问题,本申请实施例还提供一种基于深度自编码器的欺诈行为识别装置,采用了如下所述的技术方案:
语音采集模块,用于当进行面审时,接收音频采集设备采集到的原始语音数据;
编解码模块,用于将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
比对操作模块,用于对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
阈值判断模块,用于判断所述误差值是否满足预设的欺诈阈值;
第一行为确定模块,用于若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
第二行为确定模块,用于若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
为了解决上述技术问题,本申请实施例还提供一种计算机设备,采用了如下所述的技术方案:
包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现如下所述的基于深度自编码器的欺诈行为识别方法的步骤:
当进行面审时,接收音频采集设备采集到的原始语音数据;
将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
判断所述误差值是否满足预设的欺诈阈值;
若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,采用了如下所述的技术方案:
所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下所述的基于深度自编码器的欺诈行为识别方法的步骤:
当进行面审时,接收音频采集设备采集到的原始语音数据;
将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
判断所述误差值是否满足预设的欺诈阈值;
若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
与现有技术相比,本申请实施例提供的基于深度自编码器的欺诈行为识别方法、装置、计算机设备及存储介质主要有以下有益效果:
通过训练好的深度自编码器对原始语音数据进行编解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。
附图说明
为了更清楚地说明本申请中的方案,下面将对本申请实施例描述中所需要使用的附图作一个简单介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例一提供的基于深度自编码器的欺诈行为识别方法的实现流程图;
图2是本申请实施例一提供的自编码器架构的示意图;
图3是本申请实施例一提供的深度自编码器获取方法的实现流程图;
图4是本申请实施例一提供的深度自编码器优化方法的实现流程图;
图5是图4中步骤S301的实现流程图;
图6本申请实施例二提供的基于深度自编码器的欺诈行为识别装置的结构示意图;
图7是本申请实施例二提供的深度自编码器获取装置的结构示意图;
图8是本申请实施例二提供的深度自编码器优化装置的结构示意图;
图9是根据本申请的计算机设备的一个实施例的结构示意图。
具体实施方式
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术 人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
为了使本技术领域的人员更好地理解本申请方案,下面将结合附图,对本申请实施例中的技术方案进行清楚、完整地描述。
实施例一
参阅图1,示出了本申请实施例一提供的基于深度自编码器的欺诈行为识别方法的实现流程图,为了便于说明,仅示出与本申请相关的部分。
步骤S101中,当进行面审时,接收音频采集设备采集到的原始语音数据。
在本申请实施例中,面审指的是审核人员与被审核人员进行当面审查问答的场景,面审的应用场景可以是“学校面审”、“公务员面审”、“贷款面审”等等,在本申请实施例中,以“贷款面审”为例,面审则是银行信贷员以当面交谈的形式了解贷款者的贷款动机和资金状况,对潜在的信用风险和欺诈风险做出预判。
在本申请实施例中,音频采集设备主要用于采集语音信号,该音频采集设备通过麦克收集面审环境中的音频数据。
在本申请实施例中,原始语音数据指的是面审时采集到的被审核人员发出的语音信息。该语音数据会通过声纹识别网络进行识别操作,得到基于该语音数据的声纹数据。
步骤S102中,将原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果。
在本申请实施例中,深度自编码器主要用于识别欺诈语音,该深度自编码器由语音编码器和语音解码器组成,语音编码器的主要功能就是把用户语音的PCM(脉冲编码调制)样值编码成少量的比特(帧)。这种方法使得语音在连路产生误码、网络抖动和突发传输时具有健壮性(Robustness)。在接收端,语音帧先被误码为PCM语音样值,然后再转换成语音波形,语音解码器则是将语音编码器输出的编码结果转换成语音输出数据,在通常情况下,该转换出来的语音输出数据与属于语音编码器的用户语音是相一致的,若不一致,则说明该段用户语音存在欺诈行为。
在本申请实施例中,预先训练好的深度自编码器指的是在使用之前,训练自编码器的目标函数是解码出来的数据和原输入数据的差值,而无欺诈的语音数据的分布有一定的相似性,用这一类数据训练的自编码器,可以较好地拟合无欺诈语音数据的分布,解码恢复出的数据与原数据误差较小。存在欺诈的语音数据与无欺诈的语音数据分布差异较大,因此输入自编码器后无法较好地恢复。因此可以利用自编码器识别欺诈语音。
在本申请实施例中,编解码结果指的是上述语音解码器将语音编码器输出的编码结果转换成的语音输出数据,该语音输出数据即为该编解码结果。
在本申请实施例中,参阅图2,示出了本申请实施例一提供的自编码器的架构示意图,输入数据经过编码器后得到编码结果,再经过解码器恢复出输入数据。f θ(x)表示深度编码器神经网络的映射函数,表征输入向量x到编码层表示向量y=f θ(x)之间的非线性映射关系,输出y作为编码数据;f' θ'(y)表示深度解码器神经网络的映射函数,表征编码层表示向量y到重建向量z=f' θ'(y)之间的非线性映射关系,输出z作为解码数据。
在步骤S103中,对原始语音数据以及编解码结果进行比对操作,得到误差值。
在本申请实施例中,比对操作主要用于辨别该原始语音数据和编解码结果的声波形状 是否相似。
在本申请实施例中,可以通过傅里叶变换处理操作获取该原始语音数据和编解码结果相对应的声波形状,其中,声波的形状向上表示为1,向下表示为0,然后所有的形状就可以“1”、“0”进行表示,以获得两段长度相同的声波形状文本,再根据海明距离计算方式计算出该原始语音数据和编解码结果的误差值。
在本申请实施例中,海明距离(Hamming Distance)计算法要求输入的文本长度相同,海明距离指的是两个文本之间不同的字母的位数。
在步骤S104中,判断误差值是否满足预设的欺诈阈值。
在本申请实施例中,欺诈阈值主要用于区分该原始语音数据是否存在欺诈行为,用户可以根据实际情况进行预先设置,作为示例,该欺诈阈值可以为10、15、20等,应当理解,此处对欺诈阈值的举例仅为方便理解,不用于限定本申请。
在步骤S105中,若误差值满足欺诈阈值,则确定原始语音数据存在欺诈行为。
在步骤S106中,若误差值不满足欺诈阈值,则确定原始语音数据不存在欺诈行为。
在实际应用中,若欺诈阈值设定为0.02,而用户输入的语音数据声纹0为[1.0,2.0,3.0、4.0,5.0],通过深度自编码器编解码后得到的输出声纹1为[1.1,2.1,3.1,4.1,5.1]时,那么声纹0与声纹1的均方差=[(1.1-1.0)^2+(2.1-2.0)^2+(3.1-3.0)^2+(4.1-4.0)^2+(5.1-5.0)^2]/5=0.01<0.02,那么该语音数据则判定为无欺诈行为;若通过深度自编码器编解码后得到的输出声纹2为[3.0,4.0,5.0,6.0,7.0]时,那么声纹0与声纹2的均方差=[(3.0-1.0)^2+(4.0-2.0)^2+(5.0-3.0)^2+(6.0-4.0)^2+(7.0-5.0)^2]/5=4.0>0.02,那么该语音数据则判定为存在欺诈行为。
本申请实施例一提供的基于深度自编码器的欺诈行为识别方法,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。
继续参阅图3,示出了本申请实施例一提供的深度自编码器获取方法的实现流程图,为了便于说明,仅示出与本申请相关的部分。
在本申请实施例一的一些可选的实现方式中,本申请提供的基于深度自编码器的欺诈行为识别方法还包括:步骤S201、步骤S202、步骤S203以及步骤S204。
在步骤S201中,读取本地数据库,在本地数据库中获取训练语音数据。
在本申请实施例中,本地数据库中预先存储有已经分析完成的语音数据样本,该语音数据样本可以是通过技术人员进行甄别获得,进一步的,为了避免主观判断的局限性,可通过语音信号分析情绪波动以进行甄别,提高样本的准确性。
在步骤S202中,构建默认自编码器,默认自编码器至少由一个自编码器组成。
在本申请实施例中,每一个自编码器都是一个输入和学习目标相同的神经网络,其结构分为编码器和解码器两部分。给定输入空间X∈x和特征空间h∈F,自编码器求解两者的映射f,g使输入特征的重建误差达到最小:
f:x→F
g:F→x
Figure PCTCN2021096471-appb-000001
求解完成后,由编码器输出的隐含层特征h,即“编码特征(encoded feature)”可视为输入数据X的表征。
在本申请实施例中,自编码器的个数可根据用户的实际情况选取,作为示例,该自编 码器的个数可以为4个、6个等等,应当理解,此处对自编码器个数的举例仅为方便理解,不用于限定本申请。
在步骤S203中,判断默认自编码器是否由一个自编码器组成。
在本申请实施例中,通过判断默认自编码器的组成个数,从而确定该默认自编码器的训练方式。
在步骤S204中,若默认自编码器由一个自编码器组成,则将训练语音数据输入至自编码器进行自编码器训练操作,得到预先训练好的深度自编码器。
在本申请实施例中,当默认自编码器中的自编码器只有一个时,只需要对该自编码器进行训练即可得到上述深度自编码器。
在步骤S205中,若默认自编码器不仅由一个自编码器组成,则将训练语音数据输入至深度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据。
在步骤S206中,将第一训练数据输入至第二个自编码器进行自编码器训练操作,并依次逐个训练剩余的自编码器;
在步骤S207中,当所有自编码器均完成自编码器训练操作后,得到预先训练好的深度自编码器。
在本申请实施例中,用户可以预先设定深度自编码器中自编码器的组成个数,并根据组成个数分配对应的训练方式。
继续参阅图4,示出了本申请实施例一提供的深度自编码器优化方法的实现流程图,为了便于说明,仅示出与本申请相关的部分。
在本申请实施例一的一些可选的实现方式中,在上述步骤S207之前,还包括:步骤S301。
在步骤S301中,基于误差反向传播算法对深度自编码器进行调优操作,以使深度自编码器的输入和输出误差最小化。
在本申请实施例中,误差反向传播算法是自动控制上最重要、应用最多的有效算法之一。该误差反向传播算法的实现过程是基于输出层输出的误差数据反馈到各个自编码器,各个自编码器根据该误差数据修正各个自编码器的权值,从而实现自我优化的过程。
在实际应用中,调优操作如下所示:
1)初始化
2)输入训练样本对,计算各层输出
3)计算网络输出误差
4)计算各层误差信号
5)调整各层权值
6)检查网络总误差是否达到精度要求
满足,则训练结束;不满足,则返回步骤2。
其中,调优操作具体有两次,第一次是在编码层的输入端加入特定分布的高斯噪声;第二次是将编码层的输入以四舍五入的方式强制二值化为‘0’或者‘1’,反向传播中,仍然以浮点实数计算梯度。
继续参阅图5,示出了图4中步骤S301的实现流程图,为了便于说明,仅示出与本申请相关的部分。
在本申请实施例一的一些可选的实现方式中,上述步骤S301具体包括:步骤S401以及步骤S402。
在步骤S401中,在自编码器编码层的输入端中加入高斯噪声,以使输入数据产生误差。
在本申请实施例中,高斯噪声既是符合高斯正态分布的误差。一些情况下我们需要向标准数据中加入合适的高斯噪声会让数据变得有一定误差而具有实验价值。具体的,高斯噪声的均值为0,方差σ 2预先确定并在第一次调优训练中保持不变。进一步的,高斯噪声 的方差σ 2为0.3。
在本申请实施例中,在编码层的输入端加入特定分布的高斯噪声,从而使经训练得到的深度自编码器神经网络的编码层输出近似于0-1布尔分布。这是因为解码器网络对编码层的输出非常敏感,编码层的输出非常微小的变化就会导致解码器输出不同,而自编码器优化的目标是输出尽可能重构输入向量,故,解码器的输出是相对确定。当在编码层的输入端加入特定分布的高斯噪声,神经网络训练过程为了适应这种随机性,编码层输出会趋于0-1布尔分布,因为只有布尔分布下编码层输出受随机性影响最小,以确保解码器输出稳定。
在步骤S402中,当自编码器编码层的输出端输出数据时,对输出数据进行二值化操作,以降低输出数据受所述输入数据随机性的影响。
在本申请实施例中,二值化操作指的是将编码层的输出数据以四舍五入的方式强制转换为‘0’或‘1’的方式,使得该输出数据在进行自我优化时以浮点实数计算梯度。
在本申请实施例中,在利用误差反向传播算法进行调优训练时,总是试图误差最小化,当在编码层的输出强制二值化这种机制下训练,编码层输出的浮点实数也将趋于0-1布尔分布,因为只有0-1布尔分布下才可以误差最小化。
在本申请实施例中,在第一次调优训练采用在编码层的输入端加入特定分布的高斯噪声的基础上,第二次调优训练在其基础上采用将编码层的输出强制二值化,这样训练后得到深度自编码器神经网络的性能最佳。
在本申请实施例一的一些可选的实现方式中,上述训练操作通过最小化θ *得到,最小化最小化θ *表示为:
Figure PCTCN2021096471-appb-000002
其中,n表示训练数据样本的个数;θ={w,b}和θ'={w T,b'}分别表示编码器和解码器的参数矩阵;θ *,θ' *表示为优化后的参数矩阵;x (i)为自编码器的输入,z (i)=f'θ'(fθ(X (i)))为自编码器的输出;E(x,z)为损失函数,E(x,z)表示为:
Figure PCTCN2021096471-appb-000003
其中,N为向量维度,k为维度下标。
综上所述,本申请提供了一种基于深度自编码器的欺诈行为识别方法,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。同时,在编码层的输入端加入特定分布的高斯噪声,从而使经训练得到的深度自编码器神经网络的编码层输出近似于0-1布尔分布。这是因为解码器网络对编码层的输出非常敏感,编码层的输出非常微小的变化就会导致解码器输出不同,而自编码器优化的目标是输出尽可能重构输入向量,故,解码器的输出是相对确定。当在编码层的输入端加入特定分布的高斯噪声,神经网络训练过程为了适应这种随机性,编码层输出会趋于0-1布尔分布,因为只有布尔分布下编码层输出受随机性影响最小,以确保解码器输出稳定;在第一次调优训练采用在编码层的输入端加入特定分布的高斯噪声的基础上,第二次调优训练在其基础上采用将编码层的输出强制二值化,这样训练后得到深度自编码器神经网络的性能最佳。
需要强调的是,为进一步保证上述原始语音数据的私密和安全性,上述原始语音数据还可以存储于一区块链的节点中。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,该计算机可读指令可存储于一计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
实施例二
进一步参考图6,作为上述图1所示方法的实现,本申请实施例二提供了一种基于深度自编码器的欺诈行为识别装置,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图6所示,本申请实施例二提供的基于深度自编码器的欺诈行为识别装置100包括:语音采集模块101、编码操作模块102、比对操作模块103、阈值判断模块104、第一行为确定模块105以及第二行为确定模块106。其中:
语音采集模块101,用于当进行面审时,接收音频采集设备采集到的原始语音数据;
编解码模块102,用于将原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
比对操作模块103,用于对原始语音数据以及编解码结果进行比对操作,得到误差值;
阈值判断模块104,用于判断误差值是否满足预设的欺诈阈值;
第一行为确定模块105,用于若误差值满足欺诈阈值,则确定原始语音数据存在欺诈行为;
第二行为确定模块106,用于若误差值不满足欺诈阈值,则确定原始语音数据不存在欺诈行为。
在本申请实施例中,面审指的是审核人员与被审核人员进行当面审查问答的场景,面审的应用场景可以是“学校面审”、“公务员面审”、“贷款面审”等等,在本申请实施例中,以“贷款面审”为例,面审则是银行信贷员以当面交谈的形式了解贷款者的贷款动机和资金状况,对潜在的信用风险和欺诈风险做出预判。
在本申请实施例中,音频采集设备主要用于采集语音信号,该音频采集设备通过麦克收集面审环境中的音频数据。
在本申请实施例中,原始语音数据指的是面审时采集到的被审核人员发出的语音信息。该语音数据会通过声纹识别网络进行识别操作,得到基于该语音数据的声纹数据。
在本申请实施例中,深度自编码器主要用于识别欺诈语音,该深度自编码器由语音编码器和语音解码器组成,语音编码器的主要功能就是把用户语音的PCM(脉冲编码调制)样值编码成少量的比特(帧)。这种方法使得语音在连路产生误码、网络抖动和突发传输时具有健壮性(Robustness)。在接收端,语音帧先被误码为PCM语音样值,然后再转换成语音波形,语音解码器则是将语音编码器输出的编码结果转换成语音输出数据,在通常情况下,该转换出来的语音输出数据与属于语音编码器的用户语音是相一致的,若不一致,则说明该段用户语音存在欺诈行为。
在本申请实施例中,预先训练好的深度自编码器指的是在使用之前,训练自编码器的目标函数是解码出来的数据和原输入数据的差值,而无欺诈的语音数据的分布有一定的相似性,用这一类数据训练的自编码器,可以较好地拟合无欺诈语音数据的分布,解码恢复出的数据与原数据误差较小。存在欺诈的语音数据与无欺诈的语音数据分布差异较大,因此输入自编码器后无法较好地恢复。因此可以利用自编码器识别欺诈语音。
在本申请实施例中,编解码结果指的是上述语音解码器将语音编码器输出的编码结果转换成的语音输出数据,该语音输出数据即为该编解码结果。
在本申请实施例中,参阅图2,示出了本申请实施例一提供的自编码器的架构示意图,输入数据经过编码器后得到编码结果,再经过解码器恢复出输入数据。f θ(x)表示深度编码器神经网络的映射函数,表征输入向量x到编码层表示向量y=f θ(x)之间的非线性映射关系,输出y作为编码数据;f' θ'(y)表示深度解码器神经网络的映射函数,表征编码层表示向量y到重建向量z=f' θ'(y)之间的非线性映射关系,输出z作为解码数据。
在本申请实施例中,比对操作主要用于辨别该原始语音数据和编解码结果的声波形状是否相似。
在本申请实施例中,可以通过傅里叶变换处理操作获取该原始语音数据和编解码结果相对应的声波形状,其中,声波的形状向上表示为1,向下表示为0,然后所有的形状就可以“1”、“0”进行表示,以获得两段长度相同的声波形状文本,再根据海明距离计算方式计算出该原始语音数据和编解码结果的误差值。
在本申请实施例中,海明距离(Hamming Distance)计算法要求输入的文本长度相同,海明距离指的是两个文本之间不同的字母的位数。
在本申请实施例中,欺诈阈值主要用于区分该原始语音数据是否存在欺诈行为,用户可以根据实际情况进行预先设置,作为示例,该欺诈阈值可以为10、15、20等,应当理解,此处对欺诈阈值的举例仅为方便理解,不用于限定本申请。
在实际应用中,若欺诈阈值设定为0.02,而用户输入的语音数据声纹0为[1.0,2.0,3.0、4.0,5.0],通过深度自编码器编解码后得到的输出声纹1为[1.1,2.1,3.1,4.1,5.1]时,那么声纹0与声纹1的均方差=[(1.1-1.0)^2+(2.1-2.0)^2+(3.1-3.0)^2+(4.1-4.0)^2+(5.1-5.0)^2]/5=0.01<0.02,那么该语音数据则判定为无欺诈行为;若通过深度自编码器编解码后得到的输出声纹2为[3.0,4.0,5.0,6.0,7.0]时,那么声纹0与声纹2的均方差=[(3.0-1.0)^2+(4.0-2.0)^2+(5.0-3.0)^2+(6.0-4.0)^2+(7.0-5.0)^2]/5=4.0>0.02,那么该语音数据则判定为存在欺诈行为。
本申请实施例二提供的基于深度自编码器的欺诈行为识别装置,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节 省技术人员的人力物力。
继续参阅图7,示出了本申请实施例二提供的深度自编码器获取装置的结构示意图,为了便于说明,仅示出与本申请相关的部分。
在本申请实施例二的一些可选的实现方式中,上述基于深度自编码器的欺诈行为识别装置100还包括:训练数据获取模块107、构建模块108、组成判断模块109、第一结果模块110、第二结果模块111、训练操作模块112以及深度自编码器确认模块113。其中:
训练数据获取模块107,用于读取本地数据库,在本地数据库中获取训练语音数据;
构建模块108,用于构建默认自编码器,默认自编码器至少由一个自编码器组成;
组成判断模块109,用于判断默认自编码器是否由一个自编码器组成;
第一结果模块110,用于若默认自编码器由一个自编码器组成,则将训练语音数据输入至自编码器进行自编码器训练操作,得到预先训练好的深度自编码器;
第二结果模块111,用于若默认自编码器不仅由一个自编码器组成,则将训练语音数据输入至深度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据;
训练操作模块112,用于将第一训练数据输入至第二个自编码器进行自编码器训练操作,并依次逐个训练剩余的自编码器;
深度自编码器确认模块113,当所有自编码器均完成自编码器训练操作后,得到预先训练好的深度自编码器。
在本申请实施例中,本地数据库中预先存储有已经分析完成的语音数据样本,该语音数据样本可以是通过技术人员进行甄别获得,进一步的,为了避免主观判断的局限性,可通过语音信号分析情绪波动以进行甄别,提高样本的准确性。
在本申请实施例中,每一个自编码器都是一个输入和学习目标相同的神经网络,其结构分为编码器和解码器两部分。给定输入空间X∈x和特征空间h∈F,自编码器求解两者的映射f,g使输入特征的重建误差达到最小:
f:x→F
g:F→x
Figure PCTCN2021096471-appb-000004
求解完成后,由编码器输出的隐含层特征h,即“编码特征(encodedfeature)”可视为输入数据X的表征。
在本申请实施例中,自编码器的个数可根据用户的实际情况选取,作为示例,该自编码器的个数可以为4个、6个等等,应当理解,此处对自编码器个数的举例仅为方便理解,不用于限定本申请。
在本申请实施例中,通过判断默认自编码器的组成个数,从而确定该默认自编码器的训练方式。
在本申请实施例中,当默认自编码器中的自编码器只有一个时,只需要对该自编码器进行训练即可得到上述深度自编码器。
在本申请实施例中,用户可以预先设定深度自编码器中自编码器的组成个数,并根据组成个数分配对应的训练方式。
继续参阅图8,示出了本申请实施例二提供的深度自编码器优化装置的结构示意图,为了便于说明,仅示出与本申请相关的部分。
在本申请实施例一的一些可选的实现方式中,上述基于深度自编码器的欺诈行为识别装置100还包括:调优操作模块114。其中:
调优操作模块114,用于基于误差反向传播算法对深度自编码器进行调优操作,以使深度自编码器的输入和输出误差最小化。
在本申请实施例中,误差反向传播算法是自动控制上最重要、应用最多的有效算法之一。该误差反向传播算法的实现过程是基于输出层输出的误差数据反馈到各个自编码器,各个自编码器根据该误差数据修正各个自编码器的权值,从而实现自我优化的过程。
在实际应用中,调优操作如下所示:
1)初始化
2)输入训练样本对,计算各层输出
3)计算网络输出误差
4)计算各层误差信号
5)调整各层权值
6)检查网络总误差是否达到精度要求
满足,则训练结束;不满足,则返回步骤2。
其中,调优操作具体有两次,第一次是在编码层的输入端加入特定分布的高斯噪声;第二次是将编码层的输入以四舍五入的方式强制二值化为‘0’或者‘1’,反向传播中,仍然以浮点实数计算梯度。
在本申请实施例二的一些可选的实现方式中,上述调优操作模块114具体包括:第一调优操作子模块以及第二调优操作子模块。其中:
第一调优操作子模块,用于在自编码器编码层的输入端中加入高斯噪声,以使输入数据产生误差;
第二调优操作子模块,用于当自编码器编码层的输出端输出数据时,对输出数据进行二值化操作,以降低输出数据受输入数据随机性的影响。
在本申请实施例中,高斯噪声既是符合高斯正态分布的误差。一些情况下我们需要向标准数据中加入合适的高斯噪声会让数据变得有一定误差而具有实验价值。具体的,高斯噪声的均值为0,方差σ 2预先确定并在第一次调优训练中保持不变。进一步的,高斯噪声的方差σ 2为0.3。
在本申请实施例中,在编码层的输入端加入特定分布的高斯噪声,从而使经训练得到的深度自编码器神经网络的编码层输出近似于0-1布尔分布。这是因为解码器网络对编码层的输出非常敏感,编码层的输出非常微小的变化就会导致解码器输出不同,而自编码器优化的目标是输出尽可能重构输入向量,故,解码器的输出是相对确定。当在编码层的输入端加入特定分布的高斯噪声,神经网络训练过程为了适应这种随机性,编码层输出会趋于0-1布尔分布,因为只有布尔分布下编码层输出受随机性影响最小,以确保解码器输出稳定。
在本申请实施例中,二值化操作指的是将编码层的输出数据以四舍五入的方式强制转换为‘0’或‘1’的方式,使得该输出数据在进行自我优化时以浮点实数计算梯度。
当深度自编码器在前向传播时,在编码层的输入端加入特定分布的高斯噪声。
在本申请实施例中,高斯噪声的均值为0,方差σ^2预先确定并在第一次调优训练中保持不变。进一步的,高斯噪声的方差σ^2为0.3。
在本申请实施例中,在编码层的输入端加入特定分布的高斯噪声,从而使经训练得到的深度自编码器神经网络的编码层输出近似于0-1布尔分布。这是因为解码器网络对编码层的输出非常敏感,编码层的输出非常微小的变化就会导致解码器输出不同,而自编码器优化的目标是输出尽可能重构输入向量,故,解码器的输出是相对确定。当在编码层的输入端加入特定分布的高斯噪声,神经网络训练过程为了适应这种随机性,编码层输出会趋于0-1布尔分布,因为只有布尔分布下编码层输出受随机性影响最小,以确保解码器输出稳定。
在本申请实施例一的一些可选的实现方式中,上述步骤S301具体包括:当深度自编码器在前向传播时,将编码层的输出以四舍五入的方式强制二值化为‘0’或‘1’;反向传播中,仍然以浮点实数计算梯度。
在本申请实施例中,在利用误差反向传播算法进行调优训练时,总是试图误差最小化,当在编码层的输出强制二值化这种机制下训练,编码层输出的浮点实数也将趋于0-1布尔分布,因为只有0-1布尔分布下才可以误差最小化。
在本申请实施例中,在第一次调优训练采用在编码层的输入端加入特定分布的高斯噪声的基础上,第二次调优训练在其基础上采用将编码层的输出强制二值化,这样训练后得到深度自编码器神经网络的性能最佳。
在本申请实施例二的一些可选的实现方式中,上述训练操作通过最小化θ *得到,最小化最小化θ *表示为:
Figure PCTCN2021096471-appb-000005
其中,n表示训练数据样本的个数;θ={w,b}和θ'={w T,b'}分别表示编码器和解码器的参数矩阵;θ *,θ' *表示为优化后的参数矩阵;x (i)为自编码器的输入,z (i)=f'θ'(fθ(X (i)))为自编码器的输出;E(x,z)为损失函数,E(x,z)表示为:
Figure PCTCN2021096471-appb-000006
其中,N为向量维度,k为维度下标。
综上所述,本申请提供了一种基于深度自编码器的欺诈行为识别装置,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。同时,在编码层的输入端加入特定分布的高斯噪声,从而使经训练得到的深度自编码器神经网络的编码层输出近似于0-1布尔分布。这是因为解码器网络对编码层的输出非常敏感,编码层的输出非常微小的变化就会导致解码器输出不同,而自编码器优化的目标是输出尽可能重构输入向量,故,解码器的输出是相对确定。当在编码层的输入端加入特定分布的高斯噪声,神经网络训练过程为了适应这种随机性,编码层输出会趋于0-1布尔分布,因为只有布尔分布下编码层输出受随机性影响最小,以确保解码器输出稳定;在第一次调优训练采用在编码层的输入端加入特定分布的高斯噪声的基础上,第二次调优训练在其基础上采用将编码层的输出强制二值化,这样训练后得到深度自编码器神经网络的性能最佳。
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图9,图9为本实施例计算机设备基本结构框图。
所述计算机设备200包括通过系统总线相互通信连接存储器210、处理器220、网络接口230。需要指出的是,图中仅示出了具有组件210-230的计算机设备200,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable Gate Array,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。
所述存储器210至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等,所述计算机可读存储介质可以是非易失性,也可以是易失性。在一些实施例中,所述存储器210可以是所述计算机设备 200的内部存储单元,例如该计算机设备200的硬盘或内存。在另一些实施例中,所述存储器210也可以是所述计算机设备200的外部存储设备,例如该计算机设备200上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器210还可以既包括所述计算机设备200的内部存储单元也包括其外部存储设备。本实施例中,所述存储器210通常用于存储安装于所述计算机设备200的操作系统和各类应用软件,例如基于深度自编码器的欺诈行为识别方法的计算机可读指令等。此外,所述存储器210还可以用于暂时地存储已经输出或者将要输出的各类数据。
所述处理器220在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器220通常用于控制所述计算机设备200的总体操作。本实施例中,所述处理器220用于运行所述存储器210中存储的计算机可读指令或者处理数据,例如运行所述基于深度自编码器的欺诈行为识别方法的计算机可读指令。
所述网络接口230可包括无线网络接口或有线网络接口,该网络接口230通常用于在所述计算机设备200与其他电子设备之间建立通信连接。
本申请提供的基于深度自编码器的欺诈行为识别方法,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令可被至少一个处理器执行,以使所述至少一个处理器执行如上述的基于深度自编码器的欺诈行为识别方法的步骤。
本申请提供的基于深度自编码器的欺诈行为识别方法,通过训练好的深度自编码器对原始语音数据进行编码操作和解码操作,获得基于深度自编码器恢复的编解码结果,通过比对该原始语音数据与编解码结果的信号误差,即可确认该原始语音数据是否存在欺诈行为,由于训练语料大部分都是无欺诈的正常样本,使得该深度自编码器的编码解码无欺诈语音时不会出现较大的信号误差,进而避免传统采用语音情绪模型难以准确地鉴别欺诈样本的问题,同时,本申请无需技术人员甄别训练语料是否存在欺诈行为,极大节省技术人员的人力物力。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。

Claims (20)

  1. 一种基于深度自编码器的欺诈行为识别方法,其中,包括下述步骤:
    当进行面审时,接收音频采集设备采集到的原始语音数据;
    将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
    对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
    判断所述误差值是否满足预设的欺诈阈值;
    若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
    若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
  2. 根据权利要求1所述的基于深度自编码器的欺诈行为识别方法,其中,在所述将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果的步骤之前还包括:
    读取本地数据库,在所述本地数据库中获取训练语音数据;
    构建默认自编码器,所述默认自编码器至少由一个自编码器组成;
    判断所述默认自编码器是否由一个自编码器组成;
    若所述默认自编码器由一个自编码器组成,则将所述训练语音数据输入至所述自编码器进行自编码器训练操作,得到所述预先训练好的深度自编码器;
    若所述默认自编码器不仅由一个自编码器组成,则将所述训练语音数据输入至所述深度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据;
    将所述第一训练数据输入至第二个所述自编码器进行所述自编码器训练操作,并依次逐个训练剩余的所述自编码器;
    当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器。
  3. 根据权利要求2所述的基于深度自编码器的欺诈行为识别方法,其中,在所述当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器的步骤之前还包括:
    基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化。
  4. 根据权利要求3所述的基于深度自编码器的欺诈行为识别方法,其中,所述基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化的步骤具体包括:
    在所述自编码器编码层的输入端中加入高斯噪声,以使输入数据产生误差;
    当所述自编码器编码层的输出端输出数据时,对所述输出数据进行二值化操作,以降低所述输出数据受所述输入数据随机性的影响。
  5. 根据权利要求2所述的基于深度自编码器的欺诈行为识别方法,其中,所述训练操作通过最小化θ *得到,所述最小化θ *表示为:
    Figure PCTCN2021096471-appb-100001
    其中,n表示训练数据样本的个数;θ={w,b}和θ'={w T,b'}分别表示编码器和解码器的参数矩阵;θ *,θ' *表示为优化后的参数矩阵;x (i)为自编码器的输入,z (i)=f'θ'(fθ(X (i)))为自编码器的输出;E(x,z)为损失函数,E(x,z)表示为:
    Figure PCTCN2021096471-appb-100002
    其中,N为向量维度,k为维度下标。
  6. 根据权利要求1所述的基于深度自编码器的欺诈行为识别方法,其中,在所述当进 行面审时,接收音频采集设备采集到的原始语音数据的步骤之后,所述方法还包括下述步骤:
    将所述原始语音数据存储至区块链中。
  7. 一种基于深度自编码器的欺诈行为识别装置,其中,包括:
    语音采集模块,用于当进行面审时,接收音频采集设备采集到的原始语音数据;
    编解码模块,用于将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
    比对操作模块,用于对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
    阈值判断模块,用于判断所述误差值是否满足预设的欺诈阈值;
    第一行为确定模块,用于若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
    第二行为确定模块,用于若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
  8. 根据权利要求7所述的基于深度自编码器的欺诈行为识别装置,其中,所述装置还包括:
    训练数据获取模块,用于读取本地数据库,在所述本地数据库中获取训练语音数据;
    构建模块,用于构建默认自编码器,所述默认自编码器至少由一个自编码器组成;
    组成判断模块,用于判断所述默认自编码器是否由一个自编码器组成;
    第一结果模块,用于若所述默认自编码器由一个自编码器组成,则将所述训练语音数据输入至所述自编码器进行自编码器训练操作,得到所述预先训练好的深度自编码器;
    第二结果模块,用于若所述默认自编码器不仅由一个自编码器组成,则将所述训练语音数据输入至所述深度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据;
    训练操作模块,用于将所述第一训练数据输入至第二个所述自编码器进行所述自编码器训练操作,并依次逐个训练剩余的所述自编码器;
    深度自编码器确认模块,当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器。
  9. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述处理器执行所述计算机可读指令时实现如下所述的基于深度自编码器的欺诈行为识别方法的步骤:
    当进行面审时,接收音频采集设备采集到的原始语音数据;
    将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
    对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
    判断所述误差值是否满足预设的欺诈阈值;
    若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
    若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
  10. 根据权利要求9所述的计算机设备,其中,在所述将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果的步骤之前还包括:
    读取本地数据库,在所述本地数据库中获取训练语音数据;
    构建默认自编码器,所述默认自编码器至少由一个自编码器组成;
    判断所述默认自编码器是否由一个自编码器组成;
    若所述默认自编码器由一个自编码器组成,则将所述训练语音数据输入至所述自编码器进行自编码器训练操作,得到所述预先训练好的深度自编码器;
    若所述默认自编码器不仅由一个自编码器组成,则将所述训练语音数据输入至所述深 度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据;
    将所述第一训练数据输入至第二个所述自编码器进行所述自编码器训练操作,并依次逐个训练剩余的所述自编码器;
    当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器。
  11. 根据权利要求10所述的计算机设备,其中,在所述当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器的步骤之前还包括:
    基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化。
  12. 根据权利要求11所述的计算机设备,其中,所述基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化的步骤具体包括:
    在所述自编码器编码层的输入端中加入高斯噪声,以使输入数据产生误差;
    当所述自编码器编码层的输出端输出数据时,对所述输出数据进行二值化操作,以降低所述输出数据受所述输入数据随机性的影响。
  13. 根据权利要求10所述的计算机设备,其中,所述训练操作通过最小化θ *得到,所述最小化θ *表示为:
    Figure PCTCN2021096471-appb-100003
    其中,n表示训练数据样本的个数;θ={w,b}和θ'={w T,b'}分别表示编码器和解码器的参数矩阵;θ *,θ' *表示为优化后的参数矩阵;x (i)为自编码器的输入,z (i)=f'θ'(fθ(X (i)))为自编码器的输出;E(x,z)为损失函数,E(x,z)表示为:
    Figure PCTCN2021096471-appb-100004
    其中,N为向量维度,k为维度下标。
  14. 根据权利要求9所述的计算机设备,其中,在所述当进行面审时,接收音频采集设备采集到的原始语音数据的步骤之后,所述方法还包括下述步骤:
    将所述原始语音数据存储至区块链中。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如下所述的基于深度自编码器的欺诈行为识别方法的步骤其中,包括下述步骤:
    当进行面审时,接收音频采集设备采集到的原始语音数据;
    将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果;
    对所述原始语音数据以及所述编解码结果进行比对操作,得到误差值;
    判断所述误差值是否满足预设的欺诈阈值;
    若所述误差值满足所述欺诈阈值,则确定所述原始语音数据存在欺诈行为;
    若所述误差值不满足所述欺诈阈值,则确定所述原始语音数据不存在欺诈行为。
  16. 根据权利要求15所述的计算机可读存储介质,其中,在所述将所述原始语音数据输入至预先训练好的深度自编码器进行编解码操作,得到编解码结果的步骤之前还包括:
    读取本地数据库,在所述本地数据库中获取训练语音数据;
    构建默认自编码器,所述默认自编码器至少由一个自编码器组成;
    判断所述默认自编码器是否由一个自编码器组成;
    若所述默认自编码器由一个自编码器组成,则将所述训练语音数据输入至所述自编码器进行自编码器训练操作,得到所述预先训练好的深度自编码器;
    若所述默认自编码器不仅由一个自编码器组成,则将所述训练语音数据输入至所述深度自编码器中的第一个自编码器进行自编码器训练操作,得到第一训练数据;
    将所述第一训练数据输入至第二个所述自编码器进行所述自编码器训练操作,并依次逐个训练剩余的所述自编码器;
    当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器。
  17. 根据权利要求16所述的计算机可读存储介质,其中,在所述当所有所述自编码器均完成所述自编码器训练操作后,得到所述预先训练好的深度自编码器的步骤之前还包括:
    基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化。
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述基于误差反向传播算法对所述深度自编码器进行调优操作,以使所述深度自编码器的输入和输出误差最小化的步骤具体包括:
    在所述自编码器编码层的输入端中加入高斯噪声,以使输入数据产生误差;
    当所述自编码器编码层的输出端输出数据时,对所述输出数据进行二值化操作,以降低所述输出数据受所述输入数据随机性的影响。
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述训练操作通过最小化θ *得到,所述最小化θ *表示为:
    Figure PCTCN2021096471-appb-100005
    其中,n表示训练数据样本的个数;θ={w,b}和θ'={w T,b'}分别表示编码器和解码器的参数矩阵;θ *,θ' *表示为优化后的参数矩阵;x (i)为自编码器的输入,z (i)=f'θ'(fθ(X (i)))为自编码器的输出;E(x,z)为损失函数,E(x,z)表示为:
    Figure PCTCN2021096471-appb-100006
    其中,N为向量维度,k为维度下标。
  20. 根据权利要求15所述的计算机可读存储介质,其中,在所述当进行面审时,接收音频采集设备采集到的原始语音数据的步骤之后,所述方法还包括下述步骤:
    将所述原始语音数据存储至区块链中。
PCT/CN2021/096471 2020-11-17 2021-05-27 一种欺诈行为识别方法、装置、计算机设备及存储介质 WO2022105169A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011286464.5A CN112331230A (zh) 2020-11-17 2020-11-17 一种欺诈行为识别方法、装置、计算机设备及存储介质
CN202011286464.5 2020-11-17

Publications (1)

Publication Number Publication Date
WO2022105169A1 true WO2022105169A1 (zh) 2022-05-27

Family

ID=74321486

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096471 WO2022105169A1 (zh) 2020-11-17 2021-05-27 一种欺诈行为识别方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN112331230A (zh)
WO (1) WO2022105169A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331230A (zh) * 2020-11-17 2021-02-05 平安科技(深圳)有限公司 一种欺诈行为识别方法、装置、计算机设备及存储介质
CN113243918B (zh) * 2021-06-11 2021-11-30 深圳般若计算机系统股份有限公司 基于多模态隐匿信息测试的风险检测方法及装置
CN114937455B (zh) * 2022-07-21 2022-10-11 中国科学院自动化研究所 语音检测方法及装置、设备及存储介质
CN116304762A (zh) * 2023-05-17 2023-06-23 杭州致成电子科技有限公司 负荷的分解方法和装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105636047A (zh) * 2014-10-29 2016-06-01 中兴通讯股份有限公司 一种欺诈用户的检测方法、装置和系统
CN107222865A (zh) * 2017-04-28 2017-09-29 北京大学 基于可疑行为识别的通讯诈骗实时检测方法和系统
US9837079B2 (en) * 2012-11-09 2017-12-05 Mattersight Corporation Methods and apparatus for identifying fraudulent callers
CN107680602A (zh) * 2017-08-24 2018-02-09 平安科技(深圳)有限公司 语音欺诈识别方法、装置、终端设备及存储介质
CN107958215A (zh) * 2017-11-23 2018-04-24 深圳市分期乐网络科技有限公司 一种防欺诈识别方法、装置、服务器及存储介质
CN109559217A (zh) * 2018-10-25 2019-04-02 平安科技(深圳)有限公司 基于区块链的贷款数据处理方法、装置、设备及存储介质
CN110705585A (zh) * 2019-08-22 2020-01-17 深圳壹账通智能科技有限公司 网络欺诈识别方法、装置、计算机装置及存储介质
CN112331230A (zh) * 2020-11-17 2021-02-05 平安科技(深圳)有限公司 一种欺诈行为识别方法、装置、计算机设备及存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101999902A (zh) * 2009-09-03 2011-04-06 上海天岸电子科技有限公司 声纹测谎仪及声纹测谎方法
CN108806695A (zh) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 自更新的反欺诈方法、装置、计算机设备和存储介质
CN109636061B (zh) * 2018-12-25 2023-04-18 深圳市南山区人民医院 医保欺诈预测网络的训练方法、装置、设备及存储介质
CN110222554A (zh) * 2019-04-16 2019-09-10 深圳壹账通智能科技有限公司 欺诈识别方法、装置、电子设备及存储介质
CN111178523B (zh) * 2019-08-02 2023-06-06 腾讯科技(深圳)有限公司 一种行为检测方法、装置、电子设备及存储介质
CN110473557B (zh) * 2019-08-22 2021-05-28 浙江树人学院(浙江树人大学) 一种基于深度自编码器的语音信号编解码方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9837079B2 (en) * 2012-11-09 2017-12-05 Mattersight Corporation Methods and apparatus for identifying fraudulent callers
CN105636047A (zh) * 2014-10-29 2016-06-01 中兴通讯股份有限公司 一种欺诈用户的检测方法、装置和系统
CN107222865A (zh) * 2017-04-28 2017-09-29 北京大学 基于可疑行为识别的通讯诈骗实时检测方法和系统
CN107680602A (zh) * 2017-08-24 2018-02-09 平安科技(深圳)有限公司 语音欺诈识别方法、装置、终端设备及存储介质
CN107958215A (zh) * 2017-11-23 2018-04-24 深圳市分期乐网络科技有限公司 一种防欺诈识别方法、装置、服务器及存储介质
CN109559217A (zh) * 2018-10-25 2019-04-02 平安科技(深圳)有限公司 基于区块链的贷款数据处理方法、装置、设备及存储介质
CN110705585A (zh) * 2019-08-22 2020-01-17 深圳壹账通智能科技有限公司 网络欺诈识别方法、装置、计算机装置及存储介质
CN112331230A (zh) * 2020-11-17 2021-02-05 平安科技(深圳)有限公司 一种欺诈行为识别方法、装置、计算机设备及存储介质

Also Published As

Publication number Publication date
CN112331230A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2022105169A1 (zh) 一种欺诈行为识别方法、装置、计算机设备及存储介质
CN114218403B (zh) 基于知识图谱的故障根因定位方法、装置、设备及介质
WO2022007438A1 (zh) 情感语音数据转换方法、装置、计算机设备及存储介质
CN112863683B (zh) 基于人工智能的病历质控方法、装置、计算机设备及存储介质
CN111210335B (zh) 用户风险识别方法、装置及电子设备
CN112507116A (zh) 基于客户应答语料的客户画像方法及其相关设备
CN109979439B (zh) 基于区块链的语音识别方法、装置、介质及电子设备
CN111177319A (zh) 风险事件的确定方法、装置、电子设备和存储介质
WO2021237923A1 (zh) 智能配音方法、装置、计算机设备和存储介质
CN113420212A (zh) 基于深度特征学习的推荐方法、装置、设备及存储介质
CN112084779A (zh) 用于语义识别的实体获取方法、装置、设备及存储介质
CN112598039B (zh) 获取nlp分类领域阳性样本方法及相关设备
CN111598122B (zh) 数据校验方法、装置、电子设备和存储介质
CN113283222A (zh) 自动化报表生成方法、装置、计算机设备及存储介质
CN112417886A (zh) 意图实体信息抽取方法、装置、计算机设备及存储介质
CN110362981B (zh) 基于可信设备指纹判断异常行为的方法及系统
WO2021151354A1 (zh) 一种单词识别方法、装置、计算机设备和存储介质
CN117172632B (zh) 一种企业异常行为检测方法、装置、设备及存储介质
CN113177701A (zh) 一种用户信用评估方法和装置
CN112950382A (zh) 交易业务撮合匹配方法、装置、电子设备及介质
CN110674497B (zh) 一种恶意程序相似度计算的方法和装置
CN117077656B (zh) 论证关系挖掘方法、装置、介质及电子设备
CN110992067B (zh) 消息推送方法、装置、计算机设备及存储介质
CN113572913B (zh) 图像加密方法、装置、介质及电子设备
CN116993515A (zh) 基于凭证核验的风险评估方法、装置、计算机设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21893324

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21893324

Country of ref document: EP

Kind code of ref document: A1