CN108566627A

CN108566627A - A kind of method and system identifying fraud text message using deep learning

Info

Publication number: CN108566627A
Application number: CN201711205293.7A
Authority: CN
Inventors: 陈晓莉; 刘亭; 丁帆; 丁一帆; 徐菁; 林建洪; 徐佳丽
Original assignee: Zhejiang Ponshine Information Technology Co Ltd
Current assignee: Zhejiang Ponshine Information Technology Co Ltd
Priority date: 2017-11-27
Filing date: 2017-11-27
Publication date: 2018-09-21

Abstract

The invention discloses a kind of method and system identifying fraud text message using deep learning, cause algorithm recognition accuracy not high to solve the problems, such as that the feature of fraud text message is ever-changing.This method includes：It obtains the text data of short message sample and carries out word segmentation processing；Term vector is converted into the text data after participle using Word2Vec；The term vector is converted by sentence vector using LSTM algorithms；Using the sentence vector as the input vector of softmax graders to train deep learning model；Fraud text message is identified according to the output result of the softmax graders of the deep learning after training.The present invention provides a kind of method and system that fraud text message is identified using deep learning, improves the ability accurately identified to fraud text message.

Description

Method and system for recognizing fraud short messages by utilizing deep learning

Technical Field

The invention relates to the technical field of communication security, in particular to a method and a system for recognizing fraud short messages by utilizing deep learning.

Background

The short message is used as a carrier for transmitting information among massive clients, and an effective channel is established for communication among the clients. Along with the wide use of short messages, the phenomenon of fraud information content transmitted through the short messages is more and more serious, great inconvenience is brought to normal information communication of mobile phone users, and more users are consequently cheated and cannot be read.

The short message brings serious social cost and cultural loss to operators while creating economic benefit.

At present, the research on the identification of the fraud short messages mainly adopts a mode of combining feature extraction and a traditional machine learning algorithm, but a large amount of manpower and material resources are needed in the aspect of feature extraction, and along with the rapid updating and iteration of information, the features of the fraud short messages are also changed greatly, so that the accuracy of algorithm identification is seriously reduced. In addition, short messages belong to short texts, are short in length and sparse in characteristics, and especially, the structure and content of fraud short messages serving as an escape filtering mechanism are often not standardized, so that the traditional text characteristic extraction method cannot be completely suitable for short message classification.

Patent publication No. CN102547709A provides a method and a mobile phone for confirming fraud short messages, the method includes: comparing the received short message content with a preset keyword database, if the keywords are matched or partially matched, comparing the short message content with a preset fraud short message database, if the keywords are matched, determining that the short message is a fraud short message, filtering the short message, and if the keywords are partially matched, determining that the short message is a suspected fraud short message, and prompting a user to prevent the short message. The mobile phone comprises a first comparison unit, a second comparison unit, a filtering unit and a display unit. The method for confirming the fraud short messages and the mobile phone can ensure that the mobile phone can confirm whether the short messages are fraud short messages and can filter and prevent the fraud short messages. However, the recognition rate of the invention is very low when the invention faces the varied irregular fraud short messages.

Disclosure of Invention

The invention aims to provide a method and a system for recognizing fraud short messages by utilizing deep learning, which are used for solving the problem of low algorithm recognition accuracy caused by the fact that the characteristics of the fraud short messages are changeable.

In order to achieve the purpose, the invention adopts the technical scheme that:

a method for recognizing fraud short messages by deep learning comprises the following steps:

acquiring text data of a short message sample and performing word segmentation processing;

converting the text data after Word segmentation into Word vectors by adopting Word2 Vec;

converting the word vector into a sentence vector by adopting an LSTM algorithm;

taking the sentence vector as an input vector of a softmax classifier to train a deep learning model;

and recognizing fraud short messages according to the output result of the trained softmax classifier.

Further, the step of performing word segmentation processing by using the acquired text data of the short message specifically includes:

collecting text data of all short message samples;

removing a non-text part in the text data by adopting a regularization method;

dividing the short message sample into a negative sample and a positive sample, and dividing the short message sample into a training sample and a testing sample according to a preset proportion;

performing word segmentation processing on the short message sample by adopting a crust word segmentation tool;

stop words are introduced to remove invalid words from the text.

Further, the step of training the deep learning model specifically includes:

selecting the result with the maximum probability value as the output short message category;

calculating a loss function and performing back propagation;

adjusting the weight of the deep learning model to a preset threshold value;

inputting the test sample into the deep learning model, and calculating accuracy, recall rate and F value;

and optimizing the deep learning model by utilizing the self-learning capability of the neural network.

Further, the step of converting the text data after Word segmentation into Word vectors by using Word2Vec specifically includes:

counting the number m of the keywords in the fraud short message special library;

converting a word into an n-dimensional word vector x by using a one-hot-vector;

establishing a weight matrix w of n m to map the vector of n dimensions to the hidden neuron with dimension 1 m;

obtaining a vector W by back-propagation and a word vector W (i) of 1 m by multiplication with the word vector x;

and adding the word vectors corresponding to the fraud keywords appearing in each short message to obtain a text vector d of the short message.

Further, the step of converting the word vector into the sentence vector by using the LSTM algorithm specifically includes:

arranging the word vectors in a preset sequence, assuming that there are m word vectors x_tForming;

initializing model parameters W_f，U_f，b_f，W_a，U_a，b_a，W_i，U_i，b_i，W_o，U_o，b_o；

X is to be_tPass into forgetting door f_t+1And updating the weight W of the forgetting gate_t+1，U_t+1，b_t+1(ii) a Wherein,

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

wherein W_f，U_f，b_fCoefficients and biases that are linear relationships; sigma is a sigmoid activation function;

updating input Gate parameter i_tAnd a_t(ii) a Wherein,

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；

a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

W_a，U_a，b_a，W_i，U_i，b_icoefficients and biases that are linear relationships; sigma is a sigmoid activation function;

updating model output State C_tWherein

C_t＝C_t-1⊙f_t+i_t⊙a_t；

as Hadamard product;

updating output gate parameter σ_tAnd h_tAnd outputs the predicted value of the current sequence indexWherein,

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)；

further, the softmax function is formulated as:

wherein,is the output of the jth neuron in the lth layer,an output representing a jth neuron of a current layer; wherein,

further, the method also comprises the following steps:

and if the fraud short message is determined, intercepting the short message.

A system for recognizing fraud messages using deep learning, comprising:

the processing module is used for acquiring text data of the short message sample and performing word segmentation processing;

the Word vector module is used for converting the text data after Word segmentation into Word vectors by adopting Word2 Vec;

the sentence vector module is used for converting the word vector into a sentence vector by adopting an LSTM algorithm;

the training module is used for taking the sentence vector as an input vector of a softmax classifier so as to train a deep learning model;

and the recognition module is used for recognizing the fraud short messages according to the output result of the deep-learning softmax classifier after training.

Further, the processing module specifically includes:

the acquisition unit is used for acquiring text data of all short message samples;

the removing unit is used for removing the non-text part in the text data by adopting a regularization method;

the dividing unit is used for dividing the short message sample into a negative sample and a positive sample and dividing the short message sample into a training sample and a test sample according to a preset proportion;

the word segmentation unit is used for performing word segmentation processing on the short message sample by adopting a crust word segmentation tool;

and the introducing unit is used for introducing stop words to remove invalid words in the text.

Further, the training module specifically includes:

the selection unit is used for selecting the result with the maximum probability value as the output short message category;

the computing unit is used for computing a loss function and performing back propagation;

the adjusting unit is used for adjusting the weight of the deep learning model to a preset threshold value;

the input unit is used for inputting the test sample into the deep learning model and calculating the accuracy, the recall rate and the F value;

and the tuning unit is used for tuning the deep learning model by utilizing the self-learning capability of the neural network.

Compared with the traditional technology, the invention has the following advantages:

the invention provides a method and a system for identifying fraud short messages by utilizing deep learning, which improve the capability of accurately identifying the fraud short messages.

Drawings

FIG. 1 is a flowchart of a method for identifying fraud short messages by deep learning according to an embodiment;

FIG. 2 is a diagram of the LSTM algorithm provided by the first embodiment for fraud message identification;

FIG. 3 is a block diagram of a system for recognizing fraud messages by deep learning according to the second embodiment.

Detailed Description

The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.

Example one

The embodiment provides a method for identifying fraud short messages by deep learning, as shown in fig. 1, comprising the steps of:

s11: acquiring text data of a short message sample and performing word segmentation processing;

s12: converting the text data after halving by adopting Word2Vec into Word vectors;

s13: converting the word vector into a sentence vector by adopting an LSTM algorithm;

s14: taking the sentence vector as an input vector of a softmax classifier to train a deep learning model;

s15: recognizing fraud short messages according to the output result of the trained deep learning softmax classifier;

s16: and if the fraud short message is determined, intercepting the short message.

In recent years, deep learning algorithms have been applied to the field of natural language processing, with superior results compared to traditional models. In natural language processing, a Recurrent Neural Network (RNNs) or a Recurrent Neural Network (RNNs) is a commonly used method. Their role is to encode the input in matrix form as a one-dimensional vector at a lower latitude, while retaining most of the useful information. RNNs are in many variations, such as common RNNs, as well as GRU, LSTM, and the like.

The embodiment provides a method for effectively identifying fraud messages by utilizing a deep learning algorithm under mass data. When the short message sender sends the short message to the short message network gateway, the deep learning analysis system is utilized to identify the content of the short message, if the deep learning model judges that the short message is a fraud short message, the fraud short message is intercepted, and the short message cannot reach the receiver. Through the deep learning algorithm, the accuracy rate of recognizing the fraud short messages is greatly improved. The problem of low recognition rate caused by the diversification of the fraud short messages is avoided.

In this embodiment, step S11 is to obtain text data of the short message sample and perform word segmentation processing.

Wherein, step S11 specifically includes:

collecting text data of all short message samples;

removing a non-text part in the text data by adopting a regularization method;

dividing the short message sample into a negative sample and a positive sample, and dividing the short message sample into a training sample and a test sample according to a preset proportion;

stop words are introduced to take out invalid words in the text.

Specifically, regularization means that in linear algebra theory, an ill-defined problem is usually defined by a set of linear algebra equations, and the set of equations usually results from an ill-defined inverse problem with a large condition number. The large condition number means that rounding errors or other errors can severely impact the outcome of the problem.

The ending word segmentation tool is used for realizing efficient word graph scanning based on a prefix dictionary and generating a directed acyclic graph formed by all possible word forming conditions of Chinese characters in a sentence; a maximum probability path is searched by adopting dynamic programming, and a maximum segmentation combination based on word frequency is found out; for unknown words, an HMM model based on Chinese character word forming capability is adopted, and a Viterbi algorithm is used.

Firstly, collecting sample text data containing fraud short messages and normal short messages; and removing the non-text part of the text data by adopting a regularization method, and only keeping the text part. And dividing the short message sample into a positive sample and a negative sample and dividing the samples into a training sample and a test sample according to a preset proportion. For example, the sample is divided into a training sample and a test sample in a ratio of 3: 1. And then, performing word segmentation processing on the short message sample by adopting a word segmentation tool, and finally introducing stop words to take out invalid words in the text.

In this embodiment, step S12 is to convert Word2Vec into Word vectors.

Wherein, step S12 specifically includes:

obtaining a vector W by inverse transfer and a word vector W (i) of 1 m by multiplication with the word vector x;

Specifically, Word2Vec is a Word vector generation tool for deep learning, essentially utilizes a neural network language model and simplifies the neural network language model, thereby ensuring the effect and improving the computational complexity. Word2Vec represents words with high dimensional vectors and places words of similar meaning in similar positions, and uses real vectors (not limited to integers). Only a large amount of linguistic data of a certain language is needed to train the model and obtain the word vector. There are two algorithms commonly used for this model: CBOW and Skip-gram. The CBOW model predicts a current word by using k words before and after the word W (t); the Skip-gram model is just the opposite, and uses word W (t) to predict k words before and after the word W (t).

Before training the model, words in the fraud message feature library need to be quantized and converted into word vectors. The number of the fraud short message words in the feature word library is the dimension of the vector, each word is endowed with a code by using a one-hot-vector, the position of the word is marked as '1', and the other positions are marked as '0'. For example, the word vector [0,0,0,0,1, …,0,0] for "congratulation", and the word vector for "winning" is [0,1,0,0,0, …,0,0 ].

Word2Vec is a neural network with one hidden layer. The input and output of the method are word vectors, and after the trained neural network converges, the weight from an input layer to a hidden layer is assigned to each word vector, so that each word can obtain a new vector with semantic meaning.

The specific implementation process of the Word2Vec algorithm can be further subdivided into:

counting the keywords in the fraud short message feature library, and assuming that m keywords exist;

firstly, converting a word into an n-dimensional word vector x by utilizing a one-hot-vector; take "winning" as an example:

"winning" → [0,0,0,0,1, …,0,0]

The hidden layer is provided with m neurons, and the known input layer is an n-dimensional vector and is fully connected with the hidden layer;

the hidden layer is connected with the output layer in a full mode, a softmax classifier is added in the computer output unit, a final vector W can be obtained through reverse transfer, and the final word vector, namely the vector W (i) of 1 m, can be obtained through multiplication of the final word vector and the initial word vector, namely x W;

x*w＝W(i)＝[W_i1W_i2… W_im]

and adding the word vectors corresponding to the fraud keywords appearing in each short message to obtain a text vector d belonging to the short message.

In this embodiment, step S13 is to convert the word vector into the sentence vector by using the LSTM algorithm.

Wherein, step S13 specifically includes:

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

updating input Gate parameter i_tAnd a_t(ii) a Wherein,

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

updating model output State C_tWherein

C_t＝C_t-1⊙f_t+i_t⊙a_t；

as Hadamard product;

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)；

specifically, a deep learning model for fraud short message recognition is built based on LSTM (long-short term memory artificial neural network), as shown in fig. 2.

LSTM, as a special case of RNN, can avoid the problem of gradient disappearance of conventional RNN. RNN assumes that the samples are sequence based. Such as from sequence index 1 to sequence index t. For any sequence index number t, its corresponding input is x in the corresponding sample sequence number_iAnd hidden state h at t-1 position_tThe determination is made in common. At any sequence index t, there is a corresponding model prediction output o_t. By predicting the output o_tAnd training sequence true output y_tAnd a loss function L_tThe model can be trained in a similar way to softmax and then used to predict the output for some positions in the test sequence. Due to the problem of RNN gradient disappearance, a hidden structure of a sequence index position t is improved to avoid the problem of gradient disappearance, and the special RNN is LSTM.

Hidden states h propagated forward at each sequence index position t instant, except as RNN_tThere is another hidden state, generally called the cellular state, denoted C_t。

In addition to cellular states, there are other structures in LSTM, which are generally referred to as gated structures. The gates of the LSTM at each sequence index position t typically include three types, a forgetting gate, an input gate, and an output gate.

The forgetting gate controls whether to forget, and in the LSTM, whether to forget the state of the hidden cell in the previous layer is controlled with a certain probability. Hidden state h of last sequence of input_t-1And the present sequence data x_iThe output f of the forgetting gate is obtained by an activation function, generally sigmoid_t. Output f due to sigmoid_tIn [0,1 ]]And hence the output f here_tRepresenting the probability of forgetting the state of a layer of hidden cells. Using the mathematical expression as:

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

wherein W_f，U_f，b_fCoefficients and biases that are linear relationships; sigma is sigmoid activation function.

The input gate is responsible for processing the input of the current sequence position and consists of two parts, wherein the first part uses a sigmoid activation function, and the output is i_tThe second part uses the tanh activation function and outputs a bit a_tAnd multiplying the results of the two methods to update the cell state, wherein the mathematical expression is as follows:

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；

a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

wherein, W_a，U_a，b_a，W_i，U_i，b_iCoefficients and biases that are linear relationships; sigma is sigmoid activation function.

The results of both the forgetting gate and the entry gate contribute to the cell state C_t. Cell State C_tIs composed of two parts, the first part is C_i-1And forget gate output f_tThe second part is the input gate i_tAnd a_tThe product of (a) and (b), namely:

C_t＝C_t-1⊙f_t+i_t⊙a_t；

wherein, it is Hadamard product.

Hidden state h of output gate_tThe update of (2) is composed of two parts, the first part being o_tFrom the previous sequence of hidden states h_t-1And the present sequence data x_tAnd the activation function sigmoid, the second part being derived from the hidden state C_tAnd tanh activation function, i.e.:

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)。

LSTM model with hidden state h_t，C_tParameter has W_f，U_f，b_fW_a，U_a，b_a，W_i，U_i， b_i，W_o，U_o，b_oAnd sequentially updating parameters according to the sequence of the forgetting gate, the input gate and the output gate, and outputting a current sequence index predicted value:

in this embodiment, step S14 is to use the vector as an input vector of the softmax classifier to train the deep learning model.

Wherein the softmax function formula is as follows:

in particular, one feature of the softmax function is that it takes as the output of each neuron the ratio of the input of that neuron to the sum of the inputs of all neurons in the current layer. This makes the output easier to interpret. The larger the output value of a neuron is, the higher the probability that the class corresponding to the neuron is a true class is.

Step S14 specifically includes:

calculating a loss function and performing back propagation;

adjusting the weight of the deep learning model to a preset threshold;

inputting the test sample into a deep learning model, and calculating accuracy, recall rate and F value;

and (4) optimizing the deep learning model by utilizing the self-learning capability of the neural network.

Specifically, the idea of the back propagation algorithm is to iteratively update all parameters by a gradient descent method, and the key point is to calculate partial derivatives of all parameters based on a loss function. There are two hidden states h in the LSTM_tAnd C_tTwo deltas are defined to propagate in opposite directions, i.e.

In this embodiment, the result with the maximum probability value of the hidden cell state is selected as the output of the short message, and then the loss function is calculated and the reverse propagation is performed. And then, adjusting the weight of the deep learning model to a preset threshold value, inputting the test sample into the deep learning model, calculating the accuracy, the recall rate and the F value, and then optimizing the deep learning model according to the calculated record. And training by the method to obtain the trained deep learning model.

In this embodiment, step S15 is to identify fraud short messages according to the output result of the trained deep learning softmax classifier.

Specifically, after the deep learning model is obtained through training, data is input into the deep learning model to identify the fraud short messages.

In this embodiment, step S16 is to intercept the short message if the fraud short message is determined.

Specifically, if the fraud short message is judged through the deep learning model, the short message gateway intercepts the short message, and the short message receiver does not receive the short message.

Example two

The embodiment provides a system for classifying fraud short messages by deep learning, as shown in fig. 3, including:

the processing module 21 is configured to obtain text data of a short message sample and perform word segmentation processing;

the Word vector module 22 is used for converting the text data after halving by Word2Vec into Word vectors;

a sentence vector module 23, configured to convert the word vector into a sentence vector by using an LSTM algorithm;

a training module 24, configured to use the sentence vector as an input vector of the softmax classifier to train the deep learning model;

and the identification module 25 is used for identifying the fraud short messages according to the output result of the trained deep learning softmax classifier.

The embodiment provides a system for effectively identifying fraud messages by utilizing a deep learning algorithm under mass data. When the short message sender sends the short message to the short message network gateway, the deep learning analysis system is utilized to identify the content of the short message, if the deep learning model judges that the short message is a fraud short message, the fraud short message is intercepted, and the short message cannot reach the receiver. Through the deep learning algorithm, the accuracy rate of recognizing the fraud short messages is greatly improved. The problem of low recognition rate caused by the diversification of the fraud short messages is avoided.

In this embodiment, the processing module 21 is configured to obtain text data of a short message sample and perform word segmentation processing.

Wherein, the processing module 21 specifically includes:

and the introducing unit is used for introducing stop words to extract invalid words in the text.

Specifically, the collecting unit collects sample text data containing the fraud short messages and the normal short messages; and the removing unit removes the non-text part of the text data by adopting a regularization method and only keeps the text part. The dividing unit divides the short message sample into a positive sample and a negative sample and divides the samples into a training sample and a testing sample according to a preset proportion. For example, the sample is divided into a training sample and a test sample in a ratio of 3: 1. And finally, the word segmentation unit introduces stop words to take out invalid words in the text.

In this embodiment, the Word vector module 22 converts the text data after Word segmentation into Word vectors by using Word2 Vec.

The word vector module 22 specifically includes:

the statistical unit is used for counting the number m of the key words in the fraud short message special library;

the conversion unit is used for converting a word into an n-dimensional word vector x by utilizing a one-hot-vector;

the establishing unit is used for establishing a weight matrix w of n m so as to map the vector of n dimensions to the hidden neuron with the dimension of 1 m;

a backward transfer unit for finding a vector W by backward transfer and obtaining a word vector W (i) of 1 × m by multiplying with the word vector x;

and the adding unit is used for adding the word vectors corresponding to the fraud keywords appearing in each short message to obtain a text vector d of the short message.

"winning" → [0,0,0,0,1, …,0,0]

x*w＝W(i)＝[W_i1W_i2… W_im]

In this embodiment, the sentence vector module 23 is configured to convert the word vector into the sentence vector by using an LSTM algorithm.

The sentence vector module 23 specifically includes:

an arrangement unit for arranging the word vectors in a preset order, assuming that there are m word vectors x_tForming;

an initialization unit for initializing model parameters W_f，U_f，b_f，W_a，U_a，b_a，W_i，U_i，b_i，W_o，U_o，b_o；

A forgetting gate unit for changing x_tPass into forgetting door f_t+1And updating the weight W of the forgetting gate_t+1，U_t+1，b_t+1(ii) a Wherein,

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

an input gate unit for updating an input gate parameter i_tAnd a_t(ii) a Wherein,

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

an output state unit for updating the model output state C_tWherein

C_t＝C_t-1⊙f_t+i_t⊙a_t；

as Hadamard product;

an output gate unit for updating an output gate parameter σ_tAnd h_tAnd outputs the predicted value of the current sequence indexWherein,

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)；

LSTM, as a special case of RNN, can avoid the problem of gradient disappearance of conventional RNN. RNN assumes that the samples are sequence based. Such as from sequence index 1 to sequence index t. For any sequence index number t, its corresponding input is x in the corresponding sample sequence number_iAnd hidden state h at t-1 position_tThe determination is made in common. At any sequence index t, there is a corresponding model prediction output o_t. By predicting the output o_tAnd training sequence true output y_tAnd a loss function L_tThe model can be trained in a similar way to softmax and then used to predict the output for some positions in the test sequence. Due to the problem of RNN gradient disappearance, the hidden structure of the sequence index position t is improved to avoidThe problem of gradient disappearance is avoided, and the special RNN is LSTM.

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；

a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

C_t＝C_t-1⊙f_t+i_t⊙a_t；

wherein, it is Hadamard product.

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)。

in this embodiment, the training module 24 is configured to use the vector as an input vector of the softmax classifier to train the deep learning model.

Wherein the softmax function formula is as follows:

The training module 24 specifically includes:

In this embodiment, the recognition module 25 is configured to recognize the fraud short messages according to the output result of the trained deep learning softmax classifier.

The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims

1. A method for recognizing fraud short messages by deep learning is characterized by comprising the following steps:

2. The method for recognizing fraud short messages by deep learning as claimed in claim 1, wherein the step of performing word segmentation processing on the text data of the obtained short messages specifically comprises:

collecting text data of all short message samples;

removing a non-text part in the text data by adopting a regularization method;

stop words are introduced to remove invalid words from the text.

3. The method of claim 2, wherein the step of training the deep learning model specifically comprises:

calculating a loss function and performing back propagation;

adjusting the weight of the deep learning model to a preset threshold value;

4. The method for recognizing fraud short messages by deep learning of claim 1, wherein said step of converting Word2Vec into Word vectors comprises:

5. The method for recognizing fraud short messages with deep learning of claim 1, wherein said step of converting said word vector into sentence vector using LSTM algorithm specifically comprises:

f_t+1＝σ(W_t+1h_t+U_t+1x_t+1+b_t+1)；

wherein, W_f，U_f，b_fCoefficients and biases that are linear relationships; sigma is a sigmoid activation function;

updating input Gate parameter i_tAnd a_t(ii) a Wherein,

i_t＝σ(W_ih_t-1+U_ix_t+b_i)；

a_t＝tanh(W_ah_t-1+U_ax_t+b_a)；

updating model output State C_tWherein

C_t＝C_t-1⊙f_t+i_t⊙a_t；

as Hadamard product;

σ_t＝σ(W_oh_t-1+U_ox_t+b_o)；

h_t＝o_t⊙tanh(C_t)；

6. the method for identifying fraud short messages with deep learning of claim 1, wherein said softmax function formula is:

7. the method for recognizing fraud messages with deep learning of claim 1, further comprising the steps of:

and if the fraud short message is determined, intercepting the short message.

8. A system for recognizing fraud messages using deep learning, comprising:

9. The system of claim 8, wherein the processing module comprises:

10. The system of claim 8, wherein the training module specifically comprises: