CN107239504A - A kind of deep learning algorithm for being used to recognize fraud text message - Google Patents
A kind of deep learning algorithm for being used to recognize fraud text message Download PDFInfo
- Publication number
- CN107239504A CN107239504A CN201710327007.8A CN201710327007A CN107239504A CN 107239504 A CN107239504 A CN 107239504A CN 201710327007 A CN201710327007 A CN 201710327007A CN 107239504 A CN107239504 A CN 107239504A
- Authority
- CN
- China
- Prior art keywords
- deep learning
- short message
- module
- url
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/12—Detection or prevention of fraud
- H04W12/128—Anti-malware arrangements, e.g. protection against SMS fraud or mobile malware
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W12/00—Security arrangements; Authentication; Protecting privacy or anonymity
- H04W12/12—Detection or prevention of fraud
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/12—Messaging; Mailboxes; Announcements
- H04W4/14—Short messaging services, e.g. short message services [SMS] or unstructured supplementary service data [USSD]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a kind of deep learning algorithm for being used to recognize fraud text message, it is related to information security field, including deep learning module, interactive module, pretreatment module and comparison module, the interactive module is by the short message sending of acquisition to the pretreatment module, the pretreatment module extracts the characteristic vector of short message, the deep learning module forms deep learning model according to sample set, the comparison module compares the characteristic vector and the deep learning model of short message, the comparison module sends comparative result to the interactive module, comparative result is fed back to user by the interactive module, the short message includes short message text and/or URL.The present invention has high expansibility, by by algoritic module, to the processing of each several part neural network moduleization;Algorithm is reliable, using deep learning algorithm, automatic study sentence feature, and potential feature can be more excavated compared to shallow-layer algorithm.
Description
Technical field
The present invention relates to information security field, more particularly, to a kind of deep learning algorithm for being used to recognize fraud text message.
Background technology
Mobile phone has been the indispensable instrument of people's daily life, is unequal in this case using the thing number of short message fraud
Number, and also have the impetus further expanded.
Currently for fraud text message, domestic well-known safe mobile phone manufacturer is substantially simple right using database progress
Than, or employ simple machine learning algorithm to recognize fraud text message, it there is no the method using deep learning
To recognize fraud text message, and current preventing mobile phone swindle algorithm is simple, can not be effectively protected user.
The content of the invention
In view of the drawbacks described above of prior art, the technical problems to be solved by the invention are to provide a kind of with high expansion
Property, algoritic module, can more excavate potential feature be used for recognize the deep learning algorithm of fraud text message.
The invention provides a kind of deep learning algorithm for being used to recognize fraud text message, including deep learning module, interaction
Module, pretreatment module and comparison module, the interactive module by the short message sending of acquisition to the pretreatment module, it is described pre-
Processing module extracts the characteristic vector of short message, and the deep learning module forms deep learning model, the ratio according to sample set
Compared with the characteristic vector that module compares short message and the deep learning model, comparative result is sent to described and handed over by the comparison module
Comparative result is fed back to user by mutual module, the interactive module, and the short message includes short message text and/or URL.
Further, the deep learning model includes short message text deep learning model and/or URL depth study mould
Type.
Further, the forming process of the short message text deep learning model includes pretreatment module extraction short message
Short message text be characterized vector, short message text characteristic vector is imported in DBN and forms short message text by the deep learning module
Deep learning model.
Further, the forming process of the URL deep learning model includes the URL that the pretreatment module extracts short message
Vector is characterized, URL characteristic vectors are imported and URL deep learning model is formed in DBN by the deep learning module.
Further, the characteristic vector of the pretreatment module extraction short message includes short message text characteristic vector and/or URL
Characteristic vector.
Further, the generation type of the short message text characteristic vector is that the pretreatment module divides short message text
From the isolated short message text is imported into export in Woed2vec and obtains short message text characteristic vector.
Further, the generation type of the URL characteristic vectors is that the pretreatment module separates URL, will be separated
To URL using extracting rule obtain URL characteristic vectors.
Further, the comparison module compares the characteristic vector and the deep learning model of short message, including relatively shorter
Believe Text eigenvector and short message text deep learning model and/or URL characteristic vectors and URL deep learning model.
Further, when comparing short message text characteristic vector with short message text deep learning model, by the short message text
Characteristic vector imports the short message text deep learning grader of the deep learning module, the result after classification and the depth
The threshold value of study module setting compares, and feedback result is to interactive module.
Further, when comparing URL characteristic vectors with URL deep learning model, the URL characteristic vectors are imported described
The URL depth Study strategies and methods of deep learning module, the threshold value ratio that the result after classification is set with the deep learning module
Compared with, and feedback result is to interactive module.
Compared with prior art, the beneficial effects of the invention are as follows:The present invention has high expansibility, by by algoritic module
Change, to the processing of each several part neural network moduleization;Algorithm reliability, using deep learning algorithm, automatic study sentence feature, phase
Potential feature can be more excavated than shallow-layer algorithm.
The technique effect of the design of the present invention, concrete structure and generation is described further below with reference to accompanying drawing, with
It is fully understood from the purpose of the present invention, feature and effect.
Brief description of the drawings
Fig. 1 is a kind of module diagram of preferred embodiment of the invention;
Fig. 2 is a kind of short message text deep learning model product process schematic diagram of preferred embodiment of the invention;
Fig. 3 is a kind of URL deep learning model product process schematic diagram of preferred embodiment of the invention;
Fig. 4 is that a kind of short message text characteristic vector of preferred embodiment of the invention is compared with short message text deep learning model
Schematic flow sheet;
Fig. 5 is that a kind of URL characteristic vectors of preferred embodiment of the invention and URL deep learning model are compared flow signal
Figure.
Embodiment
Below in conjunction with accompanying drawing to a kind of preferred reality for being used to recognize the deep learning algorithm of fraud text message of the present invention
Example is applied to be described in detail, but the present invention is not limited in the embodiment.Thoroughly understand in order that the public has to the present invention,
Concrete details is described in detail in present invention below preferred embodiment.
Embodiment 1:
As shown in figure 1, the invention provides a kind of deep learning algorithm for being used to recognize fraud text message, including deep learning
Module, interactive module, pretreatment module and comparison module, interactive module to pretreatment module, locate the short message sending of acquisition in advance
The characteristic vector that module extracts short message is managed, deep learning module forms deep learning model according to sample set, and comparison module compares
The characteristic vector of short message and deep learning model, comparison module send comparative result to interactive module, and interactive module will compare
As a result user is fed back to, short message includes short message text and/or URL.
Deep learning model includes short message text deep learning model and/or URL deep learning model.
The short message text that the forming process of short message text deep learning model includes pretreatment module extraction short message is characterized
Short message text characteristic vector is imported and short message text deep learning model is formed in DBN by vector, deep learning module.
The URL that the forming process of URL deep learning model includes pretreatment module extraction short message is characterized vector, depth
Practise module and URL characteristic vectors are imported into formation URL deep learning model in DBN.
The characteristic vector that pretreatment module extracts short message includes short message text characteristic vector and/or URL characteristic vectors.
The generation type of short message text characteristic vector is that pretreatment module separates short message text, will be isolated short
Export obtains short message text characteristic vector in this importing of message Woed2vec.
The generation type of URL characteristic vectors is that pretreatment module separates URL, and isolated URL is advised using extraction
Then obtain URL characteristic vectors.
Comparison module compares the characteristic vector of short message and deep learning model, including compare short message text characteristic vector with it is short
This deep learning of message model and/or URL characteristic vectors and URL deep learning model.
When comparing short message text characteristic vector with short message text deep learning model, short message text characteristic vector is imported deep
The short message text deep learning grader of study module is spent, the threshold value ratio that the result after classification is set with deep learning module
Compared with, and feedback result is to interactive module.
When comparing URL characteristic vectors with URL deep learning model, URL characteristic vectors are imported to the URL of deep learning module
Deep learning grader, the result after classification is compared with the threshold value that deep learning module is set, and feedback result gives interaction mould
Block.
Embodiment 2:
As shown in Fig. 2 short message text deep learning model product process is as follows:
Step S11, imports short message sample, into step S12;
Step S12, pretreatment module is by the isolated short message text characteristic vector of short message sample, into step S13;
Short message text characteristic vector is imported DBN formation short message text deep learning moulds by step S13, deep learning module
Type.
As shown in figure 3, URL deep learning model product process is as follows:
Step S21, imports short message sample, into step S22;
Step S22, pretreatment module is by the isolated URL characteristic vectors of short message sample, into step S13;
URL characteristic vectors are imported DBN formation URL deep learning models by step S23, deep learning module.
As shown in figure 4, to be compared flow with short message text deep learning model as follows for short message text characteristic vector:
Step S31, imports short message sample, into step S32;
Step S32, imports export in Woed2vec by short message text and obtains short message text characteristic vector, into step S33;
Step S33, short message text characteristic vector imports the short message text deep learning grader of deep learning module, enters
Step S34;
Step S34, compares the short message text deep learning grader that short message text characteristic vector imports deep learning module
The threshold value that sorted result is set with deep learning module, and feedback result is to interactive module;If result of the comparison is determined
Short message is suspicious fraud text message, then feeds back to client's short message for fraud text message, if comparative result determines that short message is normal short
Letter, then keep silent, wait next short message.
As shown in figure 5, to be compared flow with URL deep learning model as follows for URL characteristic vectors:
Step S41, imports short message sample, into step S42;
Step S42, pretreatment module separates URL, by isolated URL using extracting rule obtain URL features to
Amount, into step S43;
Step S43, URL characteristic vectors is imported the URL depth Study strategies and methods of deep learning module, into step S44;
Step S44, compares the sorted knot of URL depth Study strategies and methods that URL characteristic vectors import deep learning module
The threshold value that fruit sets with deep learning module, and feedback result is to interactive module;If result of the comparison determines that short message is suspicious
Fraud text message, then feed back to client's short message for fraud text message, if comparative result determines that short message is normal short message, keeps quiet
It is silent, wait next short message.
In summary, the present invention has high expansibility, the modularization of algorithm, at each several part neural network module
Reason;Algorithm reliability, using deep learning algorithm, automatic study sentence feature can more excavate potential feature compared to shallow-layer algorithm.
Preferred embodiment of the invention described in detail above.It should be appreciated that one of ordinary skill in the art without
Need creative work just can make many modifications and variations according to the design of the present invention.Therefore, all technologies in the art
Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea
Technical scheme, all should be in the protection domain being defined in the patent claims.
Claims (10)
1. a kind of deep learning algorithm for being used to recognize fraud text message, it is characterised in that including deep learning module, interaction mould
Block, pretreatment module and comparison module, the interactive module by the short message sending of acquisition to the pretreatment module, the pre- place
The characteristic vector that module extracts short message is managed, the deep learning module forms deep learning model, the comparison according to sample set
Module compares the characteristic vector and the deep learning model of short message, and the comparison module sends comparative result to the interaction
Comparative result is fed back to user by module, the interactive module, and the short message includes short message text and/or URL.
2. the deep learning algorithm as claimed in claim 1 for being used to recognize fraud text message, it is characterised in that the deep learning
Model includes short message text deep learning model and/or URL deep learning model.
3. the deep learning algorithm as claimed in claim 2 for being used to recognize fraud text message, it is characterised in that the short message text
The short message text that the forming process of deep learning model includes pretreatment module extraction short message is characterized vector, the depth
Short message text characteristic vector is imported and short message text deep learning model is formed in DBN by study module.
4. the deep learning algorithm as claimed in claim 2 for being used to recognize fraud text message, it is characterised in that the URL depth
The URL that the forming process of learning model includes pretreatment module extraction short message is characterized vector, the deep learning module
URL characteristic vectors are imported URL deep learning model is formed in DBN.
5. the deep learning algorithm as claimed in claim 1 for being used to recognize fraud text message, it is characterised in that the pretreatment mould
The characteristic vector that block extracts short message includes short message text characteristic vector and/or URL characteristic vectors.
6. the deep learning algorithm as claimed in claim 5 for being used to recognize fraud text message, it is characterised in that the short message text
The generation type of characteristic vector is that the pretreatment module separates short message text, and the isolated short message text is led
Enter export in Woed2vec and obtain short message text characteristic vector.
7. the deep learning algorithm as claimed in claim 6 for being used to recognize fraud text message, it is characterised in that the URL features
The generation type of vector is that the pretreatment module separates URL, and isolated URL is obtained into URL spies using extracting rule
Levy vector.
8. the deep learning algorithm as claimed in claim 1 for being used to recognize fraud text message, it is characterised in that the comparison module
Compare the characteristic vector and the deep learning model of short message, including compare short message text characteristic vector and short message text depth
Practise model and/or URL characteristic vectors and URL deep learning model.
9. the deep learning algorithm as claimed in claim 8 for being used to recognize fraud text message, it is characterised in that compare short message text
When characteristic vector is with short message text deep learning model, the short message text characteristic vector is imported into the deep learning module
Short message text deep learning grader, the result after classification is compared with the threshold value that the deep learning module is set, and is fed back
As a result interactive module is given.
10. the deep learning algorithm as claimed in claim 8 for being used to recognize fraud text message, it is characterised in that compare URL features
When vector is with URL deep learning model, the URL characteristic vectors are imported to the URL depth study point of the deep learning module
Class device, the result after classification is compared with the threshold value that the deep learning module is set, and feedback result is to interactive module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710327007.8A CN107239504A (en) | 2017-05-10 | 2017-05-10 | A kind of deep learning algorithm for being used to recognize fraud text message |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710327007.8A CN107239504A (en) | 2017-05-10 | 2017-05-10 | A kind of deep learning algorithm for being used to recognize fraud text message |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107239504A true CN107239504A (en) | 2017-10-10 |
Family
ID=59984321
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710327007.8A Pending CN107239504A (en) | 2017-05-10 | 2017-05-10 | A kind of deep learning algorithm for being used to recognize fraud text message |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107239504A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021932A (en) * | 2017-11-22 | 2018-05-11 | 北京奇虎科技有限公司 | Data detection method, device and electronic equipment |
CN109922444A (en) * | 2017-12-13 | 2019-06-21 | 中国移动通信集团公司 | A kind of refuse messages recognition methods and device |
CN109982272A (en) * | 2019-02-13 | 2019-07-05 | 北京航空航天大学 | A kind of fraud text message recognition methods and device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103607705A (en) * | 2013-12-04 | 2014-02-26 | 北京网秦天下科技有限公司 | Junk message filtering method and engine |
CN106161209A (en) * | 2016-07-21 | 2016-11-23 | 康佳集团股份有限公司 | A kind of method for filtering spam short messages based on degree of depth self study and system |
CN106332024A (en) * | 2016-08-31 | 2017-01-11 | 华为技术有限公司 | Insecure short message recognition method and related equipment |
-
2017
- 2017-05-10 CN CN201710327007.8A patent/CN107239504A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103607705A (en) * | 2013-12-04 | 2014-02-26 | 北京网秦天下科技有限公司 | Junk message filtering method and engine |
CN106161209A (en) * | 2016-07-21 | 2016-11-23 | 康佳集团股份有限公司 | A kind of method for filtering spam short messages based on degree of depth self study and system |
CN106332024A (en) * | 2016-08-31 | 2017-01-11 | 华为技术有限公司 | Insecure short message recognition method and related equipment |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108021932A (en) * | 2017-11-22 | 2018-05-11 | 北京奇虎科技有限公司 | Data detection method, device and electronic equipment |
CN109922444A (en) * | 2017-12-13 | 2019-06-21 | 中国移动通信集团公司 | A kind of refuse messages recognition methods and device |
CN109922444B (en) * | 2017-12-13 | 2020-11-03 | 中国移动通信集团公司 | Spam message identification method and device |
CN109982272A (en) * | 2019-02-13 | 2019-07-05 | 北京航空航天大学 | A kind of fraud text message recognition methods and device |
CN109982272B (en) * | 2019-02-13 | 2020-08-28 | 北京航空航天大学 | Fraud short message identification method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107885999B (en) | Vulnerability detection method and system based on deep learning | |
CN107239504A (en) | A kind of deep learning algorithm for being used to recognize fraud text message | |
CN109005145B (en) | Malicious URL detection system and method based on automatic feature extraction | |
CN107943941B (en) | Junk text recognition method and system capable of being updated iteratively | |
CN106446195A (en) | News recommending method and device based on artificial intelligence | |
CN105279405A (en) | Keypress behavior pattern construction and analysis system of touch screen user and identity recognition method thereof | |
CN1319331C (en) | Method and system for detecting and discriminating counterfeit web page | |
CN104809069A (en) | Source node loophole detection method based on integrated neural network | |
CN103258535A (en) | Identity recognition method and system based on voiceprint recognition | |
CN105302884B (en) | Webpage mode identification method and visual structure learning method based on deep learning | |
CN107104988B (en) | IPv6 intrusion detection method based on probabilistic neural network | |
CN107644106A (en) | The internuncial method of automatic mining business, terminal device and storage medium | |
CN112766166A (en) | Lip-shaped forged video detection method and system based on polyphone selection | |
Fujii et al. | HumanGAN: generative adversarial network with human-based discriminator and its evaluation in speech perception modeling | |
Alsubaei et al. | Enhancing phishing detection: A novel hybrid deep learning framework for cybercrime forensics | |
Yusoff et al. | Fraud detection in telecommunication industry using Gaussian mixed model | |
CN110049034A (en) | A kind of real-time Sybil attack detection method of complex network based on deep learning | |
Luceri et al. | Unmasking the web of deceit: Uncovering coordinated activity to expose information operations on twitter | |
CN109218721A (en) | A kind of mutation video detecting method compared based on frame | |
CN107193900A (en) | A kind of identifying system and its application method of suspicious SMS | |
CN102891838A (en) | Method and device for detecting promotion content in question and answer club | |
CN110661795A (en) | Vector-level threat information automatic production and distribution system and method | |
CN114155880A (en) | Illegal voice recognition method and system based on GBDT algorithm model | |
Rozhon et al. | Using lstm cells for sip dialogs mapping and security analysis | |
Jesmithaa et al. | Detecting phishing attacks using Convolutional Neural Network and LSTM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171010 |