CN104317847A - Method and system for identifying languages in network text information - Google Patents

Method and system for identifying languages in network text information Download PDF

Info

Publication number
CN104317847A
CN104317847A CN201410539771.8A CN201410539771A CN104317847A CN 104317847 A CN104317847 A CN 104317847A CN 201410539771 A CN201410539771 A CN 201410539771A CN 104317847 A CN104317847 A CN 104317847A
Authority
CN
China
Prior art keywords
word message
languages
surfing
net
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410539771.8A
Other languages
Chinese (zh)
Inventor
孙伟力
杨超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shengshi Guangming Software Co., Ltd.
Original Assignee
孙伟力
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 孙伟力 filed Critical 孙伟力
Priority to CN201410539771.8A priority Critical patent/CN104317847A/en
Publication of CN104317847A publication Critical patent/CN104317847A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Abstract

The invention provides a method and a system for identifying languages in network text information. The method comprises the following steps of collecting an internet surfing data package of internet surfing equipment in a network accessing position during internet surfing, obtaining the text information contained in the internet surfing data package, and according to the text information, identifying the language of the text information generated by the internet surfing equipment. The method and the system for identifying the languages in the network text information have the advantages that the additional arrangement of a client end on the internet surfing equipment is not needed, the internet surfing data package can be obtained, and the language of the text information can be identified; according to the language, the nationality of a user holding the internet surfing equipment can be judged, a security department can monitor certain particular populations (such as populations within a certain nationality range) in a targeted way, the supervision efficiency is improved, the security department can favorably and timely obtain the information related with terrorist activities, and the social stability is maintained.

Description

A kind of method and system of languages of recognition network Word message
Technical field
The present invention relates to data acquisition process technology.Relate in particular to a kind of method and system of languages of recognition network Word message.
Background technology
Present stage, in world wide, national conflict is given prominence to, and the violence terrorist incident based on national characteristics frequently occurs.And along with the fast development of network technology, the quantity sharp increase of internet information resource, Internet user colony also constantly expands, the possessor of equipment for surfing the net can carry out transmission mail by network, chat, forum posts and browses the operations such as webpage, the Internet data bag comprising aforesaid operations information can be produced while carrying out these operations, therefore, if can be analyzed described Internet data bag, to likely get some information of the possessor of equipment for surfing the net, the languages belonging to Word message that the possessor of such as equipment for surfing the net inputs, and then the nationality judging belonging to it, security department can be monitored some specific crowd (crowd in such as a certain national scope) targetedly, improve supervision efficiency, be conducive to security department and get the information relevant to terrorist activity in time, safeguard the stable of society.
But above-mentioned Internet data bag to be obtained in prior art and there is a lot of technical difficulty and challenge, the collection that client could realize above-mentioned data installed by general needs on equipment for surfing the net, but different equipments for surfing the net often uses different operating system, this just needs to develop the multiple client matched with operating system, development amount is very large, when the operating system update of equipment for surfing the net, also need synchronously to upgrade to client, make the maintenance cost of system very high, and equipment for surfing the net possessor often SC when installing client, often can there is the phenomenon of refusing to install client, and client is not installed, just can not get the Internet data bag of equipment for surfing the net, identify that the languages of the Word message that Internet data bag comprises also just cannot achieve certainly.
Summary of the invention
For this reason, technical matters to be solved by this invention is to need in prior art to install on equipment for surfing the net client just can get the Internet data bag that equipment for surfing the net possessor produces when surfing the Net, thus provides a kind of and can obtain without the need to installing client on equipment for surfing the net the method and system that Internet data Bao Bingneng therefrom identifies the languages of the recognition network Word message of the languages of Word message.
For solving the problems of the technologies described above, technical scheme of the present invention is as follows:
The invention provides a kind of method of languages of recognition network Word message, comprise the steps:
At the Internet data bag that network insertion station acquisition equipment for surfing the net produces when surfing the Net;
Obtain the Word message comprised in described Internet data bag;
The languages of the Word message that equipment for surfing the net produces according to described Word message identification.
The method of the languages of recognition network Word message of the present invention, the step of the Word message comprised in the described Internet data bag of described acquisition, comprising:
According to transport layer protocol, described Internet data bag is reassembled into transport layer session data stream;
The data comprised in described transport layer session data stream are gone out according to HTML (Hypertext Markup Language) HTML protocol analysis;
Its Word message comprised is gone out from described extracting data.
The method of the languages of recognition network Word message of the present invention, the step of the languages of the described Word message that equipment for surfing the net produces according to described Word message identification, comprising:
Parse in described Word message the character code of each character correspondence in Unicode comprised;
The coding range of described Word message in Unicode is obtained according to described character code;
The languages of described Word message are identified according to described coding range.
The method of the languages of recognition network Word message of the present invention, also comprises after the languages of the described Word message that equipment for surfing the net produces according to described Word message identification:
Carry out classification according to the languages of described Word message to described Word message to store.
Present invention also offers a kind of system of languages of recognition network Word message, comprising:
Harvester, for the Internet data bag produced when surfing the Net at network insertion station acquisition equipment for surfing the net;
Acquisition device, for obtaining the Word message comprised in described Internet data bag;
Recognition device, for the languages of the Word message that equipment for surfing the net according to described Word message identification produces.
The system of the languages of recognition network Word message of the present invention, described acquisition device comprises:
Recomposition unit, for reassembling into transport layer session data stream according to transport layer protocol by described Internet data bag;
First resolution unit, for going out the data comprised in described transport layer session data stream according to HTML (Hypertext Markup Language) HTML protocol analysis;
Extraction unit, for going out its Word message comprised from described extracting data.
The system of the languages of recognition network Word message of the present invention, described recognition device comprises:
Second resolution unit, for parsing in described Word message the character code of each character correspondence in Unicode comprised;
Scope acquiring unit, for obtaining the coding range of described Word message in Unicode according to described character code;
Languages recognition unit, for identifying the languages of described Word message according to described coding range.
The system of the languages of recognition network Word message of the present invention, also comprises:
Sorting storage device, stores for carrying out classification according to the languages of described Word message to described Word message.
Technique scheme of the present invention has the following advantages compared to existing technology:
The invention provides a kind of method and system of languages of recognition network Word message, by the Internet data bag produced when surfing the Net at network insertion station acquisition equipment for surfing the net, obtain the Word message that comprises in described Internet data bag afterwards and according to described Word message identification the languages of the Word message that equipment for surfing the net produces.Therefore the method and system of the languages of recognition network Word message of the present invention, the languages that described Internet data Bao Bingneng therefrom identifies Word message can be obtained without the need to installing client on equipment for surfing the net, and the nationality belonging to possessor of described equipment for surfing the net can be judged according to described languages, security department can be monitored some specific crowd (crowd in such as a certain national scope) targetedly, improve supervision efficiency, be conducive to security department and get the information relevant to terrorist activity in time, safeguard the stable of society.
Accompanying drawing explanation
In order to make content of the present invention be more likely to be clearly understood, below according to a particular embodiment of the invention and by reference to the accompanying drawings, the present invention is further detailed explanation, wherein
Fig. 1 is the step block diagram of the method for the languages of recognition network Word message described in embodiment 1;
Fig. 2 is the step block diagram obtaining the Word message method comprised in Internet data bag;
Fig. 3 is the step block diagram of languages method of the Word message produced according to Word message identification equipment for surfing the net;
Fig. 4 is the structured flowchart of the system of the languages of recognition network Word message described in embodiment 2;
Fig. 5 is the structured flowchart of acquisition device;
Fig. 6 is the structured flowchart of recognition device.
In figure, Reference numeral is expressed as: 1-harvester, 2-acquisition device, 3-recognition device, 4-sorting storage device, 21-recomposition unit, 22-first resolution unit, 23-extraction unit, 31-second resolution unit, 32-scope acquiring unit, 33-voice recognition unit.
Embodiment
Embodiment 1
Present embodiments provide a kind of method of languages of recognition network Word message, as shown in Figure 1, comprise the steps:
S1. at the Internet data bag that network insertion station acquisition equipment for surfing the net produces when surfing the Net.
S2. the Word message comprised in described Internet data bag is obtained.
S3. the languages of Word message that equipment for surfing the net produces according to described Word message identification.
Preferably, also comprise the steps: after described step S3
S4. carry out classification according to the languages of described Word message to described Word message to store.
Particularly, the Internet data bag that can be produced when surfing the Net at network insertion station acquisition equipment for surfing the net by the data acquisition node being arranged at network insertion position.Can by the Internet data bag of each equipment for surfing the net of type collection of poll.
Particularly, also can first store Internet data bag, then the languages that aforesaid operations identifies the Word message that equipment for surfing the net produces are performed to the Internet data bag stored, after languages identification, according to languages, according to languages, class indication be carried out to the data stored again; Also first can perform after aforesaid operations identifies the languages of the Word message that equipment for surfing the net produces, according to languages, classification storage be carried out to Word message.In a word, can carry out before recognition the storage of data, also can carry out after recognition, can determine according to system architecture concrete condition when the system of building.
Preferably, as shown in Figure 2, the step of the Word message comprised in the described Internet data bag of described acquisition, can comprise:
S21. according to transport layer protocol, described Internet data bag is reassembled into transport layer session data stream.
S22. the data comprised in described transport layer session data stream are gone out according to HTML (Hypertext Markup Language) HTML protocol analysis.
S23. its Word message comprised is gone out from described extracting data.
Particularly, equipment for surfing the net possessor utilizes equipment for surfing the net to carry out transmission mail, chat, during the operation such as online forum message, generally all text event detection can be carried out, therefore above-mentioned Word message will be comprised in the Internet data bag that equipment for surfing the net produces when surfing the Net, after collecting above-mentioned Internet data bag, by transport layer protocol, described Internet data bag is reassembled into transport layer session data stream, the data comprised in described transport layer session data stream can be parsed according to HTML (Hypertext Markup Language) HTML agreement, the MAC Address of equipment for surfing the net is just included in described data, network access style (sends mail, browse webpage, forum posts, chat etc.) and internet content (Mail Contents, post content in URL address, website, chatting object, chat content) etc. data, therefore its Word message comprised can be extracted from above-mentioned data, such as Mail Contents, chat content, to post content etc.
Preferably, as shown in Figure 3, the step of the languages of the described Word message that equipment for surfing the net produces according to described Word message identification, can comprise:
S31. the character code of each character correspondence in Unicode comprised is parsed in described Word message.
S32. the coding range of described Word message in Unicode is obtained according to described character code.
S33. the languages of described Word message are identified according to described coding range.
Particularly, parse each character of comprising in Word message at Unicode (Unicode, ten thousand country codes, single code) middle corresponding character code, just can get the coding range of Word message in Unicode according to character code, when such as coding range is in (4E00-9FBF), the languages that just can be identified the Word message of its correspondence by the mode of inquiry comparison according to this coding range are Chinese, when coding range is in (0600-06FF, 0750-077F, FB50-FDFF, FE70-FEFF) time in, the languages that just can identify Word message corresponding to this coding range are Arabic, when coding range is in (1800-18AF), the languages that just can identify Word message corresponding to this coding range are Mongolian etc.And by the languages of described Word message, the nationality of equipment for surfing the net possessor just can be judged, be Chinese, Arabic, Mongolian or other countries, national people.After languages confirm, carry out classification according to the languages of Word message to Word message again to store, such as carry out classification according to Chinese information, english information, Tibetan information, Balakrishnan information, Sino-British mixed information, middle dimension mixed information etc. to described Word message store and show, be conducive to inquiry and the monitoring in later stage.
The method of the languages of recognition network Word message described in the present embodiment, the languages that described Internet data Bao Bingneng therefrom identifies Word message can be obtained without the need to installing client on equipment for surfing the net, and the nationality belonging to possessor of described equipment for surfing the net can be judged according to described languages, security department can be monitored some specific crowd (crowd in such as a certain national scope) targetedly, improve supervision efficiency, be conducive to security department and get the information relevant to terrorist activity in time, safeguard the stable of society.
Embodiment 2
Present embodiments provide a kind of system of languages of recognition network Word message, as shown in Figure 4, comprising:
Harvester 1, for the Internet data bag produced when surfing the Net at network insertion station acquisition equipment for surfing the net.
Acquisition device 2, for obtaining the Word message comprised in described Internet data bag.
Recognition device 3, for the languages of the Word message that equipment for surfing the net according to described Word message identification produces.
Preferably, sorting storage device 4 can also being comprised, storing for carrying out classification according to the languages of described Word message to described Word message.
Preferably, described acquisition device 2 can comprise:
Recomposition unit 21, for reassembling into transport layer session data stream according to transport layer protocol by described Internet data bag.
First resolution unit 22, for going out the data comprised in described transport layer session data stream according to HTML (Hypertext Markup Language) HTML protocol analysis.
Extraction unit 23, for going out its Word message comprised from described extracting data.
Preferably, described recognition device 3 can comprise:
Second resolution unit 31, for parsing in described Word message the character code of each character correspondence in Unicode comprised.
Scope acquiring unit 32, for obtaining the coding range of described Word message in Unicode according to described character code.
Languages recognition unit 33, for identifying the languages of described Word message according to described coding range.
The system of the languages of recognition network Word message described in the present embodiment, without the need to installing client on equipment for surfing the net, described Internet data bag can be obtained by harvester 1 and therefrom be identified the languages of Word message by acquisition device 2 and recognition device 3, and the nationality belonging to possessor of described equipment for surfing the net can be judged according to described languages, security department can be monitored some specific crowd (crowd in such as a certain national scope) targetedly, improve supervision efficiency, be conducive to security department and get the information relevant to terrorist activity in time, safeguard the stable of society.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the present invention can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code.
The present invention describes with reference to according to the process flow diagram of the method for the embodiment of the present invention, equipment (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Although describe the preferred embodiments of the present invention, those skilled in the art once obtain the basic creative concept of cicada, then can make other change and amendment to these embodiments.So claims are intended to be interpreted as comprising preferred embodiment and falling into all changes and the amendment of the scope of the invention.

Claims (8)

1. a method for the languages of recognition network Word message, is characterized in that, comprises the steps:
At the Internet data bag that network insertion station acquisition equipment for surfing the net produces when surfing the Net;
Obtain the Word message comprised in described Internet data bag;
The languages of the Word message that equipment for surfing the net produces according to described Word message identification.
2. the method for the languages of recognition network Word message according to claim 1, is characterized in that, the step of the Word message comprised in the described Internet data bag of described acquisition, comprising:
According to transport layer protocol, described Internet data bag is reassembled into transport layer session data stream;
The data comprised in described transport layer session data stream are gone out according to HTML (Hypertext Markup Language) HTML protocol analysis;
Its Word message comprised is gone out from described extracting data.
3. the method for the languages of recognition network Word message according to claim 1 and 2, is characterized in that, the step of the languages of the described Word message that equipment for surfing the net produces according to described Word message identification, comprising:
Parse in described Word message the character code of each character correspondence in Unicode comprised;
The coding range of described Word message in Unicode is obtained according to described character code;
The languages of described Word message are identified according to described coding range.
4. according to the method for the languages of the arbitrary described recognition network Word message of claim 1-3, it is characterized in that, also comprise after the languages of the described Word message that equipment for surfing the net produces according to described Word message identification:
Carry out classification according to the languages of described Word message to described Word message to store.
5. a system for the languages of recognition network Word message, is characterized in that, comprising:
Harvester (1), for the Internet data bag produced when surfing the Net at network insertion station acquisition equipment for surfing the net;
Acquisition device (2), for obtaining the Word message comprised in described Internet data bag;
Recognition device (3), for the languages of the Word message that equipment for surfing the net according to described Word message identification produces.
6. the system of the languages of recognition network Word message according to claim 5, is characterized in that, described acquisition device (2) comprising:
Recomposition unit (21), for reassembling into transport layer session data stream according to transport layer protocol by described Internet data bag;
First resolution unit (22), for going out the data comprised in described transport layer session data stream according to HTML (Hypertext Markup Language) HTML protocol analysis;
Extraction unit (23), for going out its Word message comprised from described extracting data.
7. the system of the languages of the recognition network Word message according to claim 5 or 6, is characterized in that, described recognition device (3) comprising:
Second resolution unit (31), for parsing in described Word message the character code of each character correspondence in Unicode comprised;
Scope acquiring unit (32), for obtaining the coding range of described Word message in Unicode according to described character code;
Languages recognition unit (33), for identifying the languages of described Word message according to described coding range.
8., according to the system of the languages of the arbitrary described recognition network Word message of claim 5-7, it is characterized in that, also comprise:
Sorting storage device (4), stores for carrying out classification according to the languages of described Word message to described Word message.
CN201410539771.8A 2014-10-13 2014-10-13 Method and system for identifying languages in network text information Pending CN104317847A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410539771.8A CN104317847A (en) 2014-10-13 2014-10-13 Method and system for identifying languages in network text information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410539771.8A CN104317847A (en) 2014-10-13 2014-10-13 Method and system for identifying languages in network text information

Publications (1)

Publication Number Publication Date
CN104317847A true CN104317847A (en) 2015-01-28

Family

ID=52373079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410539771.8A Pending CN104317847A (en) 2014-10-13 2014-10-13 Method and system for identifying languages in network text information

Country Status (1)

Country Link
CN (1) CN104317847A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106528535A (en) * 2016-11-14 2017-03-22 北京赛思信安技术股份有限公司 Multi-language identification method based on coding and machine learning
CN107203763A (en) * 2016-03-18 2017-09-26 北大方正集团有限公司 Character recognition method and device
CN107528765A (en) * 2016-06-22 2017-12-29 北京宸瑞国新科技有限公司 A kind of recognition methods of Email language and system
CN108600856A (en) * 2018-03-20 2018-09-28 青岛海信电器股份有限公司 The recognition methods of plug-in subtitle language and device in video file
CN111079408A (en) * 2019-12-26 2020-04-28 北京锐安科技有限公司 Language identification method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750538A (en) * 2005-09-29 2006-03-22 西安交大捷普网络科技有限公司 Method for discovering and controlling of producing flow based on P2P high speed unloading software
CN101340308A (en) * 2008-08-19 2009-01-07 翁时锋 Network rubbish information filtering architecture, Network rubbish information cleaning system and method thereof
CN201298231Y (en) * 2008-01-30 2009-08-26 张新波 Multilingual communication and application system capable of automatically identifying multilanguage
CN101976231A (en) * 2010-08-25 2011-02-16 孙强国 Network supervision method for multi-language short messages
US20130145023A1 (en) * 2010-08-19 2013-06-06 Dekai Li Personalization of information content by monitoring network traffic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1750538A (en) * 2005-09-29 2006-03-22 西安交大捷普网络科技有限公司 Method for discovering and controlling of producing flow based on P2P high speed unloading software
CN201298231Y (en) * 2008-01-30 2009-08-26 张新波 Multilingual communication and application system capable of automatically identifying multilanguage
CN101340308A (en) * 2008-08-19 2009-01-07 翁时锋 Network rubbish information filtering architecture, Network rubbish information cleaning system and method thereof
US20130145023A1 (en) * 2010-08-19 2013-06-06 Dekai Li Personalization of information content by monitoring network traffic
CN101976231A (en) * 2010-08-25 2011-02-16 孙强国 Network supervision method for multi-language short messages

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张健 等: "多语种eml文件编码及语种识别算法研究", 《新疆大学学报(自然科学版)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203763A (en) * 2016-03-18 2017-09-26 北大方正集团有限公司 Character recognition method and device
CN107203763B (en) * 2016-03-18 2020-03-06 北大方正集团有限公司 Character recognition method and device
CN107528765A (en) * 2016-06-22 2017-12-29 北京宸瑞国新科技有限公司 A kind of recognition methods of Email language and system
CN106528535A (en) * 2016-11-14 2017-03-22 北京赛思信安技术股份有限公司 Multi-language identification method based on coding and machine learning
CN106528535B (en) * 2016-11-14 2019-04-26 北京赛思信安技术股份有限公司 A kind of multi-speech recognition method based on coding and machine learning
CN108600856A (en) * 2018-03-20 2018-09-28 青岛海信电器股份有限公司 The recognition methods of plug-in subtitle language and device in video file
CN111079408A (en) * 2019-12-26 2020-04-28 北京锐安科技有限公司 Language identification method, device, equipment and storage medium
CN111079408B (en) * 2019-12-26 2023-05-30 北京锐安科技有限公司 Language identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US20160019470A1 (en) Event detection through text analysis using trained event template models
CN104317847A (en) Method and system for identifying languages in network text information
US11196758B2 (en) Method and system for enabling automated log analysis with controllable resource requirements
CN104063401B (en) The method and apparatus that a kind of webpage pattern address merges
US20150154249A1 (en) Data ingestion module for event detection and increased situational awareness
CN113645224B (en) Network attack detection method, device, equipment and storage medium
CN105577528B (en) A kind of wechat public platform collecting method and device based on virtual machine
CN105516196A (en) HTTP message data-based parallelization network anomaly detection method and system
CN104615748B (en) Internet of Things Web event-handling methods based on Watir
CN110912888B (en) Malicious HTTP (hyper text transport protocol) traffic detection system and method based on deep learning
CN105468737A (en) Web service big data analysis method, cloud computing platform and mining system
CN103810268A (en) Search result recommendation information loading method, device and system and URL detection method, device and system
CN108289093A (en) The construction method and structure system in App application condition codes library
CN106878397A (en) A kind of WEB user behaviors feedback method and system
CN102984161A (en) Identification method and device for reliable website
US10671686B2 (en) Processing webpage data
CN104750663B (en) The recognition methods of text messy code and device in the page
CN108073693A (en) A kind of distributed network crawler system based on Hadoop
CN103036910A (en) Method and device for controlling user web access behaviors
CN102571922B (en) Method and device for processing data stream
CN103577180A (en) Data processing method and data processing device
CN102882988A (en) Method, device and equipment for acquiring address information of resource information
CN103793508A (en) Method, device and system for loading recommend information and detecting websites
Wang et al. Smart devices information extraction in home wi‐fi networks
CN114900492B (en) Abnormal mail detection method, device and system and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: BEIJING SHENGSHI GUANGMING SOFTWARE CO., LTD.

Free format text: FORMER OWNER: SUN WEILI

Effective date: 20150204

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20150204

Address after: 100085, C, block 9, 3rd Street, Haidian District, Beijing, 1210

Applicant after: Beijing Shengshi Guangming Software Co., Ltd.

Address before: 100085, Room 408, block D, Jinyu Ka Wah building, upper third street, Haidian District, Beijing

Applicant before: Sun Weili

RJ01 Rejection of invention patent application after publication

Application publication date: 20150128

RJ01 Rejection of invention patent application after publication