CN106446215A - Internet big data evidence collecting system - Google Patents

Internet big data evidence collecting system Download PDF

Info

Publication number
CN106446215A
CN106446215A CN201610874934.7A CN201610874934A CN106446215A CN 106446215 A CN106446215 A CN 106446215A CN 201610874934 A CN201610874934 A CN 201610874934A CN 106446215 A CN106446215 A CN 106446215A
Authority
CN
China
Prior art keywords
unit
information
module
internet
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610874934.7A
Other languages
Chinese (zh)
Inventor
晋彤
李永康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Special Road Mdt Infotech Ltd
Original Assignee
Guangzhou Special Road Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Special Road Mdt Infotech Ltd filed Critical Guangzhou Special Road Mdt Infotech Ltd
Priority to CN201610874934.7A priority Critical patent/CN106446215A/en
Publication of CN106446215A publication Critical patent/CN106446215A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The invention discloses an internet big data evidence collecting system which comprises a template configuration module, an interface connection module, an information fusion module and a system application module, wherein the template configuration module is used for configuring different temperature parameters of the system; the interface connection module is used for achieving connection to different internet data; the information fusion module is used for acquiring various types of harmful sensitive information in the internet, performing contextual analysis and data mining on the information, acquiring various types of harmful sensitive information, automatically collecting evidences, and cutting original webpage information and web cache; the system application module is used for transmitting processed information to a user. The internet big data evidence collecting system is capable of deeply mining the harmful sensitive information, filtering and shielding junk information and comprehensively acquiring important harmful sensitive information evidences, and has the advantages of expansion easiness, high performance and good processing properties.

Description

The Internet big data evidence-obtaining system
Technical field
The present invention relates to Internet technical field, a kind of more particularly, to the Internet big data evidence-obtaining system.
Background technology
With the continuous development of science and technology and society, the Internet is also into a booming period.But due to duty The disappearance of energy department supervision, law enfrocement official's consciousness is delayed, and laws and regulations are unsound, and the Internet has also become to grow various ethic default Or even the paradise of illegal act, give birth to a lot of events being related to the network information, such as " uncle " event, demolition events, indecency photograph thing Part, sudden and violent probably event etc..In the network information of these events, although the viewpoint of some network speeches is real, to event Development, the solution of problem serve certain facilitation.But it is more changeable in mood, the extreme, property propagandized, belittle, attack Hit the work that each place government is carried out, or even fabricate rumors, mislead the public, evoke netizen to the discontented of government and indignation.This Steadily create huge hidden danger to safeguard network public opinion.Safeguard that internet arena order, investigation also original event truth become Internet era new problem.
But due to network open, uncertain, across time and space property the features such as so that the Internet evidence obtaining is abnormal tired Difficult.In addition in the Internet, contain much information, renewal speed is fast, the verity of evidence, objectivity, effectiveness and legitimacy need Carefully to judge, this also brings a bigger difficult problem to the Internet evidence obtaining, be badly in need of one kind at present and meet the Internet big data epoch Data evidence obtaining means.
Content of the invention
In view of this, the invention provides a kind of the Internet big data evidence-obtaining system, including:Template configuration module, interface Linking module, information fusion module and system application module;Wherein,
Described template configuration module is used for the every template parameter in configuration system;
Described interface is connected module and is used for realizing and the docking of all kinds of internet datas;
Described information aggregation module is used for gathering the various harmful sensitive information in the Internet, and carries out language to described information Border analysis, data mining, obtain various harmful sensitive informations automatic evidence-collecting, and intercept former info web and snapshots of web pages;
Described system application module is used for for the information after processing being distributed to user.
Further, described template configuration module include homepage dispensing unit, framework dispensing unit, page configuration unit and Structure dispensing unit.
Further, also to include contact person's dispensing unit, contact method dispensing unit, website clear for described template configuration module Look at unit and website release unit.
Further, described template configuration module also includes personnel's authority configuration unit and system interface dispensing unit.
Further, described interface linking module include managing data interface unit, application system data interface unit and Index data interface unit.
Further, described information aggregation module include information acquisition unit, information excavating unit, information classifying unit and Semantic analysis unit.
Further, described information aggregation module includes information fusion unit, data statisticss unit, information process unit.
Further, described information aggregation module also includes format conversion unit.
Further, described information aggregation module also includes information transmission unit.
Further, described information aggregation module also includes data packaging unit.
Implement the present invention, have the advantages that:
Present invention employs the B/S framework mode based on cloud platform, the leading information gathering retrieval technique of combining global and Algorithm, accumulates experience according to itself abundant professional experiences and long-term industry, carries out deep excavation, mistake to harmful sensitive information Filter and shielding rubbish information, obtain important harmful sensitive information evidence comprehensively.The present invention is based on distributed cloud platform and advanced Acquisition technique there is the high technical characterstic of easy extension, high-performance, process performance.
The present invention can be by carrying out the analysis of hyperlink HTML to internet mass data, with just negative corpus information storehouse as base Plinth, analyzes internet mass information, and the characteristic vector of the Internet sensitive event/topic is carried out navigate to page snapshot and former Page link, and snapshot process is carried out to this type of information, independently preserve snapshot and former page sectional drawing simultaneously, retain information as follow-up The evidence traced.
The present invention passes through Web mining, and the information that captures preserves in the local database, carries out web page analysis Process, the Html code of its webpage is embedded in the web page frame being ready for, realize webpage auto snapshot, set up and locally deposit Storage, and it is associated with index part, add Link address using the respective field in index data base and realize mapping pass System.
System is rendered to the keyword match character with user input in snapshots of web pages using Rendering, with highlighted Color shows.After the keyword of Web Searcher interface input inquiry carries out word segmentation processing, form lemma sequence by user Row, carry out retrieval and inquisition, string matching in snapshots of web pages, corresponding keyword are arranged corresponding Html label, utilize Html sentence adds color displays.
The foundation of the present invention can facilitate user intuitively quickly to understand various important sensitive informations, and forms complete interconnection Net big data information evidence obtaining system.
Brief description
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art and advantage, below will be to enforcement Example or description of the prior art in required use accompanying drawing be briefly described it should be apparent that, drawings in the following description are only It is only some embodiments of the present invention, for those of ordinary skill in the art, on the premise of not paying creative work, Other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is the structured flowchart of the system of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation description is it is clear that described embodiment is only a part of embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art obtained on the premise of not making creative work all its His embodiment, broadly falls into the scope of protection of the invention.
Embodiment:
As shown in figure 1,
The invention provides a kind of the Internet big data evidence-obtaining system, including:Template configuration module, interface linking module, Information fusion module and system application module;Wherein,
Described template configuration module is used for the every template parameter in configuration system;
Described interface is connected module and is used for realizing and the docking of all kinds of internet datas;
Described information aggregation module is used for gathering the various harmful sensitive information in the Internet, and carries out language to described information Border analysis, data mining, obtain various harmful sensitive informations automatic evidence-collecting, and intercept former info web and snapshots of web pages;
Described system application module is used for for the information after processing being distributed to user.
Further, described template configuration module include homepage dispensing unit, framework dispensing unit, page configuration unit and Structure dispensing unit.
Further, also to include contact person's dispensing unit, contact method dispensing unit, website clear for described template configuration module Look at unit and website release unit.
Further, described template configuration module also includes personnel's authority configuration unit and system interface dispensing unit.
Further, described interface linking module include managing data interface unit, application system data interface unit and Index data interface unit.
Further, described information aggregation module include information acquisition unit, information excavating unit, information classifying unit and Semantic analysis unit.
Further, described information aggregation module includes information fusion unit, data statisticss unit, information process unit.
Further, described information aggregation module also includes format conversion unit.
Further, described information aggregation module also includes information transmission unit.
Further, described information aggregation module also includes data packaging unit.
Implement the present invention, have the advantages that:
Present invention employs the B/S framework mode based on cloud platform, the leading information gathering retrieval technique of combining global and Algorithm, accumulates experience according to itself abundant professional experiences and long-term industry, carries out deep excavation, mistake to harmful sensitive information Filter and shielding rubbish information, obtain important harmful sensitive information evidence comprehensively.The present invention is based on distributed cloud platform and advanced Acquisition technique there is the high technical characterstic of easy extension, high-performance, process performance.
The present invention can be by carrying out the analysis of hyperlink HTML to internet mass data, with just negative corpus information storehouse as base Plinth, analyzes internet mass information, and the characteristic vector of the Internet sensitive event/topic is carried out navigate to page snapshot and former Page link, and snapshot process is carried out to this type of information, independently preserve snapshot and former page sectional drawing simultaneously, retain information as follow-up The evidence traced.
The present invention passes through Web mining, and the information that captures preserves in the local database, carries out web page analysis Process, the Html code of its webpage is embedded in the web page frame being ready for, realize webpage auto snapshot, set up and locally deposit Storage, and it is associated with index part, add Link address using the respective field in index data base and realize mapping pass System.
System is rendered to the keyword match character with user input in snapshots of web pages using Rendering, with highlighted Color shows.After the keyword of Web Searcher interface input inquiry carries out word segmentation processing, form lemma sequence by user Row, carry out retrieval and inquisition, string matching in snapshots of web pages, corresponding keyword are arranged corresponding Html label, utilize Html sentence adds color displays.
The foundation of the present invention can facilitate user intuitively quickly to understand various important sensitive informations, and forms complete interconnection Net big data information evidence obtaining system.
The above is the preferred embodiment of the present invention it is noted that for those skilled in the art For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (10)

1. a kind of the Internet big data evidence-obtaining system is it is characterised in that include:Template configuration module, interface linking module, information Aggregation module and system application module;Wherein,
Described template configuration module is used for the every template parameter in configuration system;
Described interface is connected module and is used for realizing and the docking of all kinds of internet datas;
Described information aggregation module is used for gathering various harmful sensitive information in the Internet, and carries out linguistic context to described information and divide Analysis, data mining, obtain various harmful sensitive informations automatic evidence-collecting, and intercept former info web and snapshots of web pages;
Described system application module is used for for the information after processing being distributed to user.
2. system according to claim 1 is it is characterised in that described template configuration module includes homepage dispensing unit, frame Frame dispensing unit, page configuration unit and structure dispensing unit.
3. system according to claim 2 is it is characterised in that described template configuration module also includes contact person's configuration list Unit, contact method dispensing unit, website browse unit and website release unit.
4. system according to claim 3 is it is characterised in that described template configuration module also includes personnel's authority configuration list Unit and system interface dispensing unit.
5. system according to claim 1 is it is characterised in that described interface linking module includes managing data-interface list Unit, application system data interface unit and index data interface unit.
6. system according to claim 1 is it is characterised in that described information aggregation module includes information acquisition unit, letter Breath excavates unit, information classifying unit and semantic analysis unit.
7. system according to claim 1 is it is characterised in that described information aggregation module includes information fusion unit, number Unit, information process unit according to statistics.
8. system according to claim 7 is it is characterised in that described information aggregation module also includes format conversion unit.
9. system according to claim 8 is it is characterised in that described information aggregation module also includes information transmission unit.
10. system according to claim 9 is it is characterised in that described information aggregation module also includes data packaging unit.
CN201610874934.7A 2016-09-30 2016-09-30 Internet big data evidence collecting system Pending CN106446215A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610874934.7A CN106446215A (en) 2016-09-30 2016-09-30 Internet big data evidence collecting system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610874934.7A CN106446215A (en) 2016-09-30 2016-09-30 Internet big data evidence collecting system

Publications (1)

Publication Number Publication Date
CN106446215A true CN106446215A (en) 2017-02-22

Family

ID=58172807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610874934.7A Pending CN106446215A (en) 2016-09-30 2016-09-30 Internet big data evidence collecting system

Country Status (1)

Country Link
CN (1) CN106446215A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108629012A (en) * 2018-05-07 2018-10-09 厦门市美亚柏科信息股份有限公司 Forensic data parses the intelligent checking method and system of accuracy

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090171961A1 (en) * 2007-12-28 2009-07-02 Jason Fredrickson Workflow collaboration in a forensic investigations system
CN102902703A (en) * 2012-07-19 2013-01-30 中国人民解放军国防科学技术大学 Network sensitive information-oriented screenshot discovery and locking callback method
CN102968600A (en) * 2012-10-30 2013-03-13 国网电力科学研究院 Full life-cycle management method for sensitive data file based on fingerprint information implantation
CN105468990A (en) * 2014-09-04 2016-04-06 中国移动通信集团安徽有限公司 Sensitive information management control method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090171961A1 (en) * 2007-12-28 2009-07-02 Jason Fredrickson Workflow collaboration in a forensic investigations system
CN102902703A (en) * 2012-07-19 2013-01-30 中国人民解放军国防科学技术大学 Network sensitive information-oriented screenshot discovery and locking callback method
CN102968600A (en) * 2012-10-30 2013-03-13 国网电力科学研究院 Full life-cycle management method for sensitive data file based on fingerprint information implantation
CN105468990A (en) * 2014-09-04 2016-04-06 中国移动通信集团安徽有限公司 Sensitive information management control method and apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108629012A (en) * 2018-05-07 2018-10-09 厦门市美亚柏科信息股份有限公司 Forensic data parses the intelligent checking method and system of accuracy
CN108629012B (en) * 2018-05-07 2020-08-25 厦门市美亚柏科信息股份有限公司 Intelligent verification method and system for forensic data analysis accuracy

Similar Documents

Publication Publication Date Title
CN110717049B (en) Text data-oriented threat information knowledge graph construction method
KR101911466B1 (en) Analysis system for predicting future risks
CN103870461B (en) Subject recommending method, device and server
CN103763124A (en) Internet user behavior analyzing and early-warning system and method
CN101231661B (en) Method and system for digging object grade knowledge
CN104281702B (en) Data retrieval method and device based on electric power critical word participle
CN106815307A (en) Public Culture knowledge mapping platform and its use method
CN103873601B (en) A kind of method for digging and system addressing class query word
CN103744877A (en) Public opinion monitoring application system deployed in internet and application method
CN106250424A (en) The searching method of a kind of daily record context, Apparatus and system
CN104536956A (en) A Microblog platform based event visualization method and system
CN102902703A (en) Network sensitive information-oriented screenshot discovery and locking callback method
CN102298638A (en) Method and system for extracting news webpage contents by clustering webpage labels
CN104182482B (en) A kind of news list page determination methods and the method for screening news list page
CN104063497A (en) Viewpoint processing method and device and searching method and device
CN103279476B (en) The detection method of a kind of WEB application system sensitive word and system
CN106202563A (en) A kind of real time correlation evental news recommends method and system
US20120117034A1 (en) Context-aware apparatus and method
CN106021351A (en) An aggregation extraction method and device for news events
CN104933171A (en) Method and device for associating data of interest point
CN109857869A (en) A kind of hot topic prediction technique based on Ap increment cluster and network primitive
CN101794308A (en) Method for extracting repeated strings facing meaningful string mining and device
CN104778232B (en) Searching result optimizing method and device based on long query
CN106446215A (en) Internet big data evidence collecting system
CN116521729A (en) Information classification searching method and device based on elastic search

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170222