CN105447202A - Internet information collecting system - Google Patents
Internet information collecting system Download PDFInfo
- Publication number
- CN105447202A CN105447202A CN201511032832.2A CN201511032832A CN105447202A CN 105447202 A CN105447202 A CN 105447202A CN 201511032832 A CN201511032832 A CN 201511032832A CN 105447202 A CN105447202 A CN 105447202A
- Authority
- CN
- China
- Prior art keywords
- information
- unit
- internet
- module
- obtaining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of internet, in particular to an internet information collecting system. An information source recognition unit is used for recognizing an information source correlated to keywords and obtaining a path of the information source according to the keywords input by a user; an information collecting unit is used for obtaining information correlated to the information source according to the path; a filtering and analyzing unit is used for recognizing and analyzing the collected information and filtering information which is not correlated to the keywords; a semantic analyzing unit is used for carrying out semantic analyzing on stored information; a data analyzing unit is used for obtaining information analyzed semantically, analyzing the information and obtaining an analysis result. The internet information collecting system has the advantages that by recognizing the information source, before the information is obtained, the information source meeting the requirement of the user is screened, needed information can be obtained systematically and comprehensively through the information source, and then the data reference and decision support are provided for the user needing the information.
Description
Technical field
The present invention relates to internet arena, particularly relate to a kind of internet information acquisition system.
Background technology
Popularizing of internet brings huge quantity of information to all trades and professions, large data are also raw thereupon applying, large data (bigdata, megadata), or claim flood tide data, refer to and need new tupe just can have the magnanimity of stronger decision edge, clairvoyance and process optimization ability, high growth rate and diversified information assets.
Large small site number in internet is in necessarily, quantity of information under accumulation is huge especially, also exist quite huge about business opportunity in these information, the data of the aspects such as treatment, the overwhelming majority is distributed in each World Jam, in each space, in the interactive discussion spaces such as BLOG, data in these interaction space possess suitable value, possesses sizable reference value to a certain extent, each enterprises and institutions, government organs etc. also need the internet public opinion paid close attention in these spaces, for client provides the Orientation of internet public opinion timely, for Public Crisis public relations, spins etc. provide Data support.But also there is no to be retrieved as vocational cognition at present and the comparatively system of data reference and decision support and comprehensive infosystem are provided.
Summary of the invention
Now providing for the problems referred to above can compared with system and a kind of internet information acquisition system comprehensively obtaining internet information.
Concrete technical scheme is:
A kind of internet information acquisition system, wherein, comprising:
Information source recognition unit, for the key word inputted according to user, identifies the information source being associated with described key word, obtains the path of described information source;
Information acquisition unit, connects described information source recognition unit, for obtaining the information being associated with described information source according to described path;
Filter analysis unit, connects described information acquisition unit, for carrying out discriminance analysis to the described information gathered, filters the described information with described key word onrelevant relation;
Semantic analysis unit, connects described filter analysis unit, resolves for carrying out semanteme to the described information stored;
Data analysis unit, connects described semantic analysis unit, for obtaining the described information of resolving through described semanteme, and analyzing described information, obtaining analysis result.
Preferably, above-mentioned internet information acquisition system, wherein, described filter analysis unit comprises:
First identification module, for identifying the described information gathered, and classifies by preset classification according to the result identified;
Filtering module, connects described identification module, with filtering the described information with described key word onrelevant relation.
Preferably, above-mentioned internet information acquisition system, wherein, comprising:
Memory management unit, connects described filter analysis unit, for the described information after stored filter of classifying, and manages described information.
Preferably, above-mentioned internet information acquisition system, wherein, described memory management unit comprises:
A plurality of memory module, each described memory module is for storing the described information of a type;
Information classification module, connects described memory module, for classifying to described information according to pre-conditioned, and the described information identified is stored in corresponding described memory module.
Preferably, above-mentioned internet information acquisition system, wherein, described memory management unit comprises:
Information integration module, for screening out the described information repeated in the described information gathered;
Information searching module, connects described information integration module for retrieving according to user's input information the described information after screening out.
Preferably, above-mentioned internet information acquisition system, wherein, described semantic analysis unit comprises:
Second identification module, for identifying the content storing described information, is divided into language message and emotion information by the described information identified;
Language semantic is analyzed, and connects described second identification module, resolving, obtaining the first parsing semantic for carrying out semanteme to the described language message after screening;
Emotion semantic analysis, connects described second identification module, resolving, obtaining second and resolving semanteme for carrying out semanteme to the described emotion information after screening;
Preferably, above-mentioned internet information acquisition system, wherein, comprising:
Policing services unit, connects described data analysis unit, for supervising the described analysis result obtained;
Preferably, above-mentioned internet information acquisition system, wherein, comprising:
Report generation unit, connects described data analysis unit, for according to described analysis result, forms an analysis report by initialize format.
The invention has the beneficial effects as follows, can by the identification to information source, first by screening the information source meeting user and require before obtaining information, compared with system and can comprehensively obtain the information needed by information source, so for the user needed provide data with reference to and decision support.
Accompanying drawing explanation
Fig. 1 is general construction schematic diagram in the preferred embodiment of a kind of internet information acquisition system of the present invention;
Fig. 2-5 is part-structure schematic diagram in the preferred embodiment of a kind of internet information acquisition system of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under the prerequisite of not making creative work, all belongs to the scope of protection of the invention.
It should be noted that, when not conflicting, the embodiment in the present invention and the feature in embodiment can combine mutually.
Below in conjunction with the drawings and specific embodiments, the invention will be further described, but not as limiting to the invention.
As shown in Figure 1,
A kind of internet information acquisition system, wherein, comprising:
Information source recognition unit 1, for the key word inputted according to user, identifies the information source being associated with key word, the path in obtaining information source;
Information acquisition unit 2, link information identifing source unit 1, for obtaining the information being associated with information source according to path;
Filter analysis unit 3, link information collecting unit 2, for carrying out discriminance analysis to the information gathered, filters the information with key word onrelevant relation;
Semantic analysis unit 4, connects filter analysis unit 3, resolves for carrying out semanteme to the information stored;
Data analysis unit 5, connects semantic analysis unit 4, for obtaining the information of resolving through semanteme, and analyzing information, obtaining analysis result.
Native system is the identification by carrying out information source to user entered keyword, obtains the information being associated with information source, is one and possesses internet data crawl, the public sentiment disposal system of underlying semantics analysis and data analysis capabilities.It can provide: internet data Grasping skill, data analysis capabilities, and Data classification ability, analyzes data more accurately.Data mining ability, can carry out deeper analysis to the data of internet.
System needs towards each enterprises and institutions, government organs etc. the client paying close attention to internet public opinion, and for client provides the Orientation of internet public opinion timely, be Public Crisis public relations, spin etc. provide Data support.
In present pre-ferred embodiments, as shown in Figure 2, filter analysis unit 3 comprises:
First identification module 301, for identifying the information gathered, and classifies by preset classification according to the result identified;
Filtering module 302, linkage identification module, with filtering the information with key word onrelevant relation.
Whether comprise the first identification module 301 and filtering module 302 at filter analysis unit 3, for identifying the information gathered, according to being that advertisement is classified, be that advertising message is then filtered advertising message as what gather.
As shown in Figure 3,
In present pre-ferred embodiments, comprising:
Memory management unit, connect filter analysis unit 3, for the information after stored filter of classifying, and manage information, this unit facilitates user to the management of Information Monitoring.
On the basis of technique scheme, further, memory management unit comprises:
A plurality of memory module 501, each memory module 501 is for storing the information of a type;
Information classification module 502, connects memory module 501, for classifying to information according to pre-conditioned, and the information of identification is stored in corresponding memory module 501.
Can, by whether being classify to foodstuff, brand, complaint, suggestion etc., different classification be stored in different independently memory modules 501, to analyze for the information of filtering.
In present pre-ferred embodiments, as shown in Figure 4, memory management unit comprises:
Information integration module 503, for screening out the information repeated in the information of collection;
Information searching module 504, link information integrate module 503 is for retrieving according to user's input information the information after screening out.
Memory management unit also comprises information integration module 503 and the information repeated is carried out to integration and screened out, so that user is retrieved by information searching module 504.
In present pre-ferred embodiments, as shown in Figure 5, semantic analysis unit 4 comprises:
Second identification module 401, for identifying the content of the information of storage, is divided into language message and emotion information by the information of identification;
Language semantic analyzes 402, connects the second identification module 401, resolving, obtaining the first parsing semantic for carrying out semanteme to the language message after screening;
Emotion semantic analysis 403, connects the second identification module 401, resolving, obtaining second and resolving semanteme for carrying out semanteme to the emotion information after screening.
To with integrate after information carry out semantic analysis by semantic analysis unit 4, identify especially by the second identification module 401, the information of storage be divided into language message and emotion information, obtain first and resolve semantic and second resolve semanteme.User is excavated data according to the semanteme of resolving, and then obtains the information having commercial value needed for user.
In present pre-ferred embodiments, comprising:
Policing services unit, connection data analytic unit 5, for supervising the analysis result obtained.
In present pre-ferred embodiments, comprising:
Report generation unit, connection data analytic unit 5, for according to analysis result, forms an analysis report by initialize format.Facilitate user to obtain intuitively to be associated with the business analysis report of key word, to make business decision etc.
The foregoing is only preferred embodiment of the present invention; not thereby embodiments of the present invention and protection domain is limited; to those skilled in the art; should recognize and all should be included in the scheme that equivalent replacement done by all utilizations instructions of the present invention and diagramatic content and apparent change obtain in protection scope of the present invention.
Claims (8)
1. an internet information acquisition system, is characterized in that, comprising:
Information source recognition unit, for the key word inputted according to user, identifies the information source being associated with described key word, obtains the path of described information source;
Information acquisition unit, connects described information source recognition unit, for obtaining the information being associated with described information source according to described path;
Filter analysis unit, connects described information acquisition unit, for carrying out discriminance analysis to the described information gathered, filters the described information with described key word onrelevant relation;
Semantic analysis unit, connects described filter analysis unit, resolves for carrying out semanteme to the described information stored;
Data analysis unit, connects described semantic analysis unit, for obtaining the described information of resolving through described semanteme, and analyzing described information, obtaining analysis result.
2. internet information acquisition system as claimed in claim 1, it is characterized in that, described filter analysis unit comprises:
First identification module, for identifying the described information gathered, and classifies by preset classification according to the result identified;
Filtering module, connects described identification module, with filtering the described information with described key word onrelevant relation.
3. internet information acquisition system as claimed in claim 1, is characterized in that, comprising:
Memory management unit, connects described filter analysis unit, for the described information after stored filter of classifying, and manages described information.
4. internet information acquisition system as claimed in claim 3, it is characterized in that, described memory management unit comprises:
A plurality of memory module, each described memory module is for storing the described information of a type;
Information classification module, connects described memory module, for classifying to described information according to pre-conditioned, and the described information identified is stored in corresponding described memory module.
5. internet information acquisition system as claimed in claim 3, it is characterized in that, described memory management unit comprises:
Information integration module, for screening out the described information repeated in the described information gathered;
Information searching module, connects described information integration module for retrieving according to user's input information the described information after screening out.
6. internet information acquisition system as claimed in claim 1, it is characterized in that, described semantic analysis unit comprises:
Second identification module, for identifying the content storing described information, is divided into language message and emotion information by the described information identified;
Language semantic is analyzed, and connects described second identification module, resolving, obtaining the first parsing semantic for carrying out semanteme to the described language message after screening;
Emotion semantic analysis, connects described second identification module, resolving, obtaining second and resolving semanteme for carrying out semanteme to the described emotion information after screening.
7. internet information acquisition system as claimed in claim 1, is characterized in that, comprising:
Policing services unit, connects described data analysis unit, for supervising the described analysis result obtained.
8. internet information acquisition system as claimed in claim 1, is characterized in that, comprising:
Report generation unit, connects described data analysis unit, for according to described analysis result, forms an analysis report by initialize format.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511032832.2A CN105447202A (en) | 2015-12-31 | 2015-12-31 | Internet information collecting system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201511032832.2A CN105447202A (en) | 2015-12-31 | 2015-12-31 | Internet information collecting system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105447202A true CN105447202A (en) | 2016-03-30 |
Family
ID=55557378
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201511032832.2A Pending CN105447202A (en) | 2015-12-31 | 2015-12-31 | Internet information collecting system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105447202A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108153865A (en) * | 2017-12-22 | 2018-06-12 | 中山市小榄企业服务有限公司 | A kind of network application acquisition system of internet |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030220908A1 (en) * | 2002-05-21 | 2003-11-27 | Bridgewell Inc. | Automatic knowledge management system |
CN102346772A (en) * | 2011-09-23 | 2012-02-08 | 王楠 | Directional acquisition system based on OWL (ontology web language) semantic analysis |
CN103176985A (en) * | 2011-12-20 | 2013-06-26 | 中国科学院计算机网络信息中心 | Timely and high-efficiency crawling method for internet information |
CN103473369A (en) * | 2013-09-27 | 2013-12-25 | 清华大学 | Semantic-based information acquisition method and semantic-based information acquisition system |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
CN103744877A (en) * | 2013-12-20 | 2014-04-23 | 潘大庆 | Public opinion monitoring application system deployed in internet and application method |
CN103778200A (en) * | 2014-01-09 | 2014-05-07 | 中国科学院计算技术研究所 | Method for extracting information source of message and system thereof |
CN104009970A (en) * | 2013-09-17 | 2014-08-27 | 宁波公众信息产业有限公司 | Network information acquisition method |
CN104182389A (en) * | 2014-07-21 | 2014-12-03 | 安徽华贞信息科技有限公司 | Semantic-based big data analysis business intelligence service system |
CN104933093A (en) * | 2015-05-19 | 2015-09-23 | 武汉泰迪智慧科技有限公司 | Regional public opinion monitoring and decision-making auxiliary system and method based on big data |
-
2015
- 2015-12-31 CN CN201511032832.2A patent/CN105447202A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030220908A1 (en) * | 2002-05-21 | 2003-11-27 | Bridgewell Inc. | Automatic knowledge management system |
CN102346772A (en) * | 2011-09-23 | 2012-02-08 | 王楠 | Directional acquisition system based on OWL (ontology web language) semantic analysis |
CN103176985A (en) * | 2011-12-20 | 2013-06-26 | 中国科学院计算机网络信息中心 | Timely and high-efficiency crawling method for internet information |
CN104009970A (en) * | 2013-09-17 | 2014-08-27 | 宁波公众信息产业有限公司 | Network information acquisition method |
CN103473369A (en) * | 2013-09-27 | 2013-12-25 | 清华大学 | Semantic-based information acquisition method and semantic-based information acquisition system |
CN103544255A (en) * | 2013-10-15 | 2014-01-29 | 常州大学 | Text semantic relativity based network public opinion information analysis method |
CN103744877A (en) * | 2013-12-20 | 2014-04-23 | 潘大庆 | Public opinion monitoring application system deployed in internet and application method |
CN103778200A (en) * | 2014-01-09 | 2014-05-07 | 中国科学院计算技术研究所 | Method for extracting information source of message and system thereof |
CN104182389A (en) * | 2014-07-21 | 2014-12-03 | 安徽华贞信息科技有限公司 | Semantic-based big data analysis business intelligence service system |
CN104933093A (en) * | 2015-05-19 | 2015-09-23 | 武汉泰迪智慧科技有限公司 | Regional public opinion monitoring and decision-making auxiliary system and method based on big data |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108153865A (en) * | 2017-12-22 | 2018-06-12 | 中山市小榄企业服务有限公司 | A kind of network application acquisition system of internet |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102831220B (en) | Subject-oriented customized news information extraction system | |
CN105468744B (en) | Big data platform for realizing tax public opinion analysis and full text retrieval | |
CN104504150A (en) | News public opinion monitoring system | |
CN104951512A (en) | Public sentiment data collection method and system based on Internet | |
CN102542061B (en) | Intelligent product classification method | |
CN108897778B (en) | Image annotation method based on multi-source big data analysis | |
CN104573016A (en) | System and method for analyzing vertical public opinions based on industry | |
CN103488635A (en) | Method and device for acquiring product information | |
KR20160075971A (en) | Big data management system for public complaints services | |
CN105677802A (en) | Internet information analysis system | |
CN104504151A (en) | Public opinion monitoring system of Wechat | |
CN111414520A (en) | Intelligent mining system for sensitive information in public opinion information | |
Al-Najran et al. | A requirements specification framework for big data collection and capture | |
CN104598561A (en) | Text-based intelligent agricultural video classification method and text-based intelligent agricultural video classification system | |
US20140280150A1 (en) | Multi-source contextual information item grouping for document analysis | |
US20190384812A1 (en) | Portfolio-based text analytics tool | |
CN107315799A (en) | A kind of internet duplicate message screening technique and system | |
Jaiswal et al. | Data Mining Techniques and Knowledge Discovery Database | |
CN106250405A (en) | A kind of magnanimity information processing system | |
US20200073871A1 (en) | A system for managing, analyzing, navigating or searching of data information across one or more sources within a computer or a computer network, without copying, moving or manipulating the source or the data information stored in the source | |
CN105447202A (en) | Internet information collecting system | |
KR101718599B1 (en) | System for analyzing social media data and method for analyzing social media data using the same | |
CN111368550A (en) | Public opinion information management system | |
Kotiyal et al. | Big Data Preprocessing Phase in Engendering Quality Data | |
KR20210045172A (en) | Big Data Management and System for Livestock Disease Outbreak Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160330 |