CN110941757A - Big data based policy information query pushing system and method - Google Patents
Big data based policy information query pushing system and method Download PDFInfo
- Publication number
- CN110941757A CN110941757A CN201911096896.7A CN201911096896A CN110941757A CN 110941757 A CN110941757 A CN 110941757A CN 201911096896 A CN201911096896 A CN 201911096896A CN 110941757 A CN110941757 A CN 110941757A
- Authority
- CN
- China
- Prior art keywords
- module
- policy information
- big data
- database
- big
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a policy information query pushing system based on big data, an acquisition module is connected with a screening module, the screening module is connected with an input module, the input module is connected with a database through a storage module, a calling module is arranged in the database, the database is connected with a big data platform through a networking module, the big data platform is also connected with a display module and a pushing module, the invention can periodically index input keywords through a timing module, an index module and a keyword grabbing module, the aim of updating policy information in time is realized, the queried and pushed policy information is ensured to be up-to-date, correlation comparison can be carried out on the index result, the condition that individual words are correlated but the whole article is not correlated is avoided, the index time is reduced, the index pressure of the system is reduced, pictures, videos and characters in a webpage can be distinguished through source codes and classified, the indexed policy information can be quickly and effectively obtained, and irrelevant contents can be eliminated.
Description
Technical Field
The invention relates to the technical field of big data-based policy information query pushing systems and methods, in particular to a big data-based policy information query pushing system and method.
Background
Big data is a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode.
The information policy is measures and strategies adopted by a country for developing information resources, developing information industry and coordinating information utilization.
Most of the current policy information query systems based on big data are manually queried and screened, and new policy information cannot be obtained in time after the policy information is updated, so an improved technology is urgently needed to solve the problem in the prior art.
Disclosure of Invention
The invention aims to provide a big data-based policy information query pushing system and a big data-based policy information query pushing method, which can periodically index input keywords through a timing module, an index module and a keyword grabbing module, realize the aim of updating policy information in time, ensure that the queried and pushed policy information is up-to-date, and solve the problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: the utility model provides a policy information inquiry propelling movement system based on big data, includes collection module, screening module, types module, storage module, database, calls module, networking module, big data platform, show module, propelling movement module, collection module links to each other with the screening module, the screening module links to each other with types module, types the module and passes through storage module and link to each other with the database, be provided with in the database and call the module, the database passes through networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module.
Preferably, the acquisition module comprises an input module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module carries out matching through the keywords and network index results input by the input module, and the timing module is used for setting the starting cycle time of the index module and the keyword grabbing module.
Preferably, the screening module comprises a relevance comparison module, a repetition comparison module and a code judgment module, wherein the relevance comparison module is used for keyword relevance comparison, the repetition comparison module is used for performing repetition comparison on the search result and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
Preferably, the entry module and the storage module are used for entering and storing the screened network search results into the database.
Preferably, the calling module is used for calling data in the database and is connected with the big data platform through the networking module.
Preferably, the display module is used for displaying policy information in the big data platform.
Preferably, the pushing module is used for pushing policy information in the big data platform.
Preferably, the method comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module, then inputting related keywords through an input module, and indexing the keywords on the network through an indexing module according to a keyword grabbing module to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, all character characters in the search result are subjected to repetition comparison through a repetition comparison module, and the search result with high repetition is rejected;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention can periodically index the input keywords through the timing module, the indexing module and the keyword grabbing module, thereby realizing the purpose of updating the policy information in time and ensuring that the queried and pushed policy information is the latest.
(2) The index result can be subjected to relevance comparison, and the condition that individual words are relevant but the whole article is irrelevant is avoided.
(3) The repetition contrast module can avoid a large number of same indexing results, reduce the indexing time and reduce the indexing pressure of the system.
(4) The code judging module can distinguish and classify pictures, videos and characters in the webpage through the source codes, and can quickly and effectively obtain the indexed policy information and remove irrelevant contents.
Drawings
FIG. 1 is a block diagram of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: a big data-based policy information query pushing system comprises an acquisition module, a screening module, an entry module, a storage module and a database, the calling module, the networking module, big data platform, the show module, the propelling movement module, the acquisition module links to each other with the screening module, the screening module links to each other with the type module, the type module passes through storage module and links to each other with the database, type module and storage module are used for typing in the network search result after the screening and save to the database in, be provided with the calling module in the database, the calling module is used for calling the data in the database and links to each other with big data platform through the networking module, the database passes through the networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module, the show module is used for showing the policy information in the big data platform, the propelling movement module is used for the policy information in the big data. The system comprises an acquisition module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module is matched with keywords and network index results output by the input module, the timing module is used for setting starting cycle time of the index module and the keyword grabbing module, the screening module comprises a correlation comparison module, a repetition comparison module and a code judgment module, the correlation comparison module is used for keyword correlation comparison, the repetition comparison module is used for performing repetition comparison on search results and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
A use method of a big data-based policy information query pushing system comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module (the time T is the acquisition cycle period of an acquisition module), then inputting related keywords through an input module, indexing the keywords on the network through an indexing module according to a keyword grabbing module, and grabbing the keywords on the webpage by adopting a search engine spider technology to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, redundant other codes are directly filtered and deleted, the other codes comprise advertisement position codes, some characters, pictures or videos in the non-relevance codes and the like, all character characters in the search result are subjected to the duplication comparison through a duplication comparison module, the duplication of the characters can be adjusted, and the search result with high duplication is removed;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
The acquisition module of the invention consists of an input module, an index module, a keyword grabbing module and a timing module, the input keywords can be periodically indexed through the timing module, the indexing module and the keyword grabbing module to realize the aim of updating the policy information in time and ensure that the inquired and pushed policy information is up-to-date, the screening module consists of a correlation comparison module, a repetition degree comparison module and a code judgment module, the index result can be compared with the relevance, the condition that the individual words are relevant but the whole article is irrelevant is avoided, the duplication degree comparison module can avoid a large number of same index results, reduce the index time and lighten the index pressure of the system, and the code judgment module can distinguish and classify pictures, videos and characters in the webpage through the source codes, so that the indexed policy information can be quickly and effectively obtained, and irrelevant contents can be removed.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (8)
1. A big data-based policy information query pushing system is characterized in that: including collection module, screening module, type module, storage module, database, transfer module, networking module, big data platform, show module, propelling movement module, collection module links to each other with the screening module, the screening module links to each other with type module, type module passes through storage module and links to each other with the database, be provided with in the database and transfer the module, the database passes through networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module.
2. The big-data-based policy information query pushing system according to claim 1, wherein: the acquisition module comprises an input module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module carries out matching through the keywords and network index results output by the input module, and the timing module is used for setting the starting cycle time of the index module and the keyword grabbing module.
3. The big-data-based policy information query pushing system according to claim 1, wherein: the screening module comprises a correlation comparison module, a repetition contrast module and a code judgment module, wherein the correlation comparison module is used for keyword correlation comparison, the repetition contrast module is used for performing repetition contrast on a search result and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
4. The big-data-based policy information query pushing system according to claim 1, wherein: and the input module and the storage module are used for inputting and storing the screened network search results into the database.
5. The big-data-based policy information query pushing system according to claim 1, wherein: the calling module is used for calling data in the database and is connected with the big data platform through the networking module.
6. The big-data-based policy information query pushing system according to claim 1, wherein: the display module is used for displaying policy information in the big data platform.
7. The big-data-based policy information query pushing system according to claim 1, wherein: the pushing module is used for pushing policy information in the big data platform.
8. The method for using the big-data-based policy information query pushing system according to claim 1, wherein: the method comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module, then inputting related keywords through an input module, and indexing the keywords on the network through an indexing module according to a keyword grabbing module to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, all character characters in the search result are subjected to repetition comparison through a repetition comparison module, and the search result with high repetition is rejected;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911096896.7A CN110941757A (en) | 2019-11-11 | 2019-11-11 | Big data based policy information query pushing system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911096896.7A CN110941757A (en) | 2019-11-11 | 2019-11-11 | Big data based policy information query pushing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110941757A true CN110941757A (en) | 2020-03-31 |
Family
ID=69907630
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911096896.7A Pending CN110941757A (en) | 2019-11-11 | 2019-11-11 | Big data based policy information query pushing system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110941757A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652485A (en) * | 2020-05-25 | 2020-09-11 | 青海绿能数据有限公司 | New energy data acquisition and analysis system based on big data platform |
CN112258144A (en) * | 2020-09-27 | 2021-01-22 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112632387A (en) * | 2020-12-30 | 2021-04-09 | 广东富状元科技有限公司 | Big data-based policy information personalized customization pushing system |
-
2019
- 2019-11-11 CN CN201911096896.7A patent/CN110941757A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111652485A (en) * | 2020-05-25 | 2020-09-11 | 青海绿能数据有限公司 | New energy data acquisition and analysis system based on big data platform |
CN112258144A (en) * | 2020-09-27 | 2021-01-22 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112258144B (en) * | 2020-09-27 | 2022-04-26 | 重庆生产力促进中心 | Policy file information matching and pushing method based on automatic construction of target entity set |
CN112632387A (en) * | 2020-12-30 | 2021-04-09 | 广东富状元科技有限公司 | Big data-based policy information personalized customization pushing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3819792A2 (en) | Method, apparatus, device, and storage medium for intention recommendation | |
CN111460252B (en) | Automatic search engine method and system based on network public opinion analysis | |
CN110941757A (en) | Big data based policy information query pushing system and method | |
CN111967761B (en) | Knowledge graph-based monitoring and early warning method and device and electronic equipment | |
US20190034498A1 (en) | Determining a presentation format for search results based on a presentation recommendation machine learning model | |
CN105718587A (en) | Network content resource evaluation method and evaluation system | |
CN112307762B (en) | Search result sorting method and device, storage medium and electronic device | |
CN107895008B (en) | Information hotspot discovery method based on big data platform | |
CN111049818B (en) | Abnormal information discovery method based on network traffic big data | |
CN104866471A (en) | Instance matching method based on local sensitive Hash strategy | |
CN101339560B (en) | Method and device for searching series data, and search engine system | |
CN109376142A (en) | Data migration method and terminal device | |
CN111859065A (en) | Big data-based public opinion listening system | |
CN103077216A (en) | Sub-graph matching device and sub-graph matching method | |
CN116032741A (en) | Equipment identification method and device, electronic equipment and computer storage medium | |
CN108228787A (en) | According to the method and apparatus of multistage classification processing information | |
CN112199488A (en) | Incremental knowledge graph entity extraction method and system for power customer service question answering | |
CN115858906A (en) | Enterprise searching method, device, equipment, computer storage medium and program | |
CN103136256A (en) | Method and system for achieving information retrieval in network | |
CN110175197B (en) | Ontology construction method and system based on semantic Internet of things | |
CN113673889A (en) | Intelligent data asset identification method | |
CN113342844A (en) | Industrial intelligent search system | |
CN104951869A (en) | Workflow-based public opinion monitoring method and workflow-based public opinion monitoring device | |
CN110990430A (en) | Large-scale data parallel processing system | |
CN114666145B (en) | Security early warning method and system based on network acquisition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20200331 |