CN110941757A - Big data based policy information query pushing system and method - Google Patents

Big data based policy information query pushing system and method Download PDF

Info

Publication number
CN110941757A
CN110941757A CN201911096896.7A CN201911096896A CN110941757A CN 110941757 A CN110941757 A CN 110941757A CN 201911096896 A CN201911096896 A CN 201911096896A CN 110941757 A CN110941757 A CN 110941757A
Authority
CN
China
Prior art keywords
module
policy information
big data
database
big
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911096896.7A
Other languages
Chinese (zh)
Inventor
李亚萍
侯林勇
王俊
张亮
袁率
杨坤
刘婉莹
方程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Xiaodingdang Information Technology Co Ltd
Original Assignee
Guizhou Xiaodingdang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Xiaodingdang Information Technology Co Ltd filed Critical Guizhou Xiaodingdang Information Technology Co Ltd
Priority to CN201911096896.7A priority Critical patent/CN110941757A/en
Publication of CN110941757A publication Critical patent/CN110941757A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a policy information query pushing system based on big data, an acquisition module is connected with a screening module, the screening module is connected with an input module, the input module is connected with a database through a storage module, a calling module is arranged in the database, the database is connected with a big data platform through a networking module, the big data platform is also connected with a display module and a pushing module, the invention can periodically index input keywords through a timing module, an index module and a keyword grabbing module, the aim of updating policy information in time is realized, the queried and pushed policy information is ensured to be up-to-date, correlation comparison can be carried out on the index result, the condition that individual words are correlated but the whole article is not correlated is avoided, the index time is reduced, the index pressure of the system is reduced, pictures, videos and characters in a webpage can be distinguished through source codes and classified, the indexed policy information can be quickly and effectively obtained, and irrelevant contents can be eliminated.

Description

Big data based policy information query pushing system and method
Technical Field
The invention relates to the technical field of big data-based policy information query pushing systems and methods, in particular to a big data-based policy information query pushing system and method.
Background
Big data is a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode.
The information policy is measures and strategies adopted by a country for developing information resources, developing information industry and coordinating information utilization.
Most of the current policy information query systems based on big data are manually queried and screened, and new policy information cannot be obtained in time after the policy information is updated, so an improved technology is urgently needed to solve the problem in the prior art.
Disclosure of Invention
The invention aims to provide a big data-based policy information query pushing system and a big data-based policy information query pushing method, which can periodically index input keywords through a timing module, an index module and a keyword grabbing module, realize the aim of updating policy information in time, ensure that the queried and pushed policy information is up-to-date, and solve the problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: the utility model provides a policy information inquiry propelling movement system based on big data, includes collection module, screening module, types module, storage module, database, calls module, networking module, big data platform, show module, propelling movement module, collection module links to each other with the screening module, the screening module links to each other with types module, types the module and passes through storage module and link to each other with the database, be provided with in the database and call the module, the database passes through networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module.
Preferably, the acquisition module comprises an input module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module carries out matching through the keywords and network index results input by the input module, and the timing module is used for setting the starting cycle time of the index module and the keyword grabbing module.
Preferably, the screening module comprises a relevance comparison module, a repetition comparison module and a code judgment module, wherein the relevance comparison module is used for keyword relevance comparison, the repetition comparison module is used for performing repetition comparison on the search result and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
Preferably, the entry module and the storage module are used for entering and storing the screened network search results into the database.
Preferably, the calling module is used for calling data in the database and is connected with the big data platform through the networking module.
Preferably, the display module is used for displaying policy information in the big data platform.
Preferably, the pushing module is used for pushing policy information in the big data platform.
Preferably, the method comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module, then inputting related keywords through an input module, and indexing the keywords on the network through an indexing module according to a keyword grabbing module to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, all character characters in the search result are subjected to repetition comparison through a repetition comparison module, and the search result with high repetition is rejected;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention can periodically index the input keywords through the timing module, the indexing module and the keyword grabbing module, thereby realizing the purpose of updating the policy information in time and ensuring that the queried and pushed policy information is the latest.
(2) The index result can be subjected to relevance comparison, and the condition that individual words are relevant but the whole article is irrelevant is avoided.
(3) The repetition contrast module can avoid a large number of same indexing results, reduce the indexing time and reduce the indexing pressure of the system.
(4) The code judging module can distinguish and classify pictures, videos and characters in the webpage through the source codes, and can quickly and effectively obtain the indexed policy information and remove irrelevant contents.
Drawings
FIG. 1 is a block diagram of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: a big data-based policy information query pushing system comprises an acquisition module, a screening module, an entry module, a storage module and a database, the calling module, the networking module, big data platform, the show module, the propelling movement module, the acquisition module links to each other with the screening module, the screening module links to each other with the type module, the type module passes through storage module and links to each other with the database, type module and storage module are used for typing in the network search result after the screening and save to the database in, be provided with the calling module in the database, the calling module is used for calling the data in the database and links to each other with big data platform through the networking module, the database passes through the networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module, the show module is used for showing the policy information in the big data platform, the propelling movement module is used for the policy information in the big data. The system comprises an acquisition module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module is matched with keywords and network index results output by the input module, the timing module is used for setting starting cycle time of the index module and the keyword grabbing module, the screening module comprises a correlation comparison module, a repetition comparison module and a code judgment module, the correlation comparison module is used for keyword correlation comparison, the repetition comparison module is used for performing repetition comparison on search results and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
A use method of a big data-based policy information query pushing system comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module (the time T is the acquisition cycle period of an acquisition module), then inputting related keywords through an input module, indexing the keywords on the network through an indexing module according to a keyword grabbing module, and grabbing the keywords on the webpage by adopting a search engine spider technology to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, redundant other codes are directly filtered and deleted, the other codes comprise advertisement position codes, some characters, pictures or videos in the non-relevance codes and the like, all character characters in the search result are subjected to the duplication comparison through a duplication comparison module, the duplication of the characters can be adjusted, and the search result with high duplication is removed;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
The acquisition module of the invention consists of an input module, an index module, a keyword grabbing module and a timing module, the input keywords can be periodically indexed through the timing module, the indexing module and the keyword grabbing module to realize the aim of updating the policy information in time and ensure that the inquired and pushed policy information is up-to-date, the screening module consists of a correlation comparison module, a repetition degree comparison module and a code judgment module, the index result can be compared with the relevance, the condition that the individual words are relevant but the whole article is irrelevant is avoided, the duplication degree comparison module can avoid a large number of same index results, reduce the index time and lighten the index pressure of the system, and the code judgment module can distinguish and classify pictures, videos and characters in the webpage through the source codes, so that the indexed policy information can be quickly and effectively obtained, and irrelevant contents can be removed.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (8)

1. A big data-based policy information query pushing system is characterized in that: including collection module, screening module, type module, storage module, database, transfer module, networking module, big data platform, show module, propelling movement module, collection module links to each other with the screening module, the screening module links to each other with type module, type module passes through storage module and links to each other with the database, be provided with in the database and transfer the module, the database passes through networking module and links to each other with big data platform, big data platform still links to each other with show module and propelling movement module.
2. The big-data-based policy information query pushing system according to claim 1, wherein: the acquisition module comprises an input module, an index module, a keyword grabbing module and a timing module, wherein the input module is used for inputting keywords, the index module carries out network index through the input module, the keyword grabbing module carries out matching through the keywords and network index results output by the input module, and the timing module is used for setting the starting cycle time of the index module and the keyword grabbing module.
3. The big-data-based policy information query pushing system according to claim 1, wherein: the screening module comprises a correlation comparison module, a repetition contrast module and a code judgment module, wherein the correlation comparison module is used for keyword correlation comparison, the repetition contrast module is used for performing repetition contrast on a search result and filtering redundant repeated information, and the code judgment module is used for judging webpage codes and classifying and screening pictures, videos and characters.
4. The big-data-based policy information query pushing system according to claim 1, wherein: and the input module and the storage module are used for inputting and storing the screened network search results into the database.
5. The big-data-based policy information query pushing system according to claim 1, wherein: the calling module is used for calling data in the database and is connected with the big data platform through the networking module.
6. The big-data-based policy information query pushing system according to claim 1, wherein: the display module is used for displaying policy information in the big data platform.
7. The big-data-based policy information query pushing system according to claim 1, wherein: the pushing module is used for pushing policy information in the big data platform.
8. The method for using the big-data-based policy information query pushing system according to claim 1, wherein: the method comprises the following steps:
the method comprises the following steps: firstly setting time T of a timing module, then inputting related keywords through an input module, and indexing the keywords on the network through an indexing module according to a keyword grabbing module to obtain a search result;
step two: the search structure is subjected to relevance comparison through a relevance comparison module, a search result with high relevance is screened out, character characters in a character code string, pictures in a main seal paragraph, videos in the main seal paragraph and other codes in a non-main seal paragraph are screened out through a code judgment module, all character characters in the search result are subjected to repetition comparison through a repetition comparison module, and the search result with high repetition is rejected;
step three: storing the search results screened by the screening module into a database connected with a big data platform through a recording module and a storage module;
step four: when the query device is used for querying, the display module is used for displaying the relevant policy information in the big data platform on the query device, and meanwhile, the push module can be used for pushing the relevant policy information in real time.
CN201911096896.7A 2019-11-11 2019-11-11 Big data based policy information query pushing system and method Pending CN110941757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911096896.7A CN110941757A (en) 2019-11-11 2019-11-11 Big data based policy information query pushing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911096896.7A CN110941757A (en) 2019-11-11 2019-11-11 Big data based policy information query pushing system and method

Publications (1)

Publication Number Publication Date
CN110941757A true CN110941757A (en) 2020-03-31

Family

ID=69907630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911096896.7A Pending CN110941757A (en) 2019-11-11 2019-11-11 Big data based policy information query pushing system and method

Country Status (1)

Country Link
CN (1) CN110941757A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652485A (en) * 2020-05-25 2020-09-11 青海绿能数据有限公司 New energy data acquisition and analysis system based on big data platform
CN112258144A (en) * 2020-09-27 2021-01-22 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112632387A (en) * 2020-12-30 2021-04-09 广东富状元科技有限公司 Big data-based policy information personalized customization pushing system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111652485A (en) * 2020-05-25 2020-09-11 青海绿能数据有限公司 New energy data acquisition and analysis system based on big data platform
CN112258144A (en) * 2020-09-27 2021-01-22 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112258144B (en) * 2020-09-27 2022-04-26 重庆生产力促进中心 Policy file information matching and pushing method based on automatic construction of target entity set
CN112632387A (en) * 2020-12-30 2021-04-09 广东富状元科技有限公司 Big data-based policy information personalized customization pushing system

Similar Documents

Publication Publication Date Title
EP3819792A2 (en) Method, apparatus, device, and storage medium for intention recommendation
CN111460252B (en) Automatic search engine method and system based on network public opinion analysis
CN110941757A (en) Big data based policy information query pushing system and method
CN111967761B (en) Knowledge graph-based monitoring and early warning method and device and electronic equipment
US20190034498A1 (en) Determining a presentation format for search results based on a presentation recommendation machine learning model
CN105718587A (en) Network content resource evaluation method and evaluation system
CN112307762B (en) Search result sorting method and device, storage medium and electronic device
CN107895008B (en) Information hotspot discovery method based on big data platform
CN111049818B (en) Abnormal information discovery method based on network traffic big data
CN104866471A (en) Instance matching method based on local sensitive Hash strategy
CN101339560B (en) Method and device for searching series data, and search engine system
CN109376142A (en) Data migration method and terminal device
CN111859065A (en) Big data-based public opinion listening system
CN103077216A (en) Sub-graph matching device and sub-graph matching method
CN116032741A (en) Equipment identification method and device, electronic equipment and computer storage medium
CN108228787A (en) According to the method and apparatus of multistage classification processing information
CN112199488A (en) Incremental knowledge graph entity extraction method and system for power customer service question answering
CN115858906A (en) Enterprise searching method, device, equipment, computer storage medium and program
CN103136256A (en) Method and system for achieving information retrieval in network
CN110175197B (en) Ontology construction method and system based on semantic Internet of things
CN113673889A (en) Intelligent data asset identification method
CN113342844A (en) Industrial intelligent search system
CN104951869A (en) Workflow-based public opinion monitoring method and workflow-based public opinion monitoring device
CN110990430A (en) Large-scale data parallel processing system
CN114666145B (en) Security early warning method and system based on network acquisition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200331