CN113254746B - Internet public opinion display system based on raspberry group - Google Patents

Internet public opinion display system based on raspberry group Download PDF

Info

Publication number
CN113254746B
CN113254746B CN202110567772.3A CN202110567772A CN113254746B CN 113254746 B CN113254746 B CN 113254746B CN 202110567772 A CN202110567772 A CN 202110567772A CN 113254746 B CN113254746 B CN 113254746B
Authority
CN
China
Prior art keywords
public opinion
website
news
folk
display
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110567772.3A
Other languages
Chinese (zh)
Other versions
CN113254746A (en
Inventor
邓帅杰
王德志
罗琛
王德宇
王凯琳
陈超
李泽荃
李永飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Institute of Science and Technology
Original Assignee
North China Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Institute of Science and Technology filed Critical North China Institute of Science and Technology
Priority to CN202110567772.3A priority Critical patent/CN113254746B/en
Publication of CN113254746A publication Critical patent/CN113254746A/en
Application granted granted Critical
Publication of CN113254746B publication Critical patent/CN113254746B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/02Computing arrangements based on specific mathematical models using fuzzy logic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Game Theory and Decision Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Computational Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a raspberry group-based online public opinion presentation system. The system comprises: the raspberry pie is provided with a folk public opinion and news information crawling program, the folk public opinion and news information crawling program is configured into a website crawler pool constructed based on a fuzzy comprehensive evaluation method, news events of a news website and folk events of a folk website are crawled, and keywords of the news events and folk events are counted respectively; the display is in communication connection with the raspberry group, a first display area of the display is used for displaying the news event in a rolling way, and a second display area of the display is used for displaying the civil event in a rolling way; the interactive display can display the keywords of the news event and the statistical data of the keywords of the folk event respectively, the interactive object determines the hot keywords according to the statistical data and invokes folk public opinion and news information crawling programs in the raspberry group to crawl and display the network public opinion corresponding to the hot keywords.

Description

Internet public opinion display system based on raspberry group
Technical Field
The application relates to the technical field of data display, in particular to a raspberry group-based online public opinion display system.
Background
With the rapid development of the information quantity of the Internet, in the big data age of information explosion, the general public opinion platforms such as hundred degrees and microblogs, which are all public opinion reports based on the platform of the general public opinion, are difficult to comprehensively master the contents of the platforms so as to screen target information, and the existing public opinion platforms cannot obtain the information of the whole network, so that the perception of the situation of the whole network public opinion is difficult to obtain, and the data such as weather, oil price, hydropower charge, national policy and the like are rarely displayed.
Accordingly, there is a need to provide an improved solution to the above-mentioned deficiencies of the prior art.
Disclosure of Invention
The invention aims to provide a raspberry group-based online public opinion presentation system for solving or relieving the problems in the prior art.
In order to achieve the above object, the present application provides the following technical solutions:
the application provides a network public opinion presentation system based on raspberry group, which comprises: the raspberry pie is provided with a folk public opinion and news information crawling program, the folk public opinion and news information crawling program is configured to crawl news events of a news website and folk events of a folk website based on a website crawler pool constructed by a fuzzy comprehensive evaluation method, and keywords of the news events and folk events are counted respectively;
A display communicatively coupled to a raspberry group, comprising: the display area of the display is divided into a first display area and a second display area, the first display area can perform rolling display on news events, and the second display area can perform rolling display on civil events; the interactive display can display the keyword of the news event and the statistical data of the keyword of the folk event respectively, the interactive object determines the hot keyword according to the statistical data and invokes the folk public opinion and the news information crawling program in the raspberry group to crawl and display the network public opinion corresponding to the hot keyword, wherein the network public opinion comprises the hot news event or the hot folk event corresponding to the hot keyword and public opinion information of the hot news event or the hot folk event.
Preferably, the folk public opinion and news information crawling program configured on the raspberry party is further configured to crawl the discussion hotness of the news event and the folk event respectively, and obtain discussion hotness values corresponding to the news event and the folk event; correspondingly, the first display area can correspondingly scroll and display the discussion heat value of the news event; the second display area can correspondingly scroll and display the discussion heat value of the civil event.
Preferably, after the interactive object determines the hot keywords according to the statistical data, the interactive display displays an interactive control panel in the folk public opinion and news information crawling program, and interacts with the interactive object to crawl and display the network public opinion corresponding to the hot keywords.
Preferably, the raspberry group-based online public opinion presentation system further comprises: the first lamp set is connected with the raspberry group in a communication mode, and correspondingly, when an interactive object carries out interactive selection on the crawling data sources of the civil public opinion and news information crawling program in the interactive control panel, the first lamp set characterizes whether the civil public opinion and the news information crawling program normally communicate with the crawling data sources or not through different flashing colors.
Preferably, the raspberry group-based online public opinion presentation system further comprises: the second lamp set is connected with the raspberry group in a communication mode, and the second lamp set characterizes whether the civilian public opinion and the news information crawling program run normally when crawling data by different flashing colors.
Preferably, the raspberry group-based online public opinion presentation system further comprises: the buzzer is in communication connection with the raspberry group, and correspondingly, responds to the running completion of the folk public opinion and news information crawling program, and the buzzer sounds for 2 seconds; and responding to the operation errors of the civil public opinion and news information crawling program, and sounding the buzzer for 5 seconds.
Preferably, the raspberry group-based online public opinion presentation system further comprises: the third lamp set is in communication connection with the raspberry group, is connected with the buzzer in parallel, and correspondingly responds to the running completion of the folk public opinion and news information crawling program, and the buzzer flashes green light three times and then extinguishes after 2 seconds; in response to the public opinion of the people and the running error of the news information crawling program, the buzzer sounds for 5 seconds, and the third lamp set always flashes red.
Preferably, the raspberry group-based online public opinion presentation system further comprises: and the fourth lamp group is in communication connection with the raspberry group, responds to the interactive display to display the network public opinion, and displays the discussion heat value corresponding to the network public opinion in different flashing numbers.
Preferably, the raspberry group-based online public opinion presentation system further comprises: the voice interaction device is connected with the raspberry group in a communication mode, and can call the civil public opinion and news information crawling program through voice collection.
Preferably, the voice interaction device can also broadcast the network public opinion displayed on the interaction display, which is determined by the interaction object.
The beneficial effects are that:
In the technical scheme provided by the embodiment of the application, the raspberry party is provided with a folk public opinion and news information crawling program, the folk public opinion and news information crawling program is provided with a website crawler pool constructed based on a fuzzy comprehensive evaluation method, news events of a news website and folk events of the folk website are crawled by using the website crawler pool, and keywords of the news events and the folk events are counted respectively; and then, the news event and the folk event are sent to a display connected with the raspberry pie for communication, rolling display is carried out in different areas in the display, statistical data of keywords of the news event and the folk event are sent to an interactive display, after the interactive object determines hot keywords according to the statistical data, folk public opinion and a news information crawling program are called to crawl and display network public opinion corresponding to the hot keywords, wherein the network public opinion comprises hot news events or hot folk events corresponding to the hot keywords, and public opinion comment information corresponding to the hot events. Therefore, the situation of the whole network public opinion is effectively obtained, and the comprehensive, accurate and timely tracking of network events is realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. Wherein:
fig. 1 is a schematic structural diagram of a raspberry group-based online public opinion presentation system according to some embodiments of the present application;
fig. 2 is a schematic diagram of a configuration of a raspberry group provided according to some embodiments of the present application;
FIG. 3 is a schematic diagram of an interactive panel provided in accordance with some embodiments of the present application;
FIG. 4 is a graph of similarity change between news websites and civil websites provided in accordance with some embodiments of the present application;
FIG. 5 is a graph of mood changes for a hotspot event provided by the Internet in accordance with some embodiments of the present application;
fig. 6 is a graph of emotional changes of news websites and civil websites regarding the same event, provided in accordance with some embodiments of the present application.
Reference numerals illustrate:
101-raspberry pie; 111-a data acquisition module; 121-a data analysis module; 131-a model training module; 102-a display; 112-a display; 122-an interactive display; 103-a first lamp group; 104-a second lamp group; 105-a third lamp group; 106-a fourth lamp group; 107-a buzzer; 108-voice interaction device.
Detailed Description
The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments. Various examples are provided by way of explanation of the present application and not limitation of the present application. Indeed, it will be apparent to those skilled in the art that modifications and variations can be made in the present application without departing from the scope or spirit of the application. For example, features illustrated or described as part of one embodiment can be used on another embodiment to yield still a further embodiment. Accordingly, it is intended that the present application include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
In the description of the present application, the terms "longitudinal," "transverse," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely for convenience in describing the present application and do not require that the present application must be constructed and operated in a particular orientation, and thus are not to be construed as limiting the present application. The terms "coupled," "connected," and "configured" as used herein are to be interpreted broadly, and may be, for example, fixedly connected or detachably connected; can be directly connected or indirectly connected through an intermediate component; either a wired electrical connection, a radio connection or a wireless communication signal connection, the specific meaning of which terms will be understood by those of ordinary skill in the art as the case may be.
Fig. 1 is a schematic diagram of a raspberry group-based online public opinion presentation system according to some embodiments of the present application; as shown in fig. 1, the network public opinion presentation system based on raspberry group includes: raspberry group 101 and display 102. The raspberry pie 101 is provided with a folk public opinion and news information crawling program, wherein the folk public opinion and news information crawling program is configured to crawl news events of a news website and folk events of the folk website based on a website crawler pool constructed by a fuzzy comprehensive evaluation method, and respectively count keywords of the news events and the folk events; display 102 is communicatively coupled to raspberry group 101, including: the display area of the display 112 is divided into a first display area and a second display area, the first display area can perform rolling display on news events, and the second display area can perform rolling display on civil events; the interactive display 122 can respectively display the keyword of the news event and the statistical data of the keyword of the folk event, and the interactive object determines the hot keyword according to the statistical data and invokes the folk public opinion and the news information crawling program in the raspberry group 101 to crawl and display the network public opinion corresponding to the hot keyword, where the network public opinion includes the hot news event or the hot folk event corresponding to the hot keyword and the public opinion information of the hot news event or the hot folk event.
In the embodiment of the present application, in the raspberry group 101, keywords of a news event and a civil event are respectively counted based on a word segmentation statistical method, and if the statistical data of the keywords of the news event displayed in the interactive display 122 is consistent with the statistical data of the keywords of the civil event, the interactive object determines a hot keyword according to the statistical data. Specifically, in the news event, according to the frequency of the keywords, the first keyword and the second keyword … … are sequentially arranged from high to low; in the folk events, according to the frequency of the keywords, the first keyword and the second keyword … … are sequentially arranged from high to low; if the first keyword of the news event presented in the interactive display 122 is consistent with the first keyword in the civil event, the interactive object determines that the first keyword is a hot spot keyword. If the first keyword of the news event presented in the interactive display 122 does not coincide with the first keyword in the civil event, a hot spot keyword is determined by the interactive object itself.
If the frequency of the plurality of keywords of the news event is the same as the frequency of the plurality of keywords in the civil event, for example, the first keyword appears 10 times and the second keyword appears 9 times in the news event; in the folk event, the first keyword appears 9 times, and the second keyword appears 9 times, so that it cannot be determined exactly which of the first keyword and the second keyword is the hot spot keyword, and a selection frame is popped out in the interactive display 122, and the hot spot keyword is selected by the interactive object.
In some alternative embodiments, after the interactive object determines the hot keywords according to the statistical data, the interactive display 122 displays an interactive control panel in the folk public opinion and news information crawling program, and interacts with the interactive object to crawl and display the network public opinion corresponding to the hot keywords.
Specifically, after the interactive object determines the hot keywords, an interactive panel (as shown in fig. 2) is displayed on the interactive display 122, where a data source selection, a keyword input box, a search page number, a crawling area input box for crawling the result display and selection box, various buttons corresponding to the crawled data (for example, viewing a hot article, viewing the article content, generating an image, downloading historical data) and the like are provided in the interactive panel. The interactive object is used for crawling corresponding network public opinion through interaction with an interactive panel, such as selecting a data source, inputting hot keywords and the like, invoking a civil public opinion and news information crawling program, and displaying a list in crawling result display and selection frame; then, the interactive object can realize the viewing, displaying, downloading and the like of the list content through various buttons corresponding to the crawled data.
In a specific example, the online public opinion presentation system based on raspberry group further includes: the first lamp set 103 is in communication connection with the raspberry group 101, and when the interactive object performs interactive selection on the crawling data sources of the indicated civil public opinion and news information crawling program in the interactive control panel, the first lamp set 103 characterizes whether the civil public opinion and the news information crawling program normally communicate with the crawling data sources or not by different flashing colors. Specifically, when the interactive object selects the data source, the corresponding crawler in the website crawler pool can run once, but does not perform data crawling, only the website communication is tested, and if the communication is successful, the first lamp group 103 (for example, an LED lamp) can flash twice and then go out after green light is turned on; if the website needs account login, and a corresponding account cannot be found in the website crawler pool, the first lamp group 103 flashes yellow lamps twice; if the website is not accessible, the first light set 103 flashes red lights twice. Therefore, communication monitoring on the access website is realized, and the running reliability of the system is improved.
In another specific example, the raspberry group-based online public opinion presentation system further includes: the second lamp set 104, the second lamp set 104 is connected with the raspberry group 101 in a communication way, and the second lamp set 104 characterizes whether the civilian public opinion and news information crawling program operates normally when crawling data with different flashing colors. Specifically, the interactive object selects and inputs other conditions (such as inputting a hot keyword, searching page number, etc.) on the interactive control panel to the data source, and calls the folk public opinion and news information crawling program to perform data acquisition, if the folk public opinion and news information crawling program operates normally, the second lamp group 104 (such as an LED lamp) is turned on in green; if the network connection to the data source is unstable, the second light bank 104 lights up yellow; if the folk public opinion and news information crawling program crashes, the second light set 104 is turned on in red. Therefore, the real-time monitoring of the operation of the folk public opinion and news information crawling program is realized, the system operation efficiency is ensured, and the system operation error is timely checked.
In some optional embodiments, the raspberry group based online public opinion presentation system further includes: the buzzer 107 is in communication connection with the raspberry group 101, and correspondingly, the buzzer 107 sounds for 2 seconds in response to the running of the civil public opinion and news information crawling program; in response to the civil public opinion and news information crawling program running error, the buzzer 107 sounds for 5 seconds. Further, the raspberry group-based online public opinion presentation system further comprises: the third lamp set 105 is in communication connection with the raspberry group 101, is connected with the buzzer 107 in parallel, and correspondingly responds to the running completion of the folk public opinion and news information crawling program, and the buzzer 107 flashes green light three times and then extinguishes after 2 seconds; in response to the public opinion of the people and the news information crawling program running error, the buzzer 107 sounds for 5 seconds, and the third light group 105 always flashes red.
In this embodiment of the present application, the third light group 105 may be an LED light, and after the crawling of the public opinion and news information is completed, firstly, a short-time sound alert is performed through the buzzer 107, and then the third light group 105 flashes three colors down to be turned off, which indicates that the program operation is finished; if the civil public opinion and news information crawling program is wrong, the buzzer 107 alarms for 5 seconds and always lights the third lamp group 105. At this time, the interactive panel is displayed on the display 112, where the crawler data is acquired but not yet processed, the interactive object performs preliminary judgment on the data, and determines whether there is wrong crawling information (such as a messy code, repetition, etc. crawled) according to the displayed data form (the crawling data of each website exists in the form). In the process, the interactive object realizes screening of data by switching different forms.
In some optional embodiments, the folk public opinion and news information crawling program configured on the raspberry group 101 is further configured to crawl the discussion hotness of the news event and the folk event, respectively, to obtain discussion hotness values corresponding to the news event and the folk event; correspondingly, the first display area can correspondingly scroll and display the discussion heat value of the news event; the second display area can correspondingly scroll and display the discussion heat value of the civil event.
Specifically, when the news event is displayed in a rolling manner in the first display area, after the corresponding news event, the discussion hotness value of the news event is displayed in a corresponding rolling manner; and when the civil event is displayed in a rolling way in the second display area, displaying the discussion heat value of the civil event correspondingly in a rolling way after the corresponding civil event. Therefore, the attention degree of each website to each event on the network can be observed in real time.
In some optional embodiments, the raspberry group based online public opinion presentation system further includes: the fourth light set 106, the fourth light set 106 is communicatively connected with the raspberry group 101, and in response to the interactive display 122 displaying the internet public opinion, the fourth light set 106 displays discussion heat values corresponding to the internet public opinion in different flashing numbers.
In the embodiment of the application, the trend, trend and intensity of the network public opinion in the network are intuitively displayed through the number of the lit fourth lamp group 106 (such as the LED lamp strip), so that the effective guidance for making the strategy or guiding the public opinion is realized.
In some optional embodiments, the raspberry group based online public opinion presentation system further includes: the voice interaction device 108, the voice interaction device 108 is in communication connection with the raspberry group 101, and the voice interaction device 108 can call the folk public opinion and news information crawling program through voice collection. Further, the voice interaction device 108 can also broadcast the internet public opinion presented on the interaction display 122 determined by the interaction object. Therefore, voice interaction is realized in the system, manual input of interaction is effectively reduced, and the operation efficiency of the system is improved.
In the embodiment of the application, a folk public opinion and news information crawling program is configured on a raspberry group 101, the folk public opinion and news information crawling program is provided with a website crawler pool constructed based on a fuzzy comprehensive evaluation method, and news events of a news website and folk events of the folk website are crawled by using the website crawler pool, and keywords of the news events and the folk events are counted respectively; and then, the news event and the folk event are sent to a display 112 in communication connection with the raspberry group 101, rolling display is carried out in different areas in the display 112, statistical data of keywords of the news event and the folk event are sent to an interactive display 122, after the interactive object determines hot keywords according to the statistical data, folk public opinion and a news information crawling program are called to crawl and display network public opinion corresponding to the hot keywords, wherein the network public opinion comprises hot news events or hot folk events corresponding to the hot keywords, and public opinion comment information corresponding to the hot events. Therefore, the situation of the whole network public opinion is effectively obtained, and the comprehensive, accurate and timely tracking of network events is realized.
Fig. 3 is a schematic diagram of a configuration of a raspberry group provided according to some embodiments of the present application; as shown in fig. 3, the raspberry group 101 is configured with a data acquisition module 111, a data analysis module 121, and a model training module 131 to cooperate with the operation of the civil public opinion and news information crawling program. The data collection module 111 is configured with a website crawler pool constructed based on a fuzzy comprehensive evaluation method, wherein the website crawler pool is configured to crawl news event data of a first page of a news website and crawl civil event data of a first page of a civil website; the data analysis module 121 is configured to determine a hot event according to news event data and folk event data based on a word frequency statistical method or based on a preset text similarity model; and invoking a website crawler pool, and crawling public opinion information of the hot event in the folk website according to the hot key words of the hot event to obtain the public opinion information of the hot event; and acquiring the emotion tendency value of the Internet about the hot event according to the public opinion information of the hot time based on a preset emotion analysis model.
In this embodiment, the news website and the civil website may be professional plates in a professional type website or a comprehensive type website, for example, the news website may be an observer network, each government website, each ministry of commission website, a china policy network, etc., and the civil website may be a national water network, a power grid, each regional news network, a weather network, an earthquake network, a treasured panning network, a knowledge, a B station, a microblog, etc.
In this embodiment of the present application, the website crawler pool is an aggregate of website crawlers, and there are multiple website crawlers in the website crawler pool, each website has multiple crawlers correspondingly, and each crawler has different functions, so as to crawl data of different categories (for example, military, science and technology, automobile, emotion, etc.), respectively; because the operation and maintenance of each website are changed along with time, the similarity between the website crawler pools can be calculated regularly, so that the accuracy of the similarity between the website crawler pools can be effectively ensured, and the crawled data is accurate and reliable.
Specifically, a plurality of news website crawlers constructed based on a fuzzy comprehensive evaluation method and a plurality of civil website crawlers constructed based on the fuzzy comprehensive evaluation method are configured in the website crawler pool; the news website crawler can crawl the reading quantity and the clicking quantity of each news event of the news website top page, and crawl the news website top page according to the reading quantity and the clicking quantity of each news event to obtain news event data; the civil website crawlers can crawl the reading quantity and the clicking quantity of each civil event of the civil website homepage, and crawl the civil website homepage according to the reading quantity and the clicking quantity of each civil event to obtain the civil event data.
In the embodiment of the application, a new web site crawler crawls all news events according to the reading quantity and the clicking quantity of all the news events on the first page of the news website to acquire news event data; the civil website crawlers crawl the reading quantity and the clicking quantity of each civil event of the first page of the civil website, and crawl each civil event in sequence according to the reading quantity and the clicking quantity of each civil event to obtain the civil event data.
In some alternative embodiments, the web site crawler pool constructed based on the fuzzy comprehensive evaluation method is used for the crawling process,
the first step, based on a word vector cosine algorithm, similarity calculation is carried out among the constructed multiple website crawler pools in a preset period, and the similarity among the multiple website crawler pools is obtained. Specifically, crawling web crawlers corresponding to the constructed multiple web crawler pools in a preset period to obtain web text data corresponding to each web crawler pool; and based on a word vector cosine algorithm, performing similarity calculation on the plurality of website text data corresponding to the plurality of website crawler pools, and obtaining the similarity among the plurality of website crawler pools.
In the embodiment of the application, when similarity calculation is performed among a plurality of constructed website crawler pools in a preset period, website homepage data are obtained by periodically crawling website homepages of websites corresponding to the website crawler pools, and then all the obtained website homepage data are spliced to obtain corresponding website text data.
In the embodiment of the application, the similarity between the crawler pools of each website is calculated through a word vector cosine algorithm, so that text association analysis between the websites to be crawled corresponding to the crawler pools of each website is realized. Specifically, similarity calculation between the crawler pools of each website is realized through the relevance of sentence composition components in the text data of the websites to be crawled.
In the embodiment of the application, a model for calculating the similarity between the crawler pools based on the word vector cosine algorithm is shown in a formula (1), and the formula (1) is as follows:
wherein cos theta represents the similarity between the crawler pools of each website, and A, B represents word vectors in the website text data of two websites to be crawled respectively; a is that i Representing the ith component of the word vector A, i being a positive integer; b (B) i Representing the ith component of the word vector B; n represents the dimension of the word vector, and the value of n is a positive integer. For example, the word vector a= (3, 5,7, 8), then a is a 4-dimensional vector, i.e., n=4, a 1 =3,A 2 =5,A 3 =7,A 4 =8, 1.ltoreq.i.ltoreq.4, i being a positive integer.
Firstly, respectively segmenting website text data of two websites to be crawled by using jieba, and then, vectorizing the website text data by using TtfVectorizer class in sklearn to obtain TF-IDF (term frequency-inverse document frequency, abbreviated as TF-IDF); and finally, performing cosine method calculation on the relevance of the two websites to be crawled by using a cosine-similarity class.
And secondly, selecting a reference website according to the access request, and screening websites to be crawled according to the similarity between a website crawler pool corresponding to the reference website and other website crawler pools. The method comprises the following steps: and selecting a reference website according to the access request, and screening websites to be crawled in sequence according to the similarity between the website crawler pool corresponding to the reference website and other website crawler pools.
In the embodiment of the application, the similarity between two web site crawler pools in a preset period is the similarity between two corresponding web sites in the preset period, and the similarity between the web site crawler pools is used as a similarity table of each web site to be crawled in the preset period and is stored in a database. A corresponding reference website is selected according to key information (e.g., search keyword, etc.) in the access request of the target user, for example, the target user searches for video, and the corresponding reference website is selected as an curiosity, a cool video, or a curry, etc. And then, screening the websites to be crawled according to the similarity table of each website to be crawled, thereby improving crawling efficiency and reducing resource consumption.
In this embodiment of the application, according to the similarity table of each website to be crawled, the similarity of each website to be crawled is ordered from high to low, the website to be crawled with high similarity is first selected by the website crawler pool corresponding to the reference website, and then the website to be crawled with low similarity is crawled, so that crawling efficiency is effectively improved, and resource consumption is reduced.
In an application scenario, when screening websites to be crawled, discarding the websites to be crawled in response to the similarity between the website crawler pool corresponding to the websites to be crawled and the website crawler pool corresponding to the reference websites being lower than a preset similarity threshold.
In the embodiment of the application, if the similarity between the website to be crawled and the reference website is lower than the preset similarity threshold, it is indicated that topics between the website to be crawled and the reference website are inconsistent in a preset period, the search information of the target user does not exist in the website to be crawled basically, and corresponding data cannot be acquired when the website to be crawled is crawled, so that crawling of the website to be crawled can be directly abandoned.
And thirdly, calculating crawling recommendation values of the website crawler pools corresponding to the screened websites to be crawled based on a fuzzy comprehensive evaluation method, so that the website crawler pools corresponding to the reference websites crawl the screened websites to be crawled according to the crawling recommendation values of the website crawler pools corresponding to the screened websites to be crawled.
In the embodiment of the application, when calculating the crawling recommendation value of the screened website to be crawled based on the fuzzy comprehensive evaluation method, calculating the crawling recommendation value of the website crawler pool corresponding to the screened website to be crawled based on the fuzzy comprehensive evaluation method according to the crawling weight of the crawling influence factors of the screened website to be crawled, wherein the crawling influence factors represent the influence parameters when crawling the displayed website to be crawled; the crawling weight characterizes the level of influence of crawling influencing factors on the crawling recommendation values shown.
Specifically, crawling impact factors include: website hotness, historical request failure rate, user score, site crawling strength, site tolerance, and site crawling risk; the website heat represents the value information amount covered by the website to be crawled, the historical request failure rate represents the failure probability of crawling the website to be crawled, the user score represents the satisfaction degree of the crawling result of the user to the website to be crawled, the site anti-crawling strength represents the crawling difficulty degree of the website to be crawled, the site tolerance capability represents the size of the access amount which can be born by the website to be crawled, and the site crawling risk represents whether the website to be crawled is allowed to crawl or not.
In the embodiment of the application, when the crawling recommendation value of the screened websites to be crawled is calculated based on a fuzzy comprehensive evaluation method, the website heat, the historical request failure rate, the user score, the site crawling prevention strength, the site tolerance and the site crawling risk are respectively scored.
In the embodiment of the application, in the calculation process of the crawling recommendation value of the screened websites to be crawled, the website heat, the user score and the site tolerance are all positive crawling influence factors. The higher the website heat score is, the higher the social general attention to the website is in a preset period, the more active the information flow of the website is, and the content with research value is more, so the crawling value is more. The user score is determined according to the satisfaction degree of the crawling result of the user for performing historical crawling on the website, and the higher the satisfaction degree is, the higher the user score is, the more valuable information can be obtained when the user crawls the website. The higher the site tolerance, the better the architecture of the website, the larger the access amount that the website can bear, and the less likely other users are bothered when crawling the website.
In the calculation process of the crawling recommended value of the screened websites to be crawled, the higher the historical request failure rate is, the site crawling prevention strength and the site crawling risk are negative crawling influence factors. The higher the history request failure rate is, the worse the operation and maintenance condition of the website in a preset period is, the higher the possibility of crawling failure is when the website is crawled, the more the crawling failure times are, the more the resource waste is caused, and when the crawling recommended value of the screened website to be crawled is calculated, the score of the history request failure rate of the website is reduced along with the increase of the history request failure rate. The higher the protection intensity of the site is, the more Vietnam is crawled, namely the score of the site anti-crawling intensity is reduced along with the enhancement of the site protection intensity. The higher the risk of crawling a site, the more unsuitable the site is for crawling, and the greater the risk of crawling the site.
In the embodiment of the application, the crawling weight reflects the influence of the corresponding crawling recommendation value in the process of evaluating the website and the corresponding website crawler pool by different crawling influence factors. For example, when a student evaluates a prize, two factors of "score" and "enthusiasm to participate in extracurricular activities" need to be considered for scoring the student, and if "score" is more important than "enthusiasm to participate in extracurricular activities", the weight of "score" is set to 0.8, and the weight of "enthusiasm to participate in extracurricular activities" is set to 0.2; the final student's score for prize is equal to the sum of the product of 0.8 and "score", plus the product of 0.2 and "enthusiasm to engage in extracurricular activity".
In the embodiment of the application, the website hotness, the historical request failure rate, the user score, the site crawling prevention strength, the site tolerability and the site crawling risk are respectively used as u 1 、u 2 、u 3 、u 4 、u 5 、u 6 Expressed by a, the corresponding crawling weights are respectively represented by a 1 、a 2 、a 3 、a 4 、a 5 、a 6
The set of factors that crawl the influencing factors is:
U={u 1 ,u 2 ,u 3 ,u 4 ,u 5 ,u 6 }
wherein, scoring u of website heat 1 =x 1 ,x 1 ∈(0,100]The method comprises the steps of carrying out a first treatment on the surface of the Scoring u of historical request failure rate 2 =100-x 2 ,x 2 ∈(0,100]The method comprises the steps of carrying out a first treatment on the surface of the Scoring u of user score 3 =x 3 ,x 3 ∈(0,100]The method comprises the steps of carrying out a first treatment on the surface of the Scoring u of site anti-climb intensity 4 =100-x 4 ,x 4 ∈(0,100]The method comprises the steps of carrying out a first treatment on the surface of the Scoring of site tolerance 5 =x 5 ,x 5 ∈(0,100]The method comprises the steps of carrying out a first treatment on the surface of the Scoring u of site crawling risk 6 =100-x 6 ,x 6 ∈(0,100]. Wherein x is 1 The ranking condition of website hotness is reflected, and the higher the ranking is, the x is 1 The greater the value of (2); x is x 2 Reflects the actual situation of the historical request failure rate, and the higher the historical request failure rate is, x 2 The greater the value of (2); x is x 3 Reflects the actual value of the user scoring feedback, and the higher the user scoring feedback is, x 3 The greater the value of (2); x is x 4 Reflects the strength of the station, and x is the higher the strength of the station 4 The larger the value of (2); x is x 5 Reflecting the level of site tolerance, the higher the site tolerance, x 5 The larger the value of (2); x is x 6 Reflects the height of the site crawling risk, and x is the higher the site crawling risk is 6 The larger the value of (2).
Then, a weight set of crawling weights of each crawling influence factor is determined based on a hierarchical analysis (Analytic Hierarchy Process, abbreviated as AHP) as follows:
A={a 1 、a 2 、a 3 、a 4 、a 5 、a 6 }
the discrimination matrix of the construction factor set U is as follows:
wherein, the discrimination matrix reflects the importance degree between every two factors in the factor set.
The weight set is:
A={0.1638,0.1464,0.3557,0.0752,0.1744,0.0845}
then, an alternative set is established:
v= { very recommended, general, not recommended, very not recommended }
And evaluating the websites to be crawled according to crawling influence factors to obtain crawling recommendation values of the websites to be crawled. Specifically, single factor evaluation is carried out on each crawling influence factor to obtain a single factor evaluation result of each crawling influence factor, and then, according to the single factor evaluation result of each crawling influence factor, a crawling recommended value of a website to be crawled is calculated based on a fuzzy comprehensive evaluation method.
Here, the site climbing-proof strength u 4 The following description is given for the sake of example: site crawling prevention strength u for websites to be crawled 4 There are m (m is a positive integer) users scoring, then there are m u 4 Values, where there is s 1 The individual values being associated with intervals (80, 100],s 2 The values belonging to the interval (60, 80],s 3 The individual values being associated with intervals (40, 60],s 4 The individual values belonging to the interval (20, 40],s 5 The individual values are subject to intervals (0, 20]Wherein s is 1 、s 2 、s 3 、s 4 、s 5 Sum is equal to m and s 1 、s 2 、s 3 、s 4 、s 5 Are all positive integers.
The crawling influence factor of the website to be crawled, namely the crawling prevention strength u of the website 4 The single factor evaluation result of (2) is:
Y 4q ={y 41 ,y 42 ,y 43 ,y 44 ,y 45 }
wherein, the liquid crystal display device comprises a liquid crystal display device,
and after the six crawling influence factors of the website to be crawled are respectively subjected to single-factor evaluation, obtaining a single-factor evaluation matrix Y of the website to be crawled. The one-factor evaluation matrix Y is as follows:
Y=[Y 1j 、Y 2j 、Y 3j 、Y 4j 、Y 5j ] T
wherein j represents a scoring membership interval of crawling influence factors,
namely:
and constructing an intermediate variable matrix B=A.Y, and solving the intermediate variable matrix B based on an index model in a fuzzy comprehensive evaluation method, wherein the index model is shown in a formula (2). The formula (2) is as follows:
wherein a is w The weight representing the w-th crawling impact factor,representing the crawling influence factor at w at the corresponding weight a w The following single factor evaluation result; b j Representing the value of the intermediate variable in the j-th membership interval.
And (3) carrying out normalization processing on the intermediate variable matrix B based on a normalization model, wherein the normalization model is shown in a formula (3). Equation (3) is as follows:
constructing a membership degree set Q corresponding to the alternative set V, and enabling
Wherein, the liquid crystal display device comprises a liquid crystal display device,"very recommended", "in the corresponding alternative set V>"recommended" in the corresponding alternative set V, ">The "general" in the corresponding alternative set V; />"not recommended" in the corresponding candidate set V; />Corresponding to "very not recommended" in alternative set V.
In the membership degree set Q, the websites to be crawled are taken out of the candidate set VThe degree of membership of this elementAnd the degree of membership of the website to be crawled to the element "recommended" in the alternative set V +.>The crawling recommendation value T of the website to be crawled is:
T=Q 1 +Q 2
the larger the recommended value T, the more recommended the web site to be crawled.
In the embodiment of the application, the crawling recommendation value characterizes whether the corresponding website to be crawled is suitable for being crawled, and if the crawling recommendation value is higher, the website to be crawled is suitable for being crawled, and the result obtained after crawling meets the access request of the target object. Specifically, based on a fuzzy comprehensive evaluation method, the crawling recommendation value of the website crawler pool corresponding to the screened website to be crawled is calculated, so that the screened website to be crawled is sequentially crawled according to the crawling recommendation value of the website crawler pool corresponding to the screened website to be crawled by the website crawler pool corresponding to the reference website. Therefore, the crawling efficiency of the website crawler pool is further improved, the resource consumption is further reduced, and the obtained crawling result is more accurate and has higher credibility.
In the embodiment of the application, after the access request of the target user is acquired, when the crawlers run, the data in the website crawlers are automatically read to crawl, a plurality of crawlers crawl under the control of the queue, and the crawled data are subjected to data cleaning under the control of the queue.
In the embodiment of the application, in the crawling process, the crawlers crawl websites to be crawled according to the recommended value instead of the limited extent or the depth priority, and preferentially crawl websites with higher recommended values; in addition, the autonomous filtering mode of each website can be firstly utilized, then the crawling is carried out by utilizing the scheme of the application, namely, the preliminary searching is carried out by calling the searching frame of each website, and then the crawling is carried out in the preliminary searching result, so that the crawling accuracy and the crawling efficiency are improved.
In some alternative embodiments, the data analysis module 121 is further configured to count high-frequency keywords in the news event data and the civil event data based on the word frequency statistics method, and determine that the news event corresponding to the same high-frequency keyword in the news event data and the civil event data is a hot event.
In the embodiment of the application, based on the word frequency statistics method, high-frequency keywords in the news event data and the civil event data are respectively counted, and if the high-frequency keywords in the news event data and the civil event data are consistent, common attention points of the news event and the civil event in the Internet are described. The data analysis module 121 calls a web site crawler pool to crawl public opinion information of the hot event in the civil website according to the hot key words of the hot event, so as to obtain the public opinion information of the hot event. Therefore, the operation efficiency of the crawler can be effectively improved, and the crawling efficiency and accuracy of the public opinion information for acquiring the hot events are improved.
If the high-frequency keywords in the news event data and the civil event data are inconsistent, the fact that the attention points of the news event and the civil event on the Internet are different is indicated, and at the moment, the hot event needs to be determined based on a preset text similarity model. Specifically, the data analysis module 121 is further configured to calculate the similarity between each news event in the news event data and the civil event data based on a preset text similarity model, and determine that the news event with the highest similarity to the civil event data is a hot event.
In a specific example, the data analysis module 121 is further configured to calculate the similarity between each news event in the news event data and each civil event in the civil event data based on a preset text similarity model, so as to obtain the similarity between each news event in the news event data and the civil event data.
In the embodiment of the application, after the news event data of each news website and the civil event data of each civil website are obtained, similarity calculation is carried out on each news event in the news event data and each civil event in the civil event data one by one to obtain the similarity of each news event and all the civil events in the civil event data, and a similarity table of the news event data and the civil event data is constructed according to the similarity. According to the similarity, determining the news event with highest similarity among the folk event data, which indicates that the news event is a hotspot event of folk discussion on the Internet.
In the embodiment of the present application, through the raspberry pie 101, the emotion tendency value can also be displayed on the display 102 (the display 112 or the interactive display 122) in a chart or other form, so as to form an emotion change graph of the internet about a hot event; and the similarity of each news event in the news event data and the civil event data, drawing a similarity change chart (shown in fig. 4) of the news website and the civil website, and displaying the similarity change chart on the display 102 (the display 112 or the interactive display 122) to determine whether public opinion information of the internet is concentrated or not. Therefore, the similarity change of the news event and the folk event is visually displayed through the display 102, and whether the folk discussion on the Internet is concentrated or not is more vivid and more accurate. As can be seen from fig. 4, the similarity between news websites and civil websites varies based on time series.
In some alternative embodiments, the data analysis module 121 is further configured to obtain, based on a preset emotion analysis model, emotion intensities of positive emotion and negative emotion included in public opinion information about a hotspot event of the internet according to public opinion information of the hotspot event, and obtain an emotion tendency value. Therefore, the method can more intuitively and accurately see the emotion change of the public on the internet, the tendency, trend and strength of civil discussion of the hot spot event, and is helpful for judging the public opinion trend of the hot spot on the internet in advance, and effectively guiding policy formulation or public opinion guidance.
In some alternative embodiments, the data analysis module 121 is further configured to obtain, based on a preset text classification model, an event type ratio of the internet with respect to the news event data and the civil event data according to the news event data and the civil event data, where the event type map characterizes a type ratio of public opinion events in the news website and the civil website; correspondingly, the raspberry group 101 draws an event type diagram of the news website and the civil website according to the event type ratio and sends the event type diagram to the display 102 for displaying. Therefore, by respectively classifying and comparing the news event data and the civil event data, whether the attention points of the news website and the civil website are consistent or not can be effectively known, the understanding of the focus of the civil discussion on the Internet is effectively improved, and the method is helpful for guiding policy establishment or public opinion guiding.
In this embodiment of the present application, a model training module 131 is further provided in the raspberry group 101, and the model training module 131 is configured to correspondingly construct an emotion analysis model and a text classification model according to pre-obtained public opinion information sample data and text classification sample data based on a deep learning method.
The public opinion information sample data can be public opinion evaluation data, such as microblog comment data, on a Minsheng website, and is divided into two major categories, namely negative evaluation and positive evaluation according to public opinion evaluation content, so that the prediction accuracy of the emotion analysis model on positive emotion and negative emotion is helped to be trained, and the prediction probability of the positive emotion and the negative emotion is improved, namely the prediction accuracy of emotion intensity is improved.
Specifically, the model training module 131 is further configured to perform word segmentation processing on the public opinion information sample cloth data by using a jieba library based on a deep learning method, and perform text steering amount processing on the word-segmented public opinion information sample cloth data by using a TF-IDF method, so as to construct an emotion analysis model. The method comprises the steps of performing word segmentation on public opinion information sample cloth data by adopting a jieba library, then taking out stop words according to a stop word list to improve accuracy and training efficiency of an emotion analysis model, performing text steering amount processing on the public opinion information sample cloth data subjected to word segmentation by using TF-IDF, and then calling naive Bayes which are distributed for a priori in sklearn to perform model training in combination with text content subjected to vectorization processing.
In the embodiment of the application, the text classification sample data can adopt open source data sets processed by some languages on the Internet to cover different types of text contents, so that the training speed of the text classification model is increased, and the prediction accuracy of the text classification model is improved. The training process of the text classification model is similar to that of the emotion analysis model, and will not be described in detail herein.
In the embodiment of the application, the data acquisition module 111 is used for acquiring news event data and folk event data from the website crawler pool constructed based on the fuzzy comprehensive evaluation method, so that the screening of the acquired data is realized, and the data crawling efficiency and accuracy are effectively improved; the data analysis module 121 is used for determining the hot events based on a word frequency statistical method or a text similarity model, and a website crawler pool is called to crawl public opinion information about the hot events on the Internet to obtain the public opinion information about the hot events, so that a government, an enterprise or a person can obtain comprehensive public opinion comments about the hot events from the Internet, and guidance is provided for effectively improving policy promotion, enterprise image or personal life; the display 102 is used for judging the public opinion trend of the hot spot on the internet in advance through the emotion change graph (shown in fig. 5 and 6) showing the hot spot event, so that policy formulation or public opinion guiding is effectively guided. As can be seen from fig. 5 and 6, the emotional tendency value of the internet to the hot event changes based on time series.
In the embodiment of the application, through the emotion analysis model, the detection of news data can be realized, when regional news appears (titles or contents have names of all cities, such as Beijing city, wuhan city and the like), comment data of the news in all regions are firstly obtained through a civil website crawler, then emotion value prediction is carried out by using the emotion analysis model corresponding to each region, so that the judgment of the emotion value of the regional news in all regions can be obtained, and the application can be the perception of the issued emotion of a policy, such as Beijing college entrance score and all-region reaction; the Shanghai invites new energy talents to fall to the home, reacts in various places and the like. And (3) combining the data of each region except the data of each region into a large data set, removing the labels of the regions, and retraining the labels to obtain the universal emotion analysis model. And similarly, a general text classification model and a general similarity model can be obtained.
In this embodiment of the present application, the crawler obtains the geographic tag by detecting whether there is a positioning tag in the crawled content (i.e. whether the user shares his own location), so as to obtain regional data (news events, civil events, public opinion information, etc. of each region). In addition, parameters of a geographic crawler (crawler for each region) may be adjusted, and only data with positioning (when no positioning data is available, geographic information filled by a user is used as a geographic value) may be selected, so that the amount of crawling data is greatly increased in a manner of reducing data accuracy.
In this application embodiment, after crawling data, need carry out data cleaning to the data of crawling to delete duplicate information, correct the mistake that exists, make the data keep unanimous, be convenient for carry out analysis processing to the data after crawling, improve data processing's efficiency.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (10)

1. The utility model provides a network public opinion show system based on raspberry group which characterized in that includes:
the raspberry pie is provided with a folk public opinion and news information crawling program, the folk public opinion and news information crawling program is configured to crawl news events of a news website and folk events of the folk website based on a website crawler pool constructed by a fuzzy comprehensive evaluation method, and the keywords of the news events and the folk events are counted respectively; the website crawler pool is an aggregate of website crawlers, a plurality of website crawlers are arranged in the website crawler pool, each website is correspondingly provided with a plurality of crawlers, and functions of each crawler are different so as to crawl different types of data respectively; in the crawling process of the website crawler pool, similarity calculation is carried out among the constructed plurality of website crawler pools in a preset period based on a word vector cosine algorithm, a reference website is selected according to an access request, and websites to be crawled are sequentially screened based on the similarity between the website crawler pool corresponding to the reference website and other website crawler pools; calculating a crawling recommendation value of the website crawler pool corresponding to the screened website to be crawled based on a fuzzy comprehensive evaluation method, so that the website crawler pool corresponding to the reference website crawls the screened website to be crawled according to the crawling recommendation value of the website crawler pool corresponding to the screened website to be crawled;
A display communicatively coupled to the raspberry group, comprising: a display and an interactive display are presented and,
the display area of the display is divided into a first display area and a second display area, the first display area can perform rolling display on the news event, and the second display area can perform rolling display on the civil event;
the interactive display can display the keywords of the news event and the statistical data of the keywords of the folk event respectively, the interactive object determines hot keywords according to the statistical data and invokes folk public opinion and news information crawling programs in the raspberry group to crawl and display network public opinion corresponding to the hot keywords, wherein the network public opinion comprises hot news events or hot folk events corresponding to the hot keywords and public opinion information of the hot news events or the hot folk events.
2. The raspberry-pie-based online public opinion presentation system of claim 1, comprising: the folk public opinion and news information crawling program configured on the raspberry party is further configured to crawl the news event and the discussion hotness of the folk event respectively, and acquire discussion hotness values corresponding to the news event and the folk event;
The corresponding code is used to determine the position of the object,
the first display area can correspondingly scroll and display the discussion heat value of the news event; the second display area can correspondingly scroll and display the discussion heat value of the civil event.
3. The raspberry-pie-based online public opinion presentation system of claim 1, wherein the interactive object, after determining the hot spot keywords according to the statistics data, the interactive display displays an interactive control panel in the folk public opinion and news information crawling program, and interacts with the interactive object to crawl and present the online public opinion corresponding to the hot spot keywords.
4. The raspberry group-based online public opinion presentation system of claim 3, further comprising: a first light set, said first light set being communicatively coupled to said raspberry group,
the corresponding code is used to determine the position of the object,
when the interactive object carries out interactive selection on the crawling data source of the folk public opinion and news information crawling program in the interactive control panel, the first lamp group characterizes whether the folk public opinion and news information crawling program and the crawling data source normally communicate with each other or not by different flashing colors.
5. The raspberry group-based online public opinion presentation system of claim 1, further comprising: the second lamp set is connected with the raspberry group in a communication mode, and the second lamp set characterizes whether the civil public opinion and news information crawling program run normally when crawling data according to different flashing colors.
6. The raspberry group-based online public opinion presentation system of claim 1, further comprising: a buzzer, the buzzer is connected with the raspberry group in a communication way,
the corresponding code is used to determine the position of the object,
responding to the operation of the civil public opinion and news information crawling program, and the buzzer is sounded for 2 seconds; and responding to the running errors of the folk public opinion and news information crawling program, and sounding the buzzer for 5 seconds.
7. The raspberry group-based online public opinion presentation system of claim 6, further comprising: the third lamp group is in communication connection with the raspberry group and is connected with the buzzer in parallel,
The corresponding code is used to determine the position of the object,
responding to the operation of the folk public opinion and news information crawling program, and after the buzzer sounds for 2 seconds, the third lamp group flashes green light three times and then extinguishes;
and responding to the running errors of the folk public opinion and news information crawling program, wherein the buzzer sounds for 5 seconds, and the third lamp set always shines red.
8. The raspberry group-based online public opinion presentation system of claim 1, further comprising: a fourth light group, the fourth light group is in communication connection with the raspberry group,
and responding to the interactive display to display the network public opinion, and displaying the discussion heat value corresponding to the network public opinion by the fourth lamp group in different flashing numbers.
9. The raspberry group-based internet public opinion presentation system of any of claims 1-8, further comprising: the voice interaction device is in communication connection with the raspberry group, and can call the civil public opinion and news information crawling program through voice collection.
10. The raspberry group based internet public opinion presentation system of claim 9, wherein the voice interactive device is further capable of broadcasting the internet public opinion presented on the interactive display as determined by the interactive object.
CN202110567772.3A 2021-05-24 2021-05-24 Internet public opinion display system based on raspberry group Active CN113254746B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110567772.3A CN113254746B (en) 2021-05-24 2021-05-24 Internet public opinion display system based on raspberry group

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110567772.3A CN113254746B (en) 2021-05-24 2021-05-24 Internet public opinion display system based on raspberry group

Publications (2)

Publication Number Publication Date
CN113254746A CN113254746A (en) 2021-08-13
CN113254746B true CN113254746B (en) 2023-07-18

Family

ID=77184070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110567772.3A Active CN113254746B (en) 2021-05-24 2021-05-24 Internet public opinion display system based on raspberry group

Country Status (1)

Country Link
CN (1) CN113254746B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052586A (en) * 2017-12-11 2018-05-18 上海壹账通金融科技有限公司 The analysis of public opinion method, system, computer equipment and storage medium
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6931397B1 (en) * 2000-02-11 2005-08-16 International Business Machines Corporation System and method for automatic generation of dynamic search abstracts contain metadata by crawler
CN101408883B (en) * 2008-11-24 2010-09-01 电子科技大学 Method for collecting network public feelings viewpoint
US20110270678A1 (en) * 2010-05-03 2011-11-03 Drummond Mark E System and method for using real-time keywords for targeting advertising in web search and social media
WO2019000304A1 (en) * 2017-06-29 2019-01-03 麦格创科技(深圳)有限公司 Public opinion monitoring method and system
US10747833B2 (en) * 2017-10-30 2020-08-18 Nio Usa, Inc. Personalized news recommendation engine
CN110609950B (en) * 2019-08-02 2022-09-16 济南大学 Public opinion system search word recommendation method and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108052586A (en) * 2017-12-11 2018-05-18 上海壹账通金融科技有限公司 The analysis of public opinion method, system, computer equipment and storage medium
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion

Also Published As

Publication number Publication date
CN113254746A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
CN107391706B (en) Urban tourism question-answering system based on mobile internet
Quercia et al. The digital life of walkable streets
Purohit et al. Emergency-relief coordination on social media: Automatically matching resource requests and offers
Meaney Commuters and marauders: An examination of the spatial behaviour of serial criminals
CN103198072B (en) Method and device is recommended in a kind of excavation of popular search word
CN111540244A (en) Multi-terminal online teaching system based on big data analysis
CN110533212A (en) Urban waterlogging public sentiment monitoring and pre-alarming method based on big data
CN112668375B (en) Tourist distribution analysis system and method in scenic spot
Xing et al. Crowdsourced social media and mobile phone signaling data for disaster impact assessment: A case study of the 8.8 Jiuzhaigou earthquake
CN105975609A (en) Industrial design product intelligent recommendation method and system
CN108052608A (en) A kind of method and device according to senior secondary course intelligent recommendation university specialty
CN114240528A (en) Interactive scenic spot guide system for rural tourism
Shen et al. Information retrieval of a disaster event from cross-platform social media
WO2023065798A1 (en) Dynamic road event processing method and apparatus, device, and medium
CN113378023B (en) Civil public opinion and news information mining comparison visualization system
CN113254746B (en) Internet public opinion display system based on raspberry group
Kang et al. Assessment of perceived and physical walkability using street view images and deep learning technology
Tavra et al. Unpacking the role of volunteered geographic information in disaster management: focus on data quality
Sakai et al. Photo image classification using pre-trained deep network for density-based spatiotemporal analysis system
CN105912637A (en) Knowledge-based user interest mining method
CN114997624A (en) Intelligent whole-person safety production responsibility management system
Albakour et al. SMART: An Open Source Framework for Searching the Physical World.
Sánchez-Ávila et al. Detection of barriers to mobility in the smart city using Twitter
CN115640403A (en) Knowledge management and control method and device based on knowledge graph
Liu et al. Semantics and structure based recommendation of similar legal cases

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant