WO2022052546A1 - 舆情数据处理系统及方法、计算机存储介质、电子设备 - Google Patents

舆情数据处理系统及方法、计算机存储介质、电子设备 Download PDF

Info

Publication number
WO2022052546A1
WO2022052546A1 PCT/CN2021/100424 CN2021100424W WO2022052546A1 WO 2022052546 A1 WO2022052546 A1 WO 2022052546A1 CN 2021100424 W CN2021100424 W CN 2021100424W WO 2022052546 A1 WO2022052546 A1 WO 2022052546A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
public opinion
opinion data
network
sensitivity level
Prior art date
Application number
PCT/CN2021/100424
Other languages
English (en)
French (fr)
Inventor
陈予郎
Original Assignee
长鑫存储技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 长鑫存储技术有限公司 filed Critical 长鑫存储技术有限公司
Priority to US17/455,738 priority Critical patent/US20220084051A1/en
Publication of WO2022052546A1 publication Critical patent/WO2022052546A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the technical field of big data, and in particular, to a public opinion data processing system, a public opinion data processing method, a computer storage medium, and an electronic device.
  • the purpose of the present disclosure is to provide a public opinion data processing system, a public opinion data processing method, a computer storage medium and an electronic device, thereby at least to a certain extent avoiding the relatively weak concurrent processing capability in the related art, which cannot be displayed for users with different account levels Defects in data at different sensitivity levels.
  • a public opinion data processing system including: a network data integration platform, a big data cluster, a business data integration platform, a database server and a data display platform; wherein, the network data integration platform is used for collecting data
  • the obtained network public opinion data is audited and analyzed to obtain the sensitivity level of the network public opinion data, and the network public opinion data and its sensitivity level are sent to the big data cluster; the big data cluster is used to filter the network public opinion data.
  • Invalid data in the network public opinion data and send the filtered network public opinion data to the business data integration platform;
  • the business data integration platform is used to filter the enterprise public opinion data from the filtered network public opinion data, and combine the enterprise public opinion data,
  • the obtained association between the user's account level and the sensitivity level of the enterprise public opinion data is stored in the database server;
  • the data display platform is used to request the database server to obtain the target sensitivity level according to the user's account level through authentication
  • the enterprise public opinion data of the target sensitivity level is displayed to the authenticated user.
  • a method for processing public opinion data comprising: auditing and analyzing collected network public opinion data through a network data integration platform to obtain a sensitivity level of the network public opinion data, The public opinion data and its sensitivity level are sent to the big data cluster; the invalid data in the network public opinion data is filtered through the big data cluster, and the filtered network public opinion data is sent to the business data integration platform; Screening enterprise public opinion data from the filtered network public opinion data, and storing the enterprise public opinion data, the association relationship between the obtained user's account level and the sensitivity level of the enterprise public opinion data to the database server; The platform requests the database server to obtain the enterprise public opinion data of the target sensitivity level according to the account level of the authenticated user, and displays the enterprise public opinion data of the target sensitivity level to the authenticated user.
  • a computer storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the public opinion data processing method described in the second aspect.
  • an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions to The public opinion data processing method described in the second aspect above is executed.
  • the public opinion data processing system, public opinion data processing method, computer storage medium and electronic device in the exemplary embodiments of the present disclosure at least have the following advantages and positive effects:
  • the network data integration platform performs audit analysis on the collected network public opinion data to obtain the sensitivity level of the network public opinion data, and combines the network public opinion data and its sensitivity.
  • the grade is sent to the big data cluster, so that the big data cluster can filter the invalid data, and send the filtered data to the business data integration platform, so that it can not only solve the problem of low efficiency caused by manual review of data, but also
  • the auditing efficiency of the data is improved, the amount of data to be processed can be reduced, the influence of invalid data on the subsequent data processing process can be avoided, and the subsequent data processing efficiency can be improved.
  • the business data integration platform selects the enterprise public opinion data from the filtered network public opinion data, and stores the enterprise public opinion data, the acquired relationship between the user's account level and the sensitivity level of the enterprise public opinion data in the database server, It can solve the technical problem of weak concurrency capability caused by deploying only one distributed big data cluster in the related art, and improve the concurrent processing capability of the system.
  • the data display platform requests the database server to obtain the enterprise public opinion data of the target sensitivity level according to the account level of the authenticated user, and displays the enterprise public opinion data of the target sensitivity level to the authenticated user. Differently, it displays enterprise public opinion data of different sensitivity levels, and provides an intelligent information security management and control mechanism, which makes the displayed data more close to the user's needs and improves the user's information acquisition efficiency.
  • FIG. 1 shows a schematic structural diagram of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of a sub-flow of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of a sub-flow of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of a sub-flow of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • FIG. 5 shows a schematic diagram of a sub-flow of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • Fig. 6 shows the overall interaction flow chart of the public opinion data processing system in an exemplary embodiment of the present disclosure
  • FIG. 7 shows a schematic flowchart of a method for processing public opinion data in an exemplary embodiment of the present disclosure
  • FIG. 8 shows a schematic structural diagram of a computer storage medium in an exemplary embodiment of the present disclosure
  • FIG. 9 shows a schematic structural diagram of an electronic device in an exemplary embodiment of the present disclosure.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
  • numerous specific details are provided in order to give a thorough understanding of the embodiments of the present disclosure.
  • those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. may be employed.
  • well-known solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
  • a public opinion data processing system is firstly provided, which overcomes, at least to a certain extent, the defect of the related art that the concurrency capability is weak and cannot display data of different sensitivity levels for users of different account levels.
  • FIG. 1 shows a schematic structural diagram of a public opinion data processing system in an exemplary embodiment of the present disclosure
  • the executive body of the public opinion data processing system may be a server that processes public opinion data.
  • a public opinion data processing system 100 may include a network data integration platform 101 , a big data cluster 102 , a business data integration platform 103 , a database server 104 and a data display platform 105 . in:
  • Network data integration platform big data cluster, business data integration platform, database server and data display platform; among them,
  • the network data integration platform is used to audit and analyze the collected network public opinion data to obtain the sensitivity level of the network public opinion data, and send the network public opinion data and its sensitivity level to the big data cluster;
  • Big data cluster used to filter invalid data in network public opinion data, and send the filtered network public opinion data to the business data integration platform;
  • the business data integration platform is used to filter the corporate public opinion data from the filtered network public opinion data, and store the relationship between the corporate public opinion data, the acquired user's account level and the sensitivity level of the corporate public opinion data to the database server;
  • the data display platform is used to request the database server to obtain the enterprise public opinion data of the target sensitivity level according to the account level of the authenticated user, and to display the enterprise public opinion data of the target sensitivity level to the authenticated user.
  • the network data integration platform performs audit analysis on the collected network public opinion data to obtain the sensitivity level of the network public opinion data, and combines the network public opinion data and its sensitivity.
  • the grade is sent to the big data cluster, so that the big data cluster can filter the invalid data, and send the filtered data to the business data integration platform, so that it can not only solve the problem of low efficiency caused by manual review of data, but also
  • the auditing efficiency of the data is improved, the amount of data to be processed can be reduced, the influence of invalid data on the subsequent data processing process can be avoided, and the subsequent data processing efficiency can be improved.
  • the business data integration platform selects the enterprise public opinion data from the filtered network public opinion data, and stores the enterprise public opinion data, the acquired relationship between the user's account level and the sensitivity level of the enterprise public opinion data in the database server, It can solve the technical problem of weak concurrency capability caused by deploying only one distributed big data cluster in the related art, and improve the concurrent processing capability of the system.
  • the data display platform requests the database server to obtain the enterprise public opinion data of the target sensitivity level according to the account level of the authenticated user, and displays the enterprise public opinion data of the target sensitivity level to the authenticated user. Differently, it displays enterprise public opinion data of different sensitivity levels, and provides an intelligent information security management and control mechanism, which makes the displayed data more close to the user's needs and improves the user's information acquisition efficiency.
  • the network data integration platform 101 is used for auditing and analyzing the collected network public opinion data to obtain the sensitivity level of the network public opinion data, and sending the network public opinion data and its sensitivity level to the big data cluster.
  • FIG. 2 which is a schematic diagram of a sub-flow of a public opinion data processing system in an exemplary embodiment of the present disclosure, specifically showing a schematic diagram of a sub-flow of the network data integration platform for collecting network public opinion data, including steps S201-S202. , the specific implementation is explained below with reference to FIG. 2 .
  • step S201 the data in the preset data repository is collected and/or the web crawler is called to periodically collect the data in the target website.
  • a data repository can be preset, and the data in the data repository can be public opinion data collected by public opinion specialists, public opinion data provided by relevant data providers, etc., and then the network data integration platform can collect the data.
  • the data in the repository since the data in the data repository generally has a fixed data format that is not easy to transform, and has a stable data source, a large amount of development and operation and maintenance costs can be avoided.
  • a web crawler can also be called to periodically collect data in a target website, and the target website can be related search engines at home and abroad (Baidu, Sogou, 360), Weibo, WeChat official account, forum, Tieba, Website platforms such as blogs can be set by themselves according to the actual situation, and belong to the protection scope of this disclosure.
  • a certain timing program can be set (for example, the timing duration is 10 minutes), and then, the data on the target website is collected every 10 minutes, and the collected data is converted into a unified format. As a result, massive amounts of data can be collected, and no procurement costs are required, thereby reducing data collection costs.
  • step S202 the collected data is deduplicated and normalized to obtain network public opinion data.
  • the collected data can be deduplicated to remove the redundant data, avoid a large amount of redundant data from entering the subsequent processing flow, ensure the simplicity of the data, save the processing thread of the processor, and ensure the Data processing efficiency.
  • the collected network public opinion data may be massive, including public opinion data of multiple industries, exemplarily, may include public opinion data of the computer industry, insurance industry, automobile industry, food safety industry, electronic circuit industry, etc.
  • JSON format Java Script Object Notation, JS Object Notation, which is a lightweight data exchange format.
  • JSON format data can not only improve readability, but also Complexity can be reduced and data exchange and data processing are facilitated.
  • the network data integration platform can also call the relevant automatic audit module to audit and analyze the data, so as to obtain the sensitivity level of the network public opinion data.
  • This process can involve key technologies such as pattern matching algorithm, text semantic analysis, hot topic discovery, and bad image content identification, in order to strengthen the security management of data and further ensure the legality, health and security of data.
  • 3, which is a schematic sub-flow diagram of the public opinion data processing system in an exemplary embodiment of the present disclosure, which specifically shows that the network data integration platform performs audit analysis on the collected network public opinion data to obtain network public opinion.
  • a schematic diagram of a sub-flow of data sensitivity level, including steps S301-S304, and the specific implementation is explained below with reference to FIG. 3 .
  • step S301 a semantic recognition algorithm is invoked to perform semantic recognition on the network public opinion data, and a semantic recognition result is obtained.
  • the network data integration platform may invoke a semantic recognition algorithm to perform semantic recognition on network public opinion data to obtain a semantic recognition result.
  • a deep convolutional neural network can also be trained to obtain a semantic recognition model, and the network public opinion data is input into the semantic recognition model to obtain its semantic recognition result.
  • step S302 the target keyword in the network public opinion data is determined according to the pre-stored sensitive keywords.
  • big data analysts may also pre-store some sensitive keywords, for example, sensitive keywords It can be keywords such as violence, damage to corporate image, negative semantic words, etc. Furthermore, the network public opinion data can be compared and matched with these sensitive keywords to determine the target keyword.
  • step S303 the semantic recognition result and the target keyword are aggregated and integrated to obtain sensitive data contained in the network public opinion data.
  • the semantic recognition result and the target keyword can be aggregated and integrated to obtain sensitive data contained in the network public opinion data, and mark the sensitive data.
  • step S304 the sensitivity level of the network public opinion data is determined according to the number of sensitive data.
  • the sensitivity level of the network public opinion data A can be determined as level I (high level);
  • the sensitivity level of the network public opinion data B can be determined as level II (medium level);
  • the network public opinion data can be The sensitivity level of data C is determined to be level III (low level).
  • the network data integration platform can send the network public opinion data and its sensitivity level to the big data cluster.
  • Big data clusters allow developers to either run written programs in the "cloud”, use the services provided in the "cloud”, or both.
  • a big data cluster is a platform that integrates data access, data processing, data storage, query retrieval, analysis and mining, and application interfaces.
  • the big data cluster 102 is used for filtering invalid data in the network public opinion data, and sending the filtered network public opinion data to the business data integration platform.
  • the big data cluster can filter invalid data contained in the network public opinion data, and send the filtered network public opinion data to the business data integration platform.
  • the big data cluster in the present disclosure may be a distributed relational database, such as GreenPlum, thus, on the one hand, it can provide a larger data storage space, achieve the data backup function, and avoid the loss of application database data. On the other hand, it can also improve the computing power and query speed of massive data.
  • the invalid data can be some gossip, rumors, rum and other public opinion news for which the authenticity of the data cannot be determined.
  • some invalid keywords can be preset (for example: the title of some gossip, the title of the rumor, the rumor. keywords such as titles), and then, the network public opinion data can be compared and matched with these invalid keywords to determine the invalid data in it, and then the big data cluster can filter the determined invalid data, and filter the invalid data.
  • the processed network public opinion data is sent to the business data integration platform.
  • the big data cluster can also analyze and process the network public opinion data to classify and summarize it. For example, it can be divided into sudden natural disaster events, production safety accidents, mass events, public health Events, corporate image, judicial events, economic and people’s civil events, overseas emergencies and other categories, thus, in the subsequent processing process, different categories of network public opinion data can be classified and processed, thereby improving the orderliness of data and the efficiency of data processing.
  • the business data integration platform 103 is used to filter corporate public opinion data from the filtered network public opinion data, and to associate the corporate public opinion data, the user's account level with the sensitivity level of the corporate public opinion data. The relationships are stored to the database server 104 .
  • the business data integration platform 103 can filter out corporate public opinion data from the filtered network public opinion data, which can be exemplarily the corporate public opinion data of the target company, or the public opinion data of all companies in the industry associated with the target company. , thus, it is convenient for the target enterprise to understand the development trend of the industry and improve its competitiveness among the peers.
  • the business data integration platform can also provide an audit operation interface, so that the auditor can log in to the system and input the first interactive operation information (for example, manually bind the enterprise public opinion data of different sensitivity levels with the corresponding account level) to realize Correlate enterprise public opinion data of different sensitivity levels with different account levels.
  • the enterprise public opinion data with a sensitivity level of I can be associated with the account level (high-level users, such as senior managers within the enterprise), that is, the high-level users are displayed.
  • the above-mentioned corporate public opinion data with a sensitivity level of I; the corporate public opinion data with a sensitivity level of II can be associated with the account level (middle and high-level users, such as middle-level managers within the enterprise), that is, it is displayed for middle and high-level users.
  • the above-mentioned enterprise public opinion data with a sensitivity level II; enterprise public opinion data with a sensitivity level III can be associated with the account level (low-level users, such as social personnel outside the enterprise, grassroots personnel inside the enterprise), namely Display the above-mentioned enterprise public opinion data with sensitivity level III for low-level users.
  • enterprise public opinion data with sensitivity levels I, II, and III may also be associated with account levels (high-level users), that is, the above-mentioned sensitivity levels of I, II, and III are displayed for high-level users.
  • Enterprise public opinion data; enterprise public opinion data with sensitivity levels II and III can be associated with account levels (middle and high-level users), that is, the above-mentioned enterprise public opinion data with sensitivity levels II and III can be displayed for middle and high-level users;
  • the authority of the big data analyst can also be set through the above-mentioned auditing operation interface as follows: adjusting the processing parameters of the system and modifying the algorithm programs related to the system.
  • Set the account authority of the chief auditor as follows: audit all data in the system, and have the authority to set display, query, withdraw display, and withdraw query of all data.
  • the account authority of a group auditor is: audit a certain type of data, and have the authority to set the display, query, withdraw display, withdraw query, etc. of this type of data.
  • Set the account authority of high-level users to: provide classified display of "enterprise public opinion data with sensitivity levels I, II, and III", and have the ability to read “enterprise public opinion data with sensitivity levels I, II, and III” and query permissions.
  • Set the account permissions of middle and high-level users as follows: have the classified display of "enterprise public opinion data with sensitivity levels II and III", and have the permission to read and query "enterprise public opinion data with sensitivity levels II and III”.
  • Set the account permissions of low-level users as follows: have the permission to read and query "enterprise public opinion data with sensitivity level III".
  • different account levels can be distinguished by user names.
  • the user names of high-level users can be set to start with the fixed characters "superior-user”
  • the user names of middle and high-level users can be set to start with the fixed characters "junior-user”.
  • Usernames for low-level users begin with the fixed characters "user”. It should be noted that the above characters can be set by themselves according to the actual situation, and belong to the protection scope of the present disclosure.
  • the auditor may also evaluate the risk value corresponding to the above-mentioned enterprise public opinion data, so that relevant public opinion prevention and control measures can be formulated in advance according to the risk value, so as to stop the relevant public opinion crisis in time and avoid enterprise losses caused by the crisis.
  • the business data integration platform can also analyze and process the corporate public opinion data to classify and summarize it.
  • Data that is, data that has no beneficial impact on corporate image
  • negative public opinion data that is, data that is unfavorable for corporate image promotion
  • corporate public opinion data related to core technology that is, data that is unfavorable for corporate image promotion
  • different types of enterprise public opinion data can be classified and processed, thereby improving the orderliness of the data and the pertinence of data display.
  • the business data integration platform can integrate the corporate public opinion data, the acquired user's account level and the sensitivity level of the corporate public opinion data.
  • the association relationship is stored in the database server deployed on a single machine to further improve the reliability of the data.
  • the database server in the present disclosure may be PostgreSQL (an open source relational database, which is characterized by powerful functions and supports many advanced features such as object orientation), thereby reducing equipment costs and ensuring strong concurrent processing capabilities. .
  • the data display platform can directly request data from the database server, so as to solve the technical problem of weak concurrency caused by deploying only one distributed big data cluster in related technologies, and improve the concurrency of the system. processing power.
  • the data display platform 105 is used for requesting the database server to obtain enterprise public opinion data of the target sensitivity level according to the account level of the authenticated user, and displaying the enterprise public opinion data of the target sensitivity level to the authenticated user.
  • the data display platform in the present disclosure may be the iNews news platform.
  • FIG. 4 is a schematic diagram of a sub-flow of the public opinion data processing system in an exemplary embodiment of the present disclosure, which specifically shows that the data display platform provides an identity authentication interface.
  • a schematic diagram of a sub-flow for determining a user who has passed the authentication includes steps S401 to S404 , and the specific implementation is explained below with reference to FIG. 4 .
  • step S401 an identity authentication interface is provided.
  • the data display platform can provide an identity authentication interface.
  • the identity authentication interface can display a user name and a login password.
  • it can also include a picture verification code, a digital verification code, etc. The actual situation is set by itself, which belongs to the protection scope of the present disclosure.
  • step S402 the identity information input by the user to be authenticated in the identity authentication interface is obtained.
  • the user to be authenticated can input his identity information in the above-mentioned identity authentication interface.
  • step S403 the identity information is sent to the third-party authentication platform, so that the third-party authentication platform performs legality authentication on the identity information.
  • the data display platform can send the above-mentioned identity information to the third-party authentication platform, so that the third-party authentication platform can verify the legitimacy of the above-mentioned identity information.
  • the third-party certification platform may refer to: a third-party certification body other than Global Business News.
  • the third-party certification is to verify the true and legal identity of the Business News applicant, to eliminate false information to the greatest extent, and to ensure the interests of Business News.
  • the third-party authentication platform can perform LDAP authentication (Lightweight Directory Access Protocol, Lightweight Directory Access Protocol, referred to as: LDAP) for identity information.
  • LDAP authentication is an authentication built by WSS3.0 plus the Lightweight Directory Access Protocol.
  • the method is that the identity information is placed on the LDAP server, and the user to be authenticated is authenticated through the data on the LDAP server. After the authentication is passed, the third-party authentication platform can send an authentication pass message to the data display platform.
  • step S404 upon receiving the authentication passed message returned by the third-party authentication platform, it is determined that the user to be authenticated is the authentication passed user.
  • the data display platform can determine that the user to be authenticated is the authenticated user.
  • the data display platform can request the database server according to the account level of the authenticated user, and display the requested enterprise public opinion data to the authenticated user.
  • the corresponding account authority may be determined according to the user name of the authenticated user. For example, when the user name of the authenticated user begins with the fixed character "superior-user", the account level of the user may be determined.
  • the requested enterprise public opinion data of the target sensitivity level may be: enterprise public opinion data of sensitivity levels I, II, and III. Therefore, enterprise public opinion data of different sensitivity levels can be displayed for users of different account levels, so that the displayed data is closer to the needs of users, and the efficiency of information acquisition of users is improved.
  • step S501 an information display interface is provided, and the enterprise public opinion data of the target sensitivity level is displayed in the information display interface.
  • the data display platform may provide an information display interface, and display the enterprise public opinion data of the target sensitivity level requested (from the database server through the user's account level according to the authentication) in the information display interface.
  • step S502 a feedback opinion on the enterprise public opinion data of the target sensitivity level input by the user in the information display interface through the authentication is obtained.
  • users who have passed the authentication can browse the corporate public opinion data of the above target sensitivity level. After browsing, they can enter feedback on each piece of corporate public opinion data in the above information display interface. For example, the display of X data helps to improve Corporate image, the display of Y data may damage the corporate image, etc.
  • step S503 the feedback is written into the database server.
  • the data display platform can write the above feedback into the database server.
  • a prompt message can be sent to the auditor (for example, a message pop-up window is displayed on the relevant audit operation interface), so that after the auditor receives the above message, the auditor can Input the second interactive operation information (for example, manually delete data, etc.) through the audit operation interface to cancel the display of the relevant enterprise public opinion data.
  • the specified keyword may be a keyword that may cause losses to the enterprise, such as "damaging the image of the enterprise, revealing the secrets of the core technology of the enterprise". For example, when the feedback is "the display of Y data may damage the corporate image", the display of Y data by the data display platform can be cancelled, that is, the data display platform cannot request the Y data.
  • FIG. 6 shows the overall interaction flow chart of the public opinion data processing system in an exemplary embodiment of the present disclosure, and the specific implementation is explained below with reference to FIG. 6 .
  • the network data integration platform 101 is used to collect the data in the preset data repository and/or call the web crawler to periodically collect the data in the target website, so as to obtain the network public opinion data;
  • the stored sensitive keywords perform audit analysis on the collected network public opinion data (automatic audit operation) to obtain the sensitivity level of the network public opinion data, and send the network public opinion data and its sensitivity level to the big data cluster;
  • the big data cluster 102 is used to filter invalid data and send the filtered network public opinion data to the business data integration platform;
  • the business data integration platform 103 is used to filter corporate public opinion data from the filtered network public opinion data; provide an auditing operation interface to compare the corporate public opinion data, the user's account level set by the auditor and the sensitivity level of the corporate public opinion data
  • the associated relationship is stored in the database server 104; and an information display interface is provided for users to read and query data.
  • the data display platform 105 is used to provide an identity authentication interface, and send the identity information of the user to be authenticated to the third-party authentication platform 106, and perform legality authentication on the user identity through the LDAP service of the third-party authentication platform;
  • the account level requests the database server to obtain the enterprise public opinion data of the target sensitivity level, and displays the enterprise public opinion data of the target sensitivity level to the authenticated user;
  • the present disclosure also provides a method for processing public opinion data.
  • FIG. 7 shows a schematic flowchart of the method for processing public opinion data in an exemplary embodiment of the present disclosure.
  • the execution body of the method for processing public opinion data may be It is a server that processes public opinion data.
  • a method for processing public opinion data includes the following steps:
  • Step S710 audit and analyze the collected network public opinion data through the network data integration platform to obtain the sensitivity level of the network public opinion data, and send the network public opinion data and its sensitivity level to the big data cluster;
  • Step S720 filtering invalid data in the network public opinion data through the big data cluster, and sending the filtered network public opinion data to the business data integration platform;
  • Step S730 Screen the enterprise public opinion data from the filtered network public opinion data through the business data integration platform, and store the enterprise public opinion data, the relationship between the acquired user's account level and the sensitivity level of the enterprise public opinion data in the database server ;
  • Step S740 requesting the database server through the data display platform according to the account level of the authenticated user to obtain enterprise public opinion data of the target sensitivity level, and displaying the enterprise public opinion data of the target sensitivity level to the authenticated user.
  • the data in the preset data repository is collected through the network data integration platform and/or the web crawler is called to periodically collect the data in the target website; Repeat and normalize to obtain network public opinion data.
  • a semantic recognition algorithm is invoked through the network data integration platform to perform semantic recognition on the network public opinion data, and a semantic recognition result is obtained; the target keyword in the network public opinion data is determined according to the pre-stored sensitive keywords; The semantic recognition results and target keywords are aggregated and integrated to obtain the sensitive data contained in the network public opinion data; according to the number of sensitive data, the sensitivity level of the network public opinion data is determined.
  • an audit operation interface is provided through the business data integration platform; according to the first interactive operation information input by the auditor on the audit operation interface, the user's account level and the sensitivity level of the enterprise public opinion data are determined relationship between.
  • an identity authentication interface is provided through a data display platform; identity information input by a user to be authenticated in the identity authentication interface is obtained; the identity information is sent to a third-party authentication platform, so that the third-party authentication platform Perform legality authentication on identity information; when receiving the authentication passed message returned by the third-party authentication platform, determine the user to be authenticated as the authenticated user.
  • an information display interface is provided through the data display platform, and enterprise public opinion data of the target sensitivity level is displayed in the information display interface; Feedback opinions of the enterprise public opinion data of the degree level; write the feedback opinions into the database server.
  • a prompt message is sent to the auditor; an audit operation interface is provided; according to the input of the auditor on the audit operation interface
  • the second interactive operation information cancels the display of the enterprise public opinion data of the target sensitivity level.
  • the present disclosure can solve the problem of low efficiency caused by manual auditing of data in related fields, improve the efficiency of data auditing, and can reduce the amount of data that needs to be processed, and avoid invalid data to the subsequent data processing process. and improve the efficiency of subsequent data processing. Further, the technical problem of weak concurrency capability caused by deploying only one distributed big data cluster in the related art can be solved, and the concurrent processing capability of the system can be improved. On the other hand, it can display corporate public opinion data with different sensitivity levels according to the user account level, and provide an intelligent information security management and control mechanism, so that the displayed data is closer to the user's needs and improves the user's information acquisition efficiency.
  • modules or units of the apparatus for action performance are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.
  • the exemplary embodiments described herein can be implemented by software, or can be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , which includes several instructions to cause a computing device (which may be a personal computer, a server, a mobile terminal, or a network device, etc.) to execute a method according to an embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a mobile terminal, or a network device, etc.
  • a computer storage medium capable of implementing the above method is also provided.
  • a program product capable of implementing the method described above in this specification is stored thereon.
  • various aspects of the present disclosure may also be implemented in the form of a program product including program code for causing the program product to run on a terminal device when the program product is run on a terminal device.
  • the terminal device performs the steps according to various exemplary embodiments of the present disclosure described in the above-mentioned "Example Method" section of this specification.
  • a program product 800 for implementing the above method according to an embodiment of the present disclosure is described, which can adopt a portable compact disk read only memory (CD-ROM) and include program codes, and can be stored in a terminal device, For example running on a personal computer.
  • a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network
  • an electronic device capable of implementing the above method is also provided.
  • aspects of the present disclosure may be implemented as a system, method or program product. Therefore, various aspects of the present disclosure can be embodied in the following forms: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as implementations "circuit", “module” or "system”.
  • FIG. 9 An electronic device 900 according to this embodiment of the present disclosure is described below with reference to FIG. 9 .
  • the electronic device 900 shown in FIG. 9 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • electronic device 900 takes the form of a general-purpose computing device.
  • Components of the electronic device 900 may include, but are not limited to, the above-mentioned at least one processing unit 910 , the above-mentioned at least one storage unit 920 , a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910 ), and a display unit 940 .
  • the storage unit stores program codes, which can be executed by the processing unit 910, so that the processing unit 910 executes various exemplary methods according to the present disclosure described in the above-mentioned “Exemplary Methods” section of this specification. Implementation steps.
  • the processing unit 910 may perform as shown in FIG.
  • Step S710 audit and analyze the collected network public opinion data through the network data integration platform to obtain the sensitivity level of the network public opinion data, and combine the network public opinion data and its sensitivity level are sent to the big data cluster; step S720, filter invalid data in the network public opinion data through the big data cluster, and send the filtered network public opinion data to the business data integration platform; step S730, through the business data integration platform Screen the enterprise public opinion data from the filtered network public opinion data, and store the relationship between the enterprise public opinion data, the acquired user's account level and the sensitivity level of the enterprise public opinion data in the database server; step S740, through the data display platform According to the account level of the authenticated user, the database server is requested to obtain the enterprise public opinion data of the target sensitivity level, and the enterprise public opinion data of the target sensitivity level is displayed to the authenticated user.
  • the storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 9201 and/or a cache storage unit 9202 , and may further include a read only storage unit (ROM) 9203 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 920 may also include a program/utility 9204 having a set (at least one) of program modules 9205 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
  • the bus 930 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.
  • the electronic device 900 may also communicate with one or more external devices 1000 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 900, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 900 to communicate with one or more other computing devices. Such communication may take place through input/output (I/O) interface 950 . Also, the electronic device 900 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 960 . As shown, network adapter 960 communicates with other modules of electronic device 900 via bus 930 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on the network , including several instructions to cause a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种舆情数据处理系统、舆情数据处理方法、计算机存储介质、电子设备,其中,该舆情数据处理系统(100)包括:网络数据整合平台(101),用于对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群(102);大数据集群(102),用于将过滤后的网络舆情数据发送至业务数据整合平台(103);业务数据整合平台(103),用于从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器(104);数据展示平台(105),用于将目标敏感度等级的企业舆情数据展示给认证通过用户。

Description

舆情数据处理系统及方法、计算机存储介质、电子设备
本申请要求于2020年09月11日提交中国专利局、申请号为CN202010953021.0、申请名称为“舆情数据处理系统及方法、计算机存储介质、电子设备”的专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及大数据技术领域,特别涉及一种舆情数据处理系统、舆情数据处理方法、计算机存储介质及电子设备。
背景技术
新媒体时代媒介技术的不断发展和网络技术的不断创新推动了新媒体的发展,加速了网络舆情发展的繁荣。与传统媒体相比,新媒介影响力更大,凝聚力更强,传播更便捷,已成为不同利益群体表达诉求的重要平台与载体。在这个平台上,公众深度搜索欲望强烈,关注问题广泛,参与社会事件监督热情高,很多信息极易被无限放大引发突发公共事件。所以对企业来说,如何对舆情进行有效的监控和处理就显得尤为重要。
目前,相关舆情数据处理系统中一般只部署一分布式关联型数据库,数据的并发处理能力较弱,并且,无法为不同账户等级的用户展示不同敏感等级的数据。
鉴于此,本领域亟需开发一种新的舆情处理系统。
需要说明的是,上述背景技术部分公开的信息仅用于加强对本公开的背景的理解。
发明内容
本公开的目的在于提供一种舆情数据处理系统、舆情数据处理方法、计算机存储介质及电子设备,进而至少在一定程度上避免了相关技术中并发处理能力较弱、无法为不同账户等级的用户展示不同敏感度等级的数据的缺陷。
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。
根据本公开的第一方面,提供一种舆情数据处理系统,包括:网络数据整合平台、大数据集群、业务数据整合平台、数据库服务器和数据展示平台;其中,网络数据整合平台,用于对采集到的网络舆情数据进行审计分析,以获得所述网络舆情数据的敏感度等级,将所述网络舆情数据及其敏感度等级发送至大数据集群;大数据集群,用于过滤所述网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;业务数据整合平台,用于从所述过滤后的网络舆情数据中筛选企业舆情数据,并将所述企业舆情数据、获取到用户的账户等级与所述企业舆情数 据的敏感度等级之间的关联关系存储至数据库服务器;数据展示平台,用于根据认证通过用户的账户等级请求所述数据库服务器以获得目标敏感度等级的企业舆情数据,并将所述目标敏感度等级的企业舆情数据展示给所述认证通过用户。
根据本公开的第二方面,提供一种舆情数据处理方法,包括:通过网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得所述网络舆情数据的敏感度等级,将所述网络舆情数据及其敏感度等级发送至大数据集群;通过大数据集群过滤所述网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;通过业务数据整合平台从所述过滤后的网络舆情数据中筛选企业舆情数据,并将所述企业舆情数据、获取到用户的账户等级与所述企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;通过数据展示平台根据认证通过用户的账户等级请求所述数据库服务器以获得目标敏感度等级的企业舆情数据,并将所述目标敏感度等级的企业舆情数据展示给所述认证通过用户。
根据本公开的第三方面,提供一种计算机存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第二方面所述的舆情数据处理方法。
根据本公开的第四方面,提供一种电子设备,包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行上述第二方面所述的舆情数据处理方法。
由上述技术方案可知,本公开示例性实施例中的舆情数据处理系统、舆情数据处理方法、计算机存储介质及电子设备至少具备以下优点和积极效果:
在本公开的一些实施例所提供的技术方案中,一方面,网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群,以使大数据集群过滤其中的无效数据,并将过滤后的数据发送至业务数据整合平台,从而,不仅能够解决相关中人工审核数据所导致的效率较低的问题,提高数据的审计效率,而且能够减小需要处理的数据量,避免无效数据对后续数据处理过程的影响,提高后续的数据处理效率。进一步的,业务数据整合平台从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器,能够解决相关技术中仅部署一分布式大数据集群所导致的并发能力弱的技术问题,提高系统的并发处理能力。另一方面,数据展示平台根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户,能够根据用户账户等级的不同为其展示不同敏感度等级的企业舆情数据,提供智能化的信息安全管控机制,使得展示数据更加贴近用户需求,提高用户的信息获取效率。
本公开应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的, 并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出本公开一示例性实施例中舆情数据处理系统的结构示意图;
图2示出本公开一示例性实施例中舆情数据处理系统的子流程示意图;
图3示出本公开一示例性实施例中舆情数据处理系统的子流程示意图;
图4示出本公开一示例性实施例中舆情数据处理系统的子流程示意图;
图5示出本公开一示例性实施例中舆情数据处理系统的子流程示意图;
图6示出本公开一示例性实施例中舆情数据处理系统的整体交互流程图;
图7示出本公开示例性实施例中舆情数据处理方法的流程示意图;
图8示出本公开示例性实施例中计算机存储介质的结构示意图;
图9示出本公开示例性实施例中电子设备的结构示意图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。
本说明书中使用用语“一个”、“一”、“该”和“所述”用以表示存在一个或多个要素/组成部分/等;用语“包括”和“具有”用以表示开放式的包括在内的意思并且是指除了列出的要素/组成部分/等之外还可存在另外的要素/组成部分/等;用语“第一”和“第二”等仅作为标记使用,不是对其对象的数量限制。
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。
目前,相关技术中一般只部署一分布式关联型数据库,数据的并发处理能力较弱,且相关数据均需要人工进行审计,数据处理的效率较差;另外,无法为不同账户等级 的用户展示不同敏感度等级的数据,无法避免敏感信息流入信息展示平台。
在本公开的实施例中,首先提供了一种舆情数据处理系统,至少在一定程度上克服相关技术中并发能力弱、无法为不同账户等级的用户展示不同敏感度等级的数据的缺陷。
图1示出本公开示例性实施例中舆情数据处理系统的结构示意图;该舆情数据处理系统的执行主体可以是对舆情数据进行处理的服务器。
如图1所示,根据本公开的一个实施例的舆情数据处理系统100可以包括网络数据整合平台101、大数据集群102、业务数据整合平台103、数据库服务器104和数据展示平台105。其中:
网络数据整合平台、大数据集群、业务数据整合平台、数据库服务器和数据展示平台;其中,
网络数据整合平台,用于对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群;
大数据集群,用于过滤网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;
业务数据整合平台,用于从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;
数据展示平台,用于根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户。
在图1所示实施例所提供的技术方案中,一方面,网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群,以使大数据集群过滤其中的无效数据,并将过滤后的数据发送至业务数据整合平台,从而,不仅能够解决相关中人工审核数据所导致的效率较低的问题,提高数据的审计效率,而且能够减小需要处理的数据量,避免无效数据对后续数据处理过程的影响,提高后续的数据处理效率。进一步的,业务数据整合平台从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器,能够解决相关技术中仅部署一分布式大数据集群所导致的并发能力弱的技术问题,提高系统的并发处理能力。另一方面,数据展示平台根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户,能够根据用户账户等级的不同为其展示不同敏感度等级的企业舆情数据,提供智能化的信息安全管控机制,使得展 示数据更加贴近用户需求,提高用户的信息获取效率。
以下对图1中的各个部分的具体实现过程进行详细阐述:
网络数据整合平台101,用于对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群。
示例性的,可以参考图2,图2本公开一示例性实施例中舆情数据处理系统的子流程示意图,具体示出网络数据整合平台采集网络舆情数据的子流程示意图,包含步骤S201-步骤S202,以下结合图2对具体的实施方式进行解释。
在步骤S201中,对预先设置的数据资源库中的数据进行采集和/或调用网络爬虫对目标网站中的数据进行周期性采集。
示例性的,可以预先设置一数据资源库,该数据资源库中的数据可以是通过舆情专员采集到的舆情数据、相关数据提供厂商提供的舆情数据等,进而,网络数据整合平台可以采集该数据资源库中的数据,鉴于该数据资源库中的数据一般具有不易变换的固定数据格式,并且,具有稳定的数据来源,从而,能够避免大量的开发与运维成本。
示例性的,还可以调用网络爬虫对目标网站中的数据进行周期性采集,该目标网站可以是国内外的相关搜索引擎(百度、搜狗、360)、微博、微信公众号、论坛、贴吧、博客等网站平台,可以根据实际情况自行设定,属于本公开的保护范围。示例性的,可以设定一定时程序(例如:定时时长为10分钟),进而,每间隔10分钟采集一次目标网站上的数据,并将采集到的数据转换为统一的格式。从而,能够采集到海量的数据,并且,无需采购经费,降低数据采集成本。
在步骤S202中,对采集的数据进行去重和归一化处理,得到网络舆情数据。
在采集到数据之后,可以对采集到的数据进行去重,以去除其中的冗余数据,避免大量冗余数据进入后续的处理流程中,保证数据的精简性,节省处理器的处理线程,保证数据处理效率。
在去重之后,可以对数据进行归一化处理,从而能够得到规范化的网络舆情数据,方便后续对数据进行处理。采集到的网络舆情数据可以是海量的、包含多个行业的舆情数据,示例性的,可以包含计算机行业、保险行业、汽车行业、食品安全行业、电子电路行业等的舆情数据。
示例性的,在得到网络舆情数据之后可以以JSON格式(Java Script Object Notation,JS对象简谱,是一种轻量级的数据交换格式)进行存储,JSON格式的数据不但能够提高可读性,而且可以减少复杂性,便于数据交换和数据处理。
在采集到网络舆情数据之后,网络数据整合平台还可以调用相关自动化审计模块对数据进行审计分析,以获得网络舆情数据的敏感度等级。此过程可以涉及模式匹配算法、文本语义分析、热点话题发现、不良图像内容识别等关键技术,以加强 对数据的安全管理,进一步保证数据的合法性、健康性和安全性。示例性的,可以参考图3,图3本公开一示例性实施例中舆情数据处理系统的子流程示意图,具体示出网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级的子流程示意图,包含步骤S301-S304,以下结合图3对具体的实施方式进行解释。
在步骤S301中,调用语义识别算法对网络舆情数据进行语义识别,得到语义识别结果。
示例性的,网络数据整合平台可以调用语义识别算法对网络舆情数据进行语义识别,得到语义识别结果。示例性的,还可以训练深度卷积神经网络,得到一语义识别模型,并将网络舆情数据输入该语义识别模型中,得到其语义识别结果。
在步骤S302中,根据预先存储的敏感关键字确定网络舆情数据中的目标关键字。
示例性的,大数据分析师(是指基于各种分析手段对大数据进行科学分析、挖掘、展现并用于决策支持的从业人员)还可以预先存储一些敏感关键字,示例性的,敏感关键字可以是暴力、损坏企业形象、负面语义词汇等关键字,进而,可以将网络舆情数据与这些敏感关键字进行对照匹配,以确定出目标关键字。
在步骤S303中,对语义识别结果和目标关键字进行汇总整合,得到网络舆情数据中包含的敏感数据。
在得到语义识别结果,以及,识别出上述目标关键字之后,可以对语义识别结果和目标关键字进行汇总整合,得到网络舆情数据中包含的敏感数据,并对敏感数据进行标记。
在步骤S304中,根据敏感数据的数目,确定网络舆情数据的敏感度等级。
示例性的,当网络舆情数据A中包含的敏感数据的数目为8个时,则可以将网络舆情数据A的敏感度等级确定为I级(高级别);当网络舆情数据B中包含的敏感数据的数目为4个时,则可以将网络舆情数据B的敏感度等级确定为II级(中等级别);当网络舆情数据C中包含的敏感数据的数目为0个时,则可以将网络舆情数据C的敏感度等级确定为III级(低级别)。
进而,网络数据整合平台可以将网络舆情数据及其敏感度等级发送至大数据集群。大数据集群是允许开发者们或是将写好的程序放在“云”里运行,或是使用“云”里提供的服务,或二者皆是。大数据集群是一个集数据接入、数据处理、数据存储、查询检索、分析挖掘等、应用接口等为一体的平台。
继续参考图1,大数据集群102,用于过滤网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台。
示例性的,大数据集群可以过滤网络舆情数据中包含的无效数据,并将过滤处理之后的网络舆情数据发送至业务数据整合平台。需要说明的是,本公开中的大数 据集群可以是分布型关联式数据库,例如:GreenPlum,从而,一方面,能够提供较大的数据存储空间,达到数据备份功能,避免应用数据库数据丢失所造成的损失,另一方面,还能够提高对海量数据的运算能力和查询速度。
示例性的,无效数据可以是一些小道消息、传闻、谣言等无法确定数据真实性的舆情消息,示例性的,可以预先设置一些无效关键字(例如:一些小道消息的标题、传闻的标题、谣言的标题等关键字),进而,可以将网络舆情数据与这些无效关键字进行对照匹配,以确定出其中的无效数据,进而,大数据集群可以对确定出来的无效数据进行过滤处理,并将过滤处理后的网络舆情数据发送至业务数据整合平台。
需要说明的是,大数据集群还可以对网络舆情数据进行分析处理,以对其进行分类汇总,举例而言,可以将其分为突发自然灾害事件、生产安全事故、群体性事件、公共卫生事件、企业形象、司法事件、经济民生事件、境外突发事件等类别,从而,在后续的处理过程中,可以对不同类别的网络舆情数据进行分类处理,从而提高数据有序性以及数据处理的针对性。继续参考图1,业务数据整合平台103,用于从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器104。
进而,业务数据整合平台103可以从过滤后的网络舆情数据中筛选出企业舆情数据,示例性的,可以是目标企业的企业舆情数据,也可以是与目标企业所关联行业内所有企业的舆情数据,从而,便于该目标企业了解行业发展动向,提高在同行中的竞争力。
业务数据整合平台还可以提供一审计操作界面,从而,审计员可以登陆系统并输入第一交互操作信息(例如:手动将不同敏感度等级的企业舆情数据与对应的账户等级绑定),以实现将不同敏感度等级的企业舆情数据与不同的账户等级相互关联起来。示例性的,参照上述步骤S304的相关解释,可以将敏感度等级为I级的企业舆情数据与账户等级(高阶用户,例如:企业内部的高层管理者)关联起来,即为高阶用户展示上述敏感度等级为I级的企业舆情数据;可以将敏感度等级为II级的企业舆情数据与账户等级(中高阶用户,例如:企业内部的中层管理者)关联起来,即为中高阶用户展示上述敏感度等级为II级的企业舆情数据;可以将敏感度等级为III级的企业舆情数据与账户等级(低阶用户,例如:企业外部的社会人员、企业内部的基层人员)关联起来,即为低阶用户展示上述敏感度等级为III级的企业舆情数据。
示例性的,还可以将敏感度等级为I、II、III级的企业舆情数据与账户等级(高阶用户)关联起来,即为高阶用户展示上述敏感度等级为I、II、III级的企业舆情数据;可以将敏感度等级为II、III级的企业舆情数据与账户等级(中高阶用户)关联 起来,即为中高阶用户展示上述敏感度等级为II、III级的企业舆情数据;可以将敏感度等级为III级的企业舆情数据与账户等级(低阶用户)关联起来,即为低阶用户展示上述敏感度等级为III级的企业舆情数据。
示例性的,还可以通过上述审计操作界面设定大数据分析师的权限为:调整系统的处理参数,修改系统相关的算法程序。
设定主审计员的账户权限为:审计系统内所有的数据、具有设置所有数据的展示、查询、撤回展示、撤回查询等权限。群审计员的账户权限为:审计某一类别的数据,具有设置该类别数据的展示、查询、撤回展示、撤回查询等权限。
设定高阶用户的账户权限为:提供“敏感度等级为I、II、III级的企业舆情数据”的分类展示,具备“敏感度等级为I、II、III级的企业舆情数据”的阅读和查询权限。设定中高阶用户的账户权限为:具备“敏感度等级为II、III级的企业舆情数据”的分类展示,具备“敏感度等级为II、III级的企业舆情数据”的阅读和查询权限。设定低阶用户的账户权限为:具备“敏感度等级为III级的企业舆情数据”的阅读和查询权限。
其中,不同的账户等级可以通过用户名来进行区分,举例而言,可以设置高阶用户的用户名以固定字符“superior-user”开头、中高阶用户的用户名以固定字符“junior-user”开头、低阶用户的用户名以固定字符“user”开头。需要说明的是,以上字符均可以根据实际情况自行设定,属于本公开的保护范围。
示例性的,审计员还可以评估上述企业舆情数据对应的风险值,从而,能够根据该风险值预先制定相关舆情防控措施,及时制止相关舆情危机,避免危机造成的企业损失。
示例性的,业务数据整合平台还可以对企业舆情数据进行分析处理,以对其进行分类汇总,举例而言,可以将其分为正面舆情数据(即对企业形象宣传有利的数据)、中立舆情数据(即对企业形象没有利害影响的数据)、负面舆情数据(即对企业形象宣传不利的数据),或者,将其分为与核心技术相关的企业舆情数据、与企业形象相关的企业舆情数据等类别,从而,在后续的处理过程中,可以对不同类别的企业舆情数据进行分类处理,从而提高数据有序性以及数据展示的针对性。
在获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系之后,业务数据整合平台可以将企业舆情数据、获取到的用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至单机部署的数据库服务器中,以进一步提高数据的可靠性。需要说明的是,本公开中的数据库服务器可以是PostgreSQL(一种开源关系型数据库,特点是功能强大,支持面向对象等很多高级特性),从而,能够降低设备成本,保证较强的并发处理能力。
同时,鉴于数据库服务器为单机部署的形式,数据展示平台可以直接向数据库 服务器请求数据,从而能够解决相关技术中仅部署一分布式大数据集群所导致的并发能力弱的技术问题,提高系统的并发处理能力。
继续参考图1,数据展示平台105,用于根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户。
本公开中的数据展示平台可以是iNews新闻平台,具体的,可以参考图4,图4本公开一示例性实施例中舆情数据处理系统的子流程示意图,具体示出数据展示平台提供身份认证界面以确定认证通过用户的子流程示意图,包含步骤S401-步骤S404,以下结合图4对具体的实施方式进行解释。
在步骤S401中,提供一身份认证界面。
示例性的,数据展示平台可以提供一身份认证界面,示例性的,该身份认证界面可以显示一用户名及其登陆密码,示例性的,还可以包括图片验证码、数字验证码等,可以根据实际情况自行设定,属于本公开的保护范围。
在步骤S402中,获取待认证用户在身份认证界面中输入的身份信息。
进而,待认证用户可以在上述身份认证界面中输入其身份信息。
在步骤S403中,将身份信息发送至第三方认证平台,以使第三方认证平台对身份信息进行合法性认证。
数据展示平台可以将上述身份信息发送至第三方认证平台,以使第三方认证平台对上述身份信息进行合法性认证。示例性的,第三方认证平台可以是指:除了环球商讯网之外的第三方认证机构。第三方认证是为了验证商讯通申请者的真实合法身份,最大程度上的杜绝虚假信息,保证商讯通的利益。具体的,第三方认证平台可以对身份信息进行LDAP认证(Lightweight Directory Access Protocol,轻量目录访问协议,简称:LDAP),LDAP认证是通过WSS3.0加上轻量目录LDAP协议搭建的一种认证方式,即把身份信息放在LDAP服务器上,通过LDAP服务器上的数据对待认证用户进行认证处理,在认证通过之后,第三方认证平台可以向数据展示平台发送一认证通过消息。
在步骤S404中,在接收到第三方认证平台返回的认证通过消息时,确定待认证用户为认证通过用户。
在接收到第三方认证平台返回的认证通过消息时,数据展示平台可以确定待认证用户为认证通过用户。
进而,数据展示平台可以根据认证通过用户的账户等级请求数据库服务器,并将请求到的企业舆情数据展示给认证通过用户。示例性的,可以根据认证通过用户的用户名确定其对应的账户权限,举例而言,当认证通过用户的用户名为以固定字符“superior-user”开头时,则可以确定该用户的账户等级为高阶用户,从而,请求到的 目标敏感度等级的企业舆情数据可以是:敏感度等级为I、II、III级的企业舆情数据。从而,能够为不同账户等级的用户展示不同敏感度级别的企业舆情数据,使得展示数据更加贴近用户需求,提高用户的信息获取效率。
示例性的,可以参考图5,图5本公开一示例性实施例中舆情数据处理系统的子流程示意图,具体示出数据展示平台自动化聆听用户反馈意见的流程示意图,包含步骤S501-S503,以下结合图5对具体的实施方式进行解释。
在步骤S501中,提供一信息展示界面,并在信息展示界面中展示目标敏感度等级的企业舆情数据。
示例性的,数据展示平台可以提供一信息展示界面,并在信息展示界面中展示上述(根据认证通过用户的账户等级从数据库服务器中)请求到的目标敏感度等级的企业舆情数据。
在步骤S502中,获取所述认证通过用户在所述信息展示界面中输入的对所述目标敏感度等级的企业舆情数据的反馈意见。
进而,认证通过用户可以浏览上述目标敏感度等级的企业舆情数据,在浏览完成之后,可以在上述信息展示界面中输入对每条企业舆情数据的反馈意见,例如:X数据的展示有助于提升企业形象,Y数据的展示可能有损企业形象等。
在步骤S503中,将反馈意见写入数据库服务器。
进而,数据展示平台可以将上述反馈意见写入数据库服务器。
从而,当检测到反馈意见中包含指定关键字时,可以向审计员发送提示消息(例如:在相关审计操作界面显示一消息弹窗等方式),从而,在审计员接收到上述消息之后,可以通过审计操作界面输入第二交互操作信息(例如:手动删除数据等),以撤销对相关企业舆情数据的展示。示例性的,该指定关键字可以是“有损企业形象、泄露企业核心技术机密”等可能对企业造成损失的关键字。举例而言,当反馈意见为“Y数据的展示可能有损企业形象”时,则可以撤销数据展示平台对Y数据的展示,即数据展示平台无法请求到该Y数据。
示例性的,可以参考图6,图6示出本公开一示例性实施例中舆情数据处理系统的整体交互流程图,以下结合图6对具体的实施方式进行解释。
网络数据整合平台101,用于对预先设置的数据资源库中的数据进行采集和/或调用网络爬虫对目标网站中的数据进行周期性采集,以获取到网络舆情数据;根据语义识别算法、预先存储的敏感关键字对采集到的网络舆情数据进行审计分析(自动化审计作业),以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群;
大数据集群102,用于过滤无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;
业务数据整合平台103,用于从过滤后的网络舆情数据中筛选企业舆情数据;提供一审计操作界面,将企业舆情数据、审计员设置的用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器104;以及,提供一信息展示界面,以供用户阅读、查询数据。
数据展示平台105,用于提供一身份认证界面,并将待认证用户的身份信息发送至第三方认证平台106,通过第三方认证平台的LDAP服务对用户身份进行合法性认证;根据认证通过用户的账户等级请求数据库服务器,以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户;
以及,提供一信息展示界面,获取认证通过用户的反馈意见,将反馈意见写入数据库服务器104,当目标舆情数据对应的反馈意见中包含指定关键字时,撤销数据展示平台对目标舆情数据的展示。
本公开还提供了一种舆情数据处理方法,示例性的,可以参考图7,图7示出本公开一示例性实施例中舆情数据处理方法的流程示意图,该舆情数据处理方法的执行主体可以是对舆情数据进行处理的服务器。
参考图7,根据本公开的一个实施例的舆情数据处理方法包括以下步骤:
步骤S710,通过网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群;
步骤S720,通过大数据集群过滤网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;
步骤S730,通过业务数据整合平台从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;
步骤S740,通过数据展示平台根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户。
在本公开的示例性实施例中,通过网络数据整合平台对预先设置的数据资源库中的数据进行采集和/或调用网络爬虫对目标网站中的数据进行周期性采集;对采集的数据进行去重和归一化处理,得到网络舆情数据。
在本公开的示例性实施例中,通过网络数据整合平台调用语义识别算法对网络舆情数据进行语义识别,得到语义识别结果;根据预先存储的敏感关键字确定网络舆情数据中的目标关键字;对语义识别结果和目标关键字进行汇总整合,得到网络舆情数据中包含的敏感数据;根据敏感数据的数目,确定网络舆情数据的敏感度等级。
在本公开的示例性实施例中,通过业务数据整合平台提供一审计操作界面;根据审计员在审计操作界面上输入的第一交互操作信息,确定用户的账户等级与企业舆情数据的敏感度等级之间的关联关系。
在本公开的示例性实施例中,通过数据展示平台提供一身份认证界面;获取待认证用户在身份认证界面中输入的身份信息;将身份信息发送至第三方认证平台,以使第三方认证平台对身份信息进行合法性认证;在接收到第三方认证平台返回的认证通过消息时,确定待认证用户为认证通过用户。
在本公开的示例性实施例中,通过数据展示平台提供一信息展示界面,并在信息展示界面中展示目标敏感度等级的企业舆情数据;获取认证通过用户在信息展示界面中输入的对目标敏感度等级的企业舆情数据的反馈意见;将反馈意见写入数据库服务器。
在本公开的示例性实施例中,通过业务数据整合平台当检测到反馈意见中包含指定关键字时,向审计员发送提示消息;提供一审计操作界面;根据审计员在审计操作界面上输入的第二交互操作信息,撤销对目标敏感度等级的企业舆情数据的展示。
上述舆情数据处理方法中各模块的具体细节已经在对应的舆情数据处理系统中进行了详细的描述,因此此处不再赘述。
基于上述技术方案,一方面,本公开能够解决相关中人工审核数据所导致的效率较低的问题,提高数据的审计效率,而且能够减小需要处理的数据量,避免无效数据对后续数据处理过程的影响,提高后续的数据处理效率。进一步的,能够解决相关技术中仅部署一分布式大数据集群所导致的并发能力弱的技术问题,提高系统的并发处理能力。另一方面,能够根据用户账户等级的不同为其展示不同敏感度等级的企业舆情数据,提供智能化的信息安全管控机制,使得展示数据更加贴近用户需求,提高用户的信息获取效率。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
此外,尽管在附图中以特定顺序描述了本公开中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实 施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、移动终端、或者网络设备等)执行根据本公开实施方式的方法。
在本公开示例性实施方式中,还提供了一种能够实现上述方法的计算机存储介质。其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施例中,本公开的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。
参考图8所示,描述了根据本公开的实施方式的用于实现上述方法的程序产品800,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算 设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
此外,在本公开的示例性实施例中,还提供了一种能够实现上述方法的电子设备。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为系统、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。
下面参照图9来描述根据本公开的这种实施方式的电子设备900。图9显示的电子设备900仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图9所示,电子设备900以通用计算设备的形式表现。电子设备900的组件可以包括但不限于:上述至少一个处理单元910、上述至少一个存储单元920、连接不同系统组件(包括存储单元920和处理单元910)的总线930以及显示单元940。
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元910执行,使得所述处理单元910执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的步骤。例如,所述处理单元910可以执行如图7中所示的:步骤S710,通过网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得网络舆情数据的敏感度等级,将网络舆情数据及其敏感度等级发送至大数据集群;步骤S720,通过大数据集群过滤网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;步骤S730,通过业务数据整合平台从过滤后的网络舆情数据中筛选企业舆情数据,并将企业舆情数据、获取到用户的账户等级与企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;步骤S740,通过数据展示平台根据认证通过用户的账户等级请求数据库服务器以获得目标敏感度等级的企业舆情数据,并将目标敏感度等级的企业舆情数据展示给认证通过用户。
存储单元920可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)9201和/或高速缓存存储单元9202,还可以进一步包括只读存储单元(ROM)9203。
存储单元920还可以包括具有一组(至少一个)程序模块9205的程序/实用工具9204,这样的程序模块9205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线930可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存 储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备900也可以与一个或多个外部设备1000(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备900交互的设备通信,和/或与使得该电子设备900能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口950进行。并且,电子设备900还可以通过网络适配器960与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器960通过总线930与电子设备900的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备900使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施方式的方法。
此外,上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。

Claims (10)

  1. 一种舆情数据处理系统,其特征在于,包括:
    网络数据整合平台、大数据集群、业务数据整合平台、数据库服务器和数据展示平台;其中,
    网络数据整合平台,用于对采集到的网络舆情数据进行审计分析,以获得所述网络舆情数据的敏感度等级,将所述网络舆情数据及其敏感度等级发送至大数据集群;大数据集群,用于过滤所述网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;
    业务数据整合平台,用于从所述过滤后的网络舆情数据中筛选企业舆情数据,并将所述企业舆情数据、获取到用户的账户等级与所述企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;
    数据展示平台,用于根据认证通过用户的账户等级请求所述数据库服务器以获得目标敏感度等级的企业舆情数据,并将所述目标敏感度等级的企业舆情数据展示给所述认证通过用户。
  2. 根据权利要求1所述的系统,其特征在于,所述网络数据整合平台还用于:
    对预先设置的数据资源库中的数据进行采集和/或调用网络爬虫对目标网站中的数据进行周期性采集;
    对采集的数据进行去重和归一化处理,得到所述网络舆情数据。
  3. 根据权利要求2所述的系统,其特征在于,所述网络数据整合平台还用于:
    调用语义识别算法对所述网络舆情数据进行语义识别,得到语义识别结果;
    根据预先存储的敏感关键字确定所述网络舆情数据中的目标关键字;
    对所述语义识别结果和所述目标关键字进行汇总整合,得到所述网络舆情数据中包含的敏感数据;
    根据所述敏感数据的数目,确定所述网络舆情数据的敏感度等级。
  4. 根据权利要求1所述的系统,其特征在于,所述业务数据整合平台还用于:
    提供一审计操作界面;
    根据审计员在所述审计操作界面上输入的第一交互操作信息,确定用户的账户等级与所述企业舆情数据的敏感度等级之间的关联关系。
  5. 根据权利要求1所述的系统,其特征在于,所述数据展示平台还用于:
    提供一身份认证界面;
    获取待认证用户在所述身份认证界面中输入的身份信息;
    将所述身份信息发送至第三方认证平台,以使所述第三方认证平台对所述身份信息进行合法性认证;
    在接收到所述第三方认证平台返回的认证通过消息时,确定所述待认证用户为 所述认证通过用户。
  6. 根据权利要求1所述的系统,其特征在于,所述数据展示平台还用于:
    提供一信息展示界面,并在所述信息展示界面中展示所述目标敏感度等级的企业舆情数据;
    获取所述认证通过用户在所述信息展示界面中输入的对所述目标敏感度等级的企业舆情数据的反馈意见;
    将所述反馈意见写入所述数据库服务器。
  7. 根据权利要求6所述的系统,其特征在于,所述业务数据整合平台还用于:
    当检测到所述反馈意见中包含指定关键字时,向所述审计员发送提示消息;
    提供一审计操作界面;
    根据所述审计员在所述审计操作界面上输入的第二交互操作信息,撤销对所述目标敏感度等级的企业舆情数据的展示。
  8. 一种舆情数据处理方法,其特征在于,包括:
    通过网络数据整合平台对采集到的网络舆情数据进行审计分析,以获得所述网络舆情数据的敏感度等级,将所述网络舆情数据及其敏感度等级发送至大数据集群;
    通过大数据集群过滤所述网络舆情数据中的无效数据,并将过滤后的网络舆情数据发送至业务数据整合平台;
    通过业务数据整合平台从所述过滤后的网络舆情数据中筛选企业舆情数据,并将所述企业舆情数据、获取到用户的账户等级与所述企业舆情数据的敏感度等级之间的关联关系存储至数据库服务器;
    通过数据展示平台根据认证通过用户的账户等级请求所述数据库服务器以获得目标敏感度等级的企业舆情数据,并将所述目标敏感度等级的企业舆情数据展示给所述认证通过用户。
  9. 一种计算机存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求8所述的舆情数据处理方法。
  10. 一种电子设备,其特征在于,包括:
    处理器;以及
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求8所述的舆情数据处理方法。
PCT/CN2021/100424 2020-09-11 2021-06-16 舆情数据处理系统及方法、计算机存储介质、电子设备 WO2022052546A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/455,738 US20220084051A1 (en) 2020-09-11 2021-11-19 System and method for processing public sentiment, computer storage medium and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010953021.0A CN114168830A (zh) 2020-09-11 2020-09-11 舆情数据处理系统及方法、计算机存储介质、电子设备
CN202010953021.0 2020-09-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/455,738 Continuation US20220084051A1 (en) 2020-09-11 2021-11-19 System and method for processing public sentiment, computer storage medium and electronic device

Publications (1)

Publication Number Publication Date
WO2022052546A1 true WO2022052546A1 (zh) 2022-03-17

Family

ID=80475981

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/100424 WO2022052546A1 (zh) 2020-09-11 2021-06-16 舆情数据处理系统及方法、计算机存储介质、电子设备

Country Status (2)

Country Link
CN (1) CN114168830A (zh)
WO (1) WO2022052546A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862171A (zh) * 2022-04-24 2022-08-05 支付宝(杭州)信息技术有限公司 应急事件的风险评估方法及装置
CN116995816A (zh) * 2023-09-25 2023-11-03 国网山东省电力公司淄博供电公司 一种基于人工智能的供电数据处理平台及方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320010A1 (en) * 2007-05-14 2008-12-25 Microsoft Corporation Sensitive webpage content detection
CN107908619A (zh) * 2017-11-15 2018-04-13 中国平安人寿保险股份有限公司 基于舆情监控的处理方法、装置、终端及计算机存储介质
CN109523118A (zh) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 风险数据筛选方法、装置、计算机设备和存储介质
CN109919777A (zh) * 2019-03-05 2019-06-21 上海金大师网络科技有限公司 一种数据呈现方法、装置、设备及存储介质
CN110287313A (zh) * 2019-05-20 2019-09-27 阿里巴巴集团控股有限公司 一种风险主体的确定方法及服务器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080320010A1 (en) * 2007-05-14 2008-12-25 Microsoft Corporation Sensitive webpage content detection
CN107908619A (zh) * 2017-11-15 2018-04-13 中国平安人寿保险股份有限公司 基于舆情监控的处理方法、装置、终端及计算机存储介质
CN109523118A (zh) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 风险数据筛选方法、装置、计算机设备和存储介质
CN109919777A (zh) * 2019-03-05 2019-06-21 上海金大师网络科技有限公司 一种数据呈现方法、装置、设备及存储介质
CN110287313A (zh) * 2019-05-20 2019-09-27 阿里巴巴集团控股有限公司 一种风险主体的确定方法及服务器

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114862171A (zh) * 2022-04-24 2022-08-05 支付宝(杭州)信息技术有限公司 应急事件的风险评估方法及装置
CN116995816A (zh) * 2023-09-25 2023-11-03 国网山东省电力公司淄博供电公司 一种基于人工智能的供电数据处理平台及方法
CN116995816B (zh) * 2023-09-25 2024-02-23 国网山东省电力公司淄博供电公司 一种基于人工智能的供电数据处理平台及方法

Also Published As

Publication number Publication date
CN114168830A (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
US20220084051A1 (en) System and method for processing public sentiment, computer storage medium and electronic device
US20180130157A1 (en) System and methods for identifying compromised personally identifiable information on the internet
US20140052791A1 (en) Task Based Filtering of Unwanted Electronic Communications
WO2022052546A1 (zh) 舆情数据处理系统及方法、计算机存储介质、电子设备
US20120191502A1 (en) System & Method For Analyzing & Predicting Behavior Of An Organization & Personnel
US20230113375A1 (en) Augmented threat detection using an attack matrix and data lake queries
US11979423B2 (en) Real-time classification of content in a data transmission
US8484217B1 (en) Knowledge discovery appliance
US10812426B1 (en) Data derived user behavior modeling
US11567936B1 (en) Platform agnostic query acceleration
CN107146154A (zh) 一种数据管理的方法及装置
US20230259647A1 (en) Systems and methods for automated discovery and analysis of privileged access across multiple computing platforms
CN113779609B (zh) 数据管理方法、装置、电子设备及存储介质
CN116541372A (zh) 一种数据资产治理方法及系统
CN111275391A (zh) 在线式资产智能分发系统和方法
CN111858924A (zh) 一种具有网络舆情监控及分析功能的系统
CN112822210A (zh) 一种基于网络资产的漏洞管理系统
KR101775658B1 (ko) 트러스트 기반의 스마트워크 시스템 및 방법
Khurshid et al. Big data-9vs, challenges and solutions
CN113904828B (zh) 接口的敏感信息检测方法、装置、设备、介质和程序产品
CN115033574A (zh) 信息生成方法、信息生成装置、电子设备及存储介质
US20070271229A1 (en) System and method for data searching among multiple enterprise applications
CN114265759A (zh) 一种数据信息泄露后的溯源方法、系统及电子设备
Du et al. Detection and suppression of malware based on consortium blockchain
Liu et al. Python Java joint implementation of internet-based public opinion information collection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21865605

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21865605

Country of ref document: EP

Kind code of ref document: A1