CN112732781A - Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation - Google Patents

Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation Download PDF

Info

Publication number
CN112732781A
CN112732781A CN202011628462.XA CN202011628462A CN112732781A CN 112732781 A CN112732781 A CN 112732781A CN 202011628462 A CN202011628462 A CN 202011628462A CN 112732781 A CN112732781 A CN 112732781A
Authority
CN
China
Prior art keywords
data
network
network situation
analysis
situation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011628462.XA
Other languages
Chinese (zh)
Inventor
曾曦
陈天莹
万力
李霄
曾平
黄金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Wanglian Anrui Network Technology Co ltd
Original Assignee
Shenzhen Wanglian Anrui Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Wanglian Anrui Network Technology Co ltd filed Critical Shenzhen Wanglian Anrui Network Technology Co ltd
Priority to CN202011628462.XA priority Critical patent/CN112732781A/en
Publication of CN112732781A publication Critical patent/CN112732781A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a dynamic network situation drawing system and method for multi-dimensional evaluation of fusion data quality, and relates to the technical field of network space cognition. The method comprises the following steps: a data acquisition range delineating unit analyzes the regional characteristics of economy, civil life and politics; the data gathering unit acquires data by adopting different data acquisition means according to the defined data acquisition range; the data management unit constructs a data resource catalog to form a data resource pool; the data association analysis mining unit is used for constructing a knowledge graph model and forming a holographic association library of characters, organizations, events and the like; the network situation sensing unit senses network situation, dynamically draws a network situation map and monitors the network situation in real time. The invention combines data quality evaluation, integrates multiple dimensions of data acquisition, data analysis mining, information analysis and the like, provides a network situation monitoring system, realizes the network situation oriented to a specific area, expands the breadth of transverse analysis and the depth of longitudinal analysis, and completes dynamic drawing of the network situation.

Description

Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation
Technical Field
The invention relates to the technical field of network space cognition, in particular to a dynamic network situation drawing system and method for multi-dimensional evaluation of data quality.
Background
At present, there is no complete system and technical architecture about the situation of the whole network, and what is most relevant to the situation is the related technology of monitoring the network public sentiment, the existing network public sentiment monitoring technology mainly collects mass open source information of the internet in real time through technologies such as internet open source information collection, data processing, data analysis, natural language processing and the like, realizes theme classification, content clustering and sentiment analysis automatically according to data characteristics, automatically discovers hot events and themes, and supports users to monitor and track the related network public sentiment and the like.
The prior art mainly has the following problems:
(1) the existing public opinion monitoring method mainly aims at partial social network sites, news media and other objects, automatically collects data, cleans the data, and directly performs emotion analysis, theme detection and information research and judgment. Although some technologies are relatively mature, the analysis dimension is incomplete, and a complete system for analyzing and monitoring the network situation is lacked to guide the whole network situation analysis and dynamic drawing.
(2) The existing public opinion monitoring and analyzing object is often a target client in a small range, the monitoring coverage area of the client is incomplete, and when the client faces large target network opinion monitoring, the network activity range delineation acquisition ranges of the client are not consistent based on a specific area network.
(3) The accuracy of the network sentiment data analysis result completely depends on the quality of the data source, but the existing network sentiment monitoring system does not evaluate the quality of the public sentiment data source, can not control the data quality, and can influence the information research and judgment result.
The difficulty in solving the technical problems is as follows: the existing public opinion analysis technology is not comprehensive in coverage, and a network emotion analysis system is difficult to construct; the data quality system of mass data is incomplete, and the data quality is difficult to evaluate; the data quality is not combined with a network analysis system, the data accuracy cannot be guaranteed, and the intelligence research and judgment cannot be powerfully supported.
The significance of solving the technical problems is as follows: a perfect network condition analysis system is constructed, so that the network condition is monitored completely and accurately, and information decision and research and judgment are assisted powerfully; a data quality evaluation system is constructed and fused into a network situation analysis system, the data quality is efficiently evaluated in real time, the accuracy of the whole network situation data is improved, and the problem that the information research and judgment result is inaccurate due to the data problem is effectively solved.
Disclosure of Invention
The invention is based on the cognitive domain, is developed from the network condition analysis, and solves the problems that the existing network condition analysis system is imperfect, the coverage area of a monitoring analysis object is inconsistent with the network activity range of a netizen in a specific area, and the accuracy of network condition data analysis cannot be effectively evaluated. The embodiment of the invention provides a network situation dynamic drawing system and method integrating data quality multi-dimensional evaluation. The technical scheme is as follows:
the network situation dynamic drawing system integrating the data quality multi-dimensional evaluation comprises:
the data acquisition range defining unit analyzes the regional characteristics of economy, civil life and politics, combines the network behavior of netizens and defines the data acquisition range of a main social platform, a news medium and a civil-toning institution of netizens;
the data aggregation unit acquires data by adopting different data acquisition means according to the defined data acquisition range, performs data aggregation according to different data types, and acquires the data from manual editing, open source data and a non-cooperative mode;
the data management unit is used for carrying out basic cleaning and field standardization on the data after the data are gathered, automatically adding a data label, constructing a data resource catalog, forming a data resource pool, and forming a high-value database through data access, storage, analysis and use;
the data association analysis mining unit forms different theme classifications according to different dimensions of characters, organizations and activities based on a high-value database, constructs a knowledge graph model and forms a holographic association library of the characters and the organizations;
and the network situation sensing unit is used for sensing the network situation on the basis of data association analysis mining, dynamically drawing a network situation map and monitoring the network situation in real time.
In one embodiment, the data aggregation mode includes file import, database extraction, FTP file access, and streaming data access.
In one embodiment, the data governance unit automatically discovers quality problems from nine dimensions of data timeliness, effectiveness, volatility, relevance, consistency, correctness, normalization, uniqueness and integrity, and continuously solves the problems to improve the value of data resources.
In one embodiment, the network situation perception unit conducts the perception of the network situation from the perspective of comprehensive data analysis, regional public opinion analysis, overseas public opinion analysis, important person and organization analysis, important event analysis and information analysis report.
Another objective of the present invention is to provide a method for implementing the network situation dynamic rendering system for multidimensional evaluation of fusion data quality, where the network situation dynamic rendering method for multidimensional evaluation of fusion data quality includes the following steps:
step one, data acquisition and aggregation; various data sources set in advance are analyzed by combining with regional characteristics, data are acquired by means of various data acquisition means, and multi-source data are converged to a data platform. And carrying out real-time data quality evaluation in the convergence process. This module supports the entire business primarily by providing the underlying data.
Step two, data management; the timeliness, the accuracy and the like of the data are guaranteed through the multi-dimensional evaluation of the data quality; the data validity is ensured through data cleaning, conversion, reduction and other modes; the usability and the safety of the data are ensured through data resource catalog, data labels and data authority management.
Step three, data service; and carrying out layered modeling on the data, constructing different basic libraries and association libraries, issuing data services and supporting different data requirements and business requirements.
And step four, dynamically drawing the network situation. Based on a network situation analysis system, multidimensional and omnibearing analysis of network situations facing to a specific area is realized, and the overall network situation is dynamically drawn.
In one embodiment, the data collection and aggregation comprises the steps of:
step one, based on the regional characteristics of politics, economy and civil life, an authoritative website or a representative website is defined as one of important sources of open source data;
step two, based on the network behavior analysis of the netizens, a main social platform and news media of the netizens network activities are defined to be used as the basis of the network situation basic data source;
step three, forming a network-oriented acquisition target;
step four, realizing real-time or timed acquisition of data through a network crawler, system log acquisition and manual editing mode, and transmitting the data to a data aggregation platform through a data security channel;
fifthly, in the data aggregation process, the data quality is preliminarily judged according to multiple dimensions of timeliness, effectiveness, integrity and the like of the data;
step six, based on FTP mode, stream mode and file import mode, classifying and storing the data after data quality preliminary judgment to a data aggregation system according to data sources;
and seventhly, monitoring in real time by adopting data aggregation, monitoring the data aggregation state in real time, and ensuring the stability of data aggregation.
In one embodiment, data governance comprises the steps of:
after receiving a system, evaluating the integrity, accuracy, effectiveness, uniqueness, correctness and timeliness dimension of the converged data in real time;
secondly, performing log recording and alarming on the judgment result of the data quality based on the real-time data quality detection result;
step three, automatically cleaning and converting the data based on the problems existing in the data quality detection;
step four, formulating a data standard, and constructing a data resource pool based on the data standard and data research and judgment;
establishing a data label system, classifying the data labels in a grading way, and realizing label definition of a table grade, a field grade and a data grade;
step six, constructing a data resource catalog from the perspective of data sources and data classification according to the types of the data;
and seventhly, realizing data resource query of the data resource catalog and data label system, and realizing fine-grained control on the access authority of the data resources.
In one embodiment, the data service comprises the steps of:
dividing data resources into an original layer, a standard layer, a basic layer and a subject layer, and performing layered modeling on different layers, wherein the original layer mainly stores original aggregated data; the standard layer stores data obtained after the data of the original layer is cleaned; the basic layer is a basic library of characters, organizations, events, behaviors and the like formed by fusing and associating standard layer data; the theme layer mainly faces to the data extracted and fused by different business applications;
performing deep fusion association on the data in the data resource pool, and mining association relation among the data to form a knowledge graph;
step three, configuring access authority and openness degree of data resources based on the formed data association library and the service library to form a data open directory;
and fourthly, the user applies for the data service based on the data open directory, the system receives the user service application, generates data service content based on the data service requirement and the data statistics, analysis and mining methods, and issues the service.
In one embodiment, the attributes of the data resources and the overall business construct a base resource library oriented to different topics, including a people library, an organization library, an event library, and behaviors.
In one embodiment, the method for dynamically drawing the net situation comprises the following steps:
the method comprises the steps of firstly, performing basic statistics on overall data based on multi-source collected data, analyzing data changes of different social platforms and news media, and drawing data change trends;
step two, calculating the trend of the whole network condition based on a network condition evaluation index system;
step three, predicting the change trend of the whole network situation based on the change rule of the historical data and the whole network situation trend;
performing deep analysis on the figure based on the figure basic library and the knowledge graph to realize holographic correlation of the figure and monitor the network behavior of the figure in real time;
step five, realizing dynamic tracking of character network liveness, volume of sound, support degree and the like on the basis of character social network and news media data acquisition;
forming a character basic file based on the character holographic file and the network line, tracking related activities and events of characters in real time, and dynamically perceiving information of the characters;
seventhly, deeply analyzing the basic situation of the organization, mining the association degree between the organization and the character and analyzing the association between the organization and the organization based on the organization base library and the knowledge soil doll;
detecting related activities and events of the organization on a news medium and a social platform in real time to form situation perception of the organization;
analyzing the trend of the whole public sentiment based on a machine learning and natural language processing mode, automatically finding hot topics, and tracking the topics in real time;
analyzing characters, organizations, mechanisms and the like related to the topic, mining the propagation path, key account numbers and the like of the topic, and predicting the future trend of the topic;
step eleven, carrying out deep excavation aiming at special events, and monitoring and tracking major activities and topics in real time;
analyzing and tracking the public sentiment outside the region based on the character organization library, and mining characters inside and outside the region, organization association, fund flow, network behaviors and the like to form the public sentiment situation outside the region;
combining historical events to form an analysis index of a major event, and constructing an event prediction model from dimensions such as the occurrence time, the occurrence place, the topic sensitivity, the activity scale and the like of the event;
fourteen, predicting planned events and unknown dangerous events based on an event prediction model;
and step fifteen, realizing dynamic drawing of the network situation based on combination of network situation perception, character situation perception, organization situation perception, regional public opinion situation perception, event situation perception and the like.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
1. the invention combines data quality evaluation, integrates multiple dimensions of data acquisition, data analysis mining, information analysis and the like, provides a network situation monitoring system, realizes the network situation oriented to a specific area, expands the breadth of transverse analysis and the depth of longitudinal analysis, and completes dynamic drawing of the network situation.
2. According to the invention, data quality evaluation is introduced into a network condition analysis system, so that the accuracy, effectiveness and usability of data are greatly improved, and the accuracy of the whole network condition analysis result is ensured.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a diagram of a network situation sensing system and architecture for specific areas according to the present invention.
FIG. 2 is a flow chart of data collection and aggregation provided by the present invention.
FIG. 3 is a flow chart of data governance provided by the present invention.
Fig. 4 is a flow chart of data services provided by the present invention.
FIG. 5 is a flow chart of dynamic network situation mapping according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as broadly as the present invention is capable of modification in various respects, all without departing from the spirit and scope of the present invention.
It will be understood that when an element is referred to as being "secured to" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. As used herein, the terms "vertical," "horizontal," "left," "right," and the like are for purposes of illustration only and are not intended to represent the only embodiments.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The invention combines data quality evaluation, integrates multiple dimensions of data acquisition, data analysis mining, information analysis and the like, provides a network situation monitoring system, realizes the network situation oriented to a specific area, expands the breadth of transverse analysis and the depth of longitudinal analysis, and completes dynamic drawing of the network situation.
In view of the deficiency and deficiency of the specific area-oriented network situation analysis system and technology, the invention provides a specific area-oriented network situation dynamic rendering system and method.
A network situation dynamic drawing system fusing data quality multidimensional evaluation is disclosed, wherein the system comprises: the method is realized in the aspects of data acquisition range delineation, data aggregation, data management, data association analysis and mining, network situation perception and information auxiliary decision making.
The data acquisition range defining unit analyzes regional characteristics of economy, civil life, politics and the like of a specific region, combines the network behavior of netizens, and defines the range of data acquisition of a main social platform, a news medium, a civil-style institution and the like of netizen activities.
The data aggregation unit acquires data by adopting different data acquisition means according to the defined data acquisition range, performs data aggregation according to different data types, acquires the data from manual editing, open source data and non-cooperative modes, and adopts the data aggregation modes including file import, database extraction, FTP file access, streaming data access and the like.
The data management unit is used for carrying out basic cleaning and field standardization on data after the data are gathered, automatically adding data labels, constructing a data resource catalog, forming a data resource pool, automatically discovering quality problems from nine dimensions of data timeliness, effectiveness, volatility, relevance, consistency, correctness, normalization, uniqueness and integrity through multiple links of data access, storage, analysis and use and the like, continuously solving the problems, improving the value of data resources and forming a high-value database related to a specific area.
And the data association analysis mining unit forms different theme classifications according to different dimensions of characters, organizations, activities and the like based on a high-value database related to a specific area, and constructs a knowledge graph model to form a holographic association library of the characters, the organizations and the like.
The network situation sensing unit senses the network situation from the aspects of comprehensive data analysis, regional public opinion analysis, overseas public opinion analysis, key character and organization analysis, major event analysis and information analysis report on the basis of data association analysis and mining, dynamically draws a network situation map and monitors the network situation in real time.
A network situation dynamic drawing method fusing data quality multi-dimensional evaluation comprises the following steps: data acquisition and aggregation, data management, data service and dynamic drawing of network situation.
The specific flow of data acquisition and aggregation is as follows:
1. based on regional characteristics of politics, economy, civil life and the like of a specific region, an authoritative website or a representative website for the specific region is defined as one of important sources of open source data;
2. based on the network behavior analysis of the netizens in the specific area, a main social platform and news media of network activities of the netizens in the specific area are defined as the basis of a network situation basic data source;
3. forming a network acquisition target facing a specific area;
4. the method comprises the following steps of realizing real-time or timed acquisition of data through various modes such as web crawlers, system log acquisition, manual editing and the like, and transmitting the data to a data convergence platform through a data security channel;
5. in the data aggregation process, the data quality is preliminarily judged according to multiple dimensions of timeliness, effectiveness, integrity and the like of the data;
6. based on various modes such as an FTP mode, a streaming mode, file import and the like, the data after the preliminary judgment of the data quality is classified and stored in a data aggregation system according to data sources;
7. and the data aggregation real-time monitoring is adopted, the data aggregation state is monitored in real time, and the data aggregation stability is ensured.
The main flow of data treatment is as follows:
1. after the system is received, evaluating the dimensions of the aggregated data, such as integrity, accuracy, effectiveness, uniqueness, correctness, timeliness and the like in real time;
2. performing log recording and alarming on the judgment result of the data quality based on the real-time data quality detection result;
3. based on the problems existing in the data quality detection, automatically cleaning and converting the data;
4. formulating a data standard, and constructing a data resource pool based on the data standard and data research and judgment;
5. establishing a data label system, classifying the data labels in a grading way, and realizing label definition of a table grade, a field grade and a data grade;
6. constructing a data resource catalog from the perspective of data sources and data classification according to the types of data;
7. and realizing data resource query of a data resource catalog and data label system, and realizing fine-grained control on the access authority of the data resources.
The data service flow is specifically as follows:
1. dividing the data resource into an original layer, a standard layer, a basic layer and a subject layer, and performing layered modeling on different layers; and constructing basic resource libraries for different subjects based on the attributes of the data resources and the overall business, wherein the basic resource libraries comprise a character library, an organization library, an event library, a behavior library and the like.
2. And carrying out deep fusion association on the data in the data resource pool, and mining association relations among the data to form a knowledge map.
3. And configuring the access authority and the openness degree of the data resources based on the formed data association library and various service libraries to form a data open directory.
4. The user applies for data service based on the data open catalog, the system receives the user service application, generates data service content based on the data service requirement and based on the data statistics, analysis and mining method, and issues the service.
5. The user obtains data service through interface calling, file downloading and other modes.
The dynamic drawing process of the network situation is as follows:
1. performing basic statistics on the overall data based on multi-source collected data, analyzing data changes of different social platforms and news media, and drawing data change trends;
2. calculating the trend of the overall network conditions based on a network condition evaluation index system;
3. predicting the change trend of the whole network situation based on the change rule of the historical data and the whole network situation trend;
4. performing deep analysis on the figure based on the figure basic library and the knowledge graph to realize holographic correlation of the figure and monitor the network behavior of the figure in real time;
5. on the basis of character social network and news media data acquisition, dynamic tracking of character network activity, volume of sound, support degree and the like is achieved;
6. forming a character basic file based on the character holographic file and the network line, tracking related activities and events of characters in real time, and dynamically perceiving information of the characters;
7. deeply analyzing the basic situation of the organization, excavating the association degree between the organization and the character and analyzing the association between the organization and the organization based on the organization base and the knowledge soil doll;
8. detecting related activities and events of the organization on news media and social platforms in real time to form situation perception of the organization;
9. analyzing the whole public opinion trend based on a machine learning and natural language processing mode, automatically finding hot topics, and tracking the topics in real time;
10. analyzing characters, organizations, mechanisms and the like related to the topic, mining the propagation path, key account number and the like of the topic, and predicting the future trend of the topic;
11. deep excavation is carried out aiming at the special events, and major activities and topics are monitored and tracked in real time;
12. on the basis of a character organization library, analyzing and tracking the public sentiment of a specific region outside the region, and mining characters inside and outside the region, organization association, fund flow, network behaviors and the like to form the public sentiment situation outside the region;
13. and combining historical events to form an analysis index of the major event and construct an event prediction model.
14. And predicting planned events and unknown dangerous events based on the event prediction model.
15. And realizing dynamic drawing of the network situation based on the combination of network situation perception, character situation perception, organization situation perception, regional public opinion situation perception, event situation perception and the like.
When the network situation oriented to the specific area is dynamically drawn, the whole system of the invention is used as guidance, a complete network situation analysis frame can be built, each step of technical implementation is carried out according to the steps mentioned in the invention based on each frame, the whole data quality can be effectively improved, the breadth and the depth of network situation analysis are widened, and powerful technical support is provided for information analysis and judgment.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure should be limited only by the attached claims.

Claims (10)

1. A network situation dynamic drawing system fusing data quality multidimensional evaluation is characterized by comprising:
the data acquisition range defining unit analyzes the regional characteristics of economy, civil life and politics, combines the network behavior of netizens and defines the data acquisition range of a main social platform, a news medium and a civil-toning institution of netizens;
the data aggregation unit acquires data by adopting different data acquisition means according to the defined data acquisition range, performs data aggregation according to different data types, and acquires the data from manual editing, open source data and a non-cooperative mode;
the data management unit is used for carrying out basic cleaning and field standardization on the data after the data are gathered, automatically adding a data label, constructing a data resource catalog, forming a data resource pool, and forming a high-value database through data access, storage, analysis and use;
the data association analysis mining unit forms different theme classifications according to different dimensions of characters, organizations and activities based on a high-value database, constructs a knowledge graph model and forms a holographic association library of the characters and the organizations;
and the network situation sensing unit is used for sensing the network situation on the basis of data association analysis mining, dynamically drawing a network situation map and monitoring the network situation in real time.
2. The dynamic network situation drawing system fusing the data quality multi-dimensional evaluation as claimed in claim 1, wherein the data aggregation mode includes file import, database extraction, FTP file access and stream data access.
3. The network situation dynamic drawing system fusing data quality multi-dimensional evaluation according to claim 1, characterized in that a data governance unit automatically discovers quality problems from nine dimensions of data timeliness, effectiveness, volatility, relevance, consistency, correctness, normalization, uniqueness and integrity, and continuously solves the problems and improves the value of data resources.
4. The dynamic network situation rendering system integrating multi-dimensional data quality assessment according to claim 1, wherein the network situation perception unit conducts the perception of network situation from the perspective of comprehensive data analysis, regional public opinion analysis, overseas public opinion analysis, important character and organization analysis, major event analysis, and information analysis report.
5. A method for implementing the network situation dynamic rendering system for multi-dimensional evaluation of fusion data quality according to any one of claims 1 to 4, wherein the network situation dynamic rendering method for multi-dimensional evaluation of fusion data quality comprises the following steps:
step one, data acquisition and aggregation; analyzing data sources set in advance by combining with regional characteristics, acquiring data by means of various data acquisition means, and converging multi-source data to a data platform; carrying out real-time data quality evaluation in the convergence process;
step two, data management; the timeliness and the accuracy of the data are guaranteed through the multi-dimensional evaluation of the data quality; the data effectiveness is ensured through data cleaning, conversion and reduction modes; the usability and the safety of the data are ensured through data resource catalog, data labels and data authority management;
step three, data service; carrying out layered modeling on data, constructing different basic libraries and association libraries, issuing data services, and supporting different data requirements and business requirements
And step four, dynamically drawing the network situation, realizing multidimensional and omnibearing analysis of the network situation facing to the specific area based on a network situation analysis system, and dynamically drawing the whole network situation.
6. The dynamic network situation drawing method fusing the data quality multi-dimensional evaluation as claimed in claim 5, wherein the data collection and aggregation comprises the following steps:
step one, based on the regional characteristics of politics, economy and civil life, an authoritative website or a representative website is defined as one of important sources of open source data;
step two, based on the network behavior analysis of the netizens, a main social platform and news media of the netizens network activities are defined to be used as the basis of the network situation basic data source;
step three, forming a network-oriented acquisition target;
step four, realizing real-time or timed acquisition of data through a network crawler, system log acquisition and manual editing mode, and transmitting the data to a data aggregation platform through a data security channel;
fifthly, in the data aggregation process, the data quality is preliminarily judged according to multiple dimensions of timeliness, effectiveness and integrity of the data;
step six, based on FTP mode, stream mode and file import mode, classifying and storing the data after data quality preliminary judgment to a data aggregation system according to data sources;
and seventhly, monitoring in real time by adopting data aggregation, monitoring the data aggregation state in real time, and ensuring the stability of data aggregation.
7. The dynamic network situation drawing method fusing the data quality multi-dimensional evaluation as claimed in claim 5, wherein the data governance comprises the following steps:
after receiving a system, evaluating the integrity, accuracy, effectiveness, uniqueness, correctness and timeliness dimension of the converged data in real time;
secondly, performing log recording and alarming on the judgment result of the data quality based on the real-time data quality detection result;
step three, automatically cleaning and converting the data based on the problems existing in the data quality detection;
step four, formulating a data standard, and constructing a data resource pool based on the data standard and data research and judgment;
establishing a data label system, classifying the data labels in a grading way, and realizing label definition of a table grade, a field grade and a data grade;
step six, constructing a data resource catalog from the perspective of data sources and data classification according to the types of the data;
and seventhly, realizing data resource query of the data resource catalog and data label system, and realizing fine-grained control on the access authority of the data resources.
8. The dynamic network situation drawing method fusing the data quality multi-dimensional evaluation according to claim 5, wherein the data service comprises the following steps:
dividing data resources into an original layer, a standard layer, a basic layer and a subject layer, and performing layered modeling on different layers;
performing deep fusion association on the data in the data resource pool, and mining association relation among the data to form a knowledge graph;
step three, configuring access authority and openness degree of data resources based on the formed data association library and the service library to form a data open directory;
and fourthly, the user applies for the data service based on the data open directory, the system receives the user service application, generates data service content based on the data service requirement and the data statistics, analysis and mining methods, and issues the service.
9. The dynamic network situation drawing method integrating the data quality multi-dimensional evaluation as claimed in claim 8, wherein the attributes of the data resources and the overall business construct a basic resource library oriented to different subjects, including a character library, an organization library, an event library and behaviors.
10. The dynamic network situation drawing method fusing the data quality multi-dimensional evaluation according to claim 5, characterized in that the dynamic network situation drawing method comprises the following steps:
the method comprises the steps of firstly, performing basic statistics on overall data based on multi-source collected data, analyzing data changes of different social platforms and news media, and drawing data change trends;
step two, calculating the trend of the whole network condition based on a network condition evaluation index system;
step three, predicting the change trend of the whole network situation based on the change rule of the historical data and the whole network situation trend;
performing deep analysis on the figure based on the figure basic library and the knowledge graph to realize holographic correlation of the figure and monitor the network behavior of the figure in real time;
step five, realizing dynamic tracking of the character network liveness, volume and support degree on the basis of character social network and news media data acquisition;
forming a character basic file based on the character holographic file and the network line, tracking related activities and events of characters in real time, and dynamically perceiving information of the characters;
seventhly, deeply analyzing the basic situation of the organization, mining the association degree between the organization and the character and analyzing the association between the organization and the organization based on the organization base library and the knowledge soil doll;
detecting related activities and events of the organization on a news medium and a social platform in real time to form situation perception of the organization;
analyzing the trend of the whole public sentiment based on a machine learning and natural language processing mode, automatically finding hot topics, and tracking the topics in real time;
analyzing characters, organizations and mechanisms related to the topic, mining a propagation path and a key account of the topic, and predicting the future trend of the topic;
step eleven, carrying out deep excavation aiming at special events, and monitoring and tracking major activities and topics in real time;
analyzing and tracking the public sentiment outside the region based on the character organization library, and mining characters inside and outside the region, organization association, fund flow and network behaviors to form the public sentiment situation outside the region;
combining historical events to form an analysis index of a major event, and constructing an event prediction model from dimensions of the occurrence time, the occurrence place, the topic sensitivity and the activity scale of the event;
fourteen, predicting planned events and unknown dangerous events based on an event prediction model;
and step fifteen, realizing dynamic drawing of the network situation based on combination of network situation perception, character situation perception, organization situation perception, regional public opinion situation perception and event situation perception.
CN202011628462.XA 2020-12-30 2020-12-30 Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation Pending CN112732781A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011628462.XA CN112732781A (en) 2020-12-30 2020-12-30 Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011628462.XA CN112732781A (en) 2020-12-30 2020-12-30 Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation

Publications (1)

Publication Number Publication Date
CN112732781A true CN112732781A (en) 2021-04-30

Family

ID=75609904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011628462.XA Pending CN112732781A (en) 2020-12-30 2020-12-30 Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation

Country Status (1)

Country Link
CN (1) CN112732781A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220837A (en) * 2021-05-12 2021-08-06 深圳市网联安瑞网络科技有限公司 Network space behavior monitoring and analyzing method and system of entity activity participator
CN115098784A (en) * 2022-07-18 2022-09-23 李圣刚 Data mining method and data mining system
CN115688044A (en) * 2022-08-25 2023-02-03 航天神舟智慧系统技术有限公司 Multi-dimensional fusion method and system for holographic archive
CN115907144A (en) * 2022-11-21 2023-04-04 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Event prediction method and device, terminal equipment and storage medium
CN117540936A (en) * 2024-01-09 2024-02-09 天津市大数据管理中心 Multi-source data processing method and system for primary social management
CN117991108A (en) * 2024-04-07 2024-05-07 邦邦汽车销售服务(北京)有限公司 Data-based power battery damage detection method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866126A (en) * 2019-11-22 2020-03-06 福建工程学院 College online public opinion risk assessment method
CN111461553A (en) * 2020-04-02 2020-07-28 上饶市中科院云计算中心大数据研究院 System and method for monitoring and analyzing public sentiment in scenic spot
US20200334777A1 (en) * 2018-11-21 2020-10-22 Beijing Yutian Technology Co. Ltd Intelligent emergency decision support system for emergency communication

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200334777A1 (en) * 2018-11-21 2020-10-22 Beijing Yutian Technology Co. Ltd Intelligent emergency decision support system for emergency communication
CN110866126A (en) * 2019-11-22 2020-03-06 福建工程学院 College online public opinion risk assessment method
CN111461553A (en) * 2020-04-02 2020-07-28 上饶市中科院云计算中心大数据研究院 System and method for monitoring and analyzing public sentiment in scenic spot

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220837A (en) * 2021-05-12 2021-08-06 深圳市网联安瑞网络科技有限公司 Network space behavior monitoring and analyzing method and system of entity activity participator
CN115098784A (en) * 2022-07-18 2022-09-23 李圣刚 Data mining method and data mining system
CN115688044A (en) * 2022-08-25 2023-02-03 航天神舟智慧系统技术有限公司 Multi-dimensional fusion method and system for holographic archive
CN115907144A (en) * 2022-11-21 2023-04-04 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Event prediction method and device, terminal equipment and storage medium
CN117540936A (en) * 2024-01-09 2024-02-09 天津市大数据管理中心 Multi-source data processing method and system for primary social management
CN117991108A (en) * 2024-04-07 2024-05-07 邦邦汽车销售服务(北京)有限公司 Data-based power battery damage detection method and system

Similar Documents

Publication Publication Date Title
CN112732781A (en) Network situation dynamic drawing system and method fusing data quality multi-dimensional evaluation
Polyakova et al. Design of a socio-economic processes monitoring system based on network analysis and big data
Sivarajah et al. Critical analysis of Big Data challenges and analytical methods
Ragan et al. Characterizing provenance in visualization and data analysis: an organizational framework of provenance types and purposes
Thurner et al. Analysis, synthesis, and estimation of fractal-rate stochastic point processes
Udanor et al. Determining social media impact on the politics of developing countries using social network analytics
Robinson et al. Design and evaluation of a geovisual analytics system for uncovering patterns in spatio-temporal event data
Koua et al. A usability framework for the design and evaluation of an exploratory geovisualization environment
Lande et al. OSINT as a part of cyber defense system
Aher et al. Best combination of machine learning algorithms for course recommendation system in e-learning
CN114398669A (en) Joint credit scoring method and device based on privacy protection calculation and cross-organization
Feng et al. Aggravating effects of alcohol outlet types on street robbery and aggravated assault in New York City
Wong et al. Graph signatures for visual analytics
Agarwal et al. Wikipedia and Westminster: Quality and dynamics of Wikipedia pages about UK politicians
Baranowski et al. Social welfare in the light of topic modelling
Taleghani Executive information systems development lifecycle
Polpinij et al. Internet usage patterns mining from firewall event logs
Holder et al. Current and future challenges in mining large networks: Report on the second sdm workshop on mining networks and graphs
Ng et al. Forecasting topic activity with exogenous and endogenous information signals in Twitter
Ceri et al. Towards mega-modeling: a walk through data analysis experiences
Rulff et al. Urban Rhapsody: Large‐scale exploration of urban soundscapes
Gorko et al. A multi-scale correlative approach for crowd-sourced multi-variate spatiotemporal data
Borges et al. A multidimensional data model for the analysis of learning management systems under different perspectives
CN111143653B (en) Credibility verification method for mass science popularization resources
Li et al. [Retracted] Design of Teaching Quality Analysis and Management System for PE Courses Based on Data‐Mining Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination