CN114398442B - Information processing system based on data driving - Google Patents

Information processing system based on data driving Download PDF

Info

Publication number
CN114398442B
CN114398442B CN202210083509.1A CN202210083509A CN114398442B CN 114398442 B CN114398442 B CN 114398442B CN 202210083509 A CN202210083509 A CN 202210083509A CN 114398442 B CN114398442 B CN 114398442B
Authority
CN
China
Prior art keywords
data
module
management
analysis
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210083509.1A
Other languages
Chinese (zh)
Other versions
CN114398442A (en
Inventor
成磊峰
薛丽惠
王平
胡辉
刘刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 10 Research Institute
Original Assignee
CETC 10 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 10 Research Institute filed Critical CETC 10 Research Institute
Priority to CN202210083509.1A priority Critical patent/CN114398442B/en
Publication of CN114398442A publication Critical patent/CN114398442A/en
Application granted granted Critical
Publication of CN114398442B publication Critical patent/CN114398442B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/282Hierarchical databases, e.g. IMS, LDAP data stores or Lotus Notes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an information processing system based on data driving, which comprises a data acquisition module, a plurality of data sources, a data extraction module and a data cleaning module, wherein the data acquisition module is connected with the plurality of data sources to perform data extraction and data cleaning; the data storage module stores the data after data cleaning into a corresponding storage library; the data management module is in butt joint with the data storage module and is used for executing full life cycle management and data management on the stored data; the data service module is used for providing data searching, subscription distribution and general target service for the management result of the data management module; the data calculation module comprises a data message bus, a real-time processing engine and an offline analysis engine; the business application module calls each module to process and analyze the acquired data; and the knowledge base module correspondingly constructs a knowledge base according to the analysis result. The information processing system greatly improves the data transfer and sharing capability, effectively improves the knowledge construction and service capability, fully improves the historical data value-added capability and forms a data application system conforming to a service mode.

Description

Information processing system based on data driving
Technical Field
The application relates to the field of information reconnaissance, in particular to an information processing system based on data driving.
Background
Big data is becoming a new stress point for improving application capability, and the spanning development of sensor construction causes the surge of reconnaissance data, so that the energy and efficiency are urgently needed to be enhanced through data convergence, fusion and analysis mining. The existing information operation mode, equipment system and basic support established according to the content information production mode are difficult to meet the current information production requirement, the equipment system is planned by innovatively applying leading edge technologies such as big data, cloud computing and artificial intelligence, a data driving type technical reconnaissance system is gradually built, an information processing technology is innovated, and information supporting capability is improved.
Disclosure of Invention
Aiming at the defects of the practice of the information processing technology under the condition of big data, the technical researches of data layering classification circulation, knowledge base management and information unified processing framework are developed, the existing information processing flow is optimized and perfected, the information processing system based on data driving is provided on the basis of integrating the processing capacity of the active information system and the analysis capacity of the big data technology, the key technologies such as unified computing framework, data circulation processing strategy, knowledge automatic updating and data driving processing mode are adopted, the intelligence and automation level of information processing is improved, and powerful technical support is provided for the strong energy efficiency improvement of the actual information production.
The technical scheme adopted by the application is as follows: an information processing system based on data driving includes
The data acquisition module is connected with a plurality of data sources and used for data extraction and data cleaning;
the data storage module stores the data after data cleaning into a corresponding storage library;
the data management module is in butt joint with the data storage module and is used for executing full life cycle management and data management on the stored data;
the data service module is used for providing data searching, subscription distribution and general target service for the management result of the data management module;
the data computing module comprises a data message bus, a real-time processing engine and an offline analysis engine, wherein the message data bus provides a high-speed data transmission mechanism for the whole system, the real-time processing engine provides a real-time data stream processing and analyzing environment for large-scale real-time data stream application, and the offline analysis engine comprises a processing and analyzing environment for providing PB-level offline cross-domain data, a large-scale graph calculation and an artificial intelligent processing engine;
the business application module comprises an information analysis module, a signal analysis module and a password decoding module, and the processing and analysis of the acquired data are realized through a data storage module, a data management module, a data service module and a data calculation module;
and the knowledge base module correspondingly constructs an information knowledge base, a signal target knowledge base and a password knowledge base according to the analysis results of the information analysis module, the signal analysis module and the password decoding module on the data.
Further, the data cleaning comprises data deduplication, outlier rejection and format conversion.
Further, the data storage module comprises a primary data warehouse, a secondary data mart and data integration;
the primary data warehouse stores data and provides unified storage management and cross-region access for accessed data resources;
a secondary data mart provides a data fusion storage space for data oriented to the general targets and the thematic data;
and the data integration realizes unified data exchange of a plurality of data sources and heterogeneous data source integration.
Further, the data management module comprises a metadata management module, a standard management module, a data management module and a data management and control module;
the metadata management module is used for modeling, cataloging, registering and publishing global metadata to form a metadata unified view;
the standard management module is used for inputting and revising the data standard;
the data management module is used for quality management of the data resources;
and the data management and control module is used for operation and maintenance management of the database and the file system and monitoring data access, storage and access information.
Further, the data service module comprises a data search module, a subscription distribution module and a generic object service module;
and the data searching module is used for inquiring the data in the database according to the behavior of the user and the inquiry statement.
The subscription distribution module is used for establishing a data resource catalog according to the service direction, typical combat tasks and professional field organizations and providing data inquiry, browsing, subscription, publishing and downloading according to the catalog;
and the universal target service module is used for carrying out dynamic modeling of the universal target, storage of all-element data, association relation and labels of the universal target by taking the target as a center and providing access to the universal target data.
Further, the data calculation module is in butt joint with the data storage module, performs calculation, push-down optimization and heterogeneous data storage and mixing calculation, provides a unified data calculation and sending and unified data service interface, and pushes calculation results to the data message bus.
Furthermore, the information analysis module constructs a unified situation data processing and exchanging environment of multi-unit common-view common judgment based on data resources, big data technology and intelligent algorithm, and realizes multi-source situation data mining utilization and fusion value-added.
Further, the signal analysis module is used for constructing a target knowledge base after marking, cleaning, converting and deduplicating the acquired signal data, and comprises a reconnaissance acquisition base, a signal characteristic base, a target characteristic base, an electromagnetic environment base, a signal sample base, a signal analysis knowledge base and a signal analysis software base.
Furthermore, the password decoding module performs big data cleaning, mapping, warehousing and association analysis processing on the password resource data to construct a dynamic change password target indication library, wherein the dynamic change password target indication library comprises a password equipment information library, a password user information library and a specific password information library.
Compared with the prior art, the beneficial effects of adopting the technical scheme are as follows:
1) Greatly improves the data streaming and sharing capability. The data bus penetrating through the information processing circulation is established, various applications are supported to subscribe and distribute, the traditional circulation mode of unidirectional step-by-step data flow is changed, and the data circulation exchange efficiency and the data sharing capacity are improved.
2) Knowledge construction and service capability are effectively improved. The method has the advantages that a target knowledge base for mass creation and crowd funding, machine efficient processing, expert auditing and refining and evaluation driving application is built, knowledge resource co-construction, interconnection and sharing are promoted, real information production and research and judgment are supported to the maximum extent, and support is provided for target reconnaissance, information research and the like.
3) And the value-added capability of the historical data is fully improved. Based on a unified programming framework for big data processing, processing technologies such as big data analysis, artificial intelligence and the like are applied to develop collision analysis of incremental data and full historical data, the information value of the historical data is discovered, and the contradiction between information overload and information scarcity and the contradiction between manpower investment and benefit increment are solved.
4) And forming a data application system conforming to the service mode. The capacity and efficiency of information processing are improved through data construction, data circulation, data application and data management, small data rapid circulation processing and high-efficiency analysis of big data are realized, and the method is suitable for the data processing requirement of a real duty mode.
Drawings
Fig. 1 is a schematic diagram of an information processing system based on data driving according to the present application.
FIG. 2 is a flow chart of data processing in an embodiment of the application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar modules or modules having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the application. On the contrary, the embodiments of the application include all alternatives, modifications and equivalents as may be included within the spirit and scope of the appended claims.
As shown in fig. 1, the present embodiment proposes a data-driven-based information processing system, based on a data-driven information processing technology, researches an information unified processing framework under a big data condition, knowledge management and a data flow strategy, constructs an information unified processing framework based on a knowledge base, breaks through key technologies of knowledge automatic iterative update, target-based information processing and activity-based information processing, and improves information processing efficiency in a driving mode by taking data as driving. The method specifically comprises the following steps:
the data acquisition module is connected with a plurality of data sources and used for data extraction and data cleaning;
the data storage module stores the data after data cleaning into a corresponding storage library;
the data management module is in butt joint with the data storage module and is used for executing full life cycle management and data management on the stored data;
the data service module is used for providing data searching, subscription distribution and general target service for the management result of the data management module;
the data computing module comprises a data message bus, a real-time processing engine and an offline analysis engine, wherein the message data bus provides a high-speed data transmission mechanism for the whole system, the real-time processing engine provides a real-time data stream processing and analyzing environment for large-scale real-time data stream application, and the offline analysis engine comprises a processing and analyzing environment for providing PB-level offline cross-domain data, a large-scale graph calculation and an artificial intelligent processing engine;
the business application module comprises an information analysis module, a signal analysis module and a password decoding module, and the processing and analysis of the acquired data are realized through a data storage module, a data management module, a data service module and a data calculation module;
and the knowledge base module correspondingly constructs an information knowledge base, a signal target knowledge base and a password knowledge base according to the analysis results of the information analysis module, the signal analysis module and the password decoding module on the data.
Specifically, in the data acquisition module, multi-source data access is supported based on computing resources of the distributed environment. After various data sources are docked, data extraction is carried out in an incremental or full mode, data cleaning is carried out, and cleaned data are transmitted to a data storage module.
In this embodiment, the data sources include a sensor aggregation system, an active system, and third party data. The data cleaning comprises the processing of data deduplication, outlier rejection, format conversion and the like.
In the data storage module, a proper storage library is selected according to the data acquired by the data acquisition module, the information such as data characteristics, a calculation mode, an access mode, a data scale and the like, diversified data organization tasks are met through multi-level and multi-type storage, the specific business application problem is solved, a plurality of different data storages are supported to be selected, different business requirements are flexibly adapted, and data analysis is facilitated.
Specifically, a primary data warehouse, a secondary data mart and a data integration function are arranged in the data storage module.
The data warehouse is formulated and conforms to the storage requirement specification of the primary warehouse, provides full-element and large-scale unified storage management and cross-region access for the accessed data resources, and realizes cross-center and cross-unit heterogeneous data global sharing based on metadata management. The method mainly comprises the steps of structured online storage, structured offline storage, unstructured data storage, unified storage management and cross-domain data management.
The secondary data marts provide data fusion storage space for general object-oriented and thematic data production, rich, professional and efficient storage management and right-based access for upper-layer general object analysis and thematic retrieval, and cross-industry, cross-means and cross-system heterogeneous data convergence fusion is realized based on metadata management. The system mainly comprises an offline data development and management platform, an offline database, an MPP database, a full library, a graph database and a cloud database module.
The data integration supports format conversion of various data sources, and realizes data exchange; the unified data exchange of various structured and unstructured data sources is supported; and supporting the integration of various heterogeneous data sources, including database data, data files and real-time data streams. The system mainly comprises an execution framework, operation monitoring, scheduling, configuration management and an intelligent interface module.
The data module is based on the data storage module and provides big data management and data management capability support by facing the data full life cycle management, and comprises a metadata management module, a standard management module, a data management module and a data management and control function.
Specifically, the metadata management module realizes modeling, cataloging, registering and publishing of global metadata to form a metadata unified view, and comprises metadata basic management, modeling management and resource sharing management functions. The metadata base management, modeling management and resource sharing management are mature technologies in industry, and are not described herein.
The standard management module provides a basic environment for inputting and revising data standards, and provides a standard knowledge base and standard compliance checking, including standard creation, standard maintenance, standard management and standard detection functions. The standard creation, standard maintenance, standard management and standard detection are mature technologies in industry, and are not described herein.
The data management module realizes quality management of data resources, guarantees the integrity, accuracy, consistency and timeliness of the data, and evaluates the data assets in the aspects of access, storage, quality, exchange and the like, and comprises auditing rule management, auditing task management, auditing problem management, quality analysis report, data asset statistics and data asset query functions. The auditing rule management, auditing task management, auditing problem management, quality analysis report, data asset statistics and data asset query are mature technologies in the industry, and are not described in detail herein.
The data management and control module provides operation and maintenance management such as database state, file system state, etc., and provides management maintenance and monitoring information display such as data access, storage, access, etc., and mainly comprises an operation and maintenance information reporting and storage management monitoring module. The reporting and storage management and monitoring of the operation and maintenance information are mature technologies in the industry, and are not described herein.
The data service module is mainly realized on the basis of the data management module, and based on the data management result, the data can be visible, accessible, understandable, interactive and feedback, and comprehensive data service is directly provided for various users at all levels, so that the users can efficiently call the data, accurately acquire the data, deeply mine the data and timely maintain the data; and meanwhile, the data searching service is provided, support is provided for big data collaborative analysis, deep mining and various applications, and the module is mainly used for data searching simulation, subscription distribution modules and general target service modules.
The data search module mainly provides data secondary search and unified data query and retrieval functions according to behaviors and query sentences of users, supports a plurality of heterogeneous database queries, realizes multi-table condition retrieval of the heterogeneous databases, supports browsing and permission-based downloading of retrieval results, timely recommends interested data to the users according to the behaviors of the users, and provides feedback of the recommended data to the users.
The subscription distribution module is used for organizing and establishing a data resource catalog according to the service direction, typical combat tasks and professional fields and providing catalog management, updating and publishing functions; simultaneously, the functions of inquiring, browsing, subscribing, publishing and downloading according to the permission are provided according to the catalogue; the subscriber defines data screening rules, and the data provider provides accurate pushing of data resources according to the data screening rules.
The universal target service module is used for carrying out universal target dynamic ontology modeling, unified cataloging, storage and access based on data organization requirements taking a target as a center, is responsible for storage of universal target full-factor data, association relation and labels, supports analysis of the universal target data, and provides universal target data access, universal target full-dimensional display and universal target service management for service requirements.
The data calculation module is simultaneously in butt joint with the data storage module and is responsible for calculation, push-down optimization and heterogeneous data storage mixed calculation, conventional processing, big data analysis and artificial intelligent processing are supported, and time-sensitive business processing and non-time-sensitive business analysis and research are supported; the butt joint information duty system is responsible for providing unified data calculation and issuing, unified data service interfaces and the like, pushing the processing result to a data message bus and participating in subsequent iterative calculation of incremental data. The module comprises a data message bus, a real-time processing engine and an offline analysis engine subsystem.
The data message bus provides a high-speed data transmission mechanism for high-efficiency data flow requirements of all systems, realizes flexible and reliable interaction of data, service or intersystem data decoupling and supports data flow control.
The real-time processing engine is applied to statistical analysis, monitoring and early warning, key target real-time detection and the like of large-scale real-time data streams, provides a real-time stream data processing and analysis environment with high timeliness and high throughput, and supports the functions of transverse expansion of computing capacity, on-line capacity expansion and contraction of resources as required, stream-based SQL-like query, multi-stream association, complex event processing, abnormal recovery of stream data tasks and the like.
The offline analysis engine is oriented to complex processing analysis requirements of massive offline cross-domain data,
providing a processing analysis environment of PB-level offline cross-domain data, supporting the functions of transversely expanding computing capacity, online capacity expansion and capacity reduction of resources according to needs, supporting complex processing logic iterative computation, supporting batch processing task management, supporting user non-perception transparent scheduling and the like;
providing large-scale graph calculation, providing a graph calculation processing framework for large-scale graph structural data, and solving the problems of large data volume, high relationship complexity and the like in the process of analyzing the target relationship;
the artificial intelligence processing engine is provided, main stream frames such as machine learning, deep learning and the like are integrated and fused for mass, multi-source and heterogeneous big data intelligent analysis and mining requirements, a graphical service modeling tool is provided for a user, and data analysts are assisted to find implicit knowledge and rules from mass data, so that intelligent data guarantee and auxiliary decision making capability are improved.
The business application module acquires various data sources through the data acquisition module, performs data preprocessing and metadata extraction, provides data input capability for information processing, realizes the full-flow service capability of data storage, fusion processing and application through the data storage module, the data management module, the data service module and the data calculation module, and supports the information analysis, signal analysis and password decoding module.
Wherein, the intelligence analysis module: the situation data organization and the excavation increment are taken as main lines, the advantages of various data resources, big data technology, intelligent algorithm and the like are fully utilized, the information processing and analysis capacity is improved, a unified situation data processing and exchanging environment of multi-unit common view and common judgment is formed, multiple users are supported to develop rough processing and finish processing for data, multi-source situation data excavation utilization and fusion increment are realized, and the reconnaissance monitoring capacity with sea-air targets as main objects is promoted.
And a signal analysis module: the signal target knowledge base such as a reconnaissance acquisition base, a signal feature base, a target feature base, an electromagnetic environment base, a signal sample base, a signal analysis knowledge base and a signal analysis software base is constructed by marking, cleaning, converting, deduplication and other data preprocessing on the acquired signal data.
And a password decoding module: and carrying out near-real-time big data cleaning, mapping, warehousing, association analysis and other processes on the returned and converged password resource data to form a password target knowledge base such as a password equipment information base, a password user information base, a specific password information base and the like which are dynamically changed, so as to realize comprehensive processing of password resources and generation of situation support data.
It should be noted that, in the information processing system provided in this embodiment, the system further includes a knowledge base, and by sorting the existing service summary, expert experience, and rule features, a service specification is formed and stored in the knowledge base, so as to support rapid collection, check, and management of information knowledge data, so that users can conveniently and timely search, acquire, associate, apply, and feedback, realize quality inspection and evaluation of data, and formulate rules, and examine and approve target data and related knowledge information submitted by users, thereby ensuring freshness, authority, and accuracy of content.
Specifically, the knowledge base comprises an automatic translation function based on a special word base, an automatic target element extraction function based on rule identification, a target parameter error correction function and a data updating, maintaining and evaluating feedback function.
Wherein, automatic translation function based on proprietary word stock: the front stage is manually input and maintained by a proper noun word stock with higher use frequency, the translated target parameter related corpus is trained by a supervised learning method, and then the untranslated document is automatically translated by a trained translation model.
Automatic extraction function of target elements based on rule identification: because of the certainty of the expression of part of similar text data and target information, the information format and sequence are relatively fixed or regularly circulated, and important words and phrases in the document are extracted by adopting heuristic rules. The heuristic rule is extracted mainly according to data sources, formats, special symbols, structural characteristics and the like.
Target parameter error correction function: the method comprises the steps of carrying out statistical analysis on target parameters from different sources, manually checking the letter collecting flow, forming rules by using data visualization, analogy, deduction and authority assignment methods, and solving the problems of scattered target information and expression contradiction, particularly large difference of performance indexes affecting application, unobserved examination and lower reliability by guiding automatic error correction and correction of the target parameters.
Data update maintenance and evaluation feedback function: the basic workflow of knowledge data management and maintenance is that a knowledge base administrator issues data access and maintenance tasks to relevant user nodes (or each user node sends a task application to the administrator according to the actual knowledge data sharing utilization value or actual data change condition), and each user node arranges the arrangement, access and maintenance of the data after receiving a task instruction; the field expert carries out auditing and letter picking on the newly input data; knowledge data or data can be cut and derived according to the processing requirements of users or front ends, and evaluation feedback after knowledge application is provided.
In the information processing system provided in this embodiment, as shown in fig. 2, after receiving the data source data, according to the actual situation of the acquired information elements, the shared use is reported first, then the mining processing is performed, and the quick circulation and utilization of the time-sensitive data is ensured. When the signal is intercepted, the time-sensitive data can be reported directly, and the time-sensitive information can be shared timely; and then carrying out a series of specific analysis processing on the signals, and then carrying out integral editing reporting on the processed information. Under the unified support of basic data, the data flow is optimized according to the processing nodes such as discovery, positioning, identification and evaluation, and under the support of a data message bus, information services such as early warning class, target indication class, target situation class, notification class, query class and the like are pushed in a layered and classified manner, so that functions such as a verification, guidance and common view mechanism, a data, situation, resource and knowledge sharing mechanism, a hierarchical, classified and layered guarantee mechanism and the like are developed.
Aiming at the situation that the current information on-duty personnel pay attention to the limited targets and have a larger gap from the requirement of staring at the dead of a ship, the information processing system provided by the embodiment is beneficial to quickly sensing data change based on big data information processing technology of targets and events, improves the target reconnaissance efficiency, researches multi-source association collision of incremental data under the guidance of historical data and a knowledge base, supplements information to analyze various elements, verifies the capabilities of target identification, important target discovery, ship deployment change, situation abnormality, event prejudgement and the like, and forms a reconnaissance mode and an information processing mode based on data driving.
It should be noted that, in the description of the embodiments of the present application, unless explicitly specified and limited otherwise, the terms "disposed," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; may be directly connected or indirectly connected through an intermediate medium. The specific meaning of the above terms in the present application will be understood in detail by those skilled in the art; the accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.

Claims (6)

1. An information processing system based on data driving, comprising
The data acquisition module is connected with a plurality of data sources and used for data extraction and data cleaning;
the data storage module stores the data after data cleaning into a corresponding storage library;
the data management module is in butt joint with the data storage module and is used for executing full life cycle management and data management on the stored data;
the data service module is used for providing data searching, subscription distribution and general target service for the management result of the data management module;
the data computing module comprises a data message bus, a real-time processing engine and an offline analysis engine, wherein the message data bus penetrates through each module to provide a high-speed data transmission mechanism for the whole system, the real-time processing engine provides a real-time data stream processing and analyzing environment for application facing large-scale real-time data streams, and the offline analysis engine comprises a processing and analyzing environment for providing PB-level offline cross-domain data, a large-scale graph calculation and an artificial intelligent processing engine;
the business application module comprises an information analysis module, a signal analysis module and a password decoding module, and the collected data is processed and analyzed through the data storage module, the data management module, the data service module and the data calculation module;
the knowledge base module correspondingly constructs an information knowledge base, a signal target knowledge base and a password knowledge base according to the analysis results of the information analysis module, the signal analysis module and the password decoding module on the data;
the data storage module comprises a primary data warehouse, a secondary data mart and data integration;
the primary data warehouse stores data, and the primary data warehouse makes and conforms to the storage requirement specification of the primary data warehouse, provides full-element and large-scale unified storage management and cross-region access for accessed data resources, and realizes cross-center and cross-unit heterogeneous data global sharing based on metadata management;
the secondary data marts provide a data fusion storage space for general object-oriented and thematic data production, and provide rich, professional and efficient storage management and right access for upper-layer general object analysis and thematic retrieval, and cross-industry, cross-means and cross-system heterogeneous data convergence fusion is realized based on metadata management;
data integration, supporting format conversion of various data sources, and realizing data exchange; the unified data exchange of various structured and unstructured data sources is supported; supporting the integration of various heterogeneous data sources, including database data, data files and real-time data streams;
the information analysis module takes situation data organization and excavation increment as a main line, fully utilizes the advantages of various data resources, big data technology and intelligent algorithm, improves the information processing and analysis capacity, forms a unified situation data processing and exchanging environment of multi-unit common-view common-judgment, supports multiple users to develop rough processing and finish processing for data, and realizes multi-source situation data excavation utilization and fusion increment;
the signal analysis module is used for marking, cleaning, converting and deduplicating the acquired signal data, and then constructing a target knowledge base, wherein the target knowledge base comprises a reconnaissance acquisition base, a signal characteristic base, a target characteristic base, an electromagnetic environment base, a signal sample base, a signal analysis knowledge base and a signal analysis software base.
2. The data-driven intelligence processing system of claim 1, wherein the data cleansing includes data deduplication, outlier rejection, and format conversion.
3. The data-driven based intelligence processing system of claim 1, wherein the data management module comprises a metadata management module, a standard management module, a data governance module, and a data management module;
the metadata management module is used for modeling, cataloging, registering and publishing global metadata to form a metadata unified view;
the standard management module is used for inputting and revising the data standard;
the data management module is used for quality management of the data resources;
and the data management and control module is used for operation and maintenance management of the database and the file system and monitoring data access, storage and access information.
4. The data-driven based intelligence processing system of claim 1, wherein the data service module comprises a data search module, a subscription distribution module, and a generic object service module;
the data searching module is used for inquiring the data in the database according to the behavior of the user and the inquiry statement;
the subscription distribution module is used for establishing a data resource catalog according to the service direction, typical combat tasks and professional field organizations and providing data inquiry, browsing, subscription, publishing and downloading according to the catalog;
and the universal target service module is used for carrying out dynamic modeling of the universal target, storage of all-element data, association relation and labels of the universal target by taking the target as a center and providing access to the universal target data.
5. The data-driven intelligence processing system of claim 1, wherein the data computing module interfaces with the data storage module to compute push-down optimization, heterogeneous data storage misclassification, and simultaneously provides a unified data computing and issuing, unified data service interface to push the computation results to the data message bus.
6. The data-driven intelligence processing system according to claim 1, wherein the password decoding module performs big data cleaning, mapping, warehousing and association analysis processing on the password resource data to construct a dynamic change password target indication library, including a password equipment information library, a password user information library and a specific password information library.
CN202210083509.1A 2022-01-25 2022-01-25 Information processing system based on data driving Active CN114398442B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210083509.1A CN114398442B (en) 2022-01-25 2022-01-25 Information processing system based on data driving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210083509.1A CN114398442B (en) 2022-01-25 2022-01-25 Information processing system based on data driving

Publications (2)

Publication Number Publication Date
CN114398442A CN114398442A (en) 2022-04-26
CN114398442B true CN114398442B (en) 2023-09-19

Family

ID=81232117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210083509.1A Active CN114398442B (en) 2022-01-25 2022-01-25 Information processing system based on data driving

Country Status (1)

Country Link
CN (1) CN114398442B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115134421B (en) * 2022-05-10 2024-02-20 北京市遥感信息研究所 Multi-source heterogeneous data cross-system collaborative management system and method
CN116450620B (en) * 2023-06-12 2023-09-12 中国科学院空天信息创新研究院 Database design method and system for multi-source multi-domain space-time reference data
CN116775665B (en) * 2023-08-24 2023-10-27 云南省交通投资建设集团有限公司 Full-automatic task release system based on daily operation and maintenance management of expressway

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1853180A (en) * 2003-02-14 2006-10-25 尼维纳公司 System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation
CN102916803A (en) * 2012-10-30 2013-02-06 山东省计算中心 File implicit transfer method based on public switched telephone network
CN106982188A (en) * 2016-01-15 2017-07-25 阿里巴巴集团控股有限公司 The detection method and device in malicious dissemination source
CN108009300A (en) * 2017-12-28 2018-05-08 中译语通科技(青岛)有限公司 A kind of novel maintenance system based on big data technology
CN111707993A (en) * 2020-06-11 2020-09-25 北京理工大学 Radar anti-interference quick decision-making system and method sharing migratable multi-scene characteristics
CN112199433A (en) * 2020-10-28 2021-01-08 云赛智联股份有限公司 Data management system for city-level data middling station
CN112269173A (en) * 2020-12-21 2021-01-26 中国电子科技集团公司第二十八研究所 Method for fusing one-dimensional image signals of multi-platform radar
CN112738016A (en) * 2020-11-16 2021-04-30 中国南方电网有限责任公司 Intelligent security event correlation analysis system for threat scene
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113361933A (en) * 2021-06-08 2021-09-07 南京联成科技发展股份有限公司 Centralized management and control center for cross-enterprise collaboration

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3220367A1 (en) * 2016-03-14 2017-09-20 Tata Consultancy Services Limited System and method for sound based surveillance

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1853180A (en) * 2003-02-14 2006-10-25 尼维纳公司 System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation
CN102916803A (en) * 2012-10-30 2013-02-06 山东省计算中心 File implicit transfer method based on public switched telephone network
CN106982188A (en) * 2016-01-15 2017-07-25 阿里巴巴集团控股有限公司 The detection method and device in malicious dissemination source
CN108009300A (en) * 2017-12-28 2018-05-08 中译语通科技(青岛)有限公司 A kind of novel maintenance system based on big data technology
CN111707993A (en) * 2020-06-11 2020-09-25 北京理工大学 Radar anti-interference quick decision-making system and method sharing migratable multi-scene characteristics
CN112199433A (en) * 2020-10-28 2021-01-08 云赛智联股份有限公司 Data management system for city-level data middling station
CN112738016A (en) * 2020-11-16 2021-04-30 中国南方电网有限责任公司 Intelligent security event correlation analysis system for threat scene
CN112269173A (en) * 2020-12-21 2021-01-26 中国电子科技集团公司第二十八研究所 Method for fusing one-dimensional image signals of multi-platform radar
CN113360599A (en) * 2021-05-18 2021-09-07 苏州海赛人工智能有限公司 Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN113361933A (en) * 2021-06-08 2021-09-07 南京联成科技发展股份有限公司 Centralized management and control center for cross-enterprise collaboration

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Madmom:A new python audio and music signal processing library;Jan Schluter等;《Proceedings of the 24th ACM international conference on Multimedia》;1174-1178 *
王华等.中国石油数据仓库总体设计方案研究.《石油规划设计》.2005,(5),第2-3页. *
王振等.战场网络情报侦察研究.《舰船电子工程》.2009,第29卷(6),第2页. *
郭晶等.智能情报获取系统框架研究.《军民两用技术与产品》.2020,(8),第2-5页. *

Also Published As

Publication number Publication date
CN114398442A (en) 2022-04-26

Similar Documents

Publication Publication Date Title
CN114398442B (en) Information processing system based on data driving
Diba et al. Extraction, correlation, and abstraction of event data for process mining
CN103336790A (en) Hadoop-based fast neighborhood rough set attribute reduction method
CN116485576A (en) Intelligent manufacturing management platform for brain data with known source in aviation manufacturing industry
CN111627552A (en) Medical streaming data blood relationship analysis and storage method and device
CN114756563A (en) Data management system with multiple coexisting complex service lines of internet
Xie et al. Logm: Log analysis for multiple components of hadoop platform
CN111125450A (en) Management method of multilayer topology network resource object
CN116244367A (en) Visual big data analysis platform based on multi-model custom algorithm
CN114706994A (en) Operation and maintenance management system and method based on knowledge base
CN111538720A (en) Method and system for cleaning basic data in power industry
Castano et al. A framework for expressing semantic relationships between multiple information systems for cooperation
CN113722564A (en) Visualization method and device for energy and material supply chain based on space map convolution
CN113506098A (en) Power plant metadata management system and method based on multi-source data
CN117592450A (en) Panoramic archive generation method and system based on employee information integration
Jin et al. Financial management and decision based on decision tree algorithm
CN116432092A (en) Index system and method for fusing model data
Dong et al. Scene-based big data quality management framework
CN116362443A (en) Data management method and device for enterprise information platform
Vera-Baquero et al. Measuring and querying process performance in supply chains: an approach for mining big-data cloud storages
CN112784129A (en) Pump station equipment operation and maintenance data supervision platform
Ma et al. Data management of salt cavern gas storage based on data model
Meski et al. Towards a knowledge structuring framework for decision making within industry 4.0 paradigm
Li A novel framework for discovery and reuse of typical process route driven by symbolic entropy and intelligent optimisation algorithm
Borisov et al. Automation Methodology for Complex Technical-Organizational Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant