CN107193858A - Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion - Google Patents

Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion Download PDF

Info

Publication number
CN107193858A
CN107193858A CN201710193071.1A CN201710193071A CN107193858A CN 107193858 A CN107193858 A CN 107193858A CN 201710193071 A CN201710193071 A CN 201710193071A CN 107193858 A CN107193858 A CN 107193858A
Authority
CN
China
Prior art keywords
data
semantic
source heterogeneous
source
heterogeneous data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710193071.1A
Other languages
Chinese (zh)
Other versions
CN107193858B (en
Inventor
郭志伟
张志祥
余尔坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FUZHOU JINRUIDI SOFTWARE TECHNOLOGY Co Ltd
Original Assignee
FUZHOU JINRUIDI SOFTWARE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FUZHOU JINRUIDI SOFTWARE TECHNOLOGY Co Ltd filed Critical FUZHOU JINRUIDI SOFTWARE TECHNOLOGY Co Ltd
Priority to CN201710193071.1A priority Critical patent/CN107193858B/en
Publication of CN107193858A publication Critical patent/CN107193858A/en
Application granted granted Critical
Publication of CN107193858B publication Critical patent/CN107193858B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Stored Programmes (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of intelligent Service application platform and method towards multi-source heterogeneous data fusion, it is related to data fusion application field.Multi-source heterogeneous data acquisition and data trade definable are realized by using visualization technique, using automatic technology, realizes that automatically real-time collection is with cleaning multi-source heterogeneous data, while completing data trade in real time.Using the semantic template storehouse of dynamic extending, Semantic mapping between multiple and distributing sources is realized, generative semantics dictionary realizes semantic cleaning rule definable, ensure text data can between heterogeneous data source semantic automatic conversion, realize that text data intelligently cleaning and is merchandised.The technical scheme can fill up at present the blank of the semantic automatic conversion of unstructured multi-source heterogeneous data both at home and abroad and transaction this respect, and in the government and enterprises and institutions that the country can be widely used in, solve the problem of multi-source heterogeneous data conversion of its presence is with transaction.

Description

Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion
Technical field
The present invention relates to data fusion application field, more particularly to a kind of intelligent Service towards multi-source heterogeneous data fusion Application platform and method.
Background technology
Application of the current enterprises and institutions to data fusion is only limitted to the integrated of standard metadata with integrating, many domestic Metadata of the outer data fusion product also only for standard is handled, it is impossible to realize " source data " to the place of " metadata " Reason, and the function at least occupies the workload of data fusion application 60%, this is also " the bottle that many data fusion products run into Neck ", it can not be the intelligent Service application needed for enterprises and institutions provide to cause current data fusion product.
The content of the invention
It is an object of the invention to provide a kind of intelligent Service application platform and method towards multi-source heterogeneous data fusion, So as to solve foregoing problems present in prior art.
To achieve these goals, the technical solution adopted by the present invention is as follows:
A kind of intelligent Service application platform towards multi-source heterogeneous data fusion, including:
Multi-source heterogeneous data acquisition and cleaning assembly, for defining different types of data source acquisition interface, based on semantic clear Wash rule and carry out multi-source heterogeneous data cleansing and multi-source heterogeneous data acquisition and Portable Batch System;
Isomeric data structural description component, for auto-initiation semantic template, defines multi-source heterogeneous data structured Description and establishment are based on semantic isomeric data Visualization Model;
Multi-source heterogeneous online data exchanges component, exists for defining multi-source heterogeneous data real-time exchange interface, isomeric data Line exchanges and online exchange process is tracked and showed.
Preferably, the multi-source heterogeneous data acquisition and cleaning assembly, for defining different types of data source acquisition interface, Specially:Using the standard criterion in different types of data source there is provided visualization interface, data source collection standard interface is defined.
Preferably, the multi-source heterogeneous data acquisition and cleaning assembly, it is different for carrying out multi-source based on semantic cleaning rule Structure data cleansing, specifically includes following steps:
According to semantic dictionary WordNet, description method for metadata RDF Schema and disaggregated model algorithm SOM, base is obtained In semantic similarity calculating method;
Using how tactful cleaning method, a variety of semantic matches end values are obtained, semantic matches result set is formed;
Using the fusion method of two-way amendment, the semantic matches result set is handled, data attribute similarity is obtained high Judged result, it is ensured that the correctness of data cleansing result.
Preferably, the multi-source heterogeneous data acquisition and cleaning assembly, for multi-source heterogeneous data acquisition and task scheduling Management, be specially:Standard and the cleaning rule based on semanteme are defined based on different types of data source acquisition interface, it is different according to multi-source The frequency acquisition of structure data source, realizes the visualization and automation of acquisition tasks management and running.
Preferably, the isomeric data structural description component, for auto-initiation semantic template, be specially:By Natural language processing and Ontology Matching technology extract semantic information from different types of corpus, generative semantics template instances, Semantic template example is set to obtain dynamic expansion.
Preferably, the isomeric data structural description component is used to define multi-source heterogeneous data structured description, specifically For:The auto-initiation semantic template progress generated to system is artificial perfect, realizes that the structuring of multi-source heterogeneous data is retouched State.
Preferably, the isomeric data structural description component is used to create visualizes mould based on semantic isomeric data Type, specifically includes following steps:
Based on the semantic template case library for describing multi-source heterogeneous data, by the semantic theories of learning, intelligence finds each example Between incidence relation, business service model is built by the way of example combination, automatically extracting for isomeric data feature is realized; Specifically, using visualization technique, by system interface displaying be automatically performed model construction, model evaluation, model select, it is optimal The complete modeling process that model is determined;
Semantic template example is modified using the business service model of structure, it is ensured that between semantic template example The uniformity and scalability of semantic information.
Preferably, the multi-source heterogeneous online data exchanges component and is used to define multi-source heterogeneous data real-time exchange interface, Specifically, the data model for building the structurally and semantically information that data resource is described on logical view layer, for passing through Visualization interface defines multi-source heterogeneous data real-time exchange interface.
Preferably, the multi-source heterogeneous online data exchanges component and exchanged online for isomeric data, specifically, being based on OGSA-DAI heterogeneous database exchange model, with reference to XML, Web Services, Grid Service and ontology, in data Body is introduced in exchange to describe isomeric data, and semantic mark is carried out to the XML Schema Jing Guo preliminary semantic conversion using body Note, realizes the semantic matches in heterogeneous database exchange and exchange process.
A kind of intelligent Service application process towards multi-source heterogeneous data fusion, using above-mentioned towards multi-source heterogeneous data The intelligent Service application platform of fusion, comprises the following steps:
S1, the data acquisition interface with cleaning assembly is gathered to different types of source data progress data using isomeric data Collection, including structural data and unstructured data;
S2, the semantic cleaning rule of setting and semantic dictionary;
The data collected in S1, according to the S2 semantic cleaning rules set and semantic dictionary, are carried out data cleansing by S3, The source data cleaned;
S4, the source data to the cleaning is modified according to modification rule, obtains correcting data;
The amendment data are put into index subject data base, form system metadata by S5;
S6, by isomeric data structural description component, initializes semantic template example, structure is carried out to system metadata Change description;
S7, the heterogeneous database exchange model based on OGSA-DAI realizes the online exchange of isomeric data.
The beneficial effects of the invention are as follows:Intelligent Service provided in an embodiment of the present invention towards multi-source heterogeneous data fusion should With platform and method, multi-source heterogeneous data acquisition and data trade definable are realized by using visualization technique, using automatic Change technology, realizes that automatically real-time collection is with cleaning multi-source heterogeneous data, while completing data trade in real time.Using can dynamically expand The semantic template storehouse of exhibition, realizes Semantic mapping between multiple and distributing sources, and generative semantics dictionary realizes that semantic cleaning rule can Definition, it is ensured that text data can between heterogeneous data source semantic automatic conversion, realize that text data intelligently cleaning and is merchandised.Should Technical scheme can fill up at present the sky of the semantic automatic conversion of unstructured multi-source heterogeneous data both at home and abroad and transaction this respect In vain, and can be widely used in the country government and enterprises and institutions in, solve its presence multi-source heterogeneous data conversion with hand over Easily the problem of.
Brief description of the drawings
Fig. 1 is the structural representation towards the intelligent Service application platform of multi-source heterogeneous data fusion;
Fig. 2 is to realize logical schematic towards the intelligent Service application process of multi-source heterogeneous data fusion;
Fig. 3 is the schematic flow sheet towards the intelligent Service application process of multi-source heterogeneous data fusion.
Embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with accompanying drawing, the present invention is entered Row is further described.It should be appreciated that embodiment described herein is not used to only to explain the present invention Limit the present invention.
Embodiment one
As shown in figure 1, a kind of intelligent Service application platform towards multi-source heterogeneous data fusion, including:
Multi-source heterogeneous data acquisition and cleaning assembly, for defining different types of data source acquisition interface, based on semantic clear Wash rule and carry out multi-source heterogeneous data cleansing and multi-source heterogeneous data acquisition and Portable Batch System;
Isomeric data structural description component, for auto-initiation semantic template, defines multi-source heterogeneous data structured Description and establishment are based on semantic isomeric data Visualization Model;
Multi-source heterogeneous online data exchanges component, exists for defining multi-source heterogeneous data real-time exchange interface, isomeric data Line exchanges and online exchange process is tracked and showed.
The platform that the present invention is provided realizes different types of data by using multi-source heterogeneous data acquisition and cleaning assembly Source carries out data acquisition using standard acquisition interface, carries out data cleansing based on standard semantic layer data cleaning rule, realizes The intellectuality of the unified collection of multi-source heterogeneous data and task scheduling and data cleansing and visualization, overcome prior art In, the polyphyly of multi-source heterogeneous data, isomerism, incomplete property, spanning space-time and semantic conflict characteristic cause in use The shortcoming of effective integration can not be carried out;
Meanwhile, the platform that the present invention is provided realizes isomeric data by using isomeric data structural description component Structural description, and the automation based on semantic business model process, overcome multiple and distributing sources in the prior art Architectural difference and description difference so that more efficient offer intelligent Service;
In addition, the platform that the present invention is provided exchanges component by using multi-source heterogeneous online data, realize multi-source heterogeneous Automatic semantic conversion between data, forms uniform data Fabric Interface standard, makes in different structure, different pieces of information management system Data can realize sequencing merchandise, so as to meet data trade to data validity, real-time, rational requirement.
Wherein, the multi-source heterogeneous data acquisition and cleaning assembly, for defining different types of data source acquisition interface, tool Body is:Using the standard criterion in different types of data source there is provided visualization interface, data source collection standard interface is defined.
In the present invention, the multi-source heterogeneous data acquisition and cleaning assembly, for carrying out multi-source based on semantic cleaning rule Isomeric data is cleaned, and specifically includes following steps:
According to semantic dictionary WordNet, description method for metadata RDF Schema and disaggregated model algorithm SOM, base is obtained In semantic similarity calculating method;
Using how tactful cleaning method, a variety of semantic matches end values are obtained, semantic matches result set is formed;
Using the fusion method of two-way amendment, the semantic matches result set is handled, data attribute similarity is obtained high Judged result, it is ensured that the correctness of data cleansing result.
In the embodiment of the present invention, the multi-source heterogeneous data acquisition and cleaning assembly, for multi-source heterogeneous data acquisition with Portable Batch System, be specially:
Standard and the cleaning rule based on semanteme are defined based on different types of data source acquisition interface, according to multi-source heterogeneous number According to the frequency acquisition in source, the visualization and automation of acquisition tasks management and running are realized.
In the present invention, the isomeric data structural description component, for auto-initiation semantic template, is specially:Borrow Assisted natural language processing and Ontology Matching technology extract semantic information from different types of corpus, and generative semantics template is real Example, makes semantic template example obtain dynamic expansion.
In the present invention, the isomeric data structural description component is used to define multi-source heterogeneous data structured description, tool Body is:The auto-initiation semantic template progress generated to system is artificial perfect, realizes the structuring of multi-source heterogeneous data Description.
In the present invention, the isomeric data structural description component is used to create visualizes mould based on semantic isomeric data Type, specifically includes following steps:
Based on the semantic template case library for describing multi-source heterogeneous data, by the semantic theories of learning, intelligence finds each example Between incidence relation, business service model is built by the way of example combination, automatically extracting for isomeric data feature is realized; Specifically, it is possible to use visualization technique, by system interface displaying be automatically performed model construction, model evaluation, model select, The complete modeling process that optimal models is determined;
Semantic template example is modified using the business service model of structure, it is ensured that between semantic template example The uniformity and scalability of semantic information.
In the present invention, the multi-source heterogeneous online data exchanges component and connect for defining multi-source heterogeneous data real-time exchange Mouthful, specifically, the data model for building the structurally and semantically information that data resource is described on logical view layer, for leading to Cross visualization interface and define multi-source heterogeneous data real-time exchange interface.
In the embodiment of the present invention, the multi-source heterogeneous online data exchanges component and exchanged online for isomeric data, specifically For the heterogeneous database exchange model based on OGSA-DAI, with reference to XML, Web Services, Grid Service and body skill Art, introduces body to describe isomeric data, using body to the XML Schema Jing Guo preliminary semantic conversion in data exchange Semantic marker is carried out, the semantic matches in heterogeneous database exchange and exchange process are realized.
In the present invention, the multi-source heterogeneous online data, which exchanges component, to be used to online exchange process is tracked and opened up It is existing, specifically, the component is used for offer one can exchange tracking and exhibition online with the service-oriented isomeric data of dynamic expansion Show function, allow user to be not required to it is to be understood that the ins and outs that data exchange is realized, can but there is individual clear to data exchange process Understanding.
Embodiment two
As shown in Fig. 2 the embodiments of the invention provide a kind of intelligent Service application side towards multi-source heterogeneous data fusion Method, using the intelligent Service application platform towards multi-source heterogeneous data fusion described in embodiment one, comprises the following steps:
S1, the data acquisition interface with cleaning assembly is gathered to different types of source data progress data using isomeric data Collection, including structural data and unstructured data;
S2, the semantic cleaning rule of setting and semantic dictionary;
The data collected in S1, according to the S2 semantic cleaning rules set and semantic dictionary, are carried out data cleansing by S3, The source data cleaned;
S4, the source data to the cleaning is modified according to modification rule, obtains correcting data;
The amendment data are put into index subject data base, form system metadata by S5
S6, by isomeric data structural description component, initializes semantic template example, structure is carried out to system metadata Change description;
S7, the heterogeneous database exchange model based on OGSA-DAI realizes the online exchange of isomeric data.
Specific embodiment:
Intelligent Service application process provided in an embodiment of the present invention towards multi-source heterogeneous data fusion, can be according to as follows Step is implemented:
1st, collection rule storehouse is defined by isomeric data collection and cleaning assembly, according to the different types of data of rule collection Source (including structuring and unstructured data), is formed " source data resource pool ";
2nd, cleaning rule storehouse and semantic dictionary are defined with cleaning assembly by isomeric data collection, to " source data resource pool " In data cleared up and corrected, formed " metadata resource pond ";
3rd, by isomeric data structural description component definition index algorithms library, to the system member in " metadata resource pond " Data are calculated, collected and counted, and are formed " achievement data resource pool ";
4th, by isomeric data structural description component definition semantic template example, to the finger in " achievement data resource pool " Mark data and carry out structural description, formed " semantic instance resources bank ";
5th, component definition semantic matches rule is exchanged online by isomeric data, to the semanteme in " semantic instance resources bank " Example carries out semantic marker and matching, realizes the online exchange of isomeric data;
6th, exchange the visualization following function of component online by isomeric data, realize the full mistake that isomeric data is exchanged online Journey is tracked.
By using above-mentioned technical proposal disclosed by the invention, following beneficial effect has been obtained:The embodiment of the present invention is carried The intelligent Service application platform towards multi-source heterogeneous data fusion supplied, multi-source heterogeneous data are realized by using visualization technique Collection and data trade definable, using automatic technology, realize that automatically real-time collection is with cleaning multi-source heterogeneous data, simultaneously Data trade is completed in real time.Using the semantic template storehouse of dynamic extending, Semantic mapping between multiple and distributing sources is realized, it is raw Into semantic dictionary, realize semantic cleaning rule definable, it is ensured that text data can between heterogeneous data source semantic automatic conversion, Realize text data intelligently cleaning and transaction.The technical scheme can fill up current unstructured multi-source heterogeneous data both at home and abroad The blank of semantic automatic conversion and transaction this respect, and can be widely used in the government and enterprises and institutions of the country, solve The problem of its multi-source heterogeneous data conversion existed is with transaction.
Each embodiment in this specification is described by the way of progressive, what each embodiment was stressed be with Between the difference of other embodiment, each embodiment identical similar part mutually referring to.
Those skilled in the art should be understood that the sequential for the method and step that above-described embodiment is provided can be entered according to actual conditions Row accommodation, also can concurrently be carried out according to actual conditions.
All or part of step in the method that above-described embodiment is related to can be instructed by program correlation hardware come Complete, described program can be stored in the storage medium that computer equipment can be read, for performing the various embodiments described above side All or part of step described in method.The computer equipment, for example:Personal computer, server, the network equipment, intelligent sliding Dynamic terminal, intelligent home device, wearable intelligent equipment, vehicle intelligent equipment etc.;Described storage medium, for example:RAM、 ROM, magnetic disc, tape, CD, flash memory, USB flash disk, mobile hard disk, storage card, memory stick, webserver storage, network cloud storage Deng.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by One entity or operation make a distinction with another entity or operation, and not necessarily require or imply these entities or operation Between there is any this actual relation or order.Moreover, term " comprising ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that process, method, commodity or equipment including a series of key elements not only include that A little key elements, but also other key elements including being not expressly set out, or also include be this process, method, commodity or The intrinsic key element of equipment.In the absence of more restrictions, the key element limited by sentence "including a ...", is not arranged Except also there is other identical element in the process including the key element, method, commodity or equipment.
Described above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should Depending on protection scope of the present invention.

Claims (10)

1. a kind of intelligent Service application platform towards multi-source heterogeneous data fusion, it is characterised in that including:
Multi-source heterogeneous data acquisition and cleaning assembly, for defining different types of data source acquisition interface, based on semanteme cleaning rule Then carry out multi-source heterogeneous data cleansing and multi-source heterogeneous data acquisition and Portable Batch System;
Isomeric data structural description component, for auto-initiation semantic template, defines multi-source heterogeneous data structured description And create based on semantic isomeric data Visualization Model;
Multi-source heterogeneous online data exchanges component, is handed over online for defining multi-source heterogeneous data real-time exchange interface, isomeric data Change and online exchange process is tracked and showed.
2. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute Multi-source heterogeneous data acquisition and cleaning assembly are stated, for defining different types of data source acquisition interface, is specially:Using inhomogeneity The standard criterion of type data source defines data source collection standard interface there is provided visualization interface.
3. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute Multi-source heterogeneous data acquisition and cleaning assembly are stated, for carrying out multi-source heterogeneous data cleansing, specific bag based on semantic cleaning rule Include following steps:
According to semantic dictionary WordNet, description method for metadata RDF Schema and disaggregated model algorithm SOM, obtain being based on language The similarity calculating method of justice;
Using how tactful cleaning method, a variety of semantic matches end values are obtained, semantic matches result set is formed;
Using the fusion method of two-way amendment, the semantic matches result set is handled, the high judgement of data attribute similarity is obtained As a result, it is ensured that the correctness of data cleansing result.
4. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute Multi-source heterogeneous data acquisition and cleaning assembly are stated, for multi-source heterogeneous data acquisition and Portable Batch System, is specially:Based on not Same type data source acquisition interface defines standard and the cleaning rule based on semanteme, according to the collection of multiple and distributing sources frequency Rate, realizes the visualization and automation of acquisition tasks management and running.
5. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute Isomeric data structural description component is stated, for auto-initiation semantic template, is specially:By natural language processing and body Matching technique extracts semantic information from different types of corpus, and generative semantics template instances obtain semantic template example Dynamic expansion.
6. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute Stating isomeric data structural description component is used to define multi-source heterogeneous data structured description, is specially:The institute generated to system State the progress of auto-initiation semantic template artificial perfect, realize the structural description of multi-source heterogeneous data.
7. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute State isomeric data structural description component be used for create based on semanteme isomeric data Visualization Model, specifically include following step Suddenly:
Based on the semantic template case library for describing multi-source heterogeneous data, by the semantic theories of learning, intelligence is found between each example Incidence relation, business service model is built by the way of example combination, automatically extracting for isomeric data feature is realized;Specifically , using visualization technique, model construction, model evaluation, model selection, optimal models are automatically performed by system interface displaying The complete modeling process determined;
Semantic template example is modified using the business service model of structure, it is ensured that semantic between semantic template example The uniformity and scalability of information.
8. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute State multi-source heterogeneous online data and exchange component for defining multi-source heterogeneous data real-time exchange interface, specifically, existing for building The data model of the structurally and semantically information of data resource is described, for defining multi-source by visualization interface on logical view layer Isomeric data real-time exchange interface.
9. the intelligent Service application platform according to claim 1 towards multi-source heterogeneous data fusion, it is characterised in that institute State multi-source heterogeneous online data exchange component to exchange online for isomeric data, specifically, the isomeric data based on OGSA-DAI Exchange model, with reference to XML, Web Services, Grid Service and ontology, introduces body to retouch in data exchange Isomeric data is stated, semantic marker is carried out to the XML Schema Jing Guo preliminary semantic conversion using body, realizes that isomeric data is handed over Change and the semantic matches in exchange process.
10. a kind of intelligent Service application process towards multi-source heterogeneous data fusion, it is characterised in that utilize claim 1-9 The intelligent Service application platform towards multi-source heterogeneous data fusion described in any one, comprises the following steps:
S1, carries out data to different types of source data using the data acquisition interface of isomeric data collection and cleaning assembly and adopts Collection, including structural data and unstructured data;
S2, the semantic cleaning rule of setting and semantic dictionary;
The data collected in S1, according to the S2 semantic cleaning rules set and semantic dictionary, are carried out data cleansing, obtained by S3 The source data of cleaning;
S4, the source data to the cleaning is modified according to modification rule, obtains correcting data;
The amendment data are put into index subject data base, form system metadata by S5;
S6, by isomeric data structural description component, initializes semantic template example, and carrying out structuring to system metadata retouches State;
S7, the heterogeneous database exchange model based on OGSA-DAI realizes the online exchange of isomeric data.
CN201710193071.1A 2017-03-28 2017-03-28 Intelligent Service application platform and method towards multi-source heterogeneous data fusion Active CN107193858B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710193071.1A CN107193858B (en) 2017-03-28 2017-03-28 Intelligent Service application platform and method towards multi-source heterogeneous data fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710193071.1A CN107193858B (en) 2017-03-28 2017-03-28 Intelligent Service application platform and method towards multi-source heterogeneous data fusion

Publications (2)

Publication Number Publication Date
CN107193858A true CN107193858A (en) 2017-09-22
CN107193858B CN107193858B (en) 2018-09-11

Family

ID=59871013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710193071.1A Active CN107193858B (en) 2017-03-28 2017-03-28 Intelligent Service application platform and method towards multi-source heterogeneous data fusion

Country Status (1)

Country Link
CN (1) CN107193858B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679141A (en) * 2017-09-25 2018-02-09 上海壹账通金融科技有限公司 Data storage method, device, equipment and computer-readable recording medium
CN107766556A (en) * 2017-11-03 2018-03-06 福建工程学院 A kind of interactive Ontology Matching method and computer equipment based on evolution algorithm
CN107992510A (en) * 2017-10-17 2018-05-04 广州智聚行科技有限公司 Wisdom study computational methods based on multi-source heterogeneous data analysis
CN108198595A (en) * 2018-01-18 2018-06-22 北京化工大学 A kind of multi-source heterogeneous unstructured medical record data fusion method
CN108449407A (en) * 2018-03-14 2018-08-24 中煤科工集团重庆研究院有限公司 Multi-source heterogeneous coal mine safety monitoring data acquisition method
CN108536796A (en) * 2018-04-02 2018-09-14 北京大学 A kind of isomery Ontology Matching method and system based on figure
CN109492059A (en) * 2019-01-03 2019-03-19 北京理工大学 A kind of multi-source heterogeneous data fusion and Modifying model process management and control method
CN109919469A (en) * 2019-02-27 2019-06-21 浪潮软件集团有限公司 A kind of holography science data processing method
CN110019228A (en) * 2017-12-25 2019-07-16 北京金风科创风电设备有限公司 Multi-source data integration method and device based on fan data
CN110377598A (en) * 2018-04-11 2019-10-25 西安邮电大学 A kind of multi-source heterogeneous date storage method based on intelligence manufacture process
CN110515926A (en) * 2019-08-28 2019-11-29 国网天津市电力公司 Heterogeneous data source mass data carding method based on participle and semantic dependency analysis
CN110765166A (en) * 2019-10-23 2020-02-07 山东浪潮通软信息科技有限公司 Method, device and medium for managing data
CN110781202A (en) * 2020-01-02 2020-02-11 广州欧赛斯信息科技有限公司 Intelligent data collection method and system for textbook teaching quality information
CN110990391A (en) * 2019-12-04 2020-04-10 中山市凯能集团有限公司 Integration method and system of multi-source heterogeneous data, computer equipment and storage medium
CN111291029A (en) * 2020-01-17 2020-06-16 深圳市华傲数据技术有限公司 Data cleaning method and device
CN111552685A (en) * 2019-12-27 2020-08-18 广东电网有限责任公司电力科学研究院 Spark-based electric energy quality data cleaning method and device
CN111695000A (en) * 2020-06-16 2020-09-22 山东蓝海领航大数据发展有限公司 Multi-source big data loading method and system
CN111752723A (en) * 2020-06-06 2020-10-09 中国科学院电子学研究所苏州研究院 Visual multi-source service management system and implementation method thereof
CN108959395B (en) * 2018-06-04 2020-11-06 广西大学 Multi-source heterogeneous big data oriented hierarchical reduction combined cleaning method
CN112100457A (en) * 2020-09-22 2020-12-18 国网辽宁省电力有限公司电力科学研究院 Multi-source heterogeneous data integration method based on metadata
CN112528083A (en) * 2020-12-10 2021-03-19 天津(滨海)人工智能军民融合创新中心 Message customization method based on distributed semantic template distribution
CN112650745A (en) * 2020-12-30 2021-04-13 中科环森智慧科技(苏州)有限公司 Data management system based on unified data resource pool
CN113987131A (en) * 2021-11-11 2022-01-28 江苏天汇空间信息研究院有限公司 Heterogeneous multi-source data correlation analysis system and method
CN114154572A (en) * 2021-12-02 2022-03-08 辽宁铭钉科技有限公司 Heterogeneous data centralized access analysis method based on heterogeneous platform
CN114661810A (en) * 2022-05-24 2022-06-24 国网浙江省电力有限公司杭州供电公司 Lightweight multi-source heterogeneous data fusion method and system
CN116894152A (en) * 2023-09-11 2023-10-17 山东唐和智能科技有限公司 Multisource data investigation and real-time analysis method
CN117056867A (en) * 2023-10-12 2023-11-14 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin
CN117688308A (en) * 2024-01-26 2024-03-12 中国人民解放军军事科学院系统工程研究院 Intelligent cleaning method and system for heterogeneous data
CN118229474A (en) * 2024-05-24 2024-06-21 中山大学 College data application management method and system based on data center

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101382942A (en) * 2008-10-27 2009-03-11 浙江大学 Information system data integration method orienting service cooperation based on noumenon
US20110066590A1 (en) * 2009-09-14 2011-03-17 International Business Machines Corporation Analytics integration workbench within a comprehensive framework for composing and executing analytics applications in business level languages
CN102073725A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Method for searching structured data and search engine system for implementing same
US20110145286A1 (en) * 2009-12-15 2011-06-16 Chalklabs, Llc Distributed platform for network analysis
CN103838826A (en) * 2014-01-23 2014-06-04 北京东方泰坦科技股份有限公司 Integration method of dynamic heterogeneous space information plotting data
CN104572626A (en) * 2015-01-23 2015-04-29 北京云知声信息技术有限公司 Automatic semantic template generation method and device and semantic analysis method and system
CN104933095A (en) * 2015-05-22 2015-09-23 中国电子科技集团公司第十研究所 Heterogeneous information universality correlation analysis system and analysis method thereof
CN105701181A (en) * 2016-01-06 2016-06-22 中电科华云信息技术有限公司 Dynamic heterogeneous metadata acquisition method and system
CN105893612A (en) * 2016-04-26 2016-08-24 中国科学院信息工程研究所 Consistency expression method for multi-source heterogeneous big data

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101382942A (en) * 2008-10-27 2009-03-11 浙江大学 Information system data integration method orienting service cooperation based on noumenon
US20110066590A1 (en) * 2009-09-14 2011-03-17 International Business Machines Corporation Analytics integration workbench within a comprehensive framework for composing and executing analytics applications in business level languages
US20110145286A1 (en) * 2009-12-15 2011-06-16 Chalklabs, Llc Distributed platform for network analysis
CN102073725A (en) * 2011-01-11 2011-05-25 百度在线网络技术(北京)有限公司 Method for searching structured data and search engine system for implementing same
CN103838826A (en) * 2014-01-23 2014-06-04 北京东方泰坦科技股份有限公司 Integration method of dynamic heterogeneous space information plotting data
CN104572626A (en) * 2015-01-23 2015-04-29 北京云知声信息技术有限公司 Automatic semantic template generation method and device and semantic analysis method and system
CN104933095A (en) * 2015-05-22 2015-09-23 中国电子科技集团公司第十研究所 Heterogeneous information universality correlation analysis system and analysis method thereof
CN105701181A (en) * 2016-01-06 2016-06-22 中电科华云信息技术有限公司 Dynamic heterogeneous metadata acquisition method and system
CN105893612A (en) * 2016-04-26 2016-08-24 中国科学院信息工程研究所 Consistency expression method for multi-source heterogeneous big data

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107679141A (en) * 2017-09-25 2018-02-09 上海壹账通金融科技有限公司 Data storage method, device, equipment and computer-readable recording medium
CN107992510A (en) * 2017-10-17 2018-05-04 广州智聚行科技有限公司 Wisdom study computational methods based on multi-source heterogeneous data analysis
CN107766556A (en) * 2017-11-03 2018-03-06 福建工程学院 A kind of interactive Ontology Matching method and computer equipment based on evolution algorithm
CN107766556B (en) * 2017-11-03 2021-07-30 福建工程学院 Interactive ontology matching method based on evolutionary algorithm and computer equipment
CN110019228B (en) * 2017-12-25 2022-08-09 北京金风科创风电设备有限公司 Multi-source data integration method and device based on fan data
CN110019228A (en) * 2017-12-25 2019-07-16 北京金风科创风电设备有限公司 Multi-source data integration method and device based on fan data
CN108198595B (en) * 2018-01-18 2022-05-03 北京化工大学 Multi-source heterogeneous unstructured medical record data fusion method
CN108198595A (en) * 2018-01-18 2018-06-22 北京化工大学 A kind of multi-source heterogeneous unstructured medical record data fusion method
CN108449407A (en) * 2018-03-14 2018-08-24 中煤科工集团重庆研究院有限公司 Multi-source heterogeneous coal mine safety monitoring data acquisition method
CN108449407B (en) * 2018-03-14 2021-03-23 中煤科工集团重庆研究院有限公司 Multi-source heterogeneous coal mine safety monitoring data acquisition method
CN108536796A (en) * 2018-04-02 2018-09-14 北京大学 A kind of isomery Ontology Matching method and system based on figure
CN110377598A (en) * 2018-04-11 2019-10-25 西安邮电大学 A kind of multi-source heterogeneous date storage method based on intelligence manufacture process
CN110377598B (en) * 2018-04-11 2023-04-07 西安邮电大学 Multi-source heterogeneous data storage method based on intelligent manufacturing process
CN108959395B (en) * 2018-06-04 2020-11-06 广西大学 Multi-source heterogeneous big data oriented hierarchical reduction combined cleaning method
CN109492059A (en) * 2019-01-03 2019-03-19 北京理工大学 A kind of multi-source heterogeneous data fusion and Modifying model process management and control method
CN109492059B (en) * 2019-01-03 2020-10-27 北京理工大学 Multi-source heterogeneous data fusion and model correction process control method
CN109919469A (en) * 2019-02-27 2019-06-21 浪潮软件集团有限公司 A kind of holography science data processing method
CN110515926A (en) * 2019-08-28 2019-11-29 国网天津市电力公司 Heterogeneous data source mass data carding method based on participle and semantic dependency analysis
CN110765166A (en) * 2019-10-23 2020-02-07 山东浪潮通软信息科技有限公司 Method, device and medium for managing data
CN110990391A (en) * 2019-12-04 2020-04-10 中山市凯能集团有限公司 Integration method and system of multi-source heterogeneous data, computer equipment and storage medium
CN111552685A (en) * 2019-12-27 2020-08-18 广东电网有限责任公司电力科学研究院 Spark-based electric energy quality data cleaning method and device
CN111552685B (en) * 2019-12-27 2022-02-15 广东电网有限责任公司电力科学研究院 Spark-based electric energy quality data cleaning method and device
CN110781202B (en) * 2020-01-02 2020-04-21 广州欧赛斯信息科技有限公司 Intelligent data collection method and system for textbook teaching quality information
CN110781202A (en) * 2020-01-02 2020-02-11 广州欧赛斯信息科技有限公司 Intelligent data collection method and system for textbook teaching quality information
CN111291029B (en) * 2020-01-17 2024-03-08 深圳市华傲数据技术有限公司 Data cleaning method and device
CN111291029A (en) * 2020-01-17 2020-06-16 深圳市华傲数据技术有限公司 Data cleaning method and device
CN111752723B (en) * 2020-06-06 2021-05-04 中国科学院电子学研究所苏州研究院 Visual multi-source service management system and implementation method thereof
CN111752723A (en) * 2020-06-06 2020-10-09 中国科学院电子学研究所苏州研究院 Visual multi-source service management system and implementation method thereof
CN111695000A (en) * 2020-06-16 2020-09-22 山东蓝海领航大数据发展有限公司 Multi-source big data loading method and system
CN112100457A (en) * 2020-09-22 2020-12-18 国网辽宁省电力有限公司电力科学研究院 Multi-source heterogeneous data integration method based on metadata
CN112528083A (en) * 2020-12-10 2021-03-19 天津(滨海)人工智能军民融合创新中心 Message customization method based on distributed semantic template distribution
CN112650745A (en) * 2020-12-30 2021-04-13 中科环森智慧科技(苏州)有限公司 Data management system based on unified data resource pool
CN113987131B (en) * 2021-11-11 2022-08-23 江苏天汇空间信息研究院有限公司 Heterogeneous multi-source data correlation analysis system and method
CN113987131A (en) * 2021-11-11 2022-01-28 江苏天汇空间信息研究院有限公司 Heterogeneous multi-source data correlation analysis system and method
CN114154572A (en) * 2021-12-02 2022-03-08 辽宁铭钉科技有限公司 Heterogeneous data centralized access analysis method based on heterogeneous platform
CN114661810B (en) * 2022-05-24 2022-08-16 国网浙江省电力有限公司杭州供电公司 Lightweight multi-source heterogeneous data fusion method and system
CN114661810A (en) * 2022-05-24 2022-06-24 国网浙江省电力有限公司杭州供电公司 Lightweight multi-source heterogeneous data fusion method and system
CN116894152A (en) * 2023-09-11 2023-10-17 山东唐和智能科技有限公司 Multisource data investigation and real-time analysis method
CN116894152B (en) * 2023-09-11 2023-12-12 山东唐和智能科技有限公司 Multisource data investigation and real-time analysis method
CN117056867A (en) * 2023-10-12 2023-11-14 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin
CN117056867B (en) * 2023-10-12 2024-01-23 中交第四航务工程勘察设计院有限公司 Multi-source heterogeneous data fusion method and system for digital twin
CN117688308A (en) * 2024-01-26 2024-03-12 中国人民解放军军事科学院系统工程研究院 Intelligent cleaning method and system for heterogeneous data
CN118229474A (en) * 2024-05-24 2024-06-21 中山大学 College data application management method and system based on data center
CN118229474B (en) * 2024-05-24 2024-08-09 中山大学 College data application management method and system based on data center

Also Published As

Publication number Publication date
CN107193858B (en) 2018-09-11

Similar Documents

Publication Publication Date Title
CN107193858B (en) Intelligent Service application platform and method towards multi-source heterogeneous data fusion
Gaines et al. Knowledge acquisition tools based on personal construct psychology
CN107609052A (en) A kind of generation method and device of the domain knowledge collection of illustrative plates based on semantic triangle
CN108959431A (en) Label automatic generation method, system, computer readable storage medium and equipment
CN112463980A (en) Intelligent plan recommendation method based on knowledge graph
Chang et al. Product concept evaluation and selection using data mining and domain ontology in a crowdsourcing environment
CN104239513A (en) Semantic retrieval method oriented to field data
CN108874783A (en) Power information O&M knowledge model construction method
Khoo et al. An investigation on a prototype customer-oriented information system for product concept development
CN113946686A (en) Electric power marketing knowledge map construction method and system
Xu et al. Novel model of e-commerce marketing based on big data analysis and processing
CN104317853B (en) A kind of service cluster construction method based on Semantic Web
Ke et al. Discovering e-commerce user groups from online comments: An emotional correlation analysis-based clustering method
Kim et al. Customer preference analysis based on SNS data
Jyothi et al. A study on big data modelling techniques
Zhou [Retracted] Sports Economic Mining Algorithm Based on Association Analysis and Big Data Model
CN102426578B (en) Method for measuring fuzzy similarity of ontology concept in intelligent semantic web
Ye et al. Neo-Chinese style furniture design based on semantic analysis and connection
Xu Packaging design method of modern cultural and creative products based on rough set theory
CN107480241A (en) Method is recommended by a kind of similar enterprise based on potential theme
Liang Allocation of multi-dimensional distance learning resource based on MOOC data
Diao et al. Optimization of Management Mode of Small‐and Medium‐Sized Enterprises Based on Decision Tree Model
Qingjie et al. Research on domain knowledge graph based on the large scale online knowledge fragment
CN113868322A (en) Semantic structure analysis method, device and equipment, virtualization system and medium
CN104102654B (en) A kind of method and device of words clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant