CN110297946A - A kind of uncertain XML data storage method of magnanimity - Google Patents

A kind of uncertain XML data storage method of magnanimity Download PDF

Info

Publication number
CN110297946A
CN110297946A CN201910644221.5A CN201910644221A CN110297946A CN 110297946 A CN110297946 A CN 110297946A CN 201910644221 A CN201910644221 A CN 201910644221A CN 110297946 A CN110297946 A CN 110297946A
Authority
CN
China
Prior art keywords
uncertain
xml
data
magnanimity
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910644221.5A
Other languages
Chinese (zh)
Inventor
刘健
龚蕾蕾
张蕾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201910644221.5A priority Critical patent/CN110297946A/en
Publication of CN110297946A publication Critical patent/CN110297946A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/81Indexing, e.g. XML tags; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of magnanimity not to know XML data storage method, comprising the following steps: A, extracts the uncertain XML data of magnanimity;B, magnanimity of the building based on column database, which does not know XML data, indicates model;C, realize that the magnanimity based on uncertain column database does not know XML data storage;The present invention is uncertain by the more granularities for identifying data in uncertain XML document, studies the uncertain XML data representation method based on column database;Present invention research does not know the storage model of XML data towards magnanimity, XML data storage method is not known including the magnanimity under research mode (XMLDTD/Schema) known case, and the magnanimity under mode unknown situation does not know XML data storage method, the present invention develops the expression and mapping storage method that XML data is not known towards magnanimity, to solve the problems, such as that magnanimity does not know XML data and is difficult to manage.

Description

A kind of uncertain XML data storage method of magnanimity
Technical field
The present invention relates to XML data technical field of memory, specially a kind of magnanimity does not know XML data storage method.
Background technique
Possible world theory and possibility theory are the methods for inaccurate and uncertain information quantificational expression, from proposition Since be widely used in the extension of database model, to realize inaccurate and uncertain information indicate and processing.In recent years Come, with the development of computer technology and going deep into for application, possible world theory and possibility theory are used for conceptual data model Attention of the research of extension by researcher, and show the trend combined with XML database.In order in XML data Uncertain information is indicated in model, domestic and international researcher gradually opens the research of extension XML data model semantic meaning representation ability Beginning increases, and proposes a variety of XML Data expansion models and corresponding mapping storage method, including based on possible world theory Probability XML Data expansion model and fuzzy XML Data expansion model based on possibility theory.Example
Such as using probability tree representation union at multiple classics XML documents, while introducing the data fusion based on user feedback Strategy gives XML probability integrated solution on this basis.
Data storage is the basis of research data base system, it is an important factor for influencing system effectiveness.It is existing
In relation to not knowing the research that XML is stored, there are two main classes in document: based on traditional database (such as relational database) Uncertain XML data storage and the storage of Native-XML data.For example, data more in documents XML document The form of expression for spending ambiguity, proposes the formal definitions of fuzzy XML DTD, and have studied known to XML DTD mode In the case of the fuzzy XML storage method based on relational database;Based on undirected graph model, there is document to propose unordered side mapping side Method realizes the relational database storage of probability XML data.Along with the continuous development of Internet technology, Web number It experienced according to scale and rapidly increase from GB, TB magnitude to the magnanimity of PB magnitude.In face of the big data so grown rapidly, learn Art circle and industry propose storage and management of the cloud computing technology to support big data.Since cloud computing proposes, it has been based on GFS (Google File System) distributed file system model, MapReduce distributed computing platform and BigTable The open source big data management platform of distributed data base model realization is because having high-performance, expansible mass data storage and meter The technical characteristics such as calculation ability, and high fault-tolerant, support isomerous environment, lower use cost, become in cloud computing research and use One of most commonly used data calculating and storage model also become big number using column database storage large-scale data naturally Focus of attention according to the study.
From the point of view of existing literature, indicate that the research of model and storage method is main at present in relation to not knowing XML both at home and abroad Concern is still traditional lightweight data, does not know the research of XML table data store representation model substantially also towards magnanimity In the starting stage.Since traditional date storage method is difficult to cope with the mass data of explosive increase, how effectively It is still one of the problem that current research is faced that storage, which indicates that magnanimity does not know XML data,.
Summary of the invention
The purpose of the present invention is to provide a kind of magnanimity not to know XML data storage method, to solve above-mentioned background technique The problem of middle proposition.
To achieve the above object, the invention provides the following technical scheme: a kind of magnanimity does not know XML data storage method, The following steps are included:
A, it extracts magnanimity and does not know XML data;
B, magnanimity of the building based on column database does not know XML data representation model;
C, realize that the magnanimity based on uncertain column database does not know the storage of XML data.
Preferably, detailed process is as follows by the step A:
A, data to be stored are obtained;
B, data are parsed, includes at least one key-value pair in the data after parsing, wherein key indicates one in data to be stored A field, value indicate the corresponding data value of the field;
C, according to expandable mark language XML file is preset, the validity, legitimacy and integrality of field are checked respectively for, In, it include the definition of all fields that can store in the default XML file, and must include in the data of parsing Field;
D, it is stored by the data of validity check, validity checking and integrity checking to database.
Preferably, building process is as follows in the step B:
A, identify that more granularities of uncertain XML data are uncertain;
B, representation method and uncertain column database model definition of the uncertain XML data in column database are provided;
C, it establishes the magnanimity based on uncertain column database and does not know XML data representation model.
Preferably, realize that process is as follows in the step C:
I), XML data are not known for magnanimity known to mode, designs it in the storage model of uncertain column database;
Ii) magnanimity unknown for mode does not know XML data, designs it in the storage model of uncertain column database:
Iii) for not knowing the magnanimity uncertain data in column database, realize uncertain column database to uncertain XML The storage transformation model of database.
Preferably, detailed process is as follows in the step i):
(a) leaf elements, the n omicronn-leaf daughter element, attribute information in uncertain XML database schema are obtained;
It (b) is that basic division unit designs respective column database table with n omicronn-leaf daughter element, according in each non-leaf element units The leaf elements of nesting, attribute information design respective column.
Preferably, the step ii) in detailed process is as follows:
(a) it identifies the data entity in uncertain XML database, extracts uncertain XML data tree node routing information;
It (b) is that basic division unit designs respective column database table with data entity, according to corresponding in each data entity unit Data tree routing information design column database respective column, provide magnanimity do not know XML database data entity, routing information With the mapping ruler between column database column;
(c) the uncertain XML database under establishment model unknown situation and the mapping model between column database.
Preferably, the step iii) in detailed process is as follows:
(a) according to the major key and column design of each table in uncertain column database using n omicronn-leaf child node as more uncertain XML of root Tree;
(b) splicing is carried out to which generation is complete not to each uncertain XML tree according to data association information in uncertain column database Determine XML tree.
Compared with prior art, the beneficial effects of the present invention are: the present invention is by identifying data in uncertain XML document More granularities it is uncertain, study the uncertain XML data presentation technique based on column database;Present invention research is towards magnanimity The storage model of uncertain XML data, it is uncertain including the magnanimity under research mode (XMLDTD/Schema) known case Magnanimity under XML date storage method and mode unknown situation does not know XML date storage method, and the present invention develops face The expression and mapping storage method of XML data are not known to magnanimity, are difficult to manage to solve magnanimity and not know XML data The problem of.
Detailed description of the invention
Fig. 1 is flow chart of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, the present invention provides a kind of technical solution: the invention provides the following technical scheme: a kind of magnanimity is not true Determine XML data storage method, the following steps are included:
A, it extracts magnanimity and does not know XML data;
B, magnanimity of the building based on column database does not know XML data representation model;
C, realize that the magnanimity based on uncertain column database does not know the storage of XML data.
In the present invention, detailed process is as follows by step A:
A, data to be stored are obtained;
B, data are parsed, includes at least one key-value pair in the data after parsing, wherein key indicates one in data to be stored A field, value indicate the corresponding data value of the field;
C, according to expandable mark language XML file is preset, the validity, legitimacy and integrality of field are checked respectively for, In, it include the definition of all fields that can store in the default XML file, and must include in the data of parsing Field;
D, it is stored by the data of validity check, validity checking and integrity checking to database.
By data parsing that will be to be stored, according to default expandable mark language XML file, field is checked respectively for Validity, legitimacy and integrality will be stored by the data of validity check, validity checking and integrity checking to data Library.So that not needing to rewrite code if necessary to change data model during data database storing, and only need to match Setting XML file can be realized the storage of data, simple, convenient, be the maintenance and modification save the cost of project.
In the present invention, building process is as follows in step B:
A, identify that more granularities of uncertain XML data are uncertain;
B, representation method and uncertain column database model definition of the uncertain XML data in column database are provided;
C, it establishes the magnanimity based on uncertain column database and does not know XML data representation model.
Realize that process is as follows in the present invention, in step C:
I), XML data are not known for magnanimity known to mode, designs it in the storage model of uncertain column database;
Ii) magnanimity unknown for mode does not know XML data, designs it in the storage model of uncertain column database:
Iii) for not knowing the magnanimity uncertain data in column database, realize uncertain column database to uncertain XML The storage transformation model of database.
In the present invention, detailed process is as follows in step i):
(a) leaf elements, the n omicronn-leaf daughter element, attribute information in uncertain XML database schema are obtained;
It (b) is that basic division unit designs respective column database table with n omicronn-leaf daughter element, according in each non-leaf element units The leaf elements of nesting, attribute information design respective column.
In the present invention, step ii) in detailed process is as follows:
(a) it identifies the data entity in uncertain XML database, extracts uncertain XML data tree node routing information;
It (b) is that basic division unit designs respective column database table with data entity, according to corresponding in each data entity unit Data tree routing information design column database respective column, provide magnanimity do not know XML database data entity, routing information With the mapping ruler between column database column;
(c) the uncertain XML database under establishment model unknown situation and the mapping model between column database.
In the present invention, step iii) in detailed process is as follows:
(a) according to the major key and column design of each table in uncertain column database using n omicronn-leaf child node as more uncertain XML of root Tree;
(b) splicing is carried out to which generation is complete not to each uncertain XML tree according to data association information in uncertain column database Determine XML tree.
In conclusion the present invention is uncertain by the more granularities for identifying data in uncertain XML document, research is based on The uncertain XML data presentation technique of column database;Present invention research does not know the storage mould of XML data towards magnanimity Type does not know XML date storage method, Yi Jimo including the magnanimity under research mode (XMLDTD/Schema) known case Magnanimity under formula unknown situation does not know XML date storage method, and present invention exploitation does not know XML data towards magnanimity Expression and mapping storage method, to solve the problems, such as that magnanimity does not know XML data and is difficult to manage.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.

Claims (7)

1. a kind of magnanimity does not know XML data storage method, it is characterised in that: the following steps are included:
A, it extracts magnanimity and does not know XML data;
B, magnanimity of the building based on column database does not know XML data representation model;
C, realize that the magnanimity based on uncertain column database does not know the storage of XML data.
2. a kind of magnanimity according to claim 1 does not know XML data storage method, it is characterised in that: the step A tool Body process is as follows:
A, data to be stored are obtained;
B, data are parsed, includes at least one key-value pair in the data after parsing, wherein key indicates one in data to be stored A field, value indicate the corresponding data value of the field;
C, according to expandable mark language XML file is preset, the validity, legitimacy and integrality of field are checked respectively for, In, it include the definition of all fields that can store in the default XML file, and must include in the data of parsing Field;
D, it is stored by the data of validity check, validity checking and integrity checking to database.
3. a kind of magnanimity according to claim 1 does not know XML data storage method, it is characterised in that: in the step B Building process is as follows:
A, identify that more granularities of uncertain XML data are uncertain;
B, representation method and uncertain column database model definition of the uncertain XML data in column database are provided;
C, it establishes the magnanimity based on uncertain column database and does not know XML data representation model.
4. a kind of magnanimity according to claim 1 does not know XML data storage method, it is characterised in that: in the step C Realization process is as follows:
I), XML data are not known for magnanimity known to mode, designs it in the storage model of uncertain column database;
Ii) magnanimity unknown for mode does not know XML data, designs it in the storage model of uncertain column database:
Iii) for not knowing the magnanimity uncertain data in column database, realize uncertain column database to uncertain XML The storage transformation model of database.
5. a kind of magnanimity according to claim 4 does not know XML data storage method, it is characterised in that: the step i) In detailed process is as follows:
(a) leaf elements, the n omicronn-leaf daughter element, attribute information in uncertain XML database schema are obtained;
It (b) is that basic division unit designs respective column database table with n omicronn-leaf daughter element, according in each non-leaf element units The leaf elements of nesting, attribute information design respective column.
6. a kind of magnanimity according to claim 4 does not know XML data storage method, it is characterised in that: the step ii) In detailed process is as follows:
(a) it identifies the data entity in uncertain XML database, extracts uncertain XML data tree node routing information;
It (b) is that basic division unit designs respective column database table with data entity, according to corresponding in each data entity unit Data tree routing information design column database respective column, provide magnanimity do not know XML database data entity, routing information With the mapping ruler between column database column;
(c) the uncertain XML database under establishment model unknown situation and the mapping model between column database.
7. a kind of magnanimity according to claim 4 does not know XML data storage method, it is characterised in that: the step Iii detailed process is as follows in):
(a) according to the major key and column design of each table in uncertain column database using n omicronn-leaf child node as more uncertain XML of root Tree;
(b) splicing is carried out to which generation is complete not to each uncertain XML tree according to data association information in uncertain column database Determine XML tree.
CN201910644221.5A 2019-07-17 2019-07-17 A kind of uncertain XML data storage method of magnanimity Pending CN110297946A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910644221.5A CN110297946A (en) 2019-07-17 2019-07-17 A kind of uncertain XML data storage method of magnanimity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910644221.5A CN110297946A (en) 2019-07-17 2019-07-17 A kind of uncertain XML data storage method of magnanimity

Publications (1)

Publication Number Publication Date
CN110297946A true CN110297946A (en) 2019-10-01

Family

ID=68031330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910644221.5A Pending CN110297946A (en) 2019-07-17 2019-07-17 A kind of uncertain XML data storage method of magnanimity

Country Status (1)

Country Link
CN (1) CN110297946A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521325A (en) * 2011-12-02 2012-06-27 西北工业大学 XML (Extensive Makeup Language) structural similarity measuring method based on frequency-associated tag sequence
CN103020262A (en) * 2012-12-24 2013-04-03 Tcl集团股份有限公司 Data storage method, system and data storage equipment
KR20160139693A (en) * 2015-05-28 2016-12-07 목포대학교산학협력단 Shipdex Document Modeling Based on HBase Store Structure for Ship Materials

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521325A (en) * 2011-12-02 2012-06-27 西北工业大学 XML (Extensive Makeup Language) structural similarity measuring method based on frequency-associated tag sequence
CN103020262A (en) * 2012-12-24 2013-04-03 Tcl集团股份有限公司 Data storage method, system and data storage equipment
KR20160139693A (en) * 2015-05-28 2016-12-07 목포대학교산학협력단 Shipdex Document Modeling Based on HBase Store Structure for Ship Materials

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIAN LIU等: "Enabling Massive XML-Based Biological Data Management in HBase", 《IEEE》 *
王玉操: "一种海量XML文档存储和检索平台的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Similar Documents

Publication Publication Date Title
Lee et al. Query performance of the IFC model server using an object-relational database approach and a traditional relational database approach
EP2874077B1 (en) Stateless database cache
US8732178B2 (en) Using views of subsets of nodes of a schema to generate data transformation jobs to transform input files in first data formats to output files in second data formats
US20150269215A1 (en) Dependency-aware transaction batching for data replication
CN111090461A (en) Code annotation generation method based on machine translation model
CN106168965A (en) Knowledge mapping constructing system
CN101694668A (en) Method and device for confirming web structure similarity
CN110990467B (en) BIM model format conversion method and conversion system
WO2013057937A1 (en) Transformation of complex data source result sets to normalized sets for manipulation and presentation
US20100235344A1 (en) Mechanism for utilizing partitioning pruning techniques for xml indexes
JP2015531126A (en) Method and apparatus for realizing product characteristic navigation
CN105404634A (en) Key-Value data block based data management method and system
US20140289185A1 (en) Apparatus and Method for Policy Based Rebalancing in a Distributed Document-Oriented Database
CN105447253A (en) Integration method of three-dimensional process data
CN106445913A (en) MapReduce-based semantic inference method and system
CN103294791A (en) Extensible markup language pattern matching method
Sattar et al. Incorporating nosql into a database course
Ferguson et al. Linked data view methodology and application to BIM alignment and interoperability
CN107273425A (en) A kind of Database Development Method and device based on ORM frameworks
CN110297946A (en) A kind of uncertain XML data storage method of magnanimity
Cheng et al. A cloud computing approach to partial exchange of BIM models
CN101976244A (en) Method for partitioning nodes in XML (Extensible Markup Language) message as well as methods for applying same
CN103309888A (en) Method and device for verifying data of electronic map
CN115617989A (en) Method and system for constructing Chinese patent key information corpus and computer equipment
CN110543467B (en) Mode conversion method and device for time series database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191001