CN109981698A - Number networking cross-domain data access standardized system and method based on metadata - Google Patents

Number networking cross-domain data access standardized system and method based on metadata Download PDF

Info

Publication number
CN109981698A
CN109981698A CN201711448756.2A CN201711448756A CN109981698A CN 109981698 A CN109981698 A CN 109981698A CN 201711448756 A CN201711448756 A CN 201711448756A CN 109981698 A CN109981698 A CN 109981698A
Authority
CN
China
Prior art keywords
data
province
access
outside
metadata
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711448756.2A
Other languages
Chinese (zh)
Other versions
CN109981698B (en
Inventor
鄂海红
宋美娜
段云峰
刘庆
王赟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bo Motomori Wo Information Technology (beijing) Co Ltd
Original Assignee
Bo Motomori Wo Information Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bo Motomori Wo Information Technology (beijing) Co Ltd filed Critical Bo Motomori Wo Information Technology (beijing) Co Ltd
Priority to CN201711448756.2A priority Critical patent/CN109981698B/en
Publication of CN109981698A publication Critical patent/CN109981698A/en
Application granted granted Critical
Publication of CN109981698B publication Critical patent/CN109981698B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/561Adding application-functional data or data for application control, e.g. adding metadata

Abstract

The invention discloses a kind of, and the number networking cross-domain data based on metadata accesses standardized system and method, wherein system includes: that resource management layer outside interworking protocol includes based on the standardized access Hive of HiveMetaDataProtocol;Interworking protocol includes being based on the standardized access HBase of HBaseMetaDataProtocol outside data storage layer;Interworking protocol includes being based on the standardized access HDFS data of HdfsMetaDataProtocol outside File Store layer;Said external interworking protocol is integrated to obtain the data access protocol based on metadata, to realize the standardization of data access.The system can realize data access based on metadata, realize the circulation of digital resource and share, and in several networking technology frameworks, solve the transparence of each layer data access of cross-domain large data center, realize the access protocol between level.

Description

Number networking cross-domain data access standardized system and method based on metadata
Technical field
The present invention relates to big data technical field, in particular to a kind of number networking cross-domain data access mark based on metadata Standardization system and method.
Background technique
With the fast development of Internet of Things, mobile Internet, social network, business data rapid development is semi-structured And non-structured data increase again at geometry, business demand complexity is also being increase accordingly, and is brought to internet industry Higher challenge " number networking " (Internet of Data) is come into being, and " number networking " is that China next generation's big data basis is set The general name for applying general technical framework and specific implementation, using internet as bearer network, by the format, interface, agreement of standard, into The exchange and interconnection of row data realize the various businesses application after data communication with data dimension more abundant.Number networking It establishes on internet, the ICP/IP protocol of internet bottom is still the underlying protocol of transmission, but number networking will increase by one Number interconnector layer solves the standardization statements of data, the standard agreement of data transmission, the standard of data exchange, data application The contents such as standardization of standard interface, data access terminal, data transparent transmission, application end in number networking can be obtained directly Access evidence, generates the various service applications based on data.Number networking is to promote " the connected internet of number number ".
Hadoop big data cluster-based storage is mostly used in the related technology, excavates mass data, it is good with health to remain ahead The development of property, effectively accesses HDFS file system (Hadoop Distributed File in multiple big data clusters System, distributed file system), HBase, nosql (Not Only SQL, non-relational database) database and Hive number According to the data of the multi-data sources such as library, transparent efficient realize becomes communication row across the interoperability between multiple Hadoop cluster levels The urgent need of industry can not solve each layer (data storage layer, computing engines layer etc.) data access of cross-domain large data center Transparence, it would be highly desirable to solve.
Summary of the invention
The present invention is directed to solve at least some of the technical problems in related technologies.
For this purpose, an object of the present invention is to provide a kind of, the number networking cross-domain data access based on metadata is standardized The standardization of data access may be implemented in system, the system, solves the problems, such as data access in cross-domain big data.
It is another object of the present invention to propose a kind of number networking cross-domain data access standardization side based on metadata Method.
In order to achieve the above objectives, one aspect of the present invention embodiment proposes a kind of number networking cross-domain data based on metadata Access standardized system, comprising: interworking protocol outside resource management layer accesses Hive database for consistency, wherein described Interworking protocol includes being based on the standardized access Hive of HiveMetaDataProtocol outside resource management layer;Data storage layer External interworking protocol accesses Hbase database for consistency, wherein interworking protocol includes base outside the data storage layer In the standardized access HBase of HBaseMetaDataProtocol;Interworking protocol outside File Store layer is visited for consistency Ask HDFS file system, wherein interworking protocol includes marking based on HdfsMetaDataProtocol outside the File Store layer The access HDFS data of standardization;Wherein, interworking protocol, data storage layer outside intercommunication outside the resource management layer are assisted File Store layer outside interworking protocol of negotiating peace is integrated to obtain the data access protocol based on metadata, to realize data access Standardization.
The number networking cross-domain data based on metadata of the embodiment of the present invention accesses standardized system, can be by resource Interworking protocol, data storage layer outside interworking protocol and File Store layer outside interworking protocol are integrated and are based on outside management level The data access protocol of metadata is realized the circulation of digital resource and is shared, network in number to realize the standardization of data access In Technical Architecture, the transparence of each layer data access of cross-domain large data center is solved, the access protocol between level is realized.
In addition, the number networking cross-domain data access standardized system according to the above embodiment of the present invention based on metadata is also It can have following additional technical characteristic:
Further, in one embodiment of the invention, interworking protocol is further used for outside the File Store layer: Acquisition instruction is sent to obtain associated documents;The associated documents are obtained in the current of each province's cluster by accessing metadata warehouse Storage location;It is spliced to HdfsMeatDataProtocol protocol header according to each province's information inquired, so that can be with after splicing Navigate to the position in the HDFS of the cluster of any province;File request order is distributed to using HdfsMeatDataProtocol Each province, to realize the data of access each province.
Further, in one embodiment of the invention, interworking protocol is further used for outside the resource management layer: Command access hive data are issued, operate any table in central arbitrary data library using HiveMetaDataProtocol agreement;It is logical It crosses access metadata warehouse and obtains any epitope in the central arbitrary data library in the metadata information of each province current location;Using Jdbc access mode;It is spliced to HiveMetaDataProtocol protocol header according to each province's information inquired, so that after splicing The position in the Hive of the cluster of any province can be navigated to;Hive request command is divided using HiveMetaDataProtocol It is dealt into each province, to realize the data of access each province.
Further, in one embodiment of the invention, interworking protocol is further used for outside the data storage layer: Command access HBase data are issued, HBaseMetaDataProtocol agreement, any table in operation center are used;Inquiry center member Database obtains any epitope in the center in the metadata information of each province current location;It is accessed using Scan class;According to inquiry To each province's information be spliced to HBaseMetaDataProtocol protocol header so that the collection of any province can be navigated to after splicing Position in the HBase of group;HBase request command is distributed to each province using HBaseMetaDataProtocol, is visited with realizing Ask the data of each province.
Further, in one embodiment of the invention, further includes: interworking protocol, data point outside the reference level of upper layer Analyse interworking protocol and computing engines layer outside interworking protocol outside layer.
In order to achieve the above objectives, another aspect of the present invention embodiment proposes a kind of cross-domain number of number networking based on metadata According to access standardized method, comprising the following steps: Hive database is accessed by interworking protocol consistency outside resource management layer, Wherein, interworking protocol includes being based on the standardized access Hive of HiveMetaDataProtocol outside the resource management layer; Hbase database is accessed by interworking protocol consistency outside data storage layer, wherein intercommunication association outside the data storage layer View includes being based on the standardized access HBase of HBaseMetaDataProtocol;Pass through interworking protocol one outside File Store layer Cause property access HDFS file system, wherein interworking protocol includes being based on outside the File Store layer The standardized access HDFS data of HdfsMetaDataProtocol;To interworking protocol, the data outside institute's resource management layer Interworking protocol and File Store layer outside interworking protocol are integrated outside accumulation layer, are generated the data based on metadata and are visited Agreement is asked, to realize the standardization of data access.
The number networking cross-domain data based on metadata of the embodiment of the present invention accesses standardized method, can be by resource Interworking protocol, data storage layer outside interworking protocol and File Store layer outside interworking protocol are integrated and are based on outside management level The data access protocol of metadata is realized the circulation of digital resource and is shared, network in number to realize the standardization of data access In Technical Architecture, the transparence of each layer data access of cross-domain large data center is solved, the access protocol between level is realized.
In addition, the number networking cross-domain data access standardized method according to the above embodiment of the present invention based on metadata is also It can have following additional technical characteristic:
Further, in one embodiment of the invention, described to pass through interworking protocol consistency outside File Store layer HDFS file system is accessed, further comprises: sending acquisition instruction to obtain associated documents;It is obtained by access metadata warehouse Current storage location of the associated documents in each province's cluster;It is spliced to according to each province's information inquired HdfsMeatDataProtocol protocol header, so that the position in the HDFS for the cluster that any province can be navigated to after splicing;Benefit File request order is distributed to each province with HdfsMeatDataProtocol, to realize the data of access each province.
Further, in one embodiment of the invention, described to pass through interworking protocol consistency outside resource management layer Access Hive database, further comprise: publication command access hive data are grasped using HiveMetaDataProtocol agreement Make any table in central arbitrary data library;Any epitope in the central arbitrary data library is obtained in each province by access metadata warehouse The metadata information of current location;Using jdbc access mode;It is spliced to according to each province's information inquired HiveMetaDataProtocol protocol header, so that the position in the Hive for the cluster that any province can be navigated to after splicing;Benefit Hive request command is distributed to each province with HiveMetaDataProtocol, to realize the data of access each province.
Further, in one embodiment of the invention, described to pass through interworking protocol consistency outside data storage layer Access Hbase database, further comprise: publication command access HBase data are assisted using HBaseMetaDataProtocol View, any table in operation center;It inquires central metadatabase and obtains any epitope in the center in the metadata of each province current location Information;It is accessed using Scan class;It is spliced to HBaseMetaDataProtocol protocol header according to each province's information inquired, is made The position in the HBase of the cluster of any province can be navigated to after must splicing;It will using HBaseMetaDataProtocol HBase request command is distributed to each province, to realize the data of access each province.
Further, in one embodiment of the invention, further includes: pass through interworking protocol, number outside the reference level of upper layer The data access protocol based on metadata is generated according to interworking protocol outside interworking protocol outside analysis layer and computing engines layer.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partially become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is the structure that standardized system is accessed according to the number networking cross-domain data based on metadata of the embodiment of the present invention Schematic diagram;
Fig. 2 is the number networking cross-domain data access standardization integrated stand based on metadata according to one embodiment of the invention The structural schematic diagram of structure;
Fig. 3 is the major data cluster of HdfsMeatDataProtocol protocol access according to one embodiment of the invention The flow chart of HDFS file;
Fig. 4 is the major data cluster of HiveMetaDataProtocol protocol access according to one embodiment of the invention The flow chart of Hive database;
Fig. 5 is the major data cluster of HBaseMetaDataProtocol protocol access according to one embodiment of the invention HBase database flow chart;
Fig. 6 is the process that standardized method is accessed according to the number networking cross-domain data based on metadata of the embodiment of the present invention Figure.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to is used to explain the present invention, and is not considered as limiting the invention.
The number networking cross-domain data access based on metadata proposed according to embodiments of the present invention is described with reference to the accompanying drawings Standardized system and method, the number networking based on metadata for describing to propose according to embodiments of the present invention first with reference to the accompanying drawings across Numeric field data accesses standardized system.
Fig. 1 is the structural representation of the number networking cross-domain data access standardized system based on metadata of the embodiment of the present invention Figure.
As shown in Figure 1, should include: resource management layer based on the number networking cross-domain data access standardized system 10 of metadata Interworking protocol 200, File Store layer outside interworking protocol 300 outside external interworking protocol 100, data storage layer.
Wherein, interworking protocol 100 is used for consistency and accesses Hive database outside resource management layer, wherein resource management The external interworking protocol 100 of layer includes being based on the standardized access Hive of HiveMetaDataProtocol.Outside data storage layer Interworking protocol 200 accesses Hbase database for consistency, wherein interworking protocol 200 includes being based on outside data storage layer The standardized access HBase of HBaseMetaDataProtocol.Interworking protocol 300 is visited for consistency outside File Store layer Ask HDFS file system, wherein interworking protocol 300 includes being based on HdfsMetaDataProtocol standard outside File Store layer The access HDFS data of change.Wherein, to interworking protocol 100, data storage layer outside 200 and of interworking protocol outside resource management layer The integration of interworking protocol 300 obtains the data access protocol based on metadata outside File Store layer, to realize the mark of data access Standardization.The system 10 of the embodiment of the present invention can realize data access based on metadata, realize the circulation of digital resource together It enjoys, in several networking technology frameworks, solves the transparence of each layer data access of cross-domain large data center, realize between level Access protocol.
It is understood that metadata is the structure of data and the data of method for building up in description big data platform, can incite somebody to action It is divided into two classes: technology metadata (Technical Metadata) and data service metadata (Business by the difference of purposes Metadata), user stores the language about the data of big data system technical details and between user and real system Adopted layer metadata.Data access standard between cross-domain big data cluster based on metadata can be to avoid the discrimination of data access Justice, as shown in Fig. 2, for the accumulation layer of number networking, the data access protocol based on metadata are as follows: intercommunication outside resource management layer Agreement 100 includes being based on the standardized access Hive of HiveMetaDataProtocol;Interworking protocol 200 outside data storage layer Including being based on the standardized access HBase of HBaseMetaDataProtocol;Interworking protocol 300 includes outside File Store layer Based on the standardized access HDFS data of HdfsMetaDataProtocol.
In addition, data storage layer includes HDFS file system, Hive database and Hbase NoSql database, client The access of data is abstracted as and is accessed based on certain agreement, as HiveMetaDataProtocol is accessed for consistency Hive database;HBaseMetaDataProtocol accesses Hbase database for consistency; HdfsMetaDataProtocol accesses HDFS file system for consistency, finally carries out being integrated into unification to these agreements Data access protocol based on metadata.
Further, in one embodiment of the invention, interworking protocol 300 is further used for outside File Store layer: Acquisition instruction is sent to obtain associated documents;Associated documents are obtained in the currently stored of each province's cluster by access metadata warehouse Position;It is spliced to HdfsMeatDataProtocol protocol header according to each province's information inquired, so that can position after splicing To the position in the HDFS of the cluster of any province;File request order is distributed to respectively using HdfsMeatDataProtocol It saves, to realize the data of access each province.
It is understood that as shown in figure 3, the embodiment of the present invention based on HdfsMeatDataProtocol protocol access Steps are as follows for the HDFS document flow of each province's cluster:
(1) instruction for obtaining file is sent, obtains associated documents HdfsMeatDataProtocol: //file_path;
(2) by access metadata warehouse, obtain this document in each province's cluster actual storage location (since each province collects Group's file system disunity, associated documents position are different);
(3) HdfsMeatDataProtocol agreement is spliced to according to each province's information inquired (ID of such as each province is numbered) Head;
(4) finally it is spliced into the position in the HDFS that can navigate to the cluster of a certain province;
(5) file request order is distributed to each province using HdfsMeatDataProtocol, realizes the number of access each province According to.
Further, in one embodiment of the invention, interworking protocol 100 is further used for outside resource management layer: Command access hive data are issued, operate any table in central arbitrary data library using HiveMetaDataProtocol agreement;It is logical It crosses access metadata warehouse and obtains any epitope in central arbitrary data library in the metadata information of each province current location;Using jdbc Access mode;It is spliced to HiveMetaDataProtocol protocol header according to each province's information inquired, so that can be with after splicing Navigate to the position in the Hive of the cluster of any province;Hive request command is distributed to using HiveMetaDataProtocol Each province, to realize the data of access each province.
It is understood that as shown in figure 4, the embodiment of the present invention is passed through based on HiveMetaDataProtocol agreement The Hive database steps that jdbc mode accesses each province's cluster are as follows:
(1) command access hive data are issued, the embodiment of the present invention uses HiveMetaDataProtocol agreement, to grasp Make certain central database table, access path can be with are as follows: HiveMetaDataProtocol: //databaseName/ TableName;
(2) by access metadata warehouse, obtain the epitope in each province's physical location metadata information (database ip, Database name, table name etc.);
(3) jdbc access mode is used;
(4) HiveMetaDataProtocol agreement is spliced to according to each province's information inquired (ID of such as each province is numbered) Head;
(5) finally it is spliced into the position in the Hive that can navigate to the cluster of a certain province;
(6) Hive request command is distributed to each province using HiveMetaDataProtocol, realizes the number of access each province According to.
Further, in one embodiment of the invention, interworking protocol 200 is further used for outside data storage layer: Command access HBase data are issued, HBaseMetaDataProtocol agreement, any table in operation center are used;Inquiry center member Database obtains any epitope in center in the metadata information of each province current location;It is accessed using Scan class;According to what is inquired Each province's information is spliced to HBaseMetaDataProtocol protocol header, so that the cluster of any province can be navigated to after splicing Position in HBase;HBase request command is distributed to each province using HBaseMetaDataProtocol, to realize that access is each The data of province.
It is understood that as shown in figure 5, the embodiment of the present invention is led to based on HBaseMetaDataProtocol agreement It is as follows to cross the step of scan class mode accesses the HBase of each province's cluster:
(1) command access HBase data are issued, using HBaseMetaDataProtocol agreement, operate certain central table, Access path are as follows: HBaseMetaDataProtocol: //TableName.
(2) central metadatabase is inquired, obtains the epitope in the metadata information (table name etc.) of each province's physical location.
(3) it is accessed using Scan class.
(4) HBaseMetaDataProtocol association is spliced to according to each province's information inquired (ID of such as each province is numbered) Discuss head
(5) finally it is spliced into the position in the HBase that can navigate to the cluster of a certain province.
(6) HBase request command is distributed to each province using HBaseMetaDataProtocol, realizes access each province Data.
Further, in one embodiment of the invention, the system 10 of the embodiment of the present invention further include: upper layer reference level Interworking protocol and computing engines layer outside interworking protocol outside external interworking protocol, data analysis layer.
The number networking cross-domain data access standardized system based on metadata proposed according to embodiments of the present invention, Ke Yitong It crosses to interworking protocol integration outside interworking protocol, data storage layer outside interworking protocol and File Store layer outside resource management layer The data access protocol based on metadata is obtained, to realize the standardization of data access, the circulation of digital resource is realized and shares, In several networking technology frameworks, the transparence of each layer data access of cross-domain large data center is solved, is realized between level Access protocol.
The number networking cross-domain data access based on metadata proposed according to embodiments of the present invention referring next to attached drawing description Standardized method.
Fig. 6 is the process of the number networking cross-domain data access standardized method based on metadata of one embodiment of the invention Figure.
As shown in fig. 6, should based on metadata number networking cross-domain datas access standardized methods the following steps are included:
In step s 601, Hive database is accessed by interworking protocol consistency outside resource management layer, wherein resource Interworking protocol includes being based on the standardized access Hive of HiveMetaDataProtocol outside management level.
In step S602, Hbase database is accessed by interworking protocol consistency outside data storage layer, wherein number It include being based on the standardized access HBase of HBaseMetaDataProtocol according to interworking protocol outside accumulation layer.
In step S603, HDFS file system is accessed by interworking protocol consistency outside File Store layer, wherein text Interworking protocol includes being based on the standardized access HDFS data of HdfsMetaDataProtocol outside part accumulation layer.
It is in S604, to interworking protocol, data storage layer outside interworking protocol and text outside institute's resource management layer in step Interworking protocol is integrated outside part accumulation layer, the data access protocol based on metadata is generated, to realize the mark of data access Standardization.
Further, in one embodiment of the invention, it is accessed by interworking protocol consistency outside File Store layer HDFS file system further comprises: sending acquisition instruction to obtain associated documents;It is obtained by access metadata warehouse related Current storage location of the file in each province's cluster;HdfsMeatDataProtocol association is spliced to according to each province's information inquired Head is discussed, so that the position in the HDFS for the cluster that any province can be navigated to after splicing;Utilize HdfsMeatDataProtocol File request order is distributed to each province, to realize the data of access each province.
Further, in one embodiment of the invention, it is accessed by interworking protocol consistency outside resource management layer Hive database further comprises: publication command access hive data, using in the operation of HiveMetaDataProtocol agreement Entreat any table in arbitrary data library;Any epitope in central arbitrary data library is obtained in each province current location by access metadata warehouse Metadata information;Using jdbc access mode;HiveMetaDataProtocol is spliced to according to each province's information inquired Protocol header, so that the position in the Hive for the cluster that any province can be navigated to after splicing;It utilizes Hive request command is distributed to each province by HiveMetaDataProtocol, to realize the data of access each province.
Further, in one embodiment of the invention, it is accessed by interworking protocol consistency outside data storage layer Hbase database further comprises: publication command access HBase data use HBaseMetaDataProtocol agreement, behaviour Make any table in center;It inquires central metadatabase and obtains any epitope in center in the metadata information of each province current location;Using The access of Scan class;It is spliced to HBaseMetaDataProtocol protocol header according to each province's information inquired, so that can after splicing The position in HBase to navigate to the cluster of any province;Using HBaseMetaDataProtocol by HBase request command It is distributed to each province, to realize the data of access each province.
Further, in one embodiment of the invention, further includes: pass through interworking protocol, number outside the reference level of upper layer The data access protocol based on metadata is generated according to interworking protocol outside interworking protocol outside analysis layer and computing engines layer.
It should be noted that the aforementioned solution to the number networking cross-domain data access standardized system embodiment based on metadata The number networking cross-domain data access standardized method based on metadata that explanation is also applied for the embodiment is released, it is no longer superfluous herein It states.
The number networking cross-domain data access standardized method based on metadata proposed according to embodiments of the present invention, Ke Yitong It crosses to interworking protocol integration outside interworking protocol, data storage layer outside interworking protocol and File Store layer outside resource management layer The data access protocol based on metadata is obtained, to realize the standardization of data access, the circulation of digital resource is realized and shares, In several networking technology frameworks, the transparence of each layer data access of cross-domain large data center is solved, is realized between level Access protocol.
In the description of the present invention, it is to be understood that, term " center ", " longitudinal direction ", " transverse direction ", " length ", " width ", " thickness ", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom" "inner", "outside", " up time The orientation or positional relationship of the instructions such as needle ", " counterclockwise ", " axial direction ", " radial direction ", " circumferential direction " be orientation based on the figure or Positional relationship is merely for convenience of description of the present invention and simplification of the description, rather than the device or element of indication or suggestion meaning must There must be specific orientation, be constructed and operated in a specific orientation, therefore be not considered as limiting the invention.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
In the present invention unless specifically defined or limited otherwise, term " installation ", " connected ", " connection ", " fixation " etc. Term shall be understood in a broad sense, for example, it may be being fixedly connected, may be a detachable connection, or integral;It can be mechanical connect It connects, is also possible to be electrically connected;It can be directly connected, can also can be in two elements indirectly connected through an intermediary The interaction relationship of the connection in portion or two elements, unless otherwise restricted clearly.For those of ordinary skill in the art For, the specific meanings of the above terms in the present invention can be understood according to specific conditions.
In the present invention unless specifically defined or limited otherwise, fisrt feature in the second feature " on " or " down " can be with It is that the first and second features directly contact or the first and second features pass through intermediary mediate contact.Moreover, fisrt feature exists Second feature " on ", " top " and " above " but fisrt feature be directly above or diagonally above the second feature, or be merely representative of First feature horizontal height is higher than second feature.Fisrt feature can be under the second feature " below ", " below " and " below " One feature is directly under or diagonally below the second feature, or is merely representative of first feature horizontal height less than second feature.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
Although the embodiments of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is not considered as limiting the invention, those skilled in the art within the scope of the invention can be to above-mentioned Embodiment is changed, modifies, replacement and variant.

Claims (10)

1. a kind of number networking cross-domain data based on metadata accesses standardized system characterized by comprising
Interworking protocol outside resource management layer accesses Hive database for consistency, wherein outside the resource management layer mutually Logical agreement includes being based on the standardized access Hive of HiveMetaDataProtocol;
Interworking protocol outside data storage layer accesses Hbase database for consistency, wherein outside the data storage layer Interworking protocol includes being based on the standardized access HBase of HBaseMetaDataProtocol;
Interworking protocol outside File Store layer accesses HDFS file system for consistency, wherein outside the File Store layer Interworking protocol includes being based on the standardized access HDFS data of HdfsMetaDataProtocol;
Wherein, interworking protocol, data storage layer outside interworking protocol and the file outside the resource management layer are deposited Interworking protocol is integrated to obtain the data access protocol based on metadata outside reservoir, to realize the standardization of data access.
2. the number networking cross-domain data according to claim 1 based on metadata accesses standardized system, which is characterized in that Interworking protocol is further used for outside the File Store layer:
Acquisition instruction is sent to obtain associated documents;
The associated documents are obtained in the current storage location of each province's cluster by accessing metadata warehouse;
It is spliced to HdfsMeatDataProtocol protocol header according to each province's information inquired, so that can navigate to after splicing Position in the HDFS of the cluster of any province;
File request order is distributed to each province using HdfsMeatDataProtocol, to realize the data of access each province.
3. the number networking cross-domain data according to claim 1 based on metadata accesses standardized system, which is characterized in that Interworking protocol is further used for outside the resource management layer:
Command access hive data are issued, it is any to operate central arbitrary data library using HiveMetaDataProtocol agreement Table;
The central any epitope in arbitrary data library is obtained in the metadata letter of each province current location by accessing metadata warehouse Breath;
Using jdbc access mode;
It is spliced to HiveMetaDataProtocol protocol header according to each province's information inquired, so that can navigate to after splicing Position in the Hive of the cluster of any province;
Hive request command is distributed to each province using HiveMetaDataProtocol, to realize the data of access each province.
4. the number networking cross-domain data according to claim 1 based on metadata accesses standardized system, which is characterized in that Interworking protocol is further used for outside the data storage layer:
Command access HBase data are issued, HBaseMetaDataProtocol agreement, any table in operation center are used;
It inquires central metadatabase and obtains any epitope in the center in the metadata information of each province current location;
It is accessed using Scan class;
It is spliced to HBaseMetaDataProtocol protocol header according to each province's information inquired, so that can position after splicing To the position in the HBase of the cluster of any province;
HBase request command is distributed to each province using HBaseMetaDataProtocol, to realize the data of access each province.
5. the number networking cross-domain data according to claim 1-4 based on metadata accesses standardized system, It is characterized in that, further includes: outside the reference level of upper layer outside interworking protocol, data analysis layer outside interworking protocol and computing engines layer Interworking protocol.
6. a kind of number networking cross-domain data based on metadata accesses standardized method, which comprises the following steps:
Hive database is accessed by interworking protocol consistency outside resource management layer, wherein outside the resource management layer mutually Logical agreement includes being based on the standardized access Hive of HiveMetaDataProtocol;
Hbase database is accessed by interworking protocol consistency outside data storage layer, wherein outside the data storage layer mutually Logical agreement includes being based on the standardized access HBase of HBaseMetaDataProtocol;
HDFS file system is accessed by interworking protocol consistency outside File Store layer, wherein outside the File Store layer Interworking protocol includes being based on the standardized access HDFS data of HdfsMetaDataProtocol;And
Outside to interworking protocol outside resource management layer outside interworking protocol, the data storage layer and the File Store layer Portion's interworking protocol is integrated, and the data access protocol based on metadata is generated, to realize the standardization of data access.
7. the number networking cross-domain data according to claim 6 based on metadata accesses standardized method, which is characterized in that It is described that HDFS file system is accessed by interworking protocol consistency outside File Store layer, further comprise:
Acquisition instruction is sent to obtain associated documents;
The associated documents are obtained in the current storage location of each province's cluster by accessing metadata warehouse;
It is spliced to HdfsMeatDataProtocol protocol header according to each province's information inquired, so that can navigate to after splicing Position in the HDFS of the cluster of any province;
File request order is distributed to each province using HdfsMeatDataProtocol, to realize the data of access each province.
8. the number networking cross-domain data according to claim 6 based on metadata accesses standardized method, which is characterized in that It is described that Hive database is accessed by interworking protocol consistency outside resource management layer, further comprise:
Command access hive data are issued, it is any to operate central arbitrary data library using HiveMetaDataProtocol agreement Table;
The central any epitope in arbitrary data library is obtained in the metadata letter of each province current location by accessing metadata warehouse Breath;
Using jdbc access mode;
It is spliced to HiveMetaDataProtocol protocol header according to each province's information inquired, so that can navigate to after splicing Position in the Hive of the cluster of any province;
Hive request command is distributed to each province using HiveMetaDataProtocol, to realize the data of access each province.
9. the number networking cross-domain data according to claim 6 based on metadata accesses standardized method, which is characterized in that It is described that Hbase database is accessed by interworking protocol consistency outside data storage layer, further comprise:
Command access HBase data are issued, HBaseMetaDataProtocol agreement, any table in operation center are used;
It inquires central metadatabase and obtains any epitope in the center in the metadata information of each province current location;
It is accessed using Scan class;
It is spliced to HBaseMetaDataProtocol protocol header according to each province's information inquired, so that can position after splicing To the position in the HBase of the cluster of any province;
HBase request command is distributed to each province using HBaseMetaDataProtocol, to realize the data of access each province.
10. standardized method is accessed according to the described in any item number networking cross-domain datas based on metadata of claim 6-9, It is characterized in that, further includes:
Pass through interworking protocol, data analysis layer outside interworking protocol and computing engines layer outside interworking protocol outside the reference level of upper layer Generate the data access protocol based on metadata.
CN201711448756.2A 2017-12-27 2017-12-27 Metadata-based data networking cross-domain data access standardization system and method Active CN109981698B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711448756.2A CN109981698B (en) 2017-12-27 2017-12-27 Metadata-based data networking cross-domain data access standardization system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711448756.2A CN109981698B (en) 2017-12-27 2017-12-27 Metadata-based data networking cross-domain data access standardization system and method

Publications (2)

Publication Number Publication Date
CN109981698A true CN109981698A (en) 2019-07-05
CN109981698B CN109981698B (en) 2022-03-04

Family

ID=67072169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711448756.2A Active CN109981698B (en) 2017-12-27 2017-12-27 Metadata-based data networking cross-domain data access standardization system and method

Country Status (1)

Country Link
CN (1) CN109981698B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982075A (en) * 2012-10-30 2013-03-20 北京京东世纪贸易有限公司 Heterogeneous data source access supporting system and method thereof
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform
CN106202452A (en) * 2016-07-15 2016-12-07 复旦大学 The uniform data resource management system of big data platform and method
CN106339509A (en) * 2016-10-26 2017-01-18 国网山东省电力公司临沂供电公司 Power grid operation data sharing system based on large data technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform
CN102982075A (en) * 2012-10-30 2013-03-20 北京京东世纪贸易有限公司 Heterogeneous data source access supporting system and method thereof
CN106202452A (en) * 2016-07-15 2016-12-07 复旦大学 The uniform data resource management system of big data platform and method
CN106339509A (en) * 2016-10-26 2017-01-18 国网山东省电力公司临沂供电公司 Power grid operation data sharing system based on large data technology

Also Published As

Publication number Publication date
CN109981698B (en) 2022-03-04

Similar Documents

Publication Publication Date Title
CN100531055C (en) Data synchronous system and its method
CN106850788B (en) Integrated framework and integrated approach towards multi-source heterogeneous geographic information resources
EP2320616B1 (en) Mobile searching method and system, and method for synchronizing search ability of searching server
CN105338113B (en) A kind of multi-platform data interconnection system for Urban Data resource-sharing
CN105760397B (en) Internet of things ontology model processing method and device
CN104915909A (en) Data aggregation platform
CN101094173A (en) Integrated system of data interchange under distributed isomerical environment
WO2024011816A1 (en) File base version-based method for implementing dynamic combination and application of attached resources
US10901973B1 (en) Methods and apparatus for a semantic multi-database data lake
CN102368716A (en) Data acquisition method of network configuration protocol and network configuration server
CN108304473A (en) Data transmission method between data source and system
CN102891768A (en) Method and network element for network management
KR20130047489A (en) Management system for global network slice and method thereof
CN105930345A (en) Hierarchical indexing method based on distributed real-time database system (DRTDBS)
KR20170107189A (en) mobile health care system and mobile health dashboard providing system based on components using the same
CN111652374B (en) Smart city perception equipment resource management method and system
CN105339899A (en) Method and controller for clustering applications in a software-defined network
JP2005522759A (en) Group management
CN104509029B (en) The method and device of personal information is updated in a communications system
Sinnott et al. A data-driven urban research environment for Australia
CN105636317B (en) A kind of intelligent road lamp management system and information processing method
CN109951370A (en) Much data centers are layered the method and device that interconnects
CN110011984A (en) A kind of distributed cluster system and method based on REST and RPC
CN105827454A (en) Method and system for constructing topologic CDN (content delivery network) model
Venkatesan et al. Design of a smart gateway solution based on the exploration of specific challenges in IoT

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant