CN106202452A - The uniform data resource management system of big data platform and method - Google Patents

The uniform data resource management system of big data platform and method Download PDF

Info

Publication number
CN106202452A
CN106202452A CN201610555871.9A CN201610555871A CN106202452A CN 106202452 A CN106202452 A CN 106202452A CN 201610555871 A CN201610555871 A CN 201610555871A CN 106202452 A CN106202452 A CN 106202452A
Authority
CN
China
Prior art keywords
data
packet
user
module
capsule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610555871.9A
Other languages
Chinese (zh)
Other versions
CN106202452B (en
Inventor
谢志鹏
胡俊峰
王鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201610555871.9A priority Critical patent/CN106202452B/en
Publication of CN106202452A publication Critical patent/CN106202452A/en
Application granted granted Critical
Publication of CN106202452B publication Critical patent/CN106202452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/907Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2117User registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2141Access rights, e.g. capability lists, access control lists, access tables, access matrices

Abstract

The invention belongs to big data platform technical field, be specifically related to uniform data resource management system and the method for big data platform.The present invention is directed to the unmanageable problem of data resource caused in big data platform owing to there is various types of data resource, propose the unified metadata for different types of data resource to describe, and the unified adaptor interface specification for different types of data management assembly, and devise a kind of unified data resource management method and system on this basis, support unified packet to upload and download, the function such as unified data find, unified data access application and mandate.The dynamic scalable of feasible system of the present invention, many tenants management and unified access control, it is simple to user manages and uses different types of data resource in big data platform.

Description

The uniform data resource management system of big data platform and method
Technical field
The invention belongs to big data platform technical field, be specifically related to the uniform data resource management system of big data platform With method.
Background technology
Along with the development of informationization technology, data penetrated into each industry current and business can only field, become The important factor of production.Mass data effectively managed and is excavated further and apply, having become as enterprise and improve Core competitiveness, seize the key of the first market opportunities.Under this background, big data technique arises at the historic moment, and it has the data scale of construction The distinguishing features such as greatly, data type is many, value density is low and processing speed is fast.
On existing big data platform (with Hadoop as Typical Representative), different types of data resource is often through greatly Different data management component in data ecosystem is managed respectively.Such as: distributed file system is (with HDSFS For Typical Representative) manage such as catalogue and file such file type data object, distributed column Database Systems (with HBASE is representative) then manage the data objects such as name space, table, row race, row, and distributed form system is (with HIVE as allusion quotation Type represents) then manage the data objects such as data base, form, field.Meanwhile, different data management component employs again Different access control policies protects its data object managed not operated by unwarranted access.
But, under the technical conditions of the big data platform of existing Hadoop, different types of data resource is because having completely Different logic data models, therefore lacks unified metadata management, and then causes platform user, even the management of platform Member user, cannot obtain the unified view of data resource in platform, it is difficult to finding and orient particular analysis task may need Total data resource to be accessed, it is also difficult to the data resource being had oneself is implemented simple and effective access and controlled.User The different operation interface that have to be familiar with various data management component can complete task.
In view of the foregoing, how for data resource different types of in big data platform, it is provided that unified view and Operation interface carries out the metadata of data resource and describes, finds, accesses control etc., is this area problem demanding prompt solution, tool There is using value.
Summary of the invention
It is an object of the invention to provide uniform data method for managing resource and the system of a kind of big data platform.
The present invention is directed to the data resource caused in big data platform owing to there is various types of data resource Unmanageable problem, it is proposed that the unified metadata for different types of data resource describes, and for dissimilar number According to management assembly unified adaptor interface specification, and devise on this basis a kind of unified data resource management method with System, supports unified packet to upload and download, the merit such as unified data find, unified data access application and mandate Energy.
Present invention firstly provides a kind of unified layering meta-model to the different types of data money portraying in big data platform Source.This layering meta-model is that the metadata of different types of data resource provides unified expression model, i.e. makes inhomogeneity The data resource of type has identical metadata model, and the unified management for different types of data resource provides the foundation.
Described unified meta-model is made up of four-layer structure: ground floor is data capsule layer, and the second layer is layer data packet, the Three layers is data assets layer, and the 4th layer is then data field layer.Wherein, described data capsule corresponds to certain data management group One running example (being typically deployed on some cluster) of part, described packet is the collection of one group of relevant data assets Closing, described data assets is the relatively independent data object managed by a certain data capsule, and described data field is The data field of relational data assets.Physically, packet is stored by data capsule and manages.Data capsule layer is for general General family is transparent, and user can't perceive the existence of this layer, and system is responsible for storage and safeguards that packet is to data capsule Between mapping relations.
Having the mapping relations of one-to-many between the adjacent layer of meta-model time from top to bottom, the most each data capsule stores also Having managed one or more packet, each packet is the set of one or more data assets that is mutually related, each Data assets is by being made up of 0 or multiple data field.These mapping relations are carried out two-way storage and maintenance by system: According to certain given data object, both can obtain the data object of its last layer, it is also possible to obtain its next layer data pair The list of elephant.
In addition to the mapping relations between level, meta-model defines multiple genus for data object on each level Property, this includes: substantially describe attribute, access-control attributes, secondary relationship attribute and user access information.Wherein, substantially retouch State the description information such as the relevant industries of attribute record data object, related discipline, key word, the owner, date created;Access Controlled attribute describes data object and can be conducted interviews in which way by which user;Secondary relationship attribute description data Secondary relationship between object;User access information then have recorded when data object by which user is being carried out in which way Cross access.
The present invention, on the basis of the unified meta-model of layering, further defines unified adaptor interface specification, any A type of data management component can meet the adapter card of interface specification by customization and dynamic scalable connect Enter to native system.The function that wherein adaptor interface relates to includes, but is not limited to: the designation of data object is translated by (1) Become the concrete data object in corresponding data container;(2) the uploading and downloading of packet;(3) data permission is authorized and reclaims. By unifying the abstract of adaptor interface, native system achieves dynamic extensibility, it is possible to support different types of number easily As long as being this corresponding adapter card of data management component customized development according to management assembly, and adapter is added to it is In system.
The data management component of any type there may be multiple deployment examples (containers of the most multiple same types). After the adapter corresponding to this kind of data management component is added to system, its each example deployment cluster also may be used To add in system as data capsule.Packet in big data platform be transfer to data capsule to carry out physical store and Management.Adapter has masked the diversity realizing details in different types of data management assembly, and shows a kind of unified Interface, its effect is two-way, on the one hand it by instruction morphing for the abstract operation of data object be to concrete data capsule Specific instructions sequence also performs;On the other hand, it obtains the description information about packet from concrete container, and is converted Become unified information descriptor format.
The present invention is on the basis of the unified meta-model of layering and unified adaptor interface specification, it is proposed that a kind of dissimilar The system for unified management of data resource, the uniform data resource management system of the biggest data platform.This system supports unified number Pass and the function such as download, the discovery of unified data, unified data access application and mandate according to wrapping.
Described unified data access application and mandate have employed a kind of fine-grained access control model, abstract at three layers On data object, (packet, data assets, data field) respectively defines " reading " and " writing " two kinds of abstract operations.Packet The owner can be by some data assets in whole packet or packet, some data field also or in data assets " reading " or " writing " permission grant to user.User expresses the data resource access authority request of oneself, warp on abstract model After data owner ratifies to authorize, by the authorization functions in corresponding adapter, the authorization message of abstract object is converted into phase Answer the specific instruction sequence of data capsule, and performed.
Sum it up, in the present system: unified meta-model is that the unified management of data resource provides a kind of unification Describing language, the adapter meeting adaptor interface specification realizes then act as translation layer (or conversion layer).The angle of adapter Color is two-way: on the one hand obtains the description information of data resource from concrete data management module, and is converted into system One describes language;On the other hand the abstract operation of data resource is converted to concrete to respective data object of data management module Command sequence has performed.
The uniform data resource management system of the described big data platform that the present invention provides, it is constituted as shown in Figure 2.It Including packet load-on module, data resource discovery module, user registers login module, data access application and authorization module, And metadata management module, data capsule management module, adapter management module, etc.;
Described " adapter management module ", for the information that management is relevant to adapter.It is responsible for receiving system manager from connecing Enter to hold incoming adapter card (i.e. one routine library achieving adaptor interface specification) and corresponding data management component Typonym, stores adapter card, by " metadata in the "/Adapters " subdirectory under system root directory Management module " record the mapping between adapter card and typonym;When system start-up, it is also responsible in loading system All existing adapter cards, and in internal memory, set up the mapping relations between adapter card and typonym;System is transported During row, it provides typonym to the mapping services of adapter card object to other module.
Described " data capsule management module ", for the information that management is relevant to data capsule.It is responsible for receiving system pipes Data capsule title, the typonym of corresponding data management component and the relevant configuration letter that reason person inputs from incoming end Breath, records corresponding information by " metadata management module ";When system start-up, it is responsible for setting up data in internal memory and holds Mapping between device and data management component type title;When system is run, it provides data capsule title to other module Mapping services to management data set part typonym;When system is run, it is also responsible for monitoring and safeguards each data capsule On-line working state.
Described " user registers login module ", for being responsible for the registering and logging work of domestic consumer.For new user's Registration, this module receives account title and the password that user is registered at incoming end, creates its Kerberos for this account Principal and corresponding key, and this Principal is written to only system service program together with key could visit In the keytab file asked.For the login of user, system receives account title and the password that user is inputted at incoming end, sentences Disconnected password is the most correct: if code error, then return the information of login failure;If correct, then use this account title institute Corresponding Kerberos Principal and key pass through kerberos authentication once, and by Configuration pair of user As being cached, thus avoid and repeat certification frequently.
Described " metadata management module ", is used for storing, maintenance and management data capsule, packet, data assets, number Metadata according to four class objects such as fields;And provide query interface to other functional module.
Described " packet load-on module ", for receiving domestic consumer from the incoming data packet compressing file of incoming end, root The data management component typonym specified according to user utilizes load-balancing mechanism to select a data capsule of the type, adjusts The packet loading function provided with the adapter card of the type physically stores in this data capsule and manages this number According to bag;After packet is loaded successfully into selected data container, this module is also responsible for recording use by " metadata management module " The packet that family is inputted describes information.
Described " data resource discovery module ", is used for providing a user with inquiry.It receives user looking into that incoming end inputs Ask request, first carry out the checking of user identity, the legitimate verification of inquiry content, then retrieve inquiry from metadata stores The list of the packet hit, filters according to the duty of data resource bag place data capsule, finally will filter After packet list return to incoming end as Query Result.
Described " data access application and authorization module ", for receiving domestic consumer's access application to packet, passes through Metadata management module obtains the owner of this packet, and application passes to this owner.If the owner have approved this Shen Please, then " data access application and authorization module " meeting possessory identity of certification, if by rear, calling " metadata management mould Block " obtain the data capsule at related data resource place, and the authorization routine called in corresponding adapter card holds in these data Device completes concrete mandate action.
The present invention compared with prior art, has the advantage that
(1) dynamic scalable of feasible system, novel data management module can be met accordingly connect by customized development The adapter of mouthful specification and dynamically access system, and need not from the beginning compile and dispose whole system;
(2) management of many tenants can be realized.User can manage the different types of data resource oneself being had, and can authorize Other user accesses the data resource oneself being had, it is also possible to application accesses the data resource of other user;
(3) unified data access application and mandate can be realized.Different types of data resource is accessed application and has unified Interface, by adapter as translation layer, it is achieved that the mapping between abstract mandate interface and primary access control, thus convenient User carries out application and the mandate of data resource access authority.
Accompanying drawing explanation
Fig. 1 is the corresponding relation between abstract four layer meta-model and concrete data management module.
Fig. 2 is the schematic diagram of data resource system for unified management.
Detailed description of the invention
Describe the technology implementation scheme of the present invention below in conjunction with the accompanying drawings in detail, but protection scope of the present invention is not limited to Described embodiment.
Fig. 1 gives the corresponding pass between HDFS data management module and HBase data management module and four layers of meta-model System.In an embodiment, each HDFS cluster or HBase cluster both correspond to a data capsule.In each HDFS cluster Maintain three layers of catalogue "/user name/directory name/filename ", wherein: "/user name/directory name " corresponds to a packet, What " user name " represented is the possessory user name of packet;" filename " is then corresponding to data assets.Each HBase cluster In each name space both correspond to a packet, the table in name space corresponding to the data assets in this packet, Row race in table is then corresponding to data field in this data assets.
Fig. 2 gives the overall structure figure of system.Embodiment: the present invention proposes a kind of for different types of data resource Explore of Unified Management Ideas and use the method system.This system includes packet load-on module, data resource discovery module, User registers login module, data access application and authorization module, and metadata management module, and data capsule manages module, Adapter management module, etc..
Step 1: system manager user adds different adapter cards by " adapter management module ".Adapter is inserted Part is the routine library meeting adaptor interface specification, each adapter card be both for a certain data management module (as HDFS, HBase or Hive) and custom-made.
Step 2: system manager user adds data capsule by " data capsule management module " or deletes data capsule, Each data capsule both corresponds to some adapter card, in order to represent the type of this container, and the operation to this container It is all to be completed by the routine in adapter card.During system start-up, data capsule management module will obtain all automatically The type of data capsule and physical address, set up the connection with data capsule.Additionally, data capsule management module is also responsible for monitoring With safeguard each data capsule on-line working state, such as: certain data capsule is currently online or off-line, data capsule In memory space take situation etc..System completes data moving between different pieces of information container according to these monitoring information Move, complete the tasks such as load balancing between different pieces of information container.
Step 3: domestic consumer can come register account number and login system by " user registers login module ".Owing to this is The big data platform that system is disposed works in the secure mode, uses Kerberos to carry out authenticating user identification, therefore user's note During volume account, system can create its Kerberos Principal and corresponding key for every user, and by this Principal It is written in the keytab file that only system service program could access together with key.When logging in system by user, it is System needs only to by kerberos authentication once, can be got up by the Configuration target cache of user after certification success, Avoid and repeat certification frequently.
Step 4: after domestic consumer's login system of registration, can upload packet to flat by " packet load-on module " Platform, is loaded by platform and manages.Uploading of packet completes in two steps.The first step is that packet is packaged into a compressed file The files passe page that uploading is provided by system is uploaded.User can check the All Files and corresponding document oneself uploaded No. ID (being distributed by system).Second step is user's type at list middle finger given data bag, and input packet describes letter substantially Breath, the file ID number of the packet that reselection had previously been uploaded, a packet object can be created on our platform.System root According to the type of packet object, from multiple data capsules of the type, it is dynamically selected out an appearance by load-balancing mechanism Device manages this packet, is loaded in this container by this packet by adapter and is managed, simultaneously by " metadata pipe Reason module " metadata that this packet is relevant is added in metadatabase, record the physical address of this packet.
After domestic consumer's login system of step 5. registration, can search interested by " data resource discovery module " Packet.User can be inquired about by incoming end, and " data resource discovery module " first carries out the checking of user identity, inquiry The legitimate verification of content, retrieves the list of the packet that inquiry is hit, according to data resource institute from metadata stores Duty at data capsule filters, and the packet list after filtering returns to incoming end as Query Result.With Family can select packet interested to check details further according to Query Result at incoming end.The description unit of packet Data then contain all multi information such as text description, industry, subject, keyword, type of data packet, and unified data resource finds The packet interested to user can be found out in the descriptive metadata storehouse of packet according to the inquiry of user's input.In reality Executing in example, the packet that user uploads can only be carried out read/write access by the owner in default situations, other users the need to Access, then must file an application in step 6, just can conduct interviews after the owner ratifies to authorize.
Step 6. user can check the detail information of packet in " data resource discovery module ", if needing to make Use this packet, then can propose data access request by incoming end to Resource Owner, by the table of data access request Singly fill in required access data resource title, needed for conduct interviews the type (" reading " or " writing ") of operation and required visit The time limit asked, proposes to access application, and access application can be sent to the owner user of packet and ratify.The institute of data After the person's of having logging in system by user, can check and access application accordingly, and determine to agree to or refusal this application.If application obtains Approval, corresponding mandate work then transfers to " data access entitlement module " to complete.Authorization requests can be remembered by uniform authorization module Record resolves, it is thus achieved that the data object asked and the list of operation, type and title according to packet orient its institute Data capsule, recall the authorization routine in corresponding adapter card, authorization message be converted into this data capsule Corresponding access rights are granted to the user filed an application by a series of function calls.After Authorized operation success, notifications carry Go out the user of application.
The detailed process of system processes data resource authorization is as follows: (1) first, system receives user and sent at incoming end Authorization command, authorization command is carried out parsing and extracts < data resource, authorization type, authorization object, time limit > etc. Information;(2) user identity of access side is verified, and judges that whether this user is the owner of related data resource, as Really user identity is not over checking or user nonowners, then return failure information and to client and terminate process, no Then proceed next step;(3) obtained the data capsule at related data resource place by query metadata storehouse, and call adaptation Authorization routine in device plug-in unit completes concrete mandate action in this data capsule.
Every time the mandate of data access authority is all attached with time limit, and after exceeding licensing term, corresponding mandate will be by Reclaim.User, if it is intended to be continuing with, can only initiate application again, waits owner's approval.Implementing, system is opening Can create a background thread time dynamic, effective grant column list that timing scan is current, when a certain mandate expires, this thread can be adjusted This authority is reclaimed with the authority recession function in corresponding adapter.

Claims (4)

1. the unified meta-model being used for portraying the different types of data resource in big data platform, it is characterised in that by such as Lower four-layer structure forms: ground floor is data capsule layer, and the second layer is layer data packet, and third layer is data assets layer, the 4th layer It it is then data field layer;Wherein, described data capsule is corresponding to a running example of certain data management component, described Packet is the set of one group of relevant data assets, described data assets be by a certain data capsule managed the most only Vertical data object, described data field is the data field of relational data assets;Physically, packet is held by data Device storage and management, data capsule layer is transparent for domestic consumer, and user can't perceive the existence of this layer, and system is born Duty storage also safeguards that packet is to the mapping relations between data capsule.
Unified meta-model the most according to claim 1, it is characterised in that have from top to bottom between the adjacent layer of meta-model time Having the mapping relations of one-to-many, the most each data capsule to store and manage one or more packet, each packet is one Individual or the set of the multiple data assets that is mutually related, each data assets is by being made up of 0 or multiple data field;This A little mapping relations are carried out two-way storage and maintenance by system: according to certain given data object, both can obtain it upper one The data object of layer, it is also possible to obtain the list of its next layer data object;
In addition to the mapping relations between level, meta-model defines multiple attribute for data object on each level, this Including: attribute, access-control attributes, secondary relationship attribute and user access information are described substantially;Wherein, genus is described substantially The property relevant industries of record data object, related discipline, key word, the owner, date created these information is described;Access and control Attribute description data object can be conducted interviews in which way by which user;Between secondary relationship attribute description data object Secondary relationship;User access information then records data object by which user was carried out access in which way when.
3. a uniform data resource management system for big data platform based on meta-model unified described in claim 2, it is special Levying and be, it includes: packet load-on module, data resource discovery module, and user registers login module, data access application with Authorization module, and metadata management module, data capsule management module, adapter management module;Wherein,
Described adapter management module, for the information that management is relevant to adapter;This module be responsible for receive system manager from The incoming adapter card of incoming end and corresponding data management component typonym, under system root directory "/ Adapters " adapter card is stored by subdirectory, record adapter card and class by metadata management module Mapping between type title;When system start-up, it is also responsible for all existing adapter cards in loading system, and at internal memory The middle mapping relations set up between adapter card and typonym;When system is run, it provides typonym to other module Mapping services to adapter card object;
Described data capsule management module, for the information that management is relevant to data capsule;This module is responsible for receiving system administration Data capsule title, the typonym of corresponding data management component and the relevant configuration information that member inputs from incoming end, Corresponding information is recorded by metadata management module;When system start-up, it be responsible for setting up in internal memory data capsule and Mapping between data management component typonym;When system is run, it provides data capsule title to pipe to other module The mapping services of reason data component type title;When system is run, it is also responsible for monitoring and safeguards the online of each data capsule Duty;
Described user registers login module, for being responsible for the registering and logging work of domestic consumer;For the registration of new user, should Module receives account title and the password that user is registered at incoming end, for this account create its Kerberos Principal and Corresponding key, and this Principal is written to, together with key, the keytab that only system service program could access In file;For the login of user, system receives account title and the password that user is inputted at incoming end, it is judged that whether password Correct: if code error, then to return the information of login failure;If correct, then use corresponding to this account title Kerberos Principal and key pass through kerberos authentication once, and by the Configuration target cache of user Get up, to avoid repeating frequently certification;
Described metadata management module, is used for storing, maintenance and management data capsule, packet, data assets, data field The metadata of four class objects;And provide query interface to other functional module;
Described packet load-on module, for receiving domestic consumer from the incoming data packet compressing file of incoming end, this module root The data management component typonym specified according to user utilizes load-balancing mechanism to select a data capsule of the type, adjusts The packet loading function provided with the adapter card of the type physically stores in this data capsule and manages this number According to bag;After packet is loaded successfully into selected data container, this module is also responsible for recording user by metadata management module The packet inputted describes information;
Described data resource discovery module, is used for providing a user with inquiry;This module receives the inquiry that user inputs at incoming end Request, first carries out the checking of user identity, the legitimate verification of inquiry content, then retrieves inquiry institute from metadata stores The list of the packet of hit, filters according to the duty of data resource bag place data capsule, after finally filtering Packet list return to incoming end as Query Result;
Described data access application and authorization module, for receiving domestic consumer's access application to packet, pass through metadata Management module obtains the owner of this packet, and application passes to this owner;If the owner have approved this application, then " data access application and authorization module " meeting possessory identity of certification, obtains if by rear, calling " metadata management module " Take the data capsule at related data resource place, and it is complete in this data capsule to call the authorization routine in corresponding adapter card Become concrete mandate action.
4. a uniform data method for managing resource based on uniform data resource management system described in claim 3, its feature It is to specifically comprise the following steps that
Step 1: system manager user adds different adapter cards by adapter management module;
Step 2: system manager user is added data capsule by data capsule management module or deleted data capsule, each Data capsule both corresponds to some adapter card, and in order to represent the type of this container, and the operation to this container is all passed through Routine in adapter card completes;During system start-up, data capsule management module will obtain all data capsules automatically Type and physical address, set up and the connection of data capsule;Additionally, data capsule management module is also responsible for monitoring and safeguards every The on-line working state of individual data capsule, system completes data moving between different pieces of information container according to these monitoring information Move, complete the task of load balancing between different pieces of information container;
Step 3: domestic consumer registers login module register account number and login system by user;Due to native system disposed big Data platform works in the secure mode, uses Kerberos to carry out authenticating user identification, therefore during user's register account number, and system Create its Kerberos Principal and corresponding key for every user, and this Principal is write together with key Enter to only system service program in the keytab file that could access;When logging in system by user, system only by The Configuration target cache of user once, is got up after certification success, it is to avoid repeat frequently to recognize by kerberos authentication Card;
Step 4: after domestic consumer's login system of registration, uploads packet to platform by packet load-on module, by platform Load and manage;Uploading of packet completes in two steps: the first step is packet to be packaged into a compressed file upload and pass through The files passe page that system is provided is uploaded;Second step is user's type at list middle finger given data bag, inputs packet The information that substantially describes, the file ID number of the packet that reselection had previously been uploaded, a packet pair can be created on platform As;System, according to the type of packet object, is dynamically selected from multiple data capsules of the type by load-balancing mechanism Select a container and manage this packet, by adapter this packet is loaded in this container and is managed, lead to simultaneously Cross " metadata management module " and the metadata that this packet is relevant is added in metadatabase, record the physics of this packet Address;
After domestic consumer's login system of step 5. registration, search packet interested by data resource discovery module;With Family is inquired about by incoming end, and first data resource discovery module carries out the checking of user identity, the legitimate verification of inquiry content, The list of the packet that inquiry is hit is retrieved, according to the work shape of data resource place data capsule from metadata stores State filters, and the packet list after filtering returns to incoming end as Query Result;User is connecing according to Query Result Enter end can select packet interested to check details further;The descriptive metadata of packet then comprises text and retouches State, all multi information of industry, subject, keyword, type of data packet, unified data resource finds can looking into according to user's input Ask in the descriptive metadata storehouse of packet, find out the packet interested to user;
Step 6. user checks the detail information of packet in data resource discovery module, if needing to use this packet, Then propose data access request by incoming end to Resource Owner, fill in required access by the list of data access request Data resource title, needed for conduct interviews operation type and the time limit of required access, propose access application, visit Ask that the owner user that application is sent to packet ratifies;After owner's logging in system by user of data, phase can be checked The access application answered, and determine to agree to or refusal this application;If application is given the ratification, " number is then transferred in corresponding mandate work According to accessing authorization module " complete;Authorization requests record can be resolved by uniform authorization module, it is thus achieved that the data pair asked As the list with operation, orient the data capsule at its place according to the type of packet and title, recall corresponding adapter Authorization routine in plug-in unit, is converted into a series of function calls to this data capsule, by corresponding access right by authorization message Limit is granted to the user filed an application;After Authorized operation success, the user that notifications are filed an application.
CN201610555871.9A 2016-07-15 2016-07-15 Unified data resource management system and method for big data platform Active CN106202452B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610555871.9A CN106202452B (en) 2016-07-15 2016-07-15 Unified data resource management system and method for big data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610555871.9A CN106202452B (en) 2016-07-15 2016-07-15 Unified data resource management system and method for big data platform

Publications (2)

Publication Number Publication Date
CN106202452A true CN106202452A (en) 2016-12-07
CN106202452B CN106202452B (en) 2020-05-26

Family

ID=57474356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610555871.9A Active CN106202452B (en) 2016-07-15 2016-07-15 Unified data resource management system and method for big data platform

Country Status (1)

Country Link
CN (1) CN106202452B (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106533967A (en) * 2016-12-08 2017-03-22 北京中安智达科技有限公司 Data transmission method capable of customizing load balance strategy
CN106940867A (en) * 2017-02-24 2017-07-11 深圳国泰安教育技术股份有限公司 A kind of financial trade method and device
CN107798457A (en) * 2017-07-24 2018-03-13 上海壹账通金融科技有限公司 Investment combination proposal recommending method, device, computer equipment and storage medium
CN107832440A (en) * 2017-11-17 2018-03-23 北京锐安科技有限公司 A kind of data digging method, device, server and computer-readable recording medium
CN108037919A (en) * 2017-12-01 2018-05-15 北京博宇通达科技有限公司 A kind of visualization big data workflow configuration method and system based on WEB
CN108052618A (en) * 2017-12-15 2018-05-18 北京搜狐新媒体信息技术有限公司 Data managing method and device
CN108243145A (en) * 2016-12-23 2018-07-03 航天星图科技(北京)有限公司 A kind of multi-source identity identifying method
CN108268798A (en) * 2017-06-30 2018-07-10 勤智数码科技股份有限公司 A kind of data item authority distributing method and system
CN108809900A (en) * 2017-05-02 2018-11-13 武汉斗鱼网络科技有限公司 A kind of frame and method of unified resource access
CN108874971A (en) * 2018-06-07 2018-11-23 北京赛思信安技术股份有限公司 A kind of tool and method applied to the storage of magnanimity labeling solid data
CN109150908A (en) * 2018-10-08 2019-01-04 四川大学 A kind of big data platform protective device and its guard method being deployed in gateway
CN109522090A (en) * 2018-11-09 2019-03-26 中国联合网络通信集团有限公司 Resource regulating method and device
CN109614167A (en) * 2018-12-07 2019-04-12 杭州数澜科技有限公司 A kind of method and system managing plug-in unit
CN109981698A (en) * 2017-12-27 2019-07-05 博元森禾信息科技(北京)有限公司 Number networking cross-domain data access standardized system and method based on metadata
CN110188887A (en) * 2018-09-26 2019-08-30 第四范式(北京)技术有限公司 The data managing method and device of Machine oriented study
CN110223185A (en) * 2019-05-20 2019-09-10 中国平安财产保险股份有限公司 A kind of information benefit transmission method and relevant device based on data processing
CN110651254A (en) * 2017-05-05 2020-01-03 布伦瑞克工业大学 Method, computer system and computer program for coordinating access to resources of a distributed computer system
CN110678845A (en) * 2017-06-29 2020-01-10 国际商业机器公司 Multi-tenant data services in a distributed file system for big data analytics
CN110750218A (en) * 2019-10-18 2020-02-04 北京浪潮数据技术有限公司 Storage resource management method, device, equipment and readable storage medium
CN111143449A (en) * 2019-12-12 2020-05-12 北京中电普华信息技术有限公司 Data service method and device based on unified data model
CN112395340A (en) * 2020-11-16 2021-02-23 青岛海信网络科技股份有限公司 Data asset management method and device
CN112685425A (en) * 2021-01-08 2021-04-20 东云睿连(武汉)计算技术有限公司 Data asset meta-information processing system and method
CN112966036A (en) * 2021-03-10 2021-06-15 浪潮云信息技术股份公司 Method for constructing main data service based on logic model
CN115510121A (en) * 2022-10-08 2022-12-23 上海数禾信息科技有限公司 Method, device and equipment for managing business form data and readable storage medium
CN110223185B (en) * 2019-05-20 2024-05-14 中国平安财产保险股份有限公司 Information complementary transmission method based on data processing and related equipment

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131289A (en) * 2020-08-17 2020-12-25 武汉旷视金智科技有限公司 Data processing method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724291A (en) * 2012-05-23 2012-10-10 北京经纬恒润科技有限公司 Vehicle network data acquisition method, unit and system
CN104657214A (en) * 2015-03-13 2015-05-27 华存数据信息技术有限公司 Multi-queue multi-priority big data task management system and method for achieving big data task management by utilizing system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102724291A (en) * 2012-05-23 2012-10-10 北京经纬恒润科技有限公司 Vehicle network data acquisition method, unit and system
CN104657214A (en) * 2015-03-13 2015-05-27 华存数据信息技术有限公司 Multi-queue multi-priority big data task management system and method for achieving big data task management by utilizing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨臻等: "面向海量数据的在线流数据服务框架", 《计算机应用与软件》 *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106533967B (en) * 2016-12-08 2019-04-12 北京中安智达科技有限公司 A kind of data transmission method can customize load balancing
CN106533967A (en) * 2016-12-08 2017-03-22 北京中安智达科技有限公司 Data transmission method capable of customizing load balance strategy
CN108243145A (en) * 2016-12-23 2018-07-03 航天星图科技(北京)有限公司 A kind of multi-source identity identifying method
CN108243145B (en) * 2016-12-23 2019-04-26 中科星图股份有限公司 A kind of multi-source identity identifying method
CN106940867A (en) * 2017-02-24 2017-07-11 深圳国泰安教育技术股份有限公司 A kind of financial trade method and device
CN108809900B (en) * 2017-05-02 2021-09-07 武汉斗鱼网络科技有限公司 Framework and method for unified resource access
CN108809900A (en) * 2017-05-02 2018-11-13 武汉斗鱼网络科技有限公司 A kind of frame and method of unified resource access
CN110651254B (en) * 2017-05-05 2023-11-10 TTZ片上网络与资源管理(NoC)创新学会布伦瑞克工业大学 Method, computer system and computer program for coordinating access to resources of a distributed computer system
CN110651254A (en) * 2017-05-05 2020-01-03 布伦瑞克工业大学 Method, computer system and computer program for coordinating access to resources of a distributed computer system
CN110678845B (en) * 2017-06-29 2023-05-12 国际商业机器公司 Multi-tenant data service in a distributed file system for big data analysis
CN110678845A (en) * 2017-06-29 2020-01-10 国际商业机器公司 Multi-tenant data services in a distributed file system for big data analytics
CN108268798A (en) * 2017-06-30 2018-07-10 勤智数码科技股份有限公司 A kind of data item authority distributing method and system
CN108268798B (en) * 2017-06-30 2023-09-05 勤智数码科技股份有限公司 Data item authority allocation method and system
CN107798457A (en) * 2017-07-24 2018-03-13 上海壹账通金融科技有限公司 Investment combination proposal recommending method, device, computer equipment and storage medium
CN107798457B (en) * 2017-07-24 2021-08-03 深圳壹账通智能科技有限公司 Investment portfolio scheme recommending method, device, computer equipment and storage medium
CN107832440A (en) * 2017-11-17 2018-03-23 北京锐安科技有限公司 A kind of data digging method, device, server and computer-readable recording medium
CN107832440B (en) * 2017-11-17 2020-10-13 北京锐安科技有限公司 Data mining method, device, server and computer readable storage medium
CN108037919A (en) * 2017-12-01 2018-05-15 北京博宇通达科技有限公司 A kind of visualization big data workflow configuration method and system based on WEB
CN108052618A (en) * 2017-12-15 2018-05-18 北京搜狐新媒体信息技术有限公司 Data managing method and device
CN109981698B (en) * 2017-12-27 2022-03-04 博元森禾信息科技(北京)有限公司 Metadata-based data networking cross-domain data access standardization system and method
CN109981698A (en) * 2017-12-27 2019-07-05 博元森禾信息科技(北京)有限公司 Number networking cross-domain data access standardized system and method based on metadata
CN108874971B (en) * 2018-06-07 2021-09-24 北京赛思信安技术股份有限公司 Tool and method applied to mass tagged entity data storage
CN108874971A (en) * 2018-06-07 2018-11-23 北京赛思信安技术股份有限公司 A kind of tool and method applied to the storage of magnanimity labeling solid data
CN110188887A (en) * 2018-09-26 2019-08-30 第四范式(北京)技术有限公司 The data managing method and device of Machine oriented study
CN110188887B (en) * 2018-09-26 2022-11-08 第四范式(北京)技术有限公司 Data management method and device for machine learning
CN109150908A (en) * 2018-10-08 2019-01-04 四川大学 A kind of big data platform protective device and its guard method being deployed in gateway
CN109522090A (en) * 2018-11-09 2019-03-26 中国联合网络通信集团有限公司 Resource regulating method and device
CN109614167B (en) * 2018-12-07 2023-10-20 杭州数澜科技有限公司 Method and system for managing plug-ins
CN109614167A (en) * 2018-12-07 2019-04-12 杭州数澜科技有限公司 A kind of method and system managing plug-in unit
CN110223185B (en) * 2019-05-20 2024-05-14 中国平安财产保险股份有限公司 Information complementary transmission method based on data processing and related equipment
CN110223185A (en) * 2019-05-20 2019-09-10 中国平安财产保险股份有限公司 A kind of information benefit transmission method and relevant device based on data processing
CN110750218A (en) * 2019-10-18 2020-02-04 北京浪潮数据技术有限公司 Storage resource management method, device, equipment and readable storage medium
CN111143449A (en) * 2019-12-12 2020-05-12 北京中电普华信息技术有限公司 Data service method and device based on unified data model
CN112395340B (en) * 2020-11-16 2023-07-28 青岛海信网络科技股份有限公司 Data asset management method and device
CN112395340A (en) * 2020-11-16 2021-02-23 青岛海信网络科技股份有限公司 Data asset management method and device
CN112685425B (en) * 2021-01-08 2022-06-17 东云睿连(武汉)计算技术有限公司 Data asset meta-information processing system and method
CN112685425A (en) * 2021-01-08 2021-04-20 东云睿连(武汉)计算技术有限公司 Data asset meta-information processing system and method
CN112966036B (en) * 2021-03-10 2023-02-21 浪潮云信息技术股份公司 Method for constructing main data service based on logic model
CN112966036A (en) * 2021-03-10 2021-06-15 浪潮云信息技术股份公司 Method for constructing main data service based on logic model
CN115510121A (en) * 2022-10-08 2022-12-23 上海数禾信息科技有限公司 Method, device and equipment for managing business form data and readable storage medium
CN115510121B (en) * 2022-10-08 2024-01-05 上海数禾信息科技有限公司 List data management method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN106202452B (en) 2020-05-26

Similar Documents

Publication Publication Date Title
CN106202452A (en) The uniform data resource management system of big data platform and method
AU2018374912B2 (en) Model training system and method, and storage medium
US8782096B2 (en) Virtual repository management
US8726351B2 (en) Systems and methods for controlling access to electronic records in an archives system
CN102236701B (en) Dependency graphs for multiple domains
US9524306B2 (en) Global information management system and method
US7299171B2 (en) Method and system for processing grammar-based legality expressions
CA2508928C (en) Method, system, and apparatus for discovering and connecting to data sources
US7720863B2 (en) Security view-based, external enforcement of business application security rules
RU2446456C2 (en) Integration of corporate search engines with access control application programming special interfaces
CN103812939B (en) Big data storage system
CN101226573B (en) Method for controlling access authority of electric document
US10438008B2 (en) Row level security
US20060230044A1 (en) Records management federation
US20070192374A1 (en) Virtual repository management to provide functionality
WO2022011144A1 (en) Pipeline systems and methods for use in data analytics platforms
US20170185661A1 (en) Extensible extract, transform and load (etl) framework
WO2018108423A1 (en) System and method for user authorization
CN111737216A (en) Data user environment, data governance method, and computer-readable storage medium
CN109063061A (en) Across distributed system data processing method, device, equipment and storage medium
US20140358921A1 (en) Delegating resembling data of an organization to a linked device
KR101109425B1 (en) System of managing documents
Caniou et al. Data management API within the GridRPC
Islam et al. E-Government Project Using Oracle Database
Liu et al. Digital Library Infrastructure--A Case Study on Sharing Information Resources in China

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant