CN110457018A - A kind of data management system and its management method based on Hadoop - Google Patents

A kind of data management system and its management method based on Hadoop Download PDF

Info

Publication number
CN110457018A
CN110457018A CN201910757780.7A CN201910757780A CN110457018A CN 110457018 A CN110457018 A CN 110457018A CN 201910757780 A CN201910757780 A CN 201910757780A CN 110457018 A CN110457018 A CN 110457018A
Authority
CN
China
Prior art keywords
data
user
oozie
component
data management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910757780.7A
Other languages
Chinese (zh)
Inventor
丁瑞
高传集
于昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN201910757780.7A priority Critical patent/CN110457018A/en
Publication of CN110457018A publication Critical patent/CN110457018A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • G06F8/24Object-oriented

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of data management system and its management method based on Hadoop, comprising: front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement;HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built in Hadoop frame;Front end assemblies, for when receiving the data management operations of user's input, data management operations to be sent to Oozie component;Oozie component, for when receiving data management operations, Hive Tool for Data Warehouse is called to determine the corresponding metadata of data management operations, and call Sqoop handling implement, the corresponding data to be managed of the corresponding metadata of data management operations in HDFS distributed file system are managed according to data management operations, wherein, the corresponding metadata of data management operations and data to be managed are mapping relations.This programme can reduce the difficulty of data management.

Description

A kind of data management system and its management method based on Hadoop
Technical field
The present invention relates to field of computer technology, in particular to a kind of data management system and its management based on Hadoop Method.
Background technique
Since in recent years, the tide of big data sweeps over, and under the tide of big data, have one it is very important, Occupy the ecosystem of particularly significant status, that is, distributed file system (Hadoop Distributed File System, HDFS).HDFS is rapidly developed from an edge technology, is become the mainstream technology of present big data, is almost had become The synonym of big data.
Currently, needing user to input corresponding parameter by way of order line when carrying out data management by HDFS.
As can be seen from the above description, the prior art needs user to shift to an earlier date in such a way that order line is to data management is carried out Corresponding parameter is familiar with, so that the difficulty of data management is big.
Summary of the invention
The embodiment of the invention provides a kind of data management system and its management method based on Hadoop, can reduce number According to the difficulty of management.
In a first aspect, the embodiment of the invention provides a kind of data management systems based on Hadoop, comprising:
Front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement;
The HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are taken It build in the Hadoop frame;
The front end assemblies, for receive user input data management operations when, by the data management operations It is sent to the Oozie component;
The Oozie component, for when receiving the data management operations, calling the Hive Tool for Data Warehouse It determines the corresponding metadata of the data management operations, and calls the Sqoop handling implement, according to the data management operations Pipe is carried out to the corresponding data to be managed of the corresponding metadata of data management operations described in the HDFS distributed file system Reason, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
Preferably,
The front end assemblies are further used in the data for being used to request data to be checked for receiving user's input When request command, the data request command is sent to the Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data bins Library tool determines whether there is the corresponding metadata of the data request command, when there are the data request commands to correspond to for determination Metadata when, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and asked with the data The corresponding data to be checked of the corresponding metadata of order are sought, and the data to be checked are sent to the front end assemblies, In, the corresponding metadata of data request command and the data to be checked are mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
Preferably,
The front end assemblies are further used in the user's registration information for being used to verify identity for receiving user's input When, the user's registration information is sent to the Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data bins Library tool, it is determined whether there is metadata corresponding with the user's registration information, when be not present and the user's registration information When corresponding metadata, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information, In, the prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user's registration letter Ceasing corresponding metadata with corresponding target registered information in the HDFS distributed file system is mapping relations.
Preferably,
Further comprise: Nginx server;
The front end assemblies, are further used for when receiving user's interactive instruction, and the interactive instruction is sent to institute State Nginx service, wherein the interactive instruction, comprising: the data management operations;
The Nginx server, for the interactive instruction being sent to described when receiving the interactive instruction Oozie component.
Preferably,
The Oozie component is further used for recording the Hive Tool for Data Warehouse and the HDFS distributed document The workflow information of system.
Second aspect, the embodiment of the invention provides a kind of management method of data management system based on Hadoop, packets It includes:
HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built In Hadoop frame;
By the front end assemblies, when receiving the data management operations of user's input, by the data management operations It is sent to the Oozie component;
By the Oozie component, when receiving the data management operations, the Hive Tool for Data Warehouse is called Determine the corresponding metadata of the data management operations;
By the Oozie component, the Sqoop handling implement is called, according to the data management operations to described The corresponding data to be managed of the corresponding metadata of data management operations described in HDFS distributed file system are managed, In, the corresponding metadata of data management operations and the data to be managed are mapping relations.
Preferably,
Further comprise:
By the front end assemblies, receive user's input for requesting the request of data of data to be checked to be ordered When enabling, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called The corresponding metadata of the data request command is determined whether there is, when there are the corresponding first numbers of the data request command for determination According to when, call the Hive Tool for Data Warehouse read from the HDFS distributed file system with the data request command The corresponding data to be checked of corresponding metadata, and the data to be checked are sent to the front end assemblies, wherein it is described The corresponding metadata of data request command and the data to be checked are mapping relations;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
Preferably,
Further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by institute It states user's registration information and is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive data warehouse work is called Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates permission of the user without the access HDFS distributed file system, described The corresponding metadata of user's registration information is that mapping is closed with corresponding target registered information in the HDFS distributed file system System.
Preferably,
Further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx clothes Business, wherein the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to described Oozie component.
Preferably,
Further comprise:
By the Oozie component, the Hive Tool for Data Warehouse and the HDFS distributed file system are recorded Workflow information.
The embodiment of the invention provides a kind of data management system and its management method based on Hadoop, by by HDFS Distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built in Hadoop frame, Visual interactive interface can be provided for user by front end assemblies, user in interactive interface by that can input to data The data management operations being managed, then by Oozie component call Hive Tool for Data Warehouse, that is, can determine that data management is grasped Make the corresponding data mapped metadata to be managed in HDFS distributed file system, it is corresponding by data management operations Mapping relations between metadata and pending data, Oozie component call Sqoop handling implement can be grasped according to data management Make corresponding metadata lookup data to be managed, so as to which data to be managed imported, exported, deleted, modified etc. with management behaviour Make, and the metadata in Hive Tool for Data Warehouse is managed, so that relevant operation difficulty of the user for Hadoop It is greatly lowered, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 is a kind of structural schematic diagram for data management system based on Hadoop that one embodiment of the invention provides;
Fig. 2 is the structural schematic diagram for another data management system based on Hadoop that one embodiment of the invention provides;
Fig. 3 is a kind of process of the management method for data management system based on Hadoop that one embodiment of the invention provides Figure.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
As shown in Figure 1, the embodiment of the invention provides a kind of data management systems based on Hadoop, comprising:
Front end assemblies 101, Hadoop frame 102, HDFS distributed file system 103, Oozie component 104, Hive data Warehouse tool 105 and Sqoop handling implement 106;
The HDFS distributed file system 103, the Oozie component 104, the Hive Tool for Data Warehouse and 105 The Sqoop handling implement 106 is built in the Hadoop frame 102;
The front end assemblies 101, for when receiving the data management operations of user's input, the data management to be grasped It is sent to the Oozie component 104;
The Oozie component 104, for when receiving the data management operations, calling the Hive data warehouse Tool 105 determines the corresponding metadata of the data management operations, and calls the Sqoop handling implement 106, according to the number It is corresponding to pipe to the corresponding metadata of data management operations described in the HDFS distributed file system 103 according to management operation Reason data are managed, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
In embodiments of the present invention, by by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided by front end assemblies for user Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe Reason data such as are imported, are exported, being deleted, modify at the management operation, and to the metadata progress in Hive Tool for Data Warehouse Management, so that the relevant operation difficulty of Hadoop is greatly lowered in user, without being carried out by way of inputting order line Management, so as to reduce the difficulty of data management.
For the ease of data management, in an embodiment of the present invention, the front end assemblies are further used for receiving When stating the data request command for being used to request data to be checked of user's input, the data request command is sent to described Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data bins Library tool determines whether there is the corresponding metadata of the data request command, when there are the data request commands to correspond to for determination Metadata when, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and asked with the data The corresponding data to be checked of the corresponding metadata of order are sought, and the data to be checked are sent to the front end assemblies, In, the corresponding metadata of data request command and the data to be checked are mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
In embodiments of the present invention, user forward end component can input for inquiring HDFS distributed document according to demand The data request command of the data stored in system, so as to be determined whether by Oozie component call Hive Tool for Data Warehouse There are corresponding metadata, to determine whether there is the data to be checked to be inquired of user, when there are needed for user to When inquiring data, data to be checked can be read from HDFS distributed file system by Hive Tool for Data Warehouse, concurrently Front end assemblies are given, so that data to be checked are supplied to user, meet the query demand of user.
In order to improve the safety of data management, in an embodiment of the present invention, the front end assemblies are further used for When receiving the user's registration information for being used to verify identity of user's input, the user's registration information is sent to described Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data bins Library tool, it is determined whether there is metadata corresponding with the user's registration information, when be not present and the user's registration information When corresponding metadata, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information, In, the prompt information indicates the user without the permission for accessing the HDFS distributed file system.
In embodiments of the present invention, user needs first to carry out authentication before to the operation such as data management, inquiry, I.e. user needs to input user's registration information by the interactive interface that front end assemblies provide, so as to pass through tune by Oozie component With Hive Tool for Data Warehouse, determine in HDFS distributed file system with the presence or absence of target corresponding with user's registration information Registration information has the metadata of mapping relations, and the corresponding metadata of the user's registration information is not present when determining, that is, determines Target registered information corresponding with user's registration information is not present in HDFS distributed file system, thus may determine that user is Illegal user, therefore the prompt information without access HDFS distributed file system permission can be exported by front end assemblies, so as to User understands the reason of can not accessing storage system.
In order to improve the safety of data management, in an embodiment of the present invention, as shown in Fig. 2, described be based on Hadoop Data management system, further comprise: Nginx server 201;
The front end assemblies 101 are further used for being sent to the interactive instruction when receiving user's interactive instruction The Nginx service 201, wherein the interactive instruction, comprising: the data management operations;
The Nginx server 201, for when receiving the interactive instruction, the interactive instruction to be sent to institute State Oozie component 104.
In embodiments of the present invention, by Nginx server, user may be implemented in the HDFS distributed field system on backstage System interacts, and can also realize load balancing, is held by Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement Row task provides regulatory requirement for user.
Data management situation is understood for the ease of user, in an embodiment of the present invention, the Oozie component, further For recording the workflow information of the Hive Tool for Data Warehouse and the HDFS distributed file system.
In embodiments of the present invention, Oozie component is when carrying out task schedule, Hive data in logger task scheduling process The workflow information of warehouse tool and the HDFS distributed file system forms log, is determined with will pass through the information of record Operation result during task schedule, e.g., task schedule are unsuccessfully, by checking Oozie component log, to check wrong original Cause, if detailed error reason can not be viewed, specific tasks are checked in the port that can enter HDFS distributed file system Error message.
As shown in figure 3, the embodiment of the invention provides a kind of management method of data management system based on Hadoop, packet It includes:
Step 301: HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop are handled into work Tool is built in Hadoop frame;
Step 302: by the front end assemblies, when receiving the data management operations of user's input, by the data Management operation is sent to the Oozie component;
Step 303: by the Oozie component, when receiving the data management operations, calling the Hive data Warehouse tool determines the corresponding metadata of the data management operations;
Step 304: by the Oozie component, the Sqoop handling implement is called, according to the data management operations Pipe is carried out to the corresponding data to be managed of the corresponding metadata of data management operations described in the HDFS distributed file system Reason, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
In embodiments of the present invention, by by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided by front end assemblies for user Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe Reason data the management such as are imported, are exported, being deleted, modify and being operated so that user for Hadoop relevant operation difficulty substantially Degree reduces, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
In an embodiment of the present invention, further comprise:
By the front end assemblies, receive user's input for requesting the request of data of data to be checked to be ordered When enabling, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called The corresponding metadata of the data request command is determined whether there is, when there are the corresponding first numbers of the data request command for determination According to when, call the Hive Tool for Data Warehouse read from the HDFS distributed file system with the data request command The corresponding data to be checked of corresponding metadata, and the data to be checked are sent to the front end assemblies, wherein it is described The corresponding metadata of data request command and the data to be checked are mapping relations;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
In an embodiment of the present invention, further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by institute It states user's registration information and is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive data warehouse work is called Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates permission of the user without the access HDFS distributed file system, described The corresponding metadata of user's registration information is that mapping is closed with corresponding target registered information in the HDFS distributed file system System.
In an embodiment of the present invention, further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx clothes Business, wherein the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to described Oozie component.
In an embodiment of the present invention, further comprise:
By the Oozie component, the Hive Tool for Data Warehouse and the HDFS distributed file system are recorded Workflow information.
The each embodiment of the present invention at least has the following beneficial effects:
1, in an embodiment of the present invention, by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided for user by front end assemblies Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe Reason data the management such as are imported, are exported, being deleted, modify and being operated so that user for Hadoop relevant operation difficulty substantially Degree reduces, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
2, in an embodiment of the present invention, user forward end component can input for inquiring HDFS distribution according to demand The data request command of the data stored in file system, so as to be determined by Oozie component call Hive Tool for Data Warehouse With the presence or absence of corresponding metadata, to determine whether there is the data to be checked to be inquired of user, when there are needed for user Data to be checked when, data to be checked can be read from HDFS distributed file system by Hive Tool for Data Warehouse, And front end assemblies are sent to, so that data to be checked are supplied to user, meet the query demand of user.
3, in an embodiment of the present invention, user needs first to carry out identity to test before to the operation such as data management, inquiry Card, i.e. user need to input user's registration information by the interactive interface that front end assemblies provide, so as to be passed through by Oozie component Hive Tool for Data Warehouse is called, is determined in HDFS distributed file system with the presence or absence of mesh corresponding with user's registration information The metadata that registration information has mapping relations is marked, the corresponding metadata of the user's registration information is not present when determining, i.e., really Determining HDFS distributed file system, there is no target registered information corresponding with user's registration information, thus may determine that user For illegal user, therefore the prompt information without access HDFS distributed file system permission can be exported by front end assemblies, with Just user understands the reason of can not accessing storage system.
4, in an embodiment of the present invention, by Nginx server, user may be implemented in the distributed text of the HDFS on backstage Part system interacts, and can also realize load balancing, handles work by Oozie component, Hive Tool for Data Warehouse and Sqoop Have execution task, provides regulatory requirement for user.
It should be noted that, in this document, such as first and second etc relational terms are used merely to an entity Or operation is distinguished with another entity or operation, is existed without necessarily requiring or implying between these entities or operation Any actual relationship or order.Moreover, the terms "include", "comprise" or its any other variant be intended to it is non- It is exclusive to include, so that the process, method, article or equipment for including a series of elements not only includes those elements, It but also including other elements that are not explicitly listed, or further include solid by this process, method, article or equipment Some elements.In the absence of more restrictions, the element limited by sentence " including one ", is not arranged Except there is also other identical factors in the process, method, article or apparatus that includes the element.
Finally, it should be noted that the foregoing is merely presently preferred embodiments of the present invention, it is merely to illustrate skill of the invention Art scheme, is not intended to limit the scope of the present invention.Any modification for being made all within the spirits and principles of the present invention, Equivalent replacement, improvement etc., are included within the scope of protection of the present invention.

Claims (10)

1. a kind of data management system based on Hadoop characterized by comprising
Front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop Handling implement;
The HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built In the Hadoop frame;
The front end assemblies, for when receiving the data management operations of user's input, the data management operations to be sent To the Oozie component;
The Oozie component, for when receiving the data management operations, calling the Hive Tool for Data Warehouse to determine The corresponding metadata of the data management operations, and the Sqoop handling implement is called, according to the data management operations to institute The corresponding data to be managed of the corresponding metadata of data management operations described in HDFS distributed file system are stated to be managed, In, the corresponding metadata of data management operations and the data to be managed are mapping relations.
2. the data management system according to claim 1 based on Hadoop, which is characterized in that
The front end assemblies are further used in the request of data for being used to request data to be checked for receiving user's input When order, the data request command is sent to the Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data warehouse work Tool determines whether there is the corresponding metadata of the data request command, when there are the corresponding members of the data request command for determination When data, calls the Hive Tool for Data Warehouse to read from the HDFS distributed file system and ordered with the request of data The corresponding data to be checked of corresponding metadata are enabled, and the data to be checked are sent to the front end assemblies, wherein institute It states the corresponding metadata of data request command and the data to be checked is mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
3. the data management system according to claim 1 based on Hadoop, which is characterized in that
The front end assemblies are further used for when receiving the user's registration information for being used to verify identity of user's input, will The user's registration information is sent to the Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data warehouse work Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information, wherein The prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user's registration information pair The metadata answered is mapping relations with corresponding target registered information in the HDFS distributed file system.
4. the data management system according to claim 1 based on Hadoop, which is characterized in that further comprise: Nginx Server;
The front end assemblies are further used for when receiving user's interactive instruction, the interactive instruction are sent to described Nginx service, wherein the interactive instruction, comprising: the data management operations;
The Nginx server, for when receiving the interactive instruction, the interactive instruction to be sent to the Oozie Component.
5. according to claim 1 to any data management system based on Hadoop in 4, which is characterized in that
The Oozie component is further used for recording the Hive Tool for Data Warehouse and the HDFS distributed file system Workflow information.
6. a kind of management method of the data management system based on Hadoop characterized by comprising
HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built In Hadoop frame;
The data management operations are sent when receiving the data management operations of user's input by the front end assemblies To the Oozie component;
By the Oozie component, when receiving the data management operations, the Hive Tool for Data Warehouse is called to determine The corresponding metadata of the data management operations;
By the Oozie component, the Sqoop handling implement is called, according to the data management operations to the HDFS points The corresponding data to be managed of the corresponding metadata of data management operations described in cloth file system are managed, wherein described The corresponding metadata of data management operations and the data to be managed are mapping relations.
7. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, in the data request command for being used to request data to be checked for receiving user's input When, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called to determine With the presence or absence of the corresponding metadata of the data request command, when there are the corresponding metadata of the data request command for determination When, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and the data request command pair The corresponding data to be checked of the metadata answered, and the data to be checked are sent to the front end assemblies, wherein the number It is mapping relations according to the corresponding metadata of request command and the data to be checked;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
8. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by the use Family registration information is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive Tool for Data Warehouse is called, really It is fixed to whether there is metadata corresponding with the user's registration information, when there is no first numbers corresponding with the user's registration information According to when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user The corresponding metadata of registration information is mapping relations with corresponding target registered information in the HDFS distributed file system.
9. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx service, In, the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to the Oozie Component.
10. according to the management method of the data management system based on Hadoop any in claim 6 to 9, feature It is,
Further comprise:
By the Oozie component, the work of the Hive Tool for Data Warehouse and the HDFS distributed file system is recorded Stream information.
CN201910757780.7A 2019-08-16 2019-08-16 A kind of data management system and its management method based on Hadoop Pending CN110457018A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910757780.7A CN110457018A (en) 2019-08-16 2019-08-16 A kind of data management system and its management method based on Hadoop

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910757780.7A CN110457018A (en) 2019-08-16 2019-08-16 A kind of data management system and its management method based on Hadoop

Publications (1)

Publication Number Publication Date
CN110457018A true CN110457018A (en) 2019-11-15

Family

ID=68487091

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910757780.7A Pending CN110457018A (en) 2019-08-16 2019-08-16 A kind of data management system and its management method based on Hadoop

Country Status (1)

Country Link
CN (1) CN110457018A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625596A (en) * 2020-05-14 2020-09-04 国网辽宁省电力有限公司 Multi-source data synchronous sharing method and system for real-time consumption scheduling of new energy
CN112883025A (en) * 2021-01-25 2021-06-01 北京云思畅想科技有限公司 System and method for visualizing mapping relation of ceph internal data structure

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030163570A1 (en) * 2002-02-26 2003-08-28 Sun Microsystems, Inc. Command line interface session tool
CN103856567A (en) * 2014-03-26 2014-06-11 西安电子科技大学 Small file storage method based on Hadoop distributed file system
CN109885620A (en) * 2018-12-25 2019-06-14 航天信息股份有限公司 Metadata read method and device based on Hive data warehouse

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030163570A1 (en) * 2002-02-26 2003-08-28 Sun Microsystems, Inc. Command line interface session tool
CN103856567A (en) * 2014-03-26 2014-06-11 西安电子科技大学 Small file storage method based on Hadoop distributed file system
CN109885620A (en) * 2018-12-25 2019-06-14 航天信息股份有限公司 Metadata read method and device based on Hive data warehouse

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
罗白莲: "CDH4安装实践HDFS、HBase、Zookeeper、Hive、Oozie、Sqoop", 《HTTPS://BLOG.CSDN.NET/LUOBAILIAN/ARTICLE/DETAILS/50412146》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625596A (en) * 2020-05-14 2020-09-04 国网辽宁省电力有限公司 Multi-source data synchronous sharing method and system for real-time consumption scheduling of new energy
CN111625596B (en) * 2020-05-14 2023-12-26 国网辽宁省电力有限公司 Multi-source data synchronous sharing method and system for real-time new energy consumption scheduling
CN112883025A (en) * 2021-01-25 2021-06-01 北京云思畅想科技有限公司 System and method for visualizing mapping relation of ceph internal data structure

Similar Documents

Publication Publication Date Title
Astuti et al. Risks assessment of information technology processes based on COBIT 5 framework: A case study of ITS service desk
US10579803B1 (en) System and method for management of application vulnerabilities
US7610512B2 (en) System and method for automated and assisted resolution of it incidents
US8756301B2 (en) Systems and methods for organic knowledge base runbook automation
CN106775713B (en) File auditing method and device and file submitting control system
CN108491254A (en) A kind of dispatching method and device of data warehouse
CN111199379A (en) Examination and approval method, examination and approval device and storage medium of workflow engine
CN110188103A (en) Data account checking method, device, equipment and storage medium
CN107463839A (en) A kind of system and method for managing application program
CN110457018A (en) A kind of data management system and its management method based on Hadoop
CN112184172B (en) Electronic file on-line transfer and receiving method and system
CN105653322A (en) Operation and maintenance server and server event processing method
US9082085B2 (en) Computing environment climate dependent policy management
CN101504756A (en) System and method for implementing capital allocation based on network
US8042158B2 (en) Management of user authorizations
CN113836237A (en) Method and device for auditing data operation of database
CN112965986A (en) Service consistency processing method, device, equipment and storage medium
CN110704196B (en) Resource data transfer method, device and block chain system
CN110109790A (en) Server hard disc management method, device, equipment and computer readable storage medium
KR101415528B1 (en) Apparatus and Method for processing data error for distributed system
KR100542383B1 (en) System for controlling database access based on 3-Tier structure and Method thereof
El Amin et al. Blockchain-based multi-organizational cyber risk management framework for collaborative environments
US11206268B2 (en) Account lifecycle management
JP2016110169A (en) Work application processing device, work application processing method, and program
JP5969668B1 (en) License management system, terminal, license control server, and license management method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191115