CN110457018A - A kind of data management system and its management method based on Hadoop - Google Patents
A kind of data management system and its management method based on Hadoop Download PDFInfo
- Publication number
- CN110457018A CN110457018A CN201910757780.7A CN201910757780A CN110457018A CN 110457018 A CN110457018 A CN 110457018A CN 201910757780 A CN201910757780 A CN 201910757780A CN 110457018 A CN110457018 A CN 110457018A
- Authority
- CN
- China
- Prior art keywords
- data
- user
- oozie
- component
- data management
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013523 data management Methods 0.000 title claims abstract description 108
- 238000007726 management method Methods 0.000 title claims abstract description 22
- 230000000712 assembly Effects 0.000 claims abstract description 63
- 238000000429 assembly Methods 0.000 claims abstract description 63
- 238000013507 mapping Methods 0.000 claims abstract description 24
- 230000002452 interceptive effect Effects 0.000 claims description 37
- 239000004744 fabric Substances 0.000 claims 1
- 238000000034 method Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 5
- 230000000007 visual effect Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/20—Software design
- G06F8/24—Object-oriented
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of data management system and its management method based on Hadoop, comprising: front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement;HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built in Hadoop frame;Front end assemblies, for when receiving the data management operations of user's input, data management operations to be sent to Oozie component;Oozie component, for when receiving data management operations, Hive Tool for Data Warehouse is called to determine the corresponding metadata of data management operations, and call Sqoop handling implement, the corresponding data to be managed of the corresponding metadata of data management operations in HDFS distributed file system are managed according to data management operations, wherein, the corresponding metadata of data management operations and data to be managed are mapping relations.This programme can reduce the difficulty of data management.
Description
Technical field
The present invention relates to field of computer technology, in particular to a kind of data management system and its management based on Hadoop
Method.
Background technique
Since in recent years, the tide of big data sweeps over, and under the tide of big data, have one it is very important,
Occupy the ecosystem of particularly significant status, that is, distributed file system (Hadoop Distributed File
System, HDFS).HDFS is rapidly developed from an edge technology, is become the mainstream technology of present big data, is almost had become
The synonym of big data.
Currently, needing user to input corresponding parameter by way of order line when carrying out data management by HDFS.
As can be seen from the above description, the prior art needs user to shift to an earlier date in such a way that order line is to data management is carried out
Corresponding parameter is familiar with, so that the difficulty of data management is big.
Summary of the invention
The embodiment of the invention provides a kind of data management system and its management method based on Hadoop, can reduce number
According to the difficulty of management.
In a first aspect, the embodiment of the invention provides a kind of data management systems based on Hadoop, comprising:
Front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and
Sqoop handling implement;
The HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are taken
It build in the Hadoop frame;
The front end assemblies, for receive user input data management operations when, by the data management operations
It is sent to the Oozie component;
The Oozie component, for when receiving the data management operations, calling the Hive Tool for Data Warehouse
It determines the corresponding metadata of the data management operations, and calls the Sqoop handling implement, according to the data management operations
Pipe is carried out to the corresponding data to be managed of the corresponding metadata of data management operations described in the HDFS distributed file system
Reason, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
Preferably,
The front end assemblies are further used in the data for being used to request data to be checked for receiving user's input
When request command, the data request command is sent to the Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data bins
Library tool determines whether there is the corresponding metadata of the data request command, when there are the data request commands to correspond to for determination
Metadata when, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and asked with the data
The corresponding data to be checked of the corresponding metadata of order are sought, and the data to be checked are sent to the front end assemblies,
In, the corresponding metadata of data request command and the data to be checked are mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
Preferably,
The front end assemblies are further used in the user's registration information for being used to verify identity for receiving user's input
When, the user's registration information is sent to the Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data bins
Library tool, it is determined whether there is metadata corresponding with the user's registration information, when be not present and the user's registration information
When corresponding metadata, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information,
In, the prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user's registration letter
Ceasing corresponding metadata with corresponding target registered information in the HDFS distributed file system is mapping relations.
Preferably,
Further comprise: Nginx server;
The front end assemblies, are further used for when receiving user's interactive instruction, and the interactive instruction is sent to institute
State Nginx service, wherein the interactive instruction, comprising: the data management operations;
The Nginx server, for the interactive instruction being sent to described when receiving the interactive instruction
Oozie component.
Preferably,
The Oozie component is further used for recording the Hive Tool for Data Warehouse and the HDFS distributed document
The workflow information of system.
Second aspect, the embodiment of the invention provides a kind of management method of data management system based on Hadoop, packets
It includes:
HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built
In Hadoop frame;
By the front end assemblies, when receiving the data management operations of user's input, by the data management operations
It is sent to the Oozie component;
By the Oozie component, when receiving the data management operations, the Hive Tool for Data Warehouse is called
Determine the corresponding metadata of the data management operations;
By the Oozie component, the Sqoop handling implement is called, according to the data management operations to described
The corresponding data to be managed of the corresponding metadata of data management operations described in HDFS distributed file system are managed,
In, the corresponding metadata of data management operations and the data to be managed are mapping relations.
Preferably,
Further comprise:
By the front end assemblies, receive user's input for requesting the request of data of data to be checked to be ordered
When enabling, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called
The corresponding metadata of the data request command is determined whether there is, when there are the corresponding first numbers of the data request command for determination
According to when, call the Hive Tool for Data Warehouse read from the HDFS distributed file system with the data request command
The corresponding data to be checked of corresponding metadata, and the data to be checked are sent to the front end assemblies, wherein it is described
The corresponding metadata of data request command and the data to be checked are mapping relations;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
Preferably,
Further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by institute
It states user's registration information and is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive data warehouse work is called
Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information
Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates permission of the user without the access HDFS distributed file system, described
The corresponding metadata of user's registration information is that mapping is closed with corresponding target registered information in the HDFS distributed file system
System.
Preferably,
Further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx clothes
Business, wherein the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to described
Oozie component.
Preferably,
Further comprise:
By the Oozie component, the Hive Tool for Data Warehouse and the HDFS distributed file system are recorded
Workflow information.
The embodiment of the invention provides a kind of data management system and its management method based on Hadoop, by by HDFS
Distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built in Hadoop frame,
Visual interactive interface can be provided for user by front end assemblies, user in interactive interface by that can input to data
The data management operations being managed, then by Oozie component call Hive Tool for Data Warehouse, that is, can determine that data management is grasped
Make the corresponding data mapped metadata to be managed in HDFS distributed file system, it is corresponding by data management operations
Mapping relations between metadata and pending data, Oozie component call Sqoop handling implement can be grasped according to data management
Make corresponding metadata lookup data to be managed, so as to which data to be managed imported, exported, deleted, modified etc. with management behaviour
Make, and the metadata in Hive Tool for Data Warehouse is managed, so that relevant operation difficulty of the user for Hadoop
It is greatly lowered, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is a kind of structural schematic diagram for data management system based on Hadoop that one embodiment of the invention provides;
Fig. 2 is the structural schematic diagram for another data management system based on Hadoop that one embodiment of the invention provides;
Fig. 3 is a kind of process of the management method for data management system based on Hadoop that one embodiment of the invention provides
Figure.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
As shown in Figure 1, the embodiment of the invention provides a kind of data management systems based on Hadoop, comprising:
Front end assemblies 101, Hadoop frame 102, HDFS distributed file system 103, Oozie component 104, Hive data
Warehouse tool 105 and Sqoop handling implement 106;
The HDFS distributed file system 103, the Oozie component 104, the Hive Tool for Data Warehouse and 105
The Sqoop handling implement 106 is built in the Hadoop frame 102;
The front end assemblies 101, for when receiving the data management operations of user's input, the data management to be grasped
It is sent to the Oozie component 104;
The Oozie component 104, for when receiving the data management operations, calling the Hive data warehouse
Tool 105 determines the corresponding metadata of the data management operations, and calls the Sqoop handling implement 106, according to the number
It is corresponding to pipe to the corresponding metadata of data management operations described in the HDFS distributed file system 103 according to management operation
Reason data are managed, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
In embodiments of the present invention, by by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse
It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided by front end assemblies for user
Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune
With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system
Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group
Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe
Reason data such as are imported, are exported, being deleted, modify at the management operation, and to the metadata progress in Hive Tool for Data Warehouse
Management, so that the relevant operation difficulty of Hadoop is greatly lowered in user, without being carried out by way of inputting order line
Management, so as to reduce the difficulty of data management.
For the ease of data management, in an embodiment of the present invention, the front end assemblies are further used for receiving
When stating the data request command for being used to request data to be checked of user's input, the data request command is sent to described
Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data bins
Library tool determines whether there is the corresponding metadata of the data request command, when there are the data request commands to correspond to for determination
Metadata when, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and asked with the data
The corresponding data to be checked of the corresponding metadata of order are sought, and the data to be checked are sent to the front end assemblies,
In, the corresponding metadata of data request command and the data to be checked are mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
In embodiments of the present invention, user forward end component can input for inquiring HDFS distributed document according to demand
The data request command of the data stored in system, so as to be determined whether by Oozie component call Hive Tool for Data Warehouse
There are corresponding metadata, to determine whether there is the data to be checked to be inquired of user, when there are needed for user to
When inquiring data, data to be checked can be read from HDFS distributed file system by Hive Tool for Data Warehouse, concurrently
Front end assemblies are given, so that data to be checked are supplied to user, meet the query demand of user.
In order to improve the safety of data management, in an embodiment of the present invention, the front end assemblies are further used for
When receiving the user's registration information for being used to verify identity of user's input, the user's registration information is sent to described
Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data bins
Library tool, it is determined whether there is metadata corresponding with the user's registration information, when be not present and the user's registration information
When corresponding metadata, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information,
In, the prompt information indicates the user without the permission for accessing the HDFS distributed file system.
In embodiments of the present invention, user needs first to carry out authentication before to the operation such as data management, inquiry,
I.e. user needs to input user's registration information by the interactive interface that front end assemblies provide, so as to pass through tune by Oozie component
With Hive Tool for Data Warehouse, determine in HDFS distributed file system with the presence or absence of target corresponding with user's registration information
Registration information has the metadata of mapping relations, and the corresponding metadata of the user's registration information is not present when determining, that is, determines
Target registered information corresponding with user's registration information is not present in HDFS distributed file system, thus may determine that user is
Illegal user, therefore the prompt information without access HDFS distributed file system permission can be exported by front end assemblies, so as to
User understands the reason of can not accessing storage system.
In order to improve the safety of data management, in an embodiment of the present invention, as shown in Fig. 2, described be based on Hadoop
Data management system, further comprise: Nginx server 201;
The front end assemblies 101 are further used for being sent to the interactive instruction when receiving user's interactive instruction
The Nginx service 201, wherein the interactive instruction, comprising: the data management operations;
The Nginx server 201, for when receiving the interactive instruction, the interactive instruction to be sent to institute
State Oozie component 104.
In embodiments of the present invention, by Nginx server, user may be implemented in the HDFS distributed field system on backstage
System interacts, and can also realize load balancing, is held by Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement
Row task provides regulatory requirement for user.
Data management situation is understood for the ease of user, in an embodiment of the present invention, the Oozie component, further
For recording the workflow information of the Hive Tool for Data Warehouse and the HDFS distributed file system.
In embodiments of the present invention, Oozie component is when carrying out task schedule, Hive data in logger task scheduling process
The workflow information of warehouse tool and the HDFS distributed file system forms log, is determined with will pass through the information of record
Operation result during task schedule, e.g., task schedule are unsuccessfully, by checking Oozie component log, to check wrong original
Cause, if detailed error reason can not be viewed, specific tasks are checked in the port that can enter HDFS distributed file system
Error message.
As shown in figure 3, the embodiment of the invention provides a kind of management method of data management system based on Hadoop, packet
It includes:
Step 301: HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop are handled into work
Tool is built in Hadoop frame;
Step 302: by the front end assemblies, when receiving the data management operations of user's input, by the data
Management operation is sent to the Oozie component;
Step 303: by the Oozie component, when receiving the data management operations, calling the Hive data
Warehouse tool determines the corresponding metadata of the data management operations;
Step 304: by the Oozie component, the Sqoop handling implement is called, according to the data management operations
Pipe is carried out to the corresponding data to be managed of the corresponding metadata of data management operations described in the HDFS distributed file system
Reason, wherein the corresponding metadata of the data management operations and the data to be managed are mapping relations.
In embodiments of the present invention, by by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse
It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided by front end assemblies for user
Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune
With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system
Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group
Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe
Reason data the management such as are imported, are exported, being deleted, modify and being operated so that user for Hadoop relevant operation difficulty substantially
Degree reduces, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
In an embodiment of the present invention, further comprise:
By the front end assemblies, receive user's input for requesting the request of data of data to be checked to be ordered
When enabling, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called
The corresponding metadata of the data request command is determined whether there is, when there are the corresponding first numbers of the data request command for determination
According to when, call the Hive Tool for Data Warehouse read from the HDFS distributed file system with the data request command
The corresponding data to be checked of corresponding metadata, and the data to be checked are sent to the front end assemblies, wherein it is described
The corresponding metadata of data request command and the data to be checked are mapping relations;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
In an embodiment of the present invention, further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by institute
It states user's registration information and is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive data warehouse work is called
Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information
Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates permission of the user without the access HDFS distributed file system, described
The corresponding metadata of user's registration information is that mapping is closed with corresponding target registered information in the HDFS distributed file system
System.
In an embodiment of the present invention, further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx clothes
Business, wherein the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to described
Oozie component.
In an embodiment of the present invention, further comprise:
By the Oozie component, the Hive Tool for Data Warehouse and the HDFS distributed file system are recorded
Workflow information.
The each embodiment of the present invention at least has the following beneficial effects:
1, in an embodiment of the present invention, by HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse
It is built in Hadoop frame with Sqoop handling implement, visual interaction circle can be provided for user by front end assemblies
Face, user in interactive interface by can input the data management operations being managed to data, then pass through Oozie component tune
With Hive Tool for Data Warehouse, that is, it can determine data management operations corresponding data to be managed in HDFS distributed file system
Mapped metadata passes through the mapping relations between the corresponding metadata of data management operations and pending data, Oozie group
Part calls Sqoop handling implement can be according to the corresponding metadata lookup of data management operations data to be managed, so as to treat pipe
Reason data the management such as are imported, are exported, being deleted, modify and being operated so that user for Hadoop relevant operation difficulty substantially
Degree reduces, without being managed by way of inputting order line, so as to reduce the difficulty of data management.
2, in an embodiment of the present invention, user forward end component can input for inquiring HDFS distribution according to demand
The data request command of the data stored in file system, so as to be determined by Oozie component call Hive Tool for Data Warehouse
With the presence or absence of corresponding metadata, to determine whether there is the data to be checked to be inquired of user, when there are needed for user
Data to be checked when, data to be checked can be read from HDFS distributed file system by Hive Tool for Data Warehouse,
And front end assemblies are sent to, so that data to be checked are supplied to user, meet the query demand of user.
3, in an embodiment of the present invention, user needs first to carry out identity to test before to the operation such as data management, inquiry
Card, i.e. user need to input user's registration information by the interactive interface that front end assemblies provide, so as to be passed through by Oozie component
Hive Tool for Data Warehouse is called, is determined in HDFS distributed file system with the presence or absence of mesh corresponding with user's registration information
The metadata that registration information has mapping relations is marked, the corresponding metadata of the user's registration information is not present when determining, i.e., really
Determining HDFS distributed file system, there is no target registered information corresponding with user's registration information, thus may determine that user
For illegal user, therefore the prompt information without access HDFS distributed file system permission can be exported by front end assemblies, with
Just user understands the reason of can not accessing storage system.
4, in an embodiment of the present invention, by Nginx server, user may be implemented in the distributed text of the HDFS on backstage
Part system interacts, and can also realize load balancing, handles work by Oozie component, Hive Tool for Data Warehouse and Sqoop
Have execution task, provides regulatory requirement for user.
It should be noted that, in this document, such as first and second etc relational terms are used merely to an entity
Or operation is distinguished with another entity or operation, is existed without necessarily requiring or implying between these entities or operation
Any actual relationship or order.Moreover, the terms "include", "comprise" or its any other variant be intended to it is non-
It is exclusive to include, so that the process, method, article or equipment for including a series of elements not only includes those elements,
It but also including other elements that are not explicitly listed, or further include solid by this process, method, article or equipment
Some elements.In the absence of more restrictions, the element limited by sentence " including one ", is not arranged
Except there is also other identical factors in the process, method, article or apparatus that includes the element.
Finally, it should be noted that the foregoing is merely presently preferred embodiments of the present invention, it is merely to illustrate skill of the invention
Art scheme, is not intended to limit the scope of the present invention.Any modification for being made all within the spirits and principles of the present invention,
Equivalent replacement, improvement etc., are included within the scope of protection of the present invention.
Claims (10)
1. a kind of data management system based on Hadoop characterized by comprising
Front end assemblies, Hadoop frame, HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop
Handling implement;
The HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built
In the Hadoop frame;
The front end assemblies, for when receiving the data management operations of user's input, the data management operations to be sent
To the Oozie component;
The Oozie component, for when receiving the data management operations, calling the Hive Tool for Data Warehouse to determine
The corresponding metadata of the data management operations, and the Sqoop handling implement is called, according to the data management operations to institute
The corresponding data to be managed of the corresponding metadata of data management operations described in HDFS distributed file system are stated to be managed,
In, the corresponding metadata of data management operations and the data to be managed are mapping relations.
2. the data management system according to claim 1 based on Hadoop, which is characterized in that
The front end assemblies are further used in the request of data for being used to request data to be checked for receiving user's input
When order, the data request command is sent to the Oozie component;
The Oozie component, is further used for when receiving the data request command, calls the Hive data warehouse work
Tool determines whether there is the corresponding metadata of the data request command, when there are the corresponding members of the data request command for determination
When data, calls the Hive Tool for Data Warehouse to read from the HDFS distributed file system and ordered with the request of data
The corresponding data to be checked of corresponding metadata are enabled, and the data to be checked are sent to the front end assemblies, wherein institute
It states the corresponding metadata of data request command and the data to be checked is mapping relations;
The front end assemblies, for showing the data to be checked when receiving the data to be checked.
3. the data management system according to claim 1 based on Hadoop, which is characterized in that
The front end assemblies are further used for when receiving the user's registration information for being used to verify identity of user's input, will
The user's registration information is sent to the Oozie component;
The Oozie component, is further used for when receiving the user's registration information, calls the Hive data warehouse work
Tool, it is determined whether there is metadata corresponding with the user's registration information, when there is no corresponding with the user's registration information
Metadata when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information, wherein
The prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user's registration information pair
The metadata answered is mapping relations with corresponding target registered information in the HDFS distributed file system.
4. the data management system according to claim 1 based on Hadoop, which is characterized in that further comprise: Nginx
Server;
The front end assemblies are further used for when receiving user's interactive instruction, the interactive instruction are sent to described
Nginx service, wherein the interactive instruction, comprising: the data management operations;
The Nginx server, for when receiving the interactive instruction, the interactive instruction to be sent to the Oozie
Component.
5. according to claim 1 to any data management system based on Hadoop in 4, which is characterized in that
The Oozie component is further used for recording the Hive Tool for Data Warehouse and the HDFS distributed file system
Workflow information.
6. a kind of management method of the data management system based on Hadoop characterized by comprising
HDFS distributed file system, Oozie component, Hive Tool for Data Warehouse and Sqoop handling implement are built
In Hadoop frame;
The data management operations are sent when receiving the data management operations of user's input by the front end assemblies
To the Oozie component;
By the Oozie component, when receiving the data management operations, the Hive Tool for Data Warehouse is called to determine
The corresponding metadata of the data management operations;
By the Oozie component, the Sqoop handling implement is called, according to the data management operations to the HDFS points
The corresponding data to be managed of the corresponding metadata of data management operations described in cloth file system are managed, wherein described
The corresponding metadata of data management operations and the data to be managed are mapping relations.
7. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, in the data request command for being used to request data to be checked for receiving user's input
When, the data request command is sent to the Oozie component;
By the Oozie component, when receiving the data request command, the Hive Tool for Data Warehouse is called to determine
With the presence or absence of the corresponding metadata of the data request command, when there are the corresponding metadata of the data request command for determination
When, call the Hive Tool for Data Warehouse to read from the HDFS distributed file system and the data request command pair
The corresponding data to be checked of the metadata answered, and the data to be checked are sent to the front end assemblies, wherein the number
It is mapping relations according to the corresponding metadata of request command and the data to be checked;
By the front end assemblies, when receiving the data to be checked, the data to be checked are shown.
8. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, when receiving the user's registration information for being used to verify identity of user's input, by the use
Family registration information is sent to the Oozie component;
By the Oozie component, when receiving the user's registration information, the Hive Tool for Data Warehouse is called, really
It is fixed to whether there is metadata corresponding with the user's registration information, when there is no first numbers corresponding with the user's registration information
According to when, Xiang Suoshu front end assemblies send prompt information, so that the front end assemblies show the prompt information;
Wherein, the prompt information indicates the user without the permission for accessing the HDFS distributed file system, the user
The corresponding metadata of registration information is mapping relations with corresponding target registered information in the HDFS distributed file system.
9. the management method of the data management system according to claim 6 based on Hadoop, which is characterized in that
Further comprise:
By the front end assemblies, when receiving user's interactive instruction, the interactive instruction is sent to Nginx service,
In, the interactive instruction, comprising: the data management operations;
By the Nginx server, when receiving the interactive instruction, the interactive instruction is sent to the Oozie
Component.
10. according to the management method of the data management system based on Hadoop any in claim 6 to 9, feature
It is,
Further comprise:
By the Oozie component, the work of the Hive Tool for Data Warehouse and the HDFS distributed file system is recorded
Stream information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910757780.7A CN110457018A (en) | 2019-08-16 | 2019-08-16 | A kind of data management system and its management method based on Hadoop |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910757780.7A CN110457018A (en) | 2019-08-16 | 2019-08-16 | A kind of data management system and its management method based on Hadoop |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110457018A true CN110457018A (en) | 2019-11-15 |
Family
ID=68487091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910757780.7A Pending CN110457018A (en) | 2019-08-16 | 2019-08-16 | A kind of data management system and its management method based on Hadoop |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110457018A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111625596A (en) * | 2020-05-14 | 2020-09-04 | 国网辽宁省电力有限公司 | Multi-source data synchronous sharing method and system for real-time consumption scheduling of new energy |
CN112883025A (en) * | 2021-01-25 | 2021-06-01 | 北京云思畅想科技有限公司 | System and method for visualizing mapping relation of ceph internal data structure |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030163570A1 (en) * | 2002-02-26 | 2003-08-28 | Sun Microsystems, Inc. | Command line interface session tool |
CN103856567A (en) * | 2014-03-26 | 2014-06-11 | 西安电子科技大学 | Small file storage method based on Hadoop distributed file system |
CN109885620A (en) * | 2018-12-25 | 2019-06-14 | 航天信息股份有限公司 | Metadata read method and device based on Hive data warehouse |
-
2019
- 2019-08-16 CN CN201910757780.7A patent/CN110457018A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030163570A1 (en) * | 2002-02-26 | 2003-08-28 | Sun Microsystems, Inc. | Command line interface session tool |
CN103856567A (en) * | 2014-03-26 | 2014-06-11 | 西安电子科技大学 | Small file storage method based on Hadoop distributed file system |
CN109885620A (en) * | 2018-12-25 | 2019-06-14 | 航天信息股份有限公司 | Metadata read method and device based on Hive data warehouse |
Non-Patent Citations (1)
Title |
---|
罗白莲: "CDH4安装实践HDFS、HBase、Zookeeper、Hive、Oozie、Sqoop", 《HTTPS://BLOG.CSDN.NET/LUOBAILIAN/ARTICLE/DETAILS/50412146》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111625596A (en) * | 2020-05-14 | 2020-09-04 | 国网辽宁省电力有限公司 | Multi-source data synchronous sharing method and system for real-time consumption scheduling of new energy |
CN111625596B (en) * | 2020-05-14 | 2023-12-26 | 国网辽宁省电力有限公司 | Multi-source data synchronous sharing method and system for real-time new energy consumption scheduling |
CN112883025A (en) * | 2021-01-25 | 2021-06-01 | 北京云思畅想科技有限公司 | System and method for visualizing mapping relation of ceph internal data structure |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Astuti et al. | Risks assessment of information technology processes based on COBIT 5 framework: A case study of ITS service desk | |
US10579803B1 (en) | System and method for management of application vulnerabilities | |
US7610512B2 (en) | System and method for automated and assisted resolution of it incidents | |
US8756301B2 (en) | Systems and methods for organic knowledge base runbook automation | |
CN106775713B (en) | File auditing method and device and file submitting control system | |
CN108491254A (en) | A kind of dispatching method and device of data warehouse | |
CN111199379A (en) | Examination and approval method, examination and approval device and storage medium of workflow engine | |
CN110188103A (en) | Data account checking method, device, equipment and storage medium | |
CN107463839A (en) | A kind of system and method for managing application program | |
CN110457018A (en) | A kind of data management system and its management method based on Hadoop | |
CN112184172B (en) | Electronic file on-line transfer and receiving method and system | |
CN105653322A (en) | Operation and maintenance server and server event processing method | |
US9082085B2 (en) | Computing environment climate dependent policy management | |
CN101504756A (en) | System and method for implementing capital allocation based on network | |
US8042158B2 (en) | Management of user authorizations | |
CN113836237A (en) | Method and device for auditing data operation of database | |
CN112965986A (en) | Service consistency processing method, device, equipment and storage medium | |
CN110704196B (en) | Resource data transfer method, device and block chain system | |
CN110109790A (en) | Server hard disc management method, device, equipment and computer readable storage medium | |
KR101415528B1 (en) | Apparatus and Method for processing data error for distributed system | |
KR100542383B1 (en) | System for controlling database access based on 3-Tier structure and Method thereof | |
El Amin et al. | Blockchain-based multi-organizational cyber risk management framework for collaborative environments | |
US11206268B2 (en) | Account lifecycle management | |
JP2016110169A (en) | Work application processing device, work application processing method, and program | |
JP5969668B1 (en) | License management system, terminal, license control server, and license management method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191115 |