CN108256019A - Database key generation method, device, equipment and its storage medium - Google Patents

Database key generation method, device, equipment and its storage medium Download PDF

Info

Publication number
CN108256019A
CN108256019A CN201810021713.4A CN201810021713A CN108256019A CN 108256019 A CN108256019 A CN 108256019A CN 201810021713 A CN201810021713 A CN 201810021713A CN 108256019 A CN108256019 A CN 108256019A
Authority
CN
China
Prior art keywords
data object
file
data
field name
storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810021713.4A
Other languages
Chinese (zh)
Inventor
向荣辉
陈�峰
巫可
孙冬冬
严琴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SF Technology Co Ltd
SF Tech Co Ltd
Original Assignee
SF Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SF Technology Co Ltd filed Critical SF Technology Co Ltd
Priority to CN201810021713.4A priority Critical patent/CN108256019A/en
Publication of CN108256019A publication Critical patent/CN108256019A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses database key generation method, device, equipment and its storage mediums.This method includes:The data object of tables of data stored in selection Hadoop distributed file systems determines the field name of data object;Storage file corresponding with field name is searched in Hadoop distributed file systems based on field name;If finding storage file, whether it is currently being used based on field name detection data object;If data object is not currently being used, the data object of locking data table;Content generation major key based on storage file.The embodiment of the present application provides technical solution, by judging whether to distinguish different application scenarios from the corresponding storage file of the data object of tables of data, and by locking file come locking data object, so as to prevent multi-user while shared same asset, it is ensured that do not destroy integrality, the consistency of system data.

Description

Database key generation method, device, equipment and its storage medium
Technical field
Present application relates generally to field of computer technology, and in particular to technical field of data processing more particularly to database Major key generation method, device, equipment and its storage medium.
Background technology
Hadoop distributed file systems (HDFS, Hadoop Distributed File System) have high fault tolerance The characteristics of, and it provides the data that high-throughput carrys out access application, those is suitble to have the application journey of super large data set Sequence.
In Database Systems, it is used as the major key of tables of data using unique identifier, with each number of distinguishes data table According to object record, and the generation method of unique identifier directly affects the write efficiency and recall precision of data in tables of data. HDFS systems can not avoid the problem that database key is designed in the application of big data field, such as in order processing, finance knot The application fields such as calculation.Demand of these fields to unique constraints field is constantly increasing, and leads to existing data processing platform (DPP) In terms of major key is created, it appears it is unable to do what one wishes, wherein, existing data processing platform (DPP) such as Hive, Spark-SQL, Impala etc. Deng.
It would therefore be highly desirable to a kind of new technical solution is proposed to overcome the above problem.
Invention content
In view of drawbacks described above of the prior art or deficiency, are intended to provide a kind of data object using tables of data to create The scheme of major key.
In a first aspect, the embodiment of the present application provides a kind of database key generation method, this method includes:
The data object of tables of data stored in selection Hadoop distributed file systems determines the field name of data object Claim;
Storage file corresponding with field name is searched in Hadoop distributed file systems based on field name;
If finding storage file, whether it is currently being used based on field name detection data object;
If data object is not currently being used, the data object of locking data table;
Content generation major key based on storage file.
Second aspect, the embodiment of the present application provide a kind of database key generating means, which includes:
Selecting unit for the data object of tables of data that Hadoop distributed file systems is selected to store, determines data The field name of object;
Searching unit is searched for being based on field name in Hadoop distributed file systems corresponding with field name Storage file;
Detection unit, if for finding storage file, based on field name detection data object whether by It uses;
Lock cell, if be not currently being used for data object, the data object of locking data table;
Generation unit generates major key for the content based on storage file.
The third aspect, the embodiment of the present application provide a kind of equipment, including processor, storage device;
Aforementioned storage device, for storing one or more programs;
When aforementioned one or more programs are performed by aforementioned processor so that aforementioned processor realizes that the embodiment of the present application is retouched The method stated.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence when aforementioned computer program is executed by processor, realizes the method that the embodiment of the present application describes.
Database key provided by the embodiments of the present application generates scheme, by judging in the tables of data for generating major key Data object, if there are storage files to distinguish different application scenarios.And pass through lock file, to prevent multi-user while be total to Same asset is enjoyed, so that it is guaranteed that not destroying the integrality of system data, consistency.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 shows the flow diagram of database key generation method provided by the embodiments of the present application;
Fig. 2 shows the flow diagrams of database key generation method that the another embodiment of the application provides;
Fig. 3 shows the flow diagram of database key generation method that the another embodiment of the application provides;
Fig. 4 shows the structure diagram of database key generating means provided by the embodiments of the present application;
Fig. 5 shows the structure diagram of database key generating means that the another embodiment of the application provides;
Fig. 6 shows the structure diagram of database key generating means that the another embodiment of the application provides;
Fig. 7 shows the structure diagram for being suitable for being used for realizing the computer system of the terminal device of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention rather than the restriction to the invention.It also should be noted that in order to Convenient for description, illustrated only in attached drawing with inventing relevant part.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
It please refers to Fig.1, Fig. 1 shows the flow diagram of database key generation method provided by the embodiments of the present application.
As shown in Figure 1, this method includes:
Step 101, the data object of the tables of data of selection Hadoop distributed file systems storage, determines data object Field name.
Hadoop distributed file systems (HDFS, Hadoop Distributed File System) support traditional layer Secondary type file organization structure.User or application program can create directory, and then file is stored in these catalogues.File The hierarchical structure of system namespace is similar with most of existing file system:User can create, deletes, moves or order again Name file.
Have much for the data processing platform (DPP) of HDFS, such as Hive, Spark-SQL, Impala etc..Wherein, Hive It is based on Hadoop distributed file systems, its data are stored in Hadoop distributed file systems.Hive is in itself There is no special data memory format, without establishing index for data, it is only necessary to tell Hive data when establishment table yet In data object separator and data object separator, Hive can parse data.
SparkSQL has internal storage data object storage (In-Memory Columnar Storage) and Hive compatibility The advantages that, the dependence to Hive has been broken away from, great side is obtained in terms of data compatibility, performance optimization, component extension Just.It is not the JVM object storage modes for using ecosystem that the table data of SparkSQL store in memory, but uses memory number It is stored according to object, for the storage of internal storage data object, the data object of all native data types is used into primary array It stores, after the first sequence of the complex data type (such as array, map) that Hive is supported and is connected into a byte arrays to deposit Storage.In this way, each data object creates a JVM object, stored so as to cause compact data;Additional, it can also use The high-efficiency compression method (such as dictionary encoding, data object length coding compression method) of cheap CPU overhead reduces memory overhead.
Impala is for handling the MPP for the mass data being stored in Hadoop clusters (at extensive and data object Reason) SQL query engine.It is one with C++ and the open source software of written in Java.Compared with the SQL engines of other Hadoop, it Provide high-performance and low latency.In other words, Impala is the highest SQL engines of performance (experience for providing similar RDBMS), It provides the quickest way for accessing and being stored in the data in Hadoop distributed file systems.
The embodiment of the present application, the data object of the tables of data stored by obtaining HDFS systems, obtains the data object Field name.For example, during data maintenance, tables of data A is obtained, wherein, the first data object field name is employee number, Tables of data B is for another example obtained, wherein, the field name of a certain data object is order number.
Step 102, deposit corresponding with the field name is searched in Hadoop distributed file systems based on field name Store up file.
The embodiment of the present application, by determined from tables of data for generate major key data object field name, then Search whether there is the file named with field name in HDFS systems based on field name, this document is used to store the field The current value of the data object of name definition.
For example, in tables of data A, the field name of the first data object is employee number, in the corresponding data object of employee number Storage is numerical information, for example, being encoded to HW00010 forms.
In the embodiment of the present application, it is search key by determining field name employee number, is searched in HDFS systems With the presence or absence of with the file named with employee number or the store path for finding the file named with employee number.Here file It can be referred to as storage file, be mainly used for storing the state value of employee number, for example, the first data object in tables of data A Value correspond to HW00001-HW00100 respectively, the state value stored in storage file i.e. HW00100.Expression exists employee number End-state value storage storage file in tables of data.When update or other users or program are to data Table A further operating, Then by obtaining the end-state value stored in storage file, continuous major key just constantly can be consistently generated, so as to Improve the efficiency for creating major key.
Step 103, if finding storage file, detect whether the data object is made based on field name With.
There are storage files in HDFS systems are determined, then it represents that the data object of tables of data may not be first use. The state value of in store data object in the storage file in order to further update state value, then needs further detection to be used for Create the data object of major key, if be currently being used.By detecting whether the data object is currently being used to prevent from being mostly used Family or program or machine may lead to operating mistake simultaneously to the data object into data object operation.
For example, detecting lock file corresponding with the data object, then identify the data object and be currently being used.Detection is not To lock file corresponding with the data object, then the data object is not used by.The lock file is the field with the data object The specific file of name definition.For example, field name .lock files.
Step 104, if data object is not currently being used, the data object of locking data table.
In the embodiment of the present application, operated while whether being currently being used by detection data object and prevent multi-user, So as to ensure the consistency of data, the accuracy rate of data is improved.
Optionally, in HDFS systems, by generating the lock file identical with the field name of the data object of tables of data. Prevent other users or program or machine from passing through lock simultaneously to the data object into data object write operation by locking file File limits other users to the data object into the permission of data object write operation.
The embodiment of the present application, by locking file presence or absence, instruction active user is prevention or allows other users couple Same asset into data object accessing operation, so that it is guaranteed that not destroying the integrality of system data, consistency and simultaneously data pair As property.
For example, when affairs to some data object (such as database table Table objects) into before data object operation, first to System sends out request, it is locked.Affairs just have the data object certain control after locking, in affairs release lock Before, other affairs, which cannot update this data object into data object, to be operated.
Lock file in the embodiment of the present application can be understood as exclusive lock (Exclusive Lock):That is X locks, also known as exclusive Lock, is the lock for preventing from sharing same asset simultaneously.Added exclusive lock database object cannot by other affairs read and Modification.
Step 105, the content generation major key based on storage file.
In the embodiment of the present application, what storage file stored is the state value of the data object, by being read from storage file The state value is taken, then using the state value as the initial value of generation Major key, starts increment according to definition from hyperplasia into major key Value.Major key is referred to as unique identifier, automatic to increase after the value of identification record is generated every time, so as to ensure to be generated Unique identifier order.
The embodiment of the present application, by from storage file reading state value be used as the initial value from increasing data object, so Afterwards from hyperplasia into major key data object, for the data object information of unique identification data table.
It please refers to Fig.2, Fig. 2 shows the flow of the database key generation method signals that the another embodiment of the application provides Figure.
As shown in Fig. 2, this method includes:
Step 201, the data object of the tables of data of selection Hadoop distributed file systems storage, determines the data object Field name;
Step 202, storage corresponding with field name is searched in Hadoop distributed file systems based on field name File;
Step 203, if not finding storage file, storage file is created, and be written initially in the storage file State value, the storage file are named with the field name.
The embodiment of the present application is not found and field name pair by field name in Hadoop distributed file systems The storage file answered, then it represents that the corresponding data object of the field name, it is first using major key is created, in order to keep major key Coherence and continuity creates the state value that storage file is used for storing the data object, to create master using the state value Key.For example, the value of the first data object in tables of data A corresponds to HW00001-HW00100 respectively, stored in storage file State value, that is, HW00100.
Step 204, judge whether the data object is currently being used based on field name.
After creating the storage file for storing the state value of data object in HDFS systems, then need further to examine It surveys to create whether the data object of major key is currently being used.By detecting whether the data object is currently being used, to prevent Only multi-user or program or machine lead to mistake simultaneously to the data object into data object operation so as to avoid operating simultaneously.
For example, detecting lock file corresponding with the data object, then identify the data object and be currently being used.Detection is not To lock file corresponding with the data object, then the data object is not used by.The lock file is the field with the data object The specific file of name definition.For example, field name .lock files.
The embodiment of the present application, by locking file, to prevent from sharing same asset simultaneously.It is applied in the database pair of exclusive lock As cannot be read and changed by other affairs, so that it is guaranteed that not destroying the integrality of system data, consistency.
Step 205, if data object is not currently being used, the data object of locking data table;
In the embodiment of the present application, operated while whether being currently being used by detection data object and prevent multi-user, So as to ensure the consistency of data, the accuracy rate of data is improved.
Optionally, in HDFS systems, by generating the lock file identical with the field name of the data object of tables of data. Prevent other users or program or machine from passing through lock simultaneously to the data object into data object write operation by locking file File limits other users to the data object into the permission of data object write operation.
Step 206, the content generation major key based on storage file.
In the embodiment of the present application, what storage file stored is the state value of the data object, by being read from storage file The state value is taken, then using the state value as the initial value of generation Major key, starts increment according to definition from hyperplasia into major key Value.Major key is referred to as unique identifier, automatic to increase after the value of identification record is generated every time, so as to ensure to be generated Unique identifier order.
The embodiment of the present application, by from storage file reading state value be used as the initial value from increasing data object, so Afterwards from hyperplasia into major key data object, for the data object information of unique identification data table.
It please refers to Fig.3, Fig. 3 shows the flow of the database key generation method signal that the another embodiment of the application provides Figure.
As shown in figure 3, this method includes:
Step 301, the data object of the tables of data of selection Hadoop distributed file systems storage, determines the data object Field name;
The embodiment of the present application, the data object of the tables of data stored by obtaining HDFS systems, obtains the data object Field name.For example, during data maintenance, tables of data A is obtained, wherein, the first data object field name is employee number, Tables of data B is for another example obtained, wherein, the field name of a certain data object is order number.
Step 302, storage corresponding with field name is searched in Hadoop distributed file systems based on field name File;
To obtain in tables of data A the first data object field name for employee number, to obtain the first data in tables of data The field name of object, such as Employee_number.It is that keyword is distributed literary in Hadoop using Employee_number Storage file corresponding with the field name is searched in part system, for example, Employee_number.txt is found, Huo Zhetong It crosses under specified directory and searches, for example,/temp/sequence/Employee_number.txt.
Alternatively, specified directory is /temp/sequence/ etc..Optionally, storage corresponding with field name is found File, form are not limited to the file of .txt suffix or the file of extended formatting.It is excellent in the embodiment of the present application Selection of land, by searching for the mode of keyword search filename.
Step 303, if not finding storage file, storage file is created, and be written initially in the storage file State value, the storage file are named with the field name;Then, 304 are entered step.
If not finding storage file corresponding with the field name, created in Hadoop distributed file systems One file, and initial value is written in this document.For example, not finding Employee_number.txt, then one is created Employee_number.txt, and initial value is written in this document, which is usually arranged as 0.
Step 304, it detects to whether there is in Hadoop distributed file systems based on field name and be named with field name Lock file;If there is lock file, then it represents that data object is currently being used.
By searching for ensureing the consistency operated between multi-user or program to major key and continuous the step of storage file Property, so as to ensure the correctness of major key.
After creating the storage file for storing the state value of data object in HDFS systems, then need further to examine It surveys to create whether the data object of major key is currently being used.And pass through and detect whether the data object is currently being used, come Multi-user or program or machine are prevented simultaneously to the data object into data object operation, so as to prevent while shared resource.
To obtain in tables of data A the first data object field name for employee number, to obtain the first data in tables of data The field name of object, such as Employee_number.Using Employee_number as keyword, searched in HDFS systems Employee_number.lock files, if there is this document, then it represents that Employee_number data objects are made With if there is no this document, then it represents that Employee_number data objects are not used by.It can also be by specified mesh Record is lower to search, for example,/temp/sequence/Employee_number.lock.
Alternatively, specified directory is /temp/sequence/ etc..Optionally, storage corresponding with field name is found File, form are not limited to the file of .lock suffix or the file of extended formatting.It is excellent in the embodiment of the present application Selection of land, by searching for the mode of keyword search filename.
Step 305, it if data object is currently being used, waits for a period of time, until detecting designation date object not It is used, then enters step 306.
In the process that the data object is currently being used, Major key does not connect caused by being operated simultaneously in order to avoid multi-user It is continuous, by setting waiting timer or it can check lock file in a manner that certain interval of time sends detection messages Whether also exist.Until can't detect lock file, then continue into data object step 306.
Step 306, it if data object is not currently being used, in Hadoop distributed file systems, creates with word The lock file of name section name carrys out the data object of locking data table.
The embodiment of the present application creates lock file to prevent from sharing same asset simultaneously, and lock file here can be understood as Exclusive lock (Exclusive Lock):That is X locks, also known as exclusive lock.Database object (such as data are prevented using this exclusive lock Library table object) it is read and is changed by other affairs.If there is no lock file, then it represents that data object is not currently being used, Then by creating the lock file named with field name come the data object of locking data table.
To obtain in tables of data A the first data object field name for employee number, to obtain the first data in tables of data The field name of object, such as Employee_number.
By creating Employee_number.lock, to realize the locking to affairs, in time model existing for lock file In enclosing, other affairs, which all cannot update data object into data object, to be operated, so as to prevent losing phenomena such as updating.
Optionally, the Employee_number.lock of establishment is placed under specified directory or path, such as/ temp/sequence/。
Step 307, the content generation major key based on storage file.
Optionally, step 307 includes:
Step 3071, the state value of data object preserved in storage file is read.
After to data object locking, the state value of the specified data object of tables of data is read from storage file, it should State value may be initial installation value, it is also possible to current state value.Initial state value represents first and specifies data pair when creating The initial state value of elephant is 0.Current state value represents that stateful value has existed specified data object before the procedure, the shape State value is known as current value, for example, other users, which have held data Table A data object, crosses update operation, the first data object The last one data value has been updated to HW00100, and current state value is HW00100.
Step 3072, major key is incrementally generated according to designated increments based on state value.
Based on state value, major key data object is incrementally obtained according to designated increments.To obtain the first number in tables of data A For the entitled employee number of object field, the field name of the first data object in tables of data, such as Employee_ are obtained number.During first establishment, the initial state value for specifying Employee_number data objects is 0.Designated increments are integer Value, such as 1.Then by obtaining first Major key plus designated increments on the basis of initial state value, then again with first Continuing based on Major key, which increases designated increments, obtains second Major key, and so on, until by Employee_number numbers It is finished according to object use, obtains n-th Major key.
Optionally, this method further includes:
Step 308, the last one state value of major key is saved in storage file.
Obtained n-th Major key is saved in storage file, for example, n-th Major key is saved in Employee_ In number.txt, N is natural number, is used to represent the position of the last one value herein.
Step 309, the lock file is deleted to discharge the data object.
Lock mechanism is mainly used for managing the concurrently access to shared resource, in the environment of multi-user, ensures database Integrality and consistency.After data object operation is completed, need to delete the lock, to discharge data object.
It should be noted that although describing the operation of the method for the present invention with particular order in the accompanying drawings, this is not required that Or it implies and must hold these operations of data object according to the particular order or behaviour shown in data object whole must be held Work could realize desired result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step by certain steps It holds data object and/or a step is decomposed into multiple steps and hold data object.
With reference to figure 4, the structure diagram of database key generating means provided by the embodiments of the present application.
As shown in figure 4, the device 400 includes:
Selecting unit 401 for the data object of tables of data that Hadoop distributed file systems is selected to store, determines number According to the field name of object.
Hadoop distributed file systems (HDFS, Hadoop Distributed File System) support traditional layer Secondary type file organization structure.User or application program can create directory, and then file is stored in these catalogues.File The hierarchical structure of system namespace is similar with most of existing file system:User can create, deletes, moves or order again Name file.
Have much for the data processing platform (DPP) of HDFS, such as Hive, Spark-SQL, Impala etc..Wherein, Hive It is based on Hadoop distributed file systems, its data are stored in Hadoop distributed file systems.Hive is in itself There is no special data memory format, without establishing index for data, it is only necessary to tell Hive data when establishment table yet In data object separator and data object separator, Hive can parse data.
SparkSQL has internal storage data object storage (In-Memory Columnar Storage) and Hive compatibility The advantages that, the dependence to Hive has been broken away from, great side is obtained in terms of data compatibility, performance optimization, component extension Just.It is not the JVM object storage modes for using ecosystem that the table data of SparkSQL store in memory, but uses memory number It is stored according to object, for the storage of internal storage data object, the data object of all native data types is used into primary array It stores, after the first sequence of the complex data type (such as array, map) that Hive is supported and is connected into a byte arrays to deposit Storage.In this way, each data object creates a JVM object, so as to cause data storage that can be compact;It is additional, it can be with Memory is reduced using the high-efficiency compression method (such as dictionary encoding, data object length coding compression method) of cheap CPU overhead Expense.
Impala is for handling the MPP for the mass data being stored in Hadoop clusters (at extensive and data object Reason) SQL query engine.It is one with C++ and the open source software of written in Java.Compared with the SQL engines of other Hadoop, it Provide high-performance and low latency.
In other words, Impala is the highest SQL engines of performance (experience for providing similar RDBMS), it provides access It is stored in the quickest way of the data in Hadoop distributed file systems.
The embodiment of the present application, the data object of the tables of data stored by obtaining HDFS systems, obtains the data object Field name.For example, during data maintenance, tables of data A is obtained, the first data object field name is employee number, again Tables of data B is such as obtained, the field name of a certain data object is order number.
Searching unit 402 is searched and the field name for being based on field name in Hadoop distributed file systems Corresponding storage file.
The embodiment of the present application, by determined from tables of data for generate major key data object field name, then Search whether there is the file named with field name in HDFS systems based on field name, this document is used to store the field The current value of the data object of name definition.
For example, in tables of data A, the first data object field name is employee number, the corresponding data object memory of employee number Storage is numerical information, for example, being encoded to HW00010 forms.
In the embodiment of the present application, it is search key by determining field name employee number, is searched in HDFS systems With the presence or absence of with the file named with employee number or the store path for finding the file named with employee number.Here file It can be referred to as storage file, be mainly used for storing the state value of employee number, for example, the first data object in tables of data A Value correspond to HW00001-HW00100 respectively, the state value stored in storage file i.e. HW00100.Expression exists employee number End-state value storage storage file in tables of data.When update or other users or program are to data Table A further operating, Then by obtaining the end-state value stored in storage file, continuous major key just constantly can be consistently generated, so as to Improve the efficiency for creating major key.
Whether just detection unit 403 if for finding storage file, the data object is detected based on field name It is being used.
There are storage files in HDFS systems are determined, then it represents that the data object of tables of data may not be first use. The state value of in store data object in the storage file in order to further update state value, then needs further detection to be used for Create the data object of major key, if be currently being used.By detecting whether the data object is currently being used to prevent from being mostly used Family or program or machine may lead to operating mistake simultaneously to the data object into data object operation.
For example, detecting lock file corresponding with the data object, then identify the data object and be currently being used.Detection is not To lock file corresponding with the data object, then the data object is not used by.The lock file is the field with the data object The specific file of name definition.For example, field name .lock files.
Lock cell 404, if be not currently being used for data object, the data object of locking data table.
In the embodiment of the present application, operated while whether being currently being used by detection data object and prevent multi-user, So as to ensure the consistency of data, the accuracy rate of data is improved.
Optionally, in HDFS systems, by generating the lock file identical with the field name of the data object of tables of data. Prevent other users or program or machine from passing through lock simultaneously to the data object into data object write operation by locking file File limits other users to the data object into the permission of data object write operation.
The embodiment of the present application, by locking file presence or absence, instruction active user is prevention or allows other users couple Same asset into data object accessing operation, so that it is guaranteed that not destroying the integrality of system data, consistency and simultaneously data pair As property.
For example, when affairs to some data object (such as database table Table objects) into before data object operation, first to System sends out request, it is locked.Affairs just have the data object certain control after locking, in affairs release lock Before, other affairs, which cannot update this data object into data object, to be operated.
Lock file in the embodiment of the present application can be understood as exclusive lock (Exclusive Lock):That is X locks, also known as exclusive Lock, is the lock for preventing from sharing same asset simultaneously.Added exclusive lock database object cannot by other affairs read and Modification.
Generation unit 405, the content generation major key based on storage file.
In the embodiment of the present application, what storage file stored is the state value of the data object, by being read from storage file The state value is taken, then using the state value as the initial value of generation Major key, starts increment according to definition from hyperplasia into major key Value.Major key is referred to as unique identifier, automatic to increase after the value of identification record is generated every time, so as to ensure to be generated Unique identifier order.
The embodiment of the present application, by from storage file reading state value be used as the initial value from increasing data object, so Afterwards from hyperplasia into major key data object, for the data object information of unique identification data table.
Fig. 5 is please referred to, Fig. 5 shows the structural representation of database key generating means that the another embodiment of the application provides Figure.
As shown in figure 5, the device 500 includes:
Selecting unit 501, for the data object of tables of data that Hadoop distributed file systems is selected to store, determining should The field name of data object;
Searching unit 502 is searched and field name pair for being based on field name in Hadoop distributed file systems The storage file answered;
If creating unit 503 for not finding storage file, creates storage file, and in the storage file Initial state value is written, which is named with the field name.
The embodiment of the present application is not found and field name pair by field name in Hadoop distributed file systems The storage file answered, then it represents that the corresponding data object of the field name, it is first using major key is created, in order to keep major key Coherence and continuity creates the state value that storage file is used for storing the data object, to create master using the state value Key.For example, the value of the first data object in tables of data A corresponds to HW00001-HW00100 respectively, stored in storage file State value, that is, HW00100.
Detection unit 504 judges whether the data object is currently being used for being based on field name.
After creating the storage file for storing the state value of data object in HDFS systems, then need further to examine It surveys to create whether the data object of major key is currently being used.By detecting whether the data object is currently being used, to prevent Only multi-user or program or machine lead to mistake simultaneously to the data object into data object operation so as to avoid operating simultaneously.
For example, detecting lock file corresponding with the data object, then identify the data object and be currently being used.Detection is not To lock file corresponding with the data object, then the data object is not used by.The lock file is the field with the data object The specific file of name definition.For example, field name .lock files.
The embodiment of the present application, by locking file, to prevent from sharing same asset simultaneously.It is applied in the database pair of exclusive lock As cannot be read and changed by other affairs, so that it is guaranteed that not destroying the integrality of system data, consistency.
Lock cell 505, if be not currently being used for the data object, the data object of locking data table.
In the embodiment of the present application, operated while whether being currently being used by detection data object and prevent multi-user, So as to ensure the consistency of data, the accuracy rate of data is improved.
Optionally, in HDFS systems, by generating the lock file identical with the field name of the data object of tables of data. Prevent other users or program or machine from passing through lock simultaneously to the data object into data object write operation by locking file File limits other users to the data object into the permission of data object write operation.
Generation unit 506 generates major key for the content based on storage file.
In the embodiment of the present application, what storage file stored is the state value of the data object, by being read from storage file The state value is taken, then using the state value as the initial value of generation Major key, starts increment according to definition from hyperplasia into major key Value.Major key is referred to as unique identifier, automatic to increase after the value of identification record is generated every time, so as to ensure to be generated Unique identifier order.
The embodiment of the present application, by from storage file reading state value be used as the initial value from increasing data object, so Afterwards from hyperplasia into major key data object, for the data object information of unique identification data table.
Fig. 6 is please referred to, Fig. 6 shows the structural representation of database key generating means that the another embodiment of the application provides Figure.
As shown in fig. 6, the device 600 includes:
Selecting unit 601, for the data object of tables of data that Hadoop distributed file systems is selected to store, determining should The field name of data object.
The embodiment of the present application, the data object of the tables of data stored by obtaining HDFS systems, obtains the data object Field name.For example, during data maintenance, tables of data A is obtained, wherein, the field name of the first data object is employee Number, tables of data B is for another example obtained, wherein, the field name of a certain data object is order number.
Searching unit 602 is searched and field name pair for being based on field name in Hadoop distributed file systems The storage file answered.
To obtain in tables of data A the first data object field name for employee number, to obtain the first data in tables of data The field name of object, such as Employee_number.It is that keyword is distributed literary in Hadoop using Employee_number Storage file corresponding with the field name is searched in part system, for example, Employee_number.txt is found, Huo Zhetong It crosses under specified directory and searches, for example,/temp/sequence/Employee_number.txt.
Alternatively, specified directory is /temp/sequence/ etc..Optionally, storage corresponding with field name is found File, form are not limited to the file of .txt suffix or the file of extended formatting.It is excellent in the embodiment of the present application Selection of land, by searching for the mode of keyword search filename.
If creating unit 603 for not finding storage file, creates storage file, and in the storage file Initial state value is written, which is named with the field name;Then, into lock file detection sub-unit 604.
If not finding storage file corresponding with the field name, created in Hadoop distributed file systems One file, and initial value is written in this document.For example, not finding Employee_number.txt, then one is created Employee_number.txt, and initial value is written in this document, which is usually arranged as 0.
By searching for ensureing the consistency operated between multi-user or program to major key and continuous the step of storage file Property, so as to ensure the correctness of major key.
File detection sub-unit 604 is locked, for being based on whether depositing in field name detection Hadoop distributed file systems In the lock file named with field name;If there is lock file, then it represents that data object is currently being used.
After creating the storage file for storing the state value of data object in HDFS systems, then need further to examine It surveys to create whether the data object of major key is currently being used.And pass through and detect whether the data object is currently being used, come Multi-user or program or machine are prevented simultaneously to the data object into data object operation, so as to prevent while shared resource.
To obtain in tables of data A the first data object field name for employee number, to obtain the first data in tables of data The field name of object, such as Employee_number.Using Employee_number as keyword, searched in HDFS systems Employee_number.lock files, if there is this document, then it represents that Employee_number data objects are made With if there is no this document, then it represents that Employee_number data objects are not used by.It can also be by specified mesh Record is lower to search, for example,/temp/sequence/Employee_number.lock.
Alternatively, specified directory is /temp/sequence/ etc..Optionally, storage corresponding with field name is found File, form are not limited to the file of .lock suffix or the file of extended formatting.It is excellent in the embodiment of the present application Selection of land, by searching for the mode of keyword search filename.
Delay cell 605 if be currently being used for data object, waits for a period of time, until detecting indicated number It is not used by according to object, then enters lock document creation subelement 606.
In the process that the data object is currently being used, Major key does not connect caused by being operated simultaneously in order to avoid multi-user It is continuous, by setting waiting timer or it can check lock file in a manner that certain interval of time sends detection messages Whether also exist.Until can't detect lock file, then enter lock document creation subelement 606.
Document creation subelement 606 is locked, if be not currently being used for data object, in the distributed texts of Hadoop The data object that the lock file named with field name carrys out locking data table is created in part system.
The embodiment of the present application creates lock file to prevent from sharing same asset simultaneously, and lock file here can be understood as Exclusive lock (Exclusive Lock):That is X locks, also known as exclusive lock.Database object (such as data are prevented using this exclusive lock Library table object) it is read and is changed by other affairs.If there is no lock file, then it represents that data object is not currently being used, Then by creating the lock file named with field name come the data object of locking data table.
To obtain the field name of the first data object in tables of data A for employee number, to obtain the first number in tables of data According to the field name of object, such as Employee_number.
By creating Employee_number.lock, to realize the locking to affairs, in time model existing for lock file In enclosing, other affairs, which all cannot update data object into data object, to be operated, so as to prevent losing phenomena such as updating.It is optional The Employee_number.lock of establishment is placed under specified directory or path by ground, such as/temp/sequence/.
Generation unit 607 generates major key for the content based on storage file.
Optionally, generation subelement 607 includes:
Reading subunit 6071, for reading the state value of the data object stored in storage file.
After to data object locking, the state value of the specified data object of tables of data is read from storage file, it should State value may be initial installation value, it is also possible to current state value.Initial state value represents first and specifies data pair when creating The initial state value of elephant is O.Current state value represents that stateful value has existed specified data object before the procedure, the shape State value is known as current value, for example, other users, which have held data Table A data object, crosses update operation, the first data object The last one data value has been updated to HW00100, and current state value is HW00100.
From subelement 6072 is increased, major key is incrementally generated according to designated increments for being based on state value.
Based on state value, major key data object is incrementally obtained according to designated increments.To obtain the first number in tables of data A For the entitled employee number of object field, the field name of the first data object in tables of data, such as Employee_ are obtained number.During first establishment, the initial state value for specifying Employee_number data objects is 0.Designated increments are integer Value, such as 1.Then by obtaining first Major key plus designated increments on the basis of initial state value, then again with first Continuing based on Major key, which increases designated increments, obtains second Major key, and so on, until by Employee_number numbers It is finished according to object use, obtains n-th Major key.
Optionally, which further includes:
Storage unit 608, for the last one state value of major key to be saved in storage file.
Obtained n-th Major key is saved in storage file, for example, n-th Major key is saved in Employee_ In number.txt, N is natural number, is used to represent the position of the last one value herein.
Unit 609 is deleted, for deleting the lock file to discharge the data object.
Lock mechanism is mainly used for managing the concurrently access to shared resource, in the environment of multi-user, ensures database Integrality and consistency.After data object operation is completed, need to delete the lock, to discharge data object.
It should be noted that although describing the operation of the method for the present invention with particular order in the accompanying drawings, this is not required that Or it implies and must hold these operations of data object according to the particular order or behaviour shown in data object whole must be held Work could realize desired result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step by certain steps It holds data object and/or a step is decomposed into multiple steps and hold data object.
It should be appreciated that all units or module described in device 300-600 with it is each in the method that is described with reference to figure 1-3 Step is corresponding.As a result, device 300-600 and wherein included is equally applicable to above with respect to the operation and feature of method description Unit, details are not described herein.Device 400 can be realized in advance in the browser of electronic equipment or other security applications, also may be used By by being loaded into the browser of electronic equipment or its security application in a manner of downloading etc..Corresponding list in device 300-600 Member can be cooperated with the unit in electronic equipment with the scheme for realizing the embodiment of the present application.
Below with reference to Fig. 7, it illustrates suitable for being used for realizing the calculating of the terminal device of the embodiment of the present application or server The structure diagram of machine system 700.
As shown in fig. 7, computer system 700 includes central processing unit (CPU) 701, it can be read-only according to being stored in Program in memory (ROM) 702 or be loaded into program in random access storage device (RAM) 703 from storage section 708 and Hold data object various appropriate actions and processing.In RAM 703, also it is stored with system 700 and operates required various programs And data.CPU 701, ROM 702 and RAM 703 are connected with each other by bus 704.Input/output (I/O) interface 705 It is connected to bus 704.
I/O interfaces 705 are connected to lower component:Importation 706 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 707 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 708 including hard disk etc.; And the communications portion 709 of the network interface card including LAN card, modem etc..Communications portion 709 via such as because The network of spy's net holds data object communication process.Driver 710 is also according to needing to be connected to I/O interfaces 705.Detachable media 711, such as disk, CD, magneto-optic disk, semiconductor memory etc., as needed be mounted on driver 710 on, in order to from The computer program read thereon is mounted into storage section 708 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as calculating above with reference to the process of one of Fig. 1-3 descriptions Machine software program.For example, embodiment of the disclosure includes a kind of computer program product, it can including being tangibly embodied in machine Read medium on computer program, aforementioned computer program include for hold one of data object Fig. 1-3 method program generation Code.In such embodiments, the computer program can be downloaded and installed from network by communications portion 709 and/or It is mounted from detachable media 711.
Flow chart and block diagram in attached drawing, it is illustrated that according to the system of various embodiments of the invention, method and computer journey Architectural framework in the cards, function and the operation of sequence product.In this regard, each box in flow chart or block diagram can generation The part of one module of table, program segment or code, a part for aforementioned modules, program segment or code include one or more Logic function as defined in being used to implement holds data object instruction.It should also be noted that in some implementations as replacements, side The function of being marked in frame can also be occurred with being different from the sequence marked in attached drawing.For example, two sides succeedingly represented Frame simultaneously can essentially hold data object to data object substantially, they can also hold data object in the opposite order sometimes, This is depended on the functions involved.It is also noted that each box and block diagram and/or stream in block diagram and/or flow chart The combination of box in journey figure, can be with the dedicated hardware based system for holding functions or operations as defined in data object come real It can be realized now or with the combination of specialized hardware and computer instruction.
Being described in unit or module involved in the embodiment of the present application can be realized by way of software, can also It is realized by way of hardware.Described unit or module can also be set in the processor, for example, can be described as: A kind of processor includes selecting unit, searching unit, detection unit, lock cell and generation unit.Wherein, these units or The title of module does not form the restriction to the unit or module in itself under certain conditions, for example, selecting unit can also quilt It is described as " unit for being used for selection ".
As on the other hand, present invention also provides a kind of computer readable storage medium, the computer-readable storage mediums Matter can be computer readable storage medium included in aforementioned device in above-described embodiment;Can also be individualism, not The computer readable storage medium being fitted into equipment.There are one computer-readable recording medium storages or more than one journey Sequence, foregoing routine, which by one or more than one processor is used for holding data object and is described in the database key of the application, gives birth to Into method.
The preferred embodiment and the explanation to institute's application technology principle that above description is only the application.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the specific combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from aforementioned invention design, by above-mentioned technical characteristic or its equivalent feature into number The other technical solutions for arbitrarily combining and being formed according to object.Such as features described above has with (but not limited to) disclosed herein The technical solution that the technical characteristic of similar functions is replaced mutually into data object and formed.

Claims (15)

1. a kind of database key generation method, which is characterized in that the method includes:
The data object of tables of data stored in selection Hadoop distributed file systems determines the field name of the data object Claim;
Storage corresponding with the field name is searched in the Hadoop distributed file systems based on the field name File;
If finding the storage file, detect whether the data object is currently being used based on the field name;
If the data object is not currently being used, the data object of the tables of data is locked;
Content generation major key based on the storage file.
2. according to the method described in claim 1, it is characterized in that, the method further includes:
If not finding the storage file, storage file is created, and initial state value is written in the storage file, it should Storage file is named with the field name;
Then, the step of whether data object is currently being used judged into based on the field name.
3. method according to claim 1 or 2, which is characterized in that described that the data are detected based on the field name Whether object is currently being used, including:
It detects to whether there is in the Hadoop distributed file systems based on the field name and be named with the field name Lock file;If there is the lock file, then it represents that the data object is currently being used.
If 4. according to the method described in claim 3, it is characterized in that, the data object is not currently being used, The data object of the tables of data is then locked, including:
If the data object is not currently being used, in the Hadoop distributed file systems, create with described The lock file of field name name locks the data object of the tables of data.
5. according to claim 1-3 any one of them methods, which is characterized in that the method further includes:
If the data object is currently being used, wait for a period of time, until the data object is not used by, just enter Content based on the storage file generates the step of major key.
6. according to claim 1-5 any one of them methods, which is characterized in that the content life based on the storage file Into major key, including:
Read the state value of the data object preserved in the storage file;
The major key is incrementally generated according to designated increments based on the state value.
7. according to the method described in claim 6, it is characterized in that, it is described based on the storage file content generation major key it Afterwards, the method further includes:
The last one state value of the major key is saved in the storage file;
The lock file is deleted to discharge the data object.
8. a kind of database key generating means, which is characterized in that described device includes:
Selecting unit for the data object of tables of data that Hadoop distributed file systems is selected to store, determines the data The field name of object;
Searching unit is searched and the field name pair for being based on the field name in Hadoop distributed file systems The storage file answered;
Detection unit, if for finding the storage file, detecting the data object based on the field name is It is no to be currently being used;
Lock cell if be not currently being used for the data object, locks the data object of the tables of data;
Generation unit, for generating major key based on the content of the storage file.
9. device according to claim 8, which is characterized in that described device further includes:
If creating unit for not finding the storage file, creates storage file, and be written in the storage file Initial state value, the storage file are named with the field name;
Then, then into the detection unit.
10. device according to claim 8 or claim 9, which is characterized in that the detection unit, including:
File detection sub-unit is locked, for being based on whether depositing in the field name detection Hadoop distributed file systems In the lock file named with the field name;If there is the lock file, then it represents that the data object is currently being used.
11. device according to claim 10, which is characterized in that the lock cell, including:
Document creation subelement is locked, if be not currently being used for the data object, in Hadoop distributions In file system, the lock file named with the field name is created to lock the data object of the tables of data.
12. according to claim 8-10 any one of them devices, which is characterized in that described device further includes:
Delay cell if be currently being used for the data object, waits for a period of time, until the data object not It is used, just into the generation unit.
13. according to claim 8-12 any one of them devices, which is characterized in that the generation unit, including:
Reading subunit, for reading the state value of the data object stored in the storage file;
From subelement is increased, it is incremented by for being based on the state value according to designated increments, generates the major key.
14. a kind of equipment, including processor, storage device;It is characterized in that:
The storage device, for storing one or more programs;
When one or more of programs are performed by the processor so that the processor is realized as appointed in claim 1-7 Method described in one.
15. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is executed by processor When, realize the method as described in any one of claim 1-7.
CN201810021713.4A 2018-01-09 2018-01-09 Database key generation method, device, equipment and its storage medium Pending CN108256019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810021713.4A CN108256019A (en) 2018-01-09 2018-01-09 Database key generation method, device, equipment and its storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810021713.4A CN108256019A (en) 2018-01-09 2018-01-09 Database key generation method, device, equipment and its storage medium

Publications (1)

Publication Number Publication Date
CN108256019A true CN108256019A (en) 2018-07-06

Family

ID=62724981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810021713.4A Pending CN108256019A (en) 2018-01-09 2018-01-09 Database key generation method, device, equipment and its storage medium

Country Status (1)

Country Link
CN (1) CN108256019A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329998A (en) * 2017-06-09 2017-11-07 广州虎牙信息科技有限公司 User's increment class data capture method, device and equipment
CN109165216A (en) * 2018-08-02 2019-01-08 杭州启博科技有限公司 A kind of generation method and system, storage medium of Redis distributed data base major key id
CN110351384A (en) * 2019-07-19 2019-10-18 深圳前海微众银行股份有限公司 Big data platform method for managing resource, device, equipment and readable storage medium storing program for executing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101661509A (en) * 2009-09-29 2010-03-03 金蝶软件(中国)有限公司 Method for generating major key field of database table and device thereof
US20110179082A1 (en) * 2004-02-06 2011-07-21 Vmware, Inc. Managing concurrent file system accesses by multiple servers using locks
CN102880705A (en) * 2012-09-28 2013-01-16 用友软件股份有限公司 Database primary key generating device and database primary key generating method
CN102999525A (en) * 2011-09-16 2013-03-27 深圳市金蝶中间件有限公司 Data-table processing method and system
CN105608165A (en) * 2015-12-21 2016-05-25 用友网络科技股份有限公司 Distributed database master key generation method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110179082A1 (en) * 2004-02-06 2011-07-21 Vmware, Inc. Managing concurrent file system accesses by multiple servers using locks
CN101661509A (en) * 2009-09-29 2010-03-03 金蝶软件(中国)有限公司 Method for generating major key field of database table and device thereof
CN102999525A (en) * 2011-09-16 2013-03-27 深圳市金蝶中间件有限公司 Data-table processing method and system
CN102880705A (en) * 2012-09-28 2013-01-16 用友软件股份有限公司 Database primary key generating device and database primary key generating method
CN105608165A (en) * 2015-12-21 2016-05-25 用友网络科技股份有限公司 Distributed database master key generation method and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329998A (en) * 2017-06-09 2017-11-07 广州虎牙信息科技有限公司 User's increment class data capture method, device and equipment
CN109165216A (en) * 2018-08-02 2019-01-08 杭州启博科技有限公司 A kind of generation method and system, storage medium of Redis distributed data base major key id
CN110351384A (en) * 2019-07-19 2019-10-18 深圳前海微众银行股份有限公司 Big data platform method for managing resource, device, equipment and readable storage medium storing program for executing

Similar Documents

Publication Publication Date Title
JP7113040B2 (en) Versioned hierarchical data structure for distributed data stores
JP7044879B2 (en) Local tree update for client synchronization service
US11182356B2 (en) Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems
JP6553822B2 (en) Dividing and moving ranges in distributed systems
US10338917B2 (en) Method, apparatus, and system for reading and writing files
US10169368B2 (en) Indexing of linked data
US20060059204A1 (en) System and method for selectively indexing file system content
US10931748B2 (en) Optimistic concurrency utilizing distributed constraint enforcement
WO2009017534A1 (en) Persistent query system for automatic on-demand data subscriptions from mobile devices
US20040015486A1 (en) System and method for storing and retrieving data
JP2005346717A (en) Method, system and device for detecting and connecting data source
WO2010048531A1 (en) System and methods for metadata management in content addressable storage
US11151081B1 (en) Data tiering service with cold tier indexing
US8527480B1 (en) Method and system for managing versioned structured documents in a database
US10747749B2 (en) Methods and systems for managing distributed concurrent data updates of business objects
Third et al. LinkChains: Exploring the space of decentralised trustworthy Linked Data
CN108256019A (en) Database key generation method, device, equipment and its storage medium
JP2006146615A (en) Object-related information management program, management method and management apparatus
Ruldeviyani et al. Enhancing query performance of library information systems using NoSQL DBMS: Case study on library information systems of Universitas Indonesia
US20240111751A1 (en) Record-level locks with constant space complexity
US11550760B1 (en) Time-based partitioning to avoid in-place updates for data set copies
JP2007156844A (en) Data registration/retrieval system and data registration/retrieval method
Savaliya et al. A Comparative Study of Andrew File System and Hadoop Distributed File System Framework to Manage Big Data
CN116860700A (en) Method, device, equipment and medium for processing metadata in distributed file system
US8918379B1 (en) Method and system for managing versioned structured documents in a database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180706