CN115757412A

CN115757412A - Unique key generation method for distributed database

Info

Publication number: CN115757412A
Application number: CN202211476314.XA
Authority: CN
Inventors: 张亮
Original assignee: Beijing Sifei Software Technology Co ltd
Current assignee: Beijing Sifei Software Technology Co ltd
Priority date: 2022-11-23
Filing date: 2022-11-23
Publication date: 2023-03-07

Abstract

The invention discloses a unique key generation method of a distributed database in the technical field of distributed databases, which adopts the following technical scheme: the method specifically comprises the following steps: step S1: dividing the bit of the ID into n intervals, wherein the n intervals at least comprise: a timestamp interval, a work progress interval and a sequence number bit interval; step S2: obtaining values of n intervals through a preset algorithm, and controlling the length of the distributed unique ID through a compression algorithm; and step S3: assigning a clock callback prevention flag bit to the timestamp interval; and step S4: and judging the time of last generation of the identifier at the current time, and adding a clock dial-back prevention mark bit capable of turning over the value so as to generate different IDs (identity) for the same timestamp when clock dial-back occurs. A clock dial-back prevention mark bit capable of numerical value turning is added to generate different IDs for the same timestamp when clock dial-back occurs, so that the problem of ID repetition is solved; and the ID with shorter length is generated by a compression algorithm, so that the storage space is saved.

Description

Unique key generation method for distributed database

Technical Field

The invention relates to the technical field of distributed databases, in particular to a unique key generation method of a distributed database.

Background

In traditional database software development, a primary key automatic generation technology is a basic requirement. And each database also provides corresponding support for the requirement, such as MySQL self-increment keys, oracle self-increment sequences and the like. After data fragmentation, it is a very difficult problem for different data nodes to generate a globally unique primary key. The self-increment keys between different actual tables in the same logic table can not be mutually sensed to generate repeated main keys. Although collision can be avoided by restricting the initial value and the step size of the autonomous key, additional operation and maintenance rules need to be introduced, so that the solution lacks integrity and expandability.

At present, many third-party solutions can perfectly solve the problem, such as UUID, etc., which rely on specific algorithms to self-generate non-repetitive keys, or by introducing a primary key generation service, etc. In order to facilitate the use of users and meet the requirements of different use scenes of different users, apache shardingsphere not only provides a built-in distributed primary key generator, such as UUID and snowflag, but also extracts an interface of the distributed primary key generator, thereby facilitating the self-defined self-increment key generator to be realized by the users.

Among them, the disadvantage of UUID: the trend cannot be ensured to be increased, meanwhile, if the UUID is too long, the UUID is often represented by a character string, and the index query established by using the character string as a main key is inefficient, and the snowfly algorithm has the defects as shown in fig. 1: bit allocation is fixed, in an extreme case, for example, when the worker pair is not equal to 1024, the snowfall algorithm cannot meet the requirement, and when a clock callback condition occurs in the server, the snowfall algorithm is simple to throw error processing, so that service is unavailable in the period of time before the time is recovered, and transaction failure is caused.

Disclosure of Invention

This section is for the purpose of summarizing some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. In this section, as well as in the abstract and the title of the invention, simplifications or omissions may be made in order to avoid obscuring the purpose of the section, the abstract and the title of the invention, and such simplifications or omissions are not intended to limit the scope of the invention.

The present invention is proposed in view of the above-mentioned problem of the unique key generation method of the existing distributed database. Therefore, the invention aims to provide a unique key generation method of a distributed database, which can solve the problem of ID repetition; the ID with shorter length is generated through a compression algorithm, so that the storage space is saved, and the requirement of an application scene on the ID length is met.

In order to solve the technical problem, the invention provides a unique key generation method of a distributed database, which adopts the following technical scheme: the method specifically comprises the following steps:

step S1: dividing the bit of the ID into n intervals, wherein the n intervals at least comprise: a timestamp interval, a work progress interval and a sequence number bit interval;

step S2: obtaining values of n intervals through a preset algorithm, and controlling the length of the distributed unique ID through a compression algorithm;

and step S3: assigning a clock callback prevention flag bit to the timestamp interval;

and step S4: and judging the last time of generating the identification at the current time, and adding a clock dial-back prevention mark bit capable of turning over the value so as to generate different IDs for the same timestamp when clock dial-back occurs.

Optionally, the obtaining the values of the n intervals through a preset algorithm includes:

obtaining the value of the timestamp interval, wherein the value of the timestamp interval is binary (Tn-Ts), and the Ts is the starting time and the Tn is the current time;

acquiring a value of the work process interval, wherein the value of the work process interval is equal to a unique value distributed by a database to each work server of each work process;

and acquiring the value of the sequence number bit interval.

Optionally, the controlling the length of the distributed unique ID by the compression algorithm includes:

and setting a mapping relation between numbers and characters, wherein the number ID is mapped into an M-system character string, and the M-system character string is the distributed unique ID.

Optionally, the dividing of the bit of the ID into n intervals further includes an idle interval,

removing the idle bits from the corresponding identification interval; and adding one or more of the idle bits to other identification intervals to carry out bit number expansion on the other identification intervals.

Optionally, the identifier includes a sign bit, a clock callback prevention flag bit, a timestamp bit, a work progress bit, and a serial number bit.

Optionally, the determining the last time of generating the identifier at the current time, and adding a clock callback prevention flag bit that can be turned over by a value, so as to generate different IDs for the same timestamp when clock callback occurs includes:

determining whether the current time is less than the last time of generating the identifier,

if the current time is greater than or equal to the first time, setting a clock dial-back prevention mark bit as a first numerical value, determining a difference value between the current time and the basic time, converting the difference value to obtain a first time stamp, and generating an identifier based on a sign bit, the first numerical value, the first time stamp, a work progress bit and a serial number;

and if the time difference is smaller than the preset time difference, determining that clock callback occurs, and acquiring a first identifier corresponding to a second timestamp after clock callback so as to invert the clock callback prevention mark bit in the first identifier from the first value to a second value to obtain an identifier.

In summary, the invention includes at least one of the following advantages:

the clock callback scene of the server is fully considered, and a clock callback prevention mark bit capable of being turned over in a numerical value mode is added, so that different IDs are generated for the same timestamp when clock callback occurs, and the problem of ID repetition is solved; the ID with shorter length is generated through a compression algorithm, so that the storage space is saved, and the requirement of an application scene on the ID length is met.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of bit distribution of the snowflake algorithm provided by Twitter;

FIG. 2 is a flow chart of the method of the present invention.

Detailed Description

The present invention is described in further detail below with reference to fig. 2.

Example one

Referring to fig. 1, the invention discloses a unique key generation method of a distributed database, which specifically comprises the following steps:

step S2: obtaining values of n intervals through a preset algorithm, and obtaining the value of the timestamp interval, wherein the value of the timestamp interval is binary expressed (Tn-Ts), and the Ts is the starting time and the Tn is the current time; acquiring a value of the work process interval, wherein the value of the work process interval is equal to a unique value distributed by a database to each work server of each work process; and acquiring the value of the sequence number bit interval.

Controlling the length of the distributed unique ID through a compression algorithm; and setting a mapping relation between numbers and characters, wherein the number ID is mapped into an M-system character string, and the M-system character string is the distributed unique ID. For example, firstly, setting the mapping relation between numbers and characters; and then mapping the digital ID into an M-system character string to shorten the length of the distributed unique ID. For example, after the generated 128-bit digital ID is mapped to the distributed unique character ID, the ID is partitioned, and then after each interval is compressed, the length of the ID is reduced from 128 bits to 64 bits, so that the length of the 64 bits is successfully shortened, and the length of the ID is effectively shortened while the distributed uniqueness of the ID and the data information contained in the ID are ensured.

And step S3: assigning a clock callback prevention flag bit to the timestamp interval; clock callback prevention flag bit: flipping this bit prevents the generation of duplicate IDs when a clock callback occurs. The value can be turned from 0 to 1, and can also be turned from 1 to 0, which respectively represent a first numerical value and a second numerical value, and is specifically set by a worker.

And step S4: and judging the time of last generation of an identifier at the current time, wherein the identifier comprises a sign bit, a clock callback prevention mark bit, a timestamp bit, a working process bit and a sequence number bit. Adding a clock callback prevention flag bit that can be turned over by a value to generate different IDs for the same timestamp when clock callback occurs, including:

When the current time of the server is greater than or equal to the time of the last generation of the identifier, determining that clock callback does not occur, setting a clock callback prevention flag bit to be a first numerical value, for example, 0, and sequentially generating an ID (identity) comprising a sign bit, the first numerical value, a difference value (namely, a first timestamp) between the current time and the basic time and a self-increment serial number; wherein, the base time may be a start time of the server.

However, if the value is smaller than the preset value, it indicates that clock callback occurs, and the clock callback prevention bit in the ID (i.e., the first identifier) corresponding to the second timestamp after clock callback may be directly flipped without considering information such as the current time, the serial number, and the like, for example, flipped from 0 to 1 or flipped from 1 to 0, and the obtained ID has a value different from the first ID at the clock callback prevention flag bit.

It should be noted that the time callback is generally leap seconds, or NTP (network time protocol) synchronization. The server generates more IDs every millisecond, if the clock callback is less than millisecond (namely equal to the condition), the ID sequence field is increased continuously, the repeated ID can not be generated, the problem of clock callback does not need to be processed, and the ID can be generated directly according to a normal flow.

Example two

Based on the same concept as the first embodiment, the method for generating a unique key of a distributed database further includes dividing the bit of the ID into n intervals and further includes an idle interval, that is, when the length of the distributed unique ID is controlled by a compression algorithm, there is an ID that may not use 64 bits in actual operation, and the ID may be appropriately reduced, for example, idle bits are set, and the idle bits are removed from the corresponding identification interval; and adding one or more of the idle bits to other identification intervals to carry out bit number expansion on the other identification intervals.

In the present disclosure, if implemented in the form of software functional units and sold or used as a separate product, it may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or a portion of the technical solution or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to execute all or part of the steps of the method of the embodiment of the present application. The storage value packet data server, the cloud server, the Read-only memory (ROM), the Random Access Memory (RAM), the mobile communication device, or various media capable of storing codes, such as an optical disc or a usb disk.

The above are all preferred embodiments of the present invention, and the protection scope of the present invention is not limited thereby, so: equivalent changes made according to the structure, shape and principle of the invention shall be covered by the protection scope of the invention.

Claims

1. A method for generating a unique key of a distributed database is characterized in that: the method specifically comprises the following steps:

2. The method of claim 1, wherein the method comprises: the obtaining of the values of the n intervals through the preset algorithm includes:

and acquiring the value of the sequence number bit interval.

3. The method of claim 1, wherein the method comprises: the controlling the length of the distributed unique ID through the compression algorithm includes:

and setting a mapping relation between numbers and characters, wherein the number ID is mapped into an M system character string, and the M system character string is the distributed unique ID.

4. The method of claim 1, wherein the method comprises: the division of the bits of the ID into n intervals further includes an idle interval,

5. The method of claim 1, wherein the method comprises: the mark comprises a sign bit, a clock callback prevention mark bit, a timestamp bit, a working process bit and a sequence number bit.

6. The method of claim 1, wherein the method comprises: judge the last time that generates the sign of current time, increase the time of preventing that can the numerical value upset and dial back the mark position to when taking place clock and dial back different IDs of time stamp generation, include: