CN110895531A - Data writing method of data storage table, partition server and electronic equipment - Google Patents

Data writing method of data storage table, partition server and electronic equipment Download PDF

Info

Publication number
CN110895531A
CN110895531A CN201811058603.1A CN201811058603A CN110895531A CN 110895531 A CN110895531 A CN 110895531A CN 201811058603 A CN201811058603 A CN 201811058603A CN 110895531 A CN110895531 A CN 110895531A
Authority
CN
China
Prior art keywords
data
index
writing
storage table
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811058603.1A
Other languages
Chinese (zh)
Inventor
国浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201811058603.1A priority Critical patent/CN110895531A/en
Publication of CN110895531A publication Critical patent/CN110895531A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data writing method of a data storage table, a partition server and electronic equipment. The method comprises the following steps: intercepting a data writing request sent by a client, wherein the data writing request carries a row key and at least one column value corresponding to the row key; rewriting the data writing request, and generating source data and at least one index data based on the rewritten data writing request; and writing at least one index data and source data into a data storage table, wherein the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys. According to the data storage method and device, the source data and the index data are stored in one table, the primary index function and the secondary index function can be achieved through the source data and the index data, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is guaranteed not to be damaged, and the reliability of data storage is effectively improved.

Description

Data writing method of data storage table, partition server and electronic equipment
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data writing method for a data storage table, a partition server, and an electronic device.
Background
With the rapid development and deep application of internet information technology, the storage amount of data is increased explosively, and cloud computing and distributed systems become the main trend of large data volume processing. The distributed database provides a high-performance, high-reliability and easily-expanded read-write function for the structured big data, and is widely applied to large internet companies, including HBase databases and the like which are commonly used for storing mass data.
In practical application, as shown in fig. 1, the data table of HBase is rowkey (row key) ordered, and data can be quickly located by rowkey, which is called as "primary index". In the prior art, a community open source phoenix may also be used to implement "secondary index" for a column, as shown in fig. 2, the scheme is that another corresponding HBase table is constructed on the basis of an HBase data table (col 2 is taken as an example in fig. 2), and a value (column value) of the index column and an original rowkey are reversely inserted into the newly constructed HBase table, so that the index column is ordered, and thus the rowkey can be quickly queried according to the index column, so that data can be quickly located through the rowkey in fig. 1.
However, implementation of the secondary index in this manner has one obvious disadvantage: an index HBase table needs to be added on the basis of the original HBase data table, so that the difficulty in operating the two tables during reading and writing is increased, and the consistency of index data and original data is difficult to guarantee. Even if the two tables can be accurately operated during reading and writing, it is difficult to avoid that the original contrast dependency relationship is damaged by external operations such as migration and the like.
Disclosure of Invention
In order to overcome the above technical problems or at least partially solve the above technical problems, the following technical solutions are proposed:
in a first aspect, the present application provides a data writing method for a data storage table, including:
intercepting a data writing request sent by a client, wherein the data writing request carries a row key and at least one column value corresponding to the row key;
rewriting the data writing request, and generating source data and at least one index data based on the rewritten data writing request;
and writing the at least one index data and the source data into a data storage table, wherein the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys.
In one possible implementation manner, the intercepting a data write request sent by a client includes:
and intercepting a data writing request sent by a client when the data storage table is judged to contain the index architecture.
In one possible implementation, the rewriting the data write request includes:
invoking a pre-registered data write function to overwrite the data write request, the data write function defining generation of the source data and the at least one index data by a predetermined format.
In a possible implementation manner, the data write request carries a target partition to which data is pre-written, and generating any index data based on the rewritten data write request includes:
determining a starting row identifier of the target partition, and determining a column family name corresponding to a column value;
and generating any index data according to a preset splicing sequence according to the initial row identification, the column family name, the column value and the row key.
In a possible implementation manner, the writing the at least one index data and the source data into the data storage table, where the data writing request carries a target partition into which data is to be written in advance, includes:
recording a write operation to the at least one index data and the source data to a log file;
and merging and writing the at least one index data and the source data into a sequencing memory buffer area of the target partition.
In a possible implementation manner, after writing the at least one index data and the source data into the data storage table, the method further includes:
and generating deserialization information of the storage position of the index data according to the stored serialization information of the index data, and writing the deserialization information into a data storage table.
In one possible implementation, the method further includes:
when the column value carried in the data writing request is text information, rewriting the data writing request, and generating full-text retrieval data based on the rewritten data writing request;
and writing the full-text retrieval data into a data storage table, wherein the data storage table also comprises a full-text retrieval data sub-table which is sorted according to texts.
In a second aspect, the present application provides a partitioned server, comprising:
the device comprises an interception module, a processing module and a processing module, wherein the interception module is used for intercepting a data writing request sent by a client, and the data writing request carries a row key and at least one column value corresponding to the row key;
the generating module is used for rewriting the data writing request and generating source data and at least one index data based on the rewritten data writing request;
and the writing module is used for writing the at least one index data and the source data into a data storage table, and the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys.
In a possible implementation manner, the intercepting module is specifically configured to intercept a data write request sent by a client when it is determined that the data storage table includes an index architecture.
In a possible implementation manner, the generating module is specifically configured to call a pre-registered data writing function to rewrite the data writing request, where the data writing function defines that the source data and the at least one index data are generated in a predetermined format.
In a possible implementation manner, the data write request carries a target partition to which data is to be written in advance, and the generation module is specifically configured to determine a start row identifier of the target partition and determine a column family name corresponding to a column value; and generating any index data according to a preset splicing sequence according to the starting row identification, the column family name, the column value and the row key.
In a possible implementation manner, the data write request carries a target partition to which data is pre-written, and the write module is specifically configured to record a write operation on the at least one index data and the source data to a log file; and merging and writing the at least one index data and the source data into a sequencing memory buffer area of the target partition.
In a possible implementation manner, the writing module is further specifically configured to generate deserialization information of a storage location of the index data according to the stored serialization information of the index data, and write the deserialization information into a data storage table.
In a possible implementation manner, the generating module is further specifically configured to, when the column value carried in the data writing request is text information, rewrite the data writing request, and generate full-text retrieval data based on the rewritten data writing request;
the writing module is further specifically configured to write the full-text search data into a data storage table, where the data storage table further includes a full-text search data sub-table sorted according to text.
In a third aspect, the present application provides an electronic device comprising:
a processor and a memory, the memory storing at least one instruction, at least one program, set of codes, or set of instructions, the at least one instruction, the at least one program, set of codes, or set of instructions being loaded and executed by the processor to implement the data writing method as shown in the first aspect of the present application.
In a fourth aspect, the present application provides a computer-readable storage medium for storing a computer instruction, a program, a set of codes, or a set of instructions that, when run on a computer, causes the computer to perform the data writing method as shown in the first aspect of the present application.
The beneficial effect that technical scheme that this application provided brought is:
the method and the device have the advantages that the data writing request sent by the client is intercepted and rewritten, so that the source data and at least one index data are generated based on the rewritten data writing request and then written into the data storage table, the source data and the index data are stored in one table, the primary index function and the secondary index function can be realized through the source data and the index data, meanwhile, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is guaranteed not to be damaged, and the reliability of data storage is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
FIG. 1 is a diagram illustrating an example of a data table for a primary index according to an embodiment of the present application;
FIG. 2 is a diagram illustrating an example of a data table for a secondary index according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a data writing method according to an embodiment of the present application;
FIG. 4 is an exemplary diagram of a data storage table provided by an embodiment of the present application;
fig. 5 is a schematic diagram of a data writing method based on HBase according to an embodiment of the present application;
fig. 6 is a schematic flowchart of another data writing method according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a partition server according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
Considering that in the prior art, two tables need to be maintained in the secondary index, the following problems inevitably occur:
(1) the data writing process must ensure consistency of the writing operations of the two tables, and if the writing of the table 1 is successful and the writing of the table 2 is failed, the query cannot obtain the required data.
(2) During migration of file 1 generated in table 1 and file 2 generated in table 2, if the user does not know the existence of file 2, only file 1 is copied away and cannot be used as a secondary index.
Based on the above, the application provides a data writing method of a data storage table, a partition server and an electronic device, which are used for solving the problem of how to maintain the consistency of data available for secondary indexing.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Example one
An embodiment of the present application provides a data writing method for a data storage table, as shown in fig. 3, the method includes:
step S301: intercepting a data writing request sent by a client, wherein the data writing request carries a row key and at least one column value corresponding to the row key;
when a user needs to store data, a data writing request can be sent to the server through the client, and at least one group of data to be stored by the user is carried in the data writing request. Each set of data includes a row key and at least one column value corresponding to the row key, for example, as shown in fig. 1, a set of data may include rowkey RK1 and values c11 and c21 corresponding to RK 1. In practical applications, a user may initiate a data write request when it is necessary to store one set of data, or may initiate a data write request when it is necessary to store two or more sets of data simultaneously, for example, the data write request simultaneously carries RK1, c11 and c21 corresponding to RK1, RK2, c12 and c22 corresponding to RK2, and the like. The data to be stored are only examples and should not be construed as limiting the embodiments of the present application.
In order to optimize data storage, in the embodiment of the present application, after receiving a data write request sent by a client, a server intercepts and executes step S302, instead of directly storing data carried in the data write request.
Step S302: rewriting the data writing request, and generating source data and at least one index data based on the rewritten data writing request;
in fact, if the data write request sent by the client is directly executed, the source data may be generated. In the embodiment of the present application, the data write request sent by the client is rewritten, and based on the rewritten data write request, the source data and the at least one index data may be generated.
Specifically, based on the rewritten data write request, source data is generated from the mapping relationship between the row key and each column value.
And generating any index data according to the mapping relation between any column value and the row key based on the rewritten data writing request, and generating each index data according to the mapping relation between each column value and the row key, thereby obtaining at least one index data.
It can be seen that the number of generated index data is related to the number of values corresponding to rowkey, and in the above example, if the data write request carries the row key RK1 and the column values c11 and c21 corresponding to RK1, one index data may be established according to the mapping relationship between c11 and RK1, or another index data may be established according to the mapping relationship between c21 and RK 1. That is, the corresponding index data is generated according to the mapping relationship between each column value carried in the data write request and the corresponding row key. In practical applications, some column values may not need to generate corresponding index data in consideration of practicality. Those skilled in the art can determine which column values need to generate corresponding index data according to actual situations, which is not limited in the embodiment of the present application.
Step S303: and writing at least one index data and source data into a data storage table, wherein the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys.
In the embodiment of the application, at least one index data and source data are written into the same data storage table, so that the index data become a part of the source data, and data management under the conditions of data reading and writing, data migration and the like is facilitated. As shown in fig. 4, the data storage table includes at least one INDEX data sub-table (corresponding to the INDEX table in fig. 4) sorted by column value and a source data sub-table (corresponding to the cf1 table in fig. 4) sorted by row key, and can implement the primary INDEX and the secondary INDEX functions.
The data storage method and the data storage device have the advantages that the data writing request sent by the client side is intercepted and rewritten, so that the source data and at least one index data are generated based on the rewritten data writing request and then written into the data storage table, the source data and the index data are stored in the same table, the functions of primary indexing and secondary indexing can be achieved through the source data and the index data, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is guaranteed not to be damaged, and the reliability of data storage is effectively improved.
Example two
Based on the technical solutions provided by the above embodiments, a possible implementation manner is provided below, where the embodiments of the present application are implemented based on HBase.
The HBase is a key-value database, provides efficient reading and writing service for data on a distributed file system, and codes running on an HBase server side can be written by using a Coprocessor coprocessors of the HBase so as to realize function expansion.
One type of coprocessor of the HBase can provide functions similar to triggers in a traditional database, and the coprocessor can be called by a server side when certain events occur.
For the embodiment of the present application, in step S301, that is, when a data write request sent by a client is received, the server invokes the coprocessor to intercept the data write request sent by the client.
In a feasible scheme of the embodiment of the application, when the server receives a data writing request sent by the client, it is determined whether a data storage table in which data to be stored is to be written in advance by the user contains an index architecture. Considering the diversity of data, for example, if the data to be stored by the user only needs to provide a first-level index after being stored, the data can be directly stored according to the source data, and the written data table may not contain an index architecture; if the data to be stored by the user needs to support index modes such as secondary index and the like after being stored, the technical scheme provided by the embodiment of the application needs to be adopted.
Therefore, in step S301, when the data storage table is determined to contain the index structure, a data write request sent by the client is intercepted. Namely, after the server receives a data write-in request sent by the client, when the data storage table corresponding to the data write-in request is judged to contain the index architecture, the coprocessor is called to intercept the data write-in request sent by the client and rewrite the data write-in request.
And when the data storage table corresponding to the data writing request does not contain the index architecture, directly performing data storage operation according to the data writing request sent by the client.
As can be seen from the above, with the Coprocessor coprocessors of the HBase, the code running on the server side of the HBase can be written to implement the extension of the functions. In the embodiment of the application, a code for registering a data writing function is written in the server side in advance, and is used for calling to rewrite the data writing request after intercepting the data writing request sent by the client side.
That is, in step S302, a pre-registered data writing function is called to rewrite the data writing request, the data writing function defining generation of source data and at least one index data by a predetermined format.
In practical applications, after receiving a data write request, a server needs to construct a put (write) object as shown in fig. 5. If the put object is constructed directly according to the received data write request, the source data is generated. In the embodiment of the application, the data writing request is modified through the data writing function, so that the rewritten put object can be constructed according to the rewritten data writing request, and the rewritten put object can generate the source data and the at least one index data.
Because the data writing function defines that the source data and the at least one index data are generated through a preset format, the generated source data and the at least one index data can be written into one data storage table in the step S303, so that the functions of primary indexing and secondary indexing can be realized, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is ensured not to be damaged, and the reliability of data storage is effectively improved.
EXAMPLE III
Based on the technical solutions provided by the above embodiments, a possible implementation manner is provided below for generating the source data and the at least one index data by a predetermined format.
Continuing with fig. 4, in practical applications, the HBase-based server may have multiple partitions (regions), and different sets of data to be stored by the user may fall into different partitions, for example, in fig. 4, data corresponding to row keys RK1 and RK2 are stored in the regions 1, and data corresponding to row keys RK3 and RK4 are stored in the regions 2.
In the embodiment of the application, the received data writing request sent by the client carries the target partition in which the data is pre-written. Then, in step S302, step S3021 and step S3022 are included for any index data, wherein,
step S3021: determining a starting row identifier of a target partition, and determining a column family name corresponding to a column value;
with the cfs 1-based basis in fig. 4: col2 column family generation index data is taken as an example:
the starting row of target partition Region1 corresponding to column value c21 is identified as Region1.startkey, the column family name being exemplified in fig. 4 as INDEX;
the starting row of target partition Region1 corresponding to column value c22 is identified as Region1.startkey, the column family name being exemplified in fig. 4 as INDEX;
the starting row of target partition Region2 to which column value c23 corresponds is identified as Region2.startkey, the column family names are illustrated in fig. 4 as INDEX;
the starting row of target partition Region2 to which column value c24 corresponds is identified as Region2.startkey, the column family names are illustrated in fig. 4 as INDEX;
it will be appreciated that the column name INDEX corresponds to the name of the INDEX data sub-table in the data storage table, facilitating the writing of data to the correct data sub-table.
Step S3022: and generating any index data according to the preset splicing sequence according to the initial row identification, the column family name, the column value and the row key.
In the above example, the region1.startkey + INDEX + c21+ RK1 generates the INDEX data corresponding to RK1 according to the predetermined splicing order. The index data corresponding to RK 2-RK 4 are analogized in sequence, and are not described herein again. The predetermined splicing sequence may be the sequence listed in the above parameters, or may be other sequences, and those skilled in the art may set this sequence according to actual situations, which is not limited in the embodiment of the present application.
It should be noted that, in combination with the first embodiment, if a data write request carries a set of data, for example, including a row key RK1 and column values c11 and c21 corresponding to RK1, based on cf 1: the index data generated by the col2 column family includes index data corresponding to RK 1. If two or more sets of data are carried in the data write request, including the row keys RK 1-RK 4 and the corresponding column values, based on cf 1: the index data generated by the col2 column family includes index data corresponding to RK 1-RK 4.
In the embodiment of the present application, the generated source data is as shown in the cfp 1 box in fig. 4, and similarly, in combination with the first embodiment, if a data write request carries a set of data, for example, including a row key RK1 and column values c11 and c21 corresponding to RK1, the generated source data includes source data corresponding to RK 1. If the data write request carries two or more sets of data, including, for example, row keys RK1 through RK4 and corresponding column values, the generated source data includes source data corresponding to RK1 through RK4, respectively.
For step S303, at least one generated index data and source data corresponding to the column family need to be written into the data storage table of the target partition according to the corresponding target partition, and the index data and source data corresponding to one group of data fall into the same table of the same Region, thereby avoiding the inconvenience of maintaining two tables in the prior art.
Example four
Based on the technical solutions provided by the above embodiments, a possible implementation manner is also provided, after step S301 and step S302 are implemented by any of the above embodiments, in step S303, step S3031 and step S3032 are included, wherein,
step S3031: and recording the write operation of the at least one index data and the source data to a log file.
The log file HLog is an important mechanism for ensuring data reliability of the HBase, a simple sequential log is arranged inside the log file HLog, each region on the server shares one HLog, and writing operation of at least one index data and source data is recorded into the HLog, so that when the server is crashed unexpectedly, the data can be recovered as far as possible.
Step S3032: and merging and writing at least one index data and the source data into a sequencing memory buffer area of the target partition.
The sequencing memory buffer memory is a very important component of the HBase and is a critical loop for realizing high-performance reading and writing of the HBase. As shown in fig. 5, at least one index data and one source data are merged and written into the MemStore of the corresponding target partition, and after the index data and the source data reach a specified size, the index data and the source data are written into the disk in batch, that is, the index data and the source data are written into the data storage table, so that the write-in performance of the HBase can be greatly improved. And before writing the data into the disk, the Memstore sorts the data by sorting at least one index data and the source data once respectively.
The MemStore and the HLog are used for processing at least one index data and source data together, so that the index data become a part of the source data, the at least one index data and the source data are written into the same data storage table, the functions of primary index and secondary index can be realized, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is guaranteed not to be damaged, and the reliability of data storage is effectively improved.
EXAMPLE five
Based on the technical solutions provided in the foregoing embodiments, a possible implementation manner is also provided, where after step S301, step S302, and step S303 are implemented by any of the foregoing embodiments, for any index data, the method further includes:
and generating deserialization information of the storage position of the index data according to the stored serialization information of the index data, and writing the deserialization information into the data storage table.
As shown in fig. 4, for the index data corresponding to each group of data, the corresponding deserialization information is stored, where the deserialization information may be represented as the length of the index data corresponding to each group of data. Thus, according to the deserialization information, the offset position of the index data corresponding to each group of data can be determined. When the secondary index is performed on the data storage table, the index data corresponding to each group of data can be respectively scanned according to the offset position of the index data corresponding to each group of data, so that the query result can be quickly obtained.
EXAMPLE six
Based on the technical solution provided by the above embodiment, the embodiment of the present application further provides a possible implementation manner, as shown in fig. 6, including the steps of:
step S601: when the column value carried in the data writing request is text information, rewriting the data writing request, and generating full-text retrieval data based on the rewritten data writing request;
in the embodiment of the present application, this step may be performed after step S301 shown in fig. 3, possibly together with step S302.
Specifically, for column values of text types other than the types of int, long, float, string, etc., for example, column values in column families such as addresses, titles, etc., after intercepting a data write request sent by a client, a server determines the type of the column values, and rewrites the data write request according to the determination result.
When it is determined that a certain column value is not text information, index data may be generated from the column value and the row key of the non-text type based on the rewritten data write request.
When it is determined that a certain column value is text information, full-text search data can be generated from the text information and the line key based on the rewritten data write request.
In the process of generating the full-text retrieval data, word segmentation needs to be carried out on text information, the text information is divided into a series of independent vocabulary units, each vocabulary unit is associated with a row key, and the process can be realized through Lucene in practical application.
Further, based on the rewritten data write request, source data is generated from each column value and row key.
And constructing a rewritten put object according to the rewritten data writing request, wherein the rewritten put object can generate the source data, the at least one index data and the at least one full-text retrieval data.
Step S602: and writing the full-text retrieval data into a data storage table, wherein the data storage table also comprises a full-text retrieval data sub-table which is sorted according to the text.
In the embodiment of the present application, this step may be performed together with step S303.
Specifically, full-text retrieval data is written into the data storage table, so that the fuzzy indexing function of the text type during secondary indexing can be realized, and the data query efficiency is improved.
In practical application, source data, at least one index data and at least one full-text retrieval data are written into the same data storage table, so that the functions of primary index, secondary index, fuzzy index and the like are realized, data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is ensured not to be damaged, and the reliability of data storage is effectively improved.
EXAMPLE seven
The embodiment of the present application provides a partition server, and as shown in fig. 7, the partition server 70 may include: an interception module 701, a generation module 702, and a writing module 703, wherein,
the intercepting module 701 is configured to intercept a data write request sent by a client, where the data write request carries a row key and at least one column value corresponding to the row key;
the generating module 702 is configured to rewrite the data write request, and generate source data and at least one index data based on the rewritten data write request;
the write module 703 is configured to write at least one index data and source data into a data storage table, where the data storage table includes at least one index data sub-table sorted by column value and a source data sub-table sorted by row key.
Optionally, the intercepting module 701 is specifically configured to intercept a data write request sent by a client when it is determined that the data storage table includes the index architecture.
Optionally, the generating module 702 is specifically configured to invoke a pre-registered data writing function to rewrite the data writing request, where the data writing function defines that the source data and the at least one index data are generated in a predetermined format.
Optionally, the data write request carries a target partition to which data is to be written in advance, and the generation module 702 is specifically configured to determine a start row identifier of the target partition and determine a column family name corresponding to a column value; and generating any index data according to the preset splicing sequence according to the initial row identification, the column family name, the column value and the row key.
Optionally, the data write request carries a target partition to which data is to be written in advance, and the write module 703 is specifically configured to record a write operation on at least one index data and source data to a log file; and merging and writing the at least one index data and the source data into a sequencing memory buffer area of the target partition.
Optionally, the writing module 703 is further specifically configured to generate deserialization information of the storage location of the index data according to the stored serialization information of the index data, and write the deserialization information into the data storage table.
Optionally, the generating module 702 is further specifically configured to, when the column value carried in the data writing request is text information, rewrite the data writing request, and generate full-text retrieval data based on the rewritten data writing request;
the writing module 703 is further specifically configured to write the full-text search data into a data storage table, where the data storage table further includes a full-text search data sub-table sorted according to text.
The partition server provided in the embodiment of the present application has the same implementation principle and technical effect as those of the foregoing method embodiment, and for brief description, no part of this embodiment is mentioned, and reference may be made to corresponding contents in the foregoing method embodiment, which is not described herein again.
The partition server provided by the embodiment of the application intercepts the data writing request sent by the client and rewrites the data writing request so as to generate the source data and at least one index data based on the rewritten data writing request, and writes the source data and the index data into the data storage table, so that the source data and the index data are stored in one table, and while the primary index function and the secondary index function can be realized through the source data and the index data, the data management under the conditions of data reading and writing, data migration and the like is facilitated, the consistency of the data is ensured not to be damaged, and the reliability of data storage is effectively improved.
Example eight
An embodiment of the present application further provides an electronic device, as shown in fig. 8, the electronic device 80 shown in fig. 8 includes: a processor 801 and a memory 802, the memory 802 storing at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by the processor 801 to implement the corresponding content in the aforementioned method embodiments. Optionally, the electronic device 80 may further comprise a transceiver 803. The processor 801 is coupled to a transceiver 803, such as via a bus 804. It should be noted that the transceiver 803 is not limited to one in practical application, and the structure of the electronic device 80 is not limited to the embodiment of the present application.
The processor 801 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 801 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 804 may include a path that transfers information between the above components. The bus 804 may be a PCI bus or an EISA bus, etc. The bus 804 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
The memory 802 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The embodiment of the present application also provides a computer-readable storage medium for storing computer instructions, which when run on a computer, enable the computer to execute the corresponding content in the foregoing method embodiments.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, several modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims (10)

1. A method for writing data to a data storage table, the method comprising:
intercepting a data writing request sent by a client, wherein the data writing request carries a row key and at least one column value corresponding to the row key;
rewriting the data writing request, and generating source data and at least one index data based on the rewritten data writing request;
and writing the at least one index data and the source data into a data storage table, wherein the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys.
2. The data writing method according to claim 1, wherein the intercepting a data writing request sent by a client comprises:
and intercepting a data writing request sent by a client when the data storage table is judged to contain the index architecture.
3. The data writing method according to claim 1, wherein the rewriting the data writing request includes:
invoking a pre-registered data write function to overwrite the data write request, the data write function defining generation of the source data and the at least one index data by a predetermined format.
4. The data writing method according to claim 1, wherein the data writing request carries a target partition to which data is pre-written, and any index data is generated based on the rewritten data writing request, and the method includes:
determining a starting row identifier of the target partition, and determining a column family name corresponding to a column value;
and generating any index data according to a preset splicing sequence according to the initial row identification, the column family name, the column value and the row key.
5. The data writing method according to claim 1, wherein the data writing request carries a target partition to which data is pre-written, and the writing the at least one index data and the source data into a data storage table includes:
recording a write operation to the at least one index data and the source data to a log file;
and merging and writing the at least one index data and the source data into a sequencing memory buffer area of the target partition.
6. The data writing method according to claim 1, wherein after writing the at least one index data and the source data into a data storage table, further comprising:
and generating deserialization information of the storage position of the index data according to the stored serialization information of the index data, and writing the deserialization information into a data storage table.
7. The data writing method according to any one of claims 1 to 6, characterized in that the method further comprises:
when the column value carried in the data writing request is text information, rewriting the data writing request, and generating full-text retrieval data based on the rewritten data writing request;
and writing the full-text retrieval data into a data storage table, wherein the data storage table also comprises a full-text retrieval data sub-table which is sorted according to texts.
8. A partitioned server, comprising:
the device comprises an interception module, a processing module and a processing module, wherein the interception module is used for intercepting a data writing request sent by a client, and the data writing request carries a row key and at least one column value corresponding to the row key;
the generating module is used for rewriting the data writing request and generating source data and at least one index data based on the rewritten data writing request;
and the writing module is used for writing the at least one index data and the source data into a data storage table, and the data storage table comprises at least one index data sub-table sorted according to column values and a source data sub-table sorted according to row keys.
9. An electronic device, comprising:
a processor and a memory, the memory storing at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the data writing method of any of claims 1-7.
10. A computer-readable storage medium for storing a computer instruction, a program, a set of codes, or a set of instructions which, when run on a computer, causes the computer to perform the data writing method of any one of claims 1-7.
CN201811058603.1A 2018-09-11 2018-09-11 Data writing method of data storage table, partition server and electronic equipment Pending CN110895531A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811058603.1A CN110895531A (en) 2018-09-11 2018-09-11 Data writing method of data storage table, partition server and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811058603.1A CN110895531A (en) 2018-09-11 2018-09-11 Data writing method of data storage table, partition server and electronic equipment

Publications (1)

Publication Number Publication Date
CN110895531A true CN110895531A (en) 2020-03-20

Family

ID=69784969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811058603.1A Pending CN110895531A (en) 2018-09-11 2018-09-11 Data writing method of data storage table, partition server and electronic equipment

Country Status (1)

Country Link
CN (1) CN110895531A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132397A1 (en) * 2011-11-18 2013-05-23 Nokia Corporation Methods, apparatuses and computer program products for generating indexes using a journal in a key value memory device
CN106202207A (en) * 2016-06-28 2016-12-07 中国电子科技集团公司第二十八研究所 A kind of index based on HBase ORM and searching system
CN106294814A (en) * 2016-08-16 2017-01-04 上海欣方软件有限公司 HBase secondary index based on memory database builds and the device and method of inquiry
CN106383860A (en) * 2016-08-31 2017-02-08 无锡雅座在线科技发展有限公司 Data query method and apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132397A1 (en) * 2011-11-18 2013-05-23 Nokia Corporation Methods, apparatuses and computer program products for generating indexes using a journal in a key value memory device
CN106202207A (en) * 2016-06-28 2016-12-07 中国电子科技集团公司第二十八研究所 A kind of index based on HBase ORM and searching system
CN106294814A (en) * 2016-08-16 2017-01-04 上海欣方软件有限公司 HBase secondary index based on memory database builds and the device and method of inquiry
CN106383860A (en) * 2016-08-31 2017-02-08 无锡雅座在线科技发展有限公司 Data query method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张威: "环境空气质量监测大数据非侵入式二级索引的研究", 《中国优秀硕士论文全文数据库信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN109086388B (en) Block chain data storage method, device, equipment and medium
CN109690522B (en) Data updating method and device based on B+ tree index and storage device
CN104572920A (en) Data arrangement method and data arrangement device
CN110888837B (en) Object storage small file merging method and device
CN111797134A (en) Data query method and device of distributed database and storage medium
CN114625696B (en) File recovery method and device, electronic equipment and storage medium
WO2020192663A1 (en) Data management method and related device
US10083192B2 (en) Deleted database record reuse
US20070174329A1 (en) Presenting a reason why a secondary data structure associated with a database needs rebuilding
CN110196952B (en) Program code search processing method, device, equipment and storage medium
CN115080684B (en) Network disk document indexing method and device, network disk and storage medium
CN116414935A (en) Method for distributed Search space vector data based on Elastic Search
CN113806309B (en) Metadata deleting method, system, terminal and storage medium based on distributed lock
CN110895531A (en) Data writing method of data storage table, partition server and electronic equipment
CN114896276A (en) Data storage method and device, electronic equipment and distributed storage system
CN114116723A (en) Snapshot processing method and device and electronic equipment
US20050246385A1 (en) Database-rearranging program, database-rearranging method, and database-rearranging apparatus
CN110888870A (en) Data storage table query method, partition server and electronic equipment
CN111427902A (en) Metadata management method, device, equipment and medium based on lightweight database
CN110895530A (en) Data storage method, partition server and electronic equipment
CN112084141A (en) Full-text retrieval system capacity expansion method, device, equipment and medium
US20050108205A1 (en) Data access and retrieval mechanism
CN118193032B (en) Method, apparatus, device, medium and program product for eliminating invalid dependency library
US20240362250A1 (en) Sub-track granularity for preserving point-in-time data
CN116450669A (en) Data query method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200320

RJ01 Rejection of invention patent application after publication