CN108984719B - Data deleting method and device based on column storage, server and storage medium - Google Patents

Data deleting method and device based on column storage, server and storage medium Download PDF

Info

Publication number
CN108984719B
CN108984719B CN201810749906.1A CN201810749906A CN108984719B CN 108984719 B CN108984719 B CN 108984719B CN 201810749906 A CN201810749906 A CN 201810749906A CN 108984719 B CN108984719 B CN 108984719B
Authority
CN
China
Prior art keywords
data
deleted
column storage
area
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810749906.1A
Other languages
Chinese (zh)
Other versions
CN108984719A (en
Inventor
郭琰
王攀
周智伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN201810749906.1A priority Critical patent/CN108984719B/en
Publication of CN108984719A publication Critical patent/CN108984719A/en
Application granted granted Critical
Publication of CN108984719B publication Critical patent/CN108984719B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a data deleting method, a device, a server and a storage medium based on column storage, relating to the field of databases, wherein the method comprises the following steps: acquiring data to be deleted of a column storage table; determining a data area in a column storage table of the data to be deleted; according to the data area, recording the row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table; and determining statistical information of the data area after the data to be deleted is deleted, and modifying records corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table. By adopting the technical scheme, the invention improves the deleting efficiency of the data based on the column storage.

Description

Data deleting method and device based on column storage, server and storage medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a data deleting method and device based on column storage, a server and a storage medium.
Background
With the continuous development of big data technology, the amount of data contained in the database is increased sharply, and the traditional query performance based on the row storage mode is challenged.
Currently, to improve the performance of database query, column storage, which is a different storage method from the conventional row storage, is considered. The column storage technique is to store a data table in units of columns, and store data of the same column in one data file or in a plurality of files according to the data size.
The column storage method can improve the data query performance, but when deleting data, the data deletion performance is lower than that of the row storage method because the deletion of a row of data requires the positioning and deletion of each column of data which is stored separately.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus, a server, and a storage medium for deleting data based on a column storage, so as to improve the efficiency of deleting data from a column storage table.
In a first aspect, an embodiment of the present invention provides a data deletion method based on column storage, where the method includes:
acquiring data to be deleted of a column storage table;
determining a data area in a column storage table of the data to be deleted;
according to the data area, recording the row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
and determining statistical information of the data area after the data to be deleted is deleted, and modifying records corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
In a second aspect, an embodiment of the present invention further provides a data deleting device based on column storage, where the device includes:
the data acquisition module is used for acquiring data to be deleted of the column storage table;
the data area determining module is used for determining the data area in the column storage table of the data to be deleted;
a row number recording module, configured to record, according to the data area, a row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
and the record modification module is used for determining the statistical information of the data area after the data to be deleted is deleted, and modifying the record corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
In a third aspect, an embodiment of the present invention further provides a server, including:
one or more processors;
storage means for storing one or more programs;
when the one or more programs are executed by the one or more processors, the one or more processors implement the data deletion method based on the column storage according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data deletion method based on column storage according to any embodiment of the present invention.
According to the invention, the deletion auxiliary table is used for recording the data to be deleted in the column storage table, so that the deletion of the column storage table is converted into the insertion of the deletion auxiliary table, the data file is prevented from being read and written, the problem of low data deletion efficiency is solved, and the deletion efficiency of the data based on column storage is improved.
Drawings
FIG. 1 is a flowchart of a method for deleting data based on column storage according to an embodiment of the present invention;
FIG. 2 is a flowchart of a data deletion method based on column storage according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a data deleting device based on column storage according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings.
Example one
Fig. 1 is a flowchart of a data deletion method based on column storage according to an embodiment of the present invention, where this embodiment is applicable to a case of deleting data based on column storage, and the method may be executed by a data deletion apparatus based on column storage, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in a server, as shown in fig. 1, a technical solution provided in this embodiment is specifically as follows:
step 110, acquiring data to be deleted of the column storage table.
Optionally, the data to be deleted in the column storage table may be obtained by first obtaining a deletion statement in the column storage table, and determining the data to be deleted according to the deletion statement.
Illustratively, by deleting a statement: the delete from t where school number '110628' may obtain the student record with school number '110628' in table t, which is the data to be deleted.
The column storage is to store data in columns, the specific storage rule and the management implementation mode of the column storage directly determine the operation efficiency of the column storage table, and data to be deleted, which is data to be deleted by a user, is stored in the corresponding column storage table.
The data to be deleted in the embodiment of the present application may be any data object that needs to be deleted, and may be, for example, a student data object, where the student data object generally includes fields of name, gender, school number, age, grade, and the like of a student, or a teacher data object, where the teacher data object generally includes fields of name, age, teaching age, salary, and the like.
Step 120, determining a data area in the column storage table where the data to be deleted is located.
Each column of data is stored in a partitioned manner with a certain number of predetermined rows, such a region is referred to as a data region, and the number of predetermined rows is referred to as a region size. The data in the same data area are stored in the same data file, and one data file can store one to more data areas.
In this embodiment, a column storage rule for storing data in a partitioned manner is provided, where the same column of data is directly stored in one data file or stored in a plurality of data files according to the data size, and is supplemented with a column storage auxiliary table.
The column storage auxiliary table is used for recording control information such as offset addresses and data lengths of each area of each column in the data file, and statistical information such as maximum values and minimum values of column values stored in each area.
Optionally, the column storage auxiliary table has the following structure:
table 1: column storage auxiliary table structure
Figure BDA0001725301840000041
Figure BDA0001725301840000051
The following explains the structure of the above-described storage assist table:
1) column number: the column is the corresponding sequence number in the table definition when the table is created;
2) area code: different data areas have different numbers, and the number corresponding to the data area is an area number;
3) file number: the file number corresponding to the data file;
4) offset in file: for example, if three data areas are stored in the same data file, the offset in the file of the first data area is 0, the offset in the file of the second data area is the data space occupied by the first data area, and the offset in the file of the third data area is the data space occupied by the first and second data areas.
5) Zone size: the total row number of the data which can be stored in the data area is preset by a user;
6) number of valid data lines in area: removing the line number of the data in the data area after the data are deleted;
7) the size of the occupied space of the data is as follows: the number of bytes occupied by data storage;
8) number of rows of NULL values included: the number of lines occupied by the data null value in the data area;
9) number of lines where all data are different from each other: the number of lines occupied by mutually different data in the data stored in the column storage table;
10) maximum within the zone: a maximum data value in the data area;
11) minimum in zone: a minimum data value in the data region;
12) the sum of all values in the field, all data values in the data field, is summed.
Wherein, the column number, the area number, the file number, the size of the occupied space of the data and the offset in the file in the column storage auxiliary table are control information; the maximum value in the area, the minimum value in the area, the sum of all values in the area, the area size, the number of lines of effective data in the area, the number of lines of included null values and the number of lines of all data which are different from each other are statistical information.
Because the data is stored in the column storage table in an area mode, after the data to be deleted in the column storage table is obtained, the data file where the data to be deleted is located and which data areas in the column storage table are located are determined according to the row number of the data to be deleted, the column number, the area number, the file number and the number of rows of effective data in the areas recorded in the column storage auxiliary table.
And step 130, recording the row number of the data to be deleted in the deletion auxiliary table corresponding to the column storage table according to the data area.
The corresponding deletion auxiliary table is created at the same time when the column storage table is created. Because the data stored in each data area corresponds to different row numbers, after the data area where the data to be deleted is located is determined, the row number of the data to be deleted is recorded in the corresponding deletion auxiliary table in the column storage table according to the data area. And processing the data areas in sequence according to the different data areas of the data to be deleted.
In this embodiment, the deletion auxiliary table is used to record the deleted data in each data area according to the data area, and specifically, the deletion auxiliary table may record the row number of the data to be deleted.
Step 140, determining the statistical information of the data area after deleting the data to be deleted, and modifying the record corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
After the data to be deleted is deleted, the statistical information of the data area where the data to be deleted is located in the column storage table is also changed correspondingly, for example, the statistical information of the number of rows of all the data which are different from each other, the sum of all the values in the area, and the like. Therefore, the corresponding statistical information in the column storage auxiliary table corresponding to the column storage table needs to be modified. After determining the data area where the data to be deleted is located, acquiring the data of the data area, calculating statistical information of the data area after the data to be deleted is deleted, and storing corresponding records in the auxiliary table in a corresponding modification column; after deleting the data to be deleted, if the sum of all values in the area is changed, the record in the auxiliary table is stored corresponding to the modification column.
It should be noted that, after the data in the column storage table is subjected to many operations of adding, deleting and changing, the data in each auxiliary table is expanded. Therefore, optionally, when the system is idle or the list storage table is not operated, data reforming is performed on the list storage table, that is, data in the insertion auxiliary table, the deletion auxiliary table and the update auxiliary table are all written into the data file, and then the insertion auxiliary table, the deletion auxiliary table and the update auxiliary table are emptied, so that the scale of the deletion auxiliary table is ensured not to be too large, and the query efficiency of the data is improved.
The update auxiliary table is used for recording the update data in each data area in the column storage table, wherein one update record comprises a row number, a column number and an updated value. The table structure for updating the auxiliary table in this embodiment is as follows:
table 2: updating auxiliary table structure
Column name Type (B) Description of the invention
COLID SMALLINT Updated column number
DTA_ROWID BIGINT Updated row number
VALUE VARBINARY(8188) Updated value
The update auxiliary table is used for updating the column storage table data. When an update operation is performed on the column memory table, the row number, column number, and updated value of the updated data are recorded in the update auxiliary table.
And when the data of some line numbers in the data area is determined to be the data to be deleted, deleting the record value of the line number recorded in the updated auxiliary table. Thereby reducing the space occupied by the data.
According to the technical scheme provided by the embodiment of the invention, when the data is deleted, the deletion auxiliary table is arranged, so that the data deletion operation in the column storage table is converted into the insertion operation of the deletion auxiliary table, and the data in the data file does not need to be frequently deleted, so that the problem of frequent reading and writing of the data file stored based on the column is avoided, and the data deletion efficiency is improved.
On the basis of the technical scheme, the method can also optionally comprise the following steps:
and at preset time, merging the corresponding data in the deletion auxiliary table into the data file corresponding to the column storage table.
The preset time user can set according to needs, for example, the preset time user can set the time when the system is idle, such as 3 am every day or 3 am every weekday.
When the preset time is up, the corresponding data in the deletion auxiliary table is merged into the data file corresponding to the column storage table, namely the data in the data area corresponding to the data file is deleted, so that the problem of expansion of the data in the deletion auxiliary table can be avoided, the query result is prevented from being frequently corrected by using the deletion auxiliary table when the data is queried, and the query efficiency of the data is improved.
On the basis of the technical scheme, the method can also optionally comprise the following steps:
and if the update auxiliary table has the update record of the line number, deleting the update record.
Because the data of the line number is deleted, the updating operation is not carried out on the data, and the space occupied by the data can be reduced by deleting the updating record.
Example two
Fig. 2 is a flowchart of a method for deleting data based on column storage according to a second embodiment of the present invention. The present embodiment provides a preferred embodiment based on the above embodiments, and reference is made to the first embodiment for details that are not described in detail in the present embodiment. As shown in fig. 2, the method for deleting data based on column storage according to this embodiment includes the following steps:
step 210: and acquiring data to be deleted of the column storage table.
Step 220: and determining a data area in a column storage table of the data to be deleted.
Step 230: and acquiring the record of the column storage auxiliary table corresponding to the data area.
In this embodiment, after determining the data area of the column storage table where the data to be deleted is located, the record of the data area is found in the column storage auxiliary table. Illustratively, the record may be an area number, a data footprint size, a number of valid data lines in the area, a line number, and the like.
Step 240: and if the data to be deleted comprises continuous multiple lines of data, determining the initial line number and the corresponding line number of the data to be deleted according to the data area.
Since the deletion auxiliary table needs to record the line number of the data to be deleted, when the data to be deleted is continuous data of multiple lines, only the start line number of the data to be deleted and the corresponding number of deletion lines may be recorded. Therefore, when the data to be deleted are continuous multiple lines, the line number of the first line of the data to be deleted and the total line number of the data to be deleted in the data area are determined, so that the recording is more convenient, and the occupied space of the data is reduced.
The structure of the deletion auxiliary table can be predefined, that is, the content to be recorded and the specific type of the content are defined, and the table structure of the deletion auxiliary table in this embodiment is as follows:
table 3: delete assist table structure
Column name Type (B) Description of the invention
START_ID BIGINT Initial row number
COUNT INT Number of lines deleted
Step 250: and judging whether the row number of the data to be deleted in one data area is equal to the row number of the effective data in the area in the column storage auxiliary table or not, if so, executing the step 260, and otherwise, executing the step 270.
After the corresponding data area is determined, the data in the data area is not always valid data, and since the data in the data area in the data file is not really deleted when the deletion is performed, the row number of the deletion record is recorded in the deletion auxiliary table, and the number of the valid data rows in the column storage auxiliary table is updated, the number of the valid data rows in the column storage auxiliary table is the number of the data rows in the area which are not deleted.
Since the difference in the magnitude relationship between the number of rows of the data to be deleted in one data area and the number of rows of the effective data in the area in the column storage auxiliary table corresponds to different deletion rules, it is necessary to make a judgment before the deletion operation is performed.
Step 260: and deleting the corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table, and releasing the storage space of the corresponding data area.
When the number of rows of the data to be deleted in one data area is equal to the number of rows of the effective data in the area in the column storage auxiliary table, it indicates that all records in the data area are to be deleted, and at this time, all corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table are directly deleted. In addition, since the column storage auxiliary table records not only the statistical information of the data but also the control information of the data, deleting all the corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table represents that the data area has been deleted, deleting all the data in the corresponding data area, and thus releasing the storage space of the data area. Optionally, the storage space of the data area may be reused when a new data area is subsequently inserted.
Step 270: the start line number and the corresponding line number are inserted into the delete assist table, followed by step 280.
When the number of rows of the data to be deleted in one data area is judged to be not equal to the number of rows of the effective data in the area in the column storage auxiliary table, it is indicated that the data area has other data after the data to be deleted is deleted, and at this time, the initial row number and the corresponding row number of the data to be deleted need to be recorded in the deletion auxiliary table.
Step 280: and determining statistical information of the data area after the data to be deleted is deleted, and modifying records corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
According to the technical scheme provided by the embodiment of the invention, when the data is deleted, when the data to be deleted is continuous data with multiple rows, the records can be merged, so that the occupation of a storage space can be reduced, and preferably, the deletion rule of the data can be determined by judging the size relationship between the row number of the data to be deleted and the row number of the effective data in the data area in the column storage auxiliary table. And if the row number of the data to be deleted in one data area is not equal to the row number of the effective data in the area in the column storage auxiliary table, inserting the initial row number and the corresponding row number into the deletion auxiliary table. Otherwise, deleting the corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table, and releasing the storage space of the corresponding data area. The problem of frequently reading and writing the data file is solved, IO is reduced, and data deleting efficiency is improved.
EXAMPLE III
Fig. 3 is a flowchart of a data deleting device based on column storage according to a fourth embodiment of the present invention, where the device is used to execute a data deleting method based on column storage. As shown in fig. 3, the apparatus includes a data acquisition module 310, a data area determination module 320, a line number recording module 330, and a recording modification module 340.
The data obtaining module 310 is configured to obtain data to be deleted in the column storage table;
a data area determining module 320, configured to determine a data area in a column storage table where the data to be deleted is located;
a row number recording module 330, configured to record, according to the data area, a row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
the record modification module 340 is configured to determine statistical information of the data area after the data to be deleted is deleted, and modify a record corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
Further, the line number recording module includes:
a record obtaining unit, configured to obtain a record of the column storage auxiliary table corresponding to the data area;
the determining unit is used for determining the initial line number and the corresponding line number of the data to be deleted according to the data area if the data to be deleted comprises continuous lines of data;
and the inserting unit is used for inserting the initial row number and the corresponding row number into the auxiliary deleting table if the row number of the data to be deleted in one data area is not equal to the row number of the effective data in the area in the column storage auxiliary table.
Optionally, the apparatus further comprises:
and the releasing module is used for deleting the corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table and releasing the storage space of the corresponding data area if the row number of the data to be deleted is equal to the row number of the effective data in the data area in the column storage auxiliary table.
Further, the data acquisition module is specifically configured to:
and acquiring a deletion statement of the column storage table, and determining data to be deleted according to the deletion statement.
Optionally, the method further includes:
and the update record deleting module is used for deleting the update record if the update record of the line number exists in the update auxiliary table.
Further, the method also comprises the following steps:
and the data reforming module is used for merging the corresponding data in the deleted auxiliary table into the data file corresponding to the column storage table at preset time.
The data deleting device based on the column storage can execute the data deleting method based on the column storage provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the executing method. For details of the technology that are not described in detail in this embodiment, reference may be made to a data deletion method based on column storage according to any embodiment of the present invention.
Example four
The fourth embodiment of the invention provides a server, and integrates the data deleting device based on the column storage provided by any embodiment of the invention. Specifically, as shown in fig. 4, an embodiment of the present invention provides a server, where the server includes:
one or more processors 410, one processor 410 being exemplified in fig. 4;
a memory 420; and one or more modules.
The server may further include: an input device 430 and an output device 440. The processor 410, the memory 420, the input device 430 and the output device 440 in the server may be connected by a bus or other means, and fig. 4 illustrates the connection by a bus as an example.
The memory 420 serves as a computer-readable storage medium, and may be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the data deletion method based on column storage in the embodiment of the present invention (for example, the data acquisition module 310, the data area determination module 320, the row number recording module 330, and the record modification module 340 shown in fig. 3, the processor 410 executes various functional applications and data processing of the server by executing the software programs, instructions, and modules stored in the memory 420, that is, implementing the data deletion method based on column storage in the above-described method embodiment.
The memory 420 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the server, and the like. Further, the memory 420 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 420 may further include memory located remotely from processor 410, which may be connected to a server over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 430 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the server. The output device 440 may include a display device such as a display screen.
The server can execute the data deleting method based on the column storage provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the executing method.
EXAMPLE five
The fifth embodiment of the present invention further provides a storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for deleting data based on column storage according to the fifth embodiment of the present invention is implemented as follows:
that is, the program when executed by the processor implements:
acquiring data to be deleted of a column storage table;
determining a data area in a column storage table of the data to be deleted;
according to the data area, recording the row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
and determining statistical information of the data area after the data to be deleted is deleted, and modifying records corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table.
Any combination of one or more computer-readable media may be employed. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (8)

1. A method for deleting data based on column storage, the method comprising:
acquiring data to be deleted of a column storage table;
determining a data area in a column storage table of the data to be deleted;
according to the data area, recording the row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
the recording, according to the data area, the row number of the data to be deleted in the deletion auxiliary table corresponding to the column storage table includes:
acquiring records of the column storage auxiliary table corresponding to the data area;
if the data to be deleted comprises continuous multiple lines of data, determining the initial line number and the corresponding line number of the data to be deleted according to the data area;
if the number of rows of the data to be deleted in a data area is not equal to the number of rows of the effective data in the area in the column storage auxiliary table, inserting the initial row number and the corresponding row number into the deletion auxiliary table;
and determining statistical information of the data area after the data to be deleted is deleted, and modifying records corresponding to the statistical information in a column storage auxiliary table corresponding to the column storage table, wherein the column storage auxiliary table is used for recording control information of each column, such as offset address and data length, of each area in the data file, and statistical information of maximum value and minimum value of column values stored in each area.
2. The method according to claim 1, wherein the recording of the row number of the data to be deleted in the deletion auxiliary table corresponding to the column storage table according to the data area further comprises:
and if the row number of the data to be deleted is equal to the row number of the effective data in the data area in the column storage auxiliary table, deleting the corresponding records in the column storage auxiliary table, the deletion auxiliary table and the update auxiliary table, and releasing the storage space of the corresponding data area.
3. The method of claim 1, wherein obtaining the data to be deleted from the list storage table comprises:
and acquiring a deletion statement of the column storage table, and determining data to be deleted according to the deletion statement.
4. The method of claim 1, further comprising:
and if the update auxiliary table has the update record of the line number, deleting the update record.
5. The method of claim 1, further comprising:
and at preset time, merging the corresponding data in the deletion auxiliary table into the data file corresponding to the column storage table.
6. An apparatus for column storage based data deletion, the apparatus comprising:
the data acquisition module is used for acquiring data to be deleted of the column storage table;
the data area determining module is used for determining the data area in the column storage table of the data to be deleted;
a row number recording module, configured to record, according to the data area, a row number of the data to be deleted in a deletion auxiliary table corresponding to the column storage table;
the line number recording module comprises:
a record obtaining unit, configured to obtain a record of the column storage auxiliary table corresponding to the data area;
the determining unit is used for determining the initial line number and the corresponding line number of the data to be deleted according to the data area if the data to be deleted comprises continuous lines of data;
an inserting unit, configured to insert the starting row number and the corresponding row number into the auxiliary deletion table if the row number of the data to be deleted in one data area is not equal to the row number of the effective data in the area in the auxiliary column storage table;
and the record modification module is used for determining the statistical information of the data area after the data to be deleted is deleted, and modifying the record corresponding to the statistical information in the column storage auxiliary table corresponding to the column storage table, wherein the column storage auxiliary table is used for recording the control information of the offset address, the data length and the like of each area of each column in the data file, and the statistical information of the maximum value, the minimum value and the like of the column value stored in each area.
7. A server, characterized in that the server comprises:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the column storage based data deletion method of any of claims 1-5.
8. A computer storage medium on which a computer program is stored, the program, when executed by a processor, implementing a column storage based data deletion method according to any one of claims 1 to 5.
CN201810749906.1A 2018-07-10 2018-07-10 Data deleting method and device based on column storage, server and storage medium Active CN108984719B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810749906.1A CN108984719B (en) 2018-07-10 2018-07-10 Data deleting method and device based on column storage, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810749906.1A CN108984719B (en) 2018-07-10 2018-07-10 Data deleting method and device based on column storage, server and storage medium

Publications (2)

Publication Number Publication Date
CN108984719A CN108984719A (en) 2018-12-11
CN108984719B true CN108984719B (en) 2021-08-03

Family

ID=64537547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810749906.1A Active CN108984719B (en) 2018-07-10 2018-07-10 Data deleting method and device based on column storage, server and storage medium

Country Status (1)

Country Link
CN (1) CN108984719B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5905987A (en) * 1997-03-19 1999-05-18 Microsoft Corporation Method, data structure, and computer program product for object state storage in a repository
CN103177046A (en) * 2011-12-26 2013-06-26 中国移动通信集团公司 Data processing method and data processing device based on line storage data base
CN105339904A (en) * 2013-02-01 2016-02-17 辛博立科伊奥公司 Methods and systems for storing and retrieving data
CN108875077A (en) * 2018-07-10 2018-11-23 上海达梦数据库有限公司 Column storage method, device, server and the storage medium of database

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100846541B1 (en) * 2001-10-31 2008-07-15 주식회사 케이티 System and Method for online removing column of database table
US20120166402A1 (en) * 2010-12-28 2012-06-28 Teradata Us, Inc. Techniques for extending horizontal partitioning to column partitioning
CN102129458B (en) * 2011-03-09 2012-12-12 北京翰云时代科技有限公司 Method and device for storing relational database
US9104726B2 (en) * 2013-02-05 2015-08-11 Smartfocus Holdings Limited Columnar databases
CN106557494B (en) * 2015-09-25 2019-09-20 北京国双科技有限公司 Update the method and device of column storage table
CN105447200A (en) * 2015-12-30 2016-03-30 金蝶软件(中国)有限公司 Data processing method and data processing apparatus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5905987A (en) * 1997-03-19 1999-05-18 Microsoft Corporation Method, data structure, and computer program product for object state storage in a repository
CN103177046A (en) * 2011-12-26 2013-06-26 中国移动通信集团公司 Data processing method and data processing device based on line storage data base
CN105339904A (en) * 2013-02-01 2016-02-17 辛博立科伊奥公司 Methods and systems for storing and retrieving data
CN108875077A (en) * 2018-07-10 2018-11-23 上海达梦数据库有限公司 Column storage method, device, server and the storage medium of database

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Comparative Study of Row and Column Oriented Database;Vandana Bhagat 等;《2012 Fifth International Conference on Emerging Trends in Engineering and Technology》;20130408;1-7 *
列存储系统的若干关键技术研究;丁祥武;《中国博士学位论文全文数据库 信息科技辑》;20130715(第 07 期);I137-6 *

Also Published As

Publication number Publication date
CN108984719A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN108959587B (en) Data updating method and device based on column storage, server and storage medium
CN111258966A (en) Data deduplication method, device, equipment and storage medium
US20120311479A1 (en) User interface and geo-parsing data structure
US20200409925A1 (en) Data processing method and apparatus, storage medium and electronic device
CN108170752B (en) Template-based metadata management method and system
CN108875077B (en) Column storage method and device of database, server and storage medium
US20180144061A1 (en) Edge store designs for graph databases
CN110109910A (en) Data processing method and system, electronic equipment and computer readable storage medium
CN109739828B (en) Data processing method and device and computer readable storage medium
CN109582231B (en) Data storage method and device, electronic equipment and storage medium
CN114925101A (en) Data processing method and device, storage medium and electronic equipment
CN109408539B (en) Data operation method, device, server and storage medium
CN109542912B (en) Interval data storage method, device, server and storage medium
CN111190895B (en) Organization method, device and storage medium of column-type storage data
CN108984720B (en) Data query method and device based on column storage, server and storage medium
CN108984719B (en) Data deleting method and device based on column storage, server and storage medium
CN112835905B (en) Array type column indexing method, device, equipment and storage medium
CN111522820A (en) Data storage structure, storage retrieval method, system, device and storage medium
CN114547086B (en) Data processing method, device, equipment and computer readable storage medium
US20170109392A1 (en) Supporting updatable repeated values over variable schema
CN115858471A (en) Service data change recording method, device, computer equipment and medium
CN113722296A (en) Agricultural information processing method and device, electronic equipment and storage medium
CN109033271B (en) Data insertion method and device based on column storage, server and storage medium
CN107861956B (en) Method and device for inquiring data record of bayonet passing vehicle
CN110647577A (en) Data cube partitioning method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant