CN114328565A - Large-field data release method and device, electronic equipment and storage medium - Google Patents

Large-field data release method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114328565A
CN114328565A CN202111681593.9A CN202111681593A CN114328565A CN 114328565 A CN114328565 A CN 114328565A CN 202111681593 A CN202111681593 A CN 202111681593A CN 114328565 A CN114328565 A CN 114328565A
Authority
CN
China
Prior art keywords
page
data
information
released
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111681593.9A
Other languages
Chinese (zh)
Inventor
刘静
王家贤
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN202111681593.9A priority Critical patent/CN114328565A/en
Publication of CN114328565A publication Critical patent/CN114328565A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for releasing large-field data, electronic equipment and a storage medium, wherein the method comprises the following steps: collecting data page information of large field data to be released according to the field release information; and releasing the corresponding data page according to the data page information. According to the embodiment of the invention, the data page information corresponding to the large-field data is collected, so that the data pages are released integrally, the data release efficiency can be improved, the occupied time of the segment header is reduced, and the concurrent conflict of large-field operation in the database processing process is reduced.

Description

Large-field data release method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of databases, in particular to a method and a device for releasing large-field data, electronic equipment and a storage medium.
Background
Large fields in a database are commonly used data types that require a large amount of space in the database, and common large field data types may include BLOB, CLOB, TEXT, and IMAGE data types. The data of the large field data type is usually stored in an independent storage manner, and the table stores summary information such as a data storage location of the data, instead of specific data of the large field data. Large field data is generally implemented based on data pages, the large field data is sequentially written into a series of data pages, the data pages ensure the data sequence through a front page address and a back page address in page header information, referring to fig. 1, a data page may include page header control information, data and a row offset group, the page header control information includes information such as a page type and a page address, data is stored in the middle of the data page, a partial space reserved at the tail of the data page is used for storing the row offset group, and the row offset group is used for identifying the space occupation condition on the data page, so as to manage the space of the data page itself. The data storage is organized in a database by a segment structure, each segment is composed of a group of clusters, each cluster is composed of a plurality of continuous data pages on a disk, the data pages are used for intensively storing data of a multi-column large-field data type of the same table, and the data are all located in the same table segment. Usually, when a plurality of records in a plurality of large field columns of a table are deleted, the space occupied by the large field data, that is, the data page of the table field, needs to be released.
When releasing the data pages of the table segments, the conventional method is releasing page by page, and referring to fig. 2, firstly, all the data pages to be released are loaded into the cache one by one according to the segment header information of the table segments, then the data pages are released one by one, and when releasing one data page, the segment header information of the table segments is grabbed again each time to acquire the data page to be released. The above method results in: 1. the input/output quantity of the database is large, and the data release cost is high; 2. the header information of the first page of the table segment is frequently modified, and the occupation time of the segment header is too long, so that the concurrency conflicts of large field columns in the database are increased.
Disclosure of Invention
The invention provides a method and a device for releasing large-field data, electronic equipment and a storage medium, which are used for realizing the efficiency of data release, reducing the occupied time of a segment header and reducing the concurrent conflict of large-field type data operation in a database.
In a first aspect, an embodiment of the present invention provides a method for releasing large-field data, where the method includes:
collecting data page information of large field data to be released according to the field release information;
and releasing the corresponding data page according to the data page information.
In a second aspect, an embodiment of the present invention further provides a large-field data releasing apparatus, where the apparatus includes:
the information collection module is used for collecting data page information of large field data to be released according to the field release information;
and the data release module is used for releasing the corresponding data page according to the data page information.
In a third aspect, an embodiment of the present invention further provides a computer device, where the computer device includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a method as in any one of the embodiments of the invention.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any one of the embodiments of the present invention
According to the embodiment of the invention, the data page information of the large field data to be released is collected according to the received field release information, and the data page corresponding to the field to be released is integrally released according to the data page information, so that the large field data is quickly released, the input and output quantity is reduced, the release cost is reduced, the segment head occupation time is also reduced, and the concurrency conflict of the large field type data operation in the database can be reduced.
Drawings
FIG. 1 is a diagram of a prior art data page structure;
FIG. 2 is an exemplary diagram of a large field data release in the prior art;
FIG. 3 is a flowchart of a large field data release according to an embodiment of the present invention;
FIG. 4 is a flowchart of another large field data release according to the second embodiment of the present invention;
FIG. 5 is an exemplary diagram of a control page provided in the second embodiment of the present invention;
FIG. 6 is a flowchart of another large field data release provided by the third embodiment of the present invention;
FIG. 7 is a diagram illustrating an organization format of a control page according to a third embodiment of the present invention;
FIG. 8 is a diagram illustrating an organization format of a total control item according to a third embodiment of the present invention;
fig. 9 is an exemplary diagram of an organization format of a control unit according to a third embodiment of the present invention;
FIG. 10 is a diagram illustrating an example of releasing large field data according to a third embodiment of the present invention;
FIG. 11a is a schematic structural diagram of an FPO according to a third embodiment of the present invention;
FIG. 11b is a schematic structural diagram of another FPO provided in the third embodiment of the present invention;
FIG. 11c is a schematic structural diagram of another FPO provided in the third embodiment of the present invention;
FIG. 11d is a schematic structural diagram of another FPO provided in the third embodiment of the present invention;
FIG. 12 is a schematic structural diagram of a large-field data releasing apparatus according to a fourth embodiment of the present invention;
fig. 13 is a schematic structural diagram of a point electronic device according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only a part of the structures related to the present invention, not all of the structures, are shown in the drawings, and furthermore, embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
In the embodiment of the present invention, the large field data in the database may be a data type using an independent storage manner, and the independent storage manner may also be referred to as an out-of-row storage manner. The common large field data types in the database may include BLOB, CLOB, TEXT, IMAGE, etc., the large field data is generally implemented based on data pages, the large field data is sequentially written into consecutive data pages, the data pages ensure the data sequence through a front page address and a back page address in page header information, and address information of a first page of the data pages may be stored in a row record. The data pages may be organized in the database in the form of table segments, which may be formed of a set of clusters consisting of a set of contiguous data pages. The data of records storing the columns of large field data of the same table may all be located in the same table segment. For example, referring to the following table, when table T1 is stored, all data of columns C1, C2 and C3 are stored together in one table segment. The whole table only has one segment home page, and the segment home page does not store data and is a management page. Wherein, the column C1 and the column C3 are large field columns, and the previous page of the first data page of each large field data is the control page of the large field. The large field data "ABCD 1", "ABCD 2", "CDEF 1" and "CDEF 2" are all preceded by a control page.
Data of Table T1
C1 C2 C3
ABCD1 10 CDEF1
ABCD2 20 CDEF2
Example one
Fig. 3 is a flowchart of releasing large-field data according to an embodiment of the present invention, where this embodiment is applicable to a case of releasing large-field data, and the method may be executed by a large-field data releasing apparatus, and the apparatus may be implemented in a hardware and/or software manner, referring to fig. 3, where the method according to an embodiment of the present invention specifically includes the following steps:
and step 110, collecting data page information of large field data to be released according to the field release information.
The field release information may be information triggering release of a space occupied by a memory of the large-field data, the field release information may include identification information or address information of the large-field data, and the field release information may be triggered when the large-field data is deleted or a transaction is rolled back. The large field data to be released may be large field data corresponding to the field release information, for example, large field data corresponding to identification information or address information in the field release information. The data page information may be information marking a data page occupied by a large field, and may include a first page address, a last page address, a number of pages, and the like of the data page.
And step 120, releasing the corresponding data page according to the data page information.
In the embodiment of the invention, a plurality of data pages occupied by large-field data to be released can be released as a whole according to the data page information, so that the data page release efficiency can be improved, and the resource overhead when the data pages are locked is reduced.
According to the embodiment of the invention, the data page information of the large field data to be released is collected according to the received field release information, and the data page corresponding to the field to be released is integrally released according to the data page information, so that the large field data is quickly released, the occupied time of a field header is reduced, and the concurrence conflict of large field type data operation in the database can be reduced.
Example two
Fig. 4 is a flowchart of another large-field data release provided in the second embodiment of the present invention, where the second embodiment of the present invention is embodied on the basis of the foregoing embodiment of the present invention, and referring to fig. 4, the method provided in the second embodiment of the present invention specifically includes the following steps:
and step 210, acquiring a segment home page of the large-field data to be released and a data page home page of the large-field data to be released according to the field release information, and determining the type information of the data page home page as a large-field type.
The segment home page and the data home page of the large-field data are respectively locked in sequence, and other operations are prevented from modifying the segment home page and the data home page in the time interval. The type information may be used to determine whether the data page is of a large field type.
In the embodiment of the present invention, a data home page of the large-field data to be released may be read according to the field release information, and whether the large-field data to be released is of the large-field type may be determined according to the large-field type stored in the data home page, if yes, step 220 is continuously performed, and if not, the data page of the non-large-field type may be processed according to the processing method in the prior art.
Step 220, extracting data page information from a control page of a data head page of the large-field data to be released, wherein the data page information at least comprises a head page address, a tail page address and a page number.
The control page may be a management page set when the number of data pages occupied by large-field data is greater than M pages, and the management page may store information of each data page to facilitate management of each data page, and referring to fig. 5, the control page may be a data page preceding a data page occupied by large-field data, the data page being used for management of the data page occupied by large-field data and not storing data, and therefore, the management page is referred to as a management page. The first page address may be address information of a first data page of the data pages occupied by the large-field data to be released, the last page address may be address information of a last data page of the data pages occupied by the large-field data to be released, and the page number may be the number of the data pages occupied by the large-field data to be released.
In the embodiment of the invention, the control page corresponding to the data head page of the large-field data to be released can be determined, and information such as the head page address, the tail page address, the page number and the like of the data page of the large field to be released can be extracted from the control page as the data page information.
At step 230, data page information is temporarily stored in the information collection address space.
The information collection address space may be a space for temporarily storing data page information, and there may be a respective information collection address space for each large field data to be released.
Specifically, the corresponding data page information of the large field data to be released may be stored in the corresponding information collection address space, which may be located on the data top page of the large field data to be released or other designated location.
And step 240, merging the data page information of at least two large-field data to be released, and storing the merged data page information in the address space to be released of the first page of the field.
The address space to be released may be an address space for storing all data of the large fields to be released, and may store the merged data page information, the merged data page information in the address space to be released may be stored in the form of a queue, a sequence, and the like, and the merged data page information may reduce the locking time of the large field release on the segment header page.
In the embodiment of the present invention, the obtained data page information of two or more pieces of data of large fields to be released may be merged, so that each large field to be released is released integrally by using the merged data page information in the release process. It will be appreciated that the merging may include storing the data page information in the same address space or sequentially concatenating the first page address and the last page address of the data page information, etc.
And step 250, extracting the stored data page information in the address space to be released of the segment first page.
Specifically, the data page information extracted and stored from the address space to be released in the segment top page may be extracted, and the data page information may include a top page address, a tail page address and a page number of the field to be released.
And step 260, sequentially releasing the data pages according to the address and the page number of the first page of the data page information.
In the embodiment of the invention, the first page of the data page of the large-field data to be released can be obtained according to the first page address in the data page information, and the data pages with the corresponding page number quantity are sequentially released from the first page data according to the address sequence of the data pages, so that the whole release of the data pages occupied by the large-field data can be realized.
The embodiment of the invention acquires the segment home page and the data home page of the large-field data to be released through the field release information, respectively locks the segment home page and the data home page of the large-field data in sequence, extracts the data page information of the large-field data to be released based on the control page of the data home page of the large-field data to be released, stores the extracted data page information in the information collection address space, stores or merges at least one data page information of the large-field data to be released in the address space to be released of the segment home page, acquires the data page information of the address space to be released, and sequentially releases the data pages according to the home page address and the page number of the data page information, thereby realizing the integral release of the data pages of the large-field data, improving the data release efficiency, reducing the locking time of the segment home page and reducing the conflict of the operation of the large-field data.
EXAMPLE III
Fig. 6 is a flowchart of another large-field data release provided by the third embodiment of the present invention, and the third embodiment of the present invention is embodied on the basis of the foregoing embodiments of the present invention, and referring to fig. 6, the method provided by the third embodiment of the present invention specifically includes the following steps:
and 310, acquiring a segment home page of the large-field data to be released and a data page home page of the large-field data to be released according to the field release information, and determining the type information of the data page home page as a large-field type.
And step 320, sequentially scanning at least one data page occupied by the large-field data to be released to acquire data page information under the condition that the control page pointer of the control page is invalid.
The control page pointer can be a pointer pointing to a control page from a data home page of the large-field data to be released, and the control page pointer can be used for judging whether the data page has the control page. The control page may mean that a page address of a previous page of the first page of data page occupied by the large field data is not Null and a type of the data page corresponding to the page address of the previous page is the control page.
In the embodiment of the invention, a control page pointer corresponding to a data home page of large-field data to be released can be acquired, whether the control page pointer is empty or not is judged, if yes, it is indicated that a data page occupied by the large-field data to be released does not meet the requirement of starting the control page, the control page cannot be directly used to acquire data page information of all data pages of the large-field data to be released, and all data pages occupied by the large-field data to be released need to be scanned to acquire the address of each data page and the total page number of the data pages as the data page information. In this case, the control page is absent.
And 330, reading the control page corresponding to the control page pointer under the condition that the control page pointer of the control page is effective and the control page identifier is effective, and extracting the statistical information in the total control item and the tail control item in the control page to acquire the data page information.
The control page may be an information page storing statistical information of large field data, a fragmentation unit, and the like, where the fragmentation unit may be for storing out-of-line large field allocation information, see fig. 7, the control page may include a fragmentation number, whether there is tail page information, a file number of the tail page, a page number of the tail page, and whether the control information is valid identifier, where the fragmentation number may indicate how many fragments the large field is divided into, the tail page information is used to determine whether the large field has the tail page, and the tail page number and the tail page file number may be used to locate the tail page data page. Referring to fig. 8 and 9, the control page may be composed of an overall control item and a plurality of control items (or control units), wherein the last control item may be a tail control item.
Specifically, under the condition that the pointer of the control page of the data home page of the large field data to be released is not null, the large field data to be released is a data page managed by using the control page, and the control page identifier in the control page can be judged to determine whether the control page of the large field data to be released is valid.
It can be understood that, in the case that the control page pointer of the data head page of the large field data to be released is valid and the control page identifier is valid, the total control item and the tail control item information in the control page may be extracted to obtain the data page information. The obtained data page information comprises a start page address to be released, a tail page address and a total data page to be released. The control page address is the previous page address of the data page, and the address of the tail page can be directly acquired in the total control item. The control page address serves as a starting page address to be released. Before the total data page to be released is calculated, because the tail page control item is possibly invalid, the judgment needs to be carried out according to the actual situation, and the total page to be released can be calculated after the valid tail control item page is obtained. The total number of pages of data to be released is equal to the number of control pages (1 page) + the accumulated number of pages in all control items. Namely, the start page address to be released, the end page address and the total number of data pages to be released are used as data page information.
The number of pages in the tail control item is the number of pages of the tail control unit. In the case where the trailing control item is valid, the number of pages in the trailing control item can be directly used as the number of pages of the last control item to participate in the calculation. In the case that the tail control item is invalid, it indicates that the current tail page has been released by other sessions, and at this time, the valid tail control item information needs to be acquired again. The specific method for acquiring the effective tail control item is to acquire the effective tail control item by sequentially scanning the large field data pages of the last control unit. The method comprises the steps of firstly starting from the first page positioning of a tail control item, then scanning data pages in sequence, finding the address of the next page through the page head, and finally obtaining the page number and the tail page address in the effective tail control item. The page number and the end page address at this time can be used as the page number of the last control item to participate in calculation.
And 340, sequentially scanning the data pages of the large-field data to be released to acquire data page information under the condition that the control page pointer of the control page is valid and the control page identification is invalid.
In the embodiment of the present invention, when the control page pointer of the first data page of the large-field data to be released is valid and the control page identifier is invalid, it indicates that the data page of the large-field data to be released has a control page but the control page is invalid due to other reasons, the control page cannot be directly used to obtain the data page information of all the data pages of the large-field data to be released, and all the data pages occupied by the large-field data to be released need to be scanned to obtain the address of the last page and the total number of the data pages as the data page information.
And step 350, extracting the information of the first page of the data page of the large-field data to be released, and collecting the data page information of each large-field data to be released, which is stored in the address space.
Specifically, the stored data page information of each large field data to be released may be extracted in the information collecting address space of the data top page of the large field data, where the data page stored in the information collecting address space may be the data page information of the large field data to be released, which is collected at each release.
And 360, storing the head page address and the tail page address of each data page information in the address space to be released of the head page of the segment in an end-to-end manner according to the address sequence.
In the embodiment of the invention, the first page address and the last page address in the data page information of each large-field data to be released can be sequentially connected end to end according to the address sequence, the data page information of a plurality of large-field data to be released is merged, and the merged data page information can be stored in the address space to be released.
And step 370, marking the identification number of the corresponding large field data to be released as an invalid identification number in the field header page.
Specifically, after the data page information of the large field data to be released is collected, the identification number of the large field data to be released can be marked as an invalid identification number in the first page of the segment, so that the large field data to be released is prevented from being released repeatedly.
And 380, releasing the corresponding data page according to the data page information.
According to the embodiment of the invention, the data of the segment head page and the data of the large-field data to be released are acquired through the field release information, the segment head page and the data head page are respectively locked in sequence, the data page of the large-field data to be released is scanned to acquire the data page information under the condition that the control page pointer of the control page of the data head page is invalid, the statistical information of the master control item and the tail control item in the control page is read to acquire the data page information under the condition that the control page pointer of the control page of the data head page is valid and the control page identification is valid, and the address of the start page to be released, the address of the tail page and the total number of the data pages to be released are further acquired as the data page information according to the information of the master control item and the tail control item. And under the condition that the control page pointer of the data home page of the large-field data to be released is valid and the control page identification is invalid, sequentially scanning the data pages of the large-field data to be released to acquire data page information. The first page address and the last page address in the data page information are sequentially stored in the address space to be released end to end, the data identification number of each large field to be released is marked as an invalid identification number, and the corresponding data page is integrally released according to the data page information, so that the large-field data is rapidly released, the occupied time of the segment head is reduced, and the concurrence conflict of large-field type data operation in a database can be reduced.
Further, on the basis of the above embodiment of the present invention, extracting statistical information in a total control item and a tail control item in the control page to obtain the data page information includes: and according to the information of the master control item and the tail control item, further acquiring a start page address and a tail page address to be released and a total data page to be released as the data page information.
In an exemplary embodiment, referring to fig. 10, since a large field data page of the same row in the same column in the large field data page is often composed of a plurality of continuous pages, several continuous pages related to the large field can be released together when the space is released. Therefore, when the large field is created for the first time, an address space FPO (SEG _ FPO) for subsequently registering the large field data (address unit of the first page, address unit of the last page, number of pages N of the large field) can be newly added. SEG _ FPO already exists and need not be created. When a large field of data is deleted and the space is released, the continuous N data pages from the first page address recorded in the SEG _ FPO space to be released to the tail page address are directly taken as a whole to be released at one time. Thereby, the operation on the page unit at the time of releasing the large field is changed to the operation as corresponding to one whole unit, and the temporary LOB _ FPO is registered to the SEG _ FPO of the segment top page, that is, the number of pages N of the top page, the end page address unit, and the large field are recorded in the to-be-released address space area of the segment top page. When deleting the large field record release space or inserting rollback, the record address to be processed may be the first page address of the large field data page, and only the first page address may be transmitted in the processing of the external database. When a large field is created, a structural body FPO is newly added in a segment head page and used for storing head and tail address information of the large field to be released and the number of data pages, wherein the page address can be composed of a file number and a page number. The process may include the steps of:
1. a large field delete or rollback transaction is started.
2. And acquiring a segment head page of the table segment according to the large-field memory object to be released, and locking the segment head page. The memory structure of the large field object to be released is obtained according to the table.
3. The data home page of large field data is read and locked at load time.
4. And judging whether a control page exists or not according to the first page of the large-field data, wherein if the address of a pointer pointing to the previous page in the first page of the large-field data is not null, the previous page is judged to be the control page according to the control page identifier (identified as the control page) in the first page of the large-field data. If the control page identification is effective, the control page is indicated, the large field data page is indicated to be larger than M pages, and the step (1) is carried out; the control page identification is invalid, which indicates that the large-field data page is released by other sessions, and the first page address and the last page address are obtained by scanning, and the step (2) is entered; if the address of the pointer pointing to the previous page in the data head page of the large-field data is null, the large-field data page is less than or equal to M pages, the control page is not started, and no control page enters the step (3).
(1) And if the control page identification is valid, adopting control page positioning. And reading the control page of the large-field object into the memory, and acquiring the tail page address and the number of the data pages of the last control item through the total control item after the control page is successfully verified. The tail control item (i.e. the file _ id, page _ no of the last unit of the control block) in the control information also records the tail page address information, and the tail control item can be located, so as to obtain the page number of the data page of the last control item. The first page to free the large field of the FPO is adjusted starting from the control page. The control page is also released.
If VALID FLAG of the tail page control item in the control page is TRUE, it indicates that there is a tail page and the information of the last unit is reliable, and the page count information TOTAL BYTES of the last unit is directly used. If VALID FLAG of the trailing control item is FALSE, the number of pages will be scanned at the last control element. The method comprises the steps of firstly setting a current page address as a head page of a tail control item, then scanning data pages one by one, finding a next page address through a page header, gradually obtaining the page number of a last control unit, and obtaining an effective tail page address. The total data page number is the control page number (1 page) + the accumulated number of pages in all control items. The first page to free the large field of the FPO is adjusted starting from the control page. The control page is also released.
(2) If the control page is identified and invalid, the control page (lob _ ctl page) is invalid, and the address information of the head page and the tail page required by the FPO is acquired by scanning the large field data page (namely, positioning in a conventional mode). The invalid control page identifier indicates that the data page to be released with the large field data has a control page but the control page is invalid due to other reasons, the control page cannot be directly used to acquire the data page information of all the data pages to be released with the large field data, and all the data pages occupied by the large field data to be released need to be scanned to acquire the address of the tail page and the total number of the data pages as the data page information.
(3) If the address of the previous page of the first page of the large-field data page is invalid (null), that is, there is no control page, it means that the total number of pages of the data page does not exceed M pages (for example, 4 pages), so in the case of a small amount of IO, the positioning is also performed in a conventional manner. Because the address of the next page can only be known from the previous page of the table segment, the data head page is blocked before, and then the next data page is positioned one by one from the head page of the data page of the large field until the address of the tail page is obtained, wherein the number of the data pages is the total number of all the data pages of the large field to be searched.
5. After the FPO is collected, because the FPO records the data page information to be deleted, the data page information to be deleted can be merged into the FPO of the first segment page, and the large-field data identification BLOBId in the first segment page is set as an invalid ID to avoid repeated release. Referring to the following table, the page header control information of one segment header page may include blob:
1 BLOBid
2 N_bytes
3 N_chars
6. and releasing the large field data and updating the FPO information in the field header page. The following table shows control information for a segment header page:
Figure BDA0003444241630000151
FPO start offset: in order to quickly position the information of the large-field data page unit to be released in the first page of the segment, when the head stores the head and tail address information of the large-field data page to be released, the first 5 offsets are directly skipped, and the initial offset of the FPO is directly positioned. The FPO start offset at the 6 th offset is written as the start address over the segment header. This is to write the FPO to be freed into the segment header page, indicating that the data page of the FPO is no longer in use.
After the position is confirmed, the segment head lock is occupied for one time, namely, the segment head page is blocked, and the large-field data page information to be released is stored in the address space. When the lock is released, the release of the large field is completed.
Address space: position 6 in the segment header page is good for uniform physical release after the address unit and the number of pages of a plurality of large fields to be released are recorded. So called address space.
The FPO address space in the segment first page records the number of pages to be released last time in the segment first page, the address unit of the first page and the address unit of the last page.
In an embodiment of the present invention, the FPO may exist in two forms, LOB _ FPO and SEG _ FPO. LOB _ FPO is a temporary space, and SEG _ FPO is an information collection address space of a segment header page. LOB _ FPO is a single large field FPO to be freed for gathering before release. SEG _ FPO is the sum of the FPOs that have been gathered in the segment header page. All LOB _ FPOs to be released are connected end to end and all added into SEG _ FPOs. Then, when the field data page to be released is physically released, all the large data pages to be released can be located at one time through the SEG _ FPO. And after the physical release is finished, deleting the SEG _ FPO.
Before the first release, the SEG _ FPO information of the segment header in the segment header page is constructed. On the second release, SEG _ FPO is already present. The SEG _ FPO can be repeatedly used, and the SEG _ FPO is set to be NULL immediately after each releasing. The presence of an FPO (second release will be present, first release will not be present) for SEG _ FPO is valid (not NULL), otherwise NULL. When the SEG _ FPO is used again, if the address space of the SEG _ FPO is NULL, the release of the large field in the SEG _ FPO is completed, and in this case, LOB _ FPO information of the large field data to be released currently is set to the address space SEG _ FPO. If the effective address space is recorded as SEG _ FPO, the LOB _ FPO and the SEG _ FPO are merged;
and 1, merging the address of the LOB _ FPO and the SEG _ FPO in the segment first page address space in an end-to-end connection mode, and ensuring seamless connection between data page addresses. The address of the head page, the address of the tail page and the number of pages to be released currently are registered in an address space. Firstly, connecting the existing tail page number of SEG _ FPO of the address space in the segment head page with the head page number to be released currently; the next time a new LOB _ FPO to be released is added to the SEG _ FPO, it only needs to be connected to the back of the new tail page in the address space, so that the registration can be performed quickly.
The merging method comprises the following steps: large fields of data page information to be freed up are stored or merged into this address space. If the address of the first page in the effective address space SEG _ FPO is invalid (physically released), and the SEG _ FPO is NULL at the moment, directly setting the LOB _ FPO of the current large field to the address space, and ending the return. If the address space SEG _ FPO is not NULL, the next page address of the last page data page in the address space SEG _ FPO is modified to be the first page address of the current large field LOB _ FPO. The tail page address in the address space SEG _ FPO is set to the tail page address of the current large field LOB _ FPO.
2. And finishing the updating of the segment head page FPO, ending the transaction, and releasing the locks of the data head page and the segment head page of the large-field data. Therefore, when the embodiment of the invention is adopted to release 1 or more large field records, even if the data page is huge, the segment head page is not frequently modified, and the control information of the table segment is not frequently modified.
Merging section data page for the first time:
collecting FPO- > to determine the position of an address space, and setting the head page address, the tail page address and the page number in the address space as the head page address, the tail page address and the page number of LOB _ FPO; LOB _ FPO is assigned to SEG _ FPO. The merging ends to SEG _ FPO. Referring to FIG. 11a, a FPO structure is shown.
Merging section data page for the second time:
gather FPO- > determine the location of SEG _ FPO, which is also composed of the first page, the last page, and the page number, see FIG. 11b, and the addresses in the FPO are still in order after the first merge. Upon merger again, see FIG. 11c, the addresses in the FPO are still ordered. Finally, referring to fig. 11d, when the large field data pages to be released are released uniformly and physically, only the first page (the last page of Lob _ FPO2, the total page number) needs to be selected for one-time batch physical release. After physical release, setting the SEQ _ FPO to NULL, specifically, deleting the content of the SEQ _ FPO, cleaning the FPO information of the first page of the segment, setting the page number to 0, and setting the file number and the page number of the first page and the last page to NULL.
Example four
Fig. 12 is a schematic structural diagram of a large-field data releasing apparatus provided in the fourth embodiment of the present invention, which is capable of executing the large-field data releasing method provided in any embodiment of the present invention, and has corresponding functional modules and beneficial effects of the executing method. The device can be implemented by software and/or hardware, and specifically comprises: an information collection module 401 and a data release module 402.
And the information collecting module 401 is configured to collect data page information of large field data to be released according to the field release information.
A data releasing module 402, configured to release the corresponding data page according to the data page information as a whole.
According to the embodiment of the invention, the data page information of the large field data to be released is collected by the information collection module according to the received field release information, and the data release module integrally releases the data page corresponding to the field to be released according to the data page information, so that the large field data is quickly released, the occupied time of the segment head is reduced, and the concurrent conflict of the large field type data operation in the database can be reduced.
Further, on the basis of the above embodiment of the present invention, the information collecting module 401 in the apparatus includes:
and the data home page unit is used for acquiring a segment home page of the large-field data to be released and a data home page of the large-field data to be released according to the field release information, and determining the type information of the large-field data to be released of the data page home page as a large-field type in the data home page of the large-field data.
And the information extraction unit is used for extracting the data page information from the control page of the data head page of the large-field data to be released, wherein the data page information at least comprises a head page address, a tail page address and a page number.
Further, on the basis of the above embodiment of the present invention, the data home page unit is specifically configured to:
under the condition that a control page pointer of the control page is invalid, sequentially scanning at least one data page occupied by the large-field data to be released to acquire data page information; reading a control page corresponding to a control page pointer under the condition that the control page pointer of the control page is effective and a control page identifier is effective, and extracting statistical information in a total control item and a tail control item in the control page to acquire data page information; and sequentially scanning the data pages of the large-field data to be released to acquire data page information under the condition that the control page pointer of the control page is valid and the control page identification is invalid.
Further, on the basis of the above embodiment of the present invention, the information extracting unit is specifically configured to: and according to the information of the master control item and the tail control item, further acquiring a start page address and a tail page address to be released and a total data page to be released as the data page information.
Further, on the basis of the above embodiment of the invention, the apparatus further includes:
and the information merging module is used for merging the data page information of at least two large-field data to be released and storing the merged data page information in the address space to be released of the first page of the field.
Further, on the basis of the above embodiment of the present invention, the information merging module includes:
and the information extraction unit is used for extracting the data page information of each large field data to be released, which is temporarily stored in the information collection address space.
And the merging storage unit is used for storing the head page address and the tail page address of each data page information in the address space to be released of the head page of the segment in an end-to-end manner according to the address sequence.
And the invalid marking unit is used for marking the identification number of the corresponding large field data to be released as an invalid identification number in the field header page.
Further, on the basis of the above embodiment of the present invention, the data releasing module 402 includes:
and the address acquisition unit is used for extracting the stored data page information in the address space to be released of the segment first page.
And the space release unit is used for sequentially releasing the data pages according to the first page address and the page number of the data page information.
Further, on the basis of the above embodiment of the invention, the method further includes: and the information storage unit is used for temporarily storing the data page information in an information collection address space.
EXAMPLE five
Fig. 13 is a schematic structural diagram of a dot electronic device according to a fifth embodiment of the present invention, as shown in fig. 13, the electronic device includes a processor 70, a memory 71, an input device 72, and an output device 73; the number of the processors 70 in the electronic device may be one or more, and one processor 70 is taken as an example in fig. 13; the processor 70, the memory 71, the input device 72 and the output device 73 in the electronic apparatus may be connected by a bus or other means, and the bus connection is exemplified in fig. 13.
The memory 71, as a computer-readable storage medium, may be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the large-field data releasing method in the embodiment of the present invention (for example, the information collecting module 401 and the data releasing module 402 in the large-field data releasing apparatus). The processor 70 executes various functional applications and data processing of the electronic device by running software programs, instructions and modules stored in the memory 71, that is, implements the above-described large-field data release method.
The memory 71 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 71 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 71 may further include memory located remotely from the processor 70, which may be connected to the electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 72 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the electronic apparatus. The output device 73 may include a display device such as a display screen.
EXAMPLE six
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a large-field data release method, where the method includes:
collecting data page information of large field data to be released according to the field release information;
and releasing the corresponding data page according to the data page information.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also perform related operations in the large-field data release method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the large-field data releasing apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A large-field data release method, the method comprising:
collecting data page information of large field data to be released according to the field release information;
and releasing the corresponding data page according to the data page information.
2. The method of claim 1, wherein the collecting data page information of large field data to be released according to the field release information comprises:
acquiring a segment home page of the large-field data to be released and a data home page of the large-field data to be released according to the field release information, and determining the type information of the data page home page as a large-field type;
and extracting the data page information from the control page of the data head page of the large-field data to be released, wherein the data page information at least comprises a head page address, a tail page address and a page number.
3. The method of claim 2, wherein the extracting the data page information at the control page of the data head page of the large field data to be released comprises:
under the condition that a control page pointer of the control page is invalid, sequentially scanning at least one data page occupied by the large-field data to be released to acquire data page information;
reading a control page corresponding to a control page pointer under the condition that the control page pointer of the control page is effective and a control page identifier is effective, and extracting statistical information in a total control item and a tail control item in the control page to acquire data page information;
and sequentially scanning the data pages of the large-field data to be released to acquire data page information under the condition that the control page pointer of the control page is valid and the control page identification is invalid.
4. The method of claim 1, further comprising:
and merging at least two data page information of the large field data to be released, and storing the merged data page information in the address space to be released of the first page of the field.
5. The method according to claim 4, wherein said merging the data page information of at least two of the large field data to be released comprises:
extracting the information of the data page home page of the large-field data to be released, and collecting the data page information of each large-field data to be released, which is temporarily stored in an address space;
storing the head page address and the tail page address of each data page information in the address space to be released of the head page of the segment in an end-to-end manner according to the address sequence;
and marking the identification number of the corresponding large field data to be released as an invalid identification number in the field home page.
6. The method of claim 1, wherein said releasing the corresponding data page according to the data page information as a whole comprises:
extracting the stored data page information in the address space to be released of the segment home page;
and sequentially releasing the data pages according to the address and the page number of the first page of the data page information.
7. The method according to any one of claims 1-6, further comprising: and temporarily storing the data page information in an information collection address space.
8. A large-field data release apparatus, the apparatus comprising:
the information collection module is used for collecting data page information of large field data to be released according to the field release information;
and the data release module is used for releasing the corresponding data page according to the data page information.
9. An electronic device, characterized in that the electronic device comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202111681593.9A 2021-12-29 2021-12-29 Large-field data release method and device, electronic equipment and storage medium Pending CN114328565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111681593.9A CN114328565A (en) 2021-12-29 2021-12-29 Large-field data release method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111681593.9A CN114328565A (en) 2021-12-29 2021-12-29 Large-field data release method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114328565A true CN114328565A (en) 2022-04-12

Family

ID=81022949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111681593.9A Pending CN114328565A (en) 2021-12-29 2021-12-29 Large-field data release method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114328565A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220318223A1 (en) * 2021-03-31 2022-10-06 Microsoft Technology Licensing, Llc Rowgroup consolidation with global delta accumulation and versioning in distributed systems

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101236525A (en) * 2008-01-24 2008-08-06 创新科存储技术(深圳)有限公司 File memory, reading, deleting and copying method and its relevant system
CN106156301A (en) * 2016-06-30 2016-11-23 上海达梦数据库有限公司 A kind of processing method and processing device of big field data
CN106156302A (en) * 2016-06-30 2016-11-23 上海达梦数据库有限公司 A kind of processing method and processing device of big field data
CN112395083A (en) * 2020-09-30 2021-02-23 腾讯科技(深圳)有限公司 Resource file release method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101236525A (en) * 2008-01-24 2008-08-06 创新科存储技术(深圳)有限公司 File memory, reading, deleting and copying method and its relevant system
CN106156301A (en) * 2016-06-30 2016-11-23 上海达梦数据库有限公司 A kind of processing method and processing device of big field data
CN106156302A (en) * 2016-06-30 2016-11-23 上海达梦数据库有限公司 A kind of processing method and processing device of big field data
CN112395083A (en) * 2020-09-30 2021-02-23 腾讯科技(深圳)有限公司 Resource file release method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220318223A1 (en) * 2021-03-31 2022-10-06 Microsoft Technology Licensing, Llc Rowgroup consolidation with global delta accumulation and versioning in distributed systems
US11567921B2 (en) * 2021-03-31 2023-01-31 Microsoft Technology Licensing, Llc Rowgroup consolidation with global delta accumulation and versioning in distributed systems

Similar Documents

Publication Publication Date Title
US6615219B1 (en) Database management system and method for databases having large objects
CN111090663B (en) Transaction concurrency control method, device, terminal equipment and medium
US5386554A (en) Method and apparatus for reducing data locking time by removing a lock when journal data is written into a main memory journal queue
JP3611295B2 (en) Computer system, memory management method, and storage medium
CN114328565A (en) Large-field data release method and device, electronic equipment and storage medium
US7752399B2 (en) Exclusion control method and information processing apparatus
CN115935020A (en) Graph data storage method and device
CN113312386A (en) Batch warehousing method based on distributed messages
US7680921B2 (en) Management system, management computer, managed computer, management method and program
CN115422231A (en) Data page processing method and device, electronic equipment and medium
DE19848241A1 (en) Address verification device for transmission control system
CN111061719B (en) Data collection method, device, equipment and storage medium
US11829342B2 (en) Managing lock information associated with a lock operation
CN114297217A (en) Transaction concurrency control method and device, electronic equipment and readable storage medium
CN111209304B (en) Data processing method, device and system
CN113986974A (en) Database transaction management method and device and database transaction recovery method and device
CN114595066A (en) Reserved memory processing method and device, electronic equipment and medium
CN110990394B (en) Method, device and storage medium for counting number of rows of distributed column database table
JPH0816881B2 (en) Database update method
CN111581440A (en) Hardware acceleration B + tree operation device and method thereof
CN116431571A (en) Method and system for processing time sequence data by data storage engine
JPH07182239A (en) Segment division managing system
CN115599838B (en) Data processing method, device, equipment and storage medium based on artificial intelligence
JP2001318813A (en) Method for managing data
CN109325031B (en) Data statistical method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination