CN111414362A - Data reading method, device, equipment and storage medium - Google Patents

Data reading method, device, equipment and storage medium Download PDF

Info

Publication number
CN111414362A
CN111414362A CN202010128291.8A CN202010128291A CN111414362A CN 111414362 A CN111414362 A CN 111414362A CN 202010128291 A CN202010128291 A CN 202010128291A CN 111414362 A CN111414362 A CN 111414362A
Authority
CN
China
Prior art keywords
data
rowid
target data
reading
data table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010128291.8A
Other languages
Chinese (zh)
Other versions
CN111414362B (en
Inventor
帅宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010128291.8A priority Critical patent/CN111414362B/en
Publication of CN111414362A publication Critical patent/CN111414362A/en
Priority to PCT/CN2020/136303 priority patent/WO2021169496A1/en
Application granted granted Critical
Publication of CN111414362B publication Critical patent/CN111414362B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data reading method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a target data table, extracting physical address information of the target data table, and generating a task table to be processed according to the physical address information; determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity; and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid. According to the invention, based on the big data, the target data table is divided into a plurality of data blocks and then data reading is carried out, so that the data processing efficiency is improved.

Description

Data reading method, device, equipment and storage medium
Technical Field
The present invention relates to the field of big data technologies, and in particular, to a data reading method, apparatus, device, and storage medium.
Background
The Oracle database stores a large amount of data tables, and because the data volume is large, when the database needs to be modified in a large scale, the database locks the whole tables, and long-time space occupation can cause the old exception of the Oracle snapshot. The existing vernier segmentation processing scheme occupies a large amount of space to cause abnormal operation, cannot be executed concurrently, occupies excessive system resources, and is easy to cause system service blockage, so that the data processing efficiency is greatly reduced. Therefore, how to improve the data processing efficiency is a technical problem to be solved urgently at present.
Disclosure of Invention
The invention provides a data reading method, a data reading device, data reading equipment and a storage medium, and aims to improve data processing efficiency.
To achieve the above object, the present invention provides a data reading method, including:
acquiring a target data table, extracting physical address information of the target data table, and generating a task table to be processed according to the physical address information;
determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity;
and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid.
Preferably, the step of acquiring the target data table, extracting physical address information of the target data table, and generating the to-be-processed task table according to the physical address information further includes:
judging whether the data quantity in the target data table exceeds a first threshold value or not;
if the data volume is larger than the first threshold value, setting the process number of the concurrent process according to the data volume;
and dividing the target data table into sub target data tables with corresponding quantity according to the process number.
Preferably, the step of generating a pseudo-column rowid of one or more data blocks and reading the target data table according to the rowid further includes:
acquiring a state log, and acquiring a data abnormal data block through the state log;
and acquiring the rowid of the abnormal data block, marking the rowid as the abnormal rowid, and re-reading the abnormal rowid and one or more data blocks behind the abnormal rowid.
Preferably, the step of acquiring the target data table, extracting physical address information of the target data table, and generating the to-be-processed task table according to the physical address information includes:
acquiring the target data table from a system database, and extracting physical address information of the target data table by a system, wherein the physical address information comprises range extension and attribute information of the target data table;
and taking each extension in the physical address information as an independent task, and generating a task table to be processed according to attribute information corresponding to each independent task.
Preferably, the step of generating a rowid of one or more pseudo columns of the data blocks and reading the target data table according to the rowid comprises:
generating a rowid of each data block according to the data table number, the file number, the block number and the line number, wherein the rowid comprises an initial rowid and a termination rowid;
and positioning to a target data block according to the starting rowid and the stopping rowid, and sequentially reading one or more corresponding target data blocks until the target data table is read.
Preferably, the step of locating to the target data block according to the starting rowid and the terminating rowid further comprises:
and performing locking operation on the target data in the target data block, and performing locking release after reading the target data.
Preferably, the step of generating a pseudo-column rowid of one or more data blocks and reading the target data table according to the rowid further includes:
and performing data editing operation according to the data of the rowid in the target data table.
In addition, to achieve the above object, an embodiment of the present invention further provides a data reading apparatus, including:
the acquisition module is used for acquiring a target data table, extracting physical address information of the target data table and generating a task table to be processed according to the physical address information;
the segmentation module is used for determining single data processing capacity according to preset service requirements and segmenting the task table to be processed into one or more data blocks according to the single data processing capacity;
and the reading module is used for generating one or more pseudo columns rowid of the data blocks and reading the target data table according to the rowid.
In addition, in order to achieve the above object, an embodiment of the present invention further provides a data reading device, where the data reading device includes a processor, a memory, and a data reading program stored in the memory, and when the data reading program is executed by the processor, the data reading device implements the steps of the data reading method described above.
In addition, to achieve the above object, an embodiment of the present invention further provides a computer storage medium, where a data reading program is stored, and the data reading program, when executed by a processor, implements the steps of the data reading method as described above
Compared with the prior art, the invention discloses a data reading method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a target data table, extracting physical address information of the target data table, and generating a task table to be processed according to the physical address information; determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity; and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid. According to the invention, based on the big data, the target data table is divided into a plurality of data blocks and then data reading is carried out, so that the data processing efficiency is improved.
Drawings
Fig. 1 is a schematic hardware configuration diagram of a data reading apparatus according to embodiments of the present invention;
FIG. 2 is a flow chart illustrating a first embodiment of a data reading method according to the present invention;
FIG. 3 is a flow chart illustrating a second embodiment of a data reading method according to the present invention;
FIG. 4 is a flow chart illustrating a data reading method according to a third embodiment of the present invention;
fig. 5 is a functional block diagram of a data reading apparatus according to a first embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The data reading device mainly related to the embodiment of the invention is a network connection device capable of realizing network connection, and the data reading device can be a server, a cloud platform and the like. In addition, the mobile terminal related to the embodiment of the invention can be mobile network equipment such as a mobile phone, a tablet personal computer and the like.
Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a data reading apparatus according to embodiments of the present invention. In this embodiment of the present invention, the data reading device may include a processor 1001 (e.g., a Central processing unit, CPU), a communication bus 1002, an input port 1003, an output port 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the input port 1003 is used for data input; the output port 1004 is used for data output, the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.
With continued reference to fig. 1, the memory 1005 of fig. 1, which is one type of readable storage medium, may include an operating system, a network communication module, an application program module, and a data reading program. In fig. 1, the network communication module is mainly used for connecting to a server and performing data communication with the server; the processor 1001 may call a data reading program stored in the memory 1005 and execute the data reading method provided by the embodiment of the present invention.
The embodiment of the invention provides a data reading method.
Oracle Database, also known as Oracle RDBMS, or simply Oracle. Is a relational database management system of the oracle culture company. It is a product that is always in the leading position in the field of databases. The Oracle database system is a popular relational database management system in the world at present, has good system portability, convenient use and strong function, and is suitable for various large, medium, small and microcomputer environments. The database scheme is high in efficiency, good in reliability and suitable for high throughput.
Massive data are often stored in the Oracle database, so that when the database needs to be modified in a large scale, a database lock full table is caused, and long-time space occupation causes an old exception of the Oracle snapshot. The existing vernier segmentation processing scheme occupies a large amount of space to cause abnormal operation, cannot be executed concurrently, occupies excessive system resources, and is easy to cause system service blockage, so that the data processing efficiency is greatly reduced. Therefore, how to improve the data processing efficiency is a technical problem to be solved urgently at present.
Currently, data in an Oracle database is mainly obtained in batches through a cursor, and processing is submitted in a segmented mode, however, consistent reading needs to be constructed by using the cursor, if processing time needs several hours, a large amount of undoo (undo) space can be consumed, and therefore running errors are easily caused, and tasks are forced to be terminated. And the cursor acquisition method can not be executed concurrently, and the breakpoint continuous operation needs to be read repeatedly, so that the data reading speed is low.
Referring to fig. 2, fig. 2 is a flowchart illustrating a data reading method according to a first embodiment of the present invention.
In this embodiment, the data reading method is applied to a data reading device, and the method includes:
step S101, a target data table is obtained, physical address information of the target data table is extracted, and a task table to be processed is generated according to the physical address information;
the technical scheme of the embodiment is mainly applied to the Oracle database.
Specifically, the step of acquiring the target data table, extracting physical address information of the target data table, and generating the to-be-processed task table according to the physical address information includes:
step S101 a: acquiring the target data table from a system database, and extracting physical address information of the target data table by a system, wherein the physical address information comprises range extension and attribute information of the target data table; wherein the attribute information includes a data table space physical name, a path, and a size.
Step S101 b: and taking each extension in the physical address information as an independent task, and generating a task table to be processed according to attribute information corresponding to each independent task. Integrating all the extensions into the data table to be processed, and writing the attribute information corresponding to the extensions into the data table to be processed.
Step S102, determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity;
the method comprises the steps of taking a preset service requirement as an actual adjustment requirement for a target data table, converting the preset service requirement into single data processing capacity if the data volume required to be processed at one time by the preset service requirement is 30M and can meet the requirement of hardware configuration, dividing a task table to be processed into one or more data blocks according to the single data processing capacity, and recording the number of the data blocks, wherein for example, the single data processing capacity is 30M, the data volume of data corresponding to the task table to be processed is 30M × n, and the value of n is greater than or equal to 1, namely, the data corresponding to the task table to be processed can be divided into n data blocks.
Step S103, generating a pseudo-column rowid of one or more data blocks, and reading the target data table according to the rowid.
rowid is a pseudo-column used to uniquely mark a row in a table. It is the internal address of the row data in the physical table, and contains two addresses, one is the address pointing to the data file stored in the block containing the row in the data table, and the other is the address of the row in the data block that can be directly positioned to the data row itself. Generally, the rowid includes a data table number, a file number, a block number, and a line number.
The step of generating a pseudo-column rowid of one or more of the data blocks and reading the target data table according to the rowid comprises:
step S103a, generating a rowid of each data block according to the data table number, the file number, the block number and the line number, wherein the rowid comprises an initial rowid and a termination rowid;
it can be understood that, for a task table to be processed with multiple data blocks, the ending rowid of a certain data block is the starting rowid of the next data block, and similarly, the starting rowid of a certain data block is the ending rowid of the previous data block.
And step S103b, positioning to a target data block according to the starting rowid and the ending rowid, and sequentially reading one or more corresponding target data blocks until the target data table is read. Specifically, if there is only one target data block, positioning may be performed according to the starting rowid and the terminating rowid, and corresponding data is read. And if the target data blocks are multiple, reading the corresponding multiple target data blocks in sequence according to the serial numbers and/or the sequence of the target data blocks until all the data blocks are successfully read.
Further, the step of locating the target data block according to the starting rowid and the terminating rowid further comprises:
and performing locking operation on the target data in the target data block, and performing locking release after reading the target data.
When reading the target data block, locking is required to block other operations during reading. Generally, the shorter the locking time, the less impact on the overall service. The smaller the data block, the shorter the processing time, and the lock release is performed immediately after the processing is completed, and thus the corresponding locking time is also shorter.
The row lock may be used to prevent two services from modifying the same row of data, when a service modifies a row of data, the database always adds an exclusive lock to the modified row so that other services cannot modify the row, and only after the service performs a Commit or rollback Roll Back operation, the database exclusively releases the corresponding lock, the row lock is a small-granularity lock, which provides the maximum limit for the application to obtain the data in parallel, the database exclusively releases the lock when a transaction performs a Commit or rollback Roll Back operation, the database also defines a conflict lock to obtain the data in parallel, and the database also defines a conflict lock to obtain the data in parallel, i.e., a table entry 52.
Further, the step of generating a rowid of one or more dummy columns of the data blocks and reading the target data table according to the rowid further includes:
and editing operation is carried out according to the data of the rowid in the target data table.
After the rowid is used for positioning, data editing operation can be carried out according to the positioning. The editing operation comprises Insert, Update, Delete, Merge Merge and the like. For example, by commanding: insert into the website (1, null) can insert the relevant data. As another example, data may be deleted by creating a temporary table.
According to the scheme, the target data table is obtained, the physical address information of the target data table is extracted, and the task table to be processed is generated according to the physical address information; determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity; and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid. According to the invention, based on the big data, the target data table is divided into a plurality of data blocks and then data reading is carried out, so that the data processing efficiency is improved.
As shown in fig. 3, a second embodiment of the present invention provides a data reading method, based on the first embodiment shown in fig. 1, before the step of obtaining a target data table, extracting physical address information of the target data table, and generating a to-be-processed task table according to the physical address information, the method further includes:
step S1001, judging whether the data volume in the target data table exceeds a first threshold value;
and checking the attribute of the data corresponding to the target data table to obtain the data volume of the data. The data volume refers to the occupied space of the data.
It is understood that when the amount of data is too large, a single read takes a longer time, so a concurrency mechanism may be provided to save time and improve efficiency. The first threshold may be specifically set according to hardware devices, preset time requirements, system concurrency performance, and the like, for example, the first threshold is set to 300M.
Step S1002, if the data volume is larger than the first threshold, setting a concurrent process according to the data volume;
if the data volume exceeds the first threshold, it is indicated that the concurrency mechanism needs to be activated. The concurrency mechanism may be that the larger the amount of data, the more concurrent processes. For example, when the data amount is greater than the first threshold and less than a second threshold, the number of processes of the concurrent processes is set as a first process number; and when the data volume is greater than or equal to the second threshold and less than a third threshold, setting the process quantity of the concurrent processes as a second process quantity, wherein the first threshold is less than the second threshold and less than the third threshold, and the first process quantity is less than the second process quantity.
It is understood that if the data amount is less than or equal to the first threshold, the data can be read at a single process without setting a concurrent process.
And step S1003, dividing the target data table into sub target data tables with corresponding quantity according to the process number.
Specifically, the target data table is split into sub-target data tables of which the number corresponds to the number of the processes. Whereby data in the target data table may be read simultaneously by concurrent processes.
For example, the first threshold is set to 100M, and if the data amount is 1000M, a concurrent process needs to be set. And then, setting the number of concurrent processes according to a concurrent mechanism, for example, if the number of concurrent processes is 10 according to the system concurrent processing capacity, splitting the target data table into 10 sub-target data tables. Therefore, the 1000M data is read by 10 processes simultaneously, the data reading time is greatly shortened, and the data processing efficiency is improved.
According to the scheme, whether the data volume in the target data table exceeds a first threshold value is judged; if the data volume is larger than the first threshold value, setting the process number of the concurrent process according to the data volume; and dividing the target data table into sub target data tables with corresponding quantity according to the process number. The method and the device split the target data table based on the big data and then read the data, and improve the data processing efficiency based on a concurrent processing mechanism.
As shown in fig. 4, a third embodiment of the present invention provides a data reading method, based on the first embodiment shown in fig. 1, after the step of generating one or more pseudo columns rowid of the data blocks, and reading the target data table according to the rowid, the method further includes:
step S104, acquiring a state log, and acquiring an abnormal data block through the state log;
when the system reads data, a status log is generated, and information such as a data reading object, a database, reading time, and a reading completion progress is described in the status log. And after the status log is obtained, obtaining a data block which is not completely read according to the reading completion progress, and marking the data block as an abnormal data block.
It is understood that the abnormal data block also includes data blocks that are unreadable due to data corruption.
And step S105, acquiring the rowid of the abnormal data block, marking the rowid as the abnormal rowid, and re-reading the abnormal rowid and one or more data blocks behind the abnormal rowid.
And acquiring the rowid of the abnormal data block, marking the rowid as an abnormal rowid, and reading the abnormal database by taking an abnormal starting rowid of the abnormal rowid as a starting point.
Generally, if a certain data block is not successfully read, the system automatically skips and ends the corresponding reading task, so that the database behind the abnormal data block is not read to cause data omission. If the data is read in a cursor mode, the whole database needs to be scanned again after an exception occurs, so that the breakpoint continuous operation cost is high. In this embodiment, after the abnormal database is successfully read, the other unread databases behind the abnormal data block are read continuously. Moreover, if other exceptions cause data reading termination in the processing process, only the data block currently being read and the database which is not read are affected, and the data which is successfully read is already submitted and is not affected.
And if the abnormal data block is read for multiple times and the data in the abnormal data block cannot be completely read, outputting an alarm prompt for checking the abnormal database.
According to the scheme, the state log is obtained, and the data abnormal data block is obtained through the state log; and acquiring the rowid of the abnormal data block, marking the rowid as the abnormal rowid, and re-reading the abnormal rowid and one or more data blocks behind the abnormal rowid, so that when the data reading is abnormal, repeated reading is not needed, and the data processing efficiency is improved.
In addition, the embodiment also provides a data reading device. Referring to fig. 5, fig. 5 is a functional block diagram of a data reading apparatus according to a first embodiment of the present invention.
In this embodiment, the data reading device is a virtual device, and is stored in the memory 1005 of the data reading apparatus shown in fig. 1, so as to implement all functions of the data reading program: the system comprises a target data table, a task table generation module, a task processing module and a task processing module, wherein the target data table is used for acquiring a target data table, extracting physical address information of the target data table and generating a task table to be processed according to the physical address information; the system comprises a task table, a task table and a data processing unit, wherein the task table is used for determining single data processing capacity according to preset service requirements and dividing the task table to be processed into one or more data blocks according to the single data processing capacity; and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid.
Specifically, the data reading apparatus includes:
the acquisition module 10 is configured to acquire a target data table, extract physical address information of the target data table, and generate a to-be-processed task table according to the physical address information;
the segmentation module 20 is configured to determine a single data throughput according to a preset service requirement, and segment the to-be-processed task table into one or more data blocks according to the single data throughput;
and the reading module 30 is configured to generate a pseudo-column rowid of one or more data blocks, and read the target data table according to the rowid.
Further, the obtaining module is further configured to:
judging whether the data quantity in the target data table exceeds a first threshold value or not;
if the data volume is larger than the first threshold value, setting the process number of the concurrent process according to the data volume;
and dividing the target data table into sub target data tables with corresponding quantity according to the process number.
Further, the reading module is further configured to:
acquiring a state log, and acquiring a data abnormal data block through the state log;
and acquiring the rowid of the abnormal data block, marking the rowid as the abnormal rowid, and re-reading the abnormal rowid and one or more data blocks behind the abnormal rowid.
Further, the obtaining module is further configured to:
acquiring the target data table from a system database, and extracting physical address information of the target data table by a system, wherein the physical address information comprises range extension and attribute information of the target data table;
and taking each extension in the physical address information as an independent task, and generating a task table to be processed according to attribute information corresponding to each independent task.
Further, the reading module is further configured to:
generating a rowid of each data block according to the data table number, the file number, the block number and the line number, wherein the rowid comprises an initial rowid and a termination rowid;
and positioning to a target data block according to the starting rowid and the stopping rowid, and sequentially reading one or more corresponding target data blocks until the target data table is read.
Further, the reading module is further configured to:
and performing locking operation on the target data in the target data block, and performing locking release after reading the target data.
Further, the reading module is further configured to:
and performing data editing operation according to the data of the rowid in the target data table.
In addition, an embodiment of the present invention further provides a computer storage medium, where a data reading program is stored on the computer storage medium, and when the data reading program is executed by a processor, the steps of the data reading method are implemented, which are not described herein again.
Compared with the prior art, the data reading method, the data reading device, the data reading equipment and the data reading storage medium provided by the invention comprise the following steps: acquiring a target data table, extracting physical address information of the target data table, and generating a task table to be processed according to the physical address information; determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity; and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid. According to the invention, based on the big data, the target data table is divided into a plurality of data blocks and then data reading is carried out, so that the data processing efficiency is improved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for causing a terminal device to execute the method according to the embodiments of the present invention.
The above description is only for the preferred embodiment of the present invention and is not intended to limit the scope of the present invention, and all equivalent structures or flow transformations made by the present specification and drawings, or applied directly or indirectly to other related arts, are included in the scope of the present invention.

Claims (10)

1. A method of reading data, the method comprising:
acquiring a target data table, extracting physical address information of the target data table, and generating a task table to be processed according to the physical address information;
determining single data processing capacity according to preset service requirements, and dividing the task table to be processed into one or more data blocks according to the single data processing capacity;
and generating a pseudo column rowid of one or more data blocks, and reading the target data table according to the rowid.
2. The method according to claim 1, wherein the step of obtaining the target data table, extracting physical address information of the target data table, and generating the to-be-processed task table according to the physical address information further comprises:
judging whether the data quantity in the target data table exceeds a first threshold value or not;
if the data volume is larger than the first threshold value, setting the process number of the concurrent process according to the data volume;
and dividing the target data table into sub target data tables with corresponding quantity according to the process number.
3. The method of claim 1, wherein the step of generating a rowid of one or more of the pseudo-columns of the data blocks and reading the target data table according to the rowid is further followed by:
acquiring a state log, and acquiring a data abnormal data block through the state log;
and acquiring the rowid of the abnormal data block, marking the rowid as the abnormal rowid, and re-reading the abnormal rowid and one or more data blocks behind the abnormal rowid.
4. The method according to claim 1, wherein the step of obtaining a target data table, extracting physical address information of the target data table, and generating a to-be-processed task table according to the physical address information comprises:
acquiring the target data table from a system database, and extracting physical address information of the target data table by a system, wherein the physical address information comprises range extension and attribute information of the target data table;
and taking each extension in the physical address information as an independent task, and generating a task table to be processed according to attribute information corresponding to each independent task.
5. The method of claim 1, wherein the step of generating a rowid of one or more of the pseudo-columns of the data blocks and reading the target data table according to the rowid comprises:
generating a rowid of each data block according to the data table number, the file number, the block number and the line number, wherein the rowid comprises an initial rowid and a termination rowid;
and positioning to a target data block according to the starting rowid and the stopping rowid, and sequentially reading one or more corresponding target data blocks until the target data table is read.
6. The method of claim 5, wherein the step of locating a target data block based on the starting rowid and the terminating rowid is further followed by:
and performing locking operation on the target data in the target data block, and performing locking release after reading the target data.
7. The method of claim 1, wherein the step of generating a rowid of one or more of the pseudo-columns of the data blocks and reading the target data table according to the rowid is further followed by:
and performing data editing operation according to the data of the rowid in the target data table.
8. A data reading apparatus, characterized in that the data reading apparatus comprises:
the acquisition module is used for acquiring a target data table, extracting physical address information of the target data table and generating a task table to be processed according to the physical address information;
the segmentation module is used for determining single data processing capacity according to preset service requirements and segmenting the task table to be processed into one or more data blocks according to the single data processing capacity;
and the reading module is used for generating one or more pseudo columns rowid of the data blocks and reading the target data table according to the rowid.
9. A data reading device, characterized in that the data reading device comprises a processor, a memory and a data reading program stored in the memory, which data reading program, when executed by the processor, carries out the steps of the data reading method according to any one of claims 1 to 7.
10. A computer storage medium, having a data reading program stored thereon, the data reading program, when executed by a processor, implementing the steps of the data reading method according to any one of claims 1-7.
CN202010128291.8A 2020-02-28 2020-02-28 Data reading method, device, equipment and storage medium Active CN111414362B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010128291.8A CN111414362B (en) 2020-02-28 2020-02-28 Data reading method, device, equipment and storage medium
PCT/CN2020/136303 WO2021169496A1 (en) 2020-02-28 2020-12-15 Data reading method, apparatus, and device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010128291.8A CN111414362B (en) 2020-02-28 2020-02-28 Data reading method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111414362A true CN111414362A (en) 2020-07-14
CN111414362B CN111414362B (en) 2023-11-10

Family

ID=71491095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010128291.8A Active CN111414362B (en) 2020-02-28 2020-02-28 Data reading method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111414362B (en)
WO (1) WO2021169496A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035481A (en) * 2020-08-31 2020-12-04 中国平安财产保险股份有限公司 Data processing method, data processing device, computer equipment and storage medium
CN113190281A (en) * 2021-04-08 2021-07-30 武汉达梦数据库股份有限公司 ROWID interval-based initialization loading method and device
WO2021169496A1 (en) * 2020-02-28 2021-09-02 平安科技(深圳)有限公司 Data reading method, apparatus, and device, and storage medium
CN114385260A (en) * 2021-12-15 2022-04-22 武汉达梦数据库股份有限公司 ROWID interval-based initialization loading method and equipment
CN114546942A (en) * 2022-01-28 2022-05-27 苏州浪潮智能科技有限公司 Database data reading method and device, terminal and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116088771B (en) * 2023-04-06 2023-06-16 国网浙江省电力有限公司营销服务中心 Multi-level storage method and system for worksheet data based on energy Internet

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246549A (en) * 2012-02-07 2013-08-14 阿里巴巴集团控股有限公司 Method and system for data transfer
CN107608773A (en) * 2017-08-24 2018-01-19 阿里巴巴集团控股有限公司 task concurrent processing method, device and computing device
CN110019004A (en) * 2017-09-08 2019-07-16 华为技术有限公司 A kind of data processing method, apparatus and system
CN110263057A (en) * 2019-06-12 2019-09-20 上海英方软件股份有限公司 A kind of storage/the querying method and device of ROWID mapping table
US20190294602A1 (en) * 2017-01-09 2019-09-26 Tencent Technology (Shenzhen) Company Limited Data scrubbing method and apparatus, and computer readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10353634B1 (en) * 2016-03-28 2019-07-16 Amazon Technologies, Inc. Storage tier-based volume placement
GB2566514B (en) * 2017-09-15 2020-01-08 Imagination Tech Ltd Resource allocation
CN110532799B (en) * 2019-07-31 2023-03-24 平安科技(深圳)有限公司 Data desensitization control method, electronic device and computer readable storage medium
CN111414362B (en) * 2020-02-28 2023-11-10 平安科技(深圳)有限公司 Data reading method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103246549A (en) * 2012-02-07 2013-08-14 阿里巴巴集团控股有限公司 Method and system for data transfer
US20190294602A1 (en) * 2017-01-09 2019-09-26 Tencent Technology (Shenzhen) Company Limited Data scrubbing method and apparatus, and computer readable storage medium
CN107608773A (en) * 2017-08-24 2018-01-19 阿里巴巴集团控股有限公司 task concurrent processing method, device and computing device
CN110019004A (en) * 2017-09-08 2019-07-16 华为技术有限公司 A kind of data processing method, apparatus and system
CN110263057A (en) * 2019-06-12 2019-09-20 上海英方软件股份有限公司 A kind of storage/the querying method and device of ROWID mapping table

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021169496A1 (en) * 2020-02-28 2021-09-02 平安科技(深圳)有限公司 Data reading method, apparatus, and device, and storage medium
CN112035481A (en) * 2020-08-31 2020-12-04 中国平安财产保险股份有限公司 Data processing method, data processing device, computer equipment and storage medium
CN112035481B (en) * 2020-08-31 2023-10-27 中国平安财产保险股份有限公司 Data processing method, device, computer equipment and storage medium
CN113190281A (en) * 2021-04-08 2021-07-30 武汉达梦数据库股份有限公司 ROWID interval-based initialization loading method and device
CN113190281B (en) * 2021-04-08 2022-05-17 武汉达梦数据库股份有限公司 ROWID interval-based initialization loading method and device
CN114385260A (en) * 2021-12-15 2022-04-22 武汉达梦数据库股份有限公司 ROWID interval-based initialization loading method and equipment
CN114546942A (en) * 2022-01-28 2022-05-27 苏州浪潮智能科技有限公司 Database data reading method and device, terminal and storage medium
CN114546942B (en) * 2022-01-28 2024-01-19 苏州浪潮智能科技有限公司 Database data reading method, device, terminal and storage medium

Also Published As

Publication number Publication date
WO2021169496A1 (en) 2021-09-02
CN111414362B (en) 2023-11-10

Similar Documents

Publication Publication Date Title
CN111414362A (en) Data reading method, device, equipment and storage medium
CN110096476B (en) Data backup method, device and computer readable storage medium
EP3438845A1 (en) Data updating method and device for a distributed database system
US20100153346A1 (en) Data integrity in a database environment through background synchronization
US8843450B1 (en) Write capable exchange granular level recoveries
CN108121774B (en) Data table backup method and terminal equipment
CN112650753A (en) Log management method, device, system, equipment and readable storage medium
CN110019063B (en) Method for computing node data disaster recovery playback, terminal device and storage medium
CN110287251B (en) MongoDB-HBase distributed high fault-tolerant data real-time synchronization method
CN116881051A (en) Data backup and recovery method and device, electronic equipment and storage medium
CN110851437A (en) Storage method, device and equipment
CN114896276A (en) Data storage method and device, electronic equipment and distributed storage system
CN114461762A (en) Archive change identification method, device, equipment and storage medium
CN112860376A (en) Snapshot chain making method and device, electronic equipment and storage medium
CN108376104B (en) Node scheduling method and device and computer readable storage medium
CN107656868B (en) Debugging method and system for acquiring thread name by using thread private data
CN114675995A (en) Data backup method and device and electronic equipment
CN112486966A (en) Expired data cleaning method and device and electronic equipment
CN111858487A (en) Data updating method and device
CN110795389A (en) Storage snapshot based copying method, user equipment, storage medium and device
CN112988474B (en) Method, system, equipment and medium for backing up hot data by mass small files
CN112596948B (en) Database cluster data backup method, device, equipment and storage medium
CN112015586B (en) Data reconstruction calculation method and related device
CN116414772A (en) Data dump method, device, equipment and storage medium
CN115640261A (en) HDFS empty file positioning method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant