CN113419824A - Data processing method, device, system and computer storage medium - Google Patents

Data processing method, device, system and computer storage medium Download PDF

Info

Publication number
CN113419824A
CN113419824A CN202110097003.1A CN202110097003A CN113419824A CN 113419824 A CN113419824 A CN 113419824A CN 202110097003 A CN202110097003 A CN 202110097003A CN 113419824 A CN113419824 A CN 113419824A
Authority
CN
China
Prior art keywords
data
target object
task
reading
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110097003.1A
Other languages
Chinese (zh)
Inventor
张天雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202110097003.1A priority Critical patent/CN113419824A/en
Publication of CN113419824A publication Critical patent/CN113419824A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Abstract

The embodiment of the application provides a data processing method, a device, a system and a computer storage medium, wherein the data processing method comprises the following steps: acquiring request information for a target object; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.

Description

Data processing method, device, system and computer storage medium
Technical Field
The embodiment of the application relates to the technical field of electronic information, in particular to a data processing method, a data processing device, a data processing system and a computer storage medium.
Background
For the distributed database, data is usually stored in a remote node device such as an Operation Support Systems (OSS), and when a query request of a user is received, a large amount of data needs to be read from the remote node device. However, in the process of reading data, the data needs to be sequentially searched and read according to a serial reading mode, which consumes long time, reduces the overall query efficiency, and affects the query experience of the user.
Disclosure of Invention
In view of the above, embodiments of the present application provide a data processing method, apparatus, system, and computer storage medium to solve some or all of the above problems.
According to a first aspect of embodiments of the present application, there is provided a data processing method, including: acquiring request information for a target object; generating at least two task blocks of the corresponding target object according to the request information, wherein the at least two task blocks are used for indicating to acquire the data block of the target object; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing a plurality of pre-reading queues by using a plurality of independent threads, and parallelly acquiring the data block of the target object, wherein the plurality of pre-reading queues correspond to the plurality of independent threads one to one.
According to a second aspect of embodiments of the present application, there is provided a data processing apparatus including: the acquisition module is used for acquiring request information of the target object; the task block adaptation module is used for generating at least two task blocks of the corresponding target object according to the request information, and the at least two task blocks are used for indicating to acquire the data block of the target object; the pre-reading module is used for adding at least two task blocks into a plurality of pre-reading queues which are preset; and the data block loading module is used for processing a plurality of pre-reading queues by utilizing a plurality of independent threads respectively and acquiring the data blocks of the target object in parallel, wherein the plurality of pre-reading queues correspond to the plurality of independent threads one to one.
According to a third aspect of embodiments of the present application, there is provided an electronic apparatus, including: the processor, the memory and the communication interface complete mutual communication through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the corresponding operation of the data processing method of the first aspect.
According to a fourth aspect of embodiments of the present application, there is provided a database system including: an electronic device and at least two node devices, the electronic device being the data processing apparatus described in the second aspect, or the electronic device being the electronic device described in the third aspect.
According to a fifth aspect of embodiments of the present application, there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the data processing method of the first aspect.
The data processing method, the device, the system and the computer storage medium provided by the embodiment of the application acquire the request information of the target object; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present application, and other drawings can be obtained by those skilled in the art according to the drawings.
Fig. 1 is a schematic view of a data processing method according to an embodiment of the present application;
fig. 1A is a schematic diagram of an object storage according to an embodiment of the present application;
fig. 2 is a flowchart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an asynchronous multi-thread parallel read effect according to an embodiment of the present disclosure;
FIG. 4 is a block diagram of an object store architecture according to an embodiment of the present disclosure;
fig. 5 is a block diagram of a data processing apparatus according to a second embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to a third embodiment of the present application;
fig. 7 is a schematic structural diagram of a database system according to a fourth embodiment of the present application.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application shall fall within the scope of the protection of the embodiments in the present application.
The following further describes specific implementations of embodiments of the present application with reference to the drawings of the embodiments of the present application.
Example one
The embodiment of the application provides a data processing method, which is applied to a data processing device, wherein the data processing device can be a network device such as a server. For convenience of understanding, an application scenario of the data processing method provided in the first embodiment of the present application is described, and fig. 1 is shown as a scenario diagram of the data processing method provided in the first embodiment of the present application. The scenario shown in fig. 1 comprises a data processing apparatus 101, at least one data storage apparatus 102, and a client device 103.
The data processing apparatus 101 and the data storage apparatus 102 may be cloud devices such as a server, a relay device, and an end-to-end device, and the data processing apparatus 101 and the data storage apparatus 102 may be integrated in one device or may be respectively used as two separate devices. For example, the data processing apparatus 101 and the data storage apparatus 102 may be two processors in one device, respectively; as another example, the data processing apparatus 101 and the data storage apparatus 102 may be two servers in one cabinet; as another example, the data processing device 101 and the data storage device 102 may be two separate servers. Of course, the description is given here only by way of example of a server. It should be noted that the data processing device 101 and the at least one data storage device 102 may form a distributed system, for example, they may be distributed databases. The data processing apparatus 101 coordinates in the distributed system, for example, when a transaction is started, the data processing apparatus 101 sends a message to at least one data storage apparatus 102 participating in the transaction, so that the at least one data storage apparatus 102 processes the transaction. For example, taking Object Storage Service (OSS) as an example, as shown in fig. 1A, fig. 1A is a schematic diagram of Object Storage according to an embodiment of the present disclosure. The data storage device 102 stores a plurality of Log blocks (english: Log blocks), each Log Block stores a plurality of data blocks, one data Block may be a part of data of one object, one Log Block may store a part of data of each object of a plurality of objects, and one object may include metadata, index data, and raw data, when the client device 103 requests to read a target object, the data processing device 101 needs to load the metadata and the index data of the target object from the data storage device 102 according to the request information, perform a lookup according to the metadata and the index data, and then load the raw data of the target object from the data storage device 102. Of course, this is merely illustrative.
The client device 103 may be a terminal device such as a smart phone, a tablet computer, a notebook computer, or a cloud device such as a server. It should be noted that the client device 103 may also access a network, be connected to the cloud data processing apparatus 101 through the network, and perform data interaction, and the client device 103 may be a terminal device or a cloud device. In the present application, the Network includes a Local Area Network (LAN), a Wide Area Network (WAN), and a mobile communication Network; such as the World Wide Web (WWW), Long Term Evolution (LTE) networks, 2G networks (2 th Generation Mobile Network), 3G networks (3 th Generation Mobile Network), 5G networks (5 th Generation Mobile Network), etc. Of course, this is merely an example and does not represent a limitation of the present application. The cloud may include various devices connected over a network, such as servers, relay devices, Device-to-Device (D2D) devices, and the like.
As shown in fig. 1, the data processing apparatus 101 receives request information sent by the client device 103, the request information including request information of a target object; the data processing apparatus 101 reads the data block of the target object in the plurality of data storage apparatuses 102 according to the request information. Specifically, at least two task blocks correspondingly generated according to the request information of the target object are obtained according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; for each pre-read queue, data reading can be independently performed according to the task blocks in the pre-read queue, and the data processing apparatus 101 can execute the task blocks of multiple pre-read queues at a time and read multiple data blocks of the target object in parallel.
With reference to the scenario shown in fig. 1, a data processing method provided in a first embodiment of the present application is described in detail, it should be noted that fig. 1 is only an application scenario of the data processing method provided in the first embodiment of the present application, and does not represent that the data processing method must be applied to the scenario shown in fig. 1, referring to fig. 2, fig. 2 is a flowchart of the data processing method provided in the first embodiment of the present application, and the method includes the following steps:
step 201, request information for the target object is acquired.
It should be noted that the object may be a storage object in the object storage OSS system, and may be a complete file, the target object is any object, and an object may include metadata (english: Meta), Index Data (english: Index), and raw Data (english: Data). The metadata may indicate the file size, modification time, storage path, etc. of the object; the index data may indicate index information that the data blocks of the object store in the OSS system; the raw data includes the data content of the object. The request information may include description information of the target object, the description information of one object may include a name or an Identity (english: Identity) of metadata of the object, a name or an ID of the index data, and a name or an ID of the original data, and the description information of a plurality of objects may form a list including a list of metadata, a list of the index data, and a list of the original data. Of course, this is merely an example.
Step 202, generating at least two task blocks of the corresponding target object according to the request information.
It should be noted that at least two task blocks are used for indicating to acquire the data block of the target object, and one task block can execute a read task once. With reference to the description in step 201, taking the target object including metadata, index data, and original data as an example, the request information of the target object may be divided and combined to obtain at least two task blocks.
Optionally, generating at least two task blocks of the corresponding target object according to the request information includes: and dividing and combining the request information of the target object to obtain at least two task blocks with preset sizes. The size of the task block may be preset, i.e. a preset size. For example, the request information of the target object may include description information of metadata, description information of index data, and description information of original data, and the request information of the target object is divided and merged to obtain at least two task blocks of a preset size, including: segmenting the description information of the metadata and the description information of the index data to obtain task blocks of the metadata and the index data in preset sizes; and merging the description information of the original data to obtain a task block of the original data with a preset size. The preset size may include: 1KB, 128KB, 1024KB and the like, the task blocks can be managed in a manner of multiple levels and different sizes, and the description information of the metadata, the description information of the index data and the description information of the original data are often irregular in size, so that the task blocks can be aligned according to the preset sizes of 1KB, 128KB and 1024 KB. The task block is set to be in a preset size, so that management can be facilitated, and subsequent concurrent execution of the reading task is facilitated.
Optionally, in another embodiment, the target object may be first searched in the cache, and if found, the result may be directly returned, and if not found, the result may be read in the OSS system. Specifically, generating at least two task blocks of the corresponding target object according to the request information includes: searching a target object in the cache according to the request information; and when the target object is not found in the cache, generating at least two corresponding task blocks according to the request information of the target object.
And step 203, adding at least two task blocks into a plurality of preset pre-reading queues.
It should be noted that the pre-fetch queue is used to cache task blocks, so as to perform parallel processing on multiple task blocks, and improve the efficiency of data fetching. Here, two specific implementations are exemplified for explanation:
optionally, in one implementation; adding at least two task blocks into a plurality of pre-reading queues which are preset, wherein the method comprises the following steps: adding at least one task block of metadata into a plurality of pre-reading queues; after the task block processing of the metadata is completed, adding the task block of at least one index data into a plurality of pre-reading queues; and after the task block processing of the index data is finished, adding at least one task block of the original data into a plurality of pre-reading queues. The multiple pre-reading queues perform parallel processing on the same type of data, read metadata in parallel first, and read index data in parallel after the metadata is read; after the index data is read, the original data is read in parallel, and the data reading efficiency is improved.
Optionally, in another implementation, the pre-read queue includes: the data reading method comprises the following steps of (1) pre-reading queues of metadata, pre-reading queues of index data and pre-reading queues of original data; adding at least two task blocks into a plurality of pre-reading queues which are preset, wherein the method comprises the following steps: adding at least one task block of metadata into a metadata pre-reading queue; adding at least one task block of index data into an index data pre-reading queue; and adding at least one task block of the original data into an original data pre-reading queue. In the data reading process, reading is required to be performed according to the order of metadata, index data and original data, so that pre-reading queues are respectively set for the metadata, the index data and the original data, and different types of data are respectively managed. The number of the pre-read queues corresponding to the three types of data can be one or more, for example, the number of the metadata pre-read queues can be one or more, the number of the index data pre-read queues can be one or more, and the number of the original data pre-read queues can be one or more. The three pre-reading queues asynchronously and parallelly execute the reading task, so that three different types of data are asynchronously and parallelly read, and the reading efficiency can be improved.
Optionally, in an implementation, the method further includes: and when the number of the task blocks in the pre-reading queue reaches a preset threshold value, discarding the newly generated task blocks. And a preset threshold value is set, so that the number of the task blocks in the pre-reading queue is less than or equal to the preset threshold value, and the query waiting time is avoided from being too long.
And step 204, processing the plurality of pre-reading queues by using a plurality of independent threads respectively, and acquiring the data blocks of the target object in parallel.
The multiple pre-read queues are in one-to-one correspondence with the multiple independent threads, namely, one independent thread processes one pre-read queue, and the multiple independent threads process the corresponding pre-read queues in parallel, so that the data blocks of the target object are acquired in parallel. The node device in the present application is a data storage apparatus, a data block is a unit of data storage, and generally, data (including metadata, index data, and original data) of one object is divided into a plurality of data blocks, which are stored on a plurality of node devices respectively. Here, two specific implementations are exemplified for explanation:
optionally, in a first implementation manner, processing a plurality of pre-read queues by using a plurality of independent threads respectively, and acquiring a data block of a target object in parallel includes: respectively sending task blocks in a plurality of pre-reading queues to a plurality of node devices by utilizing a plurality of independent threads; and receiving the data blocks of the target object transmitted by the plurality of node devices in parallel. The number of the pre-reading queues can be a preset number, and the number of the independent threads is also the preset number, so that the preset number of task blocks can be simultaneously sent to the node devices, the preset number of task blocks can be processed at the same time, and the data reading efficiency is improved.
Optionally, in a second implementation manner, taking an example that a target object includes metadata, index data, and original data, processing a plurality of pre-read queues by using a plurality of independent threads, respectively, and acquiring a data block of the target object in parallel, includes: utilizing a plurality of independent threads corresponding to the plurality of pre-reading queues respectively to obtain data blocks of metadata of the target object in parallel; after metadata acquisition is completed, a plurality of independent threads corresponding to a plurality of pre-reading queues are utilized to parallelly acquire data blocks of index data of a target object; after the index data is acquired, a plurality of independent threads corresponding to the plurality of pre-reading queues are utilized to acquire data blocks of the original data of the target object in parallel, and the number of the task blocks processed in parallel at the same time does not exceed the preset number.
Optionally, in a third implementation manner, taking an example that a target object includes metadata, index data, and original data, processing a plurality of pre-read queues by using a plurality of independent threads, respectively, and acquiring a data block of the target object in parallel, includes: acquiring a data block of metadata corresponding to the task block by using a first thread corresponding to the metadata pre-reading queue; acquiring a data block of index data corresponding to the task block by using a second thread corresponding to the index data; and acquiring the data block of the original data corresponding to the task block by using the third thread corresponding to the original data. The index data corresponding to the read metadata can also start to be read while the metadata is read, the original data corresponding to the read index data can also start to be read while the index data is read, and the three pre-reading queues asynchronously execute the reading task in parallel, so that the three different types of data are asynchronously read in parallel, and the reading efficiency can be improved.
In conjunction with the description of step 201 and step 204, a specific example is presented herein to illustrate the process of asynchronous multi-threaded parallel reading. In the process of reading an object, pre-reading metadata, and then loading the metadata; pre-reading the index data, and then loading the index data; and pre-reading the original data according to the index data, and then loading the original data. For a data block of original data, the data block can be loaded only after metadata and index data corresponding to the data block are loaded, in the application, the data block can be asynchronously loaded in parallel while the metadata and the index data are loaded, so that the original data are not required to be loaded after all the metadata and the index data are loaded, and the query efficiency is improved. As shown in fig. 3, fig. 3 is a schematic diagram of an asynchronous multi-thread parallel read effect according to an embodiment of the present application. Data blocks of metadata are represented by data block a, data blocks of index data are represented by data block B, data blocks of original data are represented by data block C, and association among metadata, index data, and original data are represented by numerals 1, 2, and 3, respectively. For example, data chunk B1 may be found from the metadata in data chunk A1, and data chunk C1 may be found from the index data in data chunk B1. Referring to fig. 3, after data block a1 is first acquired and data block a1 is acquired, data block a2 and data block B1 may be acquired in parallel because data block B1 needs to depend on data block a1 to be acquired, and data block C1 needs to depend on data block B1 to be acquired, and data block C1, data block B2 and data block A3 may be acquired in parallel after data block B1 and data block a2 are acquired. The data block A (metadata) acquisition, the data block B (index data) acquisition and the data block C (original data) acquisition are respectively processed asynchronously and parallelly by three independent threads, the data block B is not required to be acquired after all the data blocks A are acquired, and the data block C is not required to be acquired after all the data blocks B are acquired, so that the data reading efficiency is greatly improved. It should be noted that, this is merely an exemplary description for easy understanding, for the data block B1, it is only necessary to wait until the data block a1 is acquired, and similarly, the data block C1 is acquired after the data block B1 is acquired, and there is not necessarily a certain order between the data blocks a, and it is possible to acquire the data block a2 first and then acquire the data block a1, or acquire the data block a1 first.
With reference to the scenario shown in fig. 1 and the data processing method described in the foregoing steps 201-204, here, a specific application scenario is listed for detailed description, taking an object storage OSS system as an example, as shown in fig. 4, fig. 4 is an architecture diagram of an object storage provided in an embodiment of the present application, a left side of fig. 4 shows a data reading process, where metadata pre-reading, metadata loading, index data pre-reading, index data loading, original data pre-reading, and original data loading are performed, and a specific asynchronous multi-thread parallel reading may refer to a corresponding description in fig. 3.
The middle part of fig. 4 shows the internal architecture of the data processing apparatus, which includes a cache module, a block alignment adaptation module, and a pre-fetch module.
The cache module comprises an Object cache and a block cache. The object cache is a cache for objects, and the life cycle of the objects is managed by the object management unit. The block cache is a cache for a data block, and mainly includes a memory block cache and a Solid State Disk (SSD) block cache, where the size of the memory block cache may be less than or equal to 8GB, and the size of the SSD block cache may be less than or equal to 200 GB. The block management unit manages the cached blocks in the two block caches, for example, discarding the blocks in the memory and erasing and multiplexing the data blocks on the local file corresponding to the SSD block cache. When request information of the target object is received, the target object is searched in the object cache, and if the target object is not found, data of the target object is obtained from the data storage device according to the request information.
Specifically, the request information includes description information of metadata of the target object, description information of the index data, and description information of the original data, the block alignment adaptation module is used to divide and merge the data in the request information according to a preset size, where the preset size may include 1KB, 128KB, and 1024KB, and in fig. 4, the block alignment adaptation module includes two parts, which are divided and merged, where the metadata and the index data are irregular, the metadata and the index data are divided into task blocks of a preset size, and the description information of the original data, which may also be referred to as column blocks (english: column blocks), is merged into task blocks of a preset size.
The pre-reading module comprises three parts, namely task submission, a pre-reading queue and a pre-reading thread pool. And the task submitting part is responsible for receiving the required task blocks and putting the task blocks into the pre-reading queue, the threads in the pre-reading thread pool consume the task blocks from the pre-reading queue, and the data blocks indicated by the task blocks are read from the OSS and written into the SSD block cache.
According to the data processing method provided by the embodiment of the application, request information for a target object is acquired; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.
Example two
Based on the method described in the first embodiment, a second embodiment of the present application provides a data processing apparatus for executing the method described in the first embodiment, and referring to fig. 5, the data processing apparatus 50 includes:
an obtaining module 501, configured to obtain request information for a target object;
a task block adaptation module 502, configured to generate at least two task blocks of a corresponding target object according to the request information, where the at least two task blocks are used to instruct to obtain a data block of the target object;
a pre-reading module 503, configured to add at least two task blocks to a plurality of pre-reading queues that are preset;
the data block loading module 504 is configured to process a plurality of pre-read queues by a plurality of independent threads, respectively, and obtain data blocks of a target object in parallel, where the plurality of pre-read queues correspond to the plurality of independent threads one to one.
Optionally, the task block adapting module 502 is configured to segment and combine the request information of the target object to obtain at least two task blocks with preset sizes.
Optionally, the request information of the target object includes description information of metadata, description information of index data, and description information of original data; a task block adaptation module 502, configured to segment the description information of the metadata and the description information of the index data to obtain a task block of metadata with a preset size and a task block of the index data; and merging the description information of the original data to obtain a task block of the original data with a preset size.
Optionally, the pre-read queue comprises: the data reading method comprises the following steps of (1) pre-reading queues of metadata, pre-reading queues of index data and pre-reading queues of original data; a pre-reading module 503, configured to add at least one task block of metadata to a metadata pre-reading queue; adding at least one task block of index data into an index data pre-reading queue; and adding at least one task block of the original data into an original data pre-reading queue.
Optionally, the target object comprises metadata, index data, and raw data; a data block loading module 504, configured to obtain a data block of metadata corresponding to the task block by using a first thread corresponding to the metadata pre-read queue; acquiring a data block of index data corresponding to the task block by using a second thread corresponding to the index data; and acquiring the data block of the original data corresponding to the task block by using the third thread corresponding to the original data.
Optionally, the task block adaptation module 502 is configured to search for a target object in the cache according to the request information; and when the target object is not found in the cache, generating at least two corresponding task blocks according to the request information of the target object.
Optionally, the data block loading module 504 is configured to send task blocks in a plurality of pre-read queues to a plurality of node devices by using a plurality of independent threads, respectively; and receiving the data blocks of the target object transmitted by the plurality of node devices in parallel.
Optionally, the pre-reading module 503 is further configured to discard the newly generated task block when the number of task blocks in the pre-reading queue reaches a preset threshold.
The data processing device provided by the embodiment of the application acquires request information for a target object; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.
EXAMPLE III
Based on the method described in the first embodiment, a third embodiment of the present application provides an electronic device, configured to execute the method described in the first embodiment, and referring to fig. 6, fig. 6 is a schematic structural diagram of the electronic device provided in the third embodiment of the present application, and a specific embodiment of the present application does not limit a specific implementation of the electronic device.
As shown in fig. 6, the electronic device may include: a processor (processor)602, a communication Interface 604, a memory 606, and a communication bus 608.
Wherein:
the processor 602, communication interface 604, and memory 606 communicate with one another via a communication bus 608.
A communication interface 604 for communicating with other electronic devices, such as a terminal device or a server.
The processor 602 is configured to execute the program 610, and may specifically perform relevant steps in the foregoing method embodiments.
In particular, program 610 may include program code comprising computer operating instructions.
The processor 602 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present application. The electronic device comprises one or more processors, which can be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 606 for storing a program 610. Memory 606 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 610 may specifically be configured to cause the processor 602 to execute any one of the methods of the first embodiment.
For specific implementation of each step in the program 610, reference may be made to corresponding steps and corresponding descriptions in units in the foregoing data processing method embodiments, which are not described herein again. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described devices and modules may refer to the corresponding process descriptions in the foregoing method embodiments, and are not described herein again.
The electronic equipment provided by the embodiment of the application acquires request information for a target object; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.
Example four
Based on the method described in the first embodiment and the apparatuses described in the second and third embodiments, a fourth embodiment of the present application provides a database system for executing the method described in the first embodiment, and as shown in fig. 7, the database system 70 includes: an electronic device 701 and at least two node devices 702;
the electronic device 701 may be the data processing apparatus 50 described in the second embodiment, or the electronic device 701 may be the electronic device 60 described in the third embodiment;
the node device 702, may be the data storage device 102. The functions of the electronic device 701 and the power saving device 702 are described in detail in the above embodiments, and are not described herein again.
EXAMPLE five
Based on the method described in the first embodiment, a fifth embodiment of the present application provides a computer storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the method described in the first embodiment.
The computer storage medium provided by the embodiment of the application acquires request information for a target object; generating at least two task blocks of the corresponding target object according to the request information; adding at least two task blocks into a plurality of pre-reading queues which are preset; and respectively processing the plurality of pre-reading queues by using a plurality of independent threads, and acquiring the data blocks of the target object in parallel. Because the task blocks are added into the pre-set pre-reading queues, when the corresponding data blocks are read, the data blocks of a plurality of target objects can be read in parallel according to the plurality of task blocks, and the reading efficiency is improved.
It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present application may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present application.
The above-described methods according to embodiments of the present application may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the methods described herein may be stored in such software processes on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the data processing methods described herein. Further, when a general-purpose computer accesses code for implementing the data processing method shown herein, execution of the code converts the general-purpose computer into a special-purpose computer for executing the data processing method shown herein.
Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the embodiments of the present application.
The above embodiments are only used for illustrating the embodiments of the present application, and not for limiting the embodiments of the present application, and those skilled in the relevant art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present application, so that all equivalent technical solutions also belong to the scope of the embodiments of the present application, and the scope of patent protection of the embodiments of the present application should be defined by the claims.

Claims (12)

1. A method of data processing, comprising:
acquiring request information for a target object;
generating at least two corresponding task blocks of the target object according to the request information, wherein the at least two task blocks are used for indicating to acquire a data block of the target object;
adding the at least two task blocks into a plurality of pre-reading queues which are preset;
and respectively processing the pre-reading queues by utilizing a plurality of independent threads, and parallelly acquiring the data block of the target object, wherein the pre-reading queues are in one-to-one correspondence with the independent threads.
2. The method of claim 1, wherein the generating at least two task blocks of the corresponding target object according to the request information comprises:
and dividing and combining the request information of the target object to obtain the at least two task blocks with preset sizes.
3. The method of claim 2, wherein the request information of the target object includes description information of metadata, description information of index data, and description information of original data;
the dividing and combining the request information of the target object to obtain the at least two task blocks with preset sizes includes:
segmenting the description information of the metadata and the description information of the index data to obtain task blocks of metadata and index data with preset sizes;
and merging the description information of the original data to obtain the task block of the original data with the preset size.
4. The method of claim 3, wherein the pre-read queue comprises: the data reading method comprises the following steps of (1) pre-reading queues of metadata, pre-reading queues of index data and pre-reading queues of original data;
the adding the at least two task blocks into a plurality of preset pre-reading queues comprises:
adding at least one task block of the metadata into the metadata pre-reading queue; adding at least one task block of the index data into the index data pre-reading queue; and adding at least one task block of the original data into the original data pre-reading queue.
5. The method of claim 4, wherein the target object comprises metadata, index data, and raw data;
the parallel obtaining of the data blocks of the target object by using the plurality of independent threads respectively corresponding to the plurality of pre-read queues according to the preset number of the task blocks in the pre-read queues includes:
acquiring a data block of metadata corresponding to the task block by using a first thread corresponding to the metadata pre-reading queue; acquiring a data block of the index data corresponding to the task block by using a second thread corresponding to the index data; and acquiring a data block of the original data corresponding to the task block by using a third thread corresponding to the original data.
6. The method of claim 1, wherein the generating at least two task blocks of the corresponding target object according to the request information comprises:
searching the target object in a cache according to the request information;
and when the target object is not found in the cache, generating the at least two corresponding task blocks according to the request information of the target object.
7. The method of claim 1, wherein the concurrently retrieving the data block of the target object by processing the pre-read queues with independent threads, respectively, comprises:
respectively sending the task blocks in the pre-reading queues to a plurality of node devices by utilizing the independent threads;
and receiving the data blocks of the target object transmitted by the plurality of node devices in parallel.
8. The method of any of claims 1-7, wherein the method further comprises:
and when the number of the task blocks in the pre-reading queue reaches a preset threshold value, discarding the newly generated task blocks.
9. A data processing apparatus comprising:
the acquisition module is used for acquiring request information of the target object;
the task block adaptation module is used for generating at least two corresponding task blocks of the target object according to the request information, wherein the at least two task blocks are used for indicating to acquire a data block of the target object;
the pre-reading module is used for adding the at least two task blocks into a plurality of pre-reading queues which are preset;
and the data block loading module is used for respectively processing the pre-reading queues by utilizing a plurality of independent threads and parallelly acquiring the data blocks of the target object, wherein the pre-reading queues correspond to the independent threads one to one.
10. An electronic device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the corresponding operation of the data processing method according to any one of claims 1-8.
11. A database system, comprising: an electronic device and at least two node devices, the electronic device being the data processing apparatus of claim 9, or the electronic device being the electronic device of claim 10.
12. A computer storage medium, on which a computer program is stored which, when being executed by a processor, carries out the data processing method of any one of claims 1 to 8.
CN202110097003.1A 2021-01-25 2021-01-25 Data processing method, device, system and computer storage medium Pending CN113419824A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110097003.1A CN113419824A (en) 2021-01-25 2021-01-25 Data processing method, device, system and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110097003.1A CN113419824A (en) 2021-01-25 2021-01-25 Data processing method, device, system and computer storage medium

Publications (1)

Publication Number Publication Date
CN113419824A true CN113419824A (en) 2021-09-21

Family

ID=77711750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110097003.1A Pending CN113419824A (en) 2021-01-25 2021-01-25 Data processing method, device, system and computer storage medium

Country Status (1)

Country Link
CN (1) CN113419824A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114090130A (en) * 2021-11-26 2022-02-25 上海星融汽车科技有限公司 Method and system for preloading execution logic
CN114327299A (en) * 2022-03-01 2022-04-12 苏州浪潮智能科技有限公司 Sequential reading and pre-reading method, device, equipment and medium
CN114945023A (en) * 2022-05-20 2022-08-26 济南浪潮数据技术有限公司 Network connection multiplexing method, device, equipment and medium
WO2024001413A1 (en) * 2022-06-28 2024-01-04 华为技术有限公司 Data reading method, data loading apparatus, and communication system
CN117453422A (en) * 2023-12-22 2024-01-26 南京研利科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114090130A (en) * 2021-11-26 2022-02-25 上海星融汽车科技有限公司 Method and system for preloading execution logic
CN114327299A (en) * 2022-03-01 2022-04-12 苏州浪潮智能科技有限公司 Sequential reading and pre-reading method, device, equipment and medium
WO2023165188A1 (en) * 2022-03-01 2023-09-07 苏州浪潮智能科技有限公司 Sequential read prefetching method and apparatus, device, and medium
CN114945023A (en) * 2022-05-20 2022-08-26 济南浪潮数据技术有限公司 Network connection multiplexing method, device, equipment and medium
WO2024001413A1 (en) * 2022-06-28 2024-01-04 华为技术有限公司 Data reading method, data loading apparatus, and communication system
CN117453422A (en) * 2023-12-22 2024-01-26 南京研利科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN117453422B (en) * 2023-12-22 2024-03-01 南京研利科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN113419824A (en) Data processing method, device, system and computer storage medium
CN106657213B (en) File transmission method and device
CN107153643B (en) Data table connection method and device
WO2017041570A1 (en) Method and apparatus for writing data to cache
CN114201421B (en) Data stream processing method, storage control node and readable storage medium
TW201220197A (en) for improving the safety and reliability of data storage in a virtual machine based on cloud calculation and distributed storage environment
CN110019873B (en) Face data processing method, device and equipment
US10771358B2 (en) Data acquisition device, data acquisition method and storage medium
US20200364080A1 (en) Interrupt processing method and apparatus and server
US11868333B2 (en) Data read/write method and apparatus for database
CN110910249B (en) Data processing method and device, node equipment and storage medium
CN110929194B (en) Abstract algorithm-based static resource file cache control method and system
US11055223B2 (en) Efficient cache warm up based on user requests
US20190245827A1 (en) Method and apparatus for synchronizing contact information and medium
CN112948025B (en) Data loading method and device, storage medium, computing equipment and computing system
CN112866339B (en) Data transmission method and device, computer equipment and storage medium
CN110222046B (en) List data processing method, device, server and storage medium
CN111209263A (en) Data storage method, device, equipment and storage medium
US20150106884A1 (en) Memcached multi-tenancy offload
US9069821B2 (en) Method of processing files in storage system and data server using the method
CN114063923A (en) Data reading method and device, processor and electronic equipment
CN107291628B (en) Method and apparatus for accessing data storage device
CN112671918B (en) Binary system-based distributed data downloading method, device, equipment and medium
CN116303125B (en) Request scheduling method, cache, device, computer equipment and storage medium
CN113076178B (en) Message storage method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40058777

Country of ref document: HK