CN116225314A - Data writing method, device, computer equipment and storage medium - Google Patents

Data writing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN116225314A
CN116225314A CN202211624203.9A CN202211624203A CN116225314A CN 116225314 A CN116225314 A CN 116225314A CN 202211624203 A CN202211624203 A CN 202211624203A CN 116225314 A CN116225314 A CN 116225314A
Authority
CN
China
Prior art keywords
target
writing process
data block
writing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211624203.9A
Other languages
Chinese (zh)
Inventor
张同彦
宫凤明
杨鹏
马照云
郭照斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Zhongke Shuguang Storage Technology Co ltd
Original Assignee
Tianjin Zhongke Shuguang Storage Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Zhongke Shuguang Storage Technology Co ltd filed Critical Tianjin Zhongke Shuguang Storage Technology Co ltd
Priority to CN202211624203.9A priority Critical patent/CN116225314A/en
Publication of CN116225314A publication Critical patent/CN116225314A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Read Only Memory (AREA)
  • Stored Programmes (AREA)

Abstract

The present application relates to a data writing method, apparatus, computer device, storage medium and computer program product. The method comprises the following steps: acquiring a plurality of writing processes corresponding to the data writing request; determining a target data block of a target writing process aiming at the target writing process; dividing the target data block according to the size of the preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process; and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, enabling the next writing process to determine the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset strip as a new target writing process.

Description

Data writing method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of storage, and in particular, to a data writing method, apparatus, computer device, storage medium, and computer program product.
Background
In order to ensure consistency of data in a distributed file storage system, protection is required by a distributed interval lock. Specifically, data in the distributed file storage system may be split into stripes, each of which needs to be protected by a distributed interval lock.
In the related art, when the condition that the cross-process data needs to be written into the same stripe is related, a distributed interval lock is adopted to avoid that different writing processes process are used for processing the storage space in the same stripe, the lock is required to be called when each stripe is read, and when a plurality of stripes are read, competition of the distributed interval lock can be caused, so that the data writing efficiency is lower.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a data writing method, apparatus, computer device, computer readable storage medium, and computer program product that can avoid distributed lock contention.
In a first aspect, the present application provides a data writing method. The method comprises the following steps:
responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request;
determining a target data block of a target writing process in a plurality of writing processes;
dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process.
Based on the scheme, the data to be written can be divided into full stripes in the initial stage of data writing, the data which is written across the process due to the fact that the stripes are not full is exchanged to other writing processes for writing, the use of a distributed interval lock is avoided, the data is transmitted at a high speed through a preset data transmission network, the written data can be communicated at a high speed, exchanged and spliced, strict stripe alignment is guaranteed, and the data writing efficiency is further guaranteed.
In one embodiment, after the step of dividing the target data block according to a preset stripe size to obtain a first data block that meets the preset stripe size and a remaining data block that does not meet the preset stripe size, the method further includes:
and under the condition that the next writing process does not exist in the target writing process, caching the residual data blocks corresponding to the target writing process into a preset caching space.
Based on the scheme, the integrity of the data and the timeliness of the writing of the data can be ensured by caching the residual data blocks.
In one embodiment, the method further comprises:
after detecting a closing instruction of a file to be written corresponding to the data writing request, acquiring the residual data blocks in the preset cache space, and splicing and writing the residual data blocks into the file to be written.
Based on the scheme, the residual data blocks can be written in time, so that the loss of data is avoided, and the data writing efficiency is improved.
In one embodiment, the determining, for each target write process of the plurality of write processes, a target data block of the target write process includes:
aiming at the target writing process, acquiring a residual data block of a previous writing process when the target writing process exists in the previous writing process of the target writing process, and acquiring an original data block of the target writing process;
and splicing the residual data block of the previous writing process and the original data block of the target writing process to obtain the target data block of the target writing process.
Based on the scheme, the data block of the last writing process can be exchanged to the target writing process, so that data writing under the condition of no distributed lock competition is realized, and the data writing efficiency is improved.
In one embodiment, the method further comprises:
for the target writing process, determining that an original data block of the target writing process is a target data block of the target writing process when the target writing process does not exist in the last writing process of the target writing process.
Based on the scheme, the original data block of the writing process can be used as a target data block, so that data writing with strict stripe alignment is realized.
In one embodiment, the method further comprises:
for the target writing process, determining that the original data block of the target writing process is the target data block of the target writing process when the target writing process exists a last writing process of the target writing process and the last writing process does not exist the residual data block.
Based on the scheme, the original data block of the writing process can be used as a target data block, so that data writing with strict stripe alignment is realized.
In a second aspect, the present application further provides a data writing apparatus. The device comprises:
the first acquisition module is used for responding to a data writing request and acquiring a plurality of writing processes corresponding to the data writing request;
the first determining module is used for determining a target data block of a target writing process aiming at the target writing process in the plurality of writing processes;
the writing module is used for dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and the transmission module is used for transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network under the condition that the next writing process of the target writing process exists, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks and serves as a new target writing process, and the step of dividing the target data blocks according to the size of a preset strip is executed.
In one embodiment, the data writing apparatus further includes:
and the caching module is used for caching the residual data blocks corresponding to the target writing process into a preset caching space under the condition that the next writing process does not exist in the target writing process.
In one embodiment, the data writing apparatus further includes:
and the second acquisition module is used for acquiring the residual data blocks in the preset cache space after detecting the closing instruction of the file to be written corresponding to the data writing request, and splicing and writing the residual data blocks into the file to be written.
In one embodiment, the first determining module is specifically configured to:
aiming at the target writing process, acquiring a residual data block of a previous writing process when the target writing process exists in the previous writing process of the target writing process, and acquiring an original data block of the target writing process;
and splicing the residual data block of the previous writing process and the original data block of the target writing process to obtain the target data block of the target writing process.
In one embodiment, the data writing apparatus further includes:
and the second determining module is used for determining that the original data block of the target writing process is the target data block of the target writing process aiming at the target writing process, wherein the last writing process of the target writing process does not exist in the target writing process.
In one embodiment, the data writing apparatus further includes:
and the third determining module is used for determining that the original data block of the target writing process is the target data block of the target writing process when the last writing process of the target writing process exists and the last writing process does not exist the residual data block aiming at the target writing process.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:
responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request;
determining a target data block of a target writing process in a plurality of writing processes;
dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:
responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request;
determining a target data block of a target writing process in a plurality of writing processes;
dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request;
determining a target data block of a target writing process in a plurality of writing processes;
dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process.
The above data writing method, apparatus, computer device, storage medium and computer program product, wherein the method comprises: responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request; determining a target data block of a target writing process in a plurality of writing processes; dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process; and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process. By adopting the method, the data to be written can be divided into full stripes at the initial stage of data writing, the data which is written across the processes due to the partial stripes is exchanged to other writing processes for writing, the use of a distributed interval lock is avoided, the data is transmitted at a high speed through a preset data transmission network, the data is written through exchange and splicing, strict stripe alignment is ensured, and the data writing efficiency is further improved.
Drawings
FIG. 1 is a flow chart of a data writing method in one embodiment;
FIG. 2a is a schematic diagram illustrating the partitioning of data blocks corresponding to a writing process in a data writing method according to an embodiment;
FIG. 2b is a schematic diagram illustrating the partitioning of data blocks corresponding to a writing process in a data writing method according to an embodiment;
FIG. 3 is a flow chart illustrating the steps for determining a target data block in one embodiment;
FIG. 4 is a schematic diagram of distributed lock contention in one embodiment;
FIG. 5 is a block diagram of a data writing device in one embodiment;
fig. 6 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In one embodiment, as shown in fig. 1, a data writing method is provided, where the method is applied to a terminal to illustrate the method, it is to be understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and implemented through interaction between the terminal and the server, where the terminal may be, but is not limited to, various personal computers, notebook computers, smartphones, tablet computers, and the like, and the server may be implemented by a separate server or a server cluster formed by a plurality of servers. In this embodiment, the data writing method includes the following steps:
step 102, responding to the data writing request, and acquiring a plurality of writing processes corresponding to the data writing request.
The data writing request may be a writing request received by the terminal, where the writing request may be a request for instructing the terminal to perform a data writing operation to a distributed file system, where the distributed file system may be a physical storage resource managed by the file system, where the distributed file system may be connected to each node through a computer network, or may be a file system formed by a plurality of different logical disk partitions, and the data writing request may include at least one writing process.
Alternatively, the terminal may also acquire a plurality of write processes for performing data write operations to the distributed file system within a preset period of time.
In implementation, other terminal devices may generate a plurality of writing processes based on requirements of an actual application scenario, and generate a data writing request based on the writing processes, so that the data writing request may be sent to a terminal corresponding to the distributed file system. After the terminal receives the data writing request, the data writing request can be analyzed to obtain one writing process or a plurality of writing processes contained in the data writing request. The terminal may also receive a plurality of data writing requests in a preset time period, and acquire writing processes included in the data writing requests respectively.
Step 104, determining a target data block of the target write process aiming at the target write process in the plurality of write processes.
The target write process may be any one of a plurality of write processes. The target data block of the target writing process may be an original data block carried by the target writing process, or may be a data block generated by combining the original data block with data blocks of other writing processes.
In an implementation, after the terminal obtains a plurality of write processes corresponding to the distributed file system, each write process may be ordered according to a time sequence corresponding to each write process, and each write process may be sequentially processed. Thus, the specific processing procedure of the terminal aiming at the target writing process in the plurality of writing processes can be as follows: after determining the target writing process, the terminal needs to determine the target data block corresponding to the target writing process.
And 106, dividing the target data block according to the size of the preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing the data in the first data block corresponding to the target writing process through the target writing process.
The preset stripe size may be a length of data contained in a preset stripe in the distributed file system, for example, may be a. The data length of the first data block satisfying the size of the preset stripe may be the same as the length of data that the preset stripe may contain, and the remaining data blocks not satisfying the size of the preset stripe may be data blocks having a data length smaller than the length of data that the preset stripe may contain.
In implementation, for a target writing process, after acquiring a target data block corresponding to the target writing process, the terminal may segment the target data block according to a data length of a preset stripe to obtain one or more first data blocks (stripe alignment data) and remaining data blocks. After the terminal obtains the first data block and the remaining data block corresponding to the target writing process, the terminal can write the data in the first data block into the distributed file system respectively, so that timeliness of data writing is ensured.
Step 108, transmitting the residual data block to the next writing process of the target writing process through the preset data transmission network under the condition that the next writing process of the target writing process exists, so that the next writing process determines the residual data block and the original data block of the next writing process as the target data block, and the step of dividing the target data block according to the size of the preset stripe is executed as a new target writing process.
The preset data transmission network may be an OpenMPI network (high performance messaging library) for performing inter-process data transmission.
In implementation, after the terminal determines the first data block and the remaining data blocks corresponding to the target writing process, the terminal may write the first data block into the distributed file system corresponding to the data writing request, and at the same time, the terminal may determine whether there is a next writing process after the target writing process.
After the terminal determines that the target writing process still has the next writing process, the terminal can exchange the residual data blocks corresponding to the target writing process from the target writing process to the next writing process of the target writing process through a preset data transmission network, namely, transmit the residual data blocks from the target writing process to the next writing process of the target writing process. In this way, after determining that the next writing process of the target writing process receives the remaining data blocks of the target writing process, the terminal may determine the remaining data blocks and the original data blocks of the next writing process of the target writing process as target data blocks, and use the next writing process of the target writing process as a new target writing process, perform the steps of dividing the target data blocks according to the size of the preset stripe in the above embodiment, to obtain the first data blocks meeting the size of the preset stripe and the remaining data blocks not meeting the size of the preset stripe, and write the data in the first data blocks corresponding to the target writing process through the target writing process.
In one example, as shown in fig. 2a, the target write process may be rank0, and the target data block corresponding to rank may be data block (1); the terminal divides the data block (1) according to the size of the preset stripe, and the obtained schematic diagram of the first data block meeting the preset stripe size and the remaining data blocks not meeting the preset stripe size may be as shown in fig. 2b, where the first data block may be the data block (3) and the data block (4), and the remaining data block may be the data block (5), that is, the terminal may divide the data block (1) into the data block (3), the data block (4) and the data block (5). The terminal can write the data blocks (3), 4) into the distributed file system.
As shown in fig. 2b, the terminal may determine that the next write process, namely rank1, exists after the target write process (rank 0), based on which the terminal may exchange the remaining data blocks (data block (5)) of rank0 from rank0 to rank1 by presetting the data transmission network (local area high speed network). That is, the next writing process of the target writing process may be rank1, the original data block corresponding to rank1 may be data block (2), the target data block of rank1 may be a data block obtained by combining the remaining data block (5)) and the original data block (2)), after determining the target data block of rank1, the terminal may execute the steps of the above embodiment with rank1 as a new target writing process, that is, divide the target data block of rank1 according to the size of the preset stripe to obtain a new first data block, and a new remaining data block, where the new first data block may be a data block composed of data block (5) +data block (6), data block (7) and data block (8), and the new remaining data block may be data block (9).
In the data writing method, a plurality of writing processes corresponding to the data writing request are acquired in response to the data writing request. For a target write process of the plurality of write processes, a target data block of the target write process is determined. Dividing the target data block according to the size of the preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing the data in the first data block corresponding to the target writing process through the target writing process. And under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, enabling the next writing process to determine the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset strip as a new target writing process. By adopting the method, the data to be written can be divided into full stripes at the initial stage of data writing, the data which is not full stripes and is written by the cross-process writing is exchanged to other writing processes, the use of a distributed interval lock is avoided, the data is transmitted at a high speed through a preset data transmission network, the written data can be communicated at a high speed, exchanged and spliced, strict stripe alignment is ensured, and the data writing efficiency is further ensured through improvement.
In one embodiment, after the step of dividing the target data block according to the size of the preset stripe to obtain a first data block satisfying the size of the preset stripe and a remaining data block not satisfying the size of the preset stripe, the data writing method further includes:
and under the condition that the next writing process does not exist in the target writing process, caching the residual data blocks corresponding to the target writing process into a preset caching space.
In implementation, the terminal may store the remaining data block to a preset remaining buffer space, i.e. buffer the remaining data block, when the terminal determines that the target writing process does not have a next writing process.
In this embodiment, the integrity of the data and the timeliness of writing the data can be ensured by caching the remaining data blocks.
In one embodiment, the data writing method further includes:
after detecting a closing instruction of a file to be written corresponding to a data writing request, acquiring the residual data blocks in a preset cache space, and splicing and writing the residual data blocks into the file to be written.
The file to be written corresponding to the data writing request may be a file located in a distributed file system, an address of the file to be written may be a destination storage address of data carried by a writing process, and a closing instruction of the file to be written may be an operation executed after the file to be written has completed writing the file, and is used for indicating the terminal to close the file to be written.
In implementation, if the terminal detects that a closing instruction of a file to be written corresponding to a data writing request exists, the terminal can splice and write the remaining blocks cached in the preset cache space, namely splice and write the remaining blocks into the file to be written in the distributed file system.
In one example, after storing the remaining data blocks in the preset buffer space, if it is determined that no other writing process exists in the preset time period, the terminal may splice and write the remaining data blocks in the preset buffer space into the file to be written, that is, splice and write the remaining data blocks in the preset buffer space into the file to be written after the writing and before the writing.
In this embodiment, the remaining data blocks can be written in time, so that data loss is avoided and data writing efficiency is improved.
In one embodiment, as shown in fig. 3, the specific process of step "determining a target data block of a target write process for each target write process of a plurality of write processes" includes:
step 302, for a target write process, a previous write process of the target write process exists in the target write process, a remaining data block of the previous write process is obtained, and an original data block of the target write process is obtained.
And step 304, splicing the residual data block of the previous writing process and the original data block of the target writing process to obtain the target data block of the target writing process.
The last write process of the target write process may be a process in the migration order of the target write process according to the arrangement order of the processes.
In implementation, for a target write process, the terminal may determine whether there are other processes before the target write process, that is, determine whether the target write process is an initial write process, where the initial write process may be a write process that is located first according to a process arrangement order. Under the condition that the terminal determines that the last write process exists in the target write process, the terminal can acquire the residual data block of the last write process through a preset data transmission network, so that the terminal can also acquire the original data block carried by the target write process, and splice the residual data block of the last write process of the target write process and the original data block of the target write process to obtain the target data block of the target write process.
In this embodiment, the data block of the previous writing process may be exchanged to the target writing process, so as to implement data writing without distributed lock contention, and improve data writing efficiency.
In one embodiment, the data writing method further includes:
aiming at the target writing process, determining the original data block of the target writing process as the target data block of the target writing process when the previous writing process of the target writing process does not exist in the target writing process.
In implementation, for a target write process, the terminal may determine whether there are other processes before the target write process, that is, determine whether the target write process is an initial write process, where the initial write process may be a write process that is located first according to a process arrangement order. When the terminal determines that the target writing process does not exist in the previous writing process, that is, the terminal determines that the target writing process is the initial writing process, the terminal can acquire an original data block carried by the target writing process and take the original data block as a target data block of the target writing process.
In this embodiment, the original data block of the writing process may be used as the target data block, so as to implement data writing with strict stripe alignment.
In one embodiment, the data writing method further includes:
aiming at the target writing process, under the condition that the last writing process of the target writing process exists in the target writing process and the last writing process does not exist in the residual data blocks, the original data block of the target writing process is determined to be the target data block of the target writing process.
In implementation, when the terminal determines that the target writing process has the last writing process of the target writing process, but the last writing process of the target writing process does not have any remaining data blocks, the terminal may not need to exchange data through a preset data transmission network under the condition, so that the terminal can determine that the original data block of the target writing process is the target data block of the target writing process.
In this embodiment, the original data block of the writing process may be used as the target data block, so as to implement data writing with strict stripe alignment.
In this embodiment, in order to ensure consistency of data information in the distributed system, the data information needs to be protected by a distributed lock (distributed interval lock). The distributed application operates the same file based on multiple access points synchronously, which involves contention for the distributed lock and the resulting distributed lock request and recall. Therefore, the competition of the distributed lock is a performance bottleneck of the OpenMPI application writing files, that is, openMPI is a multi-process application, and a timing chart of interaction with a distributed file system of multiple access points is shown in fig. 4, when a write process (rank 0 and rank 1) requests the distributed lock, if the lock is used by other access points, a recall operation is performed first, a local cache of an original access point is cleared, and then the distributed lock is cached to a requesting node by a distributed lock server.
According to the data writing method, the alignment of the written data strips can be strictly ensured through the exchange and the splicing of the data, the competition of the distributed lock and the occurrence of the complementary reading are fundamentally avoided, and the writing efficiency is greatly improved. And for the OpenMPI application scene with only writing and no reading, the distributed interval lock can be closed through the ioctl interface provided by the distributed file system, the logic of the distributed interval lock can be thoroughly skipped, the writing efficiency can be further improved, the data writing method provided by the application can utilize the OpenMPI process to carry out high-speed communication, exchange and splice writing data, ensure strict stripe alignment, avoid the overhead of the distributed lock in the distributed file system, and for the non-OpenMPI application, the ioctl interface cannot be called in the writing process, the writing logic is unchanged, and the compatibility is ensured.
It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides a data writing device for realizing the above related data writing method. The implementation of the solution provided by the device is similar to the implementation described in the above method, so the specific limitation of one or more embodiments of the data writing device provided below may refer to the limitation of the data writing method hereinabove, and will not be repeated here.
In one embodiment, as shown in FIG. 5, there is provided a data writing apparatus 500 comprising:
a first obtaining module 502, configured to respond to a data writing request, and obtain a plurality of writing processes corresponding to the data writing request;
a first determining module 504, configured to determine, for a target write process of the plurality of write processes, a target data block of the target write process;
the writing module 506 is configured to divide the target data block according to the size of the preset stripe, obtain a first data block that meets the size of the preset stripe and a remaining data block that does not meet the size of the preset stripe, and write data in the first data block corresponding to the target writing process through the target writing process;
and the transmission module 508 is configured to transmit, when there is a next write process of the target write process, the remaining data block to the next write process of the target write process through the preset data transmission network, so that the next write process determines that the remaining data block and an original data block of the next write process are the target data block, and perform, as a new target write process, a step of dividing the target data block according to a size of a preset stripe.
In one embodiment, the data writing device 500 further includes:
and the caching module is used for caching the residual data blocks corresponding to the target writing process into a preset caching space under the condition that the next writing process does not exist in the target writing process.
In one embodiment, the data writing device 500 further includes:
and the second acquisition module is used for acquiring the residual data blocks in the preset cache space after detecting the closing instruction of the file to be written corresponding to the data writing request, and splicing and writing the residual data blocks into the file to be written.
In one embodiment, the first determining module 504 is specifically configured to:
aiming at the target writing process, acquiring a residual data block of a previous writing process when the target writing process exists in the previous writing process of the target writing process, and acquiring an original data block of the target writing process;
and splicing the residual data block of the previous writing process and the original data block of the target writing process to obtain the target data block of the target writing process.
In one embodiment, the data writing device 500 further includes:
and the second determining module is used for determining that the original data block of the target writing process is the target data block of the target writing process aiming at the target writing process, wherein the last writing process of the target writing process does not exist in the target writing process.
In one embodiment, the data writing device 500 further includes:
and the third determining module is used for determining that the original data block of the target writing process is the target data block of the target writing process when the last writing process of the target writing process exists and the last writing process does not exist the residual data block aiming at the target writing process.
The respective modules in the above-described data writing apparatus 500 may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing the write data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a data writing method.
It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (10)

1. A method of writing data, the method comprising:
responding to a data writing request, and acquiring a plurality of writing processes corresponding to the data writing request;
determining a target data block of a target writing process in a plurality of writing processes;
dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and under the condition that the next writing process of the target writing process exists, transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks, and executing the step of dividing the target data blocks according to the size of a preset stripe as a new target writing process.
2. The method of claim 1, wherein after the step of dividing the target data block by a predetermined stripe size to obtain a first data block satisfying the predetermined stripe size and a remaining data block not satisfying the predetermined stripe size, the method further comprises:
and under the condition that the next writing process does not exist in the target writing process, caching the residual data blocks corresponding to the target writing process into a preset caching space.
3. The method according to claim 2, wherein the method further comprises:
after detecting a closing instruction of a file to be written corresponding to the data writing request, acquiring the residual data blocks in the preset cache space, and splicing and writing the residual data blocks into the file to be written.
4. The method of claim 1, wherein the determining, for each target write process of the plurality of write processes, a target data block for the target write process comprises:
aiming at the target writing process, acquiring a residual data block of a previous writing process when the target writing process exists in the previous writing process of the target writing process, and acquiring an original data block of the target writing process;
and splicing the residual data block of the previous writing process and the original data block of the target writing process to obtain the target data block of the target writing process.
5. The method according to claim 4, wherein the method further comprises:
for the target writing process, determining that an original data block of the target writing process is a target data block of the target writing process when the target writing process does not exist in the last writing process of the target writing process.
6. The method according to claim 4, wherein the method further comprises:
for the target writing process, determining that the original data block of the target writing process is the target data block of the target writing process when the target writing process exists a last writing process of the target writing process and the last writing process does not exist the residual data block.
7. A data writing apparatus, the apparatus comprising:
the first acquisition module is used for responding to a data writing request and acquiring a plurality of writing processes corresponding to the data writing request;
the first determining module is used for determining a target data block of a target writing process aiming at the target writing process in the plurality of writing processes;
the writing module is used for dividing the target data block according to the size of a preset stripe to obtain a first data block meeting the size of the preset stripe and a residual data block not meeting the size of the preset stripe, and writing data in the first data block corresponding to the target writing process through the target writing process;
and the transmission module is used for transmitting the residual data block to the next writing process of the target writing process through a preset data transmission network under the condition that the next writing process of the target writing process exists, so that the next writing process determines the residual data block and the original data block of the next writing process as target data blocks and serves as a new target writing process, and the step of dividing the target data blocks according to the size of a preset strip is executed.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program
The steps of the method of any of claims 1 to 6 when executed by a processor.
CN202211624203.9A 2022-12-15 2022-12-15 Data writing method, device, computer equipment and storage medium Pending CN116225314A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211624203.9A CN116225314A (en) 2022-12-15 2022-12-15 Data writing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211624203.9A CN116225314A (en) 2022-12-15 2022-12-15 Data writing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116225314A true CN116225314A (en) 2023-06-06

Family

ID=86589990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211624203.9A Pending CN116225314A (en) 2022-12-15 2022-12-15 Data writing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116225314A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117854553A (en) * 2024-03-06 2024-04-09 北京云豹创芯智能科技有限公司 Data shaping circuit, method and chip

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117854553A (en) * 2024-03-06 2024-04-09 北京云豹创芯智能科技有限公司 Data shaping circuit, method and chip

Similar Documents

Publication Publication Date Title
US10606806B2 (en) Method and apparatus for storing time series data
US10552936B2 (en) Solid state storage local image processing system and method
US9542122B2 (en) Logical block addresses used for executing host commands
WO2017050064A1 (en) Memory management method and device for shared memory database
CN111708738A (en) Method and system for realizing data inter-access between hdfs of hadoop file system and s3 of object storage
CN115470156A (en) RDMA-based memory use method, system, electronic device and storage medium
CN111309805B (en) Data reading and writing method and device for database
CN116225314A (en) Data writing method, device, computer equipment and storage medium
CN115686881A (en) Data processing method and device and computer equipment
WO2024021470A1 (en) Cross-region data scheduling method and apparatus, device, and storage medium
CN115686932A (en) Backup set file recovery method and device and computer equipment
CN107451070B (en) Data processing method and server
US8543722B2 (en) Message passing with queues and channels
CN105808451B (en) Data caching method and related device
CN113127438B (en) Method, apparatus, server and medium for storing data
CN116578410A (en) Resource management method, device, computer equipment and storage medium
US10678453B2 (en) Method and device for checking false sharing in data block deletion using a mapping pointer and weight bits
CN116048878A (en) Business service recovery method, device and computer equipment
CN112764897B (en) Task request processing method, device and system and computer readable storage medium
CN111274176B (en) Information processing method, electronic equipment, system and storage medium
CN111625502A (en) Data reading method and device, storage medium and electronic device
US10430081B2 (en) Methods for minimizing fragmentation in SSD within a storage system and devices thereof
CN114710441B (en) Link aggregation method, system, computer equipment and storage medium
CN116775510B (en) Data access method, device, server and computer readable storage medium
CN115456858B (en) Image processing method, device, computer equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination