CN107179929B

CN107179929B - WRITE _ SAME function optimization implementation method and device

Info

Publication number: CN107179929B
Application number: CN201710338288.7A
Authority: CN
Inventors: 刘斌
Original assignee: Suzhou Wave Intelligent Technology Co Ltd
Current assignee: Suzhou Wave Intelligent Technology Co Ltd
Priority date: 2017-05-15
Filing date: 2017-05-15
Publication date: 2020-02-07
Anticipated expiration: 2037-05-15
Also published as: CN107179929A

Abstract

The invention discloses a method and a device for realizing a SCSI instruction WRITE _ SAME function in a ceph + tgt architecture, wherein the method comprises the following steps: allocating a predetermined buffer to a WRITE _ SAME function; repeatedly copying data to be written to the preset cache to fill the preset cache with the data to be written; writing the filled predetermined memory into a back-end memory through an rbd _ write () interface; therefore, compared with the mode that the data writing is realized by frequently calling the rbd _ write () interface in the existing scheme, in the scheme, the data to be written is written into the preset cache in advance, and the data to be written can be written into the back end for storage by calling the rbd _ write () interface once, so that the realization mode of the write _ same is greatly optimized, and the problem of compatibility of a Ceph + tgt architecture to a vmware system is solved.

Description

WRITE _ SAME function optimization implementation method and device

Technical Field

The invention relates to the technical field of function optimization, in particular to a method and a device for realizing a SCSI instruction WRITE _ SAME function in a ceph + tgt architecture.

Background

The architecture of the Ceph + tgt can externally realize a standard iscsi protocol interface for distributed storage realized by the Ceph through the tgt, so that a client can be accessed to the distributed storage realized by the Ceph through a standard iscsi initiator terminal without other complex modes. The compatibility problem of the architecture of the Ceph + tgt to vmware is a problem that most storage products have to face, and when the architecture of the Ceph + tgt is applied to a vmware system as a client, the speed of loading a virtual machine in a mapped lun is found to be very slow, and after testing, the speed of copying data to the mapped lun is also found to be very slow, and the speed of a back end is about 70K/S, so that the speed of loading the virtual machine in the mapped lun is very slow.

Therefore, how to solve the above problems and improve the compatibility of the storage architecture of ceph + tgt and the vmware system are problems to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to provide a method and a device for realizing a WRITE _ SAME function of an SCSI instruction in a ceph + tgt architecture, so as to improve the compatibility of the storage architecture of the ceph + tgt and a vmware system.

In order to achieve the above purpose, the embodiment of the present invention provides the following technical solutions:

a method for implementing a SCSI instruction WRITE _ SAME function in a ceph + tgt architecture comprises the following steps:

allocating a predetermined buffer to a WRITE _ SAME function;

repeatedly copying data to be written to the preset cache to fill the preset cache with the data to be written;

writing the filled predetermined memory to a back-end storage through an rbd _ write () interface.

Wherein the allocating a predetermined buffer to the WRITE _ SAME function includes:

determining a first byte number of data to be written;

determining a second byte number of a preset cache according to the first byte data;

allocating a predetermined buffer of the second number of bytes to a WRITE _ SAME function.

Wherein the allocating the predetermined buffer of the second byte number to the WRITE _ SAME function includes: and allocating a predetermined buffer of the second byte number to the WRITE _ SAME function by using a malloc () function.

Wherein the repeatedly copying the data to be written to the predetermined cache to fill the predetermined cache with the data to be written includes:

acquiring data to be written by utilizing a scsi _ get _ out _ buffer (cmd) function;

and repeatedly copying the data to be written in the first byte number to the preset cache of the second byte number so as to fill the preset cache of the second byte number with the data to be written in the first byte number.

Wherein the repeatedly copying the data to be written in the first byte number to the predetermined cache in the second byte number includes:

and repeatedly copying the data to be written of the first byte number to a preset cache of the second byte number through a memcpy function interface or a strncpy function interface.

An apparatus for implementing a WRITE _ SAME function in a SCSI command in a ceph + tgt architecture, comprising:

the allocation module is used for allocating a preset buffer to the WRITE _ SAME function;

the copying module is used for repeatedly copying data to be written to the preset cache so as to fill the preset cache with the data to be written;

and the data writing module is used for writing the filled predetermined memory into the back-end storage through an rbd _ write () interface.

Wherein the allocation module comprises:

the first determining unit is used for determining a first byte number of data to be written;

a second determining unit, configured to determine a second byte number of a predetermined cache according to the first byte data;

and the allocation unit is used for allocating the preset buffer memory of the second byte number to the WRITE _ SAME function.

Wherein the allocation unit allocates the predetermined buffer of the second byte number to the WRITE _ SAME function using a malloc () function.

Wherein the replication module comprises:

an obtaining unit, configured to obtain data to be written by using a scsi _ get _ out _ buffer (cmd) function;

and the copying unit is used for repeatedly copying the data to be written in the first byte number to the preset cache of the second byte number so as to fill the preset cache of the second byte number with the data to be written in the first byte number.

And the copying unit is used for repeatedly copying the data to be written of the first byte number to a preset buffer memory of the second byte number through a memcpy function interface or a strncpy function interface.

As can be seen from the above solutions, the method for implementing the SCSI command WRITE _ SAME function in the ceph + tgt architecture provided in the embodiment of the present invention includes: allocating a predetermined buffer to a WRITE _ SAME function; repeatedly copying data to be written to the preset cache to fill the preset cache with the data to be written; writing the filled predetermined memory into a back-end memory through an rbd _ write () interface;

therefore, compared with the mode that the data writing is realized by frequently calling the rbd _ write () interface in the existing scheme, in the scheme, the data to be written is written into the preset cache in advance, and the data to be written can be written into the rear end for storage by calling the rbd _ write () interface once, so that the realization mode of write _ same is greatly optimized, and the problem of compatibility of a Ceph + tgt architecture to a vmware system is solved; the invention also discloses a device for realizing the SCSI instruction WRITE _ SAME function in the ceph + tgt architecture, which can also realize the technical effect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a method for implementing an SCSI command WRITE _ SAME function in a ceph + tgt architecture according to an embodiment of the present invention;

fig. 2 is a schematic diagram illustrating that data to be written is copied in a predetermined cache according to an embodiment of the disclosure;

fig. 3 is a schematic structural diagram of an implementation apparatus of a SCSI command WRITE _ SAME function in a ceph + tgt architecture according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that when the architecture of Ceph + tgt is applied to a vmware system as a client, it is found that the speed of copying data and loading a virtual machine in a mapped lun is very slow, thereby causing a compatibility problem of the architecture with the vmware system. The implementation manner of the vmware system is different from that of a common system (Linux, windows, or the like) when copying data, the common system only needs a write instruction to copy a file, and the vmware system needs a write _ same instruction besides the common write instruction to copy data, and the write _ same instruction is briefly described as follows:

write _ same mainly implements two functions:

1. unmap function: the realization of unmap function directly calls the function interface of rbd _ discard () in ceph;

2. data of one LBA is read from inside the write _ buf and then written to consecutive LBAs from a certain address.

It should be noted that, in function 2, the rbd _ write () function interface of ceph is mainly encapsulated, the main related source code file is line 271 and line 308 in the Bs _ rbd.c file under the user directory in tgt, and the specific source code is:

by analyzing the source code, it can be found that the main process of the way of implementing write _ same is as follows:

1. obtaining the size of the blocksize as 512B through 1< < cmd- > dev- > blk _ shift;

2. obtaining data to be written by tmpbuf _ scsi _ get _ out _ buffer (cmd); where tmpbuf is 512B in size;

3. through a while loop, the loop calls the rbd _ write () interface tl/blocksize times.

In the process, the implementation manner of the rbd _ write () interface needs to pass through a plurality of processes at the back end, the implementation process is complex and time-consuming, and the process for implementing the write _ same frequently writes small blocks of data and frequently calls the rbd _ write () interface for a plurality of times, which is time-consuming.

Therefore, the embodiment of the invention discloses a method and a device for realizing a WRITE _ SAME function of an SCSI instruction in a ceph + tgt architecture, which can greatly reduce the calling frequency of an rbd _ WRITE () function interface and save time resources by realizing adjustment on a logic architecture in a WRITE _ SAME, thereby improving the performance of the WRITE _ SAME and improving the compatibility of the storage architecture of the ceph + tgt and a vmware system.

Referring to fig. 1, an implementation method of a WRITE _ SAME function in an SCSI command in a ceph + tgt architecture provided by the embodiment of the present invention includes:

s101, distributing a preset buffer to a WRITE _ SAME function;

determining a first byte number of data to be written;

Specifically, the WRITE _ SAME function may be allocated with a predetermined buffer of the second number of bytes by using a malloc () function.

In this embodiment, a buffer of the second byte number tl may be allocated in the memory, specifically, may be allocated using a malloc () function, and the second byte number t1 may be determined according to the first byte number blocksize of the data to be written, for example: the second byte count may be set to be an integer multiple of the first byte count so as to repeatedly write the data to be written into the predetermined buffer.

S102, repeatedly copying data to be written to the preset cache to fill the preset cache with the data to be written;

Referring to fig. 2, in the present scheme, read-write data obtained from a command, that is, tmsi _ get _ out _ buffer (cmd), to-be-written data tmpbuf is repeatedly copied into the allocated memory until the buffer with the number of tl bytes is filled, which may be specifically implemented by a function interface such as memcpy or strcpy.

S103, writing the filled predetermined memory into the back-end storage through an rbd _ write () interface.

It can be understood that, in the present solution, the above filled buffer is written into the backend storage at one time through the rbd _ write () interface, and specifically, the rbd _ write (rbd- > rbd _ image, offset, tl, buffer) can be called to implement. Therefore, through the implementation mode of the write _ same, only one-time rbd _ write () write interface is needed to write data into the back-end storage, so that the implementation mode of the write _ same is greatly optimized, and the problem of compatibility of a Ceph + tgt architecture to a vmware system is solved.

The following describes an implementation apparatus provided in an embodiment of the present invention, and the implementation apparatus described below and the implementation method described above may be referred to each other.

Referring to fig. 3, an apparatus for implementing a WRITE _ SAME function in an SCSI command in a ceph + tgt architecture according to an embodiment of the present invention includes:

an allocating module 100, configured to allocate a predetermined buffer to a WRITE _ SAME function;

the copying module 200 is configured to repeatedly copy data to be written to the predetermined cache, so as to fill the predetermined cache with the data to be written;

a data writing module 300, configured to write the filled predetermined memory into the backend storage through the rbd _ write () interface.

Based on the above embodiment, the allocation module includes:

Based on the above embodiment, the allocating unit allocates the predetermined buffer of the second byte number to the WRITE _ SAME function using a malloc () function.

Based on the above embodiment, the copy module includes:

Based on the above embodiment, the copy unit repeatedly copies the data to be written in the first byte number to the predetermined buffer in the second byte number through a memcpy function interface or a strncpy function interface.

In summary, under the architecture of Ceph + tgt, there is a certain problem in the implementation of write _ same in tgt, which mainly lies in frequently calling the rbd _ write () write interface, and the implementation of this rbd _ write () write interface is very complex and consumes time resources.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for implementing a SCSI instruction WRITE _ SAME function in a ceph + tgt architecture is characterized by comprising the following steps:

allocating a predetermined buffer to a WRITE _ SAME function;

and writing the data to be written in the filled predetermined cache into a back-end memory through an rbd _ write () interface.

2. The method of claim 1, wherein said assigning a predetermined buffer to a WRITE _ SAME function comprises:

determining a first byte number of data to be written;

determining a second byte number of a preset cache according to the first byte number;

3. The method of claim 2, wherein said assigning a predetermined buffer of said second number of bytes to a WRITE _ SAME function comprises:

and allocating a predetermined buffer of the second byte number to the WRITE _ SAME function by using a malloc () function.

4. The method according to claim 3, wherein the repeatedly copying the data to be written to the predetermined buffer to fill the predetermined buffer with the data to be written comprises:

5. The method of claim 4, wherein the repeatedly copying the first byte number of data to be written to the predetermined buffer of the second byte number comprises:

6. An apparatus for implementing a WRITE _ SAME function in a SCSI command in a ceph + tgt architecture, comprising:

and the data writing module is used for writing the filled data to be written in the preset cache into the back-end storage through an rbd _ write () interface.

7. The apparatus of claim 6, wherein the allocation module comprises:

a second determining unit, configured to determine a second byte number of the predetermined cache according to the first byte number;

8. The implementation device of claim 7,

the allocation unit allocates a predetermined buffer of the second byte number to the WRITE _ SAME function using a malloc () function.

9. The apparatus of claim 8, wherein the replication module comprises:

10. The implementation device of claim 9,