CN108345431B - Data reading method and device - Google Patents

Data reading method and device Download PDF

Info

Publication number
CN108345431B
CN108345431B CN201711490140.1A CN201711490140A CN108345431B CN 108345431 B CN108345431 B CN 108345431B CN 201711490140 A CN201711490140 A CN 201711490140A CN 108345431 B CN108345431 B CN 108345431B
Authority
CN
China
Prior art keywords
data
read
data set
reading
last
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711490140.1A
Other languages
Chinese (zh)
Other versions
CN108345431A (en
Inventor
高杨东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Huawei Cloud Computing Technology Co ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201711490140.1A priority Critical patent/CN108345431B/en
Publication of CN108345431A publication Critical patent/CN108345431A/en
Application granted granted Critical
Publication of CN108345431B publication Critical patent/CN108345431B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for reading data includes that when an access service layer receives a notification message that a client requests to read a data set to start moving in a data reading process, the access service layer suspends reading of remaining data in the data set and records position information used for identifying the data which is read successfully last in the data set, when a notification that the data set moving is completed is received, a new storage position of the data set is determined according to identification of the data set, and the remaining data behind the data which is read successfully last is continuously read at the new storage position. After the data set requested to be read by the client is moved, the residual data behind the data successfully read last is continuously read from the new storage position of the data set according to the identification of the data successfully read last, so that all data requested to be read can be returned in one data reading request, and the situation that one data reading request is reinitiated due to reading failure can be avoided.

Description

Data reading method and device
Technical Field
The embodiment of the invention relates to the technical field of storage, in particular to a data reading method and device.
Background
Currently, in an object storage service system, metadata is used to describe some attributes of stored data, such as information describing the storage location of the stored data, the size of the stored data, and so on. The stored data and the metadata are generally stored separately, so that different storage media and management modes can be selected according to different characteristics of the stored data and the metadata, and the reading speed, the enumeration speed and the like are accelerated. For example, the size of the data is determined by the user, and the size of the data is not controllable except for limiting a certain range by the system; the metadata is generated by the system, the size and the structure of the metadata can be customized, and even a structured and relational database can be flexibly selected for storage.
When writing an object (the object includes data to be stored and metadata determined according to the data) into an object storage system, the two processes are generally divided into two processes, namely, storing the data to be stored, and then generating and storing a piece of metadata according to some characteristics of the data to be stored.
When reading an already stored object, the flow is just opposite, generally, the metadata of the stored object is read first, and after obtaining the basic information such as the storage location of the data according to the read metadata, the data is read from the corresponding storage location in the underlying storage system and returned to the user.
In order to improve the utilization rate of the underlying storage system, a plurality of data to be stored included in one file are generally stored in different storage spaces in a interspersed manner, that is, a plurality of data stored in one storage space may belong to different data. Therefore, when a user requests to delete a file, a large number of discrete blank storage areas may appear in a certain storage space, and as the blank storage areas are too discrete and not continuous, a large storage object cannot be accommodated any more.
As shown in fig. 1, a certain storage space stores data 1 to data 5, and then a user initiates a request to delete data 2 and data 4, which may cause a discrete blank space to appear on the storage space. These discrete empty spaces are generally difficult to reuse, and when a new memory object is rewritten, it is not possible to rescan whether the previous memory space has a suitable empty space, which is not allowed in time. Therefore, data migration processing is generally required, that is, valid data 1, data 3, and data 5 are migrated to a new storage space, and after data migration, a blank space is left to continuously store new data.
After the data migration of the underlying storage system is completed, the metadata corresponding to the migrated data needs to be updated accordingly, so as to record the storage location of the migrated data. If a user initiates a read data request before or during updating of metadata, there is a possibility that a read failure may occur due to no data being found. As shown in fig. 2, a data relocation task indicated by a dotted arrow, for example, an object to be read is composed of 3 data. At this time, the user initiates a request for reading data, and after the storage locations of data 1, data 2, and data 3 are known by reading the old metadata, data 1 starts to be read, and the read data 1 is returned to the user. When data 2 is attempted to be read, at this time, data 1, data 2, and data 3 have been moved from the old memory space to the new memory space, and the old memory space is recovered by the system, the system considers that the data loss is abnormal, and finally returns information of the read failure to the user after retrying many times without stopping the retrying at the original memory location.
After the reading fails, a user needs to initiate a brand-new reading data request again to read new metadata, and then the data 1, the data 2 and the data 3 can be read again in the moved storage space according to the new metadata.
Disclosure of Invention
The embodiment of the invention provides a data reading method and device, which can solve the problem of reading failure in a scene of concurrent data relocation and data reading by a user.
In a first aspect, a method for reading data is provided, in which an access service layer, when receiving a notification message that a client requests a data set to be read to start moving in a data reading process, suspends reading remaining data in the data set and records location information for identifying data that is read last and successfully in the data set, the data set includes at least one piece of data, each piece of data corresponds to an identifier, the data set is moved to move the data set from one storage space to another storage space, when receiving a notification that the data set is moved, a new storage location of the data set is determined according to the identifier of the data set, and then, according to the recorded location information for identifying data that is read last and successfully in the data set, remaining data that is located after the data that is read last and successfully is continuously read in the moved data set stored in the new storage location.
In the process of reading data, if an access service layer receives a notification message that a client requests to read a data set to start moving, the access service layer starts to suspend reading of the remaining data in the data set and records an identifier for identifying the data which is read last successfully in the data set, and then continues to read the remaining data which is located after the data which is read last successfully from a new storage position of the data set according to the identifier of the data which is read last successfully after receiving the notification message that the client requests to read the data set to finish moving, so that all data which are requested to be read can be returned in one data reading request, and the situation that one data reading request is reinitiated due to reading failure can be avoided.
In one possible design, the location information for identifying the last successfully read data in the data set may be at least one of: the data identification corresponding to the data which is successfully read last in the data set, the data identification corresponding to each data which is successfully read in the data set, and the data identification corresponding to the next adjacent data of the data which is successfully read last in the data set.
The location of the last successfully read data can be obtained by the location information for identifying the last successfully read data in the data set, so that it can be ensured that the remaining data after the last successfully read data is continuously read from the new storage location of the data set.
In a possible design, the access service layer may read new metadata corresponding to the identifier of the data set according to the identifier of the data set, and determine a new storage location of the data set according to the read new metadata.
The new metadata corresponding to the identifier is read through the identifier of the data set, so that the new storage position of the data set can be obtained, and the remaining unread data can be continuously read at the new storage position, so that the situation that the unread data is repeatedly read all the time in the data layer is avoided.
In one possible design, before the access service layer suspends reading of remaining data in the data set and records an identifier of data that is read last successfully in the data set, the access service layer may read metadata of the data set requested to be read according to the identifier of the data set requested to be read by acquiring a read data request sent by the client, determining a storage location of the data set requested to be read, reading at least one data in the data set requested to be read according to the storage location of the data set requested to be read, and establishing an input/output (IO) stream with the client to send the read at least one data to the client in a data stream form.
In a possible design, after the access service layer suspends reading of the remaining data in the data set and records the identifier of the data that is read last successfully in the data set, the access service layer may maintain an IO stream established with the client, and after the access service layer continues to read the remaining data that is located after the data that is read last successfully in the migrated data set stored in the new storage location, the access service layer may access the remaining data that continues to be read to the IO stream maintained by the client, and send the remaining data that continues to be read to the client.
By maintaining the IO stream established with the client and re-accessing the read residual data into the maintained IO stream after the data set is moved, the state of data transmission can be maintained, and the interruption can be avoided.
In a second aspect, there is provided a data reading apparatus comprising: an IO interface and a processor; the IO interface is used for receiving and sending data and data identification by a user; the processor is configured to, in a data reading process, when determining that the IO interface receives a notification message that a client requests a read data set to start moving, suspend reading of remaining data in the data set and record location information for identifying data that is read last and successfully in the data set, where the data set includes at least one piece of data, each piece of data corresponds to one identifier, and the data set is moved to move the data set from one storage space to another storage space; when the IO interface is determined to receive the notification of the completion of the data set relocation, determining a new storage position of the data set according to the identifier of the data set; and according to the recorded position information used for identifying the data which is read successfully last in the data set, continuing to read the residual data which is positioned after the data which is read successfully last in the moved data set stored in the new storage position.
In one possible design, the location information for identifying the last successfully read data in the data set is at least one of:
data identification corresponding to the data which is successfully read last in the data set;
data identification corresponding to each data which is successfully read in the data set;
and the data identification corresponding to the next adjacent data of the data which is successfully read last in the data set.
In one possible design, when the processor determines a new storage location of the data set according to the identifier of the data set, the processor may read new metadata corresponding to the identifier of the data set according to the identifier of the data set; and determining a new storage position of the data set according to the read new metadata.
In one possible design, before suspending reading of remaining data in the data set and recording location information for identifying data that is successfully read last in the data set, the processor may further obtain, through the IO interface, a read data request sent by the client, where the read data request includes an identifier of the data set requested to be read; reading metadata of the data set requested to be read according to the identification of the data set requested to be read, and determining the storage position of the data set requested to be read; reading at least one data in the data set requested to be read according to the storage position of the data set requested to be read; and establishing an IO stream between the IO interface and the client to send the read at least one data to the client in a data stream form.
In one possible design, after suspending reading of the remaining data in the data set and recording location information identifying the last successfully read data in the data set, the processor may further continue to maintain the IO stream established between the IO interface and the client; after the processor continues to read the remaining data after the last successfully read data in the migrated data set stored in the new storage location, the processor may access the remaining data that is continuously read to the IO stream maintained between the IO interface and the client, so as to send the remaining data that is continuously read to the client.
In a third aspect, an apparatus for reading data is provided, including: a processor and a memory; the memory stores computer executable instructions, the processor is connected to the memory via the bus, and when the device is running, the processor executes the computer executable instructions stored in the memory, so as to make the device execute the method according to any one of the above first aspects.
In a fourth aspect, there is provided a computer readable storage medium comprising computer readable instructions which, when read and executed by a computer, cause the computer to perform the method of any of the first aspects.
In a fifth aspect, there is provided a computer program product comprising computer readable instructions which, when read and executed by a computer, cause the computer to perform the method of any of the first aspects.
Drawings
FIG. 1 is a diagram illustrating data relocation in the prior art;
FIG. 2 is a diagram illustrating a data read in the prior art;
fig. 3 is a schematic structural diagram of a system architecture according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating a method for reading data according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for reading data according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data reading apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a data reading apparatus according to an embodiment of the present invention.
Detailed Description
Fig. 3 shows a system architecture to which an embodiment of the present invention is applicable, which includes a client 301 and a storage server 302. Therein, a client 301 communicates with a storage server 302 for requesting read and write data from the storage server 302.
The storage server 302 may include an access service layer, a metadata layer, and a data layer; the access service layer is configured to interact with the client 301, receive a request message sent by the client 301, and perform an operation corresponding to the request message. The metadata layer is used for storing metadata corresponding to data stored in the data layer, and the metadata is mainly information describing data attributes and is used for supporting functions such as indicating storage positions and storage sizes. The data layer is used for storing data requested to be stored by the client.
At present, data is read as a most basic service, and in order to guarantee the reliability of the data, each layer of the system has corresponding reliability guarantee, for example, after an access service layer detects an abnormality, a corresponding retry is performed to solve the problems of network instability, flash or server busy, and a corresponding retry mechanism is also performed on a bottom data layer to ensure that the failure of the whole request due to the failure of a certain disk is avoided.
However, the reliability protection of each layer before the data layer is basically limited to simple retry at each layer, for example, the retry mechanism at the access service layer is only effective at the beginning of service establishment; the data layer always tries to read the data stored in the storage space from the same storage space and the same position continuously. Therefore, under the conditions that data relocation is frequent and data reading operation is concurrent, the data reading failure rate is obviously improved. If the data read by the user is large, because the reading of a certain section of data fails, the IO stream between the server and the user is interrupted, and the previously successfully transmitted data is completely invalidated, which causes great waste of network resources and time.
In order to solve the above problem, fig. 4 exemplarily shows a flow of a method for data reading provided by an embodiment of the present invention, where the flow may be performed by an access service layer, and the access service layer may be located in a storage server.
Step 401, when receiving a notification message that a client requests to read a data set to start moving in a data reading process, an access service layer suspends reading of remaining data in the data set and records location information for identifying the last successfully read data in the data set.
In the embodiment of the present invention, the data set requested to be read by the client may include at least one data, and each data corresponds to an identifier, for example, data set 1 includes data 1, data 2, … …, and data n, where n is a positive integer. And the data n is a data identifier corresponding to the nth data. In the embodiment of the present invention, the data set identifier and the data identifier of the data are used in a digital form, which is only an example, and in a specific application, other identifiers that can be used to distinguish different data from different data sets may be used, which is not limited in this respect. Data set migration may be understood as the process of moving a data set from one storage space to another. The access service layer may encounter a task of data relocation performed by the data layer in the process of reading data, and at this time, a notification message that a data set starts relocation sent by the server background may be received, that is, a task of data relocation stored in the data layer exists while a task of data reading is performed, or that a data set being read is being relocated. At this time, the access service layer suspends reading the data remaining in the data set currently being read and records the location information for identifying the last successfully read data in the data set. The location information for identifying the last successfully read data in the data set may be at least one of: data identification corresponding to the data which is successfully read at last in the data set; respectively corresponding data identification to each data which is successfully read in the data set; and data identification corresponding to the next adjacent data of the data which is successfully read last in the data set. That is, here, the data identifier of the last successfully read data, or the data identifier of the first data to be read in the remaining data, or the data identifiers of the respective data that have been successfully read may be recorded. For example, there are four data in the data set, data 1, data 2, data 3, and data 4. After reading the data 1 and the data 2, when starting to read the data 3, the received data set is wholly moved to another storage space, and the access service layer may record the data identifier of the data 2, may also record the data identifiers of the data 1 and the data 2, and may also record the data identifier of the unread data 3 next adjacent to the data 2.
In this case, the access service layer may suspend the task of the client requesting to read the data in the data set, and continue to read the data after the task of the data set relocation is completed.
Before suspending the task of reading data, the access service layer generally obtains a read data request sent by the client, where the read data request includes an identifier of a data set requested to be read. The data set requested to be read comprises at least one piece of data, the metadata corresponding to the identifier of the data set requested to be read can be read from the metadata layer through the identifier of the data set requested to be read, and the storage position of the data set requested to be read (namely, the storage position before the data set is moved), namely, the old storage position of the data set in the data layer can be determined from the read metadata. The metadata is information generated by the server after storing the data in the data set to the data layer and used for describing the attribute of the data set. After determining the storage location of the data set requested to be read, the access service layer may read at least one data in the data set requested to be read from the storage location of the data set requested to be read by the data layer. Then, the access service layer establishes an IO stream with the client to send the read at least one data to the client in a data stream form.
The access service layer may also continue to maintain the IO stream established with the client while suspending reading of the remaining data in the data set currently being read, so as to prevent the IO stream from being interrupted. The IO stream established with the client is kept, so that a user can not see the situation of data reading failure at the client and can not see reading pause, the user can see that data is transmitted all the time and is not interrupted, and the user experience can be improved.
Step 402, when receiving the notification of the completion of the data set relocation, the access service layer determines a new storage location of the data set according to the identifier of the data set.
When the data set read by the client in the data layer is requested to be moved, the server background can send a notification of the completion of the movement, and when the access service layer receives the notification of the completion of the movement of the data set, a new metadata can be generated after the movement of the data set is completed, and the new metadata is stored in the metadata layer. Since the metadata is updated after the data in the data set is migrated, when the task of reading the data is continued, the data to be read may not be located in the previous storage location, and the remaining data in the data set cannot be read in the previous old storage location, so that the new metadata generated after the data set is migrated needs to be read in the metadata layer according to the identifier of the data set, where the new metadata indicates the new storage location of the data set. After reading the new metadata corresponding to the identification of the data set, a new storage location for the data set may be obtained.
Step 403, the access service layer continues to read the remaining data after the data successfully read last in the migrated data set stored in the new storage location according to the recorded location information for identifying the data successfully read last in the data set.
After obtaining the new storage location of the data set, the access service layer may continue to read the remaining data located after the last successfully read data in the migrated data set stored in the new storage location according to the location information recorded in step 401 for identifying the last successfully read data in the data set. And when the position information of the last successfully read data is the data identifier corresponding to the last successfully read data, directly finding the data corresponding to the data identifier of the last successfully read data in a new storage position according to the data identifier of the last successfully read data, and then reading the residual data behind the last successfully read data. When the position information of the last successfully read data is the data identifier corresponding to each successfully read data, the data identifier corresponding to the last successfully read data in each successfully read data is determined, then the data corresponding to the data identifier of the last successfully read data is found in a new storage position according to the data identifier of the last successfully read data, and finally the remaining data behind the last successfully read data is read. When the position information of the last successfully read data is the data identifier corresponding to the next adjacent data of the last successfully read data, the data corresponding to the next adjacent data of the last successfully read data and the remaining data after the next adjacent data of the last successfully read data can be directly read. For example, a data set includes 5 data, which are respectively data 1, data 2, data 3, data 4, and data 5, and when the access service layer suspends the task of reading data, the last successfully read data recorded is data 2, that is, data 1 and data 2 have been successfully read, and data 3, data 4, and data 5 are the remaining data in the data set. After obtaining a new storage location according to the new metadata, the access service layer finds data 2 in the new storage location, and then continues to read data 3 located after data 2 until data 5 is read.
After the access service layer continues to read the data, the remaining data which is continuously read is accessed into the IO stream maintained by the client, and the remaining data which is continuously read is sent to the client. In this way, the situation seen on the user side is that the speed is slightly slower than that of a read data request when no data is migrated, but all data reading tasks can be successfully completed in one read data request, and the situation of reading failure can not occur.
In order to clearly explain the data reading process provided by the embodiment of the present invention, the data reading process will be described below in a specific implementation scenario.
As shown in fig. 5, the process specifically includes:
in step 501, a client initiates a read data request.
The client side initiates a data reading request, and the data reading request comprises the identification of the data set. The identification of this data set is 1, namely data set 1. The data set includes 5 data, which are data 1, data 2, data 3, data 4 and data 5, where the numbers 1, 2, 3, 4 and 5 are data identifiers corresponding to the respective data.
Step 502, reading the metadata information of the data set.
And after receiving a data reading request sent by the client, the access service layer reads the metadata information corresponding to the data set with the identifier 1.
Step 503, obtaining the position information of the data set on the data layer.
After the metadata information corresponding to the data set with the identifier 1 is read, the access service layer may obtain a storage location of the data set on the data layer.
At step 504, a location on the data layer is located and reading of data begins.
After obtaining the storage location of the data set with the identifier 1, the access service layer locates to the data layer, starts to read the data in the data set at the storage location of the data set with the identifier 1 on the data layer, starts to read from the data 1 in sequence, and records the data identifier of the successfully read data if reading a successful data.
Step 505, the read data is returned.
After reading the data in the data set, the data layer returns the data that has been read to the access service layer. For example, after reading data 1, data 1 is returned to the access service layer, and data 2 is read continuously.
Step 506, data is returned to the client in the form of IO streams.
And the access service layer returns the received data 1 which is successfully read to the client in the form of IO stream.
In step 507, the migration task starts and notifies the data of the data set to start migrating.
And in the process of reading the data, the access service layer starts to execute the relocation task of the data set with the identifier 1 in the data layer after the server, and notifies the access service layer that the data of the data set with the identifier 1 which is currently being read starts to be relocated.
Step 508, suspending the current data reading task flow, recording the identifier of the current last successfully read data, and maintaining the IO stream established with the client.
The access service layer suspends the data reading task currently being executed, namely suspends the reading of the remaining unread data in the data set, records the data identifier of the currently and last successfully read data, for example, the data identifier of the last successfully read data is 2, namely the data 2 has been read, the data 3 has not been read, and keeps the IO stream established with the client.
In step 509, the metadata is updated to notify the completion of the data migration task.
And after the data in the data set is completely migrated, the server background updates the metadata corresponding to the data set, and after the metadata is completely updated, informs the access service layer that the data migration task of the data set is completed.
At step 510, the metadata information of the data set is re-read.
And after receiving the notification of the completion of the data relocation task of the data set, the access service layer returns to the metadata layer to read the metadata information corresponding to the data set with the identifier 1.
In step 511, the storage location of the data set on the data layer is obtained.
After obtaining the metadata of the data set identified as 1, the access service layer may retrieve the new storage location of the data set identified as 1 on the data layer after performing necessary check on the metadata.
At step 512, the data is relocated to a new storage location in the data layer, and the remaining data in the data set continues to be read.
The access service layer relocates to the data layer after the new storage location according to the retrieved data set identified as 1, and then reads the remaining data located after data 2 at the new storage location of the data set identified as 1 at the data layer. Before formal reading, the reading position is moved forward from data 1 to the position of data 3, and then the remaining data, namely data 3, data 4 and data 5, are actually read from the data set identified as 1 stored in the data layer.
Step 513, return the read data.
And after the residual data are read from the new storage position of the data layer, returning the read residual data to the access service layer.
Step 514, continuing to transmit the remaining data by the IO stream before the pause.
And after receiving the read residual data, the access service layer continues to transmit the read residual data by the IO stream which is suspended before the newly read residual data is connected with the client. Therefore, the situation can be seen from the user side that the speed is slightly slower than that of reading without relocation, but all data reading tasks are successfully completed in one reading request without failure.
Based on the same technical concept, fig. 6 illustrates a structure of a data reading apparatus 600 according to an embodiment of the present invention, where the apparatus 600 may be an access service layer and may perform the above-mentioned data reading procedure.
As shown in fig. 6, the apparatus 600 specifically includes: an IO interface 601 and a processor 602;
the IO interface 601 is configured to receive and send data and a data identifier;
the processor 602 is configured to, in a data reading process, determine that when the IO interface 601 receives a notification message that a client requests a read data set to start moving, suspend reading of remaining data in the data set and record location information used to identify data that is read last and successfully in the data set, where the data set includes at least one piece of data, each piece of data corresponds to an identifier, and the data set is moved to move the data set from one storage space to another storage space; when it is determined that the IO interface 601 receives the notification that the data set is moved, determining a new storage location of the data set according to the identifier of the data set; and according to the recorded position information used for identifying the data which is read successfully last in the data set, continuing to read the residual data which is positioned after the data which is read successfully last in the moved data set stored in the new storage position.
In one possible design, the processor 602, when determining the new storage location of the data set according to the identification of the data set, has a function of:
reading new metadata corresponding to the identification of the data set according to the identification of the data set;
and determining a new storage position of the data set according to the read new metadata.
In one possible design, the processor 602, before suspending reading of the data remaining in the data set and recording location information identifying the last successfully read data in the data set, is further configured to:
acquiring a data reading request sent by the client through the IO interface 601, where the data reading request includes an identifier of a data set requested to be read;
reading metadata of the data set requested to be read according to the identification of the data set requested to be read, and determining the storage position of the data set requested to be read;
reading at least one data in the data set requested to be read according to the storage position of the data set requested to be read;
establishing an IO stream between the IO interface 601 and the client, and sending the read at least one data to the client in a data stream form.
In one possible design, the processor 602, after suspending reading of the data remaining in the data set and recording location information identifying the last successfully read data in the data set, is further configured to:
maintaining the IO stream established between the IO interface 601 and the client;
after continuing to read the remaining data after the last successfully read data in the migrated data set stored in the new storage location, the processor 602 is further configured to:
and accessing the continuously read residual data to the IO stream maintained between the IO interface 601 and the client, and sending the continuously read residual data to the client.
Based on the same technical concept, an embodiment of the present invention further provides a data reading apparatus 700, as shown in fig. 7, the apparatus 700 may include: I/O interface 701, processor 702, and memory 703. The processor 702 is used to control the operation of the apparatus 700; the memory 703 may include both read-only memory and random-access memory, and stores instructions and data that may be executed by the processor 702. A portion of the memory 703 may also include non-volatile row random access memory (NVRAM). The I/O interface 701, processor 702, and memory 703 components are connected by a bus 709, wherein the bus 709 may include a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration the various buses are labeled as bus 709 in the figure.
The data reading method disclosed by the embodiment of the invention can be applied to the processor 702, or implemented by the processor 702. In implementation, the steps of the process flow may be performed by instructions in the form of hardware, integrated logic circuits, or software in the processor 702. The processor 702 may be a general-purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like that implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 703, and the processor 702 reads the information stored in the memory 703, and completes a data reading step in combination with hardware thereof.
The method for reading data disclosed by the embodiment of the invention can be applied to the processor 702, or implemented by the processor 702.
The processor 702 is configured to read codes in the memory 703 for performing the flow of data reading in the above method embodiments.
Based on the same technical concept, embodiments of the present invention also provide a computer-readable storage medium, which includes computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer-readable storage medium causes the computer to execute the above data reading method.
Based on the same technical concept, embodiments of the present invention further provide a computer program product, which includes computer readable instructions, and when the computer reads and executes the computer readable instructions, the computer executes the method for reading data.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A method of data reading, comprising:
when an access service layer receives a notification message that a client requests to read a data set to start moving in a data reading process, the access service layer suspends reading of the remaining data in the data set and records position information used for identifying the data which is read successfully last in the data set, wherein the data set comprises at least one piece of data, each piece of data corresponds to one identification, and the data set is moved to move the data set from one storage space to another storage space;
when the access service layer receives the notification that the data set is moved, determining a new storage position of the data set according to the identifier of the data set;
the access service layer continuously reads the residual data after the data which is successfully read at the last time in the moved data set stored in the new storage position according to the recorded position information which is used for identifying the data which is successfully read at the last time in the data set;
before suspending reading of the remaining data in the data set and recording location information for identifying the last successfully read data in the data set, the access service layer further includes:
the access service layer acquires a data reading request sent by the client, wherein the data reading request comprises an identifier of a data set requested to be read;
the access service layer reads the metadata of the data set requested to be read according to the identifier of the data set requested to be read, and determines the storage position of the data set requested to be read;
the access service layer reads at least one data in the data set requested to be read according to the storage position of the data set requested to be read;
and the access service layer establishes Input and Output (IO) flow with the client so as to send the read at least one data to the client in a data flow mode.
2. The method of claim 1, wherein the location information for identifying the last successfully read data in the data set is at least one of:
data identification corresponding to the data which is successfully read last in the data set;
data identification corresponding to each data which is successfully read in the data set;
and the data identification corresponding to the next adjacent data of the data which is successfully read last in the data set.
3. The method of claim 1, wherein the access service layer determining a new storage location for the data set based on the identification of the data set comprises:
the access service layer reads new metadata corresponding to the identifier of the data set according to the identifier of the data set;
and the access service layer determines a new storage position of the data set according to the read new metadata.
4. The method of claim 1, wherein the access service layer, after suspending reading of data remaining in the data set and recording location information identifying data last successfully read in the data set, further comprises:
the access service layer maintains the IO stream established with the client;
after the access service layer continues to read the remaining data after the last successfully read data in the migrated data set stored in the new storage location, the method further includes:
and the access service layer accesses the continuously read residual data into the IO stream kept by the client and sends the continuously read residual data to the client.
5. A data reading apparatus, comprising: an input/output (IO) interface and a processor;
the IO interface is used for receiving and sending data and data identification;
the processor is configured to, in a data reading process, when determining that the IO interface receives a notification message that a client requests a read data set to start moving, suspend reading of remaining data in the data set and record location information for identifying data that is read last and successfully in the data set, where the data set includes at least one piece of data, each piece of data corresponds to one identifier, and the data set is moved to move the data set from one storage space to another storage space; when the IO interface is determined to receive the notification of the completion of the data set relocation, determining a new storage position of the data set according to the identifier of the data set; according to the recorded position information used for identifying the data which is read successfully last in the data set, the residual data which is positioned after the data which is read successfully last is continuously read in the moved data set stored in the new storage position;
the processor, prior to suspending reading of data remaining in the data set and recording an identification of a last successfully read data in the data set, is further configured to:
acquiring a data reading request sent by the client through the IO interface, wherein the data reading request comprises an identifier of a data set requested to be read;
reading metadata of the data set requested to be read according to the identification of the data set requested to be read, and determining the storage position of the data set requested to be read;
reading at least one data in the data set requested to be read according to the storage position of the data set requested to be read;
and establishing an IO stream between the IO interface and the client to send the read at least one data to the client in a data stream mode.
6. The apparatus of claim 5, wherein the location information for identifying the last successfully read data in the data set is at least one of:
data identification corresponding to the data which is successfully read last in the data set;
data identification corresponding to each data which is successfully read in the data set;
and the data identification corresponding to the next adjacent data of the data which is successfully read last in the data set.
7. The apparatus of claim 5, wherein the processor, when determining the new storage location for the data set based on the identification of the data set, has means for:
reading new metadata corresponding to the identification of the data set according to the identification of the data set;
and determining a new storage position of the data set according to the read new metadata.
8. The apparatus of claim 5, wherein the processor, after suspending reading of data remaining in the data set and recording an identification of last successfully read data in the data set, is further to:
maintaining an IO stream established between the IO interface and the client;
after continuing to read the remaining data after the last successfully read data in the migrated data set stored in the new storage location, the processor is further configured to:
and accessing the continuously read residual data into the IO stream maintained between the IO interface and the client, and sending the continuously read residual data to the client.
9. A computer-readable storage medium comprising computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1-4.
CN201711490140.1A 2017-12-29 2017-12-29 Data reading method and device Active CN108345431B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711490140.1A CN108345431B (en) 2017-12-29 2017-12-29 Data reading method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711490140.1A CN108345431B (en) 2017-12-29 2017-12-29 Data reading method and device

Publications (2)

Publication Number Publication Date
CN108345431A CN108345431A (en) 2018-07-31
CN108345431B true CN108345431B (en) 2021-06-22

Family

ID=62963402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711490140.1A Active CN108345431B (en) 2017-12-29 2017-12-29 Data reading method and device

Country Status (1)

Country Link
CN (1) CN108345431B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114063935B (en) * 2022-01-17 2022-06-14 阿里云计算有限公司 Method and device for processing data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853229B (en) * 2010-05-17 2012-08-08 华为终端有限公司 Method and device for data transportation, and method of data reading operation and data writing operation
CN103914406A (en) * 2014-03-31 2014-07-09 中国科学院微电子研究所 Migration method and migration system for hybrid memories
US9959203B2 (en) * 2014-06-23 2018-05-01 Google Llc Managing storage devices
CN106649132A (en) * 2016-12-29 2017-05-10 记忆科技(深圳)有限公司 Solid-state drive junk recovery method

Also Published As

Publication number Publication date
CN108345431A (en) 2018-07-31

Similar Documents

Publication Publication Date Title
US11474972B2 (en) Metadata query method and apparatus
US8738861B2 (en) Data prefetching method for distributed hash table DHT storage system, node, and system
CN110968586B (en) Distributed transaction processing method and device
WO2017049764A1 (en) Method for reading and writing data and distributed storage system
US20150213100A1 (en) Data synchronization method and system
CN108989432B (en) User-mode file sending method, user-mode file receiving method and user-mode file receiving and sending device
US10795579B2 (en) Methods, apparatuses, system and computer program products for reclaiming storage units
US11507277B2 (en) Key value store using progress verification
CN108475201B (en) Data acquisition method in virtual machine starting process and cloud computing system
JP6293709B2 (en) Storage system and storage system program
CN115525631A (en) Database data migration method, device, equipment and storage medium
CN104965835A (en) Method and apparatus for reading and writing files of a distributed file system
CN106599323B (en) Method and device for realizing distributed pipeline in distributed file system
CN113342507B (en) Distributed lock service realization method and device and computer equipment
EP3519993A1 (en) Tracking access pattern of inodes and pre-fetching inodes
CN108345431B (en) Data reading method and device
CN112363980A (en) Data processing method and device for distributed system
CN111767284A (en) Data processing method, device, storage medium and server
CN111078643B (en) Method and device for deleting files in batch and electronic equipment
CN113626263A (en) Method for keeping data consistency in SCST storage system and application
CN110019031B (en) File creation method and file management device
CN107209882B (en) Multi-stage de-registration for managed devices
CN111142791A (en) Data migration method and device
US20030033440A1 (en) Method of logging message activity
CN114143574B (en) Method for cleaning storage space, storage medium and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200415

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Applicant after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou

Applicant before: Hangzhou Huawei Digital Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220216

Address after: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee after: Huawei Cloud Computing Technology Co.,Ltd.

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: HUAWEI TECHNOLOGIES Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221214

Address after: 518129 Huawei Headquarters Office Building 101, Wankecheng Community, Bantian Street, Gangqu District, Shenzhen, Guangdong

Patentee after: Shenzhen Huawei Cloud Computing Technology Co.,Ltd.

Address before: 550025 Huawei cloud data center, jiaoxinggong Road, Qianzhong Avenue, Gui'an New District, Guiyang City, Guizhou Province

Patentee before: Huawei Cloud Computing Technology Co.,Ltd.

TR01 Transfer of patent right