CN111338985A

CN111338985A - Data reading and writing method and device of service system

Info

Publication number: CN111338985A
Application number: CN202010132995.2A
Authority: CN
Inventors: 闫涛
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2020-02-29
Filing date: 2020-02-29
Publication date: 2020-06-26

Abstract

The invention discloses a data read-write method and a device of a service system, wherein the method comprises the following steps: when a business system tries to write data, one controller of an upper-layer cache is used for receiving the written data, cache data are generated for the written data through a potential semantic analysis algorithm, the cache data are mirrored to all controllers in the upper-layer cache through the controllers, writing completion information is fed back to the business system, the cache data are transmitted to a lower-layer cache from the upper-layer cache, the cache data are compressed into striped data through a compression algorithm, the striped data are transmitted to a storage array from the lower-layer cache to be landed, and metadata of the striped data are cached in the lower-layer cache. The invention can optimize the big data read-write of the business system, improve the read-write performance and the response speed, and has good availability and integral stability.

Description

Data reading and writing method and device of service system

Technical Field

The present invention relates to the field of storage, and in particular, to a method and an apparatus for reading and writing data of a service system.

Background

With the development of novel technologies such as cloud computing and big data, the requirement for storage is higher and higher, and the performance, delay and reliability of storage also become the focus of attention, especially for solid state array storage (SSA), the delay requirement for storage is higher, and in order to support more service systems, storage must have more optimization methods, and the overall performance is improved. In order to ensure that the storage time delay can meet the requirement, the prior art has various methods for optimizing a stored internal system, but the method has the disadvantages of poor general effect, low efficiency, low availability and influence on the overall stability of a service system.

Aiming at the problems of poor data storage optimization effect, low efficiency, low availability and influence on the overall stability of a service system in the prior art, no effective solution is available at present.

Disclosure of Invention

In view of this, an object of the embodiments of the present invention is to provide a method and an apparatus for reading and writing data of a service system, which can optimize big data reading and writing of the service system, improve reading and writing performance and response speed, and have good usability and overall stability.

Based on the above object, a first aspect of the embodiments of the present invention provides a data reading and writing method for a service system, including:

when the business system tries to write data, the following steps are executed:

receiving the write data by using a controller of an upper-layer cache, and generating cache data for the write data by using a latent semantic analysis algorithm;

the controller is used for mirroring the cache data to all controllers in the upper-layer cache and feeding back write-in completion information to the service system;

transmitting the cache data from an upper-layer cache to a lower-layer cache, and compressing the cache data into striped data by using a compression algorithm;

the striped data is transferred from the underlying cache to the storage array to be destaged, and metadata for the striped data is cached in the underlying cache.

In some embodiments, the following steps are performed when the business system attempts to read data:

determining the read data as first hot spot data, second hot spot data or cold data;

in response to that the read data is the first hot spot data, extracting the read data from the hot spot data pre-read information cached in the upper layer and directly feeding back the read data to the service system;

in response to that the read data is the second hot spot data, extracting the read data from the hot spot data pre-reading information of the lower-layer cache and feeding the read data back to the service system through the upper-layer cache;

and in response to the read data being cold data, extracting the read data in the storage array according to the metadata cache of the lower-layer cache and feeding the read data back to the business system through the upper-layer cache and the lower-layer cache.

In some embodiments, the upper-layer cache and the lower-layer cache are respectively allocated with independently working operation resources and storage resources, so that when one of the upper-layer cache and the lower-layer cache fails due to the problem of the operation resources and the storage resources, the work of the other cache is not influenced.

In some embodiments, further comprising: additionally, the performance monitoring module is used for monitoring the working states of the upper-layer cache, the lower-layer cache and the storage array, wherein the working states comprise at least one of the following: the method comprises the following steps of upper-layer cache operation, lower-layer cache operation, cache pre-reading hit rate, metadata cache, disk access efficiency, data isolation, overall system stability, performance improvement and time delay reduction.

In some embodiments, determining the read data to be the first hot spot data, the second hot spot data, or the cold data comprises: and determining the read data as the first hot spot data, the second hot spot data or the cold data based on the reading times or the reading frequency of the read data within the preset time by the business system.

In some embodiments, the storage array is a redundant array of independent disks or a solid state disk array using a solid state disk as a primary storage unit.

A second aspect of the embodiments of the present invention provides a data reading and writing apparatus for a service system, including:

a processor; and

a memory storing program code executable by the processor, the program code performing the following steps when the business system attempts to write data:

In some embodiments, the program code further performs the following steps when the business system attempts to read the data:

In some embodiments, the steps further comprise: additionally, the performance monitoring module is used for monitoring the working states of the upper-layer cache, the lower-layer cache and the storage array, wherein the working states comprise at least one of the following: the method comprises the following steps of upper-layer cache operation, lower-layer cache operation, cache pre-reading hit rate, metadata cache, disk access efficiency, data isolation, overall system stability, performance improvement and time delay reduction.

The invention has the following beneficial technical effects: according to the data reading and writing method and device of the service system, when the service system tries to write data, one controller of an upper-layer cache is used for receiving the written data, a latent semantic analysis algorithm is used for generating cache data for the written data, the controller is used for mirroring the cache data to all controllers in the upper-layer cache and feeding back writing completion information to the service system, the cache data are transmitted to a lower-layer cache from the upper-layer cache and are compressed into striped data by a compression algorithm, the striped data are transmitted to a storage array from the lower-layer cache to be landed, and metadata of the striped data are cached in the lower-layer cache.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flow chart of a data writing method of a data reading and writing method of a service system provided by the present invention;

fig. 2 is a schematic flow chart of a data reading method of a data reading and writing method of a service system provided by the present invention;

fig. 3 is a schematic overall structure diagram of a data read-write method of a service system provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.

In view of the foregoing, a first aspect of the embodiments of the present invention provides an embodiment of a data reading and writing method for a business system, which is capable of optimizing big data reading and writing of the business system. Fig. 1 is a schematic flow chart illustrating a data reading and writing method of a service system provided in the present invention.

The data reading and writing method of the service system comprises the following steps:

as shown in fig. 1, when the business system attempts to write data, the following steps are performed:

step S101: receiving the write data by using a controller of an upper-layer cache, and generating cache data for the write data by using a latent semantic analysis algorithm;

step S103: the controller is used for mirroring the cache data to all controllers in the upper-layer cache and feeding back write-in completion information to the service system;

step S105: transmitting the cache data from an upper-layer cache to a lower-layer cache, and compressing the cache data into striped data by using a compression algorithm;

step S107: the striped data is transferred from the underlying cache to the storage array to be destaged, and metadata for the striped data is cached in the underlying cache.

In some further embodiments of the invention, as shown in fig. 2, the following steps are performed when the business system attempts to read data:

step S201: determining the read data as first hot spot data, second hot spot data or cold data;

step S203: in response to that the read data is the first hot spot data, extracting the read data from the hot spot data pre-read information cached in the upper layer and directly feeding back the read data to the service system;

step S205: in response to that the read data is the second hot spot data, extracting the read data from the hot spot data pre-reading information of the lower-layer cache and feeding the read data back to the service system through the upper-layer cache;

step S207: and in response to the read data being cold data, extracting the read data in the storage array according to the metadata cache of the lower-layer cache and feeding the read data back to the business system through the upper-layer cache and the lower-layer cache.

The invention provides a storage double-layer cache time delay optimization method, which is characterized in that through realizing double-layer cache design, after receiving IO, an upper layer cache carries out mirroring and returns ACK (acknowledgement response) to a host, so that the host time delay is reduced. And optimizing a caching strategy of an upper-layer cache according to an LSA (latent semantic analysis) algorithm to ensure high concurrency processing. And the cache read-ahead strategy of the storage platform is optimized, and the CPU overhead is reduced. And an efficient metadata caching strategy is added, and the efficient access of metadata to a disk is ensured. Meanwhile, the isolation of cache resources is realized, and the stability of the service is ensured. The optimization method of the cache can ensure the performance of the storage, save the optimization time, and meanwhile, can be used for different storages, and can be optimized specifically according to different requirements, thereby being convenient and efficient. By the method, the storage performance can be improved, the availability of the system is improved, the method is widely applied, and the total maintenance cost can be greatly reduced.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), or the like. Embodiments of the computer program may achieve the same or similar effects as any of the preceding method embodiments to which it corresponds.

The method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU (central processing unit), and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention. The above-described method steps and system elements may also be implemented using a controller and a computer-readable storage medium for storing a computer program for causing the controller to implement the functions of the above-described steps or elements.

The double-layer cache time delay optimization method provided by the invention has the characteristics of improving the storage performance, reducing the time delay, improving the storage availability and reducing the operation and maintenance processes. It is of great value to solid state array storage (SSA); the effect of optimizing the storage time delay of the system is achieved by double-layer cache design, optimizing a cache strategy through an algorithm, optimizing a storage cache pre-reading strategy and simultaneously carrying out cache isolation. After receiving the IO, the upper-layer cache performs mirroring, and after the mirroring is completed, the mirror image is directly fed back to a host side to complete writing, and other operations are delivered to the lower-layer cache to be executed, so that the response time of storage is greatly reduced. Meanwhile, after the lower-layer cache receives the service data of the upper-layer cache, the online compression, copying and other characteristics are realized, the execution efficiency of the upper-layer cache is not influenced while the function is realized, and the whole storage time delay is improved integrally.

And the upper-layer cache strategy is optimized according to the LSA algorithm, so that the execution efficiency of service concurrency is increased, and the retrieval efficiency is improved. Meanwhile, the pre-reading strategy of the cache is optimized, through calculation of hot spot data and classification of hot spots, data with the reading times of more than 80% in unit time is placed in an upper-layer cache, and data with the reading times of 30% -80% in unit time is placed in a lower-layer cache, so that the purpose of pre-reading is achieved. By the mode, the reading time of the host is greatly shortened, meanwhile, the space of upper-layer cache is saved, the reading efficiency is improved, and the time delay is shortened. Meanwhile, the compressed data in the SSD can be read into the cache in advance in a pre-reading mode, so that the decompression time is shortened, the expense of the CPU is reduced, the utilization rate of the CPU can be increased, the response time is further prolonged, and the time delay is reduced.

On the basis of pre-reading, a metadata caching strategy is added. By the cache strategy of the metadata, when data needs to be read, the data on the SSD can be accessed more efficiently, the data reading efficiency from the SSD to a lower-layer cache is improved, and the time delay is reduced. Meanwhile, the invention adds an isolation mechanism to realize the isolation of resources (the resources comprise CPU resources, cache resources, bus bandwidth resources and the like) and data (the data comprise metadata, read-write data, monitoring data and the like) between the upper cache and the lower cache and between the solid state disks. Through the isolation of the resources, the upper-layer cache cannot occupy the resources of the lower-layer cache, and meanwhile, the lower-layer cache cannot occupy the resources of the upper-layer cache and the solid state disk. Therefore, the stability of the whole system can be greatly improved, the mutual interference is avoided, and the whole resources can be reasonably utilized.

The following further illustrates an embodiment of the present invention in accordance with the embodiment shown in fig. 3.

Firstly, the storage and the application hardware environment are built, the storage equipment is configured, and the space is divided into a service system. If the concurrent needs exist, various service systems can be connected; and the deployment of each system is completed, and the normal operation of the service system is ensured. And operating the service system, checking the condition of the storage cache, the condition of the upper and lower-layer cache operation, the condition of cache pre-reading hit rate, the condition of metadata cache, the condition of disk access efficiency, the condition of data isolation, the condition of overall system stability, the condition of performance improvement and the condition of time delay reduction, and thus verifying the feasibility of the whole optimization method.

As shown in fig. 3, when the service system writes data into the storage, the IO data first reaches the upper-layer cache; after receiving the data, the upper-layer cache performs caching through an LSA algorithm, then performs mirroring between controllers, and after the mirroring is completed, feeds back ACK to the host end to complete writing, at this time, the service system considers that the writing is completed, can calculate the service time of the service system, and can obtain the delayed data of the service system. Meanwhile, the upper-layer cache writes the cached data into the lower-layer cache, the lower-layer cache compresses the data through a compression algorithm after receiving the data, the data are arranged into striped data after the compression is finished, the striped data can reduce the read-write times of the rear-end SSD, and the service life of the SSD is prolonged. After the striped data is received, the striped data is stored through a bottom RAID mechanism, after the storage is finished, the metadata of the striped data is cached to a lower cache, and the data is convenient to read from the RAID. After the process is completed, the whole stored data writing process is completed, and the IO data falls into the hard disk. In the process, the front-end service system is transparent and is not aware, and only the upper-layer cache is actually used for interacting with the service system, so that the performance of the system is greatly improved, and the time delay is reduced.

When the service system reads data, the data is also read to the upper-layer cache, because the upper-layer cache performs pre-reading aiming at the hot spot data, if the data read this time is the hot spot data, the data is directly fed back to the service system, the service data receives the data to be read, and the reading process is completed; if the read data is not pre-read in the upper-layer cache, the read data can be read in the lower-layer cache, and the pre-read of the data is also performed in the lower-layer cache, so that the possibility of reading the data in the lower-layer cache can be greatly improved. If the data is cold data and is not read in the upper and lower layer caches, the data stored in the SSD hard disk can be quickly found through the cache metadata of the lower layer cache. By the multi-level pre-reading strategy, the reading efficiency can be greatly improved, and the time delay is reduced.

The embodiment of the invention also isolates the resources and data of the upper and lower layers of cache, reduces mutual interference and improves the stability of the whole system. The upper-layer cache and the lower-layer cache are respectively distributed with fixed CPU core numbers and cache spaces, and read-write data spaces are correspondingly stored in a specific area, so that the operation of the whole system cannot be influenced by unilateral faults, and the fault condition of the whole system can be effectively avoided. In order to monitor the performance of the whole storage in real time, a performance monitoring module is also added in the system, so that the upper and lower caches and the running condition and performance condition of each module can be monitored in real time, and an implementer can conveniently check the condition of the system and make proper adjustment in the follow-up process.

The embodiment of the invention aims to simplify the workload of an implementer, improve the cache model as much as possible, optimize the operation process and reduce the operation amount, does not influence the original compatibility requirement in the implementation process, does not change the original storage operation mode, and is transparent and imperceptible to upper-layer application. By using the method, the configuration time can be greatly saved, the stability of the system is improved, and the storage time delay can be reduced. Aiming at the fact that the application has higher requirement on the time delay of solid state array storage (SSA), the cache is optimized according to the characteristics of the SSA, in the implementation process, the work is simplified, the resource utilization rate is higher, the error rate is lower, the total operation cost is reduced, and the reliability and the performance of the whole system are improved.

It can be seen from the foregoing embodiments that, in the data reading and writing method of the service system provided in the embodiments of the present invention, by using one controller of the upper layer cache to receive write data when the service system attempts to write data, and generates cache data for the write data using a latent semantic analysis algorithm, mirrors the cache data to all controllers in the upper level cache using the controllers, and feeds back the write completion information to the service system, transmits the cache data from the upper-layer cache to the lower-layer cache, and compresses the cache data into striped data using a compression algorithm, transfers the striped data from the underlying cache to the storage array for destaging, the technical scheme of caching the metadata of the striped data into the lower-layer cache can optimize the large data reading and writing of the business system, improve the reading and writing performance and the response speed, and has good usability and overall stability.

It should be particularly noted that, the steps in the embodiments of the data reading and writing method of the service system described above may be mutually intersected, replaced, added, and deleted, so that the data reading and writing method of the service system that is transformed by these reasonable permutations and combinations also belongs to the protection scope of the present invention, and the protection scope of the present invention should not be limited to the described embodiments.

In view of the foregoing, a second aspect of the embodiments of the present invention provides an embodiment of a data reading and writing apparatus for a business system, which is capable of optimizing reading and writing of big data of the business system. The data read-write device of the service system comprises:

a processor; and

In some further embodiments of the invention, the program code further performs the following steps when the business system attempts to read the data:

It can be seen from the foregoing embodiments that, in the data reading and writing apparatus of the service system provided in the embodiments of the present invention, by using one controller of the upper layer cache to receive write data when the service system attempts to write data, and generates cache data for the write data using a latent semantic analysis algorithm, mirrors the cache data to all controllers in the upper level cache using the controllers, and feeds back the write completion information to the service system, transmits the cache data from the upper-layer cache to the lower-layer cache, and compresses the cache data into striped data using a compression algorithm, transfers the striped data from the underlying cache to the storage array for destaging, the technical scheme of caching the metadata of the striped data into the lower-layer cache can optimize the large data reading and writing of the business system, improve the reading and writing performance and the response speed, and has good usability and overall stability.

It should be particularly noted that, in the embodiment of the data reading and writing apparatus of the business system, the working process of each module is specifically described by using the embodiment of the data reading and writing method of the business system, and those skilled in the art can easily think that these modules are applied to other embodiments of the data reading and writing method of the business system. Of course, since each step in the embodiment of the data reading and writing method of the service system may be intersected, replaced, added, or deleted, the data reading and writing device of the service system, which is transformed by these reasonable permutations and combinations, should also belong to the protection scope of the present invention, and the protection scope of the present invention should not be limited to the embodiment.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items. The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of an embodiment of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims

1. A data read-write method of a service system is characterized by comprising the following steps:

when the business system tries to write data, the following steps are executed:

receiving write data using a controller of an upper-level cache and generating cache data for the write data using a latent semantic analysis algorithm;

the controller is used for mirroring the cache data to all controllers in the upper-layer cache and feeding back write-in completion information to a service system;

transmitting the cache data from the upper-layer cache to a lower-layer cache, and compressing the cache data into striped data by using a compression algorithm;

transferring the striped data from the underlying cache to a storage array to be destaged, and caching metadata of the striped data in the underlying cache.

2. The method of claim 1, further comprising:

when the business system tries to read data, the following steps are executed:

determining the read data to be first hot spot data, second hot spot data or cold data;

in response to that the read data is first hot spot data, extracting the read data from the hot spot data pre-read information cached in the upper layer and directly feeding back the read data to a service system;

responding to that the read data is second hot spot data, extracting the read data from hot spot data pre-read information of the lower-layer cache and feeding back the read data to a service system through the upper-layer cache;

and responding to the read data being cold data, extracting the read data in the storage array according to the metadata cache of the lower-layer cache, and feeding back the read data to a service system through the upper-layer cache and the lower-layer cache.

3. The method according to claim 1, wherein the upper layer cache and the lower layer cache are allocated with independently operating computation resources and storage resources, respectively, so that when one of the upper layer cache and the lower layer cache fails due to a problem of computation resources and storage resources, the operation of the other is not affected.

4. The method of claim 1, further comprising: additionally using a performance monitoring module to monitor operating states of the upper level cache, the lower level cache, and the storage array, the operating states including at least one of: the method comprises the following steps of upper-layer cache operation, lower-layer cache operation, cache pre-reading hit rate, metadata cache, disk access efficiency, data isolation, overall system stability, performance improvement and time delay reduction.

5. The method of claim 2, wherein determining that the read data is first hot spot data, second hot spot data, or cold data comprises: and determining the read data as first hot spot data, second hot spot data or cold data based on the reading times or reading frequency of the read data within a preset time by the business system.

6. A data read/write apparatus of a service system, comprising:

a processor; and

7. The apparatus of claim 6, wherein the program code further performs the following steps when the business system attempts to read the data:

8. The apparatus of claim 6, wherein the upper-level cache and the lower-level cache are allocated with independently operating computation resources and storage resources, respectively, so that when one of the upper-level cache and the lower-level cache fails due to a problem of computation resources and storage resources, the operation of the other is not affected.

9. The apparatus of claim 6, wherein the steps further comprise: additionally, a performance monitoring module is used to monitor the operating states of the upper-level cache, the lower-level cache, and the storage array, the operating states including at least one of: the method comprises the following steps of upper-layer cache operation, lower-layer cache operation, cache pre-reading hit rate, metadata cache, disk access efficiency, data isolation, overall system stability, performance improvement and time delay reduction.

10. The apparatus of claim 7, wherein determining that the read data is first hot spot data, second hot spot data, or cold data comprises: and determining the read data as first hot spot data, second hot spot data or cold data based on the reading times or reading frequency of the read data within a preset time by the business system.