CN111124299A - Data storage management method, device, equipment, system and storage medium - Google Patents

Data storage management method, device, equipment, system and storage medium Download PDF

Info

Publication number
CN111124299A
CN111124299A CN201911301508.4A CN201911301508A CN111124299A CN 111124299 A CN111124299 A CN 111124299A CN 201911301508 A CN201911301508 A CN 201911301508A CN 111124299 A CN111124299 A CN 111124299A
Authority
CN
China
Prior art keywords
data
storage
query
request
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911301508.4A
Other languages
Chinese (zh)
Inventor
王怡然
任红超
曹磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Media Technology Beijing Co Ltd
Original Assignee
Netease Media Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Media Technology Beijing Co Ltd filed Critical Netease Media Technology Beijing Co Ltd
Priority to CN201911301508.4A priority Critical patent/CN111124299A/en
Publication of CN111124299A publication Critical patent/CN111124299A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Abstract

The application discloses a data storage management method, a device, equipment, a system and a storage medium, which are used for assisting a development team to conveniently use and maintain a cluster, reducing the use cost and reasonably coordinating cluster resources, and the method comprises the following steps: receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier; inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relation, wherein the routing relation is the corresponding relation between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters; and storing the data to be stored in the first cluster.

Description

Data storage management method, device, equipment, system and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data storage management method, apparatus, device, system, and storage medium.
Background
This section is intended to provide a background or context to the embodiments of the application that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
A cluster is a group of mutually independent computers interconnected by a high-speed network, which form a group and are managed in a single system mode. By means of clustering techniques, relatively high gains in performance, reliability, flexibility can be achieved at low cost. At present, a cluster is mostly adopted to manage mass data in various applications so as to realize operations such as high-speed storage, query, deletion and the like of the data.
However, in practical applications, development teams usually need to deploy a set of independent clusters for each application, and then access to the clusters is realized by configuring the IP and the port of the clusters in the applications, when the amount of data that needs to be stored by the applications is small, the problem of cluster resource waste exists, and each development team needs to learn the deployment and maintenance of the clusters, which is high in learning cost.
Disclosure of Invention
In view of the above technical problems, there is a great need for an improved method to assist a development team in conveniently using and maintaining clusters, reducing the use cost, and simultaneously coordinating the cluster resources reasonably.
In one aspect, an embodiment of the present application provides a data storage management method, including:
receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier;
inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relation, wherein the routing relation is the corresponding relation between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters;
and storing the data to be stored in the first cluster.
Optionally, the method further comprises:
and stopping responding to the query request of any index identifier of the application if detecting that the single data query volume for any application exceeds the single flow upper limit value configured for the application, or the total data query volume for any application within a unit time exceeds the total flow upper limit value configured for the application.
Optionally, the method further comprises:
receiving an index establishing request sent by a client corresponding to an application, wherein the index establishing request comprises at least one index identifier and expected storage capacity included by the application;
selecting clusters from the storage system having a remaining storage capacity that meets the desired storage capacity;
and establishing a routing relation for the index identification in the index establishing request and the selected cluster.
Optionally, the index establishing request further includes a service level;
the selecting, from the storage system, a cluster whose remaining storage capacity satisfies the desired storage capacity specifically includes:
if the service level is higher than the preset level, selecting a cluster with the storage capacity meeting the expected storage capacity from clusters of the storage system which are not established with the routing relation, and marking the selected cluster as an exclusive cluster;
and if the service level is not higher than the preset level, selecting a cluster with the residual storage capacity meeting the expected storage capacity from the non-independent clusters of the storage system.
Optionally, the cluster in the storage system is an ElasticSearch cluster.
In one aspect, an embodiment of the present application provides a data storage management apparatus, including:
the write-in gateway is used for receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier;
the message processing unit is used for inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relationship, wherein the routing relationship is the corresponding relationship between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters; and storing the data to be stored in the first cluster.
Optionally, the write gateway is further configured to write the data to be stored and the index identifier in the storage request into a cache queue, where the cache queue is a queue created based on a Kafka message system;
the message processing unit is specifically configured to sequentially obtain data to be stored and an index identifier from the cache queue, and store the obtained data to be stored in a first cluster corresponding to the index identifier.
Optionally, the write gateway is further configured to write the data to be stored in the storage request into a cache queue corresponding to the index identifier in the storage request, where the cache queue is a queue created based on a Kafka message system;
the message processing unit is specifically configured to sequentially obtain data to be stored in the storage request from the cache queue corresponding to the index identifier, and store the obtained data to be stored in the first cluster.
Optionally, the write gateway is specifically configured to: if the type of the received storage request is an http request, obtaining the data to be stored and the index identifier in the storage request, writing the obtained data to be stored into a cache queue corresponding to the obtained index identifier in a Kafka message system according to a Kafka communication protocol, and otherwise, directly writing the data to be stored in the storage request into the cache queue corresponding to the index identifier in the storage request.
Optionally, the apparatus further comprises a query gateway configured to:
receiving a first query request sent by a client corresponding to an application, wherein the first query request comprises an index identifier and a query condition, and the type of the first query request is an http request;
determining a second cluster corresponding to the index identifier in the first query request according to the routing relation;
querying data meeting the query condition in the first query request from the second cluster;
and sending the data meeting the query conditions to the client side sending the first query request.
Optionally, the query gateway is further configured to:
receiving a second query request sent by a client corresponding to an application through a preset query interface, wherein the second query request comprises a joint query identifier and query conditions;
finding a joint query mode corresponding to the joint query identifier from a preset joint query list, wherein the joint query mode comprises a plurality of index identifiers;
according to a plurality of index identifiers in the inquired combined inquiry mode, inquiring data meeting the inquiry condition in the second inquiry request from the clusters corresponding to the index identifiers respectively;
and sending the data meeting the query conditions to the client side sending the second query request.
Optionally, the query gateway is further configured to stop responding to the query request for any index identifier of any application if it is detected that the single data query volume for any application exceeds the single upper limit value of the traffic volume configured for the application, or the total data query volume for any application within a unit time exceeds the total upper limit value of the traffic volume configured for the application.
Optionally, the apparatus further includes a routing relation establishing module, configured to:
receiving an index establishing request sent by a client corresponding to an application, wherein the index establishing request comprises at least one index identifier and expected storage capacity included by the application;
selecting clusters from the storage system having a remaining storage capacity that meets the desired storage capacity;
and establishing a routing relation for the index identification in the index establishing request and the selected cluster.
Optionally, the index establishing request further includes a service level;
the routing relationship establishing module is specifically configured to:
if the service level is higher than the preset level, selecting a cluster with the storage capacity meeting the expected storage capacity from clusters of the storage system which are not established with the routing relation, and marking the selected cluster as an exclusive cluster;
and if the service level is not higher than the preset level, selecting a cluster with the residual storage capacity meeting the expected storage capacity from the non-independent clusters of the storage system.
Optionally, the cluster in the storage system is an ElasticSearch cluster.
In one aspect, an embodiment of the present application provides a data storage management device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of any one of the methods when executing the computer program.
In one aspect, an embodiment of the present application provides a data storage management system, including: the storage system comprises a plurality of clusters and any one of the data storage management devices.
In one aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, implement the steps of any of the above-described methods.
The data storage management method, device, equipment, system and storage medium provided by the embodiment of the application configure clusters for storing data in the application in advance for each application, and establish a routing relationship between an index identifier corresponding to each application and the clusters configured for each application. And when the data of a certain application needs to be stored, storing the data of the application into the corresponding cluster according to the established routing relation. Therefore, the user can complete the storage of the data only by initiating a storage request without paying attention to the internal deployment of the storage system and learning the use and maintenance of the cluster, and the cost of a user is reduced. In addition, a plurality of users can share one set of storage system, especially for the requirements of small data volume and low read-write concurrency requirement, the users can be unified into one common cluster, server resources are saved, and reasonable distribution and sharing of storage resources in the storage system are realized.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
fig. 1 is a schematic view of an application scenario of a data storage management method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating a data storage management method according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a data storage management method according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a data storage management method according to an embodiment of the present application;
FIG. 5 is a schematic diagram illustrating operation of a data storage management device according to an embodiment of the present application;
fig. 6 is a schematic flowchart of querying data in a data storage management method according to an embodiment of the present application;
FIG. 7 is a schematic diagram illustrating a flow of querying data in a data storage management method according to an embodiment of the present application;
FIG. 8 is a schematic structural diagram of a data storage management apparatus according to an embodiment of the present application;
FIG. 9 is a schematic structural diagram of a data storage management device according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of a data storage management system according to an embodiment of the present application.
Detailed Description
The principles and spirit of the present application will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present application, and are not intended to limit the scope of the present application in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present application may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.
The principles and spirit of the present application are explained in detail below with reference to several representative embodiments of the present application.
Summary of The Invention
The inventor of the application finds that when a development team develops an application at present, an independent cluster needs to be deployed for each application, then the IP and the port of the cluster are configured in the application to realize access to the cluster, when the data volume required to be stored by the application is small, the problem of cluster resource waste exists, each development team needs to learn the deployment and maintenance of the cluster, and the learning cost is high.
In order to solve the above problem, the present application provides a data storage management method, which specifically includes the following contents: receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier; inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relation, wherein the routing relation is the corresponding relation between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters; and storing the data to be stored in the first cluster. In the data storage management method, a cluster for storing data in the application is configured for each application in advance, and a routing relation between an index identifier corresponding to each application and the cluster configured for each application is established for each application; and when the data of a certain application needs to be stored, storing the data of the application into the corresponding cluster according to the established routing relation. Therefore, the user can complete the storage of the data only by initiating a storage request without paying attention to the internal deployment of the storage system and learning the use and maintenance of the cluster, and the cost of a user is reduced. In addition, a plurality of users can share one set of storage system, especially for the requirements of small data volume and low read-write concurrency requirement, the users can be unified into one common cluster, server resources are saved, and reasonable distribution and sharing of storage resources in the storage system are realized.
Having described the basic principles of the present application, various non-limiting embodiments of the present application are described in detail below.
Application scene overview
Fig. 1 is a schematic view of an application scenario of a data storage management method according to an embodiment of the present application. The application scenario comprises a terminal device 101, a data storage management device 102 and a storage system 103, wherein the terminal device 101 and the data storage management device 102 are connected through a communication network, and the data storage management device 102 and the storage system 103 are connected through the communication network. The terminal device 101 includes, but is not limited to, a desktop computer, a mobile phone, a mobile computer, a tablet computer, a media player, a smart wearable device, a smart television, and other electronic devices. The data storage management device 102 may be a server, a server cluster composed of several servers, or a cloud computing center. Storage system 103 includes a plurality of clusters 1031. The terminal device 101 is installed with a client developed for an application, and can be connected to the data storage management device 102 through an access interface provided by the client, and access the storage system 103 through the data storage management device 102, so as to obtain data management services corresponding to the access interface, including but not limited to data management services such as data storage, query, deletion, and the like.
Exemplary method
In the following, a data storage management method according to an exemplary embodiment of the present application is described in conjunction with the above application scenarios. It should be noted that the above application scenarios are only presented to facilitate understanding of the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.
Referring to fig. 2, a data storage management method provided in an embodiment of the present application may be applied to the data storage management device 102 shown in fig. 1, and specifically may include the following steps:
s201, receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier.
In specific implementation, generally, the same type of data in the same application corresponds to one index identifier, which can be determined according to the storage requirement of the user. The storage system or a background operation and maintenance person of the storage system can allocate the index identifier and the corresponding cluster to each application according to the storage requirement of the user, establish a routing relationship between the index identifier and the cluster, and store the routing relationship into a routing relationship list, and the specific process will be described in detail later.
When the amount of data that an application needs to store is small or the kind of data is single, only one index identifier may be configured for one application. When the application needs to store a large amount of data or a large number of types of data, different index identifiers may be configured for different types of data in one application, for example, in a shopping application, one index identifier may be configured for order data separately, one index identifier may be configured for user data separately, and one index identifier may be configured for commodity data separately.
S202, inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relationship, wherein the routing relationship is the corresponding relationship between the index identifier and the clusters in the storage system, and the storage system comprises a plurality of clusters.
S203, storing the data to be stored into the first cluster.
In the data storage management method provided by the embodiment of the application, a cluster for storing data in the application is configured for each application in advance, and a routing relationship between an index identifier corresponding to each application and the cluster configured for each application is established for each application. And when the data of a certain application needs to be stored, storing the data of the application into the corresponding cluster according to the established routing relation. Therefore, the user can complete the storage of the data only by initiating a storage request without paying attention to the internal deployment of the storage system and learning the use and maintenance of the cluster, and the cost of a user is reduced. In addition, a plurality of users can share one set of storage system, especially for the requirements of small data volume and low read-write concurrency requirement, the users can be unified into one common cluster, server resources are saved, and reasonable distribution and sharing of storage resources in the storage system are realized.
The clusters in the storage system of the embodiment of the present application may be any type of clusters, for example, an ElasticSearch cluster. The ElasticSearch is a Lucene-based search server. It provides a distributed multi-user capable full-text search engine based on RESTful web interface. The ElasticSearch was developed in the Java language and published as open source code under the Apache licensing terms, and is a popular enterprise level search engine. The elastic search is used in cloud computing, can achieve real-time search, and is stable, reliable, rapid, and convenient to install and use.
Further, in order to avoid the situation of data loss during the data storage process, a buffer queue may be established in the data storage management device 102 for buffering the received storage request. The buffer queue may be established based on an existing data buffering technology, for example, a buffer queue created based on the Kafka message system, which is not limited in the embodiments of the present application.
As a possible implementation, a buffer queue may be established within the data storage management device 102. Therefore, referring to fig. 3, another data storage management method provided in the embodiment of the present application may specifically include the following steps:
s301, receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier.
Wherein each application comprises at least one index identification.
S302, writing the data to be stored and the index identification in the storage request into a buffer queue.
S303, inquiring a first cluster corresponding to each index identifier in the cache queue according to the preconfigured routing relation.
The routing relationship is a corresponding relationship between the index identifier and a cluster in the storage system, and the storage system comprises a plurality of clusters.
S304, sequentially obtaining the data to be stored and the index identification from the buffer queue, and storing the obtained data to be stored into the first cluster corresponding to the index identification.
As another possible implementation manner, a buffer queue may be separately established for each index identifier, and is used to store the data to be stored corresponding to the index identifier. Therefore, referring to fig. 4, another data storage management method provided in the embodiment of the present application may specifically include the following steps:
s401, receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier.
Wherein each application comprises at least one index identification.
S402, writing the data to be stored in the storage request into a cache queue corresponding to the index identifier in the storage request.
S403, inquiring a first cluster corresponding to each index identifier in each cache queue according to the preconfigured routing relation.
The routing relationship is a corresponding relationship between the index identifier and a cluster in the storage system, and the storage system comprises a plurality of clusters.
S304, sequentially obtaining the data to be stored from the cache queue corresponding to the index identifier, and storing the obtained data to be stored into the first cluster.
In specific implementation, the buffer queues are independent from each other, and query and stored service logic can be processed in parallel, so that the data storage speed is increased.
The methods shown in fig. 3 and 4 utilize a buffer queue to decouple the client and the storage system, and allow historical data playback. When the storage system fails, the cache queue can automatically retransmit the data which is failed in storage after the storage system is recovered, so that the data is ensured not to be lost, and a user does not need to consider the problem that the data needs to be stored again after the storage is failed.
On the basis of any one of the above embodiments, in order to facilitate the user to access the storage system, a corresponding SDK (software development kit) is provided for the user, and the format of the storage request is specified in the SDK. The user only needs to write the index identification and the data to be stored into the SDK, and the SDK automatically sends a storage request to the cache queue according to a preset format so as to write the data to be stored into the cache queue.
In specific implementation, the SDK is written by adopting a development language which is supported by a data caching technology for realizing a cache queue. For example, if the Kafka messaging system supports access to Java applications, then the SDK is written in the Java language. Referring to fig. 5, with the SDK provided, the Java application may send a storage request to the write gateway, and the write gateway directly writes the data to be stored in the storage request into the corresponding cache queue in the Kafka message system.
In order to support the application of more development languages and reduce the cost of accessing the storage system, the method provided by the embodiment of the application further provides a mode of initiating a storage request through an http protocol. Correspondingly, in the method of the embodiment of the present application, the storage request initiated by the http protocol may be processed in the following manner: and writing the data to be stored in the storage request into a cache queue or a cache queue corresponding to the index identifier in the storage request by the write-in gateway according to the development language and the communication protocol supported by the data cache technology.
Taking the Kafka message system as an example, if the type of the storage request received by the write-in gateway is an http request, the write-in gateway acquires the data to be stored and the index identifier in the storage request, and then writes the acquired data to be stored into a cache queue corresponding to the acquired index identifier in the Kafka message system according to the Kafka communication protocol, otherwise, directly writes the data to be stored in the storage request into the cache queue corresponding to the index identifier in the storage request.
Referring to fig. 5, the java application sends a storage request to the write gateway in the data storage management device through the provided SDK, and the write gateway directly stores data to be stored in the storage request into a cache queue corresponding to the index identifier in the storage request in the Kafka message system. The non-java application can initiate an http type storage request to the data storage management device, write the http type storage request into the gateway to obtain the data to be stored and the index identifier in the storage request, and write the obtained data to be stored into a cache queue corresponding to the obtained index identifier in the Kafka message system according to the Kafka communication protocol. The management unit is used for storing the routing relation. The message processing unit sequentially obtains the data to be stored from the cache queue, determines the cluster corresponding to the data to be stored through inquiring the routing relation in the management unit, and stores the data to be stored into the corresponding cluster.
On the basis of any one of the foregoing embodiments, referring to fig. 6, the data storage management method provided in the embodiment of the present application further includes the following steps:
s601, receiving a first query request sent by a client corresponding to an application, wherein the first query request comprises an index identifier and a query condition, and the type of the first query request is an http request.
S602, according to the routing relation, determining a second cluster corresponding to the index identifier in the first query request.
And S603, inquiring data meeting the inquiry condition in the first inquiry request from the second cluster.
S604, sending the data meeting the query conditions to the client side sending the first query request.
In specific implementation, in order to support the application of more development languages and reduce the cost of accessing the storage system, a query gateway is arranged in the data storage management equipment and is used for processing all query requests. The user may initiate a query request using http protocol.
Referring to fig. 5, a user initiates a query request to the data storage management device through a client, the query gateway determines a cluster corresponding to an index identifier in the query request through a routing relationship in the query management unit, searches for data meeting a query condition in the first query request from the query cluster, and returns the data meeting the query condition to the client that sent the first query request.
Through the method shown in fig. 6, the user can conveniently query the required data through the query gateway without concerning the deployment inside the storage system.
In order to provide a more convenient and diversified query mode for a user, a plurality of index identifiers can be combined together to be used as a joint query mode according to the query requirement of application, a unique corresponding joint query identifier is configured for the joint query mode, and the joint query identifier and the joint query mode are stored in a joint query list in a correlation mode. The developer can provide a preset query interface corresponding to each joint query mode in the application, and a user can initiate a query request for simultaneously querying a plurality of clusters through the preset query interface. Based on this, referring to fig. 7, the method of the embodiment of the present application further includes the steps of:
s701, receiving a second query request sent by a client corresponding to the application through a preset query interface, wherein the second query request comprises a joint query identifier and query conditions.
S702, finding out joint query modes corresponding to the joint query identifications from a preset joint query list, wherein each joint query mode comprises a plurality of index identifications.
And S703, querying data meeting the query condition in the second query request from the cluster corresponding to the index identifiers respectively according to the index identifiers in the queried combined query mode.
S704, sending the data meeting the query conditions to the client side sending the second query request.
For example, if the joint query mode corresponding to a certain joint query identifier includes index identifiers Idx1, Idx2, and Idx3, data meeting the query condition in the second query request is queried from the clusters corresponding to Idx1, Idx2, and Idx3, respectively, and the queried data is integrated and then sent to the client sending the second query request.
In this way, a user can conveniently realize cross-application query through the preset query interface, namely, related data is queried from a cluster corresponding to a plurality of applications at the same time, and the storage position of each application in the storage system does not need to be concerned.
On the basis of any of the above embodiments, the upper limit value of the flow rate during query may be configured for each application, so as to implement flow control, and prevent a certain application from monopolizing all the flow rate, where the upper limit value of the flow rate may include the upper limit value of the single flow rate and the upper limit value of the total flow rate per unit time. In specific implementation, a uniform traffic upper limit value may be configured for all applications, or a corresponding traffic upper limit value may be determined according to a service level corresponding to an application, where the higher the service level is, the higher the traffic upper limit value is. When the data volume of a query of an application exceeds the configured upper limit value of the flow, the rate of data transmission to the application can be limited.
Based on this, the method of the embodiment of the present application further includes the following steps: and stopping responding to the query request of any index identifier of the application if detecting that the single data query volume for any application exceeds the single flow upper limit value configured for the application, or the total data query volume for any application within a unit time exceeds the total flow upper limit value configured for the application.
The unit time may be determined according to factors such as actual application requirements, frequency of query requests, query data amount, and the like, and the embodiment of the present application is not limited. For example, the unit time may be 10 seconds, 1 minute, or the like.
In specific implementation, the flow control can be conveniently realized in the query gateway, other functions such as fusing and the like can also be realized in the query gateway, and a user does not need to pay attention to the realization mode of the functions.
On the basis of any embodiment, a user may send an index establishment request to the data storage management device through a client, where the index establishment request includes at least one index identifier and an expected storage capacity included in the application, so as to obtain a cluster that meets a storage requirement of the user. The data storage management equipment receives an index establishing request sent by a client corresponding to an application, selects a cluster with the residual storage capacity meeting the expected storage capacity from a storage system, establishes a routing relation for establishing an index identifier in the index establishing request and the selected cluster, and stores the established routing relation into a management unit.
When the method is specifically implemented, a user can send an index acquisition request to the data storage management equipment through the client, the index acquisition request comprises an application identifier and an index field which are included by the application, the data storage management equipment receives the index acquisition request sent by the client corresponding to the application, allocates a unique index identifier according to the application identifier and the index field, and returns the unique index identifier to the client. Then, the user can establish a routing relation, perform data storage, query and other operations based on the index identification.
Further, the storage system may be divided into an exclusive cluster and a shared cluster, where the exclusive cluster stores only data corresponding to one index identifier, and the shared cluster may store data corresponding to a plurality of index identifiers simultaneously. Meanwhile, the storage system provides a plurality of selectable service levels for users, the users can select corresponding service levels according to the importance requirements of data in the application, and the data storage management equipment can automatically allocate proper clusters to store the data of the application according to the service levels selected by the users.
To this end, the build index request also includes a service level. Accordingly, clusters having a remaining storage capacity that meets the desired storage capacity may be selected from the storage system by: if the service level is higher than the preset level, selecting a cluster with the storage capacity meeting the expected storage capacity from the clusters of the storage system which are not established with the routing relation, and marking the selected cluster as an exclusive cluster; and if the service level is not higher than the preset level, selecting a cluster with the residual storage capacity meeting the expected storage capacity from the non-independent clusters of the storage system.
If a cluster does not establish a routing relationship, indicating that the cluster is not used yet, the cluster can be used as an exclusive cluster to be allocated to an index identifier with a service level higher than a preset level. It can be seen that the exclusive cluster and the shared cluster are only logical concepts, and one cluster in the storage system can be regarded as an exclusive cluster or a shared cluster.
In specific implementation, if the service level of a certain application is changed, the data storage management device may automatically adjust the allocated cluster for the application.
For example, when the original service level of a certain application is higher than the preset level and the changed service level is not higher than the preset level, the cluster corresponding to the application may be marked as a non-exclusive cluster, so that other applications may use the cluster.
For example, when the original service level of an application is not higher than the preset level and the changed server level is higher than the preset level, the data of the application in the cluster corresponding to the application currently may be migrated to any cluster in which no routing relationship is established, the migrated cluster is marked as an exclusive cluster, and the routing relationship corresponding to the application is modified; or migrating the data of other applications in the cluster corresponding to the application currently to any non-exclusive cluster, modifying the routing relation corresponding to the migrated other applications, and marking the cluster corresponding to the application as an exclusive cluster.
Exemplary device
Having described the method of the exemplary embodiments of the present application, a data storage management apparatus of the exemplary embodiments of the present application is described next.
Fig. 8 is a schematic structural diagram of a data storage management apparatus 80 according to an embodiment of the present application. In one embodiment, the data storage management device 80 includes: a write gateway 801 and a message processing unit 802.
The write gateway 801 is configured to receive a storage request sent by a client corresponding to an application, where the storage request includes data to be stored and an index identifier, and the application includes at least one index identifier.
The message processing unit 802 is configured to query a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relationship, where the routing relationship is a correspondence relationship between the index identifier and a cluster in a storage system, where the storage system includes a plurality of clusters, and store the data to be stored in the first cluster.
Optionally, the write gateway 801 is further configured to write the data to be stored and the index identifier in the storage request into the cache queue. Wherein the buffer queue may be a queue created based on the Kafka messaging system.
Correspondingly, the message processing unit 802 is specifically configured to sequentially obtain the data to be stored and the index identifier from the buffer queue, and store the obtained data to be stored in the first cluster corresponding to the index identifier.
Optionally, the write gateway 801 is further configured to write the data to be stored in the storage request into a cache queue corresponding to the index identifier in the storage request, where the cache queue is a queue created based on the Kafka message system;
correspondingly, the message processing unit 802 is specifically configured to sequentially obtain the data to be stored in the storage request from the buffer queue corresponding to the index identifier, and store the obtained data to be stored in the first cluster.
Optionally, the write gateway 801 is specifically configured to: if the type of the received storage request is an http request, data to be stored and an index identifier in the storage request are obtained, the obtained data to be stored are written into a cache queue corresponding to the obtained index identifier in a Kafka message system according to a Kafka communication protocol, and otherwise, the data to be stored in the storage request are directly written into the cache queue corresponding to the index identifier in the storage request.
Optionally, the data storage management apparatus 80 provided in this embodiment of the present application further includes a query gateway, configured to:
receiving a first query request sent by a client corresponding to an application, wherein the first query request comprises an index identifier and a query condition, and the type of the first query request is an http request;
determining a second cluster corresponding to the index identifier in the first query request according to the routing relation;
querying data meeting the query condition in the first query request from the second cluster;
and sending the data meeting the query conditions to the client side sending the first query request.
Optionally, the query gateway is further configured to:
receiving a second query request sent by a client corresponding to an application through a preset query interface, wherein the second query request comprises a joint query identifier and query conditions;
finding a joint query mode corresponding to the joint query identifier from a preset joint query list, wherein the joint query mode comprises a plurality of index identifiers;
according to the plurality of index identifiers in the inquired combined inquiry mode, inquiring data meeting the inquiry condition in the second inquiry request from the clusters corresponding to the plurality of index identifiers respectively;
and sending the data meeting the query conditions to the client side sending the second query request.
Optionally, the query gateway is further configured to stop responding to the query request for any index identifier of the application if it is detected that the single data query volume for any application exceeds the single upper limit value of the traffic volume configured for the application, or the total data query volume for any application within a unit time exceeds the total upper limit value of the traffic volume configured for the application.
Optionally, the data storage management apparatus 80 provided in the embodiment of the present application further includes a routing relationship establishing module, configured to:
receiving an index establishing request sent by a client corresponding to an application, wherein the index establishing request comprises at least one index identifier and expected storage capacity included by the application;
selecting clusters with the residual storage capacity meeting the expected storage capacity from the storage system;
and establishing a routing relation for the index identification in the index establishing request and the selected cluster.
Optionally, the request to establish an index further comprises a service level.
Correspondingly, the routing relationship establishing module is specifically configured to:
if the service level is higher than the preset level, selecting a cluster with the storage capacity meeting the expected storage capacity from the clusters of the storage system which are not established with the routing relation, and marking the selected cluster as an exclusive cluster;
and if the service level is not higher than the preset level, selecting a cluster with the residual storage capacity meeting the expected storage capacity from the non-independent clusters of the storage system.
Optionally, the cluster in the storage system is an ElasticSearch cluster.
The data storage management device and the data storage management method provided by the embodiment of the application adopt the same inventive concept, can obtain the same beneficial effects, and are not repeated herein.
Based on the same inventive concept as the data storage management method, the embodiment of the present application further provides a data storage management device, which may specifically be a server, a server cluster composed of a plurality of servers, or a cloud computing center. As shown in fig. 9, the data storage management device 90 may include a processor 901 and a memory 902.
The Processor 901 may be a general-purpose Processor, such as a Central Processing Unit (CPU), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component, which may implement or execute the methods, steps, and logic blocks disclosed in the embodiments of the present Application. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.
Memory 902, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory may include at least one type of storage medium, and may include, for example, a flash Memory, a hard disk, a multimedia card, a card-type Memory, a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Programmable Read Only Memory (PROM), a Read Only Memory (ROM), a charged Erasable Programmable Read Only Memory (EEPROM), a magnetic Memory, a magnetic disk, an optical disk, and so on. The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 902 of the embodiments of the present application may also be circuitry or any other device capable of performing a storage function for storing program instructions and/or data.
Referring to fig. 10, an embodiment of the present application further provides a data storage management system 100, including: a storage system 1001 including a plurality of clusters, and the above-described data storage management apparatus 90.
A user may connect to the data storage management device through an access interface provided by a client in the terminal device, and access the storage system through the data storage management device, thereby obtaining data management services corresponding to the access interface, including but not limited to data management services such as data storage, query, and deletion.
Exemplary program product
Embodiments of the present application provide a computer-readable storage medium for storing computer program instructions for the data storage management device, which includes a program for executing the data storage management method.
The computer storage media may be any available media or data storage device that can be accessed by a computer, including but not limited to magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs)), etc.
In some possible embodiments, the various aspects of the present application may also be implemented as a computer program product comprising program code for causing a server device to perform the steps of the data storage management method according to various exemplary embodiments of the present application described in the "exemplary methods" section above of this specification, when the computer program product is run on the server device.
The computer program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A computer program product for data storage management according to an embodiment of the present application may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a server device. However, the program product of the present application is not limited thereto, and in this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device over any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., over the internet using an internet service provider).
It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, according to embodiments of the application. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.
Further, while the operations of the methods of the present application are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
While the spirit and principles of the application have been described with reference to several particular embodiments, it is to be understood that the application is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit from the description. The application is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A data storage management method, comprising:
receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier;
inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relation, wherein the routing relation is the corresponding relation between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters;
and storing the data to be stored in the first cluster.
2. The method according to claim 1, wherein after receiving the storage request sent by the client corresponding to the application, the method further comprises:
writing the data to be stored and the index identification in the storage request into a cache queue, wherein the cache queue is a queue created based on a Kafka message system;
the storing the data to be stored in the first cluster specifically includes:
and sequentially acquiring data to be stored and an index identifier from the cache queue, and storing the acquired data to be stored into a first cluster corresponding to the index identifier.
3. The method according to claim 1, wherein after receiving the storage request sent by the client corresponding to the application, the method further comprises:
writing data to be stored in the storage request into a cache queue corresponding to the index identifier in the storage request, wherein the cache queue is a queue created based on a Kafka message system;
storing the data to be stored in the first cluster, specifically comprising:
and sequentially acquiring the data to be stored in the storage request from the cache queue corresponding to the index identifier, and storing the acquired data to be stored in the first cluster.
4. The method according to claim 3, wherein writing the data to be stored in the storage request into a buffer queue corresponding to the index identifier in the storage request specifically includes:
if the type of the received storage request is an http request, obtaining the data to be stored and the index identifier in the storage request, writing the obtained data to be stored into a cache queue corresponding to the obtained index identifier in a Kafka message system according to a Kafka communication protocol, and otherwise, directly writing the data to be stored in the storage request into the cache queue corresponding to the index identifier in the storage request.
5. The method of any of claims 1 to 4, further comprising:
receiving a first query request sent by a client corresponding to an application, wherein the first query request comprises an index identifier and a query condition, and the type of the first query request is an http request;
determining a second cluster corresponding to the index identifier in the first query request according to the routing relation;
querying data meeting the query condition in the first query request from the second cluster;
and sending the data meeting the query conditions to the client side sending the first query request.
6. The method of claim 5, further comprising:
receiving a second query request sent by a client corresponding to an application through a preset query interface, wherein the second query request comprises a joint query identifier and query conditions;
finding a joint query mode corresponding to the joint query identifier from a preset joint query list, wherein the joint query mode comprises a plurality of index identifiers;
according to a plurality of index identifiers in the inquired combined inquiry mode, inquiring data meeting the inquiry condition in the second inquiry request from the clusters corresponding to the index identifiers respectively;
and sending the data meeting the query conditions to the client side sending the second query request.
7. A data storage management apparatus, comprising:
the write-in gateway is used for receiving a storage request sent by a client corresponding to an application, wherein the storage request comprises data to be stored and an index identifier, and the application comprises at least one index identifier;
the message processing unit is used for inquiring a first cluster corresponding to the index identifier in the storage request according to a preconfigured routing relationship, wherein the routing relationship is the corresponding relationship between the index identifier and the clusters in a storage system, and the storage system comprises a plurality of clusters; and storing the data to be stored in the first cluster.
8. A data storage management device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 6 are implemented when the computer program is executed by the processor.
9. A data storage management system, comprising: the data storage management apparatus and storage system of claim 8, the storage system comprising a plurality of clusters.
10. A computer-readable storage medium having computer program instructions stored thereon, which, when executed by a processor, implement the steps of the method of any one of claims 1 to 6.
CN201911301508.4A 2019-12-17 2019-12-17 Data storage management method, device, equipment, system and storage medium Pending CN111124299A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911301508.4A CN111124299A (en) 2019-12-17 2019-12-17 Data storage management method, device, equipment, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911301508.4A CN111124299A (en) 2019-12-17 2019-12-17 Data storage management method, device, equipment, system and storage medium

Publications (1)

Publication Number Publication Date
CN111124299A true CN111124299A (en) 2020-05-08

Family

ID=70498312

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911301508.4A Pending CN111124299A (en) 2019-12-17 2019-12-17 Data storage management method, device, equipment, system and storage medium

Country Status (1)

Country Link
CN (1) CN111124299A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112543222A (en) * 2020-11-11 2021-03-23 苏宁云计算有限公司 Data processing method and device, computer equipment and storage medium
CN113315845A (en) * 2021-07-28 2021-08-27 阿里云计算有限公司 Data transmission method and device and distributed storage system
CN113553306A (en) * 2021-07-27 2021-10-26 重庆紫光华山智安科技有限公司 Data processing method and data storage management system
CN114237508A (en) * 2021-12-16 2022-03-25 中国农业银行股份有限公司 Data storage method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761141A (en) * 2013-12-13 2014-04-30 北京奇虎科技有限公司 Method and device for realizing message queue
CN110083627A (en) * 2019-04-28 2019-08-02 江苏满运软件科技有限公司 Data processing method, system, computer equipment and storage medium
CN110147398A (en) * 2019-04-25 2019-08-20 北京字节跳动网络技术有限公司 A kind of data processing method, device, medium and electronic equipment
CN110162522A (en) * 2019-05-22 2019-08-23 武汉市公安局 A kind of distributed data search system and method
CN110457281A (en) * 2019-08-14 2019-11-15 北京博睿宏远数据科技股份有限公司 Data processing method, device, equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761141A (en) * 2013-12-13 2014-04-30 北京奇虎科技有限公司 Method and device for realizing message queue
CN110147398A (en) * 2019-04-25 2019-08-20 北京字节跳动网络技术有限公司 A kind of data processing method, device, medium and electronic equipment
CN110083627A (en) * 2019-04-28 2019-08-02 江苏满运软件科技有限公司 Data processing method, system, computer equipment and storage medium
CN110162522A (en) * 2019-05-22 2019-08-23 武汉市公安局 A kind of distributed data search system and method
CN110457281A (en) * 2019-08-14 2019-11-15 北京博睿宏远数据科技股份有限公司 Data processing method, device, equipment and medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112543222A (en) * 2020-11-11 2021-03-23 苏宁云计算有限公司 Data processing method and device, computer equipment and storage medium
CN112543222B (en) * 2020-11-11 2022-07-05 苏宁云计算有限公司 Data processing method and device, computer equipment and storage medium
CN113553306A (en) * 2021-07-27 2021-10-26 重庆紫光华山智安科技有限公司 Data processing method and data storage management system
CN113315845A (en) * 2021-07-28 2021-08-27 阿里云计算有限公司 Data transmission method and device and distributed storage system
CN113315845B (en) * 2021-07-28 2022-01-04 阿里云计算有限公司 Data transmission method and device and distributed storage system
CN114237508A (en) * 2021-12-16 2022-03-25 中国农业银行股份有限公司 Data storage method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US11128707B2 (en) Omnichannel approach to application sharing across different devices
US10257115B2 (en) Cloud-based service resource provisioning based on network characteristics
US11625281B2 (en) Serverless platform request routing
CN111614738B (en) Service access method, device, equipment and storage medium based on Kubernetes cluster
US10515058B2 (en) Unified file and object data storage
US9734026B2 (en) In-memory data store replication through remote memory sharing
CN111124299A (en) Data storage management method, device, equipment, system and storage medium
CN101930449B (en) Client, brokerage server and method for providing cloud storage
US11385930B2 (en) Automatic workflow-based device switching
US20130198472A1 (en) Performing volume expansion in storage management system
CN113783922A (en) Load balancing method, system and device
US9712612B2 (en) Method for improving mobile network performance via ad-hoc peer-to-peer request partitioning
US20140123142A1 (en) System and method for providing data analysis service in cloud environment
US10963324B2 (en) Predictive microservice systems and methods
US11546431B2 (en) Efficient and extensive function groups with multi-instance function support for cloud based processing
US20230216895A1 (en) Network-based media processing (nbmp) workflow management through 5g framework for live uplink streaming (flus) control
US10425475B2 (en) Distributed data management
US10986065B1 (en) Cell-based distributed service architecture with dynamic cell assignment
US20190129743A1 (en) Method and apparatus for managing virtual machine
US20230164210A1 (en) Asynchronous workflow and task api for cloud based processing
US10200301B1 (en) Logical control groups for distributed system resources
CN112839071B (en) Training system, training data access method and device, electronic equipment and medium
CN110347473B (en) Method and device for distributing virtual machines of virtualized network elements distributed across data centers
US10708343B2 (en) Data repository for a distributed processing environment
CN116820354B (en) Data storage method, data storage device and data storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination