WO2022247316A1 - Système de traitement d'objet de stockage, procédé de traitement de requête, passerelle et support de stockage - Google Patents

Système de traitement d'objet de stockage, procédé de traitement de requête, passerelle et support de stockage Download PDF

Info

Publication number
WO2022247316A1
WO2022247316A1 PCT/CN2022/072292 CN2022072292W WO2022247316A1 WO 2022247316 A1 WO2022247316 A1 WO 2022247316A1 CN 2022072292 W CN2022072292 W CN 2022072292W WO 2022247316 A1 WO2022247316 A1 WO 2022247316A1
Authority
WO
WIPO (PCT)
Prior art keywords
request
data
query
cluster
gateway
Prior art date
Application number
PCT/CN2022/072292
Other languages
English (en)
Chinese (zh)
Inventor
杨吴同
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022247316A1 publication Critical patent/WO2022247316A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • This application relates to the field of data processing and block chain technology, in particular to a storage object processing system, a request processing method, a gateway and a storage medium.
  • the object storage cluster is a cluster system that provides functions such as data query and storage.
  • the storage files in the object storage cluster are stored in the form of objects in the object storage.
  • the user if the user needs to retrieve the object content in the object storage cluster, he must first download the object from the object storage system through the object name and put it in the big data query engine, and then use the big data query engine.
  • the data query engine locally parses and retrieves objects, and finally completes the query operation.
  • the present application provides a storage object processing system, a request processing method, a gateway and a storage medium to solve the technical problem of high cost and poor performance in the prior art that objects need to be downloaded from an object storage cluster.
  • a request processing method based on a storage object processing system includes a client, a gateway, a Presto cluster, a Hive data warehouse tool, and an object storage cluster, and the Presto cluster and the Hive data warehouse Tool connection, the Hive data warehouse tool is connected to the object storage cluster, and the method includes:
  • the gateway receives an execution request sent by the client, where the execution request is a request for performing a corresponding operation on the object storage cluster;
  • the gateway determines a request type of the execution request
  • the gateway forwards the execution request to the Presto cluster;
  • the Presto cluster reads corresponding query data from the corresponding object of the object storage cluster according to the execution request;
  • the Presto cluster feeds back the query data to the client according to the size of the query data.
  • a storage object processing system in a second aspect, includes a client, a gateway, a Presto cluster, a Hive data warehouse tool and an object storage cluster, the Presto cluster is connected to the Hive data warehouse tool, and the Hive data warehouse tool Connect with object storage cluster:
  • the gateway is configured to receive an execution request sent by the client and determine a request type of the execution request, where the execution request is a request for performing a corresponding operation on the object storage cluster;
  • the gateway is further configured to forward the execution request to the Presto cluster;
  • the Presto cluster is configured to read corresponding query data from corresponding objects in the object storage cluster according to the execution request, and to feed back the query data to the client in a corresponding manner according to the size of the query data.
  • a gateway in a third aspect, includes:
  • a receiving module configured to receive an execution request sent by the client, where the execution request is used to perform a corresponding operation on the object storage cluster;
  • a processing module configured to determine the request type of the execution request
  • a sending module configured to forward the execution request to the Presto cluster when the type of the execution request is a data query request, so that the Presto cluster retrieves the corresponding object from the object storage cluster according to the execution request
  • the corresponding query data is read in, and the query data is fed back to the client in a corresponding manner according to the size of the query data.
  • one or more readable storage media storing computer-readable instructions are provided, and when the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps :
  • the execution request is forwarded to the Presto cluster, so that the Presto cluster reads the corresponding query data from the corresponding object of the object storage cluster according to the execution request, and Make the Presto cluster feed back the query data to the client according to the size of the query data.
  • a gateway including a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor, wherein the processor executes the computer-readable instructions implement the steps of the request processing method based on the storage object processing system described in any one of the foregoing first aspects.
  • the connection relationship between the Presto cluster, the Hive data warehouse tool and the object storage cluster can be used to reduce query time and improve query efficiency. , there is no need to download objects separately from the object storage cluster to query, reduce bandwidth resource consumption, and also reduce query time. Moreover, there is no need to write independent business queries separately, which is more convenient and can be queried by using Presto clusters. The cost of the overall solution Lower, better cost performance, stronger performance.
  • FIG. 1 is a schematic diagram of a system structure of a storage object processing system in an embodiment of the present application
  • Fig. 2 is an interactive flowchart of a request processing method based on a storage object processing system in an embodiment of the present application
  • FIG. 3 is a schematic diagram of an authentication process of a request processing method based on a storage object processing system in an embodiment of the present application
  • Fig. 4 is a specific flowchart of step S30 in Fig. 2;
  • Fig. 5 is a schematic diagram of the bill generation process based on the request processing method of the storage object processing system in an embodiment
  • Fig. 6 is a schematic diagram of the process of feedback query data based on the request processing method of the storage object processing system in an embodiment
  • FIG. 7 is a schematic structural diagram of a gateway in an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a computer device in an embodiment of the present application.
  • the request processing method based on the storage object processing system provided in this application can be applied to the storage object processing system shown in FIG. 1 .
  • the storage object processing system provided by this application includes client, gateway, Presto cluster, Hive data warehouse tool and object storage cluster, and the object storage cluster is used to provide object storage service (Object Storage Service, OBS)
  • OBS object Storage Service
  • the cluster, the Presto cluster is connected to the Hive data warehouse tool, the Hive data warehouse tool is connected to the object storage cluster, and the Hive data warehouse tool is connected to the object storage cluster through its own connerctoer.
  • OBS object Storage Service
  • the above configuration mainly involves some Presto service parameters, such as the port of the Presto cluster, memory, whether it is a coordinator, etc.
  • the Hive data warehouse tool is a Hadoop-based data warehouse tool for data extraction, transformation, and loading. This is a mechanism that can store, query, and analyze large-scale data stored in Hadoop.
  • the Hive data warehouse tool can map structured data files into a database table and provide SQL query functions, which can convert SQL statements into tasks for execution. Details about the Hive data warehouse tool and Hadoop will not be described in this application.
  • Presto cluster can automatically connect to the Hive data warehouse tool, and the Hive data warehouse tool connects to the object storage cluster through the connerctoer , more detailed configuration process and cluster establishment process, which will not be described in detail here.
  • the client can communicate with the gateway through the network.
  • Clients can be, but are not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices.
  • Presto clusters and object storage clusters can be implemented using independent servers or server clusters composed of multiple servers, which are not limited in this application.
  • the gateway can receive the execution request from the client and forward it according to the type of execution request.
  • the gateway can route the execution request sent by the client to the Presto cluster or the Hive data warehouse tool to complete the specific request purpose. The description will be made in detail below.
  • a request processing method is provided, and the method is applied to the gateway in FIG. 1 as an example for illustration, including the following steps:
  • the gateway receives the execution request sent by the client, where the execution request is a request for performing a corresponding operation on the object storage cluster.
  • the user end can send an execution request to the gateway to perform the corresponding operation.
  • the gateway can receive the execution request sent by the client.
  • the execution request may specifically be a query data request for performing a query operation, a table creation request for performing a table creation operation, a creation request for creating a database, a data upload request type for uploading a data file, etc., which are not limited in this application.
  • the type of execution request is a data query request, it may specifically be an SQL request.
  • S20 The gateway determines the request type of the execution request.
  • the Presto cluster reads the corresponding query data from the corresponding object of the object storage cluster according to the execution request:
  • S50 The Presto cluster feeds back the query data to the client according to the size of the query data.
  • the gateway After the gateway receives the execution request sent by the client, it needs to determine the type of the execution request. In a specific implementation, the gateway can know the request type of the execution request by parsing the request header information of the execution request. It should be noted that in this application, the gateway has corresponding control logic for execution requests of different request types. In this embodiment, when the type of the execution request is a data query request, it means that the execution request is used to request to query object data from the object storage cluster.
  • the gateway forwards the execution request to the Presto cluster, where the Presto cluster is configured
  • the Presto cluster parses the execution request to determine the scope of the query, and then stores the data of the corresponding object in the cluster using the Hive data warehouse tool according to the scope of the query.
  • the content is read into its own memory as query data, and will be returned to the client in a corresponding way according to the size of the query data. Since the Hive data warehouse tool connects the Presto cluster and the object storage cluster through the connector, the Presto cluster can directly query the object data in the object storage cluster.
  • a request processing method based on an object storage processing system includes a Presto cluster, a Hive data warehouse tool, a client, an object storage cluster, and a gateway, and the Presto cluster Connect to hive, the Hive data warehouse tool is connected to the object storage cluster through the hive connector, and a gateway is arranged on the object storage cluster.
  • the client when data needs to be queried, the client sends query requests such as SQL statements to the object storage cluster , because the gateway is arranged on the object storage cluster, the query request of the SQL statement can be intercepted by the gateway first, and then the gateway will route the query request of the SQL statement to the Presto cluster, so that the Presto cluster can parse the SQL query request, and then pass hive
  • the data warehouse tool uses the connector to directly query the data content of the object from the object storage cluster, instead of downloading the object separately and placing it in the big data query engine for query.
  • the technical effect brought by this application is: on the one hand, using the connection relationship between the Presto cluster, the Hive data warehouse tool and the object storage cluster can reduce the query time and improve the efficiency of the query; Download objects to query, reduce bandwidth resource consumption, and also reduce query time. Moreover, there is no need to write independent business queries separately, which is more convenient and can be queried by using Presto clusters. The overall solution cost is low, cost-effective, and performance stronger.
  • the request header of the execution request includes a request signature
  • the request signature is generated by the request URL, request time, and user key.
  • S101 The gateway parses and executes the request to obtain the request header, and extracts the request signature from the request header.
  • S102 The gateway verifies the request signature.
  • the gateway triggers the execution of the step of determining the request type of the execution request.
  • the client when the client sends an execution request, the client will generate a request signature according to parameters such as the request URL of the client, the request time, and the user key, where the user key refers to the unique identity of the user Identification information, the user key can be the user's ID card information or the unique identification information applied for initial registration, which is not limited here.
  • the request signature can be generated using a preset signature algorithm combined with parameters such as the request URL, request time, and user key. The details are not limited here and will not be described in detail.
  • the request authentication is provided to the gateway for authentication deal with.
  • the gateway After the gateway receives the execution request, it needs to perform authentication processing on the execution request sent by the client. Specifically, the gateway parses the execution request to obtain the request header, and extracts the request signature from the request header, and then the gateway verifies the request signature. processing. Wherein, the verification process may be processed by using various preset authentication algorithms, which are not limited in this application. For example, the authentication center can be used to verify the above request signature. For another example, the gateway directly generates a request authentication signature based on parameters such as the Uniform Resource Locator address of the execution request, the request time, and the user key, and combines the request signature with the request header.
  • the gateway parses the execution request to obtain the request header, and extracts the request signature from the request header, and then the gateway verifies the request signature. processing. Wherein, the verification process may be processed by using various preset authentication algorithms, which are not limited in this application. For example, the authentication center can be used to verify the above request signature. For another example, the gateway directly generates a request authentication signature based on
  • the gateway will trigger the execution of the step of determining the request type of the execution request, that is, allowing the forwarding of the execution request . If the signatures of the two requests do not match, it means that the execution request sent by the client may have been tampered with in the middle, and the verification cannot be passed, and the gateway refuses to process the execution request.
  • step S30 that is, the gateway forwards the execution request to the Presto cluster, which specifically includes the following steps:
  • S31 The gateway parses the execution request to determine the requesting user.
  • the gateway acquires a preset user operation resource list, and the preset user operation resource list records data resources of the object storage cluster that can be queried by different users.
  • the gateway retrieves a preset user-operated resource list to determine the data resources that the requesting user can query.
  • S34 The gateway judges whether the query data to be queried by executing the request belongs to data resources that can be queried by the requesting user.
  • the gateway After the gateway receives the execution request, it needs to parse the execution request to determine the requesting user, and obtain a list of preset user operation resources, preset user operation resources
  • the list records the data resources of the object storage cluster that different users can query.
  • the preset user operation resource list records the data resources that user 1 can query data resource 1, user 2 can query data resource 2,,, user N can query Data resource N.
  • the data resources 1-N refer to the data range or data objects that can be operated.
  • the gateway when the gateway receives user 1's execution request for querying data resource 2, the gateway can determine that the data resource 2 to be queried by the execution request does not belong to the data resource that user 1 can query by searching the preset user operation resource list , the gateway refuses to forward the execution request, and feeds back a rejection prompt to the user terminal of user 1 to remind user 1 that he has no right to query.
  • the gateway when the gateway receives an execution request from user 2 for querying data resource 2, the gateway can determine that the query data resource 2 to be queried by the execution request belongs to the user 2 by retrieving the preset user operation resource list, and then the gateway triggers The step of forwarding the execution request to the Presto cluster, so that user 2 can successfully query data resource 2.
  • the isolation of user resources can be realized, and user operations can be restricted.
  • the resources that users can operate are isolated from each other, which has higher security.
  • step S30 that is, after the gateway forwards the execution request to the Presto cluster, as shown in Figure 5, the method further includes the following steps:
  • S201 The gateway pulls traffic of different buckets from the object storage cluster.
  • S202 The gateway filters out the traffic of the Presto cluster from the traffic of different buckets.
  • the gateway determines the query traffic of the requesting user within the preset bill calculation period according to the visit volume of the Presto cluster.
  • S204 The gateway generates bill information of the requesting user within a preset bill calculation period according to the query traffic.
  • S205 The gateway feeds back the billing information to the client corresponding to the requesting user.
  • a bucket is a container for storing objects in an object storage cluster.
  • Object storage clusters provide flat storage based on buckets and objects. All objects in a bucket are at the same logical level, eliminating the multi-level tree directory structure in general file systems. Therefore, after the gateway pulls the traffic of different buckets from the object storage cluster, it can filter out the traffic of the Presto cluster from the traffic of different buckets. For example, for a SQL query request, the Presto cluster will actively pull data files from the object storage cluster, and the traffic generated during this process is the basis for query charges.
  • the gateway can determine the query traffic of the requesting user within the preset bill calculation period according to the visit volume of the Presto cluster.
  • the preset bill calculation period may be a period agreed between the user and the storage provider such as one month or one quarter, which is not limited here. Taking one month as an example, after filtering out the visit volume of the Presto cluster within the month, determine the query traffic of the requesting user 1 within the preset bill calculation cycle, and the gateway generates the bill of the requesting user 1 within the month based on the query traffic information, and feed back the corresponding billing information within the month to the client corresponding to requesting user 1.
  • the gateway will filter out the query traffic of each user from the traffic of different buckets of the object storage cluster, so as to facilitate billing to each query user and facilitate business promotion.
  • the Presto cluster is configured to select a corresponding method to return to the client according to the size of the query data.
  • the gateway is configured into two different implementation processes, which are described below.
  • step S30 that is, after the gateway forwards the execution request to the Presto cluster, the following steps are further included:
  • S60 The gateway determines the amount of data to be queried for executing the request.
  • S70 The gateway judges the relationship between the query data volume required to execute the request and the preset data volume.
  • the amount of data to be queried may be a very large amount of data.
  • the gateway will feed back the query result to the user and wait for an indication. For example, when the data that the user needs to query is stored in a large table with a data volume of TB, the retrieval and query time of the large table may take a relatively long time.
  • the gateway needs to first determine the size of the query data required to execute the request. It should be noted that the gateway can know the query location of the execution request according to the query address of the execution request, so as to know the amount of data to be queried for the execution request.
  • the gateway feeds back the query result waiting indication to the user end, so that the gateway immediately feeds back the query result waiting indication to the user end after forwarding the execution request, so as to inform the user and avoid the user's uncertain waiting time.
  • the aforementioned preset data volume may be pre-configured, and the specific data volume is not limited in this application.
  • S90 The gateway waits for the query result write notification of the Presto cluster, which is used to indicate that the Presto cluster has read the corresponding query data from the object storage cluster according to the execution request, and writes the query data to the object in the form of a non-object file Preset location for storage clusters.
  • the gateway After the gateway waits for the query result writing notification, the gateway feeds back a download instruction to the client, so that the client downloads the query data from a preset location of the object storage cluster according to the download instruction.
  • the Presto cluster For the Presto cluster, due to receiving the execution request forwarded by the gateway, the Presto cluster reads the corresponding query data from the object storage cluster according to the execution request, and because the query data exceeds the preset data volume, the Presto cluster query process takes a little longer. After all queries are obtained, the Presto cluster will not directly feed back the query data to the gateway, but will write the query data to the preset location of the object storage cluster in the form of a non-object file, and generate a query result write notification containing the preset location information , and write the query result into a notification to feed back to the gateway.
  • the gateway After the gateway feeds back the query result waiting instruction to the client, the gateway will wait for the query result write notification of the Presto cluster. After receiving the query result write notification of the Presto cluster, it means that the Presto cluster has been executed from the The object storage cluster reads the corresponding query data, and writes the query data into the preset location of the object storage cluster as a non-object file. At this time, the gateway feeds back the download instruction containing the preset location information to the client, so that the client can follow the The download instruction downloads the query data from the preset location of the object storage cluster.
  • the preset location refers to the storage location of the query data, which is carried in the query result writing notification and download instruction.
  • the query data can be written into the object by the gateway in the form of a comma-separated value file (Comma-Separated Values, CSV), or other file forms that can be directly downloaded by the client and used without parsing. Preset location for storage clusters.
  • CSV Common-Separated Values
  • the comma-separated value file file stores tabular data (numbers and text) in plain text.
  • Plain text means that the file is a sequence of characters and contains no data that must be interpreted like binary numbers.
  • a CSV file consists of any number of records separated by some kind of newline; each record consists of fields separated by other characters or strings, most commonly commas or tabs. Typically, all records have the exact same sequence of fields.
  • plain text files In this way, the client can directly download the usable data, query the storage objects that do not need to be parsed, and realize the query of the data content.
  • the Presto cluster will not directly feed back the query data to the gateway, but provides an asynchronous query method.
  • the gateway routes the execution request to the Presto cluster, Immediately feed back the query result waiting instruction to the client, and then the gateway waits for the Presto cluster to output the query result and writes the query result file (query data) to the object storage cluster, then the gateway notifies the client to directly download the query data from the object storage cluster. Effectively avoid the user's uncertain waiting problem.
  • step 50 that is, after the gateway determines the amount of data required to perform the query, the method further includes the following steps:
  • S120 The gateway feeds back the query data sent by the Presto cluster to the client.
  • the Presto cluster For the Presto cluster, due to receiving the execution request forwarded by the gateway, the Presto cluster reads the corresponding query data from the object storage cluster according to the execution request, and because the query data is less than or equal to the preset data volume, it means that the Presto cluster query process will be slow It is relatively fast. After all queries are obtained, the Presto cluster will directly feed back the query data to the gateway without waiting for all the query data to write the query data to the preset location of the object storage cluster. For the gateway, the gateway will wait for the Presto cluster to feed back the query data of this query request. After the gateway receives the query data fed back by the Presto cluster, it means that the Presto cluster has read the corresponding query data from the object storage cluster according to the execution request. At the same time, the gateway directly feeds back the query data to the client.
  • the query data when the query data is small, compared with the prior art, the query data can be directly fed back to the client through the gateway, and the process is simpler and faster.
  • the gateway when the gateway determines that the execution request is a request to create a table or create a database, the gateway will route the execution request to the Hive data warehouse tool, and the Hive data warehouse tool will implement table creation, create database operations.
  • the gateway determines that the execution request is a request to upload a data file to the object storage cluster, the gateway will route the execution request to the object storage cluster, and the object storage cluster will write the data file into the bucket in the form of an object.
  • a query record table can also be added to the gateway, and the gateway records the query and other operation request records of different users into the query record table according to the received execution request, so as to facilitate the management of the user's data operation records.
  • the query record table and billing information mentioned in the above embodiments can be stored in the blockchain network.
  • the database in this embodiment is stored in the block chain network, and is used to store the data used and generated in the semantic recall method based on the graph neural network, such as the image to be processed, the quality score of each text region, and the target of the image to be processed related data such as Quality Score.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain (Blockchain), essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer. Deploying the database on the blockchain can improve the security of data storage.
  • the execution request is a request such as creating a table, creating a database, or uploading a data file
  • the operations such as authentication and calculation mentioned in the above method embodiments are also applicable, and will not be described again here.
  • a storage object processing system includes a client, a gateway, a Presto cluster, a Hive data warehouse tool, and an object storage cluster.
  • the Presto cluster is connected to the Hive data warehouse tool, and the Hive data Warehouse tools connect to object storage clusters:
  • the gateway is configured to receive an execution request sent by the client and determine a request type of the execution request, where the execution request is a request for performing a corresponding operation on the object storage cluster;
  • the gateway is further configured to forward the execution request to the Presto cluster;
  • the Presto cluster is configured to read corresponding query data from corresponding objects in the object storage cluster according to the execution request, and to feed back the query data to the client in a corresponding manner according to the size of the query data.
  • the gateway is further configured to: when the amount of query data required for executing the request is less than or equal to the preset amount of data, the gateway waits to receive the query data sent by the Presto cluster; After receiving the query data sent by the Presto cluster, the gateway feeds back the query data sent by the Presto cluster to the client.
  • the non-object format file is a comma-separated value file.
  • the gateway is further configured to: after forwarding the execution request to the Presto cluster, pull the traffic of different buckets from the object storage cluster; among the traffic of different buckets, filter Obtain the visit volume of the Presto cluster; determine the query traffic of the requesting user in the preset bill calculation period according to the visit volume of the Presto cluster; generate the query traffic of the requesting user in the preset bill calculation period according to the query traffic The billing information of the requesting user is fed back to the client corresponding to the requesting user, and the billing information is stored in the block chain network.
  • the request header of the execution request includes a request signature
  • the request signature is generated by requesting a uniform resource location address, request time and user key
  • the gateway is also used for:
  • the gateway is further specifically configured to: parse the execution request to determine the requesting user; obtain a preset user operation resource list, and the preset user operation resource list records all the resources that can be queried by different users.
  • the data resources of the object storage cluster retrieve the preset user operation resource list to determine the data resources that the requesting user can query; when the query data to be queried by the execution request belongs to the data resources that the requesting user can query , then trigger the step of forwarding the execution request to the Presto cluster; when the query data to be queried by the execution request does not belong to the data resources that the requesting user can query, then refuse to forward the execution request, and send
  • the above client feedback rejects the query prompt.
  • a new object storage processing system is provided.
  • the connection relationship between the Presto cluster, the Hive data warehouse tool and the object storage cluster can be used to reduce query time and improve query efficiency.
  • a gateway is provided, which is applied to a storage object processing system, and the storage object processing system also includes a client, a Presto cluster, a Hive data warehouse tool, and an object storage cluster, and the Presto cluster is connected to the Hive data warehouse tool , the Hive data warehouse tool is connected to the object storage cluster, and the gateway corresponds to the request processing method in the foregoing embodiment one by one.
  • the gateway includes a receiving module 101 , a processing module 102 and a sending module 103 . The detailed description of each functional module is as follows:
  • the receiving module 101 is configured to receive an execution request sent by the client, where the execution request is used to perform a corresponding operation on the object storage cluster;
  • a processing module 102 configured to determine the request type of the execution request
  • the sending module 103 is configured to forward the execution request to the Presto cluster when the type of the execution request is a data query request, so that the Presto cluster receives the corresponding data from the object storage cluster according to the execution request.
  • the corresponding query data is read from the object, and the query data is fed back to the client in a corresponding manner according to the size of the query data.
  • the processing module 102 is further configured to determine the amount of data to be queried for executing the request;
  • the sending module 103 is further configured to feed back a query result waiting indication to the client when the query data volume required to execute the request is greater than the preset data volume;
  • the receiving module 101 is further configured to wait for a query result write notification of the Presto cluster, where the query result write notification is used to indicate that the Presto cluster has read the corresponding query from the object storage cluster according to the execution request data, and write the query data to a preset location of the object storage cluster as a non-object file;
  • the sending module 103 is further configured to feed back a download instruction to the client after waiting for the query result writing notification, so that the client can download from the preset location of the object storage cluster according to the download instruction
  • the query data is obtained by downloading.
  • the sending module 103 is further configured to wait to receive the query data sent by the Presto cluster when the amount of query data required to execute the request is less than or equal to the preset amount of data; After receiving the query data sent by the Presto cluster, feed back the query data sent by the Presto cluster to the client.
  • the non-object format file is a comma-separated value file.
  • processing module 102 is also used for:
  • the access volume of the Presto cluster determine the query traffic of the requesting user in the preset bill calculation cycle
  • the sending module 103 is further configured to feed back the billing information to the client corresponding to the requesting user.
  • the request header of the execution request includes a request signature
  • the request signature is generated by requesting a URL, request time and user key
  • the processing module 102 is further configured to:
  • the gateway refuses to process the execution request.
  • processing module is also used for:
  • the preset user operation resource list records data resources of the object storage cluster that can be queried by different users;
  • the gateway When the query data to be queried by the execution request belongs to the data resources that the requesting user can query, the gateway triggers the step of forwarding the execution request to the Presto cluster;
  • the gateway refuses to forward the execution request
  • the sending module 103 is further configured to feed back a query rejection prompt to the client when the query data to be queried by the execution request does not belong to the data resources that the requesting user can query.
  • Each module in the above-mentioned gateway can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a gateway, and its internal structure may be as shown in FIG. 8 .
  • the gateway includes a processor, memory, network interface and database connected by a system bus. Among them, the processor of the gateway is used to provide calculation and control capabilities.
  • the storage of the gateway includes a non-volatile storage medium or a volatile storage medium, and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer readable instructions in the non-volatile storage medium.
  • the database of the gateway is used to temporarily store some data or tables involved in this application.
  • the network interface of the gateway is used to communicate with external Presto clusters and user terminals through network connections.
  • a computer device is also correspondingly provided, and the computing device may be a user terminal, and the user terminal is used to implement the user terminal side in the request processing method based on the storage object processing system
  • the computing device may be a user terminal, and the user terminal is used to implement the user terminal side in the request processing method based on the storage object processing system
  • a Presto cluster and a Hive data warehouse tool are also provided correspondingly.
  • the Presto cluster and the Hive data warehouse tool are used to implement the Presto cluster,
  • a computer device including a memory, a processor, and computer-readable instructions stored on the memory and operable on the processor.
  • the processor executes the computer-readable instructions, the following steps are implemented:
  • the execution request is forwarded to the Presto cluster, so that the Presto cluster reads the corresponding object from the corresponding object of the object storage cluster according to the execution request.
  • Query data and feed back the query data to the client in a corresponding manner according to the size of the query data.
  • One or more readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to perform the following steps:
  • the execution request is forwarded to the Presto cluster, so that the Presto cluster reads the corresponding query data from the corresponding object of the object storage cluster according to the execution request, and Make the Presto cluster feed back the query data to the client according to the size of the query data.
  • the one or more processors when executed by one or more processors, the one or more processors further perform the following steps:
  • the query result write notification is used to indicate that the Presto cluster has read the corresponding query data from the object storage cluster according to the execution request, and write the query result The data is written to a preset location of the object storage cluster in a non-object form file;
  • the one or more processors when executed by one or more processors, the one or more processors further perform the following steps:
  • the one or more processors when executed by one or more processors, the one or more processors further perform the following steps:
  • the access volume of the Presto cluster determine the query traffic of the requesting user in the preset bill calculation cycle
  • the request header of the execution request includes a request signature
  • the request signature is generated by the request uniform resource location address
  • the computer-readable instructions are processed by one or more
  • the processor executes the one or more processors also perform the following steps:
  • the implementation of the present application also provides a computer-readable storage medium, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by the processor, functions or steps on the side of the client, the Presto cluster, etc. are realized, The details will not be described here.
  • the computer-readable instructions can be stored in a non-volatile computer
  • the computer-readable storage medium may be non-volatile or volatile
  • the computer-readable instructions may include the processes of the embodiments of the above-mentioned methods when executed.
  • any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory.
  • Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente demande concerne les domaines de la technologie de traitement de données, de la technologie des chaînes de blocs et analogues et divulgue un procédé de traitement de requête basé sur un système de traitement d'objet de stockage. Le système de traitement d'objets de stockage comprend un terminal utilisateur, une passerelle, un groupe Presto, un outil d'entrepôt de données Hive et un groupe de stockage d'objets ; le groupe Presto est relié à l'outil d'entrepôt de données Hive et l'outil d'entrepôt de données Hive est connecté au groupe de stockage d'objets. Le procédé comprend les étapes suivantes : la passerelle reçoit une requête d'exécution envoyée par le terminal utilisateur, la requête d'exécution étant une requête d'exécution d'une opération correspondante sur le groupe de stockage d'objets ; la passerelle détermine un type de requête de la requête d'exécution ; lorsque le type de la requête d'exécution est une requête d'interrogation de données, la passerelle transmet la requête d'exécution au groupe Presto ; et le groupe Presto lit des données d'interrogation correspondantes à partir d'un objet correspondant du groupe de stockage d'objets selon la requête d'exécution et fournit les données d'interrogation au terminal utilisateur en fonction de la taille de données d'interrogation.
PCT/CN2022/072292 2021-05-28 2022-01-17 Système de traitement d'objet de stockage, procédé de traitement de requête, passerelle et support de stockage WO2022247316A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110589727.8 2021-05-28
CN202110589727.8A CN113204589A (zh) 2021-05-28 2021-05-28 存储对象处理系统、请求处理方法、网关和存储介质

Publications (1)

Publication Number Publication Date
WO2022247316A1 true WO2022247316A1 (fr) 2022-12-01

Family

ID=77023648

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/072292 WO2022247316A1 (fr) 2021-05-28 2022-01-17 Système de traitement d'objet de stockage, procédé de traitement de requête, passerelle et support de stockage

Country Status (2)

Country Link
CN (1) CN113204589A (fr)
WO (1) WO2022247316A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204589A (zh) * 2021-05-28 2021-08-03 平安科技(深圳)有限公司 存储对象处理系统、请求处理方法、网关和存储介质
CN115378958A (zh) * 2022-06-29 2022-11-22 马上消费金融股份有限公司 数据处理方法、系统、电子设备以及计算机可读存储介质
CN115729477A (zh) * 2023-01-09 2023-03-03 苏州浪潮智能科技有限公司 分布式存储io路径数据写入、读取方法、装置和设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787119A (zh) * 2016-03-25 2016-07-20 盛趣信息技术(上海)有限公司 基于混合引擎的大数据处理方法及系统
CN107491553A (zh) * 2017-08-31 2017-12-19 武汉光谷信息技术股份有限公司 一种数据挖掘方法及系统
US20180196850A1 (en) * 2017-01-11 2018-07-12 Facebook, Inc. Systems and methods for optimizing queries
CN109033123A (zh) * 2018-05-31 2018-12-18 康键信息技术(深圳)有限公司 基于大数据的查询方法、装置、计算机设备和存储介质
CN112749190A (zh) * 2019-10-31 2021-05-04 中国移动通信集团重庆有限公司 数据查询方法、装置、计算设备及计算机存储介质
CN113204589A (zh) * 2021-05-28 2021-08-03 平安科技(深圳)有限公司 存储对象处理系统、请求处理方法、网关和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787119A (zh) * 2016-03-25 2016-07-20 盛趣信息技术(上海)有限公司 基于混合引擎的大数据处理方法及系统
US20180196850A1 (en) * 2017-01-11 2018-07-12 Facebook, Inc. Systems and methods for optimizing queries
CN107491553A (zh) * 2017-08-31 2017-12-19 武汉光谷信息技术股份有限公司 一种数据挖掘方法及系统
CN109033123A (zh) * 2018-05-31 2018-12-18 康键信息技术(深圳)有限公司 基于大数据的查询方法、装置、计算机设备和存储介质
CN112749190A (zh) * 2019-10-31 2021-05-04 中国移动通信集团重庆有限公司 数据查询方法、装置、计算设备及计算机存储介质
CN113204589A (zh) * 2021-05-28 2021-08-03 平安科技(深圳)有限公司 存储对象处理系统、请求处理方法、网关和存储介质

Also Published As

Publication number Publication date
CN113204589A (zh) 2021-08-03

Similar Documents

Publication Publication Date Title
WO2022247316A1 (fr) Système de traitement d'objet de stockage, procédé de traitement de requête, passerelle et support de stockage
US20200301887A1 (en) Sync as a service for cloud-based applications
US10187463B2 (en) Using a shared data store for peer discovery
US10515058B2 (en) Unified file and object data storage
CN103036956B (zh) 一种分布式组态化海量数据归档系统及实现方法
CN111736775B (zh) 多源存储方法、装置、计算机系统及存储介质
US9910895B2 (en) Push subscriptions
US10498777B2 (en) Real-time push notifications for cloud-based applications
WO2020168692A1 (fr) Procédé de partage de données de masse, plateforme de partage ouverte et dispositif électronique
CN114586011B (zh) 将所有者指定的数据处理流水线插入到对象存储服务的输入/输出路径
CN108959385B (zh) 数据库部署方法、装置、计算机设备和存储介质
WO2018058998A1 (fr) Procédé de chargement de données, terminal et grappe de calcul
US20230289782A1 (en) Smart contract-based data processing
US10169348B2 (en) Using a file path to determine file locality for applications
WO2022218227A1 (fr) Procédé et appareil de dépôt à base de chaîne de blocs, et dispositif électronique
US20120078946A1 (en) Systems and methods for monitoring files in cloud-based networks
US20230102617A1 (en) Repeat transaction verification method, apparatus, and device, and medium
WO2017092384A1 (fr) Procédé et dispositif de stockage distribué de base de données groupée
US8930518B2 (en) Processing of write requests in application server clusters
WO2023103341A1 (fr) Procédé, appareil et dispositif d'invocation de contrat intelligent à base de chaîne de blocs
CN111563083A (zh) 报表数据查询方法、装置及系统
CN111585897A (zh) 请求路由管理方法、系统、计算机系统及可读存储介质
WO2024098862A1 (fr) Procédé et appareil de traitement de données à base de chaîne de blocs, dispositif et support
CN117539962B (zh) 数据处理方法、装置、计算机设备和存储介质
EP4390720A1 (fr) Procédé et appareil de traitement de données à base de chaîne de blocs, dispositif et support

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22810054

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22810054

Country of ref document: EP

Kind code of ref document: A1