CN115408431A - Data access method and device, electronic equipment and storage medium - Google Patents

Data access method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115408431A
CN115408431A CN202110606750.3A CN202110606750A CN115408431A CN 115408431 A CN115408431 A CN 115408431A CN 202110606750 A CN202110606750 A CN 202110606750A CN 115408431 A CN115408431 A CN 115408431A
Authority
CN
China
Prior art keywords
data
request
distributed database
cache
data cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110606750.3A
Other languages
Chinese (zh)
Inventor
郭志伟
武智晖
李晓根
刘辉
尚晶
徐海勇
陶涛
刘虹
谢帆
魏瑗珍
冯凯
何庆
陈卓
张伟芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110606750.3A priority Critical patent/CN115408431A/en
Publication of CN115408431A publication Critical patent/CN115408431A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data access method, a data access device, electronic equipment and a storage medium, wherein the method comprises the following steps: receiving a read data request; querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database; if the data cache is hit, returning the data corresponding to the read data request hit in the data cache; otherwise, based on the data reading request, reading the storage subsystem of the distributed database, and returning a reading result. The data cache is arranged in the main actuator of the distributed database, then the data cache is inquired based on the currently processed read data request, when the data cache is hit, the corresponding data in the data cache is directly read, otherwise, the storage subsystem is accessed, the cross-network reading operation and the disk I/O operation can be avoided, and the disk I/O overhead and the network overhead are avoided.

Description

Data access method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of distributed database technologies, and in particular, to a data access method and apparatus, an electronic device, and a storage medium.
Background
With the development of emerging technologies such as the internet, the internet of things, 5G and cloud computing, the data volume is increased explosively. The single-machine database is influenced by the configuration of a single physical machine, and the expansibility is limited, so that the distributed database becomes the development direction of the mainstream relational database at present. The distributed database can realize mixed transaction and analysis service load Processing, namely an HTAP (Hybrid Transactional/Analytical Processing) database, and mainly solves the problems of two major categories: online transaction Processing (OLTP) and Online analysis Processing (OLAP).
When OLTP service is processed, the distributed database has certain defects. First, bottlenecks are prone to occur in distributed databases between the CPU and the disk subsystem. In OLTP environment, the physical read of the disk is generally a single block read of a file, but the number of read operations is very frequent. Serious performance problems occur if it is so frequent that none of the disk subsystems can carry its IOPS. Secondly, the OLTP service has a small amount of data, so to a large extent, data often needs to be accessed across a network, and the problem is more obvious especially in some real distributed database schemes with separated computation and storage. Since the compute nodes and storage nodes are not typically located on the same physical server, such as Oracle RAC, technologies like optical fiber channel FC are required, and high-end SAN storage devices are connected, which is expensive and uneconomical. In addition, the distributed database based on the middleware agent scheme, such as TDSQL, has the same problem. Because the proxy server and the database server at the back end are generally a separate MySQL instance, the data must be forwarded through the proxy, and thus cross-network access of the data cannot be avoided.
The partial database provides result set caching to alleviate some of the problems described above. That is, the result set is cached in the memory, so that frequent disk reading and writing can be avoided, the first problem above is solved, but the problem of cross-network access of data still cannot be avoided. Results set caching techniques such as MySQL: mySQL supports a technical feature called "The MySQL Query Cache". Typically, the database hands off caching of disk accesses to the operating system or the underlying storage engine. At the SQL level, no caching is provided, but MySQL provides a technique for caching this result set. When the feature is enabled, an SQL statement is sent to MySQL, and the text and the final query result are cached by the system. And when the same SQL sentence is received by MySQL again, directly returning the cached result set. It can be seen that the above caching based on result sets has great limitations, because this approach will occupy a large memory, and when there is any change in the query, the result set cache will be invalid, so the practical value is not high.
Disclosure of Invention
The invention provides a data access method, a data access device, electronic equipment and a storage medium, which are used for solving the problem of disk I/O bottleneck when OLTP service is processed and the problem of overhead caused by cross-network transmission in the prior art.
The invention provides a data access method, which comprises the following steps:
receiving a data reading request;
querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database;
if the data cache is hit, returning the data corresponding to the read data request hit in the data cache;
otherwise, based on the data reading request, reading the storage subsystem of the distributed database, and returning a reading result.
According to the data access method provided by the present invention, reading the storage subsystem of the distributed database based on the read data request, and returning a read result, specifically comprising:
reading the storage subsystem based on the data reading request to obtain a reading result;
and filling the reading result into the data cache, and returning the reading result.
According to a data access method provided by the invention, data in a storage subsystem of the distributed database is written based on the following steps:
receiving a write data request;
querying a data cache in a distributed database based on the data writing request; the data cache is stored in a main actuator which is used for processing the data writing request currently in the distributed database;
if the data cache is hit, setting a hit item in the data cache to be in a failure state, and performing write-in operation on the storage subsystem based on the write-data request;
otherwise, directly performing write operation on the storage subsystem based on the write data request.
According to a data access method provided by the present invention, setting a hit entry in the data cache to an invalid state specifically includes:
and setting the entry corresponding to the hit entry in the data cache of each node in the distributed database to be in an invalid state.
According to a data access method provided by the present invention, setting an entry corresponding to the hit entry in the data cache of each node in the distributed database to be in an invalid state specifically includes:
sending a data failure request to a runtime management service process of a local node based on a main actuator currently processing the write data request;
broadcasting the data invalidation request to other nodes of the distributed database based on a runtime management service process of the local node;
and setting an entry corresponding to the hit entry in the data cache of each node as a failure state based on the runtime management service process of each node of the distributed database.
According to the data access method provided by the invention, the querying of the data cache in the distributed database specifically comprises the following steps:
calculating hash key values of data addresses to be accessed in the corresponding requests;
and querying a data cache in the distributed database based on the hash key value to obtain a hit result of the data cache.
According to a data access method provided by the present invention, the calculating a hash key value corresponding to a data address to be accessed in a request specifically includes:
if the corresponding request is a single-row read request or a single-row write request, calculating a hash key value of a single-row data address in the corresponding request;
and if the corresponding request is a multi-row read request or a multi-row write request, calculating a starting hash key value and a terminating hash key value of a plurality of rows of data addresses in the corresponding request.
The present invention also provides a data access apparatus, comprising:
a request receiving unit for receiving a read data request;
the cache query unit is used for querying a data cache in the distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database;
a data returning unit, configured to return data corresponding to the read data request that hits in the data cache if the data cache hits in the data cache; otherwise, reading the storage subsystems of the distributed database based on the data reading request, and returning a reading result.
The present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any of the above data access methods when executing the program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data access method as described in any one of the above.
According to the data access method, the data cache is arranged in the main actuator of the distributed database, the data cache is inquired based on the currently processed read data request, corresponding data in the data cache is directly read when the data cache is hit, otherwise, the storage subsystem is accessed, cross-network read operation and disk I/O operation can be avoided, disk I/O overhead and network overhead are avoided, SQL query processing delay is reduced, and therefore OLTP system processing capacity is improved, and higher system throughput and better OLTP performance are obtained.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a data access method provided by the present invention;
FIG. 2 is a schematic diagram of a distributed database architecture provided by the present invention;
FIG. 3 is a schematic diagram of a data invalidation method according to the present invention;
FIG. 4 is a flowchart illustrating a method for reading data according to the present invention;
FIG. 5 is a schematic flow chart of a data writing method according to the present invention;
FIG. 6 is a schematic structural diagram of a data access device provided in the present invention;
fig. 7 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The problems mainly solved by distributed databases are mainly divided into two categories: online transaction processing OLTP and online analytical processing OLAP. Where OLTP represents a very transactional system, typically a highly available online system, with small transactions and small queries as the primary. In such systems, a single database often processes more than a few hundred, or thousands, transactions per second, and Select statements execute in thousands or even tens of thousands of transactions per second. Typical OLTP systems are e-commerce systems, banks, securities, etc. OLAP, also known as DSS decision support system, i.e. data warehouse. In such a system, the execution amount of a sentence is not an evaluation criterion because the execution time of a sentence may be very long and the read data is also very much. Therefore, in such systems, the examined criteria is often the throughput (bandwidth) of the disk subsystem, such as how many MB/s of traffic can be reached.
Distributed database storage has inherent disadvantages when dealing with OLTP traffic. The method mainly includes the overhead of data access in a distributed system, the overhead of data transaction consistency processing and the like, and also has some problems faced by the traditional OLTP system. First, bottlenecks are prone to occur in the distributed database between the CPU and the disk subsystem. In OLTP environment, the physical read of the disk is generally a single block read of a file, but the number of read operations is very frequent. Serious performance problems occur if it is frequent that none of the disk subsystems can carry its IOPS. Secondly, the OLTP service has a small amount of data, so to a large extent, the data often needs to be accessed across the network, and the problem is more obvious especially for some real distributed database schemes with separated computation and storage. Since the compute nodes and storage nodes are not typically located on the same physical server, such as Oracle RAC, technologies like optical fiber channel FC are required, and high-end SAN storage devices are connected, which is expensive and uneconomical. In addition, the same problem exists with distributed databases based on middleware agent schemes, such as TDSQL. Because the proxy server and the database server at the back end, which are generally a separate MySQL instance, data must be forwarded through the proxy, and thus cross-network access of data cannot be avoided.
The partial database provides a result set cache to alleviate some of the problems described above. That is, the result set is cached in the memory, so that frequent disk reading and writing can be avoided, and the first problem above is solved, but the problem of cross-network access of data still cannot be avoided. Results set caching techniques such as MySQL: mySQL supports a technical feature called "The MySQL Query Cache". Typically, the database hands off caching of disk accesses to the operating system or the underlying storage engine. At the SQL level, no caching is provided, but MySQL provides a technique for caching this result set. When the feature is enabled, an SQL statement is sent to MySQL, and the text and the final query result are cached by the system. And when the same SQL sentence is received by MySQL again, directly returning the cached result set. It can be seen that the above caching based on result sets has great limitations, because this approach will occupy a large memory, and when there is any change in the query, the result set cache will be invalid, so the practical value is not high.
Therefore, the embodiment of the invention provides a data access method which is applied to a distributed database. Fig. 1 is a schematic flowchart of a data access method according to an embodiment of the present invention, and as shown in fig. 1, the method includes:
step 110, receiving a read data request;
step 120, inquiring data cache in the distributed database based on the data reading request; the data cache is stored in a main actuator for processing the current read data request in the distributed database;
step 130, if the data cache is hit, returning data corresponding to the hit read data request in the data cache; otherwise, based on the request for reading data, the storage subsystems of the distributed database are read, and a reading result is returned.
Specifically, a user may connect to the distributed database through the JDBC/ODBC interface and submit a read data request. The access layer of the distributed database is responsible for receiving a data reading request of a user, a main executive of the distributed database is used for specifically processing the data reading request, and an execution plan corresponding to the data reading request is generated through a compiler. Fig. 2 is a schematic diagram of an architecture of a distributed database according to an embodiment of the present invention, and an overall architecture of a Trafodion distributed database is shown in the diagram. The client application accesses the Trafodion through JDBC/ODBC, and this connection is taken care of by the access layer of Trafodion. The access layer distributes a main actuator for each client connection, and is responsible for executing all data access requests and returning results of user connection.
And querying a data cache in the distributed database based on the data reading request. Wherein the data cache is stored in a main actuator currently processing the read data request in the distributed database. That is, when the master starts processing a read data request, the master first queries the data cache stored in the master itself to determine whether the data cache hits. In the traditional database, only the cache of an execution plan is arranged on an SQL layer, and the data cache is usually handed to a storage engine, so that the performance of the storage cache is reduced due to the separation of calculation and storage under the distributed condition. Here, the data cache may be specifically stored at an interface where the main actuator accesses the storage subsystem.
If the data cache is hit, it indicates that the data that the read data request wants to read is stored in the data cache, so the Scan operator of the main actuator can be used to directly read the data corresponding to the read data request that is hit in the data cache, and the storage subsystems on other nodes do not need to be accessed across the network, and the disk I/O operation is also avoided. If the data cache is not hit, it indicates that the data that the read data request wants to read is not in the data cache, so the Scan operator of the main actuator needs to initiate an RPC call to a storage subsystem of the distributed database, such as HBase, based on the read data request, and read a read result returned by the storage subsystem through the RPC. Therefore, when the data to be read is stored in the data cache inside the main actuator, the data in the data cache can be directly read. The short circuit operation can avoid acquiring data from other processes or other nodes, namely, cross-process reading operation, cross-network reading operation and disk I/O operation are avoided, disk I/O overhead and network overhead are avoided, and SQL query processing delay is reduced, so that OLTP system processing capacity is improved, and higher system throughput and better OLTP performance are obtained.
According to the method provided by the embodiment of the invention, the data cache is arranged in the main actuator of the distributed database, then the data cache is inquired based on the currently processed read data request, when the data cache is hit, the corresponding data in the data cache is directly read, otherwise, the storage subsystem is accessed, the cross-network reading operation and the disk I/O operation can be avoided, the disk I/O overhead and the network overhead are avoided, the SQL query processing time delay is reduced, the OLTP system processing capacity is improved, and the higher system throughput and the better OLTP performance are obtained.
Based on the above embodiment, reading the storage subsystems of the distributed database based on the read data request, and returning a read result, specifically including:
reading the storage subsystem based on the data reading request to obtain a reading result;
and filling the read result into the data cache, and returning the read result.
Specifically, when the read data request misses in the data cache, the Scan operator may be used to access the storage subsystem, and obtain a read result corresponding to the read data request. After the reading result is obtained, in order to avoid disk I/O overhead and network overhead when the data is read again, the reading result may be filled in a data cache, and then the reading result is returned to the user.
Based on any one of the above embodiments, the data in the storage subsystem of the distributed database is written based on the following steps:
receiving a request for writing data;
querying a data cache in the distributed database based on the data writing request; the data cache is stored in a main actuator for processing the data writing request currently in the distributed database;
if the data cache is hit, setting a hit item in the data cache to be in a failure state, and performing write-in operation on the storage subsystem based on a write-in data request;
otherwise, directly performing write operation on the storage subsystem based on the write data request.
Specifically, a user may connect to the distributed database through the JDBC/ODBC interface and submit a write data request. The data writing request comprises a data adding request, a data changing request and a data deleting request. After the access layer of the distributed database receives a data writing request of a user, a main executive device of the distributed database processes the data writing request, and an execution plan corresponding to the data writing request is generated through a compiler.
And querying the data cache in the distributed database based on the data writing request. Wherein the data cache is stored in a main actuator currently processing the write data request in the distributed database. That is, when the master executor starts processing a write data request, it first queries the data cache stored in its own memory and determines whether the data cache hits.
If the data cache hits, it indicates that the data request wants to add, delete or change is stored in the data cache. Since the data is about to be changed, the data currently stored in the data cache is unavailable, and it is necessary to set the hit entry in the data cache to an invalid state, and then perform a write operation on the storage subsystem. If the data cache misses, the write operation can be directly performed on the storage subsystem. Wherein the write operation can be performed using the IDU operator of the master executor. Because the data cache in the main actuator is read-only cache, the cache content is invalidated by write operation, the design and development difficulty of distributed cache can be greatly simplified, the code implementation is simplified, and the performance efficiency is improved.
Based on any of the above embodiments, setting a hit entry in a data cache to an invalid state specifically includes:
and setting the entry corresponding to the hit entry in the data cache of each node in the distributed database to be in an invalid state.
Specifically, in the distributed database, the master executor is provided on each of the plurality of nodes, that is, the data caches are distributed in the plurality of nodes, and the hit entries in the data caches may also exist in the data caches on other nodes, so that the entries corresponding to the hit entries in the data caches on each node in the distributed database may be set to an invalid state, so as to avoid a read-write error.
Based on any of the above embodiments, setting an entry corresponding to a hit entry in a data cache on each node in the distributed database to be in an invalid state specifically includes:
based on a main actuator for processing the data writing request at present, sending a data failure request to a runtime management service process of a local node;
the method comprises the steps that a service process is managed based on the running time of a local node, and a data failure request is broadcasted to other nodes of a distributed database;
and setting an entry corresponding to the hit entry in the data cache on each node as a failure state based on the runtime management service process of each node of the distributed database.
Specifically, in order to set an entry corresponding to a hit entry in a data cache on each node in the distributed database to an invalid state, the entry may be assisted by a Runtime Management Service (RMS) system of the distributed database. Fig. 3 is a schematic diagram of the data invalidation method according to the embodiment of the present invention, and as shown in fig. 3, after the write path processing, that is, after a hit entry of a write data request is found in the data cache, a data invalidation request is sent to a runtime management service process (RMS process) of a local node based on a main executor that currently processes the write data request. After the runtime management service process (RMS process) of the local node receives the request, the data failure request cluster is broadcasted to all RMS processes, namely the request is broadcasted to other nodes of the distributed database. And marking cache entries corresponding to the hit entries in the data cache on each node as invalid based on the runtime management service process of each node of the distributed database.
Based on any of the embodiments, querying a data cache in a distributed database specifically includes:
calculating hash key values of data addresses to be accessed in the corresponding requests;
and querying the data cache in the distributed database based on the hash key value to obtain a hit result of the data cache.
Specifically, the hash key value of the data address to be accessed in the read data request or the write data request may be calculated based on a hash algorithm. And querying the data cache in the distribution database by using the hash key value as a query condition to obtain a hit result of the data cache.
Based on any of the above embodiments, calculating a hash key value corresponding to a data address to be accessed in a request specifically includes:
if the corresponding request is a single-row read request or a single-row write request, calculating a hash key value of a single-row data address in the corresponding request;
if the corresponding request is a multi-row read request or a multi-row write request, calculating a starting hash key value and a stopping hash key value of a multi-row data address in the corresponding request.
Specifically, the read data request or the write data request may be designed as a single row read/write or a multiple row read/write, and needs to be treated differently when calculating the hash key value. When the corresponding request is a single-row read request or a single-row write request, a hash key value of a single-row data address in the corresponding request can be calculated; when the corresponding request is a multi-row read request or a multi-row write request, calculating a starting hash key value and a stopping hash key value of a multi-row data address in the corresponding request, and taking the starting hash key value and the stopping hash key value as query conditions in the query data cache.
Based on any of the above embodiments, fig. 4 and fig. 5 are schematic flow diagrams of a data reading method and a data writing method provided by an embodiment of the present invention, respectively. As shown in fig. 4, after receiving a data reading request, determining that the request is a single-row read request or a multi-row read request, and if the request is a single-row read request, calculating hash key values of single-row data addresses in the corresponding request; otherwise, calculating the starting hash key value and the ending hash key value of a plurality of rows of data addresses in the corresponding request. Then, the data cache inside the master executor is queried. If the cache is hit, directly returning the request data; otherwise, the data of the storage subsystem Hbase is requested, and then the cache is filled and the requested data is returned. As shown in fig. 5, after receiving a data writing request, determining that the request is a single-row writing request or a multi-row writing request, and if the request is a single-row writing request, calculating a hash key value of a single-row data address in the corresponding request; otherwise, calculating the starting hash key value and the ending hash key value of a plurality of rows of data addresses in the corresponding request. Then, the data cache inside the master executor is queried. If the cache is hit, executing a data failure process, and then sending a write request to a storage engine; otherwise, the write request is directly sent to the storage engine.
Based on any of the foregoing embodiments, fig. 6 is a schematic structural diagram of a data access apparatus according to an embodiment of the present invention, and as shown in fig. 6, the apparatus includes: a request receiving unit 610, a cache lookup unit 620 and a data returning unit 630.
The request receiving unit 610 is configured to receive a read data request;
the cache query unit 620 is configured to query a data cache in the distributed database based on the read data request; the data cache is stored in a main actuator for processing the read data request currently in the distributed database;
the data returning unit 630 is configured to, if the data cache hits, return data corresponding to the data read request that hits in the data cache; otherwise, based on the request for reading data, the storage subsystems of the distributed database are read, and a reading result is returned.
According to the device provided by the embodiment of the invention, the data cache is arranged in the main actuator of the distributed database, then the data cache is inquired based on the currently processed read data request, when the data cache is hit, the corresponding data in the data cache is directly read, otherwise, the storage subsystem is accessed, the cross-network reading operation and the disk I/O operation can be avoided, the disk I/O overhead and the network overhead are avoided, the SQL query processing time delay is reduced, the OLTP system processing capacity is improved, and the higher system throughput and the better OLTP performance are obtained.
Based on any of the embodiments described above, reading the storage subsystems of the distributed database based on the read data request, and returning a read result, specifically including:
reading the storage subsystem based on the data reading request to obtain a reading result;
and filling the read result into a data cache, and returning the read result.
Based on any one of the above embodiments, the data in the storage subsystem of the distributed database is written based on the following steps:
receiving a write data request;
querying a data cache in the distributed database based on the write data request; the data cache is stored in a main actuator for processing the data writing request currently in the distributed database;
if the data cache is hit, setting a hit item in the data cache to be in a failure state, and performing write-in operation on the storage subsystem based on a write-in data request;
otherwise, directly performing write operation on the storage subsystem based on the write data request.
Based on any of the above embodiments, setting a hit entry in a data cache to an invalid state specifically includes:
and setting the entry corresponding to the hit entry in the data cache of each node in the distributed database to be in an invalid state.
Based on any of the above embodiments, setting an entry corresponding to a hit entry in a data cache on each node in the distributed database to be in an invalid state specifically includes:
based on a main actuator for processing the data writing request at present, sending a data failure request to a runtime management service process of a local node;
the method comprises the steps that a service process is managed based on the running time of a local node, and a data failure request is broadcasted to other nodes of a distributed database;
and setting an entry corresponding to the hit entry in the data cache on each node as a failure state based on the runtime management service process of each node of the distributed database.
Based on any of the embodiments, querying a data cache in a distributed database specifically includes:
calculating hash key values of data addresses to be accessed in the corresponding requests;
and querying the data cache in the distributed database based on the hash key value to obtain a hit result of the data cache.
Based on any of the above embodiments, calculating a hash key value of a data address to be accessed in a corresponding request specifically includes:
if the corresponding request is a single-row read request or a single-row write request, calculating a hash key value of a single-row data address in the corresponding request;
if the corresponding request is a multi-row read request or a multi-row write request, calculating a starting hash key value and a stopping hash key value of a multi-row data address in the corresponding request.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the electronic device may include: a processor (processor) 710, a communication Interface (Communications Interface) 720, a memory (memory) 730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may call logical instructions in memory 730 to perform a data access method comprising: receiving a read data request; querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database; if the data cache is hit, returning the data corresponding to the read data request hit in the data cache; otherwise, based on the data reading request, reading the storage subsystem of the distributed database, and returning a reading result.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the data access method provided by the above methods, the method comprising: receiving a data reading request; querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database; if the data cache is hit, returning the data corresponding to the read data request hit in the data cache; otherwise, based on the data reading request, reading the storage subsystem of the distributed database, and returning a reading result.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the data access methods provided above, the method comprising: receiving a read data request; querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database; if the data cache is hit, returning data corresponding to the read data request hit in the data cache; otherwise, reading the storage subsystems of the distributed database based on the data reading request, and returning a reading result.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method of accessing data, comprising:
receiving a data reading request;
querying a data cache in a distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database;
if the data cache is hit, returning data corresponding to the read data request hit in the data cache;
otherwise, reading the storage subsystems of the distributed database based on the data reading request, and returning a reading result.
2. The data access method according to claim 1, wherein reading the storage subsystems of the distributed database based on the read data request and returning a read result specifically comprises:
reading the storage subsystem based on the data reading request to obtain a reading result;
and filling the reading result into the data cache, and returning the reading result.
3. The data access method of claim 1, wherein the data in the storage subsystems of the distributed database is written based on the steps of:
receiving a request for writing data;
querying a data cache in a distributed database based on the data writing request; the data cache is stored in a main actuator which is used for processing the data writing request currently in the distributed database;
if the data cache is hit, setting a hit item in the data cache to be in a failure state, and performing write-in operation on the storage subsystem based on the write-data request;
otherwise, directly performing write operation on the storage subsystem based on the write data request.
4. The data access method according to claim 3, wherein the setting of the hit entry in the data cache to the invalidation state specifically includes:
and setting the entry corresponding to the hit entry in the data cache of each node in the distributed database to be in an invalid state.
5. The method according to claim 4, wherein setting an entry corresponding to the hit entry in the data cache of each node in the distributed database to an invalid state specifically includes:
sending a data failure request to a runtime management service process of a local node based on a main executor which processes the write data request currently;
broadcasting the data invalidation request to other nodes of the distributed database based on a runtime management service process of the local node;
and setting an entry corresponding to the hit entry in the data cache on each node as a failure state based on the runtime management service process of each node of the distributed database.
6. The data access method according to any one of claims 1 to 5, wherein querying the data cache in the distributed database specifically comprises:
calculating hash key values of the data addresses to be accessed in the corresponding requests;
and querying a data cache in the distributed database based on the hash key value to obtain a hit result of the data cache.
7. The data access method according to claim 6, wherein the calculating the hash key value corresponding to the data address to be accessed in the request specifically includes:
if the corresponding request is a single-row read request or a single-row write request, calculating a hash key value of a single-row data address in the corresponding request;
and if the corresponding request is a multi-row read request or a multi-row write request, calculating a starting hash key value and a stopping hash key value of a multi-row data address in the corresponding request.
8. A data access device, comprising:
a request receiving unit for receiving a read data request;
the cache query unit is used for querying a data cache in the distributed database based on the read data request; the data cache is stored in a main actuator which is used for processing the read data request currently in the distributed database;
the data returning unit is used for returning data corresponding to the read data request in the hit of the data cache if the data cache is hit; otherwise, based on the data reading request, reading the storage subsystem of the distributed database, and returning a reading result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the data access method according to any of claims 1 to 7 are implemented when the program is executed by the processor.
10. A non-transitory computer readable storage medium, having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the data access method according to any one of claims 1 to 7.
CN202110606750.3A 2021-05-27 2021-05-27 Data access method and device, electronic equipment and storage medium Pending CN115408431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110606750.3A CN115408431A (en) 2021-05-27 2021-05-27 Data access method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110606750.3A CN115408431A (en) 2021-05-27 2021-05-27 Data access method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115408431A true CN115408431A (en) 2022-11-29

Family

ID=84156544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110606750.3A Pending CN115408431A (en) 2021-05-27 2021-05-27 Data access method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115408431A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905306A (en) * 2022-12-26 2023-04-04 北京滴普科技有限公司 Local caching method, equipment and medium for OLAP analysis database

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905306A (en) * 2022-12-26 2023-04-04 北京滴普科技有限公司 Local caching method, equipment and medium for OLAP analysis database

Similar Documents

Publication Publication Date Title
US10803047B2 (en) Accessing data entities
US10303646B2 (en) Memory sharing for working data using RDMA
US11175832B2 (en) Thread groups for pluggable database connection consolidation in NUMA environment
US8548945B2 (en) Database caching utilizing asynchronous log-based replication
US10990533B2 (en) Data caching using local and remote memory
US7818309B2 (en) Method for managing data access requests utilizing storage meta data processing
EP3134821B1 (en) System and method for parallel optimization of database query using cluster cache
US7765196B2 (en) Method and apparatus for web cache using database triggers
US10509807B2 (en) Localized data affinity system and hybrid method
US10990571B1 (en) Online reordering of database table columns
US20130290636A1 (en) Managing memory
US11288237B2 (en) Distributed file system with thin arbiter node
CN115408431A (en) Data access method and device, electronic equipment and storage medium
US11341163B1 (en) Multi-level replication filtering for a distributed database
CN113626463A (en) Web performance optimization method under high concurrent access
US11822482B2 (en) Maintaining an active track data structure to determine active tracks in cache to process
Yu et al. Effect of system dynamics on coupling architectures for transaction processing
CN117914944A (en) Distributed three-level caching method and device based on Internet of things
CN117724994A (en) Data operation method, server and CXL controller
CN115374131A (en) Request processing method and system, storage medium and electronic equipment
CN116932563A (en) Database processing method, apparatus, device, storage medium, and program product
CN117827968A (en) Metadata storage separation cloud distributed data warehouse data sharing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination