WO2019161908A1 - Dynamic determination of consistency levels for distributed database transactions - Google Patents

Dynamic determination of consistency levels for distributed database transactions Download PDF

Info

Publication number
WO2019161908A1
WO2019161908A1 PCT/EP2018/054505 EP2018054505W WO2019161908A1 WO 2019161908 A1 WO2019161908 A1 WO 2019161908A1 EP 2018054505 W EP2018054505 W EP 2018054505W WO 2019161908 A1 WO2019161908 A1 WO 2019161908A1
Authority
WO
WIPO (PCT)
Prior art keywords
access
data element
transaction
consistency level
consistency
Prior art date
Application number
PCT/EP2018/054505
Other languages
French (fr)
Inventor
Joerg AELKEN
Arturo Martin De Nicolas
Peter Woerndle
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Priority to PCT/EP2018/054505 priority Critical patent/WO2019161908A1/en
Publication of WO2019161908A1 publication Critical patent/WO2019161908A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present application relates to a method for determining a consistency level for access transactions of a data element stored in a distributed database comprising a plurality of database instances. Furthermore the corresponding entity configured to determine the consistency level is provided. Additionally, a method for accessing a data element stored in the distributed database by a client accessing the data element is provided and the corresponding client. In addition a system comprising the entity configured to determine the consistency level and the client is provided. The application further relates to a computer program comprising program code and a carrier comprising the computer program.
  • Distributed databases are data storage systems which are formed by clusters of (geo-) distributed database instances.
  • the database instances can be distributed within or across multiple data centers.
  • the individual database instances may store data persistently on their local accessible storage media and may synchronize their state and content over the network. All instances of such a distributed database are meant to provide a consistent view on the data stored in the distributed database eventually, dependent on the processing speed of the instance hosts, the Read/Write speed of the used storage media and the speed of the data network connecting the instances. Therefore, all transactions towards a distributed database (usually Read or Write data) are executed with a certain consistency level present in a request for an access to a data element stored in the distributed database.
  • the transaction consistency level reflects a grade of how many and which database instances need to be involved and confirm the consistent state for this transaction.
  • the higher the consistency level the more instances are involved and need to be synchronized, hence the higher the consistency level the higher the latency for the transaction.
  • modem IT and telecommunication applications such as Virtualized Network Functions (VNFs)
  • VNFs Virtualized Network Functions
  • Stateless in this context means that the application business logic does not hold any persistent state information any longer and instead stores this data in external data storage systems.
  • One solution for redundant and resilient data storage systems are distributed databases.
  • the consistency level for each database transaction needs to be specified. Since usually these client application operations are time critical, the latency for database transactions needs to be optimized and therefore the optimal consistency level for each database transaction needs to be specified.
  • the optimal consistency level may differ between different database transactions or type of database transactions.
  • the problem with existing solutions is that the consistency level for each distributed database transaction needs to be specified during design time of the client application, based on the analysis of the client application business logic, the type of data to be stored and predicted data access patterns. For example, it may not be known at design time if the application will be connected to a database that is geographically distributed or a database that has several instances in the same data center or only a single instance. Consequently, non-optimal consistency levels for the distributed database transactions might be specified which may lead to performance degradation of the client applications, if a too high consistency level is specified, or data corruption and malfunctioning of the business logic, if a too low consistency level is specified.
  • a method for determining a consistency level for access transactions of a data element stored in a distributed database comprising a plurality of database instances. The method is carried out at an entity determining the consistency level and the data element is stored in at least some of the plurality of database instances and is accessed by a plurality of clients.
  • the entity receives information about access transactions carried out on the data element in the distributed database by the plurality of clients including receiving different access parameters used by the plurality of clients when accessing the data element. Based on the different access parameters used by the plurality of clients an access pattern is determined for the data element.
  • a consistency level is determined for the access transactions of the data element based on the determined access pattern for the data element and based on preconfigured configuration data which relates the access patterns to the consistency levels. Furthermore, the determined consistency level for the access transaction of the data element is transmitted to the plurality of clients for a future use by the client when accessing the data element.
  • a dynamic determination of the optimal consistency level is provided by monitoring and analyzing the access traffic to a data element stored in the distributed database. Based on this analysis of the access pattern and based on preconfigured configuration data a consistency level, an updated consistency level can be determined for the data element, and different clients accessing the data element can be informed about the new consistency level. The different clients do not know by how many clients a data element is accessed and how many clients carry out a Read or Write operation on the data element. By determining the access pattern used by the different clients an optimum consistency level can be determined.
  • the corresponding entity configured to determine the consistency level for the access transaction of a data element stored in the distributed database comprising the plurality of database instances
  • the entity comprises at least one processing unit and a memory containing instructions executable by the at least one processing unit.
  • the entity is operative to function as discussed above or as discussed in further detail below.
  • an entity configured to determine the consistency level for access transactions of a data element stored in the distributed database which comprises a plurality of database instances and wherein the data element is stored in at least some of the plurality of database instances and is accessed by a plurality of clients.
  • the entity comprises a first module configured to receive information about access transactions carried out on the data element in the distributed database by the plurality of clients including receiving different access parameters used by the plurality of clients when accessing the data element.
  • a second module of the entity is configured to determine an access pattern for the data element based on the different access parameters used by the plurality of clients.
  • the entity furthermore comprises a third module configured to determine a consistency level for the access transactions of the data element based on the determined access pattern for the data element and based on preconfigured configuration data which relates the access patterns to the consistency levels.
  • a fourth module is configured to transmit the determined consistency level for the access transaction of the data element to the plurality of clients for a future use by the client when accessing the data element.
  • the information about the access parameters can contain a client parameter indicating whether the client requesting access to the data element comprises several client instances. Furthermore a database parameter may be monitored indicating whether the access to the data element comprises multiple database instances receiving the requests. Furthermore, a type parameter can be monitored indicating whether the access to the data element is a Read access, a Write access or a Read and Write access.
  • a size parameter can indicate a data size of the transaction carried out when accessing the data element, a transaction latency parameter can indicate how long the access to the data element and the transmission of the requested access to the client takes.
  • a synchronization parameter can indicate how long it takes to synchronize the data element in between the different database instances.
  • a method for accessing a data element stored in the distributed database with a plurality of database instances wherein the data element is stored in at least some of the plurality of database instances.
  • an access request is transmitted to the distributed database for accessing the data element wherein the access request comprises a first consistency level to be met by the data element to be received by the client in response to the request.
  • an updated consistency level is received for the access transaction of the data element from the entity configured to determine the consistency levels for the access transactions of a data element stored in the distributed database.
  • the client then stores the updated consistency level for the access transaction of the data element and when the data element has to be accessed in the distributed database for a second time, a second access request is transmitted to the distributed database for accessing the data element, wherein the second access request comprises the updated consistency level.
  • the client can during use adapt the consistency level when requesting access to a data element stored in the distributed database, so that always the best consistency level is used.
  • the corresponding client accesses the data element stored in the distributed database and which comprises at least one processing unit and a memory containing instructions executable by the at least one processing unit wherein the client is operative to function as discussed above or as discussed in further detail below.
  • a client configured to access the data element stored the distributed database
  • the client comprises a first module configured to transmit an access request to the distributed database for accessing the data element wherein the access request comprises a first consistency level to be met by the data element to be received by the client in response to the request.
  • the client comprises a second module configured to receive an updated consistency level for the access transaction of the data element and comprises a third module configured to store the updated consistency level for the access transaction.
  • the module configured to transmit the access request can then transmit a second access request to the distributed database when the data element has to be accessed wherein the second access request comprises the updated consistency level.
  • a computer program comprising program code to be executed by at least one processing unit of the client or of the entity configured to determine the consistency level is provided wherein execution of the program code causes the at least one processing unit to execute a method as discussed above or as discussed in further detail below.
  • a carrier comprising the computer program wherein the carrier is one of an electronic signal, optical signal, radio signal, or computer readable storage medium.
  • Figure 1 shows an example schematic representation of a system in which many clients access a distributed database and in which a consistency level for an access transaction is adapted.
  • Figure 2 shows an example flowchart of a method carried out by an entity determining the consistency level shown in Figure 1 which dynamically determines the optimal consistency level for access transactions of a data element stored in a database shown in Figure 1.
  • Figure 3 shows a further flowchart of a method carried out at the entity determining the consistency level.
  • Figure 4 shows an example representation of a table with a consistency level policy showing different consistency levels for different access parameters used by a client when accessing a data element of the distributed database.
  • Figure 5 shows an example flowchart of a method carried out at the entity determining the consistency level based on the access patterns used by the plurality of clients.
  • Figure 6 shows an example flowchart of a method carried out by a client using an updated consistency level when accessing a data element.
  • Figure 7 shows an example schematic representation of an entity determining the consistency level used for accessing a data element in the distributed database.
  • Figure 8 shows another example schematic representation of an entity configured to determine the consistency level.
  • Figure 9 shows a schematic representation of a client accessing the distributed database and using the determined consistency level in an access transaction.
  • Figure 10 shows another example schematic representation of a client accessing the distributed database with an adapted consistency level.
  • the solution discussed below aims for a dynamic determination of optimal consistency levels for distributed database transactions by recognizing data access patterns and transaction performance.
  • Figure 1 shows a schematic architectural view in which different database clients 200a, 200b and 200c access a distributed database 300 which has two different data centers 300a and 300b and wherein each data center has different hosts 310, 320, 330, 340, 350, 360, 370, and 380.
  • the different hosts 310 to 380 comprise the different database instances 31 1 , 321 , 331 , 341 wherein the host 350 to 380 also comprise corresponding database instances which are not reference by referenced signs for the sake of clarity.
  • the database instances store a data element 80 in different database instances, in the example shown in Figure 1 the data element 80 is stored three times as data element 80a, 80b and 80c and when one of these data elements is accessed and amended the other corresponding data elements 80 have to be amended accordingly.
  • Each of the database instances furthermore comprises a monitoring probe 315 which collects the statistical data from the database instance and reports the collected data to a transaction monitor 70.
  • a registry 60 is provided storing intermediate transaction analysis results per transaction target which contain the aggregated statistical data and the latest determined access pattern vectors as will be discussed below.
  • an analyzing entity 100 is provided which determines or calculates the consistency level based on the information received from the transaction monitor 70, the registry 60 and using preconfigured configuration data 50.
  • the transaction monitor 70 collects statistical data for all transactions between the different clients 200a - 200c and the distributed database 300 received from the different probes 315.
  • Entity 100 aggregates and analyzes the transaction statistics and calculates an access pattern vector for each transaction target identifier taking into account the identity and location of the involved client and the database instances and the content and performance of the transaction.
  • the entity 100 detects the optimal consistency level for each transaction target identifier, i. e. for each transaction by mapping the calculated access pattern vectors to the preconfigured configuration data, the consistency level policies.
  • the entity 100 than provides the determined consistency level to the different client applications 200.
  • the client can then adapt its consistency level for the distributed database transactions.
  • the client furthermore observes to database transactions and sends feedback to the entity 100 in case the transactions degrade in terms of faults or performance.
  • the client could be any VNF externalizing its state to a distributed database. Examples could be mobile core control plane applications such as Mobility Management Entity, MME, Diameter Routing Agent, DRA, etc.
  • a method is provided which can dynamically adapt the consistency level for distributed database transactions based on runtime data access patterns and transaction performance.
  • FIG. 2 shows a first overview over the steps carried out by the system shown in Figure 1.
  • the monitor 70 observes all the transactions to all instances of the distributed database and collects and stores statistical data for these transactions received from the different monitoring probes 315.
  • the monitor may receive or collect information about an identity and address of the client application instance initiating the transaction.
  • a further parameter is the identity and the address of the database instance serving the transaction.
  • a further monitored parameter is the type of the transaction, be it a Read transaction, a Write transaction or Read and Write transaction, the type of transaction having further information about the constraints of the transaction. Constraints included in the access transactions could be setting a time-to-live for a certain data element value, locking a data element so that it is not allowed to be modified etc.
  • a further information monitored by the probes is the target data of the transaction such as the database name, the database table or keys.
  • the target data information is also used as an identifier for grouping transactions and assigning an access pattern vector as discussed in further detail below.
  • a further monitored parameter can be the size of the transferred transaction data.
  • monitor 70 It is possible to suspend the monitoring by monitor 70 after a first adaptation of the consistency level has been performed. The monitoring could then be resumed either after a dedicated time has expired provided at the monitor or when an indication from the database client has been received via a feedback loop where the feedback is provided to the entity 100 or directly to monitor 70.
  • the monitor 70 can be implemented as part of the database, as part of the client or as a proxy function between the client and the database.
  • the monitoring function carried out by monitor 70 may also only sample the traffic patterns periodically to improve the efficiency.
  • a step S21 the entity 100 then calculates the transaction access patterns in the form of access pattern vectors for the different transaction identifiers.
  • the entity 100 uses the registry 60 which stores the intermediate analyzed results such as statistical data and access pattern vectors which are continuously adapted per incoming transaction statistics.
  • the access pattern vector can comprise, by way of example the following vector components or parameters: a) transaction type, whether it is a Read, Write, Read and Write transaction.
  • database instance pattern here the information is provided whether the database is a single database instance, is a local database instance in the same datacenter or a global database instance in a different datacenter.
  • This database instance pattern describes the data center locality of the database instances and can also consider the number of hops between the instances or other restrictions such as firewall restrictions,
  • a database client pattern here, information is provided whether a single database client, a local or a global database client is accessing the distributed database
  • a transaction data size here information may be provided whether the access transaction data size is small, medium or large
  • a transaction round-trip latency this latency is measured from the transaction request sent by the client until the response received by the client and can have values such as very fast, fast, medium, slow or very slow
  • a synchronization latency describing how fast the different data elements in the database are synchronized which can again comprise parameter values such as very fast, fast, medium, slow or very slow.
  • step S22 entity 100 then detects the optimal consistency level for the identified access pattern vectors by means of the consistency level policies 50.
  • the consistency level policies provide the configurable mapping between the access pattern vector parameters and the desired consistency level.
  • the consistency levels can be grouped for different optimization targets such as performance or availability.
  • performance or availability For each access pattern vector parameter a consistency level per optimization target exists as shown in Figure 4 and as discussed in further detail below.
  • An example for an optimization target is the performance or availability wherein the performance corresponds to the latency needed to access the data element by clients, wherein the availability describes the probability that the access to the data element by the client's returns the value of the most recent Write of the data element.
  • a consistency level per optimization target is detected by consulting the policies 50.
  • the optimal consistency level for the data element is then determined by selecting either the highest consistency level or the consistency level having the majority within the consistency levels per access pattern vector parameters.
  • the entity 100 then also forwards the detected optimal consistency level per target identifier to the different client application instances 200a, 200b and 200c.
  • the clients can then use the adapted consistency level for the next access to the data element identified by the identifier.
  • step S23 the different client application receiving the optimal consistency level per target identifier adapt the consistency levels for future transactions for the effected types and target data.
  • step S24 the clients can provide feedback about the processing results used in connection with the updated consistency levels. If the processing results degrade with the data element received in response to the request with the updated consistency level, entity 100 can be informed accordingly so that this input can be used when determining a new consistency level.
  • the transaction data are used as identifier, but as an alternative the database clients 200 could associate a dedicated identifier value to its transaction data which could be interpreted by the monitor 70 or the entity 100 and the access pattern vectors could be assigned to the identifiers set by the clients 200.
  • This association of an individual transaction with a dedicated identifier, to which a consistency level could be applied later, can be passed to the monitor 70 or the entity 100 as part of the database protocol or via a separate channel or API (Application Programming Interface).
  • Figure 3 shows in further detail the calculation of the access pattern vector.
  • step S30 entity 100 receives the statistics on the access transactions from monitor 70.
  • the transaction target i. e. the accessed data element is identified per transaction where the target is specified by information such as the database name, database table, keys etc.
  • the transaction is assigned with a target identifier, alternatively that target identifier is received as part of the transaction statistics in case it has been specified by the client 200.
  • step S32 it is determined if the registry 60 does contain any entry for the transaction identifier or not.
  • step S33 in case no entry is found in the registry, a new entry is created.
  • the entry is fetched from the registry in step S34.
  • step S35 the newly received measurements are merged with the historical statistical data contained in the entry.
  • step S36 the parameter of the access pattern vector are specified in steps S36 to S41.
  • the transaction type is identified. In case all transactions related to a Read transaction only, the value can be set to "Read”. In case all transactions have been Write access only, the value is set to "Write”. In case the operation of the target have been read and write, the value can be set to "Read/Write”.
  • step S37 the database instance pattern is determined.
  • the value may be set to "single".
  • the value can be set to "local”.
  • the value can be set to "global”.
  • step S38 the DDB client pattern is classified. In case all transaction requests have been sent by the same DDB client 200, the value is set to“Single”. In case all transaction requests have been sent by DDB clients located in the same datacenter, the value is set to“Local”. In case the transaction requests have been sent by DDB clients distributed over multiple datacenters, the value is set to“Global”.
  • step S39 the size of the transferred transaction data is classified based on a median value calculated from the statistical data compared with threshold values for the size categories.
  • step S40 the transaction roundtrip latency is classified based on a median value calculated from the statistical data compared with threshold values for the latency categories.
  • step S41 the synchronization latency is classified based on a median value calculated from the statistical data compared with threshold values for the latency categories.
  • Step S42 comprises specifying the new access pattern vector by aggregating the previously determined parameter values.
  • the volatility of the access pattern vector is calculated to identify in step S44 if it is converging to a stable state or if it is changing constantly and therefore considered to be unstable. The calculation should also consider a minimum number of statistical data to ensure that the determined access pattern vector is based on a reasonable amount of data. IN the“no” branch of step S44, the method goes back to step S30
  • step S45 the access pattern vector is considered to be stable, i.e. does have a low volatility, it is forwarded to be further processed and mapped against the consistency level policies.
  • Figure 4 shows an example of a consistency level policy 50.
  • the different vector parameters are indicated.
  • Each of the parameter is optimized for either a performance such as the latency needed to access to data element or the availability corresponding to the probability that the access of the data element by the client returns the value of the most recent write of the data element.
  • All the values indicated in the database such as "ANY” or "ALL”, are consistency level values wherein the consistency level with the highest to the lowest values are as follows ALL, QUORUM, EACH_QUORUM, LOCAL_ONE, and ANY. Accordingly, ALL is the highest consistency value and ANY is the lowest consistency value.
  • numbers in a certain value range could be used instead.
  • the consistency level optimized for availability is determined as majority from ⁇ QUORUM
  • the consistency level optimized for performance is determined as highest from ⁇ ANY
  • the consistency level optimized for availability is determined as highest from ⁇ QUORUM
  • Figure 5 summarizes the steps carried out at entity 100 for determining the consistency level for access transactions of a data element 80.
  • entity 100 receives the information about the access transactions as monitored by monitor 70.
  • the access patterns are determined using the different access parameters as discussed above in connection with Figures 3 and 4. Based on the access parameters a consistency level is determined for the access transactions of the data element in step S53 taken into account the determined access patterns and take into account the consistency level policies as shown in figure 4.
  • Last but not least the determined consistency level for the access transaction of the data element is transmitted in S54 to the different clients 200 so that the clients can use the adapted consistency level for future access requests.
  • FIG. 6 summarizes the steps carried out at a client.
  • one of the clients 200 accessing a data element 80 transmits an access request for accessing the data element 80 the access request being accompanied by a first consistency level stored in the client for the corresponding data element.
  • the clients receive an updated consistency level for the data element. This updated consistency level is stored in step S63 and it is used for a new access request sent by the client to the distributed database.
  • the client can provide feedback to the entity 100 determining the consistency level as mentioned above.
  • the client application can inform entity 100 accordingly so that the entity 100 can use this feedback for the further determination of the consistency level.
  • the previously used consistency level could be transmitted or the next higher or next lower consistency level could be transmitted to the clients dependent on the fact if the performance or availability degraded.
  • FIG. 7 shows a schematic architectural view of the entity 100 which can determine an updated consistency level as discussed above.
  • the entity 100 comprises a transceiver 1 10 which is provided for transmitting user data or control messages and which is provided for receiving user data or control messages from other entities.
  • the transceiver 1 10 is used to receive the monitored access transactions and may be used to transmit the updated consistency level to the clients 200.
  • the entity 100 furthermore comprises a processing unit 120 which is responsible for the operation of the entity 100.
  • the processing unit 120 can comprise one or more processors and can carry out instructions stored on a memory 130, wherein the memory can include a read-only memory, a random access memory, mass storage, hard disk or the like.
  • the memory can furthermore include suitable program code to be executed by the processing unit 120 so as to implement the above described functionalities of entity 100.
  • Figure 8 shows of an alternative architectural view of an entity configured to determine a consistency level, here entity 500.
  • entity 500 comprises a module 510 for determining the access transactions which can be provided by the monitor 70 as discussed above in Figure 1.
  • the entity 500 comprises a module 520 determining the access patterns for the data element based on the different access parameters.
  • a module 540 is provided for transmitting the consistency level to the different clients.
  • the entity 100 or entity 500 can directly transmit the updated consistency level to the clients or entity 100/500 transmits the determined consistency levels in direction of the clients wherein additional nodes of a network such as a mobile communications network may be involved in between.
  • FIG. 9 shows a schematic architectural view of a client 200 which requests access to a data element and which uses an updated consistency level for accessing the data element.
  • the client 200 comprises a transceiver 210 configured to receive user data or control messages from other entities and configured to transmit user data or control messages to other entities. Transceiver 210 can be used to receive the updated consistency level from entity 100.
  • the client can comprise a processing unit 220 which is responsible for the operation of the client 200.
  • the processing entity comprises one or more processors and can carry out instructions stored on a memory 230, wherein the memory may include a read-only memory, a random access memory, a mass storage, hard disk or the like.
  • the processing unit 220 is furthermore configured to carry out the application such as a virtual network function provided by the client.
  • the memory can include suitable program code to be executed by the processing unit 220 so as to implement the above described functionalities in which the client is involved.
  • Figure 10 shows a further schematic view of a client 600 wherein the client 600 comprises a module 610 for transmitting the access request with a first consistency level to the distributed database.
  • a module 620 is provided which is configured to receive the updated consistency level from entity 100 or 300.
  • a module 630 is provided for storing and using the updated consistency level for a second access request to the same data element.
  • the entity when the access pattern is determined, the entity may carry out the steps of determining an access pattern vector with a plurality of vector components. Each of the vector components corresponds to one of the determined access parameters.
  • the preconfigured configuration data then comprise at least one component consistency level for each of the vector components as shown in Figure 4 and the consistency level for the access transaction of the data element can then be determined based on the component consistency levels determined for the vector components of the access pattern vector.
  • the access pattern vector is calculated based on the different consistency levels provided for the vector components.
  • the determination of the consistency level for the access transaction of the data element can comprise the step of determining a highest component consistency level present for the different vector components of the access pattern vector. This highest component consistency level may then be used as consistency level for the access transaction of the data element.
  • the component consistency levels for each of the access vector components and to determine the most frequent or a mean component consistency level occurring for the different access vector components. The most frequent component consistency level or the mean component consistency level is then used for the consistency level of the access transaction of the data element.
  • the preconfigured configuration data comprises for each of the vector components two different component consistency levels optimized for different optimization criteria.
  • One optimization criteria can be the latency needed to access to data element by the clients, indicated as performance in Figure 4.
  • Another optimization criteria can be the probability that the access of the data element by the client returns the value of the most recent write of the data element. This is indicated in the right column of Figure 4 by availability.
  • the access pattern can be monitored over time, and then the consistency level for the access transaction of the data element is only determined when a volatility of the access pattern overtime is smaller than a threshold.
  • the monitoring can be restarted only when a predefined criteria is met such as when a timer has expired or when a feedback is received from the clients 200 that a new adaptation might be necessary.
  • the monitoring of the access transactions can comprise the step of determining an access to the data element, wherein a target identifier is assigned to the determined access identifying data element as target of the access.
  • the target identifier can be stored in a registry 60 in which the different accesses to the data elements are stored for the different target identifiers with the corresponding access parameters.
  • the different access parameters used for each of the accesses are stored in connection with the different target identifiers and a common access parameter is determined for each of the access parameters for each target identifier.
  • the access transactions can be received from different monitoring probes such as probes 315 provided on each of the database instances.
  • One possible access parameter is a client parameter indicating whether the client requesting access to the data element comprises several client instances.
  • a further parameter is a database parameter indicating whether the access to the data element comprises different database instances receiving the requests.
  • a type parameter can indicate whether the access to the data element is a Read access, a Write access or Read and Write access.
  • a size parameter can indicate the data size of the transaction carried out when accessing the data element.
  • a transaction latency parameter can be used indicating how long the access to data element and a transmission of the requested access to the client takes.
  • a synchronization parameter can indicate how long it takes to synchronize the data element in between the different database instances.
  • entity 100 determines the consistency level receiving a transaction feedback from at least one of the clients 200.
  • the transaction feedback indicates a degradation of the performance or the availability of the database transactions.
  • the consistency level can then be adapted take into account the received transaction feedback.
  • the client can furthermore determine a quality of a transaction carried out with the data element received in response to the second access request, in which the new and updated consistency level was used.
  • the determined quality of the transaction is lower than a quality threshold an indication about the determined quality can be transmitted from the corresponding client to the entity 100 configured to determine the consistency level.
  • the above described invention allows a dynamic adaptation of consistency level for distributed database transactions based on runtime data access patterns and transaction performance.
  • the performance is improved for distributed database transactions as the minimum viable consistency level can be determined and used.
  • This also provides a higher capacity for a client applications storing data in the distributed database and decreases the database load due to the relaxed replication requirements in view of the lower consistency level.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method for determining a consistency level for access transactions of a data element (80) stored in a distributed database (300) comprising a plurality of database instances (310- 380), wherein the data element (80) is stored in at least some of the plurality of database instances and is accessed by a plurality of clients (200), the method comprising at an entity (100) determining the consistency level: - receiving information about access transactions carried out on the data element (80) in the distributed database (300) by the plurality of clients (200) including receiving different access parameters used by the plurality of clients (200) when accessing the data element (80), - determining an access pattern for the data element (80) based on the different access parameters used by the plurality of clients (200), - determining a consistency level for the access transactions of a data element (80) based on the determined access pattern for the data element (80) and based on preconfigured configuration data (50) which relates the access patterns to the consistency levels, - transmitting the determined consistency level for the access transaction of the data element (80) to the plurality of clients (200) for a future use by the client when accessing the data element (80).

Description

Dynamic determination of consistency levels for distributed database transactions
Technical field
The present application relates to a method for determining a consistency level for access transactions of a data element stored in a distributed database comprising a plurality of database instances. Furthermore the corresponding entity configured to determine the consistency level is provided. Additionally, a method for accessing a data element stored in the distributed database by a client accessing the data element is provided and the corresponding client. In addition a system comprising the entity configured to determine the consistency level and the client is provided. The application further relates to a computer program comprising program code and a carrier comprising the computer program.
Figure imgf000003_0001
Distributed databases are data storage systems which are formed by clusters of (geo-) distributed database instances. The database instances can be distributed within or across multiple data centers. The individual database instances may store data persistently on their local accessible storage media and may synchronize their state and content over the network. All instances of such a distributed database are meant to provide a consistent view on the data stored in the distributed database eventually, dependent on the processing speed of the instance hosts, the Read/Write speed of the used storage media and the speed of the data network connecting the instances. Therefore, all transactions towards a distributed database (usually Read or Write data) are executed with a certain consistency level present in a request for an access to a data element stored in the distributed database. The transaction consistency level reflects a grade of how many and which database instances need to be involved and confirm the consistent state for this transaction. There exists a trade-off between transaction speed and consistency level. The higher the consistency level, the more instances are involved and need to be synchronized, hence the higher the consistency level the higher the latency for the transaction. Following the so-called “cloud native application” design paradigms, modem IT and telecommunication applications, such as Virtualized Network Functions (VNFs), aim for becoming“stateless”. Stateless in this context means that the application business logic does not hold any persistent state information any longer and instead stores this data in external data storage systems. One solution for redundant and resilient data storage systems are distributed databases.
In case a client application (i.e. the VNF) utilizes a distributed database as external storage system, the consistency level for each database transaction needs to be specified. Since usually these client application operations are time critical, the latency for database transactions needs to be optimized and therefore the optimal consistency level for each database transaction needs to be specified. The optimal consistency level may differ between different database transactions or type of database transactions.
The problem with existing solutions is that the consistency level for each distributed database transaction needs to be specified during design time of the client application, based on the analysis of the client application business logic, the type of data to be stored and predicted data access patterns. For example, it may not be known at design time if the application will be connected to a database that is geographically distributed or a database that has several instances in the same data center or only a single instance. Consequently, non-optimal consistency levels for the distributed database transactions might be specified which may lead to performance degradation of the client applications, if a too high consistency level is specified, or data corruption and malfunctioning of the business logic, if a too low consistency level is specified.
Summary
Accordingly a need exists to find the best consistency level for an access to data element in a distributed database while avoiding that a too high consistency level leads to a performance degradation and while avoiding that a too low consistency level leads to a higher data corruption or malfunctioning of the application provided by client.
This need is met by the features of the independent claims. Further aspects are described in the dependent claims. According to a first aspect a method is provided for determining a consistency level for access transactions of a data element stored in a distributed database comprising a plurality of database instances. The method is carried out at an entity determining the consistency level and the data element is stored in at least some of the plurality of database instances and is accessed by a plurality of clients. The entity receives information about access transactions carried out on the data element in the distributed database by the plurality of clients including receiving different access parameters used by the plurality of clients when accessing the data element. Based on the different access parameters used by the plurality of clients an access pattern is determined for the data element. Furthermore, a consistency level is determined for the access transactions of the data element based on the determined access pattern for the data element and based on preconfigured configuration data which relates the access patterns to the consistency levels. Furthermore, the determined consistency level for the access transaction of the data element is transmitted to the plurality of clients for a future use by the client when accessing the data element.
Accordingly, a dynamic determination of the optimal consistency level is provided by monitoring and analyzing the access traffic to a data element stored in the distributed database. Based on this analysis of the access pattern and based on preconfigured configuration data a consistency level, an updated consistency level can be determined for the data element, and different clients accessing the data element can be informed about the new consistency level. The different clients do not know by how many clients a data element is accessed and how many clients carry out a Read or Write operation on the data element. By determining the access pattern used by the different clients an optimum consistency level can be determined. By way of example if one client always accesses one database instance for accessing a data element and if all clients access the same instance when writing on the data element it can be determined that the current value of the data element stored in that one instance is accurate. With this knowledge a lower consistency level can be used in a request by a client when requesting access to the data element. Furthermore, the situation may occur that the number of clients accessing a data element is increasing or decreasing. As a single client is not aware of the access patterns for a data element the above method can improve the consistency level used for accessing a data element. It can be avoided that a too high consistency level is used which degrades the performance as many instances of the database instances have to be checked in order to obtain a high consistency level. Furthermore, the corresponding entity configured to determine the consistency level for the access transaction of a data element stored in the distributed database comprising the plurality of database instances is provided, wherein the entity comprises at least one processing unit and a memory containing instructions executable by the at least one processing unit. The entity is operative to function as discussed above or as discussed in further detail below.
As an alternative an entity is provided configured to determine the consistency level for access transactions of a data element stored in the distributed database which comprises a plurality of database instances and wherein the data element is stored in at least some of the plurality of database instances and is accessed by a plurality of clients. The entity comprises a first module configured to receive information about access transactions carried out on the data element in the distributed database by the plurality of clients including receiving different access parameters used by the plurality of clients when accessing the data element. A second module of the entity is configured to determine an access pattern for the data element based on the different access parameters used by the plurality of clients. The entity furthermore comprises a third module configured to determine a consistency level for the access transactions of the data element based on the determined access pattern for the data element and based on preconfigured configuration data which relates the access patterns to the consistency levels. A fourth module is configured to transmit the determined consistency level for the access transaction of the data element to the plurality of clients for a future use by the client when accessing the data element.
The information about the access parameters can contain a client parameter indicating whether the client requesting access to the data element comprises several client instances. Furthermore a database parameter may be monitored indicating whether the access to the data element comprises multiple database instances receiving the requests. Furthermore, a type parameter can be monitored indicating whether the access to the data element is a Read access, a Write access or a Read and Write access. A size parameter can indicate a data size of the transaction carried out when accessing the data element, a transaction latency parameter can indicate how long the access to the data element and the transmission of the requested access to the client takes. A synchronization parameter can indicate how long it takes to synchronize the data element in between the different database instances.
Furthermore, a method for accessing a data element stored in the distributed database with a plurality of database instances is provided wherein the data element is stored in at least some of the plurality of database instances. At the client an access request is transmitted to the distributed database for accessing the data element wherein the access request comprises a first consistency level to be met by the data element to be received by the client in response to the request. Furthermore, an updated consistency level is received for the access transaction of the data element from the entity configured to determine the consistency levels for the access transactions of a data element stored in the distributed database. The client then stores the updated consistency level for the access transaction of the data element and when the data element has to be accessed in the distributed database for a second time, a second access request is transmitted to the distributed database for accessing the data element, wherein the second access request comprises the updated consistency level.
With this method the client can during use adapt the consistency level when requesting access to a data element stored in the distributed database, so that always the best consistency level is used.
Furthermore, the corresponding client is provided which accesses the data element stored in the distributed database and which comprises at least one processing unit and a memory containing instructions executable by the at least one processing unit wherein the client is operative to function as discussed above or as discussed in further detail below.
As an alternative a client is provided configured to access the data element stored the distributed database wherein the client comprises a first module configured to transmit an access request to the distributed database for accessing the data element wherein the access request comprises a first consistency level to be met by the data element to be received by the client in response to the request. The client comprises a second module configured to receive an updated consistency level for the access transaction of the data element and comprises a third module configured to store the updated consistency level for the access transaction. The module configured to transmit the access request can then transmit a second access request to the distributed database when the data element has to be accessed wherein the second access request comprises the updated consistency level.
Furthermore, a system comprising the entity configured to determine the consistency level as mentioned above and the client accessing the data element is provided.
Furthermore, a computer program comprising program code to be executed by at least one processing unit of the client or of the entity configured to determine the consistency level is provided wherein execution of the program code causes the at least one processing unit to execute a method as discussed above or as discussed in further detail below.
Additionally, a carrier comprising the computer program is provided wherein the carrier is one of an electronic signal, optical signal, radio signal, or computer readable storage medium.
It should be understood that the features mentioned above and features yet to be explained below can be used not only in the respective combinations indicated, but also in other combinations or in isolation without departing from the scope of the present invention. Features of the above described aspects and embodiments described below may be combined with each other in other embodiments unless explicitly mentioned otherwise.
Brief description of the drawings
The foregoing in additional features and effects of the application will become apparent from the following detailed description when read in conjunction with the accompanying drawings in which like reference numerals refer to like elements.
Figure 1 shows an example schematic representation of a system in which many clients access a distributed database and in which a consistency level for an access transaction is adapted.
Figure 2 shows an example flowchart of a method carried out by an entity determining the consistency level shown in Figure 1 which dynamically determines the optimal consistency level for access transactions of a data element stored in a database shown in Figure 1.
Figure 3 shows a further flowchart of a method carried out at the entity determining the consistency level.
Figure 4 shows an example representation of a table with a consistency level policy showing different consistency levels for different access parameters used by a client when accessing a data element of the distributed database.
Figure 5 shows an example flowchart of a method carried out at the entity determining the consistency level based on the access patterns used by the plurality of clients.
Figure 6 shows an example flowchart of a method carried out by a client using an updated consistency level when accessing a data element.
Figure 7 shows an example schematic representation of an entity determining the consistency level used for accessing a data element in the distributed database.
Figure 8 shows another example schematic representation of an entity configured to determine the consistency level.
Figure 9 shows a schematic representation of a client accessing the distributed database and using the determined consistency level in an access transaction.
Figure 10 shows another example schematic representation of a client accessing the distributed database with an adapted consistency level.
Detailed description
In the following, embodiments of the invention will be described in detail with reference to the accompanying drawings. It is to be understood that the following description of embodiments is not to be taken in a limiting sense. The scope of the invention is not intended to be limited by the embodiments described hereinafter or by the drawings, which are to be illustrative only. The drawings are to be regarded as being schematic representations, and elements illustrated in the drawings are not necessarily shown to scale. Rather, the various elements are represented such that their function and general-purpose becomes apparent to a person skilled in the art. Any connection or coupling between functional blocks, devices, components of physical or functional units shown in the drawings and described hereinafter may also be implemented by an indirect connection or coupling. A coupling between components may be established over a wired or wireless connection. Functional blocks may be implemented in hardware, software, firmware, or a combination thereof.
The solution discussed below aims for a dynamic determination of optimal consistency levels for distributed database transactions by recognizing data access patterns and transaction performance.
Figure 1 shows a schematic architectural view in which different database clients 200a, 200b and 200c access a distributed database 300 which has two different data centers 300a and 300b and wherein each data center has different hosts 310, 320, 330, 340, 350, 360, 370, and 380. The different hosts 310 to 380 comprise the different database instances 31 1 , 321 , 331 , 341 wherein the host 350 to 380 also comprise corresponding database instances which are not reference by referenced signs for the sake of clarity. The database instances store a data element 80 in different database instances, in the example shown in Figure 1 the data element 80 is stored three times as data element 80a, 80b and 80c and when one of these data elements is accessed and amended the other corresponding data elements 80 have to be amended accordingly. Each of the database instances furthermore comprises a monitoring probe 315 which collects the statistical data from the database instance and reports the collected data to a transaction monitor 70. Furthermore, a registry 60 is provided storing intermediate transaction analysis results per transaction target which contain the aggregated statistical data and the latest determined access pattern vectors as will be discussed below. Furthermore, an analyzing entity 100 is provided which determines or calculates the consistency level based on the information received from the transaction monitor 70, the registry 60 and using preconfigured configuration data 50.
The transaction monitor 70 collects statistical data for all transactions between the different clients 200a - 200c and the distributed database 300 received from the different probes 315. Entity 100 aggregates and analyzes the transaction statistics and calculates an access pattern vector for each transaction target identifier taking into account the identity and location of the involved client and the database instances and the content and performance of the transaction. The entity 100 then detects the optimal consistency level for each transaction target identifier, i. e. for each transaction by mapping the calculated access pattern vectors to the preconfigured configuration data, the consistency level policies. The entity 100 than provides the determined consistency level to the different client applications 200. The client can then adapt its consistency level for the distributed database transactions. The client furthermore observes to database transactions and sends feedback to the entity 100 in case the transactions degrade in terms of faults or performance. The client could be any VNF externalizing its state to a distributed database. Examples could be mobile core control plane applications such as Mobility Management Entity, MME, Diameter Routing Agent, DRA, etc.
As will be discussed in more detail below a method is provided which can dynamically adapt the consistency level for distributed database transactions based on runtime data access patterns and transaction performance.
Figure 2 shows a first overview over the steps carried out by the system shown in Figure 1. In step S20 the monitor 70 observes all the transactions to all instances of the distributed database and collects and stores statistical data for these transactions received from the different monitoring probes 315. By way of example the monitor may receive or collect information about an identity and address of the client application instance initiating the transaction. A further parameter is the identity and the address of the database instance serving the transaction. A further monitored parameter is the type of the transaction, be it a Read transaction, a Write transaction or Read and Write transaction, the type of transaction having further information about the constraints of the transaction. Constraints included in the access transactions could be setting a time-to-live for a certain data element value, locking a data element so that it is not allowed to be modified etc. A further information monitored by the probes is the target data of the transaction such as the database name, the database table or keys. The target data information is also used as an identifier for grouping transactions and assigning an access pattern vector as discussed in further detail below. A further monitored parameter can be the size of the transferred transaction data. Additionally, a round-trip latency of the transaction measured from the transaction request sent by the client until the response received by the client is determined and transmitted to monitor 70. Furthermore, the round-trip latency of the synchronization between the different database instances is determined.
It is possible to suspend the monitoring by monitor 70 after a first adaptation of the consistency level has been performed. The monitoring could then be resumed either after a dedicated time has expired provided at the monitor or when an indication from the database client has been received via a feedback loop where the feedback is provided to the entity 100 or directly to monitor 70.
The monitor 70 can be implemented as part of the database, as part of the client or as a proxy function between the client and the database. The monitoring function carried out by monitor 70 may also only sample the traffic patterns periodically to improve the efficiency.
In a step S21 the entity 100 then calculates the transaction access patterns in the form of access pattern vectors for the different transaction identifiers. The entity 100 uses the registry 60 which stores the intermediate analyzed results such as statistical data and access pattern vectors which are continuously adapted per incoming transaction statistics. The access pattern vector can comprise, by way of example the following vector components or parameters: a) transaction type, whether it is a Read, Write, Read and Write transaction.
Furthermore, more complex types could be used which consider more fine-grained values for Write such as add, delete or atomic transactions operating on multiple items,
b) database instance pattern: here the information is provided whether the database is a single database instance, is a local database instance in the same datacenter or a global database instance in a different datacenter. This database instance pattern describes the data center locality of the database instances and can also consider the number of hops between the instances or other restrictions such as firewall restrictions,
c) a database client pattern: here, information is provided whether a single database client, a local or a global database client is accessing the distributed database, d) a transaction data size: here information may be provided whether the access transaction data size is small, medium or large, e) a transaction round-trip latency: this latency is measured from the transaction request sent by the client until the response received by the client and can have values such as very fast, fast, medium, slow or very slow,
f) a synchronization latency describing how fast the different data elements in the database are synchronized which can again comprise parameter values such as very fast, fast, medium, slow or very slow.
This list of vector components or parameters for the access pattern vector is not exhaustive and can be amended upon availability of the statistical data types. When an access pattern vector is detected to be stable it is getting further processed. Accordingly, a stable access pattern can also mean that the change in the access pattern becomes stable so that volatility of the change is determined. A detailed mechanism to calculate the access pattern vector will be described in further detail below with reference to Figure 3.
In step S22, entity 100 then detects the optimal consistency level for the identified access pattern vectors by means of the consistency level policies 50. The consistency level policies provide the configurable mapping between the access pattern vector parameters and the desired consistency level. As will be discussed below in connection with Figure 4 the consistency levels can be grouped for different optimization targets such as performance or availability. For each access pattern vector parameter a consistency level per optimization target exists as shown in Figure 4 and as discussed in further detail below. An example for an optimization target is the performance or availability wherein the performance corresponds to the latency needed to access the data element by clients, wherein the availability describes the probability that the access to the data element by the client's returns the value of the most recent Write of the data element.
For each of the access pattern vector parameters a consistency level per optimization target is detected by consulting the policies 50. The optimal consistency level for the data element is then determined by selecting either the highest consistency level or the consistency level having the majority within the consistency levels per access pattern vector parameters.
The entity 100 then also forwards the detected optimal consistency level per target identifier to the different client application instances 200a, 200b and 200c. The clients can then use the adapted consistency level for the next access to the data element identified by the identifier.
As shown in step S23 the different client application receiving the optimal consistency level per target identifier adapt the consistency levels for future transactions for the effected types and target data.
In step S24 the clients can provide feedback about the processing results used in connection with the updated consistency levels. If the processing results degrade with the data element received in response to the request with the updated consistency level, entity 100 can be informed accordingly so that this input can be used when determining a new consistency level.
Different options might be used to uniquely identify transaction targets and groups. In the examples described above the transaction data are used as identifier, but as an alternative the database clients 200 could associate a dedicated identifier value to its transaction data which could be interpreted by the monitor 70 or the entity 100 and the access pattern vectors could be assigned to the identifiers set by the clients 200. This association of an individual transaction with a dedicated identifier, to which a consistency level could be applied later, can be passed to the monitor 70 or the entity 100 as part of the database protocol or via a separate channel or API (Application Programming Interface).
Figure 3 shows in further detail the calculation of the access pattern vector.
In step S30, entity 100 receives the statistics on the access transactions from monitor 70. In step S31 the transaction target, i. e. the accessed data element is identified per transaction where the target is specified by information such as the database name, database table, keys etc. The transaction is assigned with a target identifier, alternatively that target identifier is received as part of the transaction statistics in case it has been specified by the client 200. In step S32 it is determined if the registry 60 does contain any entry for the transaction identifier or not. In step S33, in case no entry is found in the registry, a new entry is created. In case an entry is found in the registry 60 the entry is fetched from the registry in step S34. In step S35 the newly received measurements are merged with the historical statistical data contained in the entry. Based on the merged statistical transaction data the parameter of the access pattern vector are specified in steps S36 to S41. By way of example in step S36 the transaction type is identified. In case all transactions related to a Read transaction only, the value can be set to "Read". In case all transactions have been Write access only, the value is set to "Write". In case the operation of the target have been read and write, the value can be set to "Read/Write".
Furthermore, in step S37 the database instance pattern is determined. In case all transaction requests have been received by the same database instance, the value may be set to "single". In case all transaction requests have been received by database instances located in the same data center always the same, low number of network hops, the value can be set to "local". In case the transaction requests have been received by database instances distributed over multiple data centers or received via a higher number of network hops, or via a firewall instances or network translations, the value can be set to "global".
In step S38 the DDB client pattern is classified. In case all transaction requests have been sent by the same DDB client 200, the value is set to“Single”. In case all transaction requests have been sent by DDB clients located in the same datacenter, the value is set to“Local”. In case the transaction requests have been sent by DDB clients distributed over multiple datacenters, the value is set to“Global”.
In step S39 the size of the transferred transaction data is classified based on a median value calculated from the statistical data compared with threshold values for the size categories.
In step S40 the transaction roundtrip latency is classified based on a median value calculated from the statistical data compared with threshold values for the latency categories.
In step S41 the synchronization latency is classified based on a median value calculated from the statistical data compared with threshold values for the latency categories.
Step S42 comprises specifying the new access pattern vector by aggregating the previously determined parameter values. In step S43 the volatility of the access pattern vector is calculated to identify in step S44 if it is converging to a stable state or if it is changing constantly and therefore considered to be unstable. The calculation should also consider a minimum number of statistical data to ensure that the determined access pattern vector is based on a reasonable amount of data. IN the“no” branch of step S44, the method goes back to step S30
In step S45 the access pattern vector is considered to be stable, i.e. does have a low volatility, it is forwarded to be further processed and mapped against the consistency level policies.
Figure 4 shows an example of a consistency level policy 50. In the left column of the table the different vector parameters are indicated. Each of the parameter is optimized for either a performance such as the latency needed to access to data element or the availability corresponding to the probability that the access of the data element by the client returns the value of the most recent write of the data element. All the values indicated in the database such as "ANY" or "ALL", are consistency level values wherein the consistency level with the highest to the lowest values are as follows ALL, QUORUM, EACH_QUORUM, LOCAL_ONE, and ANY. Accordingly, ALL is the highest consistency value and ANY is the lowest consistency value. As an alternative to the indicated values, numbers in a certain value range could be used instead.
In order to explain the determination of a consistency level in more detail the following examples are given with reference to Figure 4. a) An access pattern vector is calculated as:
{Transaction type Read | DDB instance distribution: Local | DDB client distribution: Single | [...] }
The consistency level optimized for performance is determined as majority from {ANY | LOCALJDNE | ANY} as“ANY”. b) An access pattern vector is calculated as:
{Transaction type Read/Write | DDB instance distribution: Global | DDB client distribution: Global | [...] }
The consistency level optimized for availability is determined as majority from { QUORUM | ALL | ALL } as“ALL”. Examples of determining the optimal consistency level as the highest of individual consistency levels, utilizing the example consistency level policy from Figure 4: c) An access pattern vector is calculated as:
{Transaction type Read | DDB instance distribution: Local | DDB client distribution: Single | [...] }
The consistency level optimized for performance is determined as highest from {ANY | LOCALJDNE | ANY} as“LOCAL_ONE”.
d) An access pattern vector is calculated as:
{Transaction type Read/Write | DDB instance distribution: Global | DDB client distribution: Global | [...] }
The consistency level optimized for availability is determined as highest from { QUORUM | ALL | ALL } as“ALL”.
Figure 5 summarizes the steps carried out at entity 100 for determining the consistency level for access transactions of a data element 80. In step S51 entity 100 receives the information about the access transactions as monitored by monitor 70. In step S52 the access patterns are determined using the different access parameters as discussed above in connection with Figures 3 and 4. Based on the access parameters a consistency level is determined for the access transactions of the data element in step S53 taken into account the determined access patterns and take into account the consistency level policies as shown in figure 4.
Last but not least the determined consistency level for the access transaction of the data element is transmitted in S54 to the different clients 200 so that the clients can use the adapted consistency level for future access requests.
Figure 6 summarizes the steps carried out at a client. In step S61 , one of the clients 200 accessing a data element 80 transmits an access request for accessing the data element 80 the access request being accompanied by a first consistency level stored in the client for the corresponding data element. In step S62 the clients receive an updated consistency level for the data element. This updated consistency level is stored in step S63 and it is used for a new access request sent by the client to the distributed database. Optionally, in a further step not shown in Figure 6 the client can provide feedback to the entity 100 determining the consistency level as mentioned above.
By way of example when the client notices that the performance of the client application degrades with the used consistency level or when the availability of the database transactions degrade, the client application can inform entity 100 accordingly so that the entity 100 can use this feedback for the further determination of the consistency level. Different options exist at the entity to adapt the consistency level based on the feedback. The previously used consistency level could be transmitted or the next higher or next lower consistency level could be transmitted to the clients dependent on the fact if the performance or availability degraded.
Figure 7 shows a schematic architectural view of the entity 100 which can determine an updated consistency level as discussed above. The entity 100 comprises a transceiver 1 10 which is provided for transmitting user data or control messages and which is provided for receiving user data or control messages from other entities. By way of example the transceiver 1 10 is used to receive the monitored access transactions and may be used to transmit the updated consistency level to the clients 200. The entity 100 furthermore comprises a processing unit 120 which is responsible for the operation of the entity 100. The processing unit 120 can comprise one or more processors and can carry out instructions stored on a memory 130, wherein the memory can include a read-only memory, a random access memory, mass storage, hard disk or the like. The memory can furthermore include suitable program code to be executed by the processing unit 120 so as to implement the above described functionalities of entity 100.
Figure 8 shows of an alternative architectural view of an entity configured to determine a consistency level, here entity 500. The entity comprises a module 510 for determining the access transactions which can be provided by the monitor 70 as discussed above in Figure 1. The entity 500 comprises a module 520 determining the access patterns for the data element based on the different access parameters. With a module 530 the consistency level is determined for the access transactions of the data element as discussed above in connection with Figure 3 and 4. A module 540 is provided for transmitting the consistency level to the different clients. In general the entity 100 or entity 500 can directly transmit the updated consistency level to the clients or entity 100/500 transmits the determined consistency levels in direction of the clients wherein additional nodes of a network such as a mobile communications network may be involved in between.
Figure 9 shows a schematic architectural view of a client 200 which requests access to a data element and which uses an updated consistency level for accessing the data element. The client 200 comprises a transceiver 210 configured to receive user data or control messages from other entities and configured to transmit user data or control messages to other entities. Transceiver 210 can be used to receive the updated consistency level from entity 100. The client can comprise a processing unit 220 which is responsible for the operation of the client 200. The processing entity comprises one or more processors and can carry out instructions stored on a memory 230, wherein the memory may include a read-only memory, a random access memory, a mass storage, hard disk or the like. The processing unit 220 is furthermore configured to carry out the application such as a virtual network function provided by the client. The memory can include suitable program code to be executed by the processing unit 220 so as to implement the above described functionalities in which the client is involved.
Figure 10 shows a further schematic view of a client 600 wherein the client 600 comprises a module 610 for transmitting the access request with a first consistency level to the distributed database. A module 620 is provided which is configured to receive the updated consistency level from entity 100 or 300. Furthermore, a module 630 is provided for storing and using the updated consistency level for a second access request to the same data element.
From the above said some general conclusions can be drawn. As far as entity 100 is concerned, when the access pattern is determined, the entity may carry out the steps of determining an access pattern vector with a plurality of vector components. Each of the vector components corresponds to one of the determined access parameters. The preconfigured configuration data then comprise at least one component consistency level for each of the vector components as shown in Figure 4 and the consistency level for the access transaction of the data element can then be determined based on the component consistency levels determined for the vector components of the access pattern vector. As discussed above in connection with Figure 4 the access pattern vector is calculated based on the different consistency levels provided for the vector components.
In this context the determination of the consistency level for the access transaction of the data element can comprise the step of determining a highest component consistency level present for the different vector components of the access pattern vector. This highest component consistency level may then be used as consistency level for the access transaction of the data element. In an alternative it is possible to determine the component consistency levels for each of the access vector components and to determine the most frequent or a mean component consistency level occurring for the different access vector components. The most frequent component consistency level or the mean component consistency level is then used for the consistency level of the access transaction of the data element.
Furthermore, it is possible that the preconfigured configuration data comprises for each of the vector components two different component consistency levels optimized for different optimization criteria. One optimization criteria can be the latency needed to access to data element by the clients, indicated as performance in Figure 4. Another optimization criteria can be the probability that the access of the data element by the client returns the value of the most recent write of the data element. This is indicated in the right column of Figure 4 by availability.
The access pattern can be monitored over time, and then the consistency level for the access transaction of the data element is only determined when a volatility of the access pattern overtime is smaller than a threshold.
Furthermore, it is possible to suspend the monitoring of the access transactions after the consistency level for the access transaction of the data element has been determined and transmitted to the client once. The monitoring can be restarted only when a predefined criteria is met such as when a timer has expired or when a feedback is received from the clients 200 that a new adaptation might be necessary.
The monitoring of the access transactions can comprise the step of determining an access to the data element, wherein a target identifier is assigned to the determined access identifying data element as target of the access. Furthermore, the target identifier can be stored in a registry 60 in which the different accesses to the data elements are stored for the different target identifiers with the corresponding access parameters. The different access parameters used for each of the accesses are stored in connection with the different target identifiers and a common access parameter is determined for each of the access parameters for each target identifier.
The access transactions can be received from different monitoring probes such as probes 315 provided on each of the database instances. Above different examples of the access parameters were discussed. One possible access parameter is a client parameter indicating whether the client requesting access to the data element comprises several client instances. A further parameter is a database parameter indicating whether the access to the data element comprises different database instances receiving the requests. A type parameter can indicate whether the access to the data element is a Read access, a Write access or Read and Write access. A size parameter can indicate the data size of the transaction carried out when accessing the data element. Furthermore, a transaction latency parameter can be used indicating how long the access to data element and a transmission of the requested access to the client takes. A synchronization parameter can indicate how long it takes to synchronize the data element in between the different database instances.
Furthermore, it is possible that entity 100 determines the consistency level receiving a transaction feedback from at least one of the clients 200. The transaction feedback indicates a degradation of the performance or the availability of the database transactions. The consistency level can then be adapted take into account the received transaction feedback.
As far as the client accessing the data element is concerned the client can furthermore determine a quality of a transaction carried out with the data element received in response to the second access request, in which the new and updated consistency level was used. When the determined quality of the transaction is lower than a quality threshold an indication about the determined quality can be transmitted from the corresponding client to the entity 100 configured to determine the consistency level.
The above described invention allows a dynamic adaptation of consistency level for distributed database transactions based on runtime data access patterns and transaction performance. The performance is improved for distributed database transactions as the minimum viable consistency level can be determined and used. This also provides a higher capacity for a client applications storing data in the distributed database and decreases the database load due to the relaxed replication requirements in view of the lower consistency level.

Claims

Claims
1. A method for determining a consistency level for access transactions of a data element (80) stored in a distributed database (300) comprising a plurality of database instances (310- 380), wherein the data element (80) is stored in at least some of the plurality of database instances and is accessed by a plurality of clients (200), the method comprising at an entity (100) determining the consistency level:
- receiving information about access transactions carried out on the data element (80) in the distributed database (300) by the plurality of clients (200) including receiving different access parameters used by the plurality of clients (200) when accessing the data element (80),
- determining an access pattern for the data element (80) based on the different access parameters used by the plurality of clients (200),
- determining a consistency level for the access transactions of a data element (80) based on the determined access pattern for the data element (80) and based on preconfigured configuration data (50) which relates the access patterns to the consistency levels,
- transmitting the determined consistency level for the access transaction of the data element (80) to the plurality of clients (200) for a future use by the client when accessing the data element (80).
2. The method according to claim 1 , wherein determining the access pattern comprises determining an access pattern vector with a plurality of vector components, each vector component corresponding to one of the determined access parameters, wherein the preconfigured configuration data comprises at least one component consistency level for each of the vector components, wherein the consistency level for the access transaction of the data element is determined based on the component consistency levels determined for the vector components of the access pattern vector.
3. The method according to claim 2, wherein determining the consistency level of the access transaction of the data element comprises at least one of the following:
- determining a highest component consistency level present for the different vector components of the access pattern vector, wherein the highest component consistency level is used for the consistency level of the access transaction of the data element (80),
- determining the component consistency levels for each of the access vector components and determining the most frequent component consistency level occurring for the different access vector components, wherein the most frequent component consistency level is used for the consistency level of the access transaction of the data element (80).
4. The method according to claim 2 or 3, wherein the preconfigured configuration data comprises for each of the vector components at least two component consistency levels optimized for different optimization criteria, one of the optimization criteria being a latency needed to access the data element by the clients, another optimization criteria being the probability that the access of the data element by the clients returns the value of the most recent write of the data element.
5. The method according to any of the preceding claims, wherein the access pattern is monitored over time, wherein the consistency level for the access transaction of the data element (80) is only determined when a volatility of the access pattern over time is smaller than a threshold.
6. The method according to any of the preceding claims, wherein the monitoring of the access transactions is suspended after the consistency level for the access transaction of the data element (80) is determined and is transmitted to the plurality of clients one time, wherein the monitoring is only restarted when a predefined criterion is met.
7. The method according to any of the preceding claims, wherein monitoring the access transactions comprises
- determining an access to the data element (80),
- assigning a target identifier to the determined access identifying the data element as target of the access,
- storing the target identifier in a registry (60) in which the different accesses to data elements (80) of the distributed database (300) are stored for the different target identifiers with the corresponding access parameters, wherein the different access parameters used for each of the accesses are stored in connection with the different target identifiers, wherein a common access parameter is determined for each of the access parameters for each target identifier.
8. The method according to any of the preceding claims, wherein the access transactions are received from different monitoring probes (315) provided on each of the database instances (310 - 380).
9. The method according to any of the preceding claims, wherein the different access parameters comprise at least one of the following parameters: a client parameter indicating whether the client requesting access to the data element comprises several client instances, a database parameter indicating whether the access to the data element comprises multiple database instances receiving the requests, a type parameter indicating whether the access to the data element is read access, a write access or a read and write access, a size parameter indicating a data size of a transaction carried out when accessing the data element, a transaction latency parameter indicating how long the access to the data element and a transmission of the requested access to the client takes, a synchronization parameter indicating how long it takes to synchronize the data element in between the different database instances.
10. The method according to any of the preceding claims, further the entity (100) determining the consistency level receiving a transaction feedback from at least one of the clients (200), the transaction feedback indicating a degradation of performance or availability of the database transactions, wherein the consistency level is adapted taking into account the received transaction feedback.
1 1. A method for accessing a data element (80) stored in a distributed database (300) with a plurality of database instances (310 - 380), wherein the data element (80) is stored in at least some of the plurality of database instances, the method comprising at a client (200):
- transmitting an access request to the distributed database (300) for accessing the data element (80), the access request comprising a first consistency level to be met by the data element to be received by the client in response to the request,
- receiving an updated consistency level for the access transaction of the data element from an entity configured to determine consistency levels (100) for the access transaction of the data elements stored in the distributed database, - storing the updated consistency level for the access transaction of the data element, wherein when the data element (80) has to be accessed in the distributed database (300) for a second time, a second access request is transmitted to the distributed database for accessing the data element, the second access request comprising the updated consistency level.
12. The method according to claim 11 , further determining a quality of a transaction carried out with the data element received in response to the second access request, wherein when the determined quality of the transaction is lower than a defined quality threshold, an indication about the determined quality is transmitted from the client (200) to the entity (100) configured to determine the consistency levels.
13. An entity (100) configured to determine a consistency level for the access transaction of a data element stored in a distributed database comprising a plurality of database instances, wherein the data element is stored in at least some of the plurality of database instances and is accessible by a plurality of clients, the entity comprising at least one processing unit (120) and a memory (130) containing instructions executable by the at least one processing unit, wherein the entity is operative to:
- receive information about access transactions carried out on the data element in the distributed database by the plurality of clients including receiving different access parameters used by the plurality of clients when accessing the database element,
- determine an access pattern for the data element based on the different access parameters used by the plurality of clients,
- determine a consistency level for the access transaction of the data element based on the determined access pattern for the data element and based on preconfigured configuration data which relates the access patterns to the consistency levels,
- transmit the determined consistency level for the access transaction of the data element to the plurality of clients for a future use by the client when accessing the data element.
14. The entity according to claim 13, further being operative, for determining the access pattern to determine an access pattern vector with a plurality of vector components, each vector component corresponding to one of the determined access parameters, wherein the preconfigured configuration data comprises at least one component consistency level for each of the vector components, and to determine the consistency level for the access transaction of the data element based on the component consistency levels determined for the vector components of the access pattern vector.
15. The entity according to claim 14, further being operative, for determining the consistency level of the access transaction of the data element to carry out at least one of the following steps:
- determining a highest component consistency level present for the different vector components of the access pattern vector, wherein the highest component consistency level is used for the consistency level of the access transaction of the data element (80),
- determining the component consistency levels for each of the access vector components and determining the most frequent component consistency level occurring for the different access vector components, wherein the most frequent component consistency level is used for the consistency level of the access transaction of the data element (80).
16. The entity according to any of claims 13 to 15, further being operative to monitor the access pattern over time, and to determine the consistency level for the access transaction of the data element (80) only, when a volatility of the access pattern over time is smaller than a threshold.
17. The entity according to any of claims 13 to 16, further being operative to suspend the monitoring of the access transactions after the consistency level for the access transaction of the data element (80) is determined and is transmitted to the plurality of clients one time, and to restart the monitoring only when a predefined criterion is met.
18. The entity according to any of claims 13 to 17, further being operative, for monitoring the access transactions, to :
- determine an access to the data element (80),
- assign a target identifier to the determined access identifying the data element as target of the access,
- store the target identifier in a registry (60) in which the different accesses to data elements (80) of the distributed database (300) are stored for the different target identifiers with the corresponding access parameters, wherein the different access parameters used for each of the accesses are stored in connection with the different target identifiers, and to determine a common access parameter for each of the access parameters for each target identifier.
19. The entity according to any of claims 13 to 18, further being operative to receive the access transactions are received from different monitoring probes (315) provided on each of the database instances (310 - 380).
20. The entity according to any of claims 13 to 19, further being operative to receive a transaction feedback from at least one of the clients (200), the transaction feedback indicating a degradation of performance or availability of the database transactions, and to adapt the consistency level taking into account the received transaction feedback.
21. A client (200) configured to access a data element (80) stored in a distributed database (300) with a plurality of database instances (310 - 380), wherein the data element (80) is stored in at least some of the plurality of database instances, the client comprising at least one processing unit (220) and a memory (230) containing instructions executable by the at least one processing unit, wherein the client is operative to:
- transmit an access request to the distributed database (300) for accessing the data element (80), the access request comprising a first consistency level to be met by the data element to be received by the client in response to the request,
- receive an updated consistency level for the access transaction of the data element from an entity configured to determine consistency levels (100) for the access transaction of the data elements stored in the distributed database,
- store the updated consistency level for the access transaction of the data element, wherein when the data element (80) has to be accessed in the distributed database (300) for a second time, a second access request is transmitted to the distributed database for accessing the data element, the second access request comprising the updated consistency level.
22. The client according to claim 21 , further being operative to determine a quality of a transaction carried out with the data element received in response to the second access request, wherein when the determined quality of the transaction is lower than a defined quality threshold, wherein the client is operative to transmit an indication about the determined quality from the client (200) to the entity (100) configured to determine the consistency levels.
23. A system comprising an entity as mentioned in any of claims 13 to 20, and a client as mentioned in claims 21 or 22.
24. A computer program comprising program code to be executed by at least one processing unit of an entity (100) determining an consistency level or of a client, wherein execution of the program code causes the at least one processing unit to execute a method according to any of claims 1 to 12.
25. A carrier comprising the computer program of claim 24, wherein the carrier is one of an electronic signal, optical signal, radio signal, or computer readable storage medium.
PCT/EP2018/054505 2018-02-23 2018-02-23 Dynamic determination of consistency levels for distributed database transactions WO2019161908A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2018/054505 WO2019161908A1 (en) 2018-02-23 2018-02-23 Dynamic determination of consistency levels for distributed database transactions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2018/054505 WO2019161908A1 (en) 2018-02-23 2018-02-23 Dynamic determination of consistency levels for distributed database transactions

Publications (1)

Publication Number Publication Date
WO2019161908A1 true WO2019161908A1 (en) 2019-08-29

Family

ID=61569231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2018/054505 WO2019161908A1 (en) 2018-02-23 2018-02-23 Dynamic determination of consistency levels for distributed database transactions

Country Status (1)

Country Link
WO (1) WO2019161908A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220235A (en) * 2021-05-17 2021-08-06 北京青云科技股份有限公司 Read-write request processing method, device, equipment and storage medium
CN116226153A (en) * 2023-05-05 2023-06-06 中国工商银行股份有限公司 Data updating method and device, processor and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140279855A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Differentiated secondary index maintenance in log structured nosql data stores
US20160179827A1 (en) * 2014-12-19 2016-06-23 International Business Machines Corporation Isolation anomaly quantification through heuristical pattern detection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140279855A1 (en) * 2013-03-15 2014-09-18 International Business Machines Corporation Differentiated secondary index maintenance in log structured nosql data stores
US20160179827A1 (en) * 2014-12-19 2016-06-23 International Business Machines Corporation Isolation anomaly quantification through heuristical pattern detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HOUSSEM-EDDINE CHIHOUB ET AL: "Harmony: Towards Automated Self-Adaptive Consistency in Cloud Storage", CLUSTER COMPUTING (CLUSTER), 2012 IEEE INTERNATIONAL CONFERENCE ON, IEEE, 24 September 2012 (2012-09-24), pages 293 - 301, XP032266174, ISBN: 978-1-4673-2422-9, DOI: 10.1109/CLUSTER.2012.56 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113220235A (en) * 2021-05-17 2021-08-06 北京青云科技股份有限公司 Read-write request processing method, device, equipment and storage medium
CN113220235B (en) * 2021-05-17 2024-02-06 北京青云科技股份有限公司 Read-write request processing method, device, equipment and storage medium
CN116226153A (en) * 2023-05-05 2023-06-06 中国工商银行股份有限公司 Data updating method and device, processor and electronic equipment
CN116226153B (en) * 2023-05-05 2023-08-11 中国工商银行股份有限公司 Data updating method and device, processor and electronic equipment

Similar Documents

Publication Publication Date Title
US11687555B2 (en) Conditional master election in distributed databases
US11886731B2 (en) Hot data migration method, apparatus, and system
US9971823B2 (en) Dynamic replica failure detection and healing
US10255148B2 (en) Primary role reporting service for resource groups
US10534776B2 (en) Proximity grids for an in-memory data grid
EP3352433B1 (en) Node connection method and distributed computing system
US8612413B2 (en) Distributed data cache for on-demand application acceleration
US8463788B2 (en) Balancing caching load in a peer-to-peer based network file system
JP4679584B2 (en) Routing service queries in overlay networks
KR20150132859A (en) Automatic tuning of virtual data center resource utilization policies
JP2004246852A (en) Method and apparatus for adjusting performance of logical volume copy destination
JP2015095149A (en) Management program, management method, and management device
CN110196860B (en) Unique identifier allocation method and device, electronic equipment and storage medium
US11068461B1 (en) Monitoring key access patterns for nonrelational databases
US8914582B1 (en) Systems and methods for pinning content in cache
US20030014507A1 (en) Method and system for providing performance analysis for clusters
CN112307119A (en) Data synchronization method, device, equipment and storage medium
US8447730B1 (en) Probe system for replication monitoring
CN114238518A (en) Data processing method, device, equipment and storage medium
EP2568386A1 (en) Method for accessing cache and fictitious cache agent
WO2019161908A1 (en) Dynamic determination of consistency levels for distributed database transactions
CN110381136A (en) A kind of method for reading data, terminal, server and storage medium
US9015371B1 (en) Method to discover multiple paths to disk devices cluster wide
EP3685567B1 (en) Load shedding of traffic based on current load state of target capacity
JP4375121B2 (en) Processing agent method in database management system

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18708939

Country of ref document: EP

Kind code of ref document: A1