WO2021184761A1 - 数据访问方法和装置、数据存储方法和装置 - Google Patents

数据访问方法和装置、数据存储方法和装置 Download PDF

Info

Publication number
WO2021184761A1
WO2021184761A1 PCT/CN2020/124253 CN2020124253W WO2021184761A1 WO 2021184761 A1 WO2021184761 A1 WO 2021184761A1 CN 2020124253 W CN2020124253 W CN 2020124253W WO 2021184761 A1 WO2021184761 A1 WO 2021184761A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
relational database
query
database
written
Prior art date
Application number
PCT/CN2020/124253
Other languages
English (en)
French (fr)
Inventor
欧霄
黄东庆
刘骏健
李建东
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021184761A1 publication Critical patent/WO2021184761A1/zh
Priority to US17/696,576 priority Critical patent/US20220207036A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of database technology, in particular to a data access method and device, and a data storage method and device.
  • Massive data is generated every day in the Internet.
  • DB database
  • relational databases such as MySQL and Oracle are mainly used to store data.
  • Relational databases use structured query language for data query, and support adding, deleting, modifying, and querying data in the database, as well as cross-table query functions. They are convenient to use and easy to understand, and they are becoming more and more widely used.
  • relational databases store data in the form of a single data table.
  • the performance limitations of this single table read and write cannot adapt to the increasing amount of data access, affecting the normal access of data, and gradually becoming a bottleneck for business development.
  • a data access method includes:
  • query data that meets the query conditions in the non-relational database
  • query data that meets the query conditions in the relational database
  • the non-relational database is consistent with the data stored in the relational database
  • a data access device includes:
  • the data read request module is used to parse the read data request to obtain the query condition when the data read request is received;
  • the data query module is used to query data that meets the query conditions in the non-relational database when the query conditions meet the preset high-frequency access conditions; when the query conditions meet the preset complex access conditions, in the relationship Query data that meets the query conditions in a database; the non-relational database and the data stored in the relational database are consistent; and
  • the query result feedback module is used for responding to the read data request based on the queried data.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the one or more processors execute the steps of the data access method.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, they cause one or more processors to execute the data access method. step.
  • a data storage method includes:
  • the data write request is analyzed to obtain the data to be written;
  • the types of the database include relational databases and non-relational databases; the non-relational databases are used for responding to a preset height Read data requests with frequent access conditions, and the relational database is used to respond to read data requests meeting preset complex access conditions; and
  • a data storage device comprising:
  • the data write request module is used to parse the data write request to obtain the data to be written when the data write request is received;
  • the data quasi-real-time synchronization module is used to write the data to be written into a kind of database; synchronize the data to be written from the one kind of database to another kind of database; wherein, the type of the database includes relation Database and non-relational database; the non-relational database is used to respond to read data requests meeting preset high-frequency access conditions, and the relational database is used to respond to read data requests meeting preset complex access conditions; and
  • the write result feedback module is configured to respond to the write data request based on the write result of the data to be written.
  • a computer device includes a memory and one or more processors.
  • the memory stores computer readable instructions.
  • the one or more processors execute the steps of the data storage method.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the computer-readable instructions When executed by one or more processors, they cause one or more processors to execute the data storage method. step.
  • Figure 1 is an application environment diagram of a data access and storage method in an embodiment
  • Figure 2 is a schematic flowchart of a data access method in an embodiment
  • Figure 3 is a schematic diagram of the principle of a data access method in an embodiment
  • FIG. 4 is a schematic diagram of a relational database based on a master-slave protection mechanism in an embodiment
  • FIG. 5 is a schematic flowchart of a data access method in a specific embodiment
  • FIG. 6 is a schematic flowchart of a data storage method in an embodiment
  • FIG. 7 is a schematic diagram of the principle of a data storage method in an embodiment
  • FIG. 8 is a flow diagram of quasi-real-time synchronization between a relational database and a non-relational database in response to a data update request in an embodiment
  • FIG. 9 is a flow diagram of quasi-real-time synchronization between a relational database and a non-relational database in response to a data insertion request in an embodiment
  • FIG. 10 is a schematic diagram of a flow chart of reconciliation in a relational database and a non-relational database in an embodiment
  • FIG. 11 is a schematic flowchart of a data storage method in a specific embodiment
  • Figure 12 is a schematic structural diagram of a data access device in an embodiment
  • Figure 13 is a schematic structural diagram of a data access device in another embodiment
  • FIG. 14 is a schematic diagram of the structure of a data storage device in an embodiment
  • 15 is a schematic diagram of the structure of a data storage device in another embodiment
  • Fig. 16 is an internal structure diagram of a data service device implemented as a server in an embodiment.
  • the data access and data storage methods provided in this application can be applied to the application environment as shown in FIG. 1.
  • the data access device 110 and the data service device 120 are directly or indirectly connected through wired or wireless communication.
  • the data service device 120 is deployed with a corresponding database 130.
  • the database 130 includes a relational database 130a and a non-relational database 130b.
  • One or more non-relational databases 130b may be deployed.
  • the data access party can initiate a data read request or a data write request to the data service device 120 through the data access device 110.
  • the data service device 120 provides data scheduling services for data accessing parties, querying data from the relational database 130a or non-relational database 130b according to a data read request, or writing data to the slave relational database 130a and non-relational database 130b according to a data writing request Import data and ensure data consistency between the relational database 130a and the non-relational database 130b.
  • the data access device 110 may specifically be a terminal, a server, or a combination of a terminal and a server.
  • the data visitor initiates a data read request to the server through the terminal, and the server forwards the data read request to the data service device 120.
  • the data service device 120 may be a server.
  • the terminal may specifically be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but it is not limited to this.
  • the server can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or it can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Middleware services, domain name services, security services, CDN (Content Delivery Network, content delivery network), as well as cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • cloud databases cloud computing, cloud functions, cloud storage, network services, cloud communications, Middleware services, domain name services, security services, CDN (Content Delivery Network, content delivery network), as well as cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • a data access method is provided. Taking the method applied to the data service device 120 in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 When a data read request is received, the data read request is parsed to obtain query conditions.
  • the data access device runs target applications that can execute the data access methods and data storage methods provided in this application, such as social applications, payment applications, and game applications.
  • the target application can be a parent application or a sub-application (Mini Program, also known as a mini program).
  • the parent application may be a native APP (Application, application), or a web page accessed based on page browsing references.
  • the read data request may be a data request initiated by the data accessing party through the target application on the data access device and used to instruct the data service device to perform a read operation on the data in the database.
  • the query condition is the query field entered in the target application when the data access party initiates a read data request.
  • FIG. 3 shows a schematic diagram of the principle of a data access method in an embodiment.
  • the data access party when data needs to be queried, can set query conditions based on the target application on the data access device.
  • the query condition can include one query field or multiple query fields.
  • the target application generates a read data request according to the query field, and sends the read data request to the data service device.
  • the data service equipment is deployed with a corresponding database. After receiving the data read request, the data service device parses the data read request, obtains the query conditions, and provides data scheduling services for the data access party according to the query conditions.
  • Step 204 When the query condition meets the preset high-frequency access condition, query data that meets the query condition in the non-relational database.
  • Database also known as data management system
  • DB Database
  • database can be regarded as an electronic file cabinet in short-a place where electronic files are stored, and users can add, intercept, update, and delete data in the file And so on.
  • database is a collection of data that is stored together in a certain way, can be shared by multiple users, has as little redundancy as possible, and is independent of the application.
  • the database in the usual sense refers to the data storage carrier in informationization. According to different data storage structures, databases can be divided into relational databases and non-relational databases.
  • the non-relational database (Not Only SQL, NoSQL) removes the relational characteristics of the relational database. There is no relationship between the data. It is because of this non-relationship that the structure of the non-relational database is simple and has a very high performance. Read and write performance can meet high-frequency access requirements.
  • the aggregation model is used for processing. Aggregation models mainly include key-value pairs (Key-Value), BSON, column families, documents, graphics, etc.
  • the non-relational database used in the embodiments of the present application is mainly a key-value storage database.
  • the key-value storage database stores data as a collection of key-value pairs, and the key is used as a unique identifier to point to the value Value.
  • Value is unstructured and is usually only treated as a string or binary data, and how to interpret it is up to the user to define.
  • the key-value storage database may specifically be KV (Ktable, a distributed storage system), Redis (Remote Dictionary Server, remote dictionary service), DynamoDB, Apache Cassandra, etc.
  • KV is a distributed storage system self-developed by WeChat, including KV server and KV client. It also provides a set of structured data semantic libraries, such as data manipulation similar to SQL (Structured Query Language) Interface Select, Update, Insert, Delete, etc.
  • Redis is an open source log-type key-value storage database written in ANSI C language, supporting the network, memory-based or persistent, and provides APIs in multiple languages. Redis supports many types of stored values, including string (string), list (linked list), set (collection), zset (sorted set) and hash (hash type).
  • data access scenarios can be divided into simple access scenarios and complex access scenarios according to the complexity of query conditions.
  • Simple access scenarios need to perform high-frequency data read and write operations based on the main identification of the data record; complex access scenarios need to filter data sets that meet these multiple conditions based on the multiple attribute content of the data record as conditions to perform read and write operations .
  • These two scenarios often appear in business scenarios of a large number of target applications, and the frequency of appearance of these two scenarios and the requirements for the data layer are also inconsistent.
  • Simple access scenarios are often high-frequency, and only need to filter data records based on the unique identifier of the data.
  • Complex access scenarios are often non-high-frequency, but require the data layer to provide the ability to perform range filtering based on multiple field conditions.
  • the preset high-frequency access condition is a preset indicator used to determine the current read data request as a data request from a high-frequency simple access scenario.
  • the query condition includes only one query field and the query field is a piece of data
  • the unique identification of the record (that is, the query primary key), or the data channel used when the data access party initiates the read data request is the first channel, etc.
  • the preset complex access condition is a preset index used to determine the current read data request as a data request from a low-frequency complex access scenario. Specifically, it can be that the query condition contains more than one query field, or the data access party initiates The data channel used in the read data request is the second channel and so on.
  • the target application provides multiple access channels on the data access page, such as the first channel and the second channel.
  • the data service device determines that the read data request meets the preset high-frequency access conditions according to the first channel identifier carried in the read data request ask.
  • the data service device when the query request meets the preset high-frequency access conditions, the data service device generates the query conditions according to the preset non-relational database access syntax rules, such as XML (Extensible Markup Language, Extensible Markup Language)
  • the corresponding query statement is used to query data that meets the query conditions in the non-relational database based on the query statement.
  • Step 206 When the query condition meets the preset complex access condition, query data that meets the query condition in the relational database; the data stored in the non-relational database is consistent with the data stored in the relational database.
  • a relational database refers to a database that uses a relational model to organize data. It stores data in the form of rows and columns for easy understanding.
  • the series of rows and columns in a relational database are called tables, and a group of tables constitutes database. Each row of data in the table constitutes a record. There is a relationship between the data in each record.
  • Users can retrieve data in the database by defining one or more query fields. For example, by defining the "score" and "gender” fields, you can query the number of girls with a score of 80-90 in the student information table.
  • the relational model can be simply understood as a two-dimensional table model, and a relational database is a data organization composed of two-dimensional tables and the relationships between them.
  • the mainstream relational databases include Oracle, DB2, MySQL, Microsoft SQL Server, Microsoft Access, etc.
  • relational databases support low-concurrency data access operations, and can provide the ability to filter data based on complex query conditions containing multiple query fields; non-relational databases support high-concurrency data access operations, which can be distributed according to the The ability of the unique query primary key Key to perform data filtering in the form of storage.
  • the data service device determines that the read data request conforms to the preset complex access according to the second channel identifier carried in the data read request.
  • Conditional request As shown in Figure 3, when the query request meets the preset complex access conditions, the data service device generates the corresponding query conditions according to the preset relational database access syntax rules, such as SQL (Structured Query Language)
  • SQL Structured Query Language
  • the query statement based on the query statement, queries the data that meets the query conditions in the relational database.
  • the above data access method further includes: when the query condition is a preset routing field, determining that the query condition meets the preset high-frequency access condition; when the query condition includes query fields other than the routing field, determining The query conditions meet the preset complex access conditions.
  • both the key and the value can be anything from simple objects to complex composite objects, but the key is the only primary key that can be queried for this piece of data.
  • the social application may store all data related to the session in a non-relational database.
  • Session data may include user profile information, messages, personalized data and topics, suggestions, targeted promotions and discounts, etc.
  • Each user session has a unique identifier, which is a unique primary key that can query the session data in a non-relational database.
  • the routing field is the data identifier on which the non-relational database performs distributed storage, that is, the unique query primary key that can query the data in the non-relational database.
  • the data service device determines whether the query field included in the query condition belongs to the routing field. If so, the data service device determines that the query condition meets the preset high-frequency access condition.
  • the data service device determines the The query conditions meet the preset complex access conditions.
  • the content of the data stored in the relational database and the non-relational database is consistent, so as to ensure that the data queried from any of the databases are consistent.
  • the consistency of the data in the relational database and the non-relational database can be checked regularly or irregularly, and the data synchronization process can be performed when inconsistencies are carried out to ensure that the data in the relational database and the non-relational database Consistency.
  • Step 208 Respond to the data read request based on the queried data.
  • the data service device sends the data queried from the relational database or the non-relational database to the data access device in response to the data read request.
  • the data access operation is performed directly through the data scheduling service provided by the data service device, and there is no need to concern its implementation details.
  • relational databases support low-concurrency data access operations and can provide the ability to filter data based on complex query conditions; non-relational databases support high-concurrency data access operations and can filter data based on simple query conditions Capability:
  • the source of the query data is distinguished according to the data access scenario, that is, the high concurrency data access scenario queries data from non-relational databases, and the degree of concurrency Data access scenarios with low but complex query conditions can query data from relational databases, which can satisfy both high-concurrency data access requests and complex data access requests and improve data access performance.
  • the non-relational database includes a plurality of distributed sub-libraries; when the query condition meets the preset high-frequency access condition, querying the data that meets the query condition in the non-relational database includes: when the query condition is a routing field At the time, determine the distributed interval to which the query field belongs; in the distributed sub-library used to store data in the distributed interval, query data that meets the query conditions.
  • Non-relational databases are highly partitionable, and allow horizontal expansion at a scale that other types of databases cannot achieve. If existing partitions fill up the capacity and require more storage space, non-relational databases will add extra The partitions are allocated to the table, thus realizing distributed storage. In summary, non-relational databases can support distributed storage of massive amounts of data.
  • Non-relational databases include multiple distributed sub-libraries.
  • the distributed interval is the value range of the routing field of the data stored in each distributed sub-database when the non-relational database performs distributed storage. For example, assuming that the data identification (ie routing field) of each piece of data is the user ID, then the data of every 100 users can be stored in a distributed sub-database, such as storing the data of users [ID0, ID100] in the distributed sub-database.
  • the sub-library A1 stores the data of the user [ID101, ID200] in the distributed sub-library A2, and so on.
  • the data service device When it is determined that the query condition meets the preset high-frequency access condition, the data service device queries the data in the non-relational database. Specifically, the data service device determines the distributed interval to which the routing field in the query condition belongs, further determines the distributed sub-library used to store data in the distributed interval, and queries the determined distributed sub-library for data that meets the query conditions .
  • distributed storage can improve data security.
  • data query based on distributed sub-databases can reduce the amount of data processing during data query, thereby improving data query efficiency.
  • the non-relational database has one or more; when the query condition meets the preset high-frequency access condition, querying the non-relational database for data that meets the query condition includes: when the query condition is a routing field, In a non-relational database that uses routing fields as the basis for distributed storage, query data that meets the query conditions.
  • each non-relational database has a corresponding routing field, and based on the routing field, a high concurrent access request can be initiated to the corresponding non-relational database. If you want to achieve high concurrent access to multiple fields, you can deploy a non-relational database corresponding to each field separately.
  • non-relational databases A and B can use fields A and B as routing fields, respectively. It can be understood that the content of the data stored in each non-relational database is consistent, but different fields are used as routing fields.
  • Each relational database can include multiple distributed sub-libraries.
  • the data service device further determines which non-relational database corresponds to the routing field of which non-relational database the query field belongs to.
  • Query data in. For example, in the above example, when a subsequent read data request containing only query field A is received, data can be queried in non-relational database A; and when a read data request containing only query field B is received, the data can be queried in Query data in non-relational database B.
  • corresponding non-relational databases are distributed and deployed for different subdivided high-frequency access scenarios, which can satisfy data access requests of multiple high-frequency access scenarios at the same time.
  • the relational database includes a master database and at least one slave database; querying data that meets the query conditions in the relational database includes: initiating a query request to the master database based on the query conditions; When the main library responds to the query of the query request, one of the slave libraries is determined as the current main library; the data that meets the query conditions is queried in the current main library.
  • a highly available database is an overall system composed of a series of databases. At any time, at least one database node is required to respond to users' data access requests and provide data services. Relational databases store data in a single data table, which requires high data security. Therefore, a master-slave standby high-availability mechanism is adopted for relational databases.
  • the highly available relational database system includes a master database and at least one slave database.
  • the master library is used to respond to the data access request initiated by the data access party, and synchronize the data access party's write operation information to the slave library.
  • the slave library is used to backup the data in the master library. It can be understood that both the master database and the slave database are relational databases.
  • FIG. 4 shows a schematic diagram of a relational database based on a master-slave protection mechanism in an embodiment. As shown in Figure 4, when the main library fails, the data service device randomly determines a slave library as the new main library, or determines the current best-performing slave library as the new main library, etc., and then the current main library Query data that meets the query conditions in the database.
  • the data service device when the query response from the main database for the query request is not received within the preset time period, the data service device initiates a query request to the main database again, and retries the preset number of times. If all are answered, it is determined that the main library is faulty.
  • the non-relational database can also improve data security according to the above-mentioned master-slave backup mechanism. Since the non-relational database is stored in a distributed manner, a corresponding standby database can be deployed for each distributed sub-database. It can be understood that both the standby database and the distributed sub-database are non-relational databases.
  • the master library that handles the main data access request, and there are a number of backup slave libraries for disaster tolerance switching.
  • the backup slave libraries will automatically be called the master library and continue to provide services.
  • the above data access method further includes: obtaining a log generated when a write operation is performed on the main library; sending the log to one or more target slave libraries, so that the target slave library performs the write operation synchronously according to the log;
  • the target slave library performs the write operation synchronously according to the log;
  • the embodiment of the present application is based on the data synchronization mode of the replication log to ensure the data consistency between the master database and the slave database.
  • the data synchronization mode of the replication log means that the data operation of the master library is sent to each slave library in the form of a log, and the same data operation is performed after the log is received from the library to complete the data backup.
  • Data operations include read operations and write operations.
  • Write operations include update operations, insert operations, and so on.
  • the master library is connected with at least one slave library, which can easily realize the separation of reading and writing. At the same time, because each slave library is running, the data in the slave library is hot data, which can quickly realize the capacity. Disaster switch.
  • the data synchronization mode of the replication log may specifically adopt any of asynchronous replication (Asynchronous replication), semisynchronous replication (Semisynchronous replication), and Fully synchronous replication (Fully synchronous replication).
  • Asynchronous replication means that after the master library sends the newly generated log to each slave library, the synchronization is considered completed without waiting for the confirmation reply from the slave library, and the information that the data operation has been successful is fed back to the data access party.
  • the default replication of databases such as MySQL is asynchronous.
  • asynchronous replication can increase the speed of data operations and reduce time-consuming, data reliability is reduced. In extreme cases, if the main library has just submitted the log and other slave libraries have not received the relevant logs, the main library will fail. The main library has already returned the information that the data operation has been successful to the data access party at this time. All data operation contents are lost.
  • the full synchronous replication mode means that after the master library sends the newly generated log to each slave library, it needs to wait for the confirmation reply information of all the slave libraries before it is considered complete synchronization. Although the fully synchronous replication method can ensure data reliability, the data operation rate is seriously affected.
  • Semi-synchronous replication is a data synchronization method between asynchronous replication and full synchronous replication. It means that the data service device needs to send logs to a preset number of targets before sending the logs newly generated by the main library to each main library.
  • the slave library after waiting for the confirmation message from the slave library, the data service device submits the log to the other slave libraries and directly regards the synchronization as complete.
  • the target slave library can be a random slave library or a pre-specified slave library.
  • the preset number can be set freely according to needs, such as 1 and so on. It can be understood that the larger the preset number, the longer the time required for data synchronization and the higher the data reliability.
  • the preset number is 1, it can be guaranteed that only when at least one slave library has truly completed synchronization, the information that the data operation has been successful will be returned to the data accessing party to ensure data reliability.
  • the rest of the slave libraries follow the asynchronous replication method and send the log to complete the synchronization of the rest of the slave libraries, reducing time-consuming.
  • the semi-synchronous replication method does not return the information that the data operation has been completed to the data access party when the confirmation reply message returned by the target slave library is not received, when the main library has just submitted the log and other slave libraries have not yet received the relevant log
  • the semi-synchronous replication method can also ensure data reliability and ensure disaster recovery No data loss when switching.
  • the log-based master-slave semi-replication mode synchronizes data between the master library and the slave library, and can also ensure the consistency of data between the master library and different slave libraries, and the semi-synchronous replication mode is more asynchronous than asynchronous.
  • the replication mode can improve data reliability. Compared with the full synchronization replication mode, it can effectively reduce time-consuming and improve database operation efficiency.
  • the above-mentioned data access method includes:
  • relational databases support low-concurrency data access operations, and can provide data filtering capabilities based on complex query conditions; non-relational databases support high-concurrency data access operations, and can perform data filtering capabilities based on simple query conditions ;
  • the source of the query data is distinguished according to the data access scenario, that is, the high concurrency data access scenario queries the data from the non-relational database, and the concurrency is not
  • Data access scenarios with high but complex query conditions can query data from relational databases, which can not only meet high-concurrency data access requests, but also meet complex data access requests and improve data access performance.
  • a data storage method is provided. Taking the method applied to the data service device 120 in FIG. 1 as an example for description, the method includes the following steps:
  • Step 602 When the data write request is received, the data write request is parsed to obtain the data to be written.
  • the write data request is a data request initiated by the data accessing party through the target application on the data access device and used to instruct the data service device to write the data in the database.
  • Write operations include update operations, insert operations, and so on.
  • Update operations refer to operations that modify or delete data. The delete operation usually does not actually delete the data from the database, but uses an update method to configure the validity field of the data as invalid.
  • the data access party may input the data to be written based on the target application on the data access device.
  • the target application generates a write data request according to the data to be written, and sends the write data request to the data service device.
  • the data service device parses the data write request, obtains the data to be written, and provides data scheduling services for the data access party according to the data to be written.
  • Step 604 Write the data to be written into a database.
  • Step 606 Synchronize the data to be written from one type of database to another type; wherein, the types of databases include relational databases and non-relational databases; the non-relational databases are used to respond to reads that meet preset high-frequency access conditions. Data requests, relational databases are used to respond to read data requests that meet preset complex access conditions.
  • Operations on data in the database include read operations and write operations. Reading data does not cause data changes, so there is no need to synchronize data between relational databases and non-relational databases.
  • the write operation will cause data changes, so data synchronization operations between relational databases and non-relational databases are required to ensure data consistency.
  • the specific synchronization strategy may be to write the data to be written into a database, and then synchronize the data to be written from the database to another database. For example, during the update operation, the data to be written can be updated to the non-relational database first, and then the relational database is updated according to the non-relational database. For another example, when performing an insert operation, you can insert the data to be written into the relational database first, and then update the non-relational database according to the relational database.
  • FIG. 7 shows a schematic diagram of the principle of a data storage method in an embodiment.
  • data can be synchronized between relational databases and non-relational databases based on a quasi-real-time synchronization strategy.
  • the data synchronization operation between the relational database and the non-relational database is based on the write operation event-driven. In other words, once a data write request occurs, the data to be written is immediately written to one type of database, and then synchronized to another type of database. .
  • This event-driven data synchronization strategy can enable the synchronization of relational data and non-relational databases with only a small time interval. Although it cannot achieve complete real-time synchronization, it can achieve quasi-real-time synchronization.
  • This event-driven synchronization strategy is called a quasi-real-time synchronization strategy.
  • Step 608 Respond to the write data request based on the write result of the data to be written.
  • the writing result refers to the result information of whether the data to be written has been successfully written into the database, and specifically includes writing success and writing failure.
  • the write results can be further divided into update results, insert results, and so on.
  • the judgment rule of the write result can be that the write result is determined to be successful when the data to be written is successfully written into one database, or the write is determined only when the data to be written is successfully written in both databases.
  • the input result is that the writing is successful.
  • the data service device returns the writing result to the data access party after determining the writing result of the data to be written according to the judgment rule of the writing result. For example, when performing an update operation, when the written data is successfully updated to the non-relational database, the information that the data update is successful may be returned to the data accessing party. For another example, when performing an insert operation, you can insert the data to be written into the relational database first, and then update the non-relational database according to the relational database. When the data to be written is successfully inserted into the relational data and non-relational database, Return to the data access party that the data is inserted successfully.
  • the source of the query data is distinguished according to the data access scenario, that is, the high-concurrency data access scenario queries data from non-relational databases, while concurrent Data access scenarios with a low degree of complexity but complex query conditions can query data from a relational database, which can not only meet high-concurrency data access requests, but also meet complex data access requests, and improve data access performance.
  • the data update request is a data request used to instruct the data service device to perform an update operation on the data in the database.
  • Message queuing is an application-to-application communication method. Applications communicate by writing and retrieving application-specific data (messages) entering and leaving the queue, without the need for a dedicated connection to link them. Messaging refers to the communication between programs by sending data in messages, rather than directly calling each other to communicate. Direct calling is usually used for technologies such as remote procedure calls. Queuing refers to applications that communicate through queues. The use of queues eliminates the requirement for simultaneous execution of receiving and sending applications.
  • FIG. 8 shows a flow diagram of quasi-real-time synchronization between a relational database and a non-relational database in response to a data update request in an embodiment.
  • the steps of responding to the data update request include:
  • the data service device When receiving the data update request, the data service device first determines the data to be updated (hereinafter referred to as target data) in the non-relational database according to the data identifier of the data to be written carried in the data update request, so that the data to be written The target data content in replaces the original data content of the piece of data. Due to conditions such as downtime, the target data may fail to update in the non-relational database. At this time, the data update failure information is fed back to the data access party.
  • target data data to be updated
  • the data service device After completing the update of the target data in the non-relational database, the data service device generates an update task according to the data to be written or its data identifier, and adds the update task to the MQ.
  • the data service device performs an update task to update the target data of the relational database. For example, the data to be written is added to the MQ, and the data service device extracts the data to be written from the MQ for consumption as needed, that is, updates the target data in the relational database according to the extracted data to be written.
  • the non-relational database is updated first, and then the relational database is updated asynchronously with the help of the message queue.
  • This data synchronization strategy based on the time-driven update operation can make the synchronization of the relational data and the non-relational database in only a small amount of time. Interval, can achieve quasi real-time synchronization.
  • asynchronously updating the content of the target data from the non-relational database to the relational database through the message queue includes: when the enqueue identifier corresponding to the data to be written is to be enqueue, identifying the data of the data to be written Add to the message queue, update the enqueue identifier to queuing in the non-relational database; traverse each data identifier in the message queue according to the queuing order; query the corresponding data identifier in the non-relational database according to the current traversal order Target data content, based on the queried target data content, update the original data content corresponding to the data identifier stored in the relational database, and update the enqueue identity of the current traversal sequence to the queue to be enrolled.
  • the enqueue identifier is a newly added field in the database by the data service device to execute the data access and storage method provided by this application. Each piece of data has a corresponding enqueue identifier.
  • the embodiment of the present application introduces the enqueue identification field and makes a mark on the side of the non-relational database.
  • the enqueue identification field only needs to be added in the data table of the non-relational database.
  • the enqueue identifier is used to characterize the status information of whether a piece of data update task is in the message queue, and specifically includes "to be enqueued" and "queuing".
  • the above step S804 of adding an update task to the message queue includes: the data service device determines the enqueue flag corresponding to the data identifier of the data to be written in the non-relational database.
  • the enqueue identifier can also be characterized by other characters, for example, "to be enqueued” can be characterized by 0, and "queuing" can be characterized by 1.
  • the data service device After updating the non-relational database, the data service device checks whether the enqueue identification field of the data to be written is 1. If the enqueue flag is 1, it means that the update task corresponding to the data to be written is already in the MQ, and there is no need to enqueue again.
  • the enqueue flag is 0, it means that there is no update task for this piece of data in MQ, and MQ needs to be added.
  • the data service device only needs to add the data identifier of the data to be written into the message queue, and determine whether the addition is successful. When the data identifier is successfully added to the message queue, the data service device determines the update result as a successful update, and feeds back information that the data update is successful to the data access party.
  • the step S806 of updating the data in the relational database with the update task in the consuming message queue includes: after adding the update task corresponding to the data to be written to the message queue, the data service device In the relational database, the enqueue identifier is updated to queuing.
  • the data service device traverses and executes each update task in the message queue according to the queuing sequence.
  • the data service device first queries the relational database based on the data identifier of the data to be written to see whether the data identifier corresponds to the target data. If it does not exist, try again.
  • the update task is skipped, an alarm is issued, and the next sequential update task is continued.
  • the data service device queries the corresponding target data content in the non-relational database according to the data identifier in the update task.
  • the data version of the queried target data content is higher than the stored version of the original data content in the relational database, Based on the queried target data content, the original data content corresponding to the data identification stored in the relational database is updated, and the enqueue identity of the current traversal sequence is updated to the waiting enqueue.
  • the relational database is asynchronously updated with the help of the message queue. Since the message queue is constantly processing the update task of the queue, the relational database can actually achieve quasi-real-time data update synchronization, and due to the update in the message queue The tasks are processed one by one, and the pressure on the relational database will not be excessive due to the large amount of requests.
  • the target data content includes the target version of the data to be written; updating the original data content stored in the non-relational database corresponding to the data identifier according to the target data content includes: determining the data stored in the non-relational database and The storage version of the original data content corresponding to the data identifier; when the target version is equal to the storage version, the original data content stored in the non-relational database corresponding to the data identifier is updated according to the target data content; after the update is successful, the storage version is updated renew.
  • the data version is a new field added to the database by the data service device to execute the data access and storage method provided by this application.
  • Each piece of data has a corresponding storage version.
  • a data version field is added to each data in non-relational databases and relational databases to record the data version information of each data.
  • the data version field needs to be added in the data tables of both non-relational databases and relational databases.
  • the data version field may specifically be a version number value, such as 0, 1, etc.
  • the data service device constructs an optimistic lock mechanism based on the data version field, and completes the write operation in the non-relational database based on the optimistic lock mechanism. Specifically, when the target data in the non-relational database is updated according to the data update request, the data service device compares the data version ( Hereinafter referred to as the storage version), the update is performed only when the data version of the target data content in the data to be written (hereinafter referred to as the data version to be written) is newer, and the storage version is increased by one after the update or insertion is completed. When the version of the data to be written is lower than the storage version, it is determined that the write result is a write failure, and the data write failure information is returned to the data access party.
  • the storage version the data version of the target data content in the data to be written
  • the data insertion request is a data request used to instruct the data service device to perform an insertion operation on the data in the database.
  • FIG. 9 shows a flow diagram of quasi-real-time synchronization between a relational database and a non-relational database in response to a data insertion request in an embodiment.
  • the steps of responding to the data insertion request include:
  • the data service device When receiving the data insertion request, the data service device first queries the relational database according to the data identifier of the data to be written carried in the data insertion request whether the data identifier already exists.
  • the data service device inserts the data to be written into the non-relational database, and initializes the data to be written in the non-relational database.
  • the data version number of the type database is 0.
  • the data service device If the data to be written fails to be inserted into the relational database or into the non-relational database, the data service device returns the insertion failure information to the data access party.
  • the data service device If the data identifier already exists in the non-relational database, it indicates that the non-relational database is also repeatedly inserted, and the data service device returns a prompt message of the insertion error to the data accessing party.
  • the insertion failure of any one of the databases may cause the data accessing party to repeatedly initiate the insert, and this embodiment does not repeat the insert when the data already exists in the relational database.
  • an error message is returned, which can ensure that the data is not inserted repeatedly and the consistency of the data.
  • inserting the data to be written into the non-relational database includes: if the data identifier has been stored in the relational database but has not been stored in the non-relational database, Insert the data to be written into the non-relational database; if the data identifier has been stored in the relational database and the non-relational database, the writing result is determined to be a writing failure.
  • the data service device inserts the data to be written into the non-relational database and initializes the data to be written in the non-relational database.
  • the version number is 0.
  • the data service device adds an update task of the data to be written to the message queue to update the corresponding target data stored in the relational database .
  • the specific update logic can refer to the above-mentioned embodiment, which will not be repeated here.
  • the above data storage method further includes: when the write data request is an insert data request, and the data identifier has not been stored in the relational database, inserting the data to be written into the non-relational database After that, initialize the enqueue identifier of the data to be written as the enqueue. That is, when a piece of data to be written is inserted for the first time, according to the normal update logic, the update task corresponding to the data to be written waits to be enqueued.
  • the above data storage method further includes: when the data identifier has been stored in the relational database but not yet stored in the non-relational database, inserting the data to be written into the non-relational database After the database, the data identifier of the data to be written is added to the message queue, and the enqueue identifier of the data to be written is initialized as being queued. That is, when a piece of data to be written is repeatedly inserted into the relational database, the update task corresponding to the data to be written is immediately added to the message queue to improve the real-time performance of data synchronization.
  • Asynchronous insertion into a non-relational database, such non-relational data does not need to bear the risk of synchronization failure in asynchronous operations.
  • the above data storage method further includes: traversing each piece of data in the relational database; when the data identifier of the current traversal sequence data is not stored in the non-relational database, inserting the data in the current traversal sequence into the non-relational database Relational database; when the data identifier of the current traversal sequence data is stored in a non-relational database, but the corresponding data content is inconsistent, the data content in the relational database is updated according to the data content in the non-relational database.
  • the consistency of the data in the two databases can be basically guaranteed.
  • the data service equipment is based on the preset bottom line Reconciliation mechanism to ensure the final consistency of the data in the two databases.
  • FIG. 10 shows a schematic diagram of the flow of bottom-up reconciliation in a relational database and a non-relational database in an embodiment.
  • the steps for reconciling the data in the relational database and the non-relational database include:
  • the data service device traverses each piece of data in the relational database according to the target time frequency, and checks whether the data identifier of each piece of data traversed is stored in the non-relational database at the same time.
  • the above data storage method further includes: counting the total data volume of all data stored in the relational database according to a preset time frequency; determining the target time frequency for traversal according to the total data volume; Data traversal includes: traversing each piece of data in the relational database according to the target time frequency.
  • This pocket reconciliation mechanism is performed every certain time (such as 30 minutes) through a script.
  • the execution frequency of the reconciliation script ie, the target time frequency
  • the target time frequency can be dynamically determined according to the total data volume of the data stored in the relational database. It can be understood that the target time frequency is positively correlated with the total data volume.
  • the data service device compares whether the data content of the data identifier in the relational database is consistent with the data content in the non-relational database. If they are consistent, continue to traverse the next sequential data. If they are inconsistent, the data service device compares whether the data version of the data content in the non-relational database identified by the data is higher than the data version of the data content in the relational database.
  • the data service device replaces the data content of the data identifier in the non-relational database with the data content in the relational database, and configures the enqueue identifier of the data corresponding to the data identifier in the non-relational database as the queue to be enqueued.
  • the data update during the reconciliation process is completely based on the data in the non-relational database, and does not refer to the data version.
  • step S804 even when the data identifier is not successfully added to the message queue, the update result can still be determined as the update success.
  • the data storage method provided by this application includes:
  • S1102 When a data write request is received, the data write request is parsed to obtain the data to be written; the data to be written includes the data identifier and target data content; the target data content includes the target version of the data to be written.
  • S1104 When the data write request is a data insert request, it is determined whether the data identifier has been stored in a relational database used to respond to a data read request meeting a preset complex access condition.
  • S1106 If the data identifier has not been stored in the relational database, insert the data to be written into the relational database and the non-relational database, initialize the enqueue identity of the data to be written to be enqueue, and initialize the storage version of the data to be written Is 0.
  • the non-relational database is used to respond to read data requests that meet preset high-frequency access conditions.
  • S1120 Respond to the data insertion request based on the insertion result of the data to be written.
  • S1122 Obtain a log generated when a write operation is performed on the main library.
  • S1124 Send the log to one or more target slave libraries, so that the target slave libraries perform a write operation synchronously according to the log.
  • S1126 When receiving the synchronization confirmation message returned by the target slave library after performing the write operation, respond to the write data request for triggering the write operation based on the result of the successful write operation, and send logs to the remaining slave libraries so that the remaining slave libraries Write operations are performed synchronously according to the log.
  • Relational databases support low-concurrency data access operations and can provide the ability to filter data based on complex query conditions; non-relational databases support high-concurrency data access operations and can filter data based on simple query conditions.
  • the source of the query data is distinguished according to the data access scenario, that is, the high-concurrency data access scenario queries data from non-relational databases, while concurrent Data access scenarios with a low degree of complexity but complex query conditions can query data from a relational database, which can not only meet high-concurrency data access requests, but also meet complex data access requests, and improve data access performance.
  • a data access device 1200 is provided.
  • the device can adopt software modules or hardware modules, or a combination of the two to form a part of computer equipment.
  • the device specifically includes: data reading The fetch request module 1202, the data query module 1204, and the query result feedback module 1206, where:
  • the data read request module 1202 is used to parse the data read request to obtain the query condition when the data read request is received.
  • the data query module 1204 is used to query data that meets the query conditions in the non-relational database when the query conditions meet the preset high-frequency access conditions; when the query conditions meet the preset complex access conditions, query in the relational database that meets the requirements
  • the data of the query conditions; the data stored in the non-relational database is consistent with the data stored in the relational database.
  • the query result feedback module 1206 is used for responding to the data read request based on the query data.
  • the above-mentioned data access device 1200 further includes an access scene identification module 1208, which is used to determine that the query condition meets the preset high-frequency access condition when the query condition is a preset routing field; When the query conditions include query fields other than the routing field, it is determined that the query conditions meet the preset complex access conditions.
  • the non-relational database includes multiple distributed sub-libraries; as shown in FIG. 13, the data query module 1204 includes a high-frequency access query module 12042, which is used to determine when the query condition is a routing field, the query field belongs to Distributed interval; in the distributed sub-library used to store data in the distributed interval, query data that meets the query conditions.
  • the high-frequency access query module 12042 is also used to query the non-relational database that uses the routing field as the basis for distributed storage when the query condition is a routing field. Data that meets the query conditions.
  • the relational database includes a master database and at least one slave database; as shown in FIG. 13, the data query module 1204 also includes a complex access query module 12044, which is used to initiate a query request to the master database based on query conditions; When the query response from the master library to the query request is not received within the preset time, one of the slave libraries is determined as the current master library; data that meets the query conditions is queried in the current master library.
  • the above-mentioned data access device 1200 further includes a semi-synchronous replication module 1210, which is used to obtain a log generated when a write operation is performed on the master library; and send the log to one or more target slaves.
  • the library enables the target slave library to perform the write operation synchronously according to the log; when receiving the synchronization confirmation message returned by the target slave library after the write operation is performed, respond to the write data request used to trigger the write operation based on the result of the successful write operation To send logs to the remaining slave libraries.
  • the relational database supports low-concurrency data access operations, and can provide the ability to filter data based on complex query conditions; non-relational databases support high-concurrency data access operations, and the ability to filter data based on simple query conditions ;
  • the source of the query data is distinguished according to the data access scenario, that is, the high concurrency data access scenario queries the data from the non-relational database, and the concurrency is not
  • Data access scenarios with high but complex query conditions can query data from relational databases, which can not only meet high-concurrency data access requests, but also meet complex data access requests, and improve data access performance.
  • Each module in the above-mentioned data access device can be implemented in whole or in part by software, hardware and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a data storage device 1400 is provided.
  • the device can adopt a software module or a hardware module, or a combination of the two can become a part of computer equipment.
  • the device specifically includes: data writing Incoming request module 1402, data quasi-real-time synchronization module 1404, and writing result feedback module 1406, in which:
  • the data write request module 1402 is used to parse the data write request to obtain the data to be written when the data write request is received.
  • the quasi-real-time data synchronization module 1404 is used to write the data to be written into a database; synchronize the data to be written from one database to another database; the types of databases include relational databases and non-relational databases ; Non-relational databases are used to respond to read data requests that meet preset high-frequency access conditions, and relational databases are used to respond to read data requests that meet preset complex access conditions.
  • the write result feedback module 1406 is configured to respond to the write data request based on the write result of the data to be written.
  • the data to be written includes a data identifier and target data content; as shown in FIG. 15, the quasi-real-time data synchronization module 1404 includes a data update synchronization module 14042, which is used for when the write data request is a data update request, according to The target data content updates the original data content corresponding to the data identifier stored in the non-relational database; synchronizing the data to be written from one database to another includes: when the non-relational database is updated, through a message The queue asynchronously updates the target data content from the non-relational database to the relational database.
  • the target data content includes the target version of the data to be written; the data update synchronization module 14042 is also used to determine the storage version of the original data content corresponding to the data identifier stored in the non-relational database; when the target version is equal to When the version is stored, the original data content corresponding to the data identifier stored in the non-relational database is updated according to the target data content; after the update is successful, the stored version is updated to the target version.
  • the version of the data to be written is lower than the storage version, it is determined that the write result is a write failure.
  • the data to be written includes a data identifier and target data content; as shown in FIG. 15, the data quasi-real-time synchronization module 1404 further includes a data insertion synchronization module 14044, which is used for when the write data request is a data insertion request, Determine whether the data identifier has been stored in the relational database; if the data identifier has not been stored in the relational database, insert the data to be written into the relational database and non-relational database; if the data identifier has been stored in the relational database, change The data to be written is inserted into the non-relational database, and after the insertion is successful, the target data content is asynchronously updated from the non-relational database to the relational database through the message queue.
  • a data insertion synchronization module 14044 which is used for when the write data request is a data insertion request, Determine whether the data identifier has been stored in the relational database; if the data identifier has not been stored in the relational database, insert the
  • the data insertion synchronization module 14044 is further configured to insert the data to be written into the non-relational database if the data identifier has been stored in the relational database but not yet in the non-relational database; When stored in a relational database and a non-relational database, the write result is determined as a write failure.
  • the data update synchronization module 14042 is further configured to add the data identifier of the data to be written to the message queue when the enqueue identifier corresponding to the data to be written is to be enqueued, and add The enqueue identifier is updated to queuing; each data identifier in the message queue is traversed according to the queuing order; the corresponding target data content is queried in the non-relational database according to the data identifier of the current traversal sequence, and based on the queried target data content, The original data content corresponding to the data identifier stored in the relational database is updated, and the enqueue identity of the current traversal sequence is updated to the waiting enqueue.
  • the above-mentioned data storage device 1400 further includes a data asynchronous update module 1408, which is used to insert the data to be written into the non-relational database when the data write request is an insert data request and the data identifier has not been stored in the relational database.
  • a data asynchronous update module 1408 which is used to insert the data to be written into the non-relational database when the data write request is an insert data request and the data identifier has not been stored in the relational database.
  • the data to be written into the queue is waiting for the queue; when the data identification has been stored in the relational database but not yet in the non-relational database, insert the data to be written into the non-relational database
  • the data identifier of the data to be written is added to the message queue, and the queue-entry identifier of the data to be written is initialized as being queued.
  • the above-mentioned data storage device 1400 further includes a data pocket reconciliation module 1410, which is used to traverse each piece of data in the relational database; the data identifier of the current traversal sequence data is not stored in When in a non-relational database, insert the data in the current traversal sequence into the non-relational database; the data identifier of the current traversal sequence data is stored in the non-relational database, but the corresponding data content is inconsistent, according to the non-relational database The data content updates the data content in the relational database.
  • Relational databases support low-concurrency data access operations and can provide the ability to filter data based on complex query conditions; non-relational databases support high-concurrency data access operations and can filter data based on simple query conditions.
  • the source of the query data is distinguished according to the data access scenario, that is, the high-concurrency data access scenario queries data from non-relational databases, while concurrent Data access scenarios with a low degree of complexity but complex query conditions can query data from a relational database, which can not only meet high-concurrency data access requests, but also meet complex data access requests, and improve data access performance.
  • Each module in the above-mentioned data storage device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 16.
  • the computer equipment includes a processor, a memory, and a network interface connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the database of the computer equipment includes a relational database and a non-relational database, and the stored data is consistent.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize a data access and storage method.
  • FIG. 16 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and a processor, where computer-readable instructions are stored in the memory, and the processor implements the steps in the foregoing method embodiments when executing the computer-readable instructions.
  • a computer-readable storage medium is provided, and computer-readable instructions are stored, and when the computer-readable instructions are executed by a processor, the steps in the foregoing method embodiments are implemented.
  • a computer program product or computer readable instruction includes a computer readable instruction, and the computer readable instruction is stored in a computer readable storage medium.
  • the processor of the computer device reads the computer-readable instruction from the computer-readable storage medium, and the processor executes the computer-readable instruction, so that the computer device executes the steps in the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical storage.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM can be in various forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据访问方法,包括:在接收到读数据请求时,对所述读数据请求进行解析,得到查询条件;当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据;当所述查询条件符合预设复杂访问条件时,在关系型数据库中查询符合所述查询条件的数据;所述非关系型数据库与所述关系型数据库所存储数据一致;基于查询到的数据响应所述读数据请求。

Description

数据访问方法和装置、数据存储方法和装置
本申请要求于2020年03月20日提交中国专利局,申请号为202010199062.5,申请名称为“数据访问方法和装置、数据存储方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库技术领域,特别是涉及一种数据访问方法和装置、数据存储方法和装置。
背景技术
互联网中每天都会产生海量数据,为了实现对海量数据的有效管理,互联网服务提供商大多会将产生的数据存储在数据库(DataBase,DB)中。目前,主要是采用MySQL、Oracle等关系型数据库对数据进行存储。关系型数据库采用结构化查询语言进行数据查询,并支持对数据库中的数据进行增删改查操作以及跨表查询功能,使用便利,易于理解,因而得到越来越广的应用。
然而,关系型数据库是以单张数据表的形式进行数据存储,这种单表读写的性能限制无法适应与日俱增的数据访问量,影响数据的正常访问,逐渐成为业务发展的瓶颈。
发明内容
一种数据访问方法,所述方法包括:
在接收到读数据请求时,对所述读数据请求进行解析,得到查询条件;
当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据;
当所述查询条件符合预设复杂访问条件时,在关系型数据库中查询符合所述查询条件的数据;所述非关系型数据库与所述关系型数据库所存储数据一致;及
基于查询到的数据响应所述读数据请求。
一种数据访问装置,所述装置包括:
数据读取请求模块,用于在接收到读数据请求时,对所述读数据请求进行解析,得到查询条件;
数据查询模块,用于当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据;当所述查询条件符合预设复杂访问条件时,在关系型数据库中查询符合所述查询条件的数据;所述非关系型数据库与所述关系型数据库所存储数据一致;及
查询结果反馈模块,用于基于查询到的数据响应所述读数据请求。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行所述数据访问方法的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行执行所述数据访问方法的步骤。
一种数据存储方法,所述方法包括:
在接收到写数据请求时,对所述写数据请求进行解析,得到待写入数据;
将所述待写入数据写入一种数据库;
将所述待写入数据从所述一种数据库同步至另一种数据库;其中,所述数据库的种类包括关系型数据库和非关系型数据库;所述非关系型数据库用于响应符合预设高频访问条件的读数据请求,所述关系型数据库用于响应符合预设复杂访问条件的读数据请求;及
基于所述待写入数据的写入结果响应所述写数据请求。
一种数据存储装置,所述装置包括:
数据写入请求模块,用于在接收到写数据请求时,对所述写数据请求进行解析,得到待写入数据;
数据准实时同步模块,用于将所述待写入数据写入一种数据库;将所述待写入数据从所述一种数据库同步至另一种数据库;其中,所述数据库的种类包括关系型数据库和非关系型数据库;所述非关系型数据库用于响应符合预设高频访问条件的读数据请求,所述关系型数据库用于响应符合预设复杂访问条件的读数据请求;及
写入结果反馈模块,用于基于所述待写入数据的写入结果响应所述写数据请求。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行所述数据存储方法的步骤。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行执行所述数据存储方法的步骤。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为一个实施例中数据访问及存储方法的应用环境图;
图2为一个实施例中数据访问方法的流程示意图;
图3为一个实施例中数据访问方法的原理示意图;
图4为一个实施例中基于主从保护机制的关系型数据库的示意图;
图5为一个具体实施例中数据访问方法的流程示意图;
图6为一个实施例中数据存储方法的流程示意图;
图7为一个实施例中数据存储方法的原理示意图;
图8为一个实施例中响应数据更新请求在关系型数据库和非关系型数据库之间进行准实时同步的流程示意图;
图9为一个实施例中响应数据插入请求时在关系型数据库和非关系型数据库之间进行准实时同步的流程示意图;
图10为一个实施例中在关系型数据库和非关系型数据库中进行兜底对账的流程示意图;
图11为一个具体实施例中数据存储方法的流程示意图;
图12为一个实施例中数据访问装置的结构示意图;
图13为另一个实施例中数据访问装置的结构示意图;
图14为一个实施例中数据存储装置的结构示意图;
图15为另一个实施例中数据存储装置的结构示意图;
图16为一个实施例中数据服务设备实现为服务器时的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的数据访问及数据存储方法,可以应用于如图1所示的应用环境中。数据访问设备110与数据服务设备120通过有线或无线通信方式进行直接或间接地连接。数据服务设备120部署有对应的数据库130。数据库130包括关系型数据库130a和非关系型数据库130b。非关系型数据库130b可以部署一个,也可以部署有多个。数据访问方可以通过数据访问设备110向数据服务设备120发起读数据请求或写数据请求。数据服务设备120为数据访问方提供数据调度服务,根据读数据请求从关系型数据库130a或非关系型数据库130b中查询数据,或根据写数据请求向从关系型数据库130a和非关系型数据库130b写入数据,并保证关系型数据库130a与非关系型数据库130b之间的数据一致性。
数据访问设备110具体可以是终端,也可以是服务器,还可以是终端与服务器的组合。比如,数据访问方通过终端向服务器发起读数据请求,服务器将读数据请求转发至数据服务设备120。数据服务设备120可以是服务器。终端具体可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。服务器具体可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。
在一个实施例中,如图2所示,提供了一种数据访问方法,以该方法应用于图1中的数据服务设备120为例进行说明,包括以下步骤:
步骤202,在接收到读数据请求时,对读数据请求进行解析,得到查询条件。
数据访问设备中运行有可执行本申请提供的数据访问方法及数据存储方法的目标应用,如社交应用、支付应用、游戏应用等。目标应用具体可以是母应用还可以是子应用(Mini Program,又称小程序)。母应用可以是原生APP(Application,应用程序),也可以是基于页面浏览引用访问的网络页面。读数据请求可以是数据访问方在数据访问设备通过目标应用发起的用于指示数据服务设备对数据库中的数据进行读操作的数据请求。查询条件是数据访问方发起读数据请求时在目标应用输入的查询字段。
具体地,参考图3,图3示出了一个实施例中数据访问方法的原理示意图。
如图3所示,当需要查询数据时,数据访问方可以基于数据访问设备上的目标应用设定查询条件。查询条件可以包含一个查询字段,也可以包括多个查询字段。目标应用根据查询字段生成读数据请求,将读数据请求发送至数据服务设备。数据服务设备部署有对应的数据库。数据服务设备在接收到读数据请求后,对读数据请求进行解析,得到查询条件,根据查询条件为数据访问方提供数据调度服务。
步骤204,当查询条件符合预设高频访问条件时,在非关系型数据库中查询符合查询条件的数据。
数据库(DB,Database),又称为数据管理系统,简而言之可视为电子化的文件柜——存储电子文件的处所,用户可以对文件中的数据运行新增、截取、更新、删除等操作。 所谓“数据库”系以一定方式储存在一起、能予多个用户共享、具有尽可能小的冗余度、与应用程序彼此独立的数据集合。通常意义上的数据库,指信息化中的数据存储载体。根据数据存储结构的不同,可以将数据库区分为关系型数据库和非关系型数据库。
非关系型数据库(Not Only SQL,NoSQL)去掉了关系型数据库的关系型特征,数据之间无关系,正是得益于这种无关系性,使非关系型数据库结构简单,具有非常高的读写性能,能够满足高频的访问需求。非关系型数据库中的使用聚合模型来处理。聚合模型主要有键值对(Key-Value)、BSON、列族、文档、图形等方式。在本申请的实施例中所采用的非关系型数据库主要是键值存储数据库。键值存储数据库将数据存储为键值对(Key-Value)对集合,键Key作为唯一标识符指向值Value。值Value无结构化,通常只被当作字符串或二进制数据,如何解释由使用者自定义。
键值存储数据库具体可以是KV(Ktable,一种分布式存储系统)、Redis(Remote Dictionary Server,远程字典服务)、DynamoDB、Apache Cassandra等。KV是微信自研的一种分布式存储系统,包括KV server端和KV client端,同时提供了一套结构化的数据语义库,比如类似SQL(Structured Query Language,结构化查询语言)的数据操纵接口Select、Update、Insert、Delete等。Redis是一个开源的使用ANSI C语言编写、支持网络、可基于内存亦可持久化的日志型的键值存储数据库,提供了多种语言的API。Redis支持存储的value类型很多,包括string(字符串)、list(链表)、set(集合)、zset(sorted set,有序集合)和hash(哈希类型)等。
实际应用中,根据查询条件的复杂度可以将数据访问场景区分为简单访问场景和复杂访问场景。简单访问场景需要根据数据记录的主要标识来进行高频的数据读写操作;复杂访问场景需要根据数据记录的多个属性内容为条件来筛选符合此多种条件的数据集,从而进行读写操作。这两种场景在大量目标应用的业务场景中都是经常出现的,而这两种场景的出现频率以及对数据层的要求也是不一致的。简单访问场景往往是高频的,且只需要根据数据的唯一标识符来进行数据记录的筛选。复杂访问场景则往往是非高频的,但需要数据层能提供根据多个字段条件来进行范围筛选的能力。
预设高频访问条件是预设的用于将当前的读数据请求判定为来自高频简单访问场景的数据请求的指标,具体可以是查询条件中仅包含一个查询字段且该查询字段为一条数据记录的唯一标识(即查询主键),或者数据访问方发起读数据请求时所采用的数据通道为第一通道等。预设复杂访问条件是预设的用于将当前的读数据请求判定为来自低频复杂访问场景的数据请求的指标,具体可以是查询条件中包含查询字段的数量多于一个,或者数据访问方发起读数据请求时所采用的数据通道为第二通道等。
具体地,目标应用在数据访问页面提供多种访问通道,如第一通道和第二通道等。当数据访问方在输入查询字段后基于第一通道的按钮发起读数据请求时,数据服务设备根据读数据请求携带的第一通道标识,将该读数据请求判定为符合预设高频访问条件的请求。如图3所示,当查询请求符合预设高频访问条件时,数据服务设备根据预设的非关系型数据库的访问语法规则,如XML(Extensible Markup Language,可扩展标记语言),生成查询条件对应的查询语句,基于该查询语句在非关系型数据库中查询符合查询条件的数据。
步骤206,当查询条件符合预设复杂访问条件时,在关系型数据库中查询符合查询条件的数据;非关系型数据库与关系型数据库所存储数据一致。
关系型数据库是指采用了关系模型来组织数据的数据库,其以行和列的形式存储数据,以便于用户理解,关系型数据库这一系列的行和列被称为表,一组表组成了数据库。表中每一行的数据构成一条记录。每条记录中的各数据之间是有关系的。用户可以通过限定一个或多个查询字段来检索数据库中的数据,比如,通过限定“得分”和“性别”字段,可以在学生信息表中查询得分在80-90之间的女生人数。关系模型可以简单理解为二维表格模型,而一个关系型数据库就是由二维表及其之间的关系组成的一个数据组织。主流的关系型数据库有Oracle、DB2、MySQL、Microsoft SQL Server、Microsoft Access等。
综合而言,关系型数据库支持低并发的数据访问操作,能够提供根据包含多个查询字段的复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据用于分布式存储的唯一查询主键Key进行数据筛选的能力。
具体地,当数据访问方在输入查询字段后基于第二通道的按钮发起读数据请求时,数据服务设备根据读数据请求携带的第二通道标识,将该读数据请求判定为符合预设复杂访问条件的请求。如图3所示,当查询请求符合预设复杂访问条件时,数据服务设备根据预设的关系型数据库的访问语法规则,如SQL(Structured Query Language,结构化查询语言),生成查询条件对应的查询语句,基于该查询语句在关系型数据库中查询符合查询条件的数据。
在一个实施例中,上述数据访问方法还包括:当查询条件为预设的路由字段时,判定查询条件符合预设高频访问条件;当查询条件包括除路由字段之外的查询字段时,判定查询条件符合预设复杂访问条件。
在非关系型数据库中,键Key和值Value都可以是从简单对象到复杂复合对象的任何内容,但键Key是唯一能够查询到该条数据的主键。例如,用户登录社交应用进行会话期间,社交应用可以将所有与会话相关的数据存储在非关系型数据库中。会话数据可能包括用户资料信息、消息、个性化数据和主题、建议、有针对性的促销和折扣等。每个用户会话具有唯一的标识符,该标识符即为能够在非关系型数据库中查询到该会话数据的唯一主键。路由字段是非关系型数据库进行分布式存储时所依据的数据标识,即为能够查询到非关系型数据库中数据的唯一查询主键。
数据服务设备确定查询条件所包含的查询字段是否属于路由字段。若是,数据服务设备判定该查询条件符合预设高频访问条件。当查询条件包含除了路由字段之外的其他查询字段时,即当查询条件包含多个查询字段,或者查询条件仅包含一个查询字段,但该查询字段并不属于路由字段时,数据服务设备判定该查询条件符合预设复杂访问条件。
如下表一所示,本实施例中,在高频简单访问场景,基于唯一数据标识在非关系型数据库中查询数据;在复杂访问场景中,基于多个查询字段在关系型数据库中查询数据。
表一
Figure PCTCN2020124253-appb-000001
值得说明的是,本申请实施例中,关系型数据库与非关系型数据库中存储的数据内容 一致,从而保证从其中任一种数据库查询到的数据一致。在一个实施例中,可以定期或不定期对关系型数据库与非关系型数据库中的数据进行一致性校验,并在不一致时进行数据同步处理,以保证关系型数据库与非关系型数据库中数据的一致性。
步骤208,基于查询到的数据响应读数据请求。
具体地,数据服务设备将从关系型数据库或者非关系型数据库查询到的数据发送至数据访问设备,以响应读数据请求。对于数据访问方而言,直接通过数据服务设备提供的数据调度服务进行数据访问操作,无需关系其实现细节。
上述数据访问方法中,关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力;通过结合两种存储技术的优势并保证两种数据库中数据的一致性,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
在一个实施例中,非关系型数据库包括多个分布式子库;当查询条件符合预设高频访问条件时,在非关系型数据库中查询符合查询条件的数据包括:当查询条件为路由字段时,确定查询字段所属的分布式区间;在用于存储分布式区间内的数据的分布式子库中,查询符合查询条件的数据。
非关系型数据库是高度可分区的,并且允许以其他类型的数据库无法实现的规模进行水平扩展,如果现有分区填满了容量,并且需要更多的存储空间,非关系型数据库就会将额外的分区分配给表,从而实现分布式存储。综合而言,非关系型数据库能够支持海量数据的分布式存储。
非关系型数据库包括多个分布式子库。分布式区间是非关系型数据库进行分布式存储时每个分布式子库所存储数据的路由字段的取值范围。比如,假设每条数据的数据标识(即路由字段)为用户ID,则可以将每100位用户的数据存储在一个分布式子库中,如将用户[ID0,ID100]的数据存储至分布式子库A1,将用户[ID101,ID200]的数据存储至分布式子库A2,如此类推。
当判定查询条件符合预设高频访问条件时,数据服务设备在非关系型数据库中查询数据。具体地,数据服务设备确定查询条件中路由字段所属的分布式区间,进一步确定用于存储分布式区间内的数据的分布式子库,在所确定的分布式子库中查询符合查询条件的数据。
本实施例中,相比集中存储,分布式存储可以提高数据安全性,此外,基于分布式子库进行数据查询,可以减少数据查询时数据处理量,进而提高数据查询效率。
在一个实施例中,非关系型数据库具有一个或多个;当查询条件符合预设高频访问条件时,在非关系型数据库中查询符合查询条件的数据包括:当查询条件为路由字段时,在将路由字段作为分布式存储依据的非关系型数据库中,查询符合查询条件的数据。
如上文所述,每个非关系型数据库具有对应的路由字段,基于该路由字段能够向相应的非关系型数据库发起高并发访问请求。如果期望针对多种字段实现高并发访问,则可以分别部署每个字段对应的非关系型数据库。比如,可以非关系型数据库A和B分别将字 段A和B作为路由字段。可以理解,每个非关系型数据库中所存储的数据内容一致,只是将不同的字段作为路由字段。每个关系型数据库可以包括多个分布式子库。
具体地,在确定读数据请求来自高频访问场景后,若非关系型数据库具有多个,数据服务设备根据查询字段属于哪一非关系型数据库对应的路由字段,进一步确定在哪一非关系型数据库中查询数据。比如,在上述举例中,后续在接收到仅包含查询字段A的读数据请求时,可以在非关系型数据库A中查询数据;而在接收到仅包含查询字段B的读数据请求时,可以在非关系型数据库B中查询数据。
本实施例中,针对细分的不同高频访问场景分布部署对应的非关系型数据库,可以同时满足多种高频访问场景的数据访问请求。
在一个实施例中,关系型数据库包括主库和至少一个从库;在关系型数据库中查询符合查询条件的数据包括:基于查询条件向主库发起查询请求;当在预设时长内未接收到主库针对查询请求的查询应答时,将其中一个从库确定为当前的主库;在当前的主库中查询符合查询条件的数据。
除了数据访问性能,数据可用性也至关重要。高可用的数据库是由一系列的数据库构成的总体系统,要求在任何时刻,至少有一个数据库节点能够响应用户的数据访问请求并提供数据服务。关系型数据库将数据集中存储在单张数据表,对数据安全性要求高,因而针对关系型数据库采用主从备用的高可用机制。高可用的关系型数据库系统包括主库和至少一个从库。主库用于响应数据访问方发起的数据访问请求,并将数据访问方对数据的写操作信息同步至从库。从库用于对主库中的数据进行备用。可以理解,主库与从库均为关系型数据库。
具体地,在确定读数据请求来自复杂访问场景后,数据服务设备从关系型数据库系统的主库中查询数据,即根据查询条件向主库发送查询请求。当在预设时长内仍未接收到主库针对查询请求的查询应答时,数据服务设备判定主库发生故障。参考图4,图4示出了一个实施例中基于主从保护机制的关系型数据库的示意图。如图4所示,当主库发生故障时,数据服务设备随机将一个从库确定为新的主库,或者将当前性能最优的一个从库确定为新的主库等,而后在当前的主库中查询符合查询条件的数据。
在一个实施例中,当在预设时长内仍未接收到主库针对查询请求的查询应答时,数据服务设备再次向主库发起查询请求,如此重试预设次数,若预设次数查询请求均为得到应答,则判定主库发生故障。
在一个实施例中,非关系型数据库也可以按照上述主从备用的机制提高数据安全性。由于非关系型数据库是分布式存储的,可以针对每个分布式子库分别部署对应的备用数据库。可以理解,备用数据库与分布式子库均为非关系型数据库。
本实施例中,有一个主库处理主要的数据访问请求,还有若干备用的从库用于容灾切换,当主库不能提供服务时,备用的从库将自动称为主库并继续提供服务,能够保证整个关系型数据库系统的可用和稳定。
在一个实施例中,上述数据访问方法还包括:获取对主库执行写操作时生成的日志;将日志发送至一个或多个目标的从库,使目标的从库根据日志同步执行写操作;当接收到目标的从库在执行写操作后返回的同步确认信息时,基于写入成功的结果来响应用于触发写操作的写数据请求,向其余从库发送日志。
除了持续稳定的数据服务,还应当保证主库和从库之间数据的一致性。本申请实施例基于复制日志的数据同步模式来保证主库和从库之间的数据一致性。复制日志的数据同步模式是指对主库的数据操作以日志的形式发送给各个从库,从库接收到日志后即进行同样的数据操作,完成数据备用。数据操作包括读操作和写操作。写操作包括更新操作、插入操作等。复制日志的数据同步模式中,主库与至少一个从库连接,能够方便的实现读写分离,同时因为每个从库都在运行中,从库里面的数据均为热数据,可以快速实现容灾切换。
复制日志的数据同步模式具体可以采用异步复制(Asynchronous replication)、半同步复制(Semisynchronous replication)和全同步复制(Fully synchronous replication)其中任一种。异步复制是指主库将新生成的日志发送给各个从库后,无需等待从库的确认回复信息即认为已完成同步,向数据访问方反馈数据操作已成功的信息。MySQL等数据库默认的复制是异步的,异步复制的方式虽然可以提高数据操作的速度,减少耗时,但数据可靠性降低。在极端情况下,若主库刚提交日志,其他从库尚未接收到相关日志时,主库便发生故障,由于主库此时已经向数据访问方返回数据操作已成功的信息,导致该日志的数据操作内容全部丢失。
全同步复制方式是指主库将新生成的日志发送给各个从库后,需要等待所有从库的确认回复信息才认为完成同步。全同步的复制方式虽然能够保证数据可靠性,但数据操作速率收到严重影响。
半同步复制是介于异步复制和全同步复制之间的一种数据同步方式,是指数据服务设备在将主库新生成的日志发送给各个主库之前,需要发送日志到预设数量目标的从库,等待从库返回确认信息后,数据服务设备再提交日志给其余从库并直接视为同步完成。目标的从库可以是随机的从库,也可以是预先指定的从库。预设数量可以根据需求自由设定,如1等。可以理解,预设数量越多,数据同步所需耗时越长,数据可靠性越高。当预设数量为1时,可以保证至少一个从库已真正完成同步时,才向数据访问方返回数据操作已成功的信息,保证数据可靠性。其余从库按照异步复制的方式,将日志发出即认为完成其余从库的同步,减少耗时。
由于半同步复制的方式在未接收到目标从库返回的确认回复信息时并不向数据访问方返回数据操作已完成的信息,当出现主库刚提交日志,其他从库尚未接收到相关日志时,主库便发生故障的极端情况时,数据访问方需要重复数据操作,进而再次触发一次日志的复制,从而面对这种极端情况,半同步复制的方式也可以保证数据可靠性,保证容灾切换时不丢失数据。
本实施例中,基于日志的主从半复制模式在主库和从库之间进行数据同步,还可以保障主库和不同从库之间数据的一致性,且半同步的复制模式相比异步复制模式能够提高数据可靠性,相比全同步的复制模式能够有效减少耗时,提高数据库操作效率。
在一个具体地实施例中,如图5所示,上述数据访问方法包括:
S502,在接收到读数据请求时,对读数据请求进行解析,得到查询条件。
S504,当查询条件为路由字段时,在将路由字段作为分布式存储依据的非关系型数据库中,查询符合查询条件的数据。
S506,当查询条件包括除路由字段之外的查询字段时,基于查询条件向主关系型数据库发起查询请求。
S508,当在预设时长内未接收到主关系型数据库针对查询请求的查询应答时,将其中一个从关系型数据库确定为当前的主关系型数据库。
S510,在当前的主关系型数据库中查询符合查询条件的数据。
S512,基于查询到的数据响应读数据请求。
上述数据访问方法,关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力;通过结合两种存储技术的优势并保证两种数据库中数据的一致性,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
应该理解的是,虽然图2和5的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2和5中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图6所示,提供了一种数据存储方法,以该方法应用于图1中的数据服务设备120为例进行说明,包括以下步骤:
步骤602,在接收到写数据请求时,对写数据请求进行解析,得到待写入数据。
写数据请求是数据访问方在数据访问设备通过目标应用发起的用于指示数据服务设备对数据库中的数据进行写操作的数据请求。写操作包括更新操作、插入操作等。更新操作是指对数据进行修改或删除的操作。删除操作通常不会真的将数据从数据库删除,而是采用更新的方式将数据的有效性字段配置为无效。
具体地,当需要向数据库中写入数据时,数据访问方可以基于数据访问设备上的目标应用输入待写入数据。目标应用根据待写入数据生成写数据请求,将写数据请求发送至数据服务设备。数据服务设备对写数据请求进行解析,得到待写入数据,根据待写入数据为数据访问方提供数据调度服务。
步骤604,将待写入数据写入一种数据库。
步骤606,将待写入数据从一种数据库同步至另一种数据库;其中,数据库的种类包括关系型数据库和非关系型数据库;非关系型数据库用于响应符合预设高频访问条件的读数据请求,关系型数据库用于响应符合预设复杂访问条件的读数据请求。
对数据库中的数据的操作包括读操作和写操作。读数据不会引起数据的变更,因而无需在关系型数据库和非关系型数据库之间进行数据同步。而写操作则会造成数据变更,因而需要在关系型数据库和非关系型数据库之间进行数据同步操作,以保证数据一致性。具体的同步策略可以是将待写入数据写入一种数据库后,再将待写入数据从该数据库同步至另一种数据库。比如,在进行更新操作时,可以将待写入数据先更新至非关系型数据库,再根据非关系型数据库来更新关系型数据库。再比如,在进行插入操作时,可以将待写入 数据先插入关系型数据库,再根据关系型数据库来更新非关系型数据库。
参考图7,图7示出了一个实施例中数据存储方法的原理示意图。如图7所示,可以基于准实时同步策略在关系型数据库和非关系型数据库进行数据同步。关系型数据库和非关系型数据库之间的数据同步操作是基于写操作事件驱动的,换言之,一旦发生写数据请求,立即将待写入数据写入一种数据库,并随即同步至另一种数据库。这种基于事件驱动的数据同步策略可以使关系型数据与非关系型数据库的同步只有微小的时间间隔,虽然不能实现完全实时,可以实现准实时同步。这种基于事件驱动的同步策略即为准实时同步策略。
步骤608,基于待写入数据的写入结果响应写数据请求。
写入结果是指待写入数据是否已成功写入数据库的结果信息,具体包括写入成功和写入失败。对于不同的写操作,写入结果进一步可以区分为更新结果、插入结果等。写入结果的判定规则可以是在待写入数据成功写入一种数据库时即将写入结果确定为写入成功,也可以只有待写入数据在两种数据库中均写入成功时才确定写入结果为写入成功。
具体地,数据服务设备根据写入结果的判定规则,在确定待写入数据的写入结果后将写入结果返回给数据访问方。比如,在进行更新操作时,可以待写入数据成功更新至非关系型数据库时即向数据访问方返回数据更新成功的信息。再比如,在进行插入操作时,可以将待写入数据先插入关系型数据库,再根据关系型数据库来更新非关系型数据库,在待写入数据成功插入关系型数据以及非关系型数据库时,向数据访问方返回数据插入成功的信息。
上述数据存储方法中,当发生写数据操作时,先将数据写入一种数据库,再将数据从该数据库同步至另一种数据库,由于是数据库中的数据是同步得到的,可以保证两种数据库中数据的一致性,由于是基于写数据操作事件驱动数据的写入和同步,可以保证数据同步的实时性。关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力。在保证两种数据库中数据一致性的前提下,通过结合两种存储技术的优势,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
在一个实施例中,待写入数据包括数据标识和目标数据内容;将待写入数据写入一种数据库包括:当写数据请求为数据更新请求时,根据目标数据内容对非关系型数据库中存储的与数据标识对应的原数据内容进行更新;将待写入数据从一种数据库同步至另一种数据库包括:在非关系型数据库更新完毕时,通过消息队列将目标数据内容从非关系型数据库异步更新至关系型数据库。
数据更新请求是用于指示数据服务设备对数据库中的数据执行更新操作的数据请求。消息队列(messagequeuing,MQ)是一种应用程序对应用程序的通信方法。应用程序通过写和检索出入队列的针对应用程序的数据(消息)来通信,而无需专用连接来链接它们。消息传递指的是程序之间通过在消息中发送数据进行通信,而不是通过直接调用彼此来通信,直接调用通常是用于诸如远程过程调用的技术。排队指的是应用程序通过队列来通信。 队列的使用除去了接收和发送应用程序同时执行的要求。
具体地,参考图8,图8示出了一个实施例中响应数据更新请求在关系型数据库和非关系型数据库之间进行准实时同步的流程示意图。如图8所示,响应数据更新请求的步骤包括:
S802,当接收到数据更新请求时,数据服务设备首先根据数据更新请求携带的待写入数据的数据标识在非关系型数据库中确定需要更新的数据(以下称目标数据),以待写入数据中的目标数据内容替换该条数据的原数据内容。由于宕机等情况,可能导致目标数据在非关系型数据库中更新失败,此时向数据访问方反馈数据更新失败的信息。
S804,在完成对非关系型数据库中目标数据的更新后,数据服务设备根据待写入数据或其数据标识生成更新任务,将更新任务添加至MQ。
S806,数据服务设备执行更新任务,对关系型数据库的目标数据进行更新。比如,将待写入数据加入MQ,数据服务设备按需从MQ中提取待写入数据进行消费,即根据提取到的待写入数据对关系型数据库中的目标数据进行更新。
本实施例中,先更新非关系型数据库,再借助消息队列来异步更新关系型数据库,这种基于更新操作时间驱动的数据同步策略可以使关系型数据与非关系型数据库的同步只有微小的时间间隔,可以实现准实时同步。
在一个实施例中,通过消息队列将目标数据内容从非关系型数据库异步更新至关系型数据库包括:当待写入数据对应的入队标识为待入队时,将待写入数据的数据标识添加至消息队列,在非关系型数据库中将入队标识更新为排队中;按照排队顺序对消息队列中每个数据标识进行遍历;根据当前遍历顺序的数据标识在非关系型数据库中查询对应的目标数据内容,基于查询到的目标数据内容,对关系型数据库存储的与数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
入队标识是数据服务设备为执行本申请提供的数据访问及存储方法在数据库中新增的字段。每条数据具有对应的入队标识。为了避免高频更新产生大量入队操作导致超出MQ长度范围,从而丢失更新任务,本申请实施例引入入队标识字段,在非关系型数据库侧做标记。入队标识字段只需在非关系型数据库的数据表中添加。入队标识用于表征一条数据的更新任务是否位在消息队列中的状态信息,具体包括“待入队”和“排队中”。
具体地,如图8所示,上述向消息队列中添加更新任务的步骤S804包括:数据服务设备确定非关系型数据库中与待写入数据的数据标识对应的入队标记。在一个实施例中,入队标识也可以采用其他字符表征,如“待入队”可以用0表征,“排队中”可以用1表征。更新非关系型数据库后,数据服务设备检查待写入数据的入队标识字段是否为1。若入队标识为1则表示待写入数据对应的更新任务已在MQ中,不需要再次入队。若入队标识为0,则表示MQ中没有该条数据的更新任务,则需要加入MQ。数据服务设备只需将待写入数据的数据标识加入消息队列,并确定是否加入成功。当数据标识成功加入消息队列时,数据服务设备将更新结果确定为更新成功,向数据访问方反馈数据更新成功的信息。
如图8所示,上述消费消息队列中的更新任务,对关系型数据库中的数据进行更新的步骤S806包括:在将待写入数据对应的更新任务添加至消息队列后,数据服务设备在非关系型数据库中将入队标识更新为排队中。数据服务设备按照排队顺序对消息队列中每个更新任务进行遍历执行。当遍历至该待写入数据对应的更新任务时,数据服务设备首先根 据待写入数据的数据标识,在关系型数据库中查询是否该数据标识对应的目标数据。若不存在则重试,若重试预设次数后仍无法查询到该数据标识对应的目标数据,则跳过该更新任务,并发出告警,继续执行下一顺序更新任务。当存在时,数据服务设备根据更新任务中的数据标识在非关系型数据库中查询对应的目标数据内容,在查询到的目标数据内容的数据版本高于关系型数据库中原数据内容的存储版本时,基于查询到的目标数据内容,对关系型数据库存储的与数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
本实施例中,借助消息队列来异步更新关系型数据库,由于消息队列是不断处理队列的更新任务的,因而关系型数据库实际上可以做到准实时的数据更新同步,且由于消息队列中的更新任务是逐个处理的,不会因为请求量大导致关系型数据库压力过大。
在一个实施例中,目标数据内容包括待写入数据的目标版本;根据目标数据内容对非关系型数据库存储的与数据标识对应的原数据内容进行更新包括:确定非关系型数据库中存储的与数据标识对应的原数据内容的存储版本;当目标版本等于存储版本时,根据目标数据内容对非关系型数据库存储的与数据标识对应的原数据内容进行更新;在更新成功后,对存储版本进行更新。
数据版本是数据服务设备为执行本申请提供的数据访问及存储方法在数据库中新增的字段。每条数据具有对应的存储版本。为保证更新顺序的准确性,避免MQ更新关系型数据库存在顺序问题,对非关系型数据库和关系型数据库中为每条数据均加上数据版本字段,来记录每条数据的数据版本信息。数据版本字段在非关系型数据库和关系型数据库的数据表中均需添加。数据版本字段具体可以是版本号数值,如0,1等。读取出数据时,将数据版本号一同读出,之后更新时,对此数据版本号加一。
数据服务设备根据通过数据版本字段构建乐观锁机制,基于乐观锁机制完成在非关系型数据库中的写操作。具体地,当根据数据更新请求对非关系型数据库中的目标数据进行更新时,数据服务设备对比非关系型数据库中当前存储的与待写入数据中数据标识对应的原数据内容的数据版本(以下称存储版本),只有待写入数据中目标数据内容的数据版本(以下称待写入数据版本)较新时才执行更新,并在更新或插入完成后对存储版本加一。当待写入数据版本低于存储版本时,确定写入结果为写入失败,向数据访问方返回数据写入失败的信息。
本实施例中,提交的数据版本号等于数据库表当前版本号,则予以更新,否则认为是过期数据,这种乐观锁机制来保证写数据操作的准确性。
在一个实施例中,待写入数据包括数据标识和目标数据内容;将待写入数据写入一种数据库包括:当写数据请求为插入数据请求时,确定数据标识是否已存储于关系型数据库中;若数据标识尚未存储于关系型数据库中,将待写入数据插入关系型数据库及非关系型数据库;若数据标识已存储于关系型数据库中,将待写入数据插入非关系型数据库,并在插入成功后,通过消息队列将目标数据内容从非关系型数据库异步更新至关系型数据库。
数据插入请求是用于指示数据服务设备对数据库中的数据执行插入操作的数据请求。
具体地,参考图9,图9示出了一个实施例中响应数据插入请求时在关系型数据库和非关系型数据库之间进行准实时同步的流程示意图。如图9所示,响应数据插入请求的步骤包括:
S902,当接收到数据插入请求时,数据服务设备首先根据数据插入请求携带的待写入数据的数据标识在关系型数据库中查询是否已经存在该数据标识。
S904,若关系型数据库中尚不存在该数据标识,表明该数据标识为首次插入,数据服务设备将待写入数据插入关系型数据库中,并初始化该待写入数据在关系型数据库的数据版本号为0。
若待写入数据在关系型数据库中插入成功,在将待写入数据成功插入关系型数据库之后,数据服务设备将待写入数据插入非关系型数据库,并初始化该待写入数据在非关系型数据库的数据版本号为0。
若待写入数据在关系型数据库中插入失败,或者在非关系型数据库中插入失败,则数据服务设备向数据访问方返回插入失败的信息。
S906,若关系型数据库中已存在该数据标识,表明是重复插入关系型数据库,数据服务设备根据数据标识在在非关系型数据库中查询是否已经存在该数据标识。
若非关系型数据库中也已存在该数据标识,表明也是重复插入非关系型数据库,数据服务设备向数据访问方返回插入错误的提示信息。在插入操作中,由于需要插入两种数据库,任何一种数据库的插入失败均有可能导致数据访问方重复发起插入,而本实施例在关系型数据库中已存在该条数据时不再重复插入,当非关系型数据库也已存在该条数据时返回错误信息,如此可以保障数据不被重复插入以及数据的一致性。
在一个实施例中,若数据标识已存储于关系型数据库中,将待写入数据插入非关系型数据库包括:若数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,将待写入数据插入非关系型数据库;若数据标识已存储于关系型数据库及非关系型数据库时,将写入成果确定为写入失败。
若非关系型数据库中尚不存在该数据标识,表明是首次插入非关系型数据库,数据服务设备将待写入数据插入非关系型数据库中,并初始化该待写入数据在非关系型数据库的数据版本号为0。考虑到数据可能发生变化,数据服务设备在将待写入数据插入非关系型数据库后,往消息队列中添加待写入数据的更新任务,以对关系型数据库中已存储的对应目标数据进行更新。具体更新逻辑可参考上述实施例,在此不再赘述。
在一个实施例中,如图9所示,上述数据存储方法还包括:当写数据请求为插入数据请求,且数据标识尚未存储于关系型数据库中,在将待写入数据插入非关系型数据库后,初始化待写入数据的入队标识为待入队。即当一条待写入数据为首次插入时,则按照正常更新逻辑,该待写入数据对应的更新任务等待入队。
在一个实施例中,如图9所示,上述数据存储方法还包括:当数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,在将待写入数据插入非关系型数据库后,将待写入数据的数据标识添加至消息队列,并初始化待写入数据的入队标识为排队中。即当一条待写入数据为重复插入关系型数据库时,则将该待写入数据对应的更新任务立即添加至消息队列,提高数据同步实时性。
本实施例中,由于非关系型数据库能够承受高频快速的插入操作,采用先插入关系型数据库,再同步插入非关系型数据库的插入操作同步策略,不再借助消息队列来将待写入数据异步插入非关系型数据库,如此非关系型数据无需承担异步操作存在的同步失败的风险。
在一个实施例中,上述数据存储方法还包括:对关系型数据库中每条数据进行遍历;在当前遍历顺序数据的数据标识未存储在非关系型数据库中时,将当前遍历顺序的数据插入非关系型数据库;在当前遍历顺序数据的数据标识存储在非关系型数据库中,但对应的数据内容不一致时,根据非关系型数据库中的数据内容对关系型数据库中的数据内容进行更新。
基于上述写操作准实时同步策略,基本可以保证两种数据库中数据的一致性。如图7所示,为了避免出现一些极端错误(如数据插入非关系型数据库持续失败,或消息队列更新重试多次依然失败等)带来数据不一致的问题,数据服务设备基于预设的兜底对账机制来保障两种数据库中数据的最终一致性。
具体地,参考图10,图10示出了一个实施例中在关系型数据库和非关系型数据库中进行兜底对账的流程示意图。如图10所示,对关系型数据库和非关系型数据库中的数据进行兜底对账的步骤包括:
S1002,数据服务设备按照目标时间频率对关系型数据库中的每条数据进行遍历,对遍历到的每条数据的数据标识是否同时存储在非关系型数据库中进行验查询。
在一个实施例中,上述数据存储方法还包括:按照预设时间频率统计关系型数据库所存储全部数据的总数据量;根据总数据量确定进行遍历的目标时间频率;对关系型数据库中每条数据进行遍历包括:按照目标时间频率对关系型数据库中每条数据进行遍历。
该兜底对账机制通过脚本每隔一定时间(如30分钟)进行一次。兜底对账脚本执行频率(即目标时间频率)可以是根据关系型数据库中所存储数据的总数据量动态确定的。可以理解,目标时间频率与总数据量正相关。
S1004,若当前遍历顺序的一条数据的数据标识未存储在非关系型数据库中,则数据服务设备在非关系型数据库中新增该条数据,即将当前遍历顺序的数据插入非关系型数据库中,并初始化新增数据的数据版本为0。
S1006,若当前遍历顺序的一条数据的数据标识存储在非关系型数据库中,则数据服务设备比对该数据标识在关系型数据库中的数据内容与非关系型数据库中的数据内容是否一致。若一致,则继续遍历下一顺序数据。若不一致,则数据服务设备比对该数据标识在非关系型数据库中数据内容的数据版本是否高于关系型数据库中数据内容的数据版本。若是,数据服务设备将该数据标识在非关系型数据库中的数据内容替换关系型数据库中的数据内容,并将该数据标识对应数据在非关系型数据库中的入队标识配置为待入队。兜底对账过程中进行数据更新完全以非关系型数据库中的数据为准,不参考数据版本。
值得说明的是,正是这种兜底对账机制,可以保证两种数据库中数据的最终一致性。因而,如图8所示,在步骤S804中,即便在数据标识未成功加入消息队列时,依然可以将更新结果确定为更新成功。
本实施例中,在两种数据库之间进行准实时同步后,进一步进行兜底对账,可以避免出现一些极端错误带来数据不一致的问题,达到数据的最终一致性。
在一个具体地实施例中,如图11所示,本申请提供的数据存储方法包括:
S1102,在接收到写数据请求时,对写数据请求进行解析,得到待写入数据;待写入数据包括数据标识和目标数据内容;目标数据内容包括待写入数据的目标版本。
S1104,当写数据请求为插入数据请求时,确定数据标识是否已存储于用于响应符合 预设复杂访问条件的读数据请求的关系型数据库中。
S1106,若数据标识尚未存储于关系型数据库中,将待写入数据插入关系型数据库及非关系型数据库,初始化待写入数据的入队标识为待入队,初始化待写入数据的存储版本为0。非关系型数据库用于响应符合预设高频访问条件的读数据请求。
S1108,若数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,将待写入数据插入非关系型数据库;并在插入成功后,将待写入数据的数据标识立即添加至消息队列,并初始化待写入数据的入队标识为排队中,初始化待写入数据的存储版本为0。按照排队顺序对消息队列中每个数据标识进行遍历;根据当前遍历顺序的数据标识在非关系型数据库中查询对应的目标数据内容,基于查询到的目标数据内容,对关系型数据库存储的与数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
S1110,若数据标识已存储于关系型数据库及非关系型数据库时,将写入成果确定为写入失败。
S1112,当写数据请求为数据更新请求时,确定非关系型数据库中存储的与数据标识对应的原数据内容的存储版本。
S1114,当目标版本等于存储版本时,根据目标数据内容对非关系型数据库存储的与数据标识对应的原数据内容进行更新;在更新成功后,将存储版本更新为目标版本。
S1116,当待写入数据版本低于存储版本时,确定写入结果为写入失败。
S1118,在非关系型数据库更新完毕,且待写入数据对应的入队标识为待入队时,将待写入数据的数据标识添加至消息队列,在非关系型数据库中将入队标识更新为排队中;按照排队顺序对消息队列中每个数据标识进行遍历;根据当前遍历顺序的数据标识在非关系型数据库中查询对应的目标数据内容,基于查询到的目标数据内容,对关系型数据库存储的与数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
S1120,基于待写入数据的插入结果响应数据插入请求。
S1122,获取对主库执行写操作时生成的日志。
S1124,将日志发送至一个或多个目标的从库,使目标的从库根据日志同步执行写操作。
S1126,当接收到目标的从库在执行写操作后返回的同步确认信息时,基于写入成功的结果来响应用于触发写操作的写数据请求,向其余从库发送日志,使其余从库根据日志同步执行写操作。
S1128,对关系型数据库中每条数据进行遍历。
S1130,在当前遍历顺序数据的数据标识未存储在非关系型数据库中时,将当前遍历顺序的数据插入非关系型数据库。
S1132,在当前遍历顺序数据的数据标识存储在非关系型数据库中,但对应的数据内容不一致时,根据非关系型数据库中的数据内容对关系型数据库中的数据内容进行更新。
上述数据存储方法中,综合使用两种存储方式,结合两者优点,同时提供数据高可用保障,并设计了一种准实时且有最终一致性保障的数据存储访问方案。该方案能满足业务高并发场景下的数据访问需求,解决传统关系型数据库的性能问题;对于复杂查询等场景又能提供低频的复杂查询方案,解决非关系型数据库在复杂查询场景下的劣势;对于数据的准确性与一致性,有可靠的数据同步方案来满足,并有兜底对账脚本来保障数据的最终 一致性。极大地提高了数据层对业务的支撑能力,解决数据访问上的瓶颈,为业务的稳步发展提供坚强有力的基础。
当发生写数据操作时,先将数据写入一种数据库,再将数据从该数据库同步至另一种数据库,由于是数据库中的数据是同步得到的,可以保证两种数据库中数据的一致性,由于是基于写数据操作事件驱动数据的写入和同步,可以保证数据同步的实时性。关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力。在保证两种数据库中数据一致性的前提下,通过结合两种存储技术的优势,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
应该理解的是,虽然图6以及图8至图11的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图6以及图8至图11中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图12所示,提供了一种数据访问装置1200,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:数据读取请求模块1202、数据查询模块1204和查询结果反馈模块1206,其中:
数据读取请求模块1202,用于在接收到读数据请求时,对读数据请求进行解析,得到查询条件。
数据查询模块1204,用于当查询条件符合预设高频访问条件时,在非关系型数据库中查询符合查询条件的数据;当查询条件符合预设复杂访问条件时,在关系型数据库中查询符合查询条件的数据;非关系型数据库与关系型数据库所存储数据一致。
查询结果反馈模块1206,用于基于查询到的数据响应读数据请求。
在一个实施例中,如图13所示,上述数据访问装置1200还包括访问场景识别模块1208,用于当查询条件为预设的路由字段时,判定查询条件符合预设高频访问条件;当查询条件包括除路由字段之外的查询字段时,判定查询条件符合预设复杂访问条件。
在一个实施例中,非关系型数据库包括多个分布式子库;如图13所示,数据查询模块1204包括高频访问查询模块12042,用于当查询条件为路由字段时,确定查询字段所属的分布式区间;在用于存储分布式区间内的数据的分布式子库中,查询符合查询条件的数据。
在一个实施例中,非关系型数据库具有一个或多个;高频访问查询模块12042还用于当查询条件为路由字段时,在将路由字段作为分布式存储依据的非关系型数据库中,查询符合查询条件的数据。
在一个实施例中,关系型数据库包括主库和至少一个从库;如图13所示,数据查询模块1204还包括复杂访问查询模块12044,用于基于查询条件向主库发起查询请求;当 在预设时长内未接收到主库针对查询请求的查询应答时,将其中一个从库确定为当前的主库;在当前的主库中查询符合查询条件的数据。
在一个实施例中,如图13所示,上述数据访问装置1200还包括半同步复制模块1210,用于获取对主库执行写操作时生成的日志;将日志发送至一个或多个目标的从库,使目标的从库根据日志同步执行写操作;当接收到目标的从库在执行写操作后返回的同步确认信息时,基于写入成功的结果来响应用于触发写操作的写数据请求,向其余从库发送日志。
上述数据访问装置,关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力;通过结合两种存储技术的优势并保证两种数据库中数据的一致性,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
关于数据访问装置的具体限定可以参见上文中对于数据访问方法的限定,在此不再赘述。上述数据访问装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,如图14所示,提供了一种数据存储装置1400,该装置可以采用软件模块或硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:数据写入请求模块1402、数据准实时同步模块1404和写入结果反馈模块1406,其中:
数据写入请求模块1402,用于在接收到写数据请求时,对写数据请求进行解析,得到待写入数据。
数据准实时同步模块1404,用于将待写入数据写入一种数据库;将待写入数据从一种数据库同步至另一种数据库;其中,数据库的种类包括关系型数据库和非关系型数据库;非关系型数据库用于响应符合预设高频访问条件的读数据请求,关系型数据库用于响应符合预设复杂访问条件的读数据请求。
写入结果反馈模块1406,用于基于待写入数据的写入结果响应写数据请求。
在一个实施例中,待写入数据包括数据标识和目标数据内容;如图15所示,数据准实时同步模块1404包括数据更新同步模块14042,用于当写数据请求为数据更新请求时,根据目标数据内容对非关系型数据库中存储的与数据标识对应的原数据内容进行更新;将待写入数据从一种数据库同步至另一种数据库包括:在非关系型数据库更新完毕时,通过消息队列将目标数据内容从非关系型数据库异步更新至关系型数据库。
在一个实施例中,目标数据内容包括待写入数据的目标版本;数据更新同步模块14042还用于确定非关系型数据库中存储的与数据标识对应的原数据内容的存储版本;当目标版本等于存储版本时,根据目标数据内容对非关系型数据库存储的与数据标识对应的原数据内容进行更新;在更新成功后,将存储版本更新为目标版本。当待写入数据版本低于存储版本时,确定写入结果为写入失败。
在一个实施例中,待写入数据包括数据标识和目标数据内容;如图15所示,数据准实时同步模块1404还包括数据插入同步模块14044,用于当写数据请求为插入数据请求 时,确定数据标识是否已存储于关系型数据库中;若数据标识尚未存储于关系型数据库中,将待写入数据插入关系型数据库及非关系型数据库;若数据标识已存储于关系型数据库中,将待写入数据插入非关系型数据库,并在插入成功后,通过消息队列将目标数据内容从非关系型数据库异步更新至关系型数据库。
在一个实施例中,数据插入同步模块14044还用于若数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,将待写入数据插入非关系型数据库;若数据标识已存储于关系型数据库及非关系型数据库时,将写入成果确定为写入失败。
在一个实施例中,数据更新同步模块14042还用于当待写入数据对应的入队标识为待入队时,将待写入数据的数据标识添加至消息队列,在非关系型数据库中将入队标识更新为排队中;按照排队顺序对消息队列中每个数据标识进行遍历;根据当前遍历顺序的数据标识在非关系型数据库中查询对应的目标数据内容,基于查询到的目标数据内容,对关系型数据库存储的与数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
在一个实施例中,上述数据存储装置1400还包括数据异步更新模块1408,用于当写数据请求为插入数据请求,且数据标识尚未存储于关系型数据库中,在将待写入数据插入非关系型数据库后,初始化待写入数据的入队标识为待入队;当数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,在将待写入数据插入非关系型数据库后,将待写入数据的数据标识添加至消息队列,并初始化待写入数据的入队标识为排队中。
在一个实施例中,如图15所示,上述数据存储装置1400还包括数据兜底对账模块1410,用于对关系型数据库中每条数据进行遍历;在当前遍历顺序数据的数据标识未存储在非关系型数据库中时,将当前遍历顺序的数据插入非关系型数据库;在当前遍历顺序数据的数据标识存储在非关系型数据库中,但对应的数据内容不一致时,根据非关系型数据库中的数据内容对关系型数据库中的数据内容进行更新。
上述数据存储装置,当发生写数据操作时,先将数据写入一种数据库,再将数据从该数据库同步至另一种数据库,由于是数据库中的数据是同步得到的,可以保证两种数据库中数据的一致性,由于是基于写数据操作事件驱动数据的写入和同步,可以保证数据同步的实时性。关系型数据库支持低并发的数据访问操作,能够提供根据复杂查询条件进行数据筛选的能力;非关系型数据库支持高并发的数据访问操作,能够根据简单查询条件进行数据筛选的能力。在保证两种数据库中数据一致性的前提下,通过结合两种存储技术的优势,根据数据访问场景区分查询数据的来源,即高并发的数据访问场景从非关系型数据库中查询数据,而并发度不高但查询条件复杂的数据访问场景则可以从关系型数据库中查询数据,从而既能满足高并发的数据访问请求,也能够满足复杂的数据访问请求,提高数据访问性能。
关于数据存储装置的具体限定可以参见上文中对于数据存储方法的限定,在此不再赘述。上述数据存储装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图16所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。 其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库包括关系型数据库和非关系型数据库,所存储的数据一致。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种数据访问及存储方法。
本领域技术人员可以理解,图16中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,还提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机可读指令,该处理器执行计算机可读指令时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品或计算机可读指令,该计算机程序产品或计算机可读指令包括计算机可读指令,该计算机可读指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机可读指令,处理器执行该计算机可读指令,使得该计算机设备执行上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种数据访问方法,由计算机设备执行,所述方法包括:
    在接收到读数据请求时,对所述读数据请求进行解析,得到查询条件;
    当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据;
    当所述查询条件符合预设复杂访问条件时,在关系型数据库中查询符合所述查询条件的数据;所述非关系型数据库与所述关系型数据库所存储数据一致;及
    基于查询到的数据响应所述读数据请求。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    当所述查询条件为预设的路由字段时,判定所述查询条件符合预设高频访问条件;及
    当所述查询条件包括除所述预设的路由字段之外的查询字段时,判定所述查询条件符合预设复杂访问条件。
  3. 根据权利要求1所述的方法,其中所述非关系型数据库包括多个分布式子库;所述当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据包括:
    当所述查询条件为预设的路由字段时,确定所述查询字段所属的分布式区间;及
    在用于存储所述分布式区间内的数据的分布式子库中,查询符合所述查询条件的数据。
  4. 根据权利要求1所述的方法,其中所述非关系型数据库具有一个或多个;所述当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据包括:
    当所述查询条件为路由字段时,在将所述路由字段作为分布式存储依据的非关系型数据库中,查询符合所述查询条件的数据。
  5. 根据权利要求1所述的方法,其中所述关系型数据库包括主库和至少一个从库;所述在关系型数据库中查询符合所述查询条件的数据包括:
    基于所述查询条件向所述主库发起查询请求;
    当在预设时长内未接收到所述主库针对所述查询请求的查询应答时,将其中一个所述从库确定为当前的主库;及
    在所述当前的主库中查询符合所述查询条件的数据。
  6. 根据权利要求5所述的方法,其中所述方法还包括:
    获取对所述主库执行写操作时生成的日志;
    将所述日志发送至一个或多个目标的从库,使所述目标的从库根据所述日志同步执行所述写操作;及
    当接收到所述目标的从库在执行所述写操作后返回的同步确认信息时,基于写入成功的结果来响应用于触发所述写操作的写数据请求,向其余所述从库发送所述日志。
  7. 一种数据存储方法,由计算机设备执行,所述方法包括:
    在接收到写数据请求时,对所述写数据请求进行解析,得到待写入数据;
    将所述待写入数据写入一种数据库;
    将所述待写入数据从所述一种数据库同步至另一种数据库;其中,所述数据库的种类包括关系型数据库和非关系型数据库;所述非关系型数据库用于响应符合预设高频访问条件的读数据请求,所述关系型数据库用于响应符合预设复杂访问条件的读数据请求;及
    基于所述待写入数据的写入结果响应所述写数据请求。
  8. 根据权利要求7所述的方法,其中所述待写入数据包括数据标识和目标数据内容;所述将所述待写入数据写入一种数据库包括:
    当所述写数据请求为数据更新请求时,根据所述目标数据内容对所述非关系型数据库中存储的与所述数据标识对应的原数据内容进行更新;及
    所述将所述待写入数据从所述一种数据库同步至另一种数据库包括:
    在所述非关系型数据库更新完毕时,通过消息队列将所述目标数据内容从所述非关系型数据库异步更新至所述关系型数据库。
  9. 根据权利要求8所述的方法,其中所述目标数据内容包括待写入数据的目标版本;所述根据目标数据内容对所述非关系型数据库存储的与所述数据标识对应的原数据内容进行更新包括:
    确定非关系型数据库中存储的与数据标识对应的原数据内容的存储版本;
    当所述目标版本等于所述存储版本时,根据目标数据内容对所述非关系型数据库存储的与所述数据标识对应的原数据内容进行更新;及
    在更新成功后,对所述存储版本进行更新。
  10. 根据权利要求7所述的方法,其中所述待写入数据包括数据标识和目标数据内容;所述将所述待写入数据写入一种数据库包括:
    当所述写数据请求为插入数据请求时,确定所述数据标识是否已存储于所述关系型数据库中;
    若所述数据标识尚未存储于所述关系型数据库中,将所述待写入数据插入所述关系型数据库及所述非关系型数据库;及
    若所述数据标识已存储于所述关系型数据库中,将所述待写入数据插入所述非关系型数据库,并在插入成功后,通过消息队列将所述目标数据内容从所述非关系型数据库异步更新至所述关系型数据库。
  11. 根据权利要求10所述的方法,其中所述若所述数据标识已存储于所述关系型数据库中,将所述待写入数据插入所述非关系型数据库包括:
    若所述数据标识已存储于所述关系型数据库中,但尚未存储于所述非关系型数据库时,将所述待写入数据插入所述非关系型数据库;及
    若所述数据标识已存储于所述关系型数据库及所述非关系型数据库时,将写入成果确定为写入失败。
  12. 根据权利要求8至11任一项所述的方法,其中所述通过消息队列将目标数据内容从非关系型数据库异步更新至所述关系型数据库包括:
    当所述待写入数据对应的入队标识为待入队时,将所述待写入数据的数据标识添加至消息队列,在所述非关系型数据库中将所述入队标识更新为排队中;
    按照排队顺序对所述消息队列中每个数据标识进行遍历;及
    根据当前遍历顺序的数据标识在所述非关系型数据库中查询对应的目标数据内容,基 于查询到的目标数据内容,对所述关系型数据库存储的与所述数据标识对应的原数据内容进行更新,将当前遍历顺序的入队标识更新为待入队。
  13. 根据权利要求8至11任一项所述的方法,其中所述数据存储方法还包括:
    当所述写数据请求为插入数据请求,且所述数据标识尚未存储于所述关系型数据库中,在将所述待写入数据插入所述关系型数据库及所述非关系型数据库后,初始化所述待写入数据的入队标识为待入队;及
    当所述数据标识已存储于关系型数据库中,但尚未存储于非关系型数据库时,在将所述待写入数据插入所述非关系型数据库后,将所述待写入数据的数据标识添加至所述消息队列,并初始化所述待写入数据的入队标识为排队中。
  14. 根据权利要求10所述的方法,其中所述方法包括:
    对所述关系型数据库中每条数据进行遍历;
    在当前遍历顺序数据的数据标识未存储在所述非关系型数据库中时,将所述当前遍历顺序的数据插入所述非关系型数据库;及
    在当前遍历顺序数据的数据标识存储在所述非关系型数据库中,但对应的数据内容不一致时,根据所述非关系型数据库中的数据内容对所述关系型数据库中的数据内容进行更新。
  15. 根据权利要求14所述的方法,其中所述数据存储方法还包括:
    按照预设时间频率统计所述关系型数据库所存储全部数据的总数据量;
    根据所述总数据量确定进行遍历的目标时间频率;及
    所述对所述关系型数据库中每条数据进行遍历包括:
    按照所述目标时间频率对所述关系型数据库中每条数据进行遍历。
  16. 一种数据访问装置,所述装置包括:
    数据读取请求模块,用于在接收到读数据请求时,对所述读数据请求进行解析,得到查询条件;
    数据查询模块,用于当所述查询条件符合预设高频访问条件时,在非关系型数据库中查询符合所述查询条件的数据;当所述查询条件符合预设复杂访问条件时,在关系型数据库中查询符合所述查询条件的数据;所述非关系型数据库与所述关系型数据库所存储数据一致;及
    查询结果反馈模块,用于基于查询到的数据响应所述读数据请求。
  17. 根据权利要求16所述的装置,其中所述数据访问装置还包括访问场景识别模块,用于当所述查询条件为预设的路由字段时,判定所述查询条件符合所述预设高频访问条件;当所述查询条件包括除所述路由字段之外的查询字段时,判定所述查询条件符合所述预设复杂访问条件。
  18. 一种数据存储装置,所述装置包括:
    数据写入请求模块,用于在接收到写数据请求时,对所述写数据请求进行解析,得到待写入数据;
    数据准实时同步模块,用于将所述待写入数据写入一种数据库;将所述待写入数据从所述一种数据库同步至另一种数据库;其中,所述数据库的种类包括关系型数据库和非关 系型数据库;所述非关系型数据库用于响应符合预设高频访问条件的读数据请求,所述关系型数据库用于响应符合预设复杂访问条件的读数据请求;及
    写入结果反馈模块,用于基于所述待写入数据的写入结果响应所述写数据请求。
  19. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行执行如权利要求1至15中任一项所述方法的步骤。
  20. 一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行如权利要求1至15中任一项所述方法的步骤。
PCT/CN2020/124253 2020-03-20 2020-10-28 数据访问方法和装置、数据存储方法和装置 WO2021184761A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/696,576 US20220207036A1 (en) 2020-03-20 2022-03-16 Data access method and apparatus, and data storage method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010199062.5A CN111414403B (zh) 2020-03-20 2020-03-20 数据访问方法和装置、数据存储方法和装置
CN202010199062.5 2020-03-20

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/696,576 Continuation US20220207036A1 (en) 2020-03-20 2022-03-16 Data access method and apparatus, and data storage method and apparatus

Publications (1)

Publication Number Publication Date
WO2021184761A1 true WO2021184761A1 (zh) 2021-09-23

Family

ID=71493112

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124253 WO2021184761A1 (zh) 2020-03-20 2020-10-28 数据访问方法和装置、数据存储方法和装置

Country Status (3)

Country Link
US (1) US20220207036A1 (zh)
CN (1) CN111414403B (zh)
WO (1) WO2021184761A1 (zh)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111414403B (zh) * 2020-03-20 2023-04-14 腾讯科技(深圳)有限公司 数据访问方法和装置、数据存储方法和装置
CN112434057A (zh) * 2020-10-12 2021-03-02 南京江北新区生物医药公共服务平台有限公司 一种数据查询方法及装置
CN112818059B (zh) * 2021-01-27 2024-05-17 百果园技术(新加坡)有限公司 一种基于容器发布平台的信息实时同步方法及装置
CN113282581A (zh) * 2021-05-17 2021-08-20 广西南宁天诚智远知识产权服务有限公司 一种数据库数据调用方法及装置
CN113407638A (zh) * 2021-07-16 2021-09-17 上海通联金融服务有限公司 实现实时关系型数据库数据同步的方法
CN116204542B (zh) * 2023-04-28 2023-08-01 广东广宇科技发展有限公司 一种数据库快速读写处理方法
CN116991692B (zh) * 2023-09-27 2024-02-09 广东广宇科技发展有限公司 一种基于数据库读写的验证方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195673A1 (en) * 2002-05-09 2008-08-14 International Business Machines Corporation Method, system, and program product for sequential coordination of external database application events with asynchronous internal database events
CN107180102A (zh) * 2017-05-25 2017-09-19 北京环境特性研究所 一种目标特性数据的存储方法和系统
US20180081956A1 (en) * 2013-11-04 2018-03-22 Guangdong Electronics Industry Institute Ltd. Method for automatically synchronizing multi-source heterogeneous data resources
CN108009236A (zh) * 2017-11-29 2018-05-08 北京锐安科技有限公司 一种大数据查询方法、系统、计算机及存储介质
CN109710668A (zh) * 2018-11-29 2019-05-03 中国电子科技集团公司第二十八研究所 一种多源异构数据访问中间件构建方法
CN111414403A (zh) * 2020-03-20 2020-07-14 腾讯科技(深圳)有限公司 数据访问方法和装置、数据存储方法和装置

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8572091B1 (en) * 2011-06-27 2013-10-29 Amazon Technologies, Inc. System and method for partitioning and indexing table data using a composite primary key
US9111012B2 (en) * 2012-11-26 2015-08-18 Accenture Global Services Limited Data consistency management
US9727625B2 (en) * 2014-01-16 2017-08-08 International Business Machines Corporation Parallel transaction messages for database replication
US10628418B2 (en) * 2014-11-13 2020-04-21 Sap Se Data driven multi-provider pruning for query execution plan
CN106294499A (zh) * 2015-06-09 2017-01-04 阿里巴巴集团控股有限公司 一种数据库数据查询方法和设备
WO2017042890A1 (ja) * 2015-09-08 2017-03-16 株式会社東芝 データベースシステム、サーバ装置、プログラムおよび情報処理方法
CN106095863B (zh) * 2016-06-03 2019-09-10 众安在线财产保险股份有限公司 一种多维度数据查询和存储系统和方法
US10860534B2 (en) * 2016-10-27 2020-12-08 Oracle International Corporation Executing a conditional command on an object stored in a storage system
US10896198B2 (en) * 2016-11-01 2021-01-19 Sap Se Scaling for elastic query service system
US10956369B1 (en) * 2017-04-06 2021-03-23 Amazon Technologies, Inc. Data aggregations in a distributed environment
CN107423368B (zh) * 2017-06-29 2020-07-17 中国测绘科学研究院 一种非关系数据库中的时空数据索引方法
US11036677B1 (en) * 2017-12-14 2021-06-15 Pure Storage, Inc. Replicated data integrity
US10970306B2 (en) * 2018-03-20 2021-04-06 Locus Robotics Corp. Change management system for data synchronization within an enterprise portal application
CN108549725A (zh) * 2018-04-28 2018-09-18 北京百度网讯科技有限公司 数据库访问控制方法、装置、系统、设备及计算机可读介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080195673A1 (en) * 2002-05-09 2008-08-14 International Business Machines Corporation Method, system, and program product for sequential coordination of external database application events with asynchronous internal database events
US20180081956A1 (en) * 2013-11-04 2018-03-22 Guangdong Electronics Industry Institute Ltd. Method for automatically synchronizing multi-source heterogeneous data resources
CN107180102A (zh) * 2017-05-25 2017-09-19 北京环境特性研究所 一种目标特性数据的存储方法和系统
CN108009236A (zh) * 2017-11-29 2018-05-08 北京锐安科技有限公司 一种大数据查询方法、系统、计算机及存储介质
CN109710668A (zh) * 2018-11-29 2019-05-03 中国电子科技集团公司第二十八研究所 一种多源异构数据访问中间件构建方法
CN111414403A (zh) * 2020-03-20 2020-07-14 腾讯科技(深圳)有限公司 数据访问方法和装置、数据存储方法和装置

Also Published As

Publication number Publication date
CN111414403A (zh) 2020-07-14
US20220207036A1 (en) 2022-06-30
CN111414403B (zh) 2023-04-14

Similar Documents

Publication Publication Date Title
WO2021184761A1 (zh) 数据访问方法和装置、数据存储方法和装置
US10078682B2 (en) Differentiated secondary index maintenance in log structured NoSQL data stores
Chandra BASE analysis of NoSQL database
JP6602355B2 (ja) クラウドベースの分散永続性及びキャッシュデータモデル
US11068501B2 (en) Single phase transaction commits for distributed database transactions
US10331797B2 (en) Transaction protocol for reading database values
WO2019128318A1 (zh) 数据处理方法、装置和系统
US9218405B2 (en) Batch processing and data synchronization in cloud-based systems
US8200624B2 (en) Membership tracking and data eviction in mobile middleware scenarios
JP2023546249A (ja) トランザクション処理方法、装置、コンピュータ機器及びコンピュータプログラム
KR102119258B1 (ko) 데이터베이스 관리 시스템에서의 변경 데이터 캡쳐 구현 기법
US10762068B2 (en) Virtual columns to expose row specific details for query execution in column store databases
KR20200056357A (ko) 데이터베이스 관리 시스템에서의 변경 데이터 캡쳐 구현 기법
CN115114374B (zh) 事务执行方法、装置、计算设备及存储介质
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
CN114328749A (zh) 业务数据处理方法及其装置、计算机可读存储介质
CN114817402A (zh) 分布式数据库于多region部署场景下的SQL执行优化方法
US11175905B2 (en) Optimizing uploads for an offline scenario by merging modification operations
CN114003580A (zh) 一种运用于分布式调度系统的数据库构建方法及装置
US11803568B1 (en) Replicating changes from a database to a destination and modifying replication capacity
US20230014029A1 (en) Local indexing for metadata repository objects
CN112527911B (zh) 一种数据存储方法、装置、设备及介质
McGlothlin et al. Scalable queries for large datasets using cloud computing: A case study
Cao Big Data Database for Business
CN114691683A (zh) 一种业务数据的管理系统及其处理方法、装置和设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926324

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 160223)

122 Ep: pct application non-entry in european phase

Ref document number: 20926324

Country of ref document: EP

Kind code of ref document: A1