WO2020192663A1 - Data management method and related device - Google Patents

Data management method and related device Download PDF

Info

Publication number
WO2020192663A1
WO2020192663A1 PCT/CN2020/080952 CN2020080952W WO2020192663A1 WO 2020192663 A1 WO2020192663 A1 WO 2020192663A1 CN 2020080952 W CN2020080952 W CN 2020080952W WO 2020192663 A1 WO2020192663 A1 WO 2020192663A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
attribute
storage system
record
data object
Prior art date
Application number
PCT/CN2020/080952
Other languages
French (fr)
Chinese (zh)
Inventor
田文罡
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020192663A1 publication Critical patent/WO2020192663A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures

Definitions

  • This application relates to the field of computer technology, in particular to a data management method and related equipment.
  • structured data is data that is logically expressed and realized by a two-dimensional table structure and follows data format and length specifications, such as sales information, property information Wait.
  • Unstructured data is irregular or incomplete data structure, and there is no predefined data, such as documents, pictures, audio, and video.
  • structured data is generally stored in a relational database
  • large unstructured data is generally stored in a file storage system
  • small unstructured data is generally stored in a key value (KV) system.
  • KV key value
  • a data object may contain both structured data and unstructured data.
  • the data object is the information of a picture.
  • the attribute information of the picture such as the name, size, shooting time, and latitude and longitude information of the shooting location, is structured data.
  • the picture itself is unstructured data, based on The thumbnail generated by this picture is unstructured data.
  • the name, size, shooting time, and latitude and longitude information of the picture will be stored in the relational database, the picture itself will be stored in the file storage system, and the thumbnail generated based on the picture will be stored in the KV storage system. It can be seen that since a data object may contain structured data and unstructured data at the same time, a data object may be stored across multiple data systems.
  • the embodiments of the present application provide a data management method and related equipment, which are used to achieve data consistency when data objects are stored across multiple data systems.
  • an embodiment of the present application provides a data management method.
  • the method includes: generating a record of a data object in a relational data table, the data object having multiple attributes, and the multiple attributes include structured attributes and Unstructured attributes, the record indicates the association relationship between the structured attributes and unstructured attributes of the data object, the relational data table is stored in a first storage system; the unstructured attributes of the data object The corresponding data is stored in the second storage system; an operation instruction is received, the operation instruction is used to perform an operation on the data object; in response to the operation instruction, the data object is determined from the first storage system The record; obtaining data corresponding to at least one of the attributes of the data object from at least one of the first storage system and the second storage system according to the record; based on The data corresponding to the at least one attribute performs the operation on the data object. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are obtained through the record, the data object can be kept in data consistency when stored across multiple data systems.
  • the generating a record of a data object in a relational data table includes: receiving an insert instruction or an update instruction, the insert instruction is used to insert the data object, the The update instruction is used to update the data object; the insert instruction and the update instruction both include the object type of the data object, and the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object; Determine the relational data table corresponding to the data object according to the object type; generate the relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute A record of the data object; submit the transaction corresponding to the insert instruction or update instruction; wherein the transaction corresponding to the insert instruction or update instruction is stored in the second storage system after the data corresponding to the unstructured attribute of the data object is stored submit.
  • the received instruction is the insert instruction
  • the unstructured attributes of the data object include key-value KV attributes
  • the second storage system is a KV storage system
  • generating the record of the data object in the relational data table corresponding to the data object includes: according to the first version identifier and the The first key value in the data corresponding to the KV attribute generates the second key value; according to the data corresponding to the structured attribute in the record and the data corresponding to the KV attribute in the record, it is generated in the relational data table corresponding to the data object
  • the record of the data object wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record includes data corresponding to the structured attribute of the data object .
  • the received instruction is the insert instruction
  • the unstructured attributes of the data object include file attributes
  • the second storage system is a file storage system
  • the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in a relational data table corresponding to the data object includes: according to the first version identifier and the The first path in the data corresponding to the file attribute generates the second path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the data object is generated in the relational data table corresponding to the data object.
  • the received instruction is the update instruction
  • the unstructured attributes of the data object include KV attributes
  • the second storage system is a KV storage system
  • the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in the relational data table corresponding to the data object includes: according to the second version identifier and the KV attribute
  • the first key value in the corresponding data generates the third key value; according to the data corresponding to the structured attribute in the record and the data corresponding to the KV attribute in the record, the data object is generated in the relational data table corresponding to the data object.
  • the received instruction is the update instruction
  • the unstructured attributes of the data object include file attributes
  • the second storage system is a file storage system
  • the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in a relational data table corresponding to the data object includes: according to the second version identifier and the The first path in the data corresponding to the file attribute generates the third path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the data object is generated in the relational data table corresponding to the data object.
  • the data corresponding to the unstructured attribute of the data object stored in the second storage system includes the identifier and content corresponding to the unstructured attribute;
  • the data corresponding to the unstructured attribute in the record stored in the first storage system includes an identifier corresponding to the unstructured attribute.
  • the operation instruction includes a query instruction, and the query instruction includes a query condition; in response to the operation instruction, the determination is made from the first storage system.
  • the record of the data object includes: in response to the operation instruction, selecting a record of the data object that meets the query condition from the first storage system; and selecting from the first storage system according to the record
  • Obtaining data corresponding to at least one attribute of the plurality of attributes of the data object from at least one of the storage systems in the second storage system includes: according to the record from the first storage system and the Acquiring data corresponding to the multiple attributes of the data object in the second storage system; the performing the operation on the data object based on the data corresponding to the at least one attribute includes: according to the data object
  • the data corresponding to each of the multiple attributes and the sequence of the multiple attributes in the record establishes the data object; and the data object is returned as a query result.
  • the unstructured attribute of the record includes a KV attribute
  • the second storage system is a KV storage system
  • the record is obtained from the first storage system according to the record.
  • acquiring the data corresponding to the multiple attributes of the data object in the second storage system includes: reading the KV data corresponding to the key value from the second storage system according to the key value, and removing the The version identifier in the key value; the key value is the data corresponding to the KV attribute in the record, and the version identifier includes a first version identifier and a second version identifier; wherein, the data corresponding to the KV attribute of the data object It includes the key value after removing the version identifier and the KV data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  • the unstructured attributes of the record include file attributes
  • the second storage system is a file storage system
  • the record is obtained from the first storage system according to the record.
  • acquiring the data corresponding to the multiple attributes of the data object in the second storage system includes: reading the file data corresponding to the path from the second storage system according to the path, and removing the data from the path
  • the version identifier; the path is the data corresponding to the file attribute in the record, the version identifier includes a first version identifier and a second version identifier; wherein the data corresponding to the file attribute of the data object includes the removal version identifier
  • the following path and the file data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  • the operation instruction further includes a deletion instruction, and the deletion instruction includes the object type of the data object, and the data and non-compatibility corresponding to the structured attribute of the data object.
  • Data corresponding to a structured attribute includes: determining the data object corresponding to the data object according to the object type A relational data table; the record of the data object is determined from the relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object;
  • the record acquiring data corresponding to at least one attribute of the multiple attributes of the data object from at least one of the first storage system and the second storage system includes: said recording from Acquiring data corresponding to the multiple attributes of the data object in the first storage system; and performing the operation on the data object based on the data corresponding to the at least one attribute includes: Delete data corresponding to the multiple attributes of the data object in a
  • the unstructured attributes include KV attributes
  • the second storage system is a KV storage system
  • the method further includes: when a verification instruction is received or when When it is detected that the verification condition is satisfied, the key value in the second storage system is traversed, and the key value is the data corresponding to the KV attribute in the second storage system; in the process of traversing the key value, if If the same key value as the fourth key value cannot be found in the relational data table stored in the first storage system, delete the fourth key value and the fourth key value correspondence in the second storage system
  • the fourth key value is one of multiple key values in the second storage system.
  • the unstructured attributes include file attributes
  • the second storage system is a file storage system
  • the method further includes: when a verification instruction is received or when When it is detected that the verification condition is met, the path in the second storage system is traversed, and the path is the data corresponding to the file attribute in the second storage system; in the process of traversing the path, if the path is If the same path as the fourth path cannot be found in the relational data table stored in the first storage system, the fourth path and the file data corresponding to the fourth path are deleted in the second storage system.
  • the four-path is one of the multiple paths in the second storage system.
  • the method before determining the relational data table corresponding to the data object according to the object type, the method further includes: receiving a definition instruction for the object type to which the data object belongs ,
  • the definition instruction includes definition information of the object type, and the definition information is used to define the structure of the relational data table of the object type; according to the definition instruction, the first storage system generates the The relational data table of the object type.
  • the determining the relational data table corresponding to the data object according to the object type includes: determining the object to which the data object belongs according to the insert instruction or update instruction Type; the relational data table corresponding to the data object is determined according to the object type.
  • an embodiment of the present application provides a data management device, which includes a unit for executing the method described in the first aspect or various possible implementations of the first aspect.
  • the above-mentioned data management device may be an electronic device, or a device for implementing data management in an electronic device (for example, an operating system, a database management system), or a server, such as a database server, an application server, etc.
  • an electronic device for example, an operating system, a database management system
  • a server such as a database server, an application server, etc.
  • the units included in the aforementioned data management device may be hardware circuits, software, or hardware circuits combined with software.
  • the embodiments of the present application provide another data management device, including a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store program instructions, and the processor is used to call the The program instructions execute the method described in the first aspect or any possible implementation of the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, the computer storage medium stores program instructions, and when the program instructions are executed by a processor, the processor executes the above-mentioned first aspect or the first aspect. Any possible implementation of the method described.
  • an embodiment of the present application provides a computer program.
  • the processor executes the method described in the first aspect or any possible implementation of the first aspect.
  • the data management device may generate a record of the data object in the relational data table.
  • the record indicates the association relationship between the structured data and the unstructured data of the data object.
  • the relational data table is stored in the first storage.
  • the unstructured data of the data object is stored in the second storage system.
  • the record of the data object can be obtained from the first storage system, and the multiple attributes of the data object can be obtained from the first storage system and/or the second storage system according to the record Data corresponding to at least one attribute; and then based on the data corresponding to the at least one attribute, performing the operation on the data object. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are all obtained through the records in the relational data table, the data object can maintain data consistency when stored across multiple data systems.
  • FIG. 1A is a schematic diagram of a data management device provided by an embodiment of the present application.
  • FIG. 1B is a schematic diagram of yet another data management device provided by an embodiment of the present application.
  • FIG. 1C is a schematic diagram of another data management device provided by an embodiment of the present application.
  • FIG. 2 is a schematic diagram of the architecture of a data management system provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a data management method provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of another data management device provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of yet another data management device provided by an embodiment of the present application.
  • the data management method provided in the embodiments of the present application can be applied to a data management device, and the data management device includes a first storage system and a second storage system.
  • the first storage system and the second storage system are different types of storage systems.
  • the first storage system can be used to store the structured attributes of the data object and corresponding structured data
  • the second storage system can be used to store the unstructured attributes of the data object and the corresponding unstructured data.
  • Structured attributes are attributes used to describe or define characteristics of structured data
  • unstructured attributes are attributes used to describe or define characteristics of unstructured data.
  • Structured data also known as row data, is data logically expressed and realized by a two-dimensional table structure, which is mainly stored and managed by relational databases; unstructured data is irregular or incomplete data structure, and has no predefined Data model, data that is not convenient to use two-dimensional logical tables of the database, such as documents, texts, pictures, reports, images, audio/video information, etc.
  • the first storage system may be a database, such as a relational database.
  • the unstructured attributes of the data object may include key-value (KV) Attribute and file attribute
  • the data management device may include a KV storage system (such as a KV database) and a file storage system (referred to as "file system").
  • the data management device may include terminal devices such as mobile phones, tablet computers, personal digital assistants (personal digital assistants, PDAs), mobile internet devices (mobile internet devices, MIDs), etc., and may also include database servers, application servers, etc.
  • the storage and processing function equipment is not limited in the embodiment of the present invention.
  • the data management device can receive an operation instruction for a data object input by a user through an application program it runs, and execute the operation instruction on the data object.
  • the application can be an album that stores images or videos, and can receive user-input operation instructions on images or videos; or the application can be a text-creation software that can receive user-input operations on text Instructions; or the application program may be instant messaging software, which can receive operating instructions input by the user for office documents, text, pictures, images, audio/video, and other data in the software.
  • FIG. 1A is a schematic diagram of a data management device provided by an embodiment of the present application.
  • the data management device includes an application module, an operation module, an interface module, and a storage system module. These modules will be further introduced below.
  • the application module may include one or more application programs, and these application programs may receive operation instructions for the data object input by the user.
  • the application can include photo albums, mailboxes, document processing software, and so on.
  • the operation module is a data object management component that provides an interface to the application module. Through the operation module, the application module can implement operations such as defining, inserting, modifying, deleting, querying, and verifying data objects. Specifically, the operation module may execute the operation indicated by the instruction according to the instruction received from the application module. The received instructions are different, and the operations performed by the operation module are different. The following examples illustrate several different instructions.
  • the operation module can determine the definition information of the object type of the data object according to the definition instruction, and then store the definition information in the storage system module through the interface module.
  • the definition information of the object type of the data object includes the structured attribute and the unstructured attribute of the data object.
  • the operation module can determine the information of the data object that needs to be stored according to the insert instruction, and then store the information of the data object that needs to be inserted into the storage system module through the interface module. Among them, the operation module saves different types of data to different storage systems. Specifically, structured data is saved to the database, file data is saved to the file storage system, and key-value data is saved to the KV storage system.
  • the operation module can determine the information of the data object that needs to be updated according to the update instruction, and then store the information of the data object that needs to be updated in the storage system module through the interface module. Among them, the operation module saves different types of data to different storage systems. Specifically, structured data is saved to the database, file data is saved to the file storage system, and key-value data is saved to the KV storage system.
  • the operation module can determine the query condition according to the query instruction, and then select data objects that meet the query condition from the storage system module through the interface module according to the query condition and feed it back to the application module.
  • the operation module can determine the data object to be deleted according to the delete instruction, and then delete the data object to be deleted from the storage system module through the interface module.
  • the operating module can check the data stored in the storage system module through the interface module according to the check command to clear invalid data.
  • the interface module provides an interface for accessing the storage system module, and the operation module can access the data in the storage system module through the interface module.
  • the interface module includes a first storage system interface submodule and a second storage system interface submodule; the storage system module includes a first storage system submodule and a second storage system submodule.
  • the operating module can access the data in the first storage system through the first storage system interface sub-module, and the operating module can access the data in the second storage system through the second storage system interface sub-module.
  • the first storage system is a database system, such as a relational database
  • the first storage system interface submodule is a database system interface
  • the second storage system is a storage system for storing non-relational data, such as KV The storage system and/or the file storage system.
  • the second storage system interface submodule includes a KV system interface submodule and/or a file storage system interface submodule.
  • the data management device stores the data corresponding to the structured attributes of the same data object in the first storage system, and stores the data corresponding to the unstructured data of the data object in the second storage system. Further, the data management device generates a relational data table in the first storage system to establish an association relationship between the structured attribute and the unstructured attribute of the data object.
  • the relational data table contains the record of the data object, the record includes the name of the structured attribute of the data object, the name of the unstructured attribute, the data content (value) corresponding to the structured attribute, and the corresponding unstructured attribute
  • the identification of the data is stored in the second storage system.
  • the operation module When the operation module receives the operation instruction for the data object from the application module, it first determines the record of the data object from the relational data table stored in the first storage system, and then can obtain the data from the relational data table according to the record
  • the data corresponding to the structured attribute of the object, and the identifier of the data corresponding to the unstructured attribute can be determined based on the record, and then the data corresponding to the unstructured attribute of the data object is obtained from the second storage system according to the identifier (data content ).
  • the data management device performs corresponding operations on the data object based on the acquired data corresponding to the structured attribute of the data object and/or the data corresponding to the unstructured attribute.
  • the data management device generates the relational data table of the data object, and the detailed process of storing the data and operating the data can refer to the related embodiment in FIG. 3.
  • the first storage system sub-module may be a database that supports multi-version concurrency control (MVCC), such as a lightweight database (SQLite).
  • MVCC can maintain multiple snapshot copies for each record in the database, and maintain the visibility of the copies through a start timestamp (begin timestamp) and an end timestamp (end timestamp).
  • the second storage system sub-module is a storage system that supports persistence, such as KV system, flash friendly file system (F2FS), fourth-generation extended file system (EXT4) and so on.
  • persistence is a mechanism for converting data between persistent state and transient state.
  • transient data (such as data in memory) is persisted into persistent data, and the persistent data can be stored for a long time.
  • the data management device accesses a data object, it first accesses the record of the data object in the database, and then operates the data stored in the second storage system according to the record content, so that the second storage system can be realized by means of the concurrency control of the database (such as file system, KV system) concurrent control access.
  • the data management device For the insertion and modification operations of data objects, the data management device must be in the file system and the KV system operation is completed before submitting the database transaction; for the deletion operation, the data management device must first manipulate the data in the database, and then submit the transaction. Operate the file system, the data in the KV system.
  • the operation module may include a data definition sub-module, a data insertion, update, deletion, and query sub-module, and a data verification sub-module.
  • the data definition sub-module is used to determine the definition information of the data object according to the definition instruction.
  • the data insert, update, delete, and query sub-module, and the data check sub-module can perform operations such as insert, update, delete, and query on the actually stored data object according to the insert instruction, update instruction, delete instruction, and query instruction.
  • the data verification sub-module can verify data in different storage systems to clear invalid data and ensure data consistency in multiple storage systems.
  • the operation modules can also be divided into other ways as needed.
  • the data insertion, update, deletion, and query submodules can be divided into data insertion submodules, data update submodules, data deletion submodules, and data
  • the query sub-module is not specifically limited in the embodiment of this application.
  • the operation module may further include a first storage system operation sub-module and a second storage system operation sub-module.
  • the first storage system operation submodule is used to perform operations on data in the first storage system
  • the second storage system operation submodule is used to perform operations on data in the second storage system.
  • the operation module may also include a system adaptation sub-module that can process the data object so that the data object can adapt to multiple storage systems.
  • Storage system or adapt the data objects of operation feedback to the application environment of different applications.
  • it may include a first storage system adaptation sub-module and a second storage system adaptation sub-module.
  • the first storage system adaptation submodule may include a database adaptation submodule, which can interface with different databases, facilitate database switching, encapsulate database operation interfaces, and provide database-like interfaces for upper-level services, including Open the database (open), perform database operations to create (create), insert (insert), update (update), delete (delete), query (query) operations, and perform begin, commit, etc. Transaction operation.
  • a database adaptation submodule which can interface with different databases, facilitate database switching, encapsulate database operation interfaces, and provide database-like interfaces for upper-level services, including Open the database (open), perform database operations to create (create), insert (insert), update (update), delete (delete), query (query) operations, and perform begin, commit, etc.
  • Transaction operation Open the database (open), perform database operations to create (create), insert (insert), update (update), delete (delete), query (query) operations, and perform begin, commit, etc.
  • the second storage system adapter submodule may include a KV storage system adapter submodule, which can be connected to different KV storage systems, can facilitate KV storage switching, and package the KV operation interface, Provide KV-like interfaces for upper-layer services, including input (put), output (get), delete (delete) and other operations.
  • the second storage system adaptation submodule may also include a file storage system adaptation submodule, which can interface with different file storage systems, facilitate file storage system switching, and interface with file storage systems. Encapsulation provides an interface similar to a file storage system for upper-level services, including operations such as opening files (open), reading files (read), writing files (write), and closing files (close).
  • FIG. 1C is a schematic diagram of another data management device provided by an embodiment of the present application.
  • the data management method of the embodiment of the present application can also be applied to a data management system.
  • FIG. 2 is a schematic diagram of the architecture of a data management system provided by an embodiment of the present application. Including client and data management equipment. The following two devices are further introduced.
  • the client is a device that provides local services to customers. Except for some applications that only run locally, the operation of the client generally needs to cooperate with the server. More commonly used clients include web browsers used on the World Wide Web, email clients for receiving and sending emails, photo album clients for storing images or videos, text clients for creating text, and client software for instant messaging.
  • the client may receive an operation instruction for the data object input by the user.
  • the operation instruction may include an insert instruction, an update instruction, a definition instruction, a query instruction, a delete instruction, a verification instruction, and so on.
  • the client can be an album client that stores images or videos, and can receive operation instructions for images or videos entered by the user;
  • the client can be a text client that creates text, and can receive the text input by the user.
  • the client can be instant messaging client software, which can receive operating instructions entered by the user for office documents, text, pictures, images, and audio/video data in the software.
  • a data management device is a device that provides data storage and processing services for clients, and can implement data management.
  • the management can include definition, storage, update, deletion, verification, and so on.
  • the client and the data management device are two independent devices, and the client and the server communicate through a network or a data line.
  • the data management device can receive an operation instruction from the client, and then execute the operation instruction on the data object.
  • the structure of the data management device can refer to the structure described in Figures 1A to 1C, and only replace the "application module" illustrated in Figures 1A to 1C with a "receiving module".
  • the interface module is used to receive from the client Operation instructions for data objects.
  • the functions of the remaining modules except for the application module in the modules illustrated in FIGS. 1A to 1C can refer to the above description, which will not be repeated here.
  • FIG. 3 is a flowchart of a data management method provided by an embodiment of the present application.
  • the data management device described below may be the data management device shown in any one of FIG. 1A to FIG. 1C and FIG. 2; the method includes but is not limited to the following steps.
  • the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes.
  • Structured attributes are attributes used to describe or define characteristics of structured data
  • unstructured attributes are attributes used to describe or define characteristics of unstructured data.
  • Structured data is data logically expressed and realized by a two-dimensional table structure, which is mainly stored and managed by relational databases; unstructured data is irregular or incomplete data structure, and there is no predefined data model, which is inconvenient to use
  • Data represented by two-dimensional logical tables of the database such as documents, texts, pictures, reports, images, audio/video information, etc.
  • the record generated in the relational data table includes the structured attribute of the data object and the data corresponding to the structured attribute, and the association relationship between the structured attribute and the unstructured attribute.
  • the relational data table is stored in the first storage system in.
  • S302 Store data corresponding to the unstructured attribute of the data object in a second storage system.
  • the record of a data object in the relational data table can contain structured attribute fields and unstructured attribute fields.
  • the value of the structured attribute field is the data corresponding to the structured attribute
  • the value of the unstructured field is the data corresponding to the unstructured attribute.
  • the unstructured data corresponding to the unstructured attribute is stored in the second storage system.
  • the structured attributes and unstructured attributes of the data object can be associated.
  • the records in the relational data table include structured attributes and unstructured attributes, as well as structure The data corresponding to each of the chemical attribute and the unstructured attribute.
  • the data corresponding to the structured attribute is the data itself, that is, the data content or data value, not the data corresponding to the structured attribute, not the original data content, but an identification of the data .
  • the real data content is stored in the second storage system.
  • S303 Receive an operation instruction, where the operation instruction is used to perform an operation on the data object.
  • the operation instruction may be a query statement (query), a check statement (check), a delete statement (delete statement) described in a database definition language (data definition language, DDL), data manipulation language (data manipulation language, DML), etc. ) Or a function call statement, etc.
  • the operation instruction indicates the object type of the data object involved in the operation.
  • the operation instruction may also include data required to perform an operation on the data object, such as data corresponding to the structured attribute and data corresponding to the unstructured attribute of the data object.
  • S306 Perform the operation on the data object based on the data corresponding to the at least one attribute.
  • the data object before the data management device generates the record of the data object, the data object may be defined based on the data definition instruction, that is, the name and type of each attribute of the data object are defined, for example, the defined data object One or more structured attributes of and one or more unstructured attributes.
  • the definition process of the data object includes: the data management device receives a definition instruction for the object type to which the data object belongs, and the definition instruction includes definition information of the object type, and The definition information is used to define the structure of the relational data table of the object type; in the first storage system according to the definition instruction, the relational data table of the object type is generated to associate the structured attributes and non-conformities of the data object. Structured attributes.
  • the definition information may be definition information of an object type of "picture”, and the unstructured attributes of the data object may include files and key values.
  • the definition information of the “picture” type can be “picture(name STRING, size INT, path FILE, latitude DOUBLE, longitude DOUBLE, time_taken STRING, thumbnail KV)”, and the object type “picture” generated according to the definition information
  • the relational data table can refer to Table 1.
  • the structural attributes in the definition information of the "picture” type are "name”, “size”, “latitude”, “longitude” and “time_taken”.
  • the structured attributes are "path” and “thumbnail”, where "path” is the file attribute, and "thumbnail” is the KV attribute.
  • the definition command containing the definition information can be "create(picture(nameSTRING,sizeINT,pathFILE,latitudeDOUBLE,longitudeDOUBLE,time_takenSTRING,thumbnailKV))".
  • "picture” is the object type to which the data object belongs.
  • step S301 there may be two situations in which the data management device generates a record of the data object in the relational data table.
  • the first case is to generate a record of the data object based on the received insert instruction
  • the second case is to generate a record of the data object based on the received update instruction.
  • the insert instruction is used to insert the data object
  • the update instruction is used to update the data object.
  • Both the insert instruction and the update instruction for the data object indicate the object type of the data object and the data corresponding to the attribute of the data object. The following will give a specific introduction to these two situations.
  • the process of generating a record of the data object in the relational data table includes: according to the data corresponding to the structured attribute and the For the data corresponding to the unstructured attribute, a record of the data object is generated in the relational data table corresponding to the data object; and, after the data corresponding to the unstructured attribute of the data object is stored in the second storage system Submit the transaction corresponding to the insert instruction or update instruction.
  • the data management device may determine the object type to which the data object belongs according to the insertion instruction; and determine the relational data table corresponding to the data object according to the object type.
  • the insert instruction for the data object is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key, t_snoopy.jpg)))
  • the data management device may determine the object type to which the data object belongs is "picture” according to the insert instruction, and then determine the relational data table corresponding to the data object according to the object type "picture” as the "picture” in the first storage system. "This kind of object type relational data table.
  • the relational data table can refer to Table 1 above.
  • the data management device may obtain the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object from the insert instruction.
  • the insert instruction is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg) ))".
  • the data management device can obtain the structured attributes of the data object "name”, "size”, “latitude”, “longitude”, and “time_taken” from the update instruction.
  • the data management device can obtain the data (data/Snoopy.jpg, Snoopy.jpg), (t_snoopy_key, t_snoopy.jpg) corresponding to the unstructured attributes "path” and "thumbnail” respectively, And fill these data into the corresponding unstructured attribute fields in the relational data table. It can be seen that the data corresponding to the unstructured attributes of the data object includes the identification and content of the data.
  • the data (data/Snoopy.jpg, Snoopy.jpg) corresponding to the unstructured attribute "path” includes the identification of the unstructured data, namely the path: data/Snoopy.jpg, and the data content, namely Snoopy. jpg file;
  • the data (t_snoopy_key, t_snoopy.jpg) corresponding to the unstructured attribute "thumbnail” includes the thumbnail identifier t_snoopy_key and the content of the thumbnail t_snoopy.jpg.
  • the following describes the detailed process of the data management device generating the record of the data object in the relational data table corresponding to the data object during the process of inserting the data object.
  • the unstructured attribute of the data object includes a KV attribute
  • the second storage system is a KV storage system.
  • the data management device According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in the relational data table corresponding to the data object.
  • the method includes: according to the first version Identify the first key value in the data corresponding to the KV attribute to generate the second key value; according to the data corresponding to the structured attribute in the record, and the data corresponding to the KV attribute in the record, in the relational data corresponding to the data object A record of the data object is generated in a table; wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data.
  • the version identifier is used to indicate the version of the record.
  • the version identifier of the record can distinguish whether the record is a record of the data object before update or a record of the data object after update.
  • the meaning represented by the first version identifier is the version identifier of the record that has been stored in the database before the data object is updated.
  • the meaning represented by the second version identifier is the version identifier of the record stored in the database after the data object is updated.
  • there are only two version identifiers in the data management device for example, version1 and version2.
  • the data management device determines that the second version identifier is version2; if the first version identifier is version2, the data management device determines that the second version identifier is version1. In yet another possible situation, there may be multiple version identifiers in the data management device, such as version 1, version 2, and version 3, and so on. If the first version identification is version1, the data management device determines that the second version identification is version2 or other version identifications except version1; if the first version identification is version2, the data management device determines that the second version identification is version3 Or other version identifiers except version2.
  • the method of generating the second key value according to the first version identifier and the first key value in the data corresponding to the KV attribute may include: adding the first key value to the first key value in the data corresponding to the KV attribute Version identification to generate the second key value. It should be noted that there may also be other ways of generating the second key value based on the first version identifier and the first key value in the data corresponding to the KV attribute, which is not limited here.
  • the unstructured attributes of the data object include file attributes
  • the second storage system is a file storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object.
  • the method includes: A version identification and the first path in the data corresponding to the file attribute generate a second path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the relationship between the data object corresponds to A record of the data object is generated in a data table; wherein the data corresponding to the file attribute in the record includes the second path, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data.
  • the method of generating the second path according to the first version identifier and the first path in the data corresponding to the file attribute may include: adding the first version identifier to the first path in the data corresponding to the file attribute To generate the second path.
  • the data management device creates the second path in the database. It should be noted that there may also be other ways of generating the second path based on the first version identifier and the first path in the data corresponding to the file attribute, which is not limited here.
  • the unstructured attributes included in the data object are file attributes (data/Snoopy.jpg, Snoopy.jpg) and KV attributes ( t_snoopy_key,t_snoopy.jpg).
  • the insert instruction for the data object is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key,t_snoopy.jpg )))
  • the data object contains 7 attributes, and the first version is identified as version1.
  • the first attribute of the data object is a structured attribute
  • the data corresponding to the structured attribute in the record is the data "Snoopy” corresponding to the structured attribute.
  • other structured attributes other than the first structured attribute among the seven attributes of the data object can also refer to this method, which will not be repeated here.
  • the third attribute of the data object is the file attribute, then the first path "data/Snoopy.jpg” in the data "(data/Snoopy.jpg,Snoopy.jpg)" corresponding to the first version identifier version1 and the file attribute is used. "The second path "data/version1/Snoopy.jpg” is generated, and the data corresponding to the file attribute in the record is the second path "data/version1/Snoopy.jpg".
  • the seventh attribute of the data object is the KV attribute
  • the second key value is generated according to the first key value "t_snoopy_key” in the data "(t_snoopy_key,t_snoopy.jpg)" corresponding to the first version identifier version1 and the KV attribute "T_snoopy_key_version1”
  • the data corresponding to the KV attribute in the record is the second key value "t_snoopy_key_version1”.
  • a record of the data object is generated in the relational data table corresponding to the data object .
  • the record can refer to Table 2 below.
  • the data management device will store the data "(t_snoopy_key_version1,t_snoopy.jpg)" corresponding to the KV attribute of the data object in the KV storage system, and the data corresponding to the file attribute of the data object "(data/version1/Snoopy.jpg, Snoopy.jpg)” is stored in the file storage system.
  • the data corresponding to the KV attributes stored in the KV storage system can be referred to Table 3:
  • the data management device After the data management device stores the data corresponding to the unstructured attribute of the data object in the second storage system, the data management device will submit the database transaction corresponding to the insert instruction. It should be noted that the operation of submitting database transactions by the data management device needs to be executed after both the first storage system and the second storage system save the corresponding data. This operation mode can ensure that the data of the data object is in each storage system. All were successfully stored.
  • step S301 for the second case, after the data management device receives the update instruction for updating the data object, the process of generating the relational data table of the data object includes: in the relational data table corresponding to the data object A record of the data object is generated in the data object; the transaction corresponding to the insert instruction or the update instruction is submitted; wherein the transaction corresponding to the insert instruction or the update instruction is stored in the second data corresponding to the unstructured attribute of the data object Submit later in the storage system.
  • the data management device may determine the object type to which the data object belongs according to the update instruction; determine the relational data table corresponding to the data object according to the object type.
  • the update instruction for the data object is "update(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key, t_snoopy.jpg)))".
  • the data management device may determine the object type to which the data object belongs is "picture” according to the update instruction, and then determine the relational data table corresponding to the data object according to the object type "picture” as the "picture” in the first storage system "This kind of object type relational data table.
  • For the relational data table please refer to Table 1 above.
  • the following describes how the data management device generates the record of the data object in the relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute during the update process. method.
  • the unstructured attribute of the data object includes a KV attribute
  • the second storage system is a KV storage system.
  • the data management device According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object.
  • the method includes: according to a second version Identify the first key value in the data corresponding to the KV attribute to generate the third key value; according to the data corresponding to the structured attribute in the record, and the data corresponding to the KV attribute in the record, the relationship between the data object A record of the data object is generated in a data table; wherein the data corresponding to the KV attribute in the record includes the third key value, and the data corresponding to the structured attribute in the record includes the structured data object The data corresponding to the attribute.
  • the meaning of the second version can refer to the content introduced above.
  • the method of generating the third key value according to the second version identifier and the first key value in the data corresponding to the KV attribute can refer to the above introduction, according to The manner in which the first key value in the data corresponding to the first version identifier and the KV attribute generates the second key value is not repeated here.
  • the unstructured attributes of the data object include file attributes
  • the second storage system is a file storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object.
  • the method includes: The second version identifier and the first path in the data corresponding to the file attribute generate a third path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the relationship between the data object corresponds to A record of the data object is generated in a data table; wherein the data corresponding to the file attribute in the record includes the third path, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data.
  • the method of generating the third path according to the second version identifier and the first path in the data corresponding to the file attribute can refer to the above introduction, according to the first version identifier and the first path in the data corresponding to the file attribute The way the path generates the second path will not be repeated here.
  • the unstructured attributes included in the data objects are file attributes (data/Snoopy.jpg, Snoopy.jpg) and KV attributes ( t_snoopy_key,t_snoopy.jpg).
  • the update command is "update(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key,t_snoopy.jpg)))
  • the data object contains 7 attributes, and the second version is identified as version2.
  • the first attribute of the data object is a structured attribute
  • the data corresponding to the structured attribute in the record includes the data "Snoopy” corresponding to the structured attribute of the data object.
  • other structured attributes other than the first structured attribute among the seven attributes in the data object can also refer to this method, which will not be repeated here.
  • the third attribute of the data object is a file attribute, and the first path "data/Snoopy.jpg” in the data "(data/Snoopy.jpg,Snoopy.jpg)" corresponding to the second version identifier version2 and the file attribute is used.
  • jpg generates the third path "data/version2/Snoopy.jpg”
  • the data corresponding to the file attribute in the record is the third path "data/version2/Snoopy.jpg”.
  • the seventh attribute of the data object is the KV attribute
  • the third key value is generated according to the first key value "t_snoopy_key” in the data "(t_snoopy_key,t_snoopy.jpg)" corresponding to the second version identifier version2 and the KV attribute "T_snoopy_key_version2”
  • the data corresponding to the KV attribute in the record is the third key value "t_snoopy_key_version2”.
  • a record of the data object is generated in the relational data table corresponding to the data object .
  • the record can refer to Table 5 below.
  • the record of the earlier version of the data object is already stored in the relational data table, so after the data management device receives the update instruction and generates the record of the updated version of the data object in the relational database, the The relational data table stored in the first storage system contains both the old and new version records, as shown in Table 6:
  • the data management device After the data management device stores the data corresponding to the unstructured attribute of the data object in the second storage system, the data management device will submit the database transaction corresponding to the update instruction. It should be noted that the operation of submitting database transactions by the data management device needs to be executed after the first storage system and the second storage system both save the corresponding data. This operation mode can ensure that the data object is successful in each storage system storage.
  • the operation command received in step S303 is a query command
  • the query command includes a query condition
  • the process of determining the record of the data object from the first storage system includes: selecting a record of the data object satisfying the query condition from the first storage system.
  • the query command is "query(picture("time_taken ⁇ 2018-10-12"))", and the meaning of the query command is to query pictures taken on October 12, 2018 or after the shooting time .
  • the data management device traverses the relational data table corresponding to the picture in the first storage system, and selects records whose shooting time is greater than or equal to 2018-10-12.
  • the obtained record can be picture("Snoopy",2M,data/version1/Snoopy.jpg,39.92,116.46,”2018-10-12",t_snoopy_key_version1) and picture("Stitch",1.5M,data /version2/Stitch.jpg,38.23,129.78,”2018-10-17",t_Stitch_key_version2).
  • the operation instruction received in step S303 is a delete instruction
  • the delete instruction includes the object type of the data object, and the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object.
  • the method for determining the record of the data object from the first storage system includes: determining a relational data table corresponding to the data object according to the object type; and determining the relational data table from the relational data table. For the record of the data object, the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object.
  • the records stored in the database include: the record picture("Snoopy", 2M, data/version1/Snoopy.jpg, 39.92, 116.46, "2018-10-12", t_snoopy_key_version1) in the first relational data table , Picture("Stitch",1.5M,data/version2/Stitch.jpg,38.23,129.78,”2018-10-17",t_Stitch_key_version2) and the record video("Show",300M,data in the second relational data table /version1/Show.avi,47.56,119.73,”2018-10-23",t_Show_key_version1).
  • the delete instruction is "delete(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key,t_snoopy.jpg)) )"".
  • the meaning of this delete instruction is to delete the data object "picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12",(t_snoopy_key,t_snoopy.jpg)) ".
  • the data management device determines that the relation data table corresponding to the data object is the first relation data table according to the object type picture.
  • the data management device determines the record of the data object from the first relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object.
  • the determined record is picture("Snoopy", 2M, data/version1/Snoopy.jpg, 39.92, 116.46, “2018-10-12”, t_snoopy_key_version1).
  • the following describes a detailed process in which the data management device obtains data corresponding to the multiple attributes of the data object from the first storage system and the second storage system according to the record in step S305.
  • the recorded unstructured attributes include KV attributes
  • the second storage system is a KV storage system.
  • the method for the data management device to obtain corresponding data in the multiple attributes of the data object from the first storage system and the second storage system according to the record includes: obtaining data from the second storage system according to a key value
  • the KV data corresponding to the key value is read in the key value, and the version identifier in the key value is removed; the key value is the data corresponding to the KV attribute in the record, and the version identifier includes the first version identifier and the first version identifier.
  • Two version identification wherein the data corresponding to the KV attribute of the data object is the key value after removing the version identification and the KV data; the data corresponding to the structured attribute of the data object is, the structured data in the record The data corresponding to the attribute.
  • the unstructured attributes of the record include file attributes
  • the second storage system is a file storage system.
  • the data management device obtains data corresponding to multiple attributes of the data object from the first storage system and the second storage system according to the record, including: reading from the second storage system according to a path Fetch the file data corresponding to the path, and remove the version identifier in the path; the path is the data corresponding to the file attribute in the record, and the version identifier includes a first version identifier and a second version identifier; where The data corresponding to the file attribute of the data object is the path after removing the version identifier and the file data; the data corresponding to the structured attribute of the data object is the data corresponding to the structured attribute in the record.
  • the first attribute in the record is a structured attribute
  • the data corresponding to the structured attribute of the data object is the data "Snoopy" corresponding to the first attribute.
  • the 7 attributes in the record are divided by Other structured attributes of the first structured attribute can also refer to this method, which will not be repeated here.
  • the third attribute in the record is a path, then read the file data “Snoopy.jpg” corresponding to the path from the file storage system according to the path “data/version1/Snoopy.jpg”, and remove the Version ID version1, the data corresponding to the file attribute of the data object is the path "data/Snoopy.jpg” after removing the version ID and the file data "Snoopy.jpg", that is, "(data/Snoopy.jpg, Snoopy.jpg” )".
  • the seventh attribute in the record is a key value, then read the KV data “t_snoopy” corresponding to the key value from the KV storage system according to the key value “t_snoopy_key_version1”, and remove the version identifier version1 in the key value,
  • the data corresponding to the KV attribute of the data object is the key value "t_snoopy_key” after removing the version identifier and the KV data "t_snoopy", that is, "(t_snoopy_key, t_snoopy.jpg)".
  • the multiple attributes of the data object are obtained from at least one of the first storage system and the second storage system according to the record
  • the data corresponding to the at least one attribute in the data includes: obtaining the data corresponding to the multiple attributes of the data object from the first storage system according to the record.
  • the data management device obtains the data object Data corresponding to multiple data: “Snoopy”, 2M, data/version1/Snoopy.jpg, 39.92, 116.46, “2018-10-12”, t_snoopy_key_version1.
  • step S306 if the operation instruction is a query instruction, based on the data corresponding to the at least one attribute, the process of performing the operation on the data object includes: generating a query result according to the at least one attribute , And return the query result to the request initiator, such as an application.
  • the data management device creates the data object according to the data corresponding to each of the seven attributes in the data object and the sequence of the seven attributes in the record.
  • the data object is "picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg))".
  • the data management device can also create a data object picture("Stitch",1.5M,(data/Stitch.jpg,Stitch.jpg),38.23,129.78,”2018-10-17",(t_Stitch_key_version2, t_Stitch.jpg)). After that, the data management device uses these two data objects as query results.
  • step S306 if the operation instruction is a delete instruction, based on the data corresponding to the at least one attribute, the method for performing the operation on the data object is: deleting from the first storage system Data corresponding to the at least one attribute; commit the transaction corresponding to the delete instruction.
  • the data management device deletes the data corresponding to the multiple data of the data object in the first relational data table of the first storage system: "Snoopy", 2M, data/version1/Snoopy.jpg ,39.92,116.46,”2018-10-12",t_snoopy_key_version1. After the deletion, the data management device submits the transaction corresponding to the deletion instruction.
  • the MVCC mechanism can maintain multiple snapshot copies for each record in the database, and maintain the visibility of the copies through a start timestamp (begin timestamp) and an end timestamp (end timestamp).
  • start timestamp is used to indicate when a record is created
  • end timestamp is used to indicate when a record expires (or is deleted).
  • the timestamp does not store the actual time when a record was created or expired, it stores the system version number when the record occurred. The system version number will continue to grow as the transaction is created, and each transaction will record its own system version number at the beginning of the transaction.
  • the start timestamp of the first record corresponding to the data object is the system version number of the current storage transaction, and the end timestamp of the first record is undefined.
  • the start timestamp of the second record corresponding to the updated data object is the system version number of the current update transaction, and the end timestamp of the second record is undefined; wherein, the system version of the update transaction The number is greater than the system version number of the stored transaction.
  • the end timestamp of the first record will be defined as the system version number of the update transaction.
  • the first record When the data management device executes the update instruction and commits the database transaction, the first record will be deleted, that is, after the data management device commits the database transaction, only the updated record is kept in the database. In addition, if another transaction performs read access to the data object while the update transaction is being performed, the record of the data object read by the other transaction is the first record. This method can make the database update and Reading does not block each other.
  • the records stored in the database can refer to the content of Table 6 above.
  • there are two corresponding records for a data object in the database which are the record containing the first version identifier and the record containing the second version identifier.
  • the records stored in the database can refer to the contents of Table 3 above. If during the execution of the update instruction, another transaction performs read access to the data object, the record of the data object read by the other transaction is a record containing the first version identifier.
  • the records stored in the final database can refer to the contents shown in Table 3 .
  • Table 3 the contents shown in Table 3 .
  • KV storage systems and file storage systems there will be two types of unstructured attribute data of the data object.
  • the unstructured attribute data corresponding to the data object includes "t_snoopy_key_version1, t_snoopy.jpg” and “t_snoopy_key_version2, t_snoopy.jpg”; for the file storage system, the data object corresponds to the unstructured
  • the attribute data are "data/version1/Snoopy.jpg, Snoopy.jpg” and “data/version2/Snoopy.jpg, Snoopy.jpg".
  • "t_snoopy_key_version1, t_snoopy.jpg” and “data/version1/Snoopy.jpg, Snoopy.jpg” are invalid data. This invalid data can be cleaned up through the verification operation.
  • the following will execute the data management device The method of verification operation is introduced.
  • the unstructured attribute includes a KV attribute
  • the second storage system is a KV storage system.
  • the method for performing a verification operation includes: when a verification instruction is received or when a verification condition is detected to be satisfied, traversing the key value in the second storage system, and the key value is in the second storage system The data corresponding to the KV attribute of the; in the process of traversing the key value, if the same key value as the fourth key cannot be found in the records stored in the first storage system, then the second storage system Delete the fourth key value and the KV data corresponding to the fourth key value, and the fourth key value is one of multiple key values in the second storage system.
  • the unstructured attributes include file attributes
  • the second storage system is a file storage system.
  • the method for performing a verification operation includes: when a verification instruction is received or when a verification condition is detected to be satisfied, traversing a path in the second storage system, where the path is a file in the second storage system Data corresponding to the attribute; in the process of traversing the path, if the same path as the fourth path cannot be found in the relational data table stored in the first storage system, delete all paths in the second storage system The fourth path and the file data corresponding to the fourth path, and the fourth path is one of multiple paths in the second storage system.
  • the verification condition may be a preset verification period at the current moment, or the amount of data stored in the data management device is greater than a preset value, and so on.
  • the check instruction may be "check (picture)", which means to check a data object of the object type "picture”.
  • the data management device will traverse the data corresponding to the non-structural attributes in the second storage system. In this way, by comparing the records stored in the database with the data corresponding to the unstructured attributes in the second storage system, invalid data in the second storage system can be cleared, and data objects can be stored across multiple data systems. To maintain data consistency.
  • the method for the data management device to delete the data corresponding to the unstructured attribute corresponding to the first record from the second storage system may refer to the method for executing the verification instruction of the data management device. Since the first record has been deleted, the unstructured data corresponding to the first record in the second storage system can be cleared by executing the check instruction.
  • the device includes a data management device including a generating unit 401, a storage unit 402, a receiving unit 403, a determining unit 404, an obtaining unit 405, and an operating unit 406.
  • the generating unit 401, storage unit 402, receiving unit 403, determining unit 404, obtaining unit 405, and operating unit 406 will be introduced below.
  • the generating unit 401 is configured to generate a record of a data object in a relational data table, the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes, and the record indicates the The relationship between the structured attribute and the unstructured attribute of the data object, and the relational data table is stored in the first storage system.
  • the generating unit 401 For the operations performed by the generating unit 401, reference may be made to the related description in step 301 in FIG. 3 above.
  • the storage unit 402 is configured to store data corresponding to the unstructured attributes of the data object in the second storage system. For operations performed by the storage unit 402, reference may be made to the related description in step 302 of FIG. 3 above.
  • the receiving unit 403 is configured to receive an operation instruction, and the operation instruction is used to perform an operation on the data object.
  • the receiving unit 403 may be a circuit or component that can be configured to receive information, such as a data transmission interface, a communication interface, or a receiver.
  • information such as a data transmission interface, a communication interface, or a receiver.
  • the determining unit 404 is configured to determine the record of the data object from the first storage system in response to the operation instruction. For the operation performed by the determining unit 404, reference may be made to the related description in step 304 in FIG. 3 above.
  • the acquiring unit 405 is configured to acquire at least one attribute corresponding to at least one of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record The data.
  • the obtaining unit 405 For the operations performed by the obtaining unit 405, reference may be made to the related description in step 305 of FIG. 3 above.
  • the operation unit 406 is configured to perform the operation on the data object based on the data corresponding to the at least one attribute. For the operations performed by the obtaining unit 406, reference may be made to the related description in step 306 in FIG.
  • each operation in FIG. 4 may also correspond to the corresponding description of the method embodiment shown in FIG. 3.
  • the above-mentioned units can be implemented in hardware, software or a combination of software and hardware.
  • the generation unit 401, the storage unit 402, the determination unit 404, the acquisition unit 405, and the operation unit 406 may be functional modules implemented by software.
  • the functions of these functional modules are implemented by programs or codes stored in the memory.
  • the management device executes these programs or codes through at least one processor to realize the functions of each functional module. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are obtained through the records in the relational data table, the data management device can allow the data object to maintain data consistency when stored across multiple data systems .
  • the data management device includes a processor 501, a memory 502, and a communication interface 503.
  • the processor 501, the memory 502, and the communication interface 503 are connected to each other through a bus 504.
  • the memory 502 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (erasable programmable read-only memory, EPROM), or Portable read-only memory (compact disc read-only memory, CD-ROM), the memory 502 is used for related instructions and data.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • EPROM erasable programmable read-only memory
  • Portable read-only memory compact disc read-only memory, CD-ROM
  • the communication interface 503 may be a circuit or component that can be configured to receive information, such as a data transmission interface, a communication interface, or a receiver.
  • the processor 501 may be one or more central processing units (CPU).
  • the CPU may be a single-core CPU or a multi-core CPU.
  • the processor 501 in the data management device performs the following operations by reading and executing the program code stored in the memory 502:
  • the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes, and the record indicates the association relationship between the structured attributes and unstructured attributes of the data object.
  • the relational data table is stored in the first storage system.
  • the data corresponding to the unstructured attribute of the data object is stored in the second storage system.
  • An operation instruction is received, and the operation instruction is used to perform an operation on the data object.
  • the record of the data object is determined from the first storage system.
  • the data object can maintain data consistency when stored across multiple data systems.
  • a computer program product is provided.
  • the method of the embodiment shown in FIG. 3 is implemented.
  • a computer-readable storage medium stores a computer program, and the computer program implements the method of the embodiment shown in FIG. 3 when the computer program is executed by a computer.

Abstract

Disclosed in the embodiments of the present application are a data management method and a related device, used to implement uniformity of data stored for a same data object across a plurality of types of storage systems. Data corresponding to a structured attribute of the data object is stored in a first storage system, for example, a relational database. Data corresponding to an unstructured attribute of the data object is stored in another type of storage system, for example, a KV system or a file system. An association relationship between the structured data and the unstructured data of the data object is recorded by means of relationship data stored in the database. When the data object is operated on, the record in the database is accessed first, and a key value and a path for the unstructured attribute is acquired from the record in the database, and then the data corresponding to the unstructured attribute is accessed by means of an interface of the storage system of the other type. Thus, data uniformity between systems of a plurality of storage types can be implemented by means of database transaction uniformity and a specified data access sequence.

Description

一种数据管理方法及相关设备A data management method and related equipment 技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种数据管理方法及相关设备。This application relates to the field of computer technology, in particular to a data management method and related equipment.
背景技术Background technique
在计算机系统中,数据一般分为结构化数据和非结构化数据,其中,结构化数据是由二维表结构来逻辑表达和实现的数据,遵循数据格式与长度规范,例如销售信息,财产信息等。非结构化数据是数据结构不规则或不完整,没有预定义的数据,例如文档、图片,音频和视频等。In computer systems, data is generally divided into structured data and unstructured data. Among them, structured data is data that is logically expressed and realized by a two-dimensional table structure and follows data format and length specifications, such as sales information, property information Wait. Unstructured data is irregular or incomplete data structure, and there is no predefined data, such as documents, pictures, audio, and video.
对数据的存储而言,结构化数据一般存储在关系型数据库中,大的非结构数据一般存储在文件存储系统中,小的非结构数据一般存储在键值(key value,KV)系统中。在实际应用中,一个数据对象可能同时包含结构化数据和非结构化数据。举例而言,数据对象为一张图片的信息,这张图片的名称、大小、拍摄时间和拍摄地的经纬度信息等图片的属性信息为结构化数据,这张图片本身为非结构化数据,基于这张图片生成的缩略图为非结构化数据。那么,图片的名称、大小、拍摄时间和拍摄地的经纬度信息等数据将存储在关系型数据库中,图片本身将存储在文件存储系统中,基于图片生成的缩略图将存储在KV存储系统中。可以看出,由于一个数据对象可能同时包含结构化数据和非结构化数据,因此一个数据对象可能存在跨多个数据系统进行存储的情况。For data storage, structured data is generally stored in a relational database, large unstructured data is generally stored in a file storage system, and small unstructured data is generally stored in a key value (KV) system. In practical applications, a data object may contain both structured data and unstructured data. For example, the data object is the information of a picture. The attribute information of the picture, such as the name, size, shooting time, and latitude and longitude information of the shooting location, is structured data. The picture itself is unstructured data, based on The thumbnail generated by this picture is unstructured data. Then, the name, size, shooting time, and latitude and longitude information of the picture will be stored in the relational database, the picture itself will be stored in the file storage system, and the thumbnail generated based on the picture will be stored in the KV storage system. It can be seen that since a data object may contain structured data and unstructured data at the same time, a data object may be stored across multiple data systems.
现有技术中,在一个数据对象跨多个数据系统进行存储的情况下,用户一般可以对多个数据系统中的数据分别进行操作,可能会存在同一数据对象在各个数据系统中的不一致的问题。例如,用户从文件存储系统中删除了一张图片的文件,用户仍旧可以通过数据库获取到图片的属性信息,但由于该图片的文件已被删除,该图片无法正常显示。如何让数据在跨多个数据系统存储的情况下保持数据一致性是本领域技术人员亟待解决的问题。In the prior art, when a data object is stored across multiple data systems, users can generally operate on the data in multiple data systems separately, and there may be a problem of inconsistency of the same data object in each data system. . For example, if a user deletes a file of a picture from the file storage system, the user can still obtain the attribute information of the picture through the database, but because the file of the picture has been deleted, the picture cannot be displayed normally. How to maintain data consistency when data is stored across multiple data systems is a problem to be solved by those skilled in the art.
发明内容Summary of the invention
本申请实施例提供一种数据管理方法及相关设备,用于实现数据对象在跨多个数据系统存储的情况下保持数据一致性。The embodiments of the present application provide a data management method and related equipment, which are used to achieve data consistency when data objects are stored across multiple data systems.
第一方面,本申请实施例提供了一种数据管理方法,该方法包括:在关系数据表中生成数据对象的记录,所述数据对象具有多个属性,所述多个属性包括结构化属性和非结构化属性,所述记录指示了所述数据对象的结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中;将所述数据对象的非结构化属性对应的数据存储到第二存储系统中;接收操作指令,所述操作指令用于对所述数据对象执行操作;响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录;根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据;基于所述至少一个属性对应的数据,对所述数据对象执行所述操作。由于该数据对象跨多个数据系统存储的多个属性对应的数据均通过该记录获取,可以让该数据对象在跨多个数据系统存储的情况下保持数据一致性。In the first aspect, an embodiment of the present application provides a data management method. The method includes: generating a record of a data object in a relational data table, the data object having multiple attributes, and the multiple attributes include structured attributes and Unstructured attributes, the record indicates the association relationship between the structured attributes and unstructured attributes of the data object, the relational data table is stored in a first storage system; the unstructured attributes of the data object The corresponding data is stored in the second storage system; an operation instruction is received, the operation instruction is used to perform an operation on the data object; in response to the operation instruction, the data object is determined from the first storage system The record; obtaining data corresponding to at least one of the attributes of the data object from at least one of the first storage system and the second storage system according to the record; based on The data corresponding to the at least one attribute performs the operation on the data object. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are obtained through the record, the data object can be kept in data consistency when stored across multiple data systems.
结合第一方面,在一种可能的实现方式中,所述在关系数据表中生成数据对象的记录, 包括:接收插入指令或更新指令,所述插入指令用于插入所述数据对象,所述更新指令用于更新所述数据对象;所述插入指令和所述更新指令均包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;根据所述对象类型确定所述数据对象对应的关系数据表;根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;提交所述插入指令或更新指令对应的事务;其中,所述插入指令或更新指令对应的事务在所述数据对象的非结构化属性对应的数据存储到第二存储系统中之后提交。With reference to the first aspect, in a possible implementation manner, the generating a record of a data object in a relational data table includes: receiving an insert instruction or an update instruction, the insert instruction is used to insert the data object, the The update instruction is used to update the data object; the insert instruction and the update instruction both include the object type of the data object, and the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object; Determine the relational data table corresponding to the data object according to the object type; generate the relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute A record of the data object; submit the transaction corresponding to the insert instruction or update instruction; wherein the transaction corresponding to the insert instruction or update instruction is stored in the second storage system after the data corresponding to the unstructured attribute of the data object is stored submit.
结合第一方面,在一种可能的实现方式中,接收的指令为所述插入指令,所述数据对象的非结构化属性包括键值KV属性,所述第二存储系统为KV存储系统;所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:根据第一版本标识和所述KV属性对应的数据中的第一键值生成第二键值;根据记录中的结构化属性对应的数据,以及记录中的KV属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第二键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the received instruction is the insert instruction, the unstructured attributes of the data object include key-value KV attributes, and the second storage system is a KV storage system; According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating the record of the data object in the relational data table corresponding to the data object includes: according to the first version identifier and the The first key value in the data corresponding to the KV attribute generates the second key value; according to the data corresponding to the structured attribute in the record and the data corresponding to the KV attribute in the record, it is generated in the relational data table corresponding to the data object The record of the data object; wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record includes data corresponding to the structured attribute of the data object .
结合第一方面,在一种可能的实现方式中,接收的指令为所述插入指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:根据所述第一版本标识和所述文件属性对应的数据中的第一路径生成第二路径;根据记录中的结构化属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第二路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the received instruction is the insert instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system; The data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in a relational data table corresponding to the data object includes: according to the first version identifier and the The first path in the data corresponding to the file attribute generates the second path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the data object is generated in the relational data table corresponding to the data object. A record of a data object; wherein the data corresponding to the file attribute in the record includes the second path, and the data corresponding to the structured attribute in the record includes data corresponding to the structured attribute of the data object.
结合第一方面,在一种可能的实现方式中,接收的指令为所述更新指令,所述数据对象的非结构化属性包括KV属性,所述第二存储系统为KV存储系统;所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:根据第二版本标识和所述KV属性对应的数据中的第一键值生成第三键值;根据记录中的结构化属性对应的数据,以及记录中的KV属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第三键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the received instruction is the update instruction, the unstructured attributes of the data object include KV attributes, and the second storage system is a KV storage system; The data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in the relational data table corresponding to the data object, includes: according to the second version identifier and the KV attribute The first key value in the corresponding data generates the third key value; according to the data corresponding to the structured attribute in the record and the data corresponding to the KV attribute in the record, the data object is generated in the relational data table corresponding to the data object. A record of a data object; wherein the data corresponding to the KV attribute in the record includes the third key value, and the data corresponding to the structured attribute in the record includes data corresponding to the structured attribute of the data object.
结合第一方面,在一种可能的实现方式中,接收的指令为所述更新指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:根据所述第二版本标识和所述文件属性对应的数据中的第一路径生成第三路径;根据记录中的结构化属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第三路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the received instruction is the update instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system; The data corresponding to the structured attribute and the data corresponding to the unstructured attribute, generating a record of the data object in a relational data table corresponding to the data object includes: according to the second version identifier and the The first path in the data corresponding to the file attribute generates the third path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the data object is generated in the relational data table corresponding to the data object. A record of a data object; wherein the data corresponding to the file attribute in the record includes the third path, and the data corresponding to the structured attribute in the record includes data corresponding to the structured attribute of the data object.
结合第一方面,在一种可能的实现方式中,所述第二存储系统中存储的所述数据对象的 非结构化属性对应的数据,包括所述非结构化属性对应的标识和内容;所述第一存储系统中存储的所述记录中非结构化属性对应的数据包括非结构化属性对应的标识。With reference to the first aspect, in a possible implementation manner, the data corresponding to the unstructured attribute of the data object stored in the second storage system includes the identifier and content corresponding to the unstructured attribute; The data corresponding to the unstructured attribute in the record stored in the first storage system includes an identifier corresponding to the unstructured attribute.
结合第一方面,在一种可能的实现方式中,所述操作指令包括查询指令,所述查询指令中包括查询条件;所述响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录,包括:响应于所述操作指令,从所述第一存储系统中选取满足所述查询条件的数据对象的记录;所述根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据,包括:根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的所述多个属性对应的数据;所述基于所述至少一个属性对应的数据,对所述数据对象执行所述操作,包括:根据所述数据对象的多个属性中每个属性对应的数据和所述记录中多个属性的顺序建立所述数据对象;将所述数据对象作为查询结果返回。With reference to the first aspect, in a possible implementation manner, the operation instruction includes a query instruction, and the query instruction includes a query condition; in response to the operation instruction, the determination is made from the first storage system. The record of the data object includes: in response to the operation instruction, selecting a record of the data object that meets the query condition from the first storage system; and selecting from the first storage system according to the record Obtaining data corresponding to at least one attribute of the plurality of attributes of the data object from at least one of the storage systems in the second storage system includes: according to the record from the first storage system and the Acquiring data corresponding to the multiple attributes of the data object in the second storage system; the performing the operation on the data object based on the data corresponding to the at least one attribute includes: according to the data object The data corresponding to each of the multiple attributes and the sequence of the multiple attributes in the record establishes the data object; and the data object is returned as a query result.
结合第一方面,在一种可能的实现方式中,所述记录的非结构化属性包括KV属性,所述第二存储系统为KV存储系统,所述根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据,包括:根据键值从所述第二存储系统中读取所述键值对应的KV数据,并去除所述键值中的版本标识;所述键值为所述记录中的KV属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;其中,所述数据对象的KV属性对应的数据包括去除版本标识后的键值和所述KV数据;所述数据对象的结构化属性对应的数据包括所述记录中结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the unstructured attribute of the record includes a KV attribute, the second storage system is a KV storage system, and the record is obtained from the first storage system according to the record. And acquiring the data corresponding to the multiple attributes of the data object in the second storage system includes: reading the KV data corresponding to the key value from the second storage system according to the key value, and removing the The version identifier in the key value; the key value is the data corresponding to the KV attribute in the record, and the version identifier includes a first version identifier and a second version identifier; wherein, the data corresponding to the KV attribute of the data object It includes the key value after removing the version identifier and the KV data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
结合第一方面,在一种可能的实现方式中,所述记录的非结构化属性包括文件属性,所述第二存储系统为文件存储系统,所述根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据,包括:根据路径从所述第二存储系统中读取所述路径对应的文件数据,并去除所述路径中的版本标识;所述路径为所述记录中的文件属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;其中,所述数据对象的文件属性对应的数据包括去除版本标识后的路径和所述文件数据;所述数据对象的结构化属性对应的数据包括所述记录中结构化属性对应的数据。With reference to the first aspect, in a possible implementation manner, the unstructured attributes of the record include file attributes, the second storage system is a file storage system, and the record is obtained from the first storage system according to the record. And acquiring the data corresponding to the multiple attributes of the data object in the second storage system includes: reading the file data corresponding to the path from the second storage system according to the path, and removing the data from the path The version identifier; the path is the data corresponding to the file attribute in the record, the version identifier includes a first version identifier and a second version identifier; wherein the data corresponding to the file attribute of the data object includes the removal version identifier The following path and the file data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
结合第一方面,在一种可能的实现方式中,所述操作指令还包括删除指令,所述删除指令包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;其中,所述响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录,包括:根据所述对象类型确定所述数据对象对应的关系数据表;从所述关系数据表中确定所述数据对象的所述记录,所述记录中的结构化属性对应的数据与所述数据对象的结构化属性对应的数据相同;所述根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的多个属性中的至少一个属性对应的数据,包括:所述根据所述记录从所述第一存储系统中获取所述数据对象的所述多个属性对应的数据;所述基于所述至少一个属性对应的数据,对所述数据对象执行所述操作,包括:从所述第一存储系统中删除所述数据对象的所述多个属性对应的数据;提交所述删除指令对应的事务。With reference to the first aspect, in a possible implementation manner, the operation instruction further includes a deletion instruction, and the deletion instruction includes the object type of the data object, and the data and non-compatibility corresponding to the structured attribute of the data object. Data corresponding to a structured attribute; wherein, in response to the operation instruction, determining the record of the data object from the first storage system includes: determining the data object corresponding to the data object according to the object type A relational data table; the record of the data object is determined from the relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object; The record acquiring data corresponding to at least one attribute of the multiple attributes of the data object from at least one of the first storage system and the second storage system includes: said recording from Acquiring data corresponding to the multiple attributes of the data object in the first storage system; and performing the operation on the data object based on the data corresponding to the at least one attribute includes: Delete data corresponding to the multiple attributes of the data object in a storage system; submit a transaction corresponding to the delete instruction.
结合第一方面,在一种可能的实现方式中,所述非结构化属性包括KV属性,所述第二存储系统为KV存储系统,所述方法还包括:当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的键值,所述键值为所述第二存储系统中的KV属性对应的数据;在遍历所述键值的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四键值相同的键值,则在所述第二存储系统中删除所述第四键值和所述第四键值对应的KV 数据,所述第四键值为所述第二存储系统中多个键值中的一个。With reference to the first aspect, in a possible implementation manner, the unstructured attributes include KV attributes, the second storage system is a KV storage system, and the method further includes: when a verification instruction is received or when When it is detected that the verification condition is satisfied, the key value in the second storage system is traversed, and the key value is the data corresponding to the KV attribute in the second storage system; in the process of traversing the key value, if If the same key value as the fourth key value cannot be found in the relational data table stored in the first storage system, delete the fourth key value and the fourth key value correspondence in the second storage system The fourth key value is one of multiple key values in the second storage system.
结合第一方面,在一种可能的实现方式中,所述非结构化属性包括文件属性,所述第二存储系统为文件存储系统,所述方法还包括:当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的路径,所述路径为所述第二存储系统中的文件属性对应的数据;在遍历所述路径的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四路径相同的路径,则在所述第二存储系统中删除所述第四路径和所述第四路径对应的文件数据,所述第四路径为所述第二存储系统中多个路径中的一个。With reference to the first aspect, in a possible implementation manner, the unstructured attributes include file attributes, the second storage system is a file storage system, and the method further includes: when a verification instruction is received or when When it is detected that the verification condition is met, the path in the second storage system is traversed, and the path is the data corresponding to the file attribute in the second storage system; in the process of traversing the path, if the path is If the same path as the fourth path cannot be found in the relational data table stored in the first storage system, the fourth path and the file data corresponding to the fourth path are deleted in the second storage system. The four-path is one of the multiple paths in the second storage system.
结合第一方面,在一种可能的实现方式中,在根据所述对象类型确定所述数据对象对应的关系数据表之前,所述方法还包括:接收针对所述数据对象所属对象类型的定义指令,所述定义指令中包含所述对象类型的定义信息,所述定义信息用于定义所述对象类型的关系数据表的结构;根据所述定义指令在所述第一存储系统中,生成所述对象类型的关系数据表。With reference to the first aspect, in a possible implementation manner, before determining the relational data table corresponding to the data object according to the object type, the method further includes: receiving a definition instruction for the object type to which the data object belongs , The definition instruction includes definition information of the object type, and the definition information is used to define the structure of the relational data table of the object type; according to the definition instruction, the first storage system generates the The relational data table of the object type.
结合第一方面,在一种可能的实现方式中,所述根据所述对象类型确定所述数据对象对应的关系数据表,包括:根据所述插入指令或更新指令确定所述数据对象所属的对象类型;根据所述对象类型确定所述数据对象对应的关系数据表。With reference to the first aspect, in a possible implementation manner, the determining the relational data table corresponding to the data object according to the object type includes: determining the object to which the data object belongs according to the insert instruction or update instruction Type; the relational data table corresponding to the data object is determined according to the object type.
第二方面,本申请实施例提供了一种数据管理设备,该数据管理设备包括用于执行第一方面或第一方面的各种可能的实现方式所描述的方法所对应的单元。In a second aspect, an embodiment of the present application provides a data management device, which includes a unit for executing the method described in the first aspect or various possible implementations of the first aspect.
上述数据管理设备可以是电子设备,也可以是电子设备中的用于实现数据管理的装置(例如,操作系统、数据库管理系统),也可以是服务器,比如数据库服务器,应用服务器等。The above-mentioned data management device may be an electronic device, or a device for implementing data management in an electronic device (for example, an operating system, a database management system), or a server, such as a database server, an application server, etc.
上述数据管理设备置包括的单元可以是硬件电路,也可是软件,也可以是硬件电路结合软件实现。The units included in the aforementioned data management device may be hardware circuits, software, or hardware circuits combined with software.
第三方面,本申请实施例提供了另一种数据管理设备,包括处理器和存储器,该处理器和存储器相互连接,其中,该存储器用于存储程序指令,该处理器用于调用该存储器中的程序指令来执行上述第一方面或者第一方面的任一可能的实现方式所描述的方法。In the third aspect, the embodiments of the present application provide another data management device, including a processor and a memory, the processor and the memory are connected to each other, wherein the memory is used to store program instructions, and the processor is used to call the The program instructions execute the method described in the first aspect or any possible implementation of the first aspect.
第四方面,本申请实施例提供了一种计算机可读存储介质,该计算机存储介质存储有程序指令,该程序指令当被处理器运行时,该处理器执行上述第一方面或者第一方面的任一可能的实现方式所描述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer storage medium stores program instructions, and when the program instructions are executed by a processor, the processor executes the above-mentioned first aspect or the first aspect. Any possible implementation of the method described.
第五方面,本申请实施例提供了一种计算机程序,该计算机程序在处理器上运行时,该处理器执行上述第一方面或者第一方面的任一可能的实现方式所描述的方法。In a fifth aspect, an embodiment of the present application provides a computer program. When the computer program runs on a processor, the processor executes the method described in the first aspect or any possible implementation of the first aspect.
在本申请实施例中,数据管理设备可以在关系数据表中生成数据对象的记录,该记录指示了数据对象的结构化数据和非结构化数据的关联关系,该关系数据表存储在第一存储系统中,该数据对象的非结构化数据存储在第二存储系统中。在需要对数据进行操作时,可以从该第一存储系统中获取该数据对象的记录,根据该记录从该第一存储系统和/或该第二存储系统中获取该数据对象的多个属性中至少一个属性对应的数据;再基于该至少一个属性对应的数据,对该数据对象执行该操作。由于该数据对象跨多个数据系统存储的多个属性对应的数据均通过关系数据表中的记录获取,可以让该数据对象在跨多个数据系统存储的情况下保持数据一致性。In the embodiment of the present application, the data management device may generate a record of the data object in the relational data table. The record indicates the association relationship between the structured data and the unstructured data of the data object. The relational data table is stored in the first storage. In the system, the unstructured data of the data object is stored in the second storage system. When the data needs to be manipulated, the record of the data object can be obtained from the first storage system, and the multiple attributes of the data object can be obtained from the first storage system and/or the second storage system according to the record Data corresponding to at least one attribute; and then based on the data corresponding to the at least one attribute, performing the operation on the data object. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are all obtained through the records in the relational data table, the data object can maintain data consistency when stored across multiple data systems.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art.
图1A是本申请实施例提供的一种数据管理设备的示意图;FIG. 1A is a schematic diagram of a data management device provided by an embodiment of the present application;
图1B是本申请实施例提供的又一种数据管理设备的示意图;FIG. 1B is a schematic diagram of yet another data management device provided by an embodiment of the present application;
图1C是本申请实施例提供的又一种数据管理设备的示意图;FIG. 1C is a schematic diagram of another data management device provided by an embodiment of the present application;
图2是本申请实施例提供的一种数据管理系统的架构示意图;2 is a schematic diagram of the architecture of a data management system provided by an embodiment of the present application;
图3是本申请实施例提供的一种数据管理方法的流程图;FIG. 3 is a flowchart of a data management method provided by an embodiment of the present application;
图4是本申请实施例提供的又一种数据管理设备的示意图;4 is a schematic diagram of another data management device provided by an embodiment of the present application;
图5是本申请实施例提供的又一种数据管理设备的示意图。Fig. 5 is a schematic diagram of yet another data management device provided by an embodiment of the present application.
具体实施方式detailed description
下面对本申请实施例中的技术方案进行更详细地描述。The technical solutions in the embodiments of the present application are described in more detail below.
本申请实施例提供的数据管理方法可应用于数据管理设备,该数据管理设备包括第一存储系统和第二存储系统。第一存储系统和第二存储系统为不同类型的存储系统。其中,第一存储系统可以用于存储数据对象的结构化属性及对应的结构化数据,第二存储系统用于存储数据对象的非结构化属性及对应的非结构化数据。结构化属性是用于描述或定义结构化数据的特征的属性,非结构化属性是用于描述或定义非结构化数据的特征的属性。结构化数据也称作行数据,是由二维表结构来逻辑表达和实现的数据,主要通过关系型数据库进行存储和管理;非结构化数据是数据结构不规则或不完整,没有预定义的数据模型,不方便用数据库二维逻辑表来表现的数据,比如文档、文本、图片、报表、图像、音频/视频信息等。在一个实施例中,该第一存储系统可以为数据库,比如关系型数据库。可选的,数据对象的非结构化属性可以为多个,则该第二存储系统可以为多个,举例而言,该数据对象的非结构化属性可以包括键值(key-value,KV)属性和文件(file)属性,则该数据管理设备可以包括KV存储系统(比如KV数据库)和文件存储系统(简称“文件系统”)。具体的,该数据管理设备可以包括手机、平板电脑、个人数字助理(personal digital assistant,PDA)、移动互联网设备(mobile internet device,MID)等终端设备,也可以包括数据库服务器、应用服务器等具有数据存储和处理功能的设备,本发明实施例不作限定。The data management method provided in the embodiments of the present application can be applied to a data management device, and the data management device includes a first storage system and a second storage system. The first storage system and the second storage system are different types of storage systems. Among them, the first storage system can be used to store the structured attributes of the data object and corresponding structured data, and the second storage system can be used to store the unstructured attributes of the data object and the corresponding unstructured data. Structured attributes are attributes used to describe or define characteristics of structured data, and unstructured attributes are attributes used to describe or define characteristics of unstructured data. Structured data, also known as row data, is data logically expressed and realized by a two-dimensional table structure, which is mainly stored and managed by relational databases; unstructured data is irregular or incomplete data structure, and has no predefined Data model, data that is not convenient to use two-dimensional logical tables of the database, such as documents, texts, pictures, reports, images, audio/video information, etc. In an embodiment, the first storage system may be a database, such as a relational database. Optionally, there may be multiple unstructured attributes of the data object, and the second storage system may be multiple. For example, the unstructured attributes of the data object may include key-value (KV) Attribute and file attribute, the data management device may include a KV storage system (such as a KV database) and a file storage system (referred to as "file system"). Specifically, the data management device may include terminal devices such as mobile phones, tablet computers, personal digital assistants (personal digital assistants, PDAs), mobile internet devices (mobile internet devices, MIDs), etc., and may also include database servers, application servers, etc. The storage and processing function equipment is not limited in the embodiment of the present invention.
该数据管理设备可以通过自身运行的应用程序接收用户输入的针对数据对象的操作指令,并对数据对象执行该操作指令。举例而言,该应用程序可以为存储图像或者视频的相册,可以接收用户输入的对图像或者视频的操作指令;或者该应用程序可以为创建文本的文本软件,可以接收用户输入的对文本的操作指令;或者该应用程序可以为即时通讯软件,可以接收用户输入的对该软件中的办公文档、文本、图片、图像和音频/视频等数据的操作指令。The data management device can receive an operation instruction for a data object input by a user through an application program it runs, and execute the operation instruction on the data object. For example, the application can be an album that stores images or videos, and can receive user-input operation instructions on images or videos; or the application can be a text-creation software that can receive user-input operations on text Instructions; or the application program may be instant messaging software, which can receive operating instructions input by the user for office documents, text, pictures, images, audio/video, and other data in the software.
参见图1A,是本申请实施例提供的一种数据管理设备的示意图。该数据管理设备包括应用模块,操作模块,接口模块,存储系统模块。以下将对这些模块进行进一步的介绍。Refer to FIG. 1A, which is a schematic diagram of a data management device provided by an embodiment of the present application. The data management device includes an application module, an operation module, an interface module, and a storage system module. These modules will be further introduced below.
其中,该应用模块可以包含一个或多个应用程序,这些应用程序可以接收用户输入的针对数据对象的操作指令。举例而言,该应用程序可以包括相册、邮箱、文档处理软件等等。Wherein, the application module may include one or more application programs, and these application programs may receive operation instructions for the data object input by the user. For example, the application can include photo albums, mailboxes, document processing software, and so on.
操作模块,是一个数据对象管理组件,它提供接口给应用模块,应用模块通过操作模块可以实现对数据对象进行定义、插入、修改、删除、查询、校验等操作。具体地,操作模块可以根据从应用模块接收到的指令,执行该指令指示的操作。接收到的指令不同,该操作模块执行的操作不同,以下针对几种不同的指令来进行举例说明。The operation module is a data object management component that provides an interface to the application module. Through the operation module, the application module can implement operations such as defining, inserting, modifying, deleting, querying, and verifying data objects. Specifically, the operation module may execute the operation indicated by the instruction according to the instruction received from the application module. The received instructions are different, and the operations performed by the operation module are different. The following examples illustrate several different instructions.
若该指令为定义指令,该操作模块可以根据该定义指令确定数据对象该对象类型的定义信息,再通过该接口模块将该定义信息存储到存储系统模块中。其中,该数据对象该对象类 型的定义信息包括该数据对象的结构化属性和非结构化属性。If the instruction is a definition instruction, the operation module can determine the definition information of the object type of the data object according to the definition instruction, and then store the definition information in the storage system module through the interface module. Wherein, the definition information of the object type of the data object includes the structured attribute and the unstructured attribute of the data object.
若该指令为插入指令,该操作模块可以根据该插入指令确定需要存储的该数据对象的信息,再通过该接口模块将需要插入的该数据对象的信息存储到存储系统模块中。其中,操作模块会将不同类型的数据保存到不同的存储系统,具体的,结构化数据保存到数据库中,文件数据保存到文件存储系统中,键值数据保存到KV存储系统中。If the instruction is an insert instruction, the operation module can determine the information of the data object that needs to be stored according to the insert instruction, and then store the information of the data object that needs to be inserted into the storage system module through the interface module. Among them, the operation module saves different types of data to different storage systems. Specifically, structured data is saved to the database, file data is saved to the file storage system, and key-value data is saved to the KV storage system.
若该指令为更新指令,该操作模块可以根据该更新指令确定需要更新的该数据对象的信息,再通过该接口模块将需要更新的该数据对象的信息存储到存储系统模块中。其中,操作模块会将不同类型的数据保存到不同的存储系统,具体的,结构化数据保存到数据库中,文件数据保存到文件存储系统中,键值数据保存到KV存储系统中。If the instruction is an update instruction, the operation module can determine the information of the data object that needs to be updated according to the update instruction, and then store the information of the data object that needs to be updated in the storage system module through the interface module. Among them, the operation module saves different types of data to different storage systems. Specifically, structured data is saved to the database, file data is saved to the file storage system, and key-value data is saved to the KV storage system.
若该指令为查询指令,该操作模块可以根据该查询指令确定查询条件,再根据该查询条件通过接口模块从存储系统模块中,选取满足查询条件的数据对象反馈给应用模块。If the instruction is a query instruction, the operation module can determine the query condition according to the query instruction, and then select data objects that meet the query condition from the storage system module through the interface module according to the query condition and feed it back to the application module.
若该指令为删除指令,该操作模块可以根据该删除指令确定需要删除的数据对象,再通过接口模块从存储系统模块中,删除该需要删除的数据对象。If the instruction is a delete instruction, the operation module can determine the data object to be deleted according to the delete instruction, and then delete the data object to be deleted from the storage system module through the interface module.
若该指令为校验指令,该操作模块可以根据该校验指令通过接口模块对存储系统模块中存储的数据进行校验,以清除无效的数据。If the command is a check command, the operating module can check the data stored in the storage system module through the interface module according to the check command to clear invalid data.
接口模块,提供访问存储系统模块的接口,操作模块可以通过该接口模块访问存储系统模块中的数据。在一个实施例中,如图1B所示,接口模块包含第一存储系统接口子模块和第二存储系统接口子模块;存储系统模块包含第一存储系统子模块和第二存储系统子模块。操作模块可以通过第一存储系统接口子模块访问第一存储系统中的数据,该操作模块可以通过第二存储系统接口子模块访问第二存储系统中的数据。The interface module provides an interface for accessing the storage system module, and the operation module can access the data in the storage system module through the interface module. In one embodiment, as shown in FIG. 1B, the interface module includes a first storage system interface submodule and a second storage system interface submodule; the storage system module includes a first storage system submodule and a second storage system submodule. The operating module can access the data in the first storage system through the first storage system interface sub-module, and the operating module can access the data in the second storage system through the second storage system interface sub-module.
在一个实施例中,第一存储系统为数据库系统,比如关系型数据库,则第一存储系统接口子模块为数据库系统接口;第二存储系统为用于存储非关系型数据的存储系统,比如KV存储系统和/或文件存储系统,相应地,第二存储系统接口子模块包括KV系统接口子模块和/或文件存储系统接口子模块。数据管理设备将同一数据对象的结构化属性对应的数据存储在第一存储系统中,将该数据对象的非结构化数据对应的数据存储在第二存储系统中。进一步地,数据管理设备在第一存储系统中生成关系数据表,以建立该数据对象的结构化属性和非结构化属性的关联关系。该关系数据表包含该数据对象的记录,该记录包括了该数据对象的结构化属性的名称、非结构化属性的名称、结构化属性对应的数据内容(值)、以及非结构化属性对应的数据的标识(非结构化属性对应的数据内容本身存储在第二存储系统)。当操作模块从应用模块接收到针对该数据对象的操作指令后,先从存储在第一存储系统的关系数据表中确定该数据对象的记录,然后可以根据该记录从关系数据表中获取该数据对象的结构化属性对应的数据,以及可以基于该记录确定非结构化属性对应的数据的标识,进而根据该标识从第二存储系统中获取该数据对象的非结构化属性对应的数据(数据内容)。最后,数据管理设备基于获取的该数据对象的结构化属性对应的数据,和/或非结构化属性对应的数据,对该数据对象执行相应的操作。数据管理设备生成数据对象的关系数据表,存储数据以及操作数据的详细过程可以参见图3相关的实施例。In one embodiment, the first storage system is a database system, such as a relational database, the first storage system interface submodule is a database system interface; the second storage system is a storage system for storing non-relational data, such as KV The storage system and/or the file storage system. Correspondingly, the second storage system interface submodule includes a KV system interface submodule and/or a file storage system interface submodule. The data management device stores the data corresponding to the structured attributes of the same data object in the first storage system, and stores the data corresponding to the unstructured data of the data object in the second storage system. Further, the data management device generates a relational data table in the first storage system to establish an association relationship between the structured attribute and the unstructured attribute of the data object. The relational data table contains the record of the data object, the record includes the name of the structured attribute of the data object, the name of the unstructured attribute, the data content (value) corresponding to the structured attribute, and the corresponding unstructured attribute The identification of the data (the data content corresponding to the unstructured attribute is stored in the second storage system). When the operation module receives the operation instruction for the data object from the application module, it first determines the record of the data object from the relational data table stored in the first storage system, and then can obtain the data from the relational data table according to the record The data corresponding to the structured attribute of the object, and the identifier of the data corresponding to the unstructured attribute can be determined based on the record, and then the data corresponding to the unstructured attribute of the data object is obtained from the second storage system according to the identifier (data content ). Finally, the data management device performs corresponding operations on the data object based on the acquired data corresponding to the structured attribute of the data object and/or the data corresponding to the unstructured attribute. The data management device generates the relational data table of the data object, and the detailed process of storing the data and operating the data can refer to the related embodiment in FIG. 3.
在一个实施例中,第一存储系统子模块可以是支持多版本并发控制(multi-version concurrency control,MVCC)的数据库,比如轻型数据库(SQLite)。其中,MVCC可以为数据库中的每条记录维护多个快照副本,通过起始时间戳(begin timestamp)和结束时间戳(end timestamp)维护副本的可见性。第二存储系统子模块是支持持久化的存储系统,比如KV系 统,闪存文件系统(flash friendly file system,F2FS),第四代扩展文件系统(fourth extended filesystem,EXT4)等。其中,持久化(persistence)的含义是将数据在持久状态和瞬时状态间转换的机制。通俗的讲,就是瞬时数据(比如内存中的数据)持久化为持久数据,该持久化数据能够长久地被保存。数据管理设备在对数据对象进行访问时,先访问该数据对象在数据库中的记录,根据记录内容再操作存储在第二存储系统中的数据,从而能够借助数据库的并发控制实现第二存储系统(如文件系统,KV系统)的并发控制访问。对于数据对象的插入和修改操作,数据管理设备必须在文件系统,KV系统操作完成后,才能提交数据库的事务;对于删除操作,数据管理设备必须先操作数据库中的数据,提交事务后,才能再操作文件系统,KV系统中的数据。In an embodiment, the first storage system sub-module may be a database that supports multi-version concurrency control (MVCC), such as a lightweight database (SQLite). Among them, MVCC can maintain multiple snapshot copies for each record in the database, and maintain the visibility of the copies through a start timestamp (begin timestamp) and an end timestamp (end timestamp). The second storage system sub-module is a storage system that supports persistence, such as KV system, flash friendly file system (F2FS), fourth-generation extended file system (EXT4) and so on. Among them, the meaning of persistence is a mechanism for converting data between persistent state and transient state. In layman's terms, transient data (such as data in memory) is persisted into persistent data, and the persistent data can be stored for a long time. When the data management device accesses a data object, it first accesses the record of the data object in the database, and then operates the data stored in the second storage system according to the record content, so that the second storage system can be realized by means of the concurrency control of the database ( Such as file system, KV system) concurrent control access. For the insertion and modification operations of data objects, the data management device must be in the file system and the KV system operation is completed before submitting the database transaction; for the deletion operation, the data management device must first manipulate the data in the database, and then submit the transaction. Operate the file system, the data in the KV system.
图1C示出了数据管理设备的一种更具体的实现方式。根据图1C,操作模块可以包括数据定义子模块,数据插入、更新、删除、查询子模块,数据校验子模块。其中,该数据定义子模块用于根据定义指令确定数据对象的定义信息。该数据插入、更新、删除、查询子模块,数据校验子模块可以根据插入指令、更新指令、删除指令、查询指令对实际存储的数据对象执行插入、更新、删除、查询等操作。该数据校验子模块可以对不同存储系统中的数据进行校验,以清除无效的数据,保证多个存储系统中的数据一致性。需要说明的是,还可以按照需要对操作模块进行其他的划分方式,例如,将该数据插入、更新、删除、查询子模块划分为数据插入子模块、数据更新子模块、数据删除子模块、数据查询子模块,本申请实施例不作具体的限制。Figure 1C shows a more specific implementation of the data management device. According to FIG. 1C, the operation module may include a data definition sub-module, a data insertion, update, deletion, and query sub-module, and a data verification sub-module. Among them, the data definition sub-module is used to determine the definition information of the data object according to the definition instruction. The data insert, update, delete, and query sub-module, and the data check sub-module can perform operations such as insert, update, delete, and query on the actually stored data object according to the insert instruction, update instruction, delete instruction, and query instruction. The data verification sub-module can verify data in different storage systems to clear invalid data and ensure data consistency in multiple storage systems. It should be noted that the operation modules can also be divided into other ways as needed. For example, the data insertion, update, deletion, and query submodules can be divided into data insertion submodules, data update submodules, data deletion submodules, and data The query sub-module is not specifically limited in the embodiment of this application.
可选的,该操作模块还可以包括第一存储系统操作子模块和第二存储系统操作子模块。其中,该第一存储系统操作子模块用于对第一存储系统中的数据执行操作,第二存储系统操作子模块用于对第二存储系统中的数据执行操作。Optionally, the operation module may further include a first storage system operation sub-module and a second storage system operation sub-module. Wherein, the first storage system operation submodule is used to perform operations on data in the first storage system, and the second storage system operation submodule is used to perform operations on data in the second storage system.
可选的,由于该存储系统模块可以包括多个存储系统,该操作模块还可以包括系统适配子模块,该系统适配子模块可以对数据对象进行处理,使得该数据对象可以适配多个存储系统,或者使得操作反馈的数据对象适应不同应用程序的应用环境。例如,可以包括第一存储系统适配子模块和第二存储系统适配子模块。举例而言,该第一存储系统适配子模块可以包括数据库适配子模块,它可以对接不同的数据库,能够方便进行数据库切换,对数据库操作接口封装,对上层业务提供类似数据库的接口,包括打开数据库(open),执行数据库的操作创建(create),插入(insert),更新(update),删除(delete),查询(query)操作,以及执行开始事务(begin),提交事务(commit)等事务操作。在又一种可能的实现方案中,该第二存储系统适配子模块可以包括KV存储系统适配子模块,可以对接不同的KV存储系统,能够方便进行KV存储切换,对KV操作接口封装,对上层业务提供类似KV的接口,包括输入(put),输出(get),删除(delete)等操作。在又一种可能的实现方案中,该第二存储系统适配子模块还可以包括文件存储系统适配子模块,可以对接不同的文件存储系统,能够方便文件存储系统切换,对文件存储系统接口封装,对上层业务提供类似文件存储系统的接口,包括打开文件(open),读文件(read),写文件(write),关闭文件(close)等操作。参见图1C,是本申请实施例提供的又一种数据管理设备的示意图。Optionally, since the storage system module may include multiple storage systems, the operation module may also include a system adaptation sub-module that can process the data object so that the data object can adapt to multiple storage systems. Storage system, or adapt the data objects of operation feedback to the application environment of different applications. For example, it may include a first storage system adaptation sub-module and a second storage system adaptation sub-module. For example, the first storage system adaptation submodule may include a database adaptation submodule, which can interface with different databases, facilitate database switching, encapsulate database operation interfaces, and provide database-like interfaces for upper-level services, including Open the database (open), perform database operations to create (create), insert (insert), update (update), delete (delete), query (query) operations, and perform begin, commit, etc. Transaction operation. In yet another possible implementation solution, the second storage system adapter submodule may include a KV storage system adapter submodule, which can be connected to different KV storage systems, can facilitate KV storage switching, and package the KV operation interface, Provide KV-like interfaces for upper-layer services, including input (put), output (get), delete (delete) and other operations. In yet another possible implementation solution, the second storage system adaptation submodule may also include a file storage system adaptation submodule, which can interface with different file storage systems, facilitate file storage system switching, and interface with file storage systems. Encapsulation provides an interface similar to a file storage system for upper-level services, including operations such as opening files (open), reading files (read), writing files (write), and closing files (close). Refer to FIG. 1C, which is a schematic diagram of another data management device provided by an embodiment of the present application.
在又一种可能的实现方式中,本申请实施例的数据管理方法还可以应用于一种数据管理系统,参见图2,是本申请实施例提供的一种数据管理系统的架构示意图,该系统包括客户端和数据管理设备。以下对这两种设备进行进一步的介绍。In another possible implementation manner, the data management method of the embodiment of the present application can also be applied to a data management system. Refer to FIG. 2, which is a schematic diagram of the architecture of a data management system provided by an embodiment of the present application. Including client and data management equipment. The following two devices are further introduced.
客户端,是为客户提供本地服务的设备。除了一些只在本地运行的应用程序之外,一般 客户端的运行需要与服务器互相配合。较常用的客户端包括了如万维网使用的网页浏览器,收寄电子邮件时的电子邮件客户端,存储图像或者视频的相册客户端,创建文本的文本客户端以及即时通讯的客户端软件等。在本申请实施例中,该客户端可以接收用户输入的针对数据对象的操作指令,该操作指令可以包括插入指令,更新指令,定义指令,查询指令,删除指令,校验指令,等等。举例而言,该客户端可以为存储图像或者视频的相册客户端,可以接收用户输入的对图像或者视频的操作指令;该客户端可以为创建文本的文本客户端,可以接收用户输入的对文本的操作指令;该客户端可以为即时通讯的客户端软件,可以接收用户输入的对该软件中的办公文档、文本、图片、图像和音频/视频等数据的操作指令。The client is a device that provides local services to customers. Except for some applications that only run locally, the operation of the client generally needs to cooperate with the server. More commonly used clients include web browsers used on the World Wide Web, email clients for receiving and sending emails, photo album clients for storing images or videos, text clients for creating text, and client software for instant messaging. In the embodiment of the present application, the client may receive an operation instruction for the data object input by the user. The operation instruction may include an insert instruction, an update instruction, a definition instruction, a query instruction, a delete instruction, a verification instruction, and so on. For example, the client can be an album client that stores images or videos, and can receive operation instructions for images or videos entered by the user; the client can be a text client that creates text, and can receive the text input by the user. The operating instructions; the client can be instant messaging client software, which can receive operating instructions entered by the user for office documents, text, pictures, images, and audio/video data in the software.
数据管理设备,是为客户端提供数据存储和处理服务的设备,可以实现对数据的管理,例如,该管理可以包括定义,存储,更新,删除,校验,等等。该客户端和该数据管理设备为独立的两台设备,该客户端与该服务器通过网络或者数据线进行通信。该数据管理设备可以接收来自客户端的操作指令,再对数据对象执行该操作指令。该数据管理设备的结构可以参照上述图1A~图1C所描述的结构,仅将上述图1A~图1C中示意的“应用模块”替换为“接收模块”,该接口模块用于从客户端接收针对数据对象的操作指令。另外,上述图1A~图1C中示意的模块中除该应用模块之外的剩余模块功能均可参照上述的描述,此处不再赘述。A data management device is a device that provides data storage and processing services for clients, and can implement data management. For example, the management can include definition, storage, update, deletion, verification, and so on. The client and the data management device are two independent devices, and the client and the server communicate through a network or a data line. The data management device can receive an operation instruction from the client, and then execute the operation instruction on the data object. The structure of the data management device can refer to the structure described in Figures 1A to 1C, and only replace the "application module" illustrated in Figures 1A to 1C with a "receiving module". The interface module is used to receive from the client Operation instructions for data objects. In addition, the functions of the remaining modules except for the application module in the modules illustrated in FIGS. 1A to 1C can refer to the above description, which will not be repeated here.
参见图3,是本申请实施例提供的一种数据管理方法的流程图。下面描述的数据管理设备可以是图1A~图1C及图2任一所示的数据管理设备;该方法包括但不限于如下步骤。Refer to FIG. 3, which is a flowchart of a data management method provided by an embodiment of the present application. The data management device described below may be the data management device shown in any one of FIG. 1A to FIG. 1C and FIG. 2; the method includes but is not limited to the following steps.
S301、在关系数据表中生成数据对象的记录。S301: Generate a record of the data object in the relational data table.
其中,该数据对象具有多个属性,该多个属性包括结构化属性和非结构化属性。结构化属性是用于描述或定义结构化数据的特征的属性,非结构化属性是用于描述或定义非结构化数据的特征的属性。结构化数据是由二维表结构来逻辑表达和实现的数据,主要通过关系型数据库进行存储和管理;非结构化数据是数据结构不规则或不完整,没有预定义的数据模型,不方便用数据库二维逻辑表来表现的数据,比如文档、文本、图片、报表、图像、音频/视频信息等。在关系数据表中生成的记录包括所述数据对象的结构化属性和该结构化属性对应的数据,以及结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中。The data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes. Structured attributes are attributes used to describe or define characteristics of structured data, and unstructured attributes are attributes used to describe or define characteristics of unstructured data. Structured data is data logically expressed and realized by a two-dimensional table structure, which is mainly stored and managed by relational databases; unstructured data is irregular or incomplete data structure, and there is no predefined data model, which is inconvenient to use Data represented by two-dimensional logical tables of the database, such as documents, texts, pictures, reports, images, audio/video information, etc. The record generated in the relational data table includes the structured attribute of the data object and the data corresponding to the structured attribute, and the association relationship between the structured attribute and the unstructured attribute. The relational data table is stored in the first storage system in.
S302、将该数据对象的非结构化属性对应的数据存储到第二存储系统中。S302: Store data corresponding to the unstructured attribute of the data object in a second storage system.
关系数据表中一个数据对象的记录可以包含结构化属性字段和非结构化属性字段,结构化属性字段的值为结构化属性对应的数据,非结构化字段的值为非结构化属性对应的数据的标识,如键值、路径等。进一步地,非结构化属性对应的非结构化数据存储在第二存储系统中。也就是说,通过第一存储系统中的关系数据表,可以将数据对象的结构化属性和非结构化属性关联起来,关系数据表中的记录包含了结构化属性和非结构化属性,以及结构化属性和非结构化属性各自对应的数据。需要说明的是,在关系数据表的记录中,结构化属性对应的数据为数据本身,即数据内容或数据值,而非结构化属性对应的数据,并非原始数据内容,而是数据的一个标识,真正的数据内容存储在第二存储系统中。下面的实施例会进行详细说明。The record of a data object in the relational data table can contain structured attribute fields and unstructured attribute fields. The value of the structured attribute field is the data corresponding to the structured attribute, and the value of the unstructured field is the data corresponding to the unstructured attribute. The identifier of the, such as key value, path, etc. Further, the unstructured data corresponding to the unstructured attribute is stored in the second storage system. In other words, through the relational data table in the first storage system, the structured attributes and unstructured attributes of the data object can be associated. The records in the relational data table include structured attributes and unstructured attributes, as well as structure The data corresponding to each of the chemical attribute and the unstructured attribute. It should be noted that in the records of the relational data table, the data corresponding to the structured attribute is the data itself, that is, the data content or data value, not the data corresponding to the structured attribute, not the original data content, but an identification of the data , The real data content is stored in the second storage system. The following examples will explain in detail.
S303、接收操作指令,该操作指令用于对该数据对象执行操作。S303. Receive an operation instruction, where the operation instruction is used to perform an operation on the data object.
具体的,该操作指令可以是使用数据库定义语言(data definition language,DDL),数据操纵语言(data manipulation language,DML)等描述的查询语句(query),校验语句(check),删除语句(delete)或者是函数调用语句等等。该操作指令指示了操作所涉及的该数据对象所属对象类型。可选的,该操作指令还可以包括对该数据对象执行操作所需的数据,比如该数 据对象的结构化属性对应的数据和非结构化属性对应的数据。Specifically, the operation instruction may be a query statement (query), a check statement (check), a delete statement (delete statement) described in a database definition language (data definition language, DDL), data manipulation language (data manipulation language, DML), etc. ) Or a function call statement, etc. The operation instruction indicates the object type of the data object involved in the operation. Optionally, the operation instruction may also include data required to perform an operation on the data object, such as data corresponding to the structured attribute and data corresponding to the unstructured attribute of the data object.
S304、响应于该操作指令,从该第一存储系统中确定该数据对象的记录。S304. In response to the operation instruction, determine the record of the data object from the first storage system.
S305、根据该记录从该第一存储系统和该第二存储系统中的至少一个存储系统中获取该数据对象的该多个属性中的至少一个属性对应的数据。S305. Obtain data corresponding to at least one attribute of the plurality of attributes of the data object from at least one of the first storage system and the second storage system according to the record.
S306、基于该至少一个属性对应的数据,对该数据对象执行该操作。S306: Perform the operation on the data object based on the data corresponding to the at least one attribute.
在一个实施例中,在该数据管理设备生成该数据对象的记录之前,可以基于数据定义指令来定义该数据对象,即定义该数据对象的各个属性的名称和类型,比如,定义的该数据对象的一个或多个结构化属性,以及一个或多个非结构化属性。在一种具体的实现方式中,该数据对象的定义过程包括:该数据管理设备接收针对所述数据对象所属对象类型的定义指令,所述定义指令中包含所述对象类型的定义信息,所述定义信息用于定义所述对象类型的关系数据表的结构;根据所述定义指令在所述第一存储系统中,生成所述对象类型的关系数据表以关联该数据对象的结构化属性和非结构化属性。In one embodiment, before the data management device generates the record of the data object, the data object may be defined based on the data definition instruction, that is, the name and type of each attribute of the data object are defined, for example, the defined data object One or more structured attributes of and one or more unstructured attributes. In a specific implementation manner, the definition process of the data object includes: the data management device receives a definition instruction for the object type to which the data object belongs, and the definition instruction includes definition information of the object type, and The definition information is used to define the structure of the relational data table of the object type; in the first storage system according to the definition instruction, the relational data table of the object type is generated to associate the structured attributes and non-conformities of the data object. Structured attributes.
举例而言,该定义信息可以是“图片”这种对象类型的定义信息,该数据对象的非结构化属性可以包括文件和键值。该“图片”类型的定义信息可以为“picture(name STRING,size INT,path FILE,latitude DOUBLE,longitude DOUBLE,time_taken STRING,thumbnail KV)”,根据该定义信息生成的所述对象类型“图片”的关系数据表可以参照表一所示。For example, the definition information may be definition information of an object type of "picture", and the unstructured attributes of the data object may include files and key values. The definition information of the “picture” type can be “picture(name STRING, size INT, path FILE, latitude DOUBLE, longitude DOUBLE, time_taken STRING, thumbnail KV)”, and the object type “picture” generated according to the definition information The relational data table can refer to Table 1.
表一Table I
属性名称Attribute name namename sizesize pathpath latitudelatitude longitudelongitude time_takentime_taken thumbnailthumbnail
属性类型Attribute type STRINGSTRING INTINT FILEFILE DOUBLEDOUBLE DOUBLEDOUBLE STRINGSTRING KVKV
其中,该“图片”类型的定义信息中的结构化属性为“命名(name)”“大小(size)”“纬度(latitude)”“经度(longitude)”“拍摄时间(time_taken)”,该非结构化属性为“路径(path)”和“缩略图(thumbnail)”,其中,“path”为文件属性,而“thumbnail”为KV属性。包含该定义信息的定义指令可以为“create(picture(name STRING,size INT,path FILE,latitude DOUBLE,longitude DOUBLE,time_taken STRING,thumbnail KV))”。其中,“图片(picture)”为数据对象所属对象类型。Among them, the structural attributes in the definition information of the "picture" type are "name", "size", "latitude", "longitude" and "time_taken". The structured attributes are "path" and "thumbnail", where "path" is the file attribute, and "thumbnail" is the KV attribute. The definition command containing the definition information can be "create(picture(nameSTRING,sizeINT,pathFILE,latitudeDOUBLE,longitudeDOUBLE,time_takenSTRING,thumbnailKV))". Among them, "picture" is the object type to which the data object belongs.
步骤S301中,该数据管理设备在关系数据表中生成数据对象的记录可能存在两种情况。第一种情况是基于接收到的插入指令生成该数据对象的记录,第二种情况是基于接收到的更新指令生成该数据对象的记录。其中,插入指令用于插入该数据对象,更新指令用于更新该数据对象。针对该数据对象的插入指令和更新指令均指示了所述数据对象的对象类型,以及所述数据对象的属性对应的数据。以下将对这两种情况进行具体的介绍。In step S301, there may be two situations in which the data management device generates a record of the data object in the relational data table. The first case is to generate a record of the data object based on the received insert instruction, and the second case is to generate a record of the data object based on the received update instruction. Wherein, the insert instruction is used to insert the data object, and the update instruction is used to update the data object. Both the insert instruction and the update instruction for the data object indicate the object type of the data object and the data corresponding to the attribute of the data object. The following will give a specific introduction to these two situations.
针对第一种情况,该数据管理设备接收到用于插入所述数据对象的插入指令后,在关系数据表中生成数据对象的记录的过程包括:根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;以及,在所述数据对象的非结构化属性对应的数据存储到第二存储系统中之后提交所述插入指令或更新指令对应的事务。For the first case, after the data management device receives the insert instruction for inserting the data object, the process of generating a record of the data object in the relational data table includes: according to the data corresponding to the structured attribute and the For the data corresponding to the unstructured attribute, a record of the data object is generated in the relational data table corresponding to the data object; and, after the data corresponding to the unstructured attribute of the data object is stored in the second storage system Submit the transaction corresponding to the insert instruction or update instruction.
其中,该数据管理设备可以根据所述插入指令确定所述数据对象所属的对象类型;根据 所述对象类型确定所述数据对象对应的关系数据表。Wherein, the data management device may determine the object type to which the data object belongs according to the insertion instruction; and determine the relational data table corresponding to the data object according to the object type.
举例而言,该针对数据对象的插入指令为“insert(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,“2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”。该数据管理设备可以根据该插入指令确定该数据对象所属的对象类型为“图片”,再根据“图片”这种对象类型确定该数据对象对应的关系数据表为,第一存储系统中的“图片”这种对象类型的关系数据表。该关系数据表可以参照上述表一。For example, the insert instruction for the data object is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key, t_snoopy.jpg)))". The data management device may determine the object type to which the data object belongs is "picture" according to the insert instruction, and then determine the relational data table corresponding to the data object according to the object type "picture" as the "picture" in the first storage system. "This kind of object type relational data table. The relational data table can refer to Table 1 above.
具体的,该数据管理设备可以从该插入指令中获取该数据对象的结构化属性对应的数据和非结构化属性对应的数据。举例而言,该插入指令为“insert(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,“2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”。该数据管理设备可以从更新指令中获取该数据对象的结构化属性“命名(name)”,“大小(size)”“纬度(latitude)”,“经度(longitude)”,和“拍摄时间(time_taken)”各自对应的数据“Snoopy”,2M,39.92,116.46,“2018-10-12”,并在关系数据表中将这些数据填充至对应的结构化属性字段中。同时,该数据管理设备可以获取非结构化属性“路径(path)”和“缩略图(thumbnail)”各自对应的数据(data/Snoopy.jpg,Snoopy.jpg),(t_snoopy_key,t_snoopy.jpg),并在关系数据表中将这些数据填充至对应的非结构化属性字段中。可以看出,数据对象的非结构化属性对应的数据包括数据的标识和内容。例如,非结构化属性“路径(path)”对应的数据(data/Snoopy.jpg,Snoopy.jpg)包括非结构化数据的标识,即路径:data/Snoopy.jpg,以及数据内容,即Snoopy.jpg文件;非结构化属性“缩略图(thumbnail)”对应的数据(t_snoopy_key,t_snoopy.jpg)包括缩略图标识t_snoopy_key和缩略图的内容t_snoopy.jpg。Specifically, the data management device may obtain the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object from the insert instruction. For example, the insert instruction is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg) ))". The data management device can obtain the structured attributes of the data object "name", "size", "latitude", "longitude", and "time_taken" from the update instruction. )" corresponding data "Snoopy", 2M, 39.92, 116.46, "2018-10-12", and fill these data into the corresponding structured attribute fields in the relational data table. At the same time, the data management device can obtain the data (data/Snoopy.jpg, Snoopy.jpg), (t_snoopy_key, t_snoopy.jpg) corresponding to the unstructured attributes "path" and "thumbnail" respectively, And fill these data into the corresponding unstructured attribute fields in the relational data table. It can be seen that the data corresponding to the unstructured attributes of the data object includes the identification and content of the data. For example, the data (data/Snoopy.jpg, Snoopy.jpg) corresponding to the unstructured attribute "path" includes the identification of the unstructured data, namely the path: data/Snoopy.jpg, and the data content, namely Snoopy. jpg file; the data (t_snoopy_key, t_snoopy.jpg) corresponding to the unstructured attribute "thumbnail" includes the thumbnail identifier t_snoopy_key and the content of the thumbnail t_snoopy.jpg.
以下介绍在插入数据对象的过程中,该数据管理设备在所述数据对象对应的关系数据表中生成所述数据对象的记录的详细过程。The following describes the detailed process of the data management device generating the record of the data object in the relational data table corresponding to the data object during the process of inserting the data object.
在一个实施例中,该数据对象的非结构化属性包括KV属性,所述第二存储系统为KV存储系统。该数据管理设备根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录的方法包括:根据第一版本标识和该KV属性对应的数据中的第一键值生成第二键值;根据记录中的结构化属性对应的数据,以及记录中的KV属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第二键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。In an embodiment, the unstructured attribute of the data object includes a KV attribute, and the second storage system is a KV storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in the relational data table corresponding to the data object. The method includes: according to the first version Identify the first key value in the data corresponding to the KV attribute to generate the second key value; according to the data corresponding to the structured attribute in the record, and the data corresponding to the KV attribute in the record, in the relational data corresponding to the data object A record of the data object is generated in a table; wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data.
其中,该版本标识用于指示该记录的版本,在数据对象的更新过程中,该记录的版本标识可以区分该记录为更新前的数据对象的记录,还是更新后的数据对象的记录。可选的,该第一版本标识代表的含义是该数据对象更新前已在数据库中存储的记录的版本标识。该第二版本标识代表的含义是该数据对象更新后在数据库中存储的记录的版本标识。在一种可能的情况下,该数据管理设备中仅存在两个版本标识,例如version1和version2。若该第一版本标识为version1,则该数据管理设备确定第二版本标识为version2;若该第一版本标识为version2,则该数据管理设备确定第二版本标识为version1。在又一种可能的情况下,该数据管理设备中可以存在多个版本标识,例如version1,version2和version3等等。若该第一版本标识为version1,则该数据管理设备确定第二版本标识为version2或者除version1的其他版本标识;若该第一版本标识为version2,则该数据管理设备确定第二版本标识为version3或者除version2的其他版本标识。The version identifier is used to indicate the version of the record. During the update process of the data object, the version identifier of the record can distinguish whether the record is a record of the data object before update or a record of the data object after update. Optionally, the meaning represented by the first version identifier is the version identifier of the record that has been stored in the database before the data object is updated. The meaning represented by the second version identifier is the version identifier of the record stored in the database after the data object is updated. In a possible situation, there are only two version identifiers in the data management device, for example, version1 and version2. If the first version identifier is version1, the data management device determines that the second version identifier is version2; if the first version identifier is version2, the data management device determines that the second version identifier is version1. In yet another possible situation, there may be multiple version identifiers in the data management device, such as version 1, version 2, and version 3, and so on. If the first version identification is version1, the data management device determines that the second version identification is version2 or other version identifications except version1; if the first version identification is version2, the data management device determines that the second version identification is version3 Or other version identifiers except version2.
可选的,根据第一版本标识和该KV属性对应的数据中的第一键值生成第二键值的方式 可以包括:在该KV属性对应的数据中的第一键值中添加该第一版本标识以生成该第二键值。需要说明的是,还可以存在其他根据第一版本标识和该KV属性对应的数据中的第一键值生成第二键值的方式,此处不做限制。Optionally, the method of generating the second key value according to the first version identifier and the first key value in the data corresponding to the KV attribute may include: adding the first key value to the first key value in the data corresponding to the KV attribute Version identification to generate the second key value. It should be noted that there may also be other ways of generating the second key value based on the first version identifier and the first key value in the data corresponding to the KV attribute, which is not limited here.
在一种实施例中,该数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统。该数据管理设备根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录的方法包括:根据所述第一版本标识和所述文件属性对应的数据中的第一路径生成第二路径;根据记录中的结构化属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第二路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。In an embodiment, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object. The method includes: A version identification and the first path in the data corresponding to the file attribute generate a second path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the relationship between the data object corresponds to A record of the data object is generated in a data table; wherein the data corresponding to the file attribute in the record includes the second path, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data.
可选的,根据该第一版本标识和该文件属性对应的数据中的第一路径生成第二路径的方式可以包括:在该文件属性对应的数据中的第一路径中添加该第一版本标识以生成该第二路径。其中,若在执行该插入指令之前,该第二路径不存在,则该数据管理设备在该数据库中创建该第二路径。需要说明的是,还可以存在其他根据该第一版本标识和该文件属性对应的数据中的第一路径生成第二路径的方式,此处不做限制。Optionally, the method of generating the second path according to the first version identifier and the first path in the data corresponding to the file attribute may include: adding the first version identifier to the first path in the data corresponding to the file attribute To generate the second path. Wherein, if the second path does not exist before executing the insert instruction, the data management device creates the second path in the database. It should be noted that there may also be other ways of generating the second path based on the first version identifier and the first path in the data corresponding to the file attribute, which is not limited here.
以下以数据对象picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)),在关系数据表中的记录的生成为例,介绍在关系数据表中生成数据对象的记录生成过程,其中,该数据对象包括的非结构化属性为文件属性(data/Snoopy.jpg,Snoopy.jpg),KV属性(t_snoopy_key,t_snoopy.jpg)。针对该数据对象的该插入指令为“insert(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”,该数据对象包含7个属性,该第一版本标识为version1。其中,该数据对象的第1个属性为结构化属性,则该记录中的该结构化属性对应的数据为该结构化属性对应的数据“Snoopy”。此外,该数据对象的7个属性中除该第1个结构化属性的其他结构化属性也可以参照该方式,此处不再赘述。该数据对象的第3个属性为文件属性,则根据第一版本标识version1和该文件属性对应的数据“(data/Snoopy.jpg,Snoopy.jpg)”中的第一路径“data/Snoopy.jpg”生成第二路径“data/version1/Snoopy.jpg”,则该记录中的文件属性对应的数据为该第二路径“data/version1/Snoopy.jpg”。该数据对象的第7个属性为KV属性,则根据第一版本标识version1和该的KV属性对应的数据“(t_snoopy_key,t_snoopy.jpg)”中的第一键值“t_snoopy_key”生成第二键值“t_snoopy_key_version1”,则该记录中的KV属性对应的数据为该第二键值“t_snoopy_key_version1”。之后,根据记录中的结构化属性对应的数据,记录中的KV属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录。该记录可以参照表二所示。The following uses the data object picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg)) in the relational data table Take the generation of the record as an example to introduce the record generation process of generating a data object in a relational data table. The unstructured attributes included in the data object are file attributes (data/Snoopy.jpg, Snoopy.jpg) and KV attributes ( t_snoopy_key,t_snoopy.jpg). The insert instruction for the data object is "insert(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg )))", the data object contains 7 attributes, and the first version is identified as version1. Wherein, the first attribute of the data object is a structured attribute, and the data corresponding to the structured attribute in the record is the data "Snoopy" corresponding to the structured attribute. In addition, other structured attributes other than the first structured attribute among the seven attributes of the data object can also refer to this method, which will not be repeated here. The third attribute of the data object is the file attribute, then the first path "data/Snoopy.jpg" in the data "(data/Snoopy.jpg,Snoopy.jpg)" corresponding to the first version identifier version1 and the file attribute is used. "The second path "data/version1/Snoopy.jpg" is generated, and the data corresponding to the file attribute in the record is the second path "data/version1/Snoopy.jpg". The seventh attribute of the data object is the KV attribute, and the second key value is generated according to the first key value "t_snoopy_key" in the data "(t_snoopy_key,t_snoopy.jpg)" corresponding to the first version identifier version1 and the KV attribute "T_snoopy_key_version1", the data corresponding to the KV attribute in the record is the second key value "t_snoopy_key_version1". Then, according to the data corresponding to the structured attribute in the record, the data corresponding to the KV attribute in the record, and the data corresponding to the file attribute in the record, a record of the data object is generated in the relational data table corresponding to the data object . The record can refer to Table 2 below.
表二Table II
Figure PCTCN2020080952-appb-000001
Figure PCTCN2020080952-appb-000001
数据管理设备会将该数据对象的KV属性对应的数据“(t_snoopy_key_version1,t_snoopy.jpg)”存储在KV存储系统中,将该数据对象的文件属性对应的数据“(data/version1/ Snoopy.jpg,Snoopy.jpg)”存储在文件存储系统中。KV存储系统存储的KV属性对应的数据可以参照表三所示:The data management device will store the data "(t_snoopy_key_version1,t_snoopy.jpg)" corresponding to the KV attribute of the data object in the KV storage system, and the data corresponding to the file attribute of the data object "(data/version1/Snoopy.jpg, Snoopy.jpg)” is stored in the file storage system. The data corresponding to the KV attributes stored in the KV storage system can be referred to Table 3:
表三Table Three
属性名称Attribute name 键值Key value 缩略图Thumbnail
数据data t_snoopy_key_version1t_snoopy_key_version1 t_snoopy.jpgt_snoopy.jpg
文件存储系统存储的文件属性对应的数据可以参照表四所示:The data corresponding to the file attributes stored in the file storage system can be referred to Table 4:
表四Table Four
属性名称Attribute name 路径path 图片image
数据data data/version1/Snoopy.jpgdata/version1/Snoopy.jpg Snoopy.jpgSnoopy.jpg
在该数据管理设备将该数据对象的非结构化属性对应的数据存储到第二存储系统中之后,该数据管理设备将提交该插入指令对应的数据库事务。需要说明的是,该数据管理设备提交数据库事务的操作需要在第一存储系统和第二存储系统均保存了相应的数据之后执行,这种操作方式可以确保该数据对象的数据在各个存储系统中均成功存储。After the data management device stores the data corresponding to the unstructured attribute of the data object in the second storage system, the data management device will submit the database transaction corresponding to the insert instruction. It should be noted that the operation of submitting database transactions by the data management device needs to be executed after both the first storage system and the second storage system save the corresponding data. This operation mode can ensure that the data of the data object is in each storage system. All were successfully stored.
步骤S301中,针对第二种情况,该数据管理设备接收到用于更新所述数据对象的更新指令后,生成该数据对象的关系数据表的过程包括:在所述数据对象对应的关系数据表中生成所述数据对象的记录;提交所述插入指令或更新指令对应的事务;其中,所述插入指令或更新指令对应的事务在所述数据对象的非结构化属性对应的数据存储到第二存储系统中之后提交。In step S301, for the second case, after the data management device receives the update instruction for updating the data object, the process of generating the relational data table of the data object includes: in the relational data table corresponding to the data object A record of the data object is generated in the data object; the transaction corresponding to the insert instruction or the update instruction is submitted; wherein the transaction corresponding to the insert instruction or the update instruction is stored in the second data corresponding to the unstructured attribute of the data object Submit later in the storage system.
其中,该数据管理设备可以根据所述更新指令确定所述数据对象所属的对象类型;根据所述对象类型确定所述数据对象对应的关系数据表。The data management device may determine the object type to which the data object belongs according to the update instruction; determine the relational data table corresponding to the data object according to the object type.
举例而言,该针对数据对象的更新指令为“update(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”。该数据管理设备可以根据该更新指令确定该数据对象所属的对象类型为“图片”,再根据“图片”这种对象类型确定该数据对象对应的关系数据表为,第一存储系统中的“图片”这种对象类型的关系数据表。该关系数据表可以参照上述表一所示。For example, the update instruction for the data object is "update(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key, t_snoopy.jpg)))". The data management device may determine the object type to which the data object belongs is "picture" according to the update instruction, and then determine the relational data table corresponding to the data object according to the object type "picture" as the "picture" in the first storage system "This kind of object type relational data table. For the relational data table, please refer to Table 1 above.
以下介绍在更新过程中,该数据管理设备根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录的方法。The following describes how the data management device generates the record of the data object in the relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute during the update process. method.
在一种实施例中,该数据对象的非结构化属性包括KV属性,所述第二存储系统为KV存储系统。该数据管理设备根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录的方法包括:根据第二版本标识和所述KV属性对应的数据中的第一键值生成第三键值;根据记录中的结构化属性对应的数据,以及记录中的KV属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第三键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。其中,该第二版本的含义可以参照上述介绍的内容,另外,根据第二版本标识和该KV属性对应的数据中的第一键值生成第三键值的方式,可以参照上述介绍的,根据第一版本标识和该KV属性对应的数据中的第一键值生成第二键值的方式,此处不再赘述。In an embodiment, the unstructured attribute of the data object includes a KV attribute, and the second storage system is a KV storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object. The method includes: according to a second version Identify the first key value in the data corresponding to the KV attribute to generate the third key value; according to the data corresponding to the structured attribute in the record, and the data corresponding to the KV attribute in the record, the relationship between the data object A record of the data object is generated in a data table; wherein the data corresponding to the KV attribute in the record includes the third key value, and the data corresponding to the structured attribute in the record includes the structured data object The data corresponding to the attribute. Among them, the meaning of the second version can refer to the content introduced above. In addition, the method of generating the third key value according to the second version identifier and the first key value in the data corresponding to the KV attribute can refer to the above introduction, according to The manner in which the first key value in the data corresponding to the first version identifier and the KV attribute generates the second key value is not repeated here.
在一种实施例中,该数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统。该数据管理设备根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录的方法包括:根据所述第二版本标识和所述文件属性对应的数据中的第一路径生成第三路径;根据记录中的结构化属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第三路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。其中,根据该第二版本标识和该文件属性对应的数据中的第一路径生成第三路径的方式,可以参照上述介绍的,根据该第一版本标识和该文件属性对应的数据中的第一路径生成第二路径的方式,此处不再赘述。In an embodiment, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system. According to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute, the data management device generates a record of the data object in a relational data table corresponding to the data object. The method includes: The second version identifier and the first path in the data corresponding to the file attribute generate a third path; according to the data corresponding to the structured attribute in the record and the data corresponding to the file attribute in the record, the relationship between the data object corresponds to A record of the data object is generated in a data table; wherein the data corresponding to the file attribute in the record includes the third path, and the data corresponding to the structured attribute in the record includes the structured attribute of the data object The corresponding data. Wherein, the method of generating the third path according to the second version identifier and the first path in the data corresponding to the file attribute can refer to the above introduction, according to the first version identifier and the first path in the data corresponding to the file attribute The way the path generates the second path will not be repeated here.
以下以数据对象picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)),在关系数据表中的记录的生成为例,介绍在关系数据表生成数据对象的记录的生成过程,其中,该数据对象包括的非结构化属性为文件属性(data/Snoopy.jpg,Snoopy.jpg),KV属性(t_snoopy_key,t_snoopy.jpg)。该更新指令为“update(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”,该数据对象包含7个属性,该第二版本标识为version2。其中,该数据对象的第1个属性为结构化属性,则该记录中的该结构化属性对应的数据包括该数据对象的结构化属性对应的数据“Snoopy”。此外,该数据对象中的7个属性中除该第1个结构化属性的其他结构化属性也可以参照该方式,此处不再赘述。该数据对象的第3个属性为文件属性,则根据第二版本标识version2和该的文件属性对应的数据“(data/Snoopy.jpg,Snoopy.jpg)”中的第一路径“data/Snoopy.jpg”生成第三路径“data/version2/Snoopy.jpg”,则该记录中的文件属性对应的数据为该第三路径“data/version2/Snoopy.jpg”。该数据对象的第7个属性为KV属性,则根据第二版本标识version2和该的KV属性对应的数据“(t_snoopy_key,t_snoopy.jpg)”中的第一键值“t_snoopy_key”生成第三键值“t_snoopy_key_version2”,该记录中的KV属性对应的数据为该第三键值“t_snoopy_key_version2”。之后,根据记录中的结构化属性对应的数据,记录中的KV属性对应的数据,以及记录中的文件属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录。该记录可以参照表五所示。The following uses the data object picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg)) in the relational data table Take the generation of records as an example to introduce the process of generating records of data objects in relational data tables. The unstructured attributes included in the data objects are file attributes (data/Snoopy.jpg, Snoopy.jpg) and KV attributes ( t_snoopy_key,t_snoopy.jpg). The update command is "update(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg)))", The data object contains 7 attributes, and the second version is identified as version2. Wherein, the first attribute of the data object is a structured attribute, and the data corresponding to the structured attribute in the record includes the data "Snoopy" corresponding to the structured attribute of the data object. In addition, other structured attributes other than the first structured attribute among the seven attributes in the data object can also refer to this method, which will not be repeated here. The third attribute of the data object is a file attribute, and the first path "data/Snoopy.jpg" in the data "(data/Snoopy.jpg,Snoopy.jpg)" corresponding to the second version identifier version2 and the file attribute is used. jpg" generates the third path "data/version2/Snoopy.jpg", and the data corresponding to the file attribute in the record is the third path "data/version2/Snoopy.jpg". The seventh attribute of the data object is the KV attribute, and the third key value is generated according to the first key value "t_snoopy_key" in the data "(t_snoopy_key,t_snoopy.jpg)" corresponding to the second version identifier version2 and the KV attribute "T_snoopy_key_version2", the data corresponding to the KV attribute in the record is the third key value "t_snoopy_key_version2". Then, according to the data corresponding to the structured attribute in the record, the data corresponding to the KV attribute in the record, and the data corresponding to the file attribute in the record, a record of the data object is generated in the relational data table corresponding to the data object . The record can refer to Table 5 below.
表五Table 5
Figure PCTCN2020080952-appb-000002
Figure PCTCN2020080952-appb-000002
由于在更新数据对象之前,关系数据表中已经存储有该数据对象的更早版本的记录,所以数据管理设备在接收到更新指令,在关系数据库中生成该数据对象的更新版本的记录后,该第一存储系统中存储的关系数据表同时包含了新旧版本的记录,如表六所示:Before the data object is updated, the record of the earlier version of the data object is already stored in the relational data table, so after the data management device receives the update instruction and generates the record of the updated version of the data object in the relational database, the The relational data table stored in the first storage system contains both the old and new version records, as shown in Table 6:
表六Table 6
Figure PCTCN2020080952-appb-000003
Figure PCTCN2020080952-appb-000003
相应地,KV存储系统存储的KV属性对应的数据可以参照表七所示:Correspondingly, the data corresponding to the KV attributes stored in the KV storage system can be referred to Table 7:
表七Table Seven
属性名称Attribute name 键值Key value 缩略图Thumbnail
数据data t_snoopy_key_version1t_snoopy_key_version1 t_snoopy.jpgt_snoopy.jpg
数据data t_snoopy_key_version2t_snoopy_key_version2 t_snoopy.jpgt_snoopy.jpg
文件存储系统存储的文件属性对应的数据可以参照表八所示:The data corresponding to the file attributes stored in the file storage system can be referred to Table 8:
表八Table 8
属性名称Attribute name 路径path 图片image
数据data data/version1/Snoopy.jpgdata/version1/Snoopy.jpg Snoopy.jpgSnoopy.jpg
数据data data/version2/Snoopy.jpgdata/version2/Snoopy.jpg Snoopy.jpgSnoopy.jpg
在该数据管理设备将该数据对象的非结构化属性对应的数据存储到第二存储系统中之后,该数据管理设备将提交该更新指令对应的数据库事务。需要说明的是,该数据管理设备提交数据库事务的操作需要在第一存储系统和第二存储系统均保存了相应的数据之后执行,这种操作方式可以确保该数据对象在各个存储系统中均成功存储。After the data management device stores the data corresponding to the unstructured attribute of the data object in the second storage system, the data management device will submit the database transaction corresponding to the update instruction. It should be noted that the operation of submitting database transactions by the data management device needs to be executed after the first storage system and the second storage system both save the corresponding data. This operation mode can ensure that the data object is successful in each storage system storage.
在一实施例中,步骤S303中接收到的操作指令为查询指令,该查询指令中包括查询条件。步骤S304中,从所述第一存储系统中确定所述数据对象的所述记录的过程包括:从所述第一存储系统中选取满足所述查询条件的数据对象的记录。In an embodiment, the operation command received in step S303 is a query command, and the query command includes a query condition. In step S304, the process of determining the record of the data object from the first storage system includes: selecting a record of the data object satisfying the query condition from the first storage system.
举例来说,该查询指令为“query(picture(“time_taken≥2018-10-12”))”,该查询指令的含义是查询拍摄时间在2018年10月12日或者在该拍摄时间之后的图片。则该数据管理设备遍历该第一存储系统中图片对应的关系数据表,从中选取拍摄时间大于或等于2018-10-12的记录。例如,该获取到的记录可以为picture(“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1)和picture(“Stitch”,1.5M,data/version2/Stitch.jpg,38.23,129.78,“2018-10-17”,t_Stitch_key_version2)。For example, the query command is "query(picture("time_taken≥2018-10-12"))", and the meaning of the query command is to query pictures taken on October 12, 2018 or after the shooting time . Then the data management device traverses the relational data table corresponding to the picture in the first storage system, and selects records whose shooting time is greater than or equal to 2018-10-12. For example, the obtained record can be picture("Snoopy",2M,data/version1/Snoopy.jpg,39.92,116.46,"2018-10-12",t_snoopy_key_version1) and picture("Stitch",1.5M,data /version2/Stitch.jpg,38.23,129.78,"2018-10-17",t_Stitch_key_version2).
在一实施例中,步骤S303中接收到的操作指令为删除指令,该删除指令包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据。步骤S304中,从所述第一存储系统中确定所述数据对象的所述记录的方法包括:根据所述对象类型确定所述数据对象对应的关系数据表;从所述关系数据表中确定所述数据对象的所述记录,所述记录中的结构化属性对应的数据与所述数据对象的结构化属性对应的数据相同。In one embodiment, the operation instruction received in step S303 is a delete instruction, and the delete instruction includes the object type of the data object, and the data corresponding to the structured attribute and the data corresponding to the unstructured attribute of the data object. . In step S304, the method for determining the record of the data object from the first storage system includes: determining a relational data table corresponding to the data object according to the object type; and determining the relational data table from the relational data table. For the record of the data object, the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object.
举例而言,该数据库中存储的记录包括:第一关系数据表中的记录picture(“Snoopy”,2M, data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1),picture(“Stitch”,1.5M,data/version2/Stitch.jpg,38.23,129.78,“2018-10-17”,t_Stitch_key_version2)和第二关系数据表中的记录video(“Show”,300M,data/version1/Show.avi,47.56,119.73,“2018-10-23”,t_Show_key_version1)。具体的,该删除指令为“delete(picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg)))”。该删除指令的含义为删除数据对象“picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg))”。在接收到该删除指令之后,该数据管理设备根据所述对象类型picture确定所述数据对象对应的关系数据表为第一关系数据表。之后,该数据管理设备从所述第一关系数据表中确定所述数据对象的所述记录,所述记录中的结构化属性对应的数据与所述数据对象的结构化属性对应的数据相同。则确定出的该记录为picture(“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1)。For example, the records stored in the database include: the record picture("Snoopy", 2M, data/version1/Snoopy.jpg, 39.92, 116.46, "2018-10-12", t_snoopy_key_version1) in the first relational data table , Picture("Stitch",1.5M,data/version2/Stitch.jpg,38.23,129.78,"2018-10-17",t_Stitch_key_version2) and the record video("Show",300M,data in the second relational data table /version1/Show.avi,47.56,119.73,"2018-10-23",t_Show_key_version1). Specifically, the delete instruction is "delete(picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg)) )". The meaning of this delete instruction is to delete the data object "picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg)) ". After receiving the delete instruction, the data management device determines that the relation data table corresponding to the data object is the first relation data table according to the object type picture. After that, the data management device determines the record of the data object from the first relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object. The determined record is picture("Snoopy", 2M, data/version1/Snoopy.jpg, 39.92, 116.46, "2018-10-12", t_snoopy_key_version1).
以下介绍步骤S305中,该数据管理设备根据所述记录,从所述第一存储系统和所述第二存储系统中获取所述数据对象的所述多个属性对应的数据的详细过程。The following describes a detailed process in which the data management device obtains data corresponding to the multiple attributes of the data object from the first storage system and the second storage system according to the record in step S305.
在一种可能的实现方式中,记录的非结构化属性包括KV属性,所述第二存储系统为KV存储系统。该数据管理设备根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据的方法包括:根据键值从所述第二存储系统中读取所述键值对应的KV数据,并去除所述键值中的版本标识;所述键值为所述记录中的KV属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;其中,所述数据对象的KV属性对应的数据为,去除版本标识后的键值和所述KV数据;所述数据对象的结构化属性对应的数据为,所述记录中结构化属性对应的数据。In a possible implementation manner, the recorded unstructured attributes include KV attributes, and the second storage system is a KV storage system. The method for the data management device to obtain corresponding data in the multiple attributes of the data object from the first storage system and the second storage system according to the record includes: obtaining data from the second storage system according to a key value The KV data corresponding to the key value is read in the key value, and the version identifier in the key value is removed; the key value is the data corresponding to the KV attribute in the record, and the version identifier includes the first version identifier and the first version identifier. Two version identification; wherein the data corresponding to the KV attribute of the data object is the key value after removing the version identification and the KV data; the data corresponding to the structured attribute of the data object is, the structured data in the record The data corresponding to the attribute.
在又一种可能的实现方式中,该记录的非结构化属性包括文件属性,所述第二存储系统为文件存储系统。该数据管理设备根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据,包括:根据路径从所述第二存储系统中读取所述路径对应的文件数据,并去除所述路径中的版本标识;所述路径为所述记录中的文件属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;其中,所述数据对象的文件属性对应的数据为,去除版本标识后的路径和所述文件数据;所述数据对象的结构化属性对应的数据为,所述记录中结构化属性对应的数据。In another possible implementation manner, the unstructured attributes of the record include file attributes, and the second storage system is a file storage system. The data management device obtains data corresponding to multiple attributes of the data object from the first storage system and the second storage system according to the record, including: reading from the second storage system according to a path Fetch the file data corresponding to the path, and remove the version identifier in the path; the path is the data corresponding to the file attribute in the record, and the version identifier includes a first version identifier and a second version identifier; where The data corresponding to the file attribute of the data object is the path after removing the version identifier and the file data; the data corresponding to the structured attribute of the data object is the data corresponding to the structured attribute in the record.
以下将以记录picture(“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1)为例,介绍该数据管理设备根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据的过程。The following will take the record picture("Snoopy",2M,data/version1/Snoopy.jpg,39.92,116.46,"2018-10-12",t_snoopy_key_version1) as an example to introduce the data management device from the first A process of acquiring data corresponding to multiple attributes of the data object in a storage system and the second storage system.
其中,该记录中的第1个属性为结构化属性,则该数据对象的结构化属性对应的数据为该第1个属性对应的数据“Snoopy”,此外,该记录中的7个属性中除该第1个结构化属性的其他结构化属性也可以参照该方式,此处不再赘述。该记录中的第3个属性为路径,则根据该路径“data/version1/Snoopy.jpg”从该文件存储系统中读取该路径对应的文件数据“Snoopy.jpg”,并去除该路径中的版本标识version1,该数据对象的文件属性对应的数据为去除该版本标识后的路径“data/Snoopy.jpg”和该文件数据“Snoopy.jpg”,即“(data/Snoopy.jpg,Snoopy.jpg)”。该记录中的第7个属性为键值,则根据该键值“t_snoopy_key_version1”从该KV存储系统中读取该键值对应的KV数据“t_snoopy”,并去除该键值中的版本标识version1, 该数据对象的KV属性对应的数据为去除版本标识后的键值“t_snoopy_key”和该KV数据“t_snoopy”,即“(t_snoopy_key,t_snoopy.jpg)”。Among them, the first attribute in the record is a structured attribute, and the data corresponding to the structured attribute of the data object is the data "Snoopy" corresponding to the first attribute. In addition, the 7 attributes in the record are divided by Other structured attributes of the first structured attribute can also refer to this method, which will not be repeated here. The third attribute in the record is a path, then read the file data “Snoopy.jpg” corresponding to the path from the file storage system according to the path “data/version1/Snoopy.jpg”, and remove the Version ID version1, the data corresponding to the file attribute of the data object is the path "data/Snoopy.jpg" after removing the version ID and the file data "Snoopy.jpg", that is, "(data/Snoopy.jpg, Snoopy.jpg" )". The seventh attribute in the record is a key value, then read the KV data “t_snoopy” corresponding to the key value from the KV storage system according to the key value “t_snoopy_key_version1”, and remove the version identifier version1 in the key value, The data corresponding to the KV attribute of the data object is the key value "t_snoopy_key" after removing the version identifier and the KV data "t_snoopy", that is, "(t_snoopy_key, t_snoopy.jpg)".
在上述列举的操作指令为删除指令的实施例中,根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据,包括:所述根据所述记录从所述第一存储系统中获取所述数据对象的所述多个属性对应的数据。举例而言,若该记录为picture(“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1),则该数据管理设备获取该数据对象的该多个数据对应的数据:“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1。In the embodiment in which the operation instruction listed above is a delete instruction, the multiple attributes of the data object are obtained from at least one of the first storage system and the second storage system according to the record The data corresponding to the at least one attribute in the data includes: obtaining the data corresponding to the multiple attributes of the data object from the first storage system according to the record. For example, if the record is picture("Snoopy",2M,data/version1/Snoopy.jpg,39.92,116.46,"2018-10-12",t_snoopy_key_version1), then the data management device obtains the data object Data corresponding to multiple data: "Snoopy", 2M, data/version1/Snoopy.jpg, 39.92, 116.46, "2018-10-12", t_snoopy_key_version1.
在一个实施例中,步骤S306中,若操作指令为查询指令,则基于所述至少一个属性对应的数据,对所述数据对象执行所述操作的过程包括:根据所述至少一个属性生成查询结果,并将查询结果返回给请求发起方,比如应用程序。In one embodiment, in step S306, if the operation instruction is a query instruction, based on the data corresponding to the at least one attribute, the process of performing the operation on the data object includes: generating a query result according to the at least one attribute , And return the query result to the request initiator, such as an application.
以上述示例为例,该数据管理设备根据该数据对象中的7个属性中每个属性对应的数据和记录中7个属性的顺序建立该数据对象。该数据对象为“picture(“Snoopy”,2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,”2018-10-12”,(t_snoopy_key,t_snoopy.jpg))”。以相似的方法,该数据管理设备还可以建立数据对象picture(“Stitch”,1.5M,(data/Stitch.jpg,Stitch.jpg),38.23,129.78,“2018-10-17”,(t_Stitch_key_version2,t_Stitch.jpg))。之后,该数据管理设备将这两个数据对象作为查询结果。Taking the foregoing example as an example, the data management device creates the data object according to the data corresponding to each of the seven attributes in the data object and the sequence of the seven attributes in the record. The data object is "picture("Snoopy",2M,(data/Snoopy.jpg,Snoopy.jpg),39.92,116.46,"2018-10-12",(t_snoopy_key,t_snoopy.jpg))". In a similar way, the data management device can also create a data object picture("Stitch",1.5M,(data/Stitch.jpg,Stitch.jpg),38.23,129.78,"2018-10-17",(t_Stitch_key_version2, t_Stitch.jpg)). After that, the data management device uses these two data objects as query results.
在一个实施例中,步骤S306中,若操作指令为删除指令,则基于所述至少一个属性对应的数据,对所述数据对象执行所述操作的方法为:从所述第一存储系统中删除所述至少一个属性对应的数据;提交所述删除指令对应的事务。以上述实例为例,则该数据管理设备在该第一存储系统的第一关系数据表中删除该数据对象的该多个数据对应的数据:“Snoopy”,2M,data/version1/Snoopy.jpg,39.92,116.46,“2018-10-12”,t_snoopy_key_version1。删除之后,该数据管理设备提交该删除指令对应的事务。In one embodiment, in step S306, if the operation instruction is a delete instruction, based on the data corresponding to the at least one attribute, the method for performing the operation on the data object is: deleting from the first storage system Data corresponding to the at least one attribute; commit the transaction corresponding to the delete instruction. Taking the foregoing example as an example, the data management device deletes the data corresponding to the multiple data of the data object in the first relational data table of the first storage system: "Snoopy", 2M, data/version1/Snoopy.jpg ,39.92,116.46,"2018-10-12",t_snoopy_key_version1. After the deletion, the data management device submits the transaction corresponding to the deletion instruction.
在一种可能的实现方式中,若接收到针对数据对象的更新指令,在该数据管理设备提交数据库事务之前,该数据库中的一个数据对象存在两个对应的记录。基于数据库的多版本并发控制(multiversion currency control,MVCC)机制,该数据库只保留更新后的记录。In a possible implementation manner, if an update instruction for a data object is received, before the data management device submits the database transaction, there are two corresponding records for a data object in the database. Based on the multiversion currency control (MVCC) mechanism of the database, the database only keeps updated records.
以下将对这种MVCC机制进行进一步介绍。MVCC机制可以为数据库中的每条记录维护多个快照副本,通过起始时间戳(begin timestamp)和结束时间戳(end timestamp)维护副本的可见性。其中,起始时间戳用于指示一条记录何时被创建,结束时间戳用于指示一条记录何时过期(或者被删除)。需要说明的是,时间戳并不存储一条记录创建或者过期发生的实际时间,它存储的是这条记录发生时的系统版本号。该系统版本号会随着事务的创建而不断增长,每个事务在事务开始时会记录自身的系统版本号。This MVCC mechanism will be further introduced below. The MVCC mechanism can maintain multiple snapshot copies for each record in the database, and maintain the visibility of the copies through a start timestamp (begin timestamp) and an end timestamp (end timestamp). Among them, the start timestamp is used to indicate when a record is created, and the end timestamp is used to indicate when a record expires (or is deleted). It should be noted that the timestamp does not store the actual time when a record was created or expired, it stores the system version number when the record occurred. The system version number will continue to grow as the transaction is created, and each transaction will record its own system version number at the beginning of the transaction.
在一个实施例中,在执行插入指令的过程中,数据对象对应的第一记录的起始时间戳为当前存储事务的系统版本号,该第一记录的结束时间戳未定义。在执行更新指令的过程中,更新的数据对象对应的第二记录的起始时间戳为当前更新事务的系统版本号,该第二记录的结束时间戳未定义;其中,该更新事务的系统版本号大于该存储事务的系统版本号。另外,该第一记录的结束时间戳将定义为该更新事务的系统版本号。在数据管理设备执行了更新指 令,提交数据库事务时,将删除该第一记录,即,在该数据管理设备提交数据库事务之后,该数据库中只保留了更新的记录。此外,若在执行更新事务的同时,另一事务对该数据对象执行了读访问,则该另一事务读取到的该数据对象的记录为第一记录,这种方式可以使得数据库的更新和读取互不阻塞。In one embodiment, during the execution of the insert instruction, the start timestamp of the first record corresponding to the data object is the system version number of the current storage transaction, and the end timestamp of the first record is undefined. In the process of executing the update instruction, the start timestamp of the second record corresponding to the updated data object is the system version number of the current update transaction, and the end timestamp of the second record is undefined; wherein, the system version of the update transaction The number is greater than the system version number of the stored transaction. In addition, the end timestamp of the first record will be defined as the system version number of the update transaction. When the data management device executes the update instruction and commits the database transaction, the first record will be deleted, that is, after the data management device commits the database transaction, only the updated record is kept in the database. In addition, if another transaction performs read access to the data object while the update transaction is being performed, the record of the data object read by the other transaction is the first record. This method can make the database update and Reading does not block each other.
举例而言,在数据管理设备执行更新指令的过程中,数据库中存储的记录可以参照上述表六的内容。其中,该数据库中一个数据对象存在了两个对应的记录,为包含了第一版本标识的记录和包含了第二版本标识的记录。基于数据库的多版本并发控制机制,在该数据管理设备提交数据库事务之后,该数据库中只保留包含了第二版本标识的记录。此时,该数据库中存储的记录可以参照上述表三的内容。若在执行更新指令的过程中,另一事务对该数据对象进行了读访问,则该另一事务读取到的该数据对象的记录为包含了第一版本标识的记录。For example, in the process of executing the update instruction by the data management device, the records stored in the database can refer to the content of Table 6 above. Among them, there are two corresponding records for a data object in the database, which are the record containing the first version identifier and the record containing the second version identifier. Based on the multi-version concurrency control mechanism of the database, after the data management device submits the database transaction, only the records containing the second version identifier are kept in the database. At this time, the records stored in the database can refer to the contents of Table 3 above. If during the execution of the update instruction, another transaction performs read access to the data object, the record of the data object read by the other transaction is a record containing the first version identifier.
在执行了更新指令,并且该数据库提交数据库事务之后,该数据库中存在该数据对象对应的一个记录,以上述例举的更新过程为例,最终数据库中存储的记录可以参照表三所示的内容。但对于KV存储系统和文件存储系统而言,会存在该数据对象的两种非结构化属性的数据。对于KV存储系统而言,该数据对象对应的非结构化属性的数据有“t_snoopy_key_version1,t_snoopy.jpg”和“t_snoopy_key_version2,t_snoopy.jpg”;对于文件存储系统而言,该数据对象对应的非结构化属性的数据有“data/version1/Snoopy.jpg,Snoopy.jpg”和“data/version2/Snoopy.jpg,Snoopy.jpg”。其中,“t_snoopy_key_version1,t_snoopy.jpg”和“data/version1/Snoopy.jpg,Snoopy.jpg”是无效的数据,这种无效的数据可以通过校验操作进行清理,以下将对该数据管理设备的执行校验操作的方法进行介绍。After the update instruction is executed and the database commits the database transaction, there is a record corresponding to the data object in the database. Taking the update process exemplified above as an example, the records stored in the final database can refer to the contents shown in Table 3 . However, for KV storage systems and file storage systems, there will be two types of unstructured attribute data of the data object. For the KV storage system, the unstructured attribute data corresponding to the data object includes "t_snoopy_key_version1, t_snoopy.jpg" and "t_snoopy_key_version2, t_snoopy.jpg"; for the file storage system, the data object corresponds to the unstructured The attribute data are "data/version1/Snoopy.jpg, Snoopy.jpg" and "data/version2/Snoopy.jpg, Snoopy.jpg". Among them, "t_snoopy_key_version1, t_snoopy.jpg" and "data/version1/Snoopy.jpg, Snoopy.jpg" are invalid data. This invalid data can be cleaned up through the verification operation. The following will execute the data management device The method of verification operation is introduced.
在一实施例中,该非结构化属性包括KV属性,所述第二存储系统为KV存储系统。该执行校验操作的方法包括:当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的键值,所述键值为所述第二存储系统中的KV属性对应的数据;在遍历所述键值的过程中,若在所述第一存储系统存储的记录中无法查找到与第四键值相同的键值,则在所述第二存储系统中删除所述第四键值和所述第四键值对应的KV数据,所述第四键值为所述第二存储系统中多个键值中的一个。In an embodiment, the unstructured attribute includes a KV attribute, and the second storage system is a KV storage system. The method for performing a verification operation includes: when a verification instruction is received or when a verification condition is detected to be satisfied, traversing the key value in the second storage system, and the key value is in the second storage system The data corresponding to the KV attribute of the; in the process of traversing the key value, if the same key value as the fourth key cannot be found in the records stored in the first storage system, then the second storage system Delete the fourth key value and the KV data corresponding to the fourth key value, and the fourth key value is one of multiple key values in the second storage system.
以上述例举的更新后的数据对象为例,在KV存储系统中存在第四键值“t_snoopy_key_version1”,但在数据库存储的记录中无法查找到与该第四键值相同的键值,则在该KV存储系统中删除该第四键值“t_snoopy_key_version1”和该第四键值对应的KV数据“t_snoopy.jpg”。Taking the updated data object mentioned above as an example, there is a fourth key value "t_snoopy_key_version1" in the KV storage system, but the same key value as the fourth key cannot be found in the records stored in the database, then Delete the fourth key value "t_snoopy_key_version1" and the KV data "t_snoopy.jpg" corresponding to the fourth key value from the KV storage system.
在一实施例中,该非结构化属性包括文件属性,该第二存储系统为文件存储系统。该执行校验操作的方法包括:当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的路径,所述路径为所述第二存储系统中的文件属性对应的数据;在遍历所述路径的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四路径相同的路径,则在所述第二存储系统中删除所述第四路径和所述第四路径对应的文件数据,所述第四路径为所述第二存储系统中多个路径中的一个。In an embodiment, the unstructured attributes include file attributes, and the second storage system is a file storage system. The method for performing a verification operation includes: when a verification instruction is received or when a verification condition is detected to be satisfied, traversing a path in the second storage system, where the path is a file in the second storage system Data corresponding to the attribute; in the process of traversing the path, if the same path as the fourth path cannot be found in the relational data table stored in the first storage system, delete all paths in the second storage system The fourth path and the file data corresponding to the fourth path, and the fourth path is one of multiple paths in the second storage system.
以上述例举的更新后的数据对象为例,在文件存储系统中存在第四路径“data/version1/Snoopy.jpg”,但在数据库存储的记录中无法查找到与该第四路径相同的路径,则在该文件存储系统中删除该第四路径“data/version1/Snoopy.jpg”和该第四路径对应的文件数据“Snoopy.jpg”。Taking the updated data object mentioned above as an example, there is a fourth path "data/version1/Snoopy.jpg" in the file storage system, but the same path as the fourth path cannot be found in the records stored in the database , Delete the fourth path "data/version1/Snoopy.jpg" and the file data "Snoopy.jpg" corresponding to the fourth path in the file storage system.
其中,该校验条件可以为当前时刻处于预设的校验周期,或者该数据管理设备中已存储的数据量大于预设值,等等。举例而言,该校验指令可以为“check(picture)”,代表校验“图 片”这种对象类型的数据对象。当数据管理设备接收到该校验指令时,该数据管理设备将遍历第二存储系统中的非结构属性对应的数据。通过这种方式,对比数据库中已存储的记录与第二存储系统中的非结构属性对应的数据,可以清除第二存储系统中无效的数据,能够让数据对象在跨多个数据系统存储的情况下保持数据一致性。Wherein, the verification condition may be a preset verification period at the current moment, or the amount of data stored in the data management device is greater than a preset value, and so on. For example, the check instruction may be "check (picture)", which means to check a data object of the object type "picture". When the data management device receives the verification instruction, the data management device will traverse the data corresponding to the non-structural attributes in the second storage system. In this way, by comparing the records stored in the database with the data corresponding to the unstructured attributes in the second storage system, invalid data in the second storage system can be cleared, and data objects can be stored across multiple data systems. To maintain data consistency.
需要说明的是,该数据管理设备从该第二存储系统中删除与该第一记录对应的非结构化属性所对应的数据的方法,可以参照该数据管理设备的执行校验指令的方法。由于第一记录已被删除,通过执行校验指令,可以清除第二存储系统中与该第一记录对应的非结构化数据。It should be noted that the method for the data management device to delete the data corresponding to the unstructured attribute corresponding to the first record from the second storage system may refer to the method for executing the verification instruction of the data management device. Since the first record has been deleted, the unstructured data corresponding to the first record in the second storage system can be cleared by executing the check instruction.
以上描述了本申请的方法实施例,下面对实现上述方法的装置实施例进行介绍。The method embodiments of the present application are described above, and the device embodiments implementing the above methods are introduced below.
参见图4,是本申请实施例提供的一种数据管理设备,该设备包括数据管理设备包括生成单元401,存储单元402,接收单元403,确定单元404,获取单元405和操作单元406。下面对生成单元401,存储单元402,接收单元403,确定单元404,获取单元405和操作单元406进行介绍。4, it is a data management device provided by an embodiment of the present application. The device includes a data management device including a generating unit 401, a storage unit 402, a receiving unit 403, a determining unit 404, an obtaining unit 405, and an operating unit 406. The generating unit 401, storage unit 402, receiving unit 403, determining unit 404, obtaining unit 405, and operating unit 406 will be introduced below.
所述生成单元401,用于在关系数据表中生成数据对象的记录,所述数据对象具有多个属性,所述多个属性包括结构化属性和非结构化属性,所述记录指示了所述数据对象的结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中。该生成单元401所执行的操作可以参照上述图3的步骤301中的相关描述。The generating unit 401 is configured to generate a record of a data object in a relational data table, the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes, and the record indicates the The relationship between the structured attribute and the unstructured attribute of the data object, and the relational data table is stored in the first storage system. For the operations performed by the generating unit 401, reference may be made to the related description in step 301 in FIG. 3 above.
所述存储单元402,用于将所述数据对象的非结构化属性对应的数据存储到第二存储系统中。该存储单元402所执行的操作可以参照上述图3的步骤302中的相关描述。The storage unit 402 is configured to store data corresponding to the unstructured attributes of the data object in the second storage system. For operations performed by the storage unit 402, reference may be made to the related description in step 302 of FIG. 3 above.
所述接收单元403,用于接收操作指令,所述操作指令用于对所述数据对象执行操作。在一个实施例中,该接收单元403可以为数据传输接口、通信接口或接收器等可被配置用于接收信息的电路或组件,该接收单元403执行的操作可以参照上述图3的步骤303中的相关描述。The receiving unit 403 is configured to receive an operation instruction, and the operation instruction is used to perform an operation on the data object. In an embodiment, the receiving unit 403 may be a circuit or component that can be configured to receive information, such as a data transmission interface, a communication interface, or a receiver. For operations performed by the receiving unit 403, refer to step 303 in FIG. 3 above. Related description.
所述确定单元404,用于响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录。该确定单元404所执行的操作可以参照上述图3的步骤304中的相关描述。The determining unit 404 is configured to determine the record of the data object from the first storage system in response to the operation instruction. For the operation performed by the determining unit 404, reference may be made to the related description in step 304 in FIG. 3 above.
所述获取单元405,用于根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据。该获取单元405所执行的操作可以参照上述图3的步骤305中的相关描述。The acquiring unit 405 is configured to acquire at least one attribute corresponding to at least one of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record The data. For the operations performed by the obtaining unit 405, reference may be made to the related description in step 305 of FIG. 3 above.
所述操作单元406,用于基于所述至少一个属性对应的数据,对所述数据对象执行所述操作。该获取单元406所执行的操作可以参照上述图3的步骤306中的相关描述。The operation unit 406 is configured to perform the operation on the data object based on the data corresponding to the at least one attribute. For the operations performed by the obtaining unit 406, reference may be made to the related description in step 306 in FIG.
另外,图4中的各个操作的具体实现细节还可以对应参照图3所示的方法实施例的相应描述。上述各个单元可以以硬件,软件或者软硬件结合的方式来实现。在一个实施例中,生成单元401,存储单元402,确定单元404,获取单元405和操作单元406可以为软件实现的功能模块,这些功能模块的功能由存储在存储器中的程序或代码实现,数据管理设备通过至少一个处理器执行这些程序或代码,可以实现各个功能模块的功能。由于该数据对象跨多个数据系统存储的多个属性对应的数据均通过关系数据表中的记录获取,该数据管理设备可以让该数据对象在跨多个数据系统存储的情况下保持数据一致性。In addition, the specific implementation details of each operation in FIG. 4 may also correspond to the corresponding description of the method embodiment shown in FIG. 3. The above-mentioned units can be implemented in hardware, software or a combination of software and hardware. In one embodiment, the generation unit 401, the storage unit 402, the determination unit 404, the acquisition unit 405, and the operation unit 406 may be functional modules implemented by software. The functions of these functional modules are implemented by programs or codes stored in the memory. The management device executes these programs or codes through at least one processor to realize the functions of each functional module. Since the data corresponding to the multiple attributes of the data object stored across multiple data systems are obtained through the records in the relational data table, the data management device can allow the data object to maintain data consistency when stored across multiple data systems .
参见图5,是本申请实施例提供的又一种数据管理设备。该数据管理设备包括处理器501、存储器502和通信接口503,该处理器501、存储器502和通信接口503通过总线504相互连 接。Refer to FIG. 5, which is another data management device provided by an embodiment of the present application. The data management device includes a processor 501, a memory 502, and a communication interface 503. The processor 501, the memory 502, and the communication interface 503 are connected to each other through a bus 504.
存储器502包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器502用于相关指令及数据。The memory 502 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (erasable programmable read-only memory, EPROM), or Portable read-only memory (compact disc read-only memory, CD-ROM), the memory 502 is used for related instructions and data.
通信接口503可以为数据传输接口、通信接口或接收器等可被配置用于接收信息的电路或组件。The communication interface 503 may be a circuit or component that can be configured to receive information, such as a data transmission interface, a communication interface, or a receiver.
处理器501可以是一个或多个中央处理器(central processing unit,CPU),在处理器501是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。The processor 501 may be one or more central processing units (CPU). In the case where the processor 501 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.
该数据管理设备中的处理器501通过读取并执行存储器502中存储的程序代码,执行以下操作:The processor 501 in the data management device performs the following operations by reading and executing the program code stored in the memory 502:
在关系数据表中生成数据对象的记录。其中,所述数据对象具有多个属性,所述多个属性包括结构化属性和非结构化属性,所述记录指示了所述数据对象的结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中。Generate records of data objects in relational data tables. Wherein, the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes, and the record indicates the association relationship between the structured attributes and unstructured attributes of the data object. The relational data table is stored in the first storage system.
将所述数据对象的非结构化属性对应的数据存储到第二存储系统中。The data corresponding to the unstructured attribute of the data object is stored in the second storage system.
接收操作指令,所述操作指令用于对所述数据对象执行操作。An operation instruction is received, and the operation instruction is used to perform an operation on the data object.
响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录。In response to the operation instruction, the record of the data object is determined from the first storage system.
根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据。Obtain data corresponding to at least one of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record.
基于所述至少一个属性对应的数据,对所述数据对象执行所述操作。Perform the operation on the data object based on the data corresponding to the at least one attribute.
图5中的处理器501所执行的各个操作的具体细节还可以对应参照图3所示的方法实施例的相应描述。由于该数据对象跨多个数据系统存储的多个属性对应的数据均通过关系数据表中的记录进行获取,可以让该数据对象在跨多个数据系统存储的情况下保持数据一致性。For specific details of the operations performed by the processor 501 in FIG. 5, reference may also be made to the corresponding description of the method embodiment shown in FIG. 3. Because the data corresponding to the multiple attributes of the data object stored across multiple data systems are all obtained through the records in the relational data table, the data object can maintain data consistency when stored across multiple data systems.
在本申请的另一实施例中提供一种计算机程序产品,当该计算机程序产品在计算机上运行时,图3所示实施例的方法得以实现。In another embodiment of the present application, a computer program product is provided. When the computer program product runs on a computer, the method of the embodiment shown in FIG. 3 is implemented.
在本申请的另一实施例中提供一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被计算机执行时实现图3所示实施例的方法。In another embodiment of the present application, a computer-readable storage medium is provided, the computer-readable storage medium stores a computer program, and the computer program implements the method of the embodiment shown in FIG. 3 when the computer program is executed by a computer.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (32)

  1. 一种数据管理方法,其特征在于,所述方法包括:A data management method, characterized in that the method includes:
    在关系数据表中生成数据对象的记录,所述数据对象具有多个属性,所述多个属性包括结构化属性和非结构化属性,所述记录包含所述结构化属性对应的数据,以及所述数据对象的结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中;A record of a data object is generated in a relational data table. The data object has multiple attributes. The multiple attributes include structured attributes and unstructured attributes. The records contain data corresponding to the structured attributes, and The association relationship between the structured attribute and the unstructured attribute of the data object, and the relational data table is stored in the first storage system;
    将所述数据对象的非结构化属性对应的数据存储到第二存储系统中;Storing the data corresponding to the unstructured attributes of the data object in the second storage system;
    接收操作指令,所述操作指令用于对所述数据对象执行操作;Receiving an operation instruction, the operation instruction being used to perform an operation on the data object;
    响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录;In response to the operation instruction, determining the record of the data object from the first storage system;
    根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据;Acquiring data corresponding to at least one attribute of the plurality of attributes of the data object from at least one of the first storage system and the second storage system according to the record;
    基于所述至少一个属性对应的数据,对所述数据对象执行所述操作。Perform the operation on the data object based on the data corresponding to the at least one attribute.
  2. 根据权利要求1所述的方法,其特征在于,所述在关系数据表中生成数据对象的记录,包括:The method according to claim 1, wherein said generating a record of a data object in a relational data table comprises:
    接收插入指令或更新指令,所述插入指令用于插入所述数据对象,所述更新指令用于更新所述数据对象;所述插入指令和所述更新指令均包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;Receiving an insert instruction or an update instruction, the insert instruction is used to insert the data object, the update instruction is used to update the data object; both the insert instruction and the update instruction include the object type of the data object, And data corresponding to the structured attribute and data corresponding to the unstructured attribute of the data object;
    根据所述对象类型确定所述数据对象对应的关系数据表;Determining the relational data table corresponding to the data object according to the object type;
    根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;Generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute;
    提交所述插入指令或更新指令对应的事务;Submit the transaction corresponding to the insert instruction or update instruction;
    其中,所述插入指令或更新指令对应的事务在所述数据对象的非结构化属性对应的数据存储到第二存储系统中之后提交。Wherein, the transaction corresponding to the insert instruction or the update instruction is submitted after the data corresponding to the unstructured attribute of the data object is stored in the second storage system.
  3. 根据权利要求2所述的方法,其特征在于,接收的指令为所述插入指令,所述数据对象的非结构化属性包括键值KV属性,所述第二存储系统为KV存储系统;The method according to claim 2, wherein the received instruction is the insert instruction, the unstructured attribute of the data object includes a key value KV attribute, and the second storage system is a KV storage system;
    所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:The generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute includes:
    根据第一版本标识和所述KV属性对应的数据中的第一键值生成第二键值;Generating a second key value according to the first version identifier and the first key value in the data corresponding to the KV attribute;
    生成所述数据对象的记录,其中,所述记录中的KV属性对应的数据包括所述第二键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。Generate a record of the data object, wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record includes the data corresponding to the structured attribute of the data object data.
  4. 根据权利要求2所述的方法,其特征在于,接收的指令为所述插入指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;The method according to claim 2, wherein the received instruction is the insert instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system;
    所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:The generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute includes:
    根据所述第一版本标识和所述文件属性对应的数据中的第一路径生成第二路径;Generating a second path according to the first path in the data corresponding to the first version identifier and the file attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第二路径,所述记录中的结构化属性对应的数据包括所述数据对 象的结构化属性对应的数据。A record of the data object is generated in a relational data table corresponding to the data object; wherein the data corresponding to the file attribute in the record includes the second path, and the data corresponding to the structured attribute in the record includes Data corresponding to the structured attributes of the data object.
  5. 根据权利要求2所述的方法,其特征在于,接收的指令为所述更新指令,所述数据对象的非结构化属性包括KV属性,所述第二存储系统为KV存储系统;The method according to claim 2, wherein the received instruction is the update instruction, the unstructured attributes of the data object include KV attributes, and the second storage system is a KV storage system;
    所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:The generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute includes:
    根据第二版本标识和所述KV属性对应的数据中的第一键值生成第三键值;Generating a third key value according to the second version identifier and the first key value in the data corresponding to the KV attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第三键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。A record of the data object is generated in the relational data table corresponding to the data object; wherein the data corresponding to the KV attribute in the record includes the third key value, and the data corresponding to the structured attribute in the record Including data corresponding to the structured attributes of the data object.
  6. 根据权利要求2所述的方法,其特征在于,接收的指令为所述更新指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;The method according to claim 2, wherein the received instruction is the update instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system;
    所述根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录,包括:The generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute includes:
    根据所述第二版本标识和所述文件属性对应的数据中的第一路径生成第三路径;Generating a third path according to the second version identifier and the first path in the data corresponding to the file attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第三路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。The record of the data object is generated in the relational data table corresponding to the data object; wherein the data corresponding to the file attribute in the record includes the third path, and the data corresponding to the structured attribute in the record includes Data corresponding to the structured attributes of the data object.
  7. 根据权利要求3~6任一项所述的方法,其特征在于,所述第二存储系统中存储的非结构化属性对应的数据包括所述非结构化属性的标识和内容;所述关系数据表中存储的所述非结构化属性对应的数据包括所述非结构化属性的标识。The method according to any one of claims 3 to 6, wherein the data corresponding to the unstructured attribute stored in the second storage system includes the identifier and content of the unstructured attribute; the relationship data The data corresponding to the unstructured attribute stored in the table includes the identifier of the unstructured attribute.
  8. 根据权利要求1~7任一项所述的方法,其特征在于,所述操作指令包括查询指令,所述查询指令中包括查询条件;The method according to any one of claims 1 to 7, wherein the operation instruction includes a query instruction, and the query instruction includes a query condition;
    所述从所述第一存储系统中确定所述数据对象的所述记录,包括:The determining the record of the data object from the first storage system includes:
    从所述第一存储系统中选取满足所述查询条件的数据对象的记录;Selecting a record of a data object that meets the query condition from the first storage system;
    所述根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据,包括:The obtaining data corresponding to at least one of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record includes:
    根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的所述多个属性对应的数据;Acquiring data corresponding to the multiple attributes of the data object from the first storage system and the second storage system according to the record;
    所述基于所述至少一个属性对应的数据,对所述数据对象执行所述操作,包括:The performing the operation on the data object based on the data corresponding to the at least one attribute includes:
    根据获取的所述多个属性对应的数据返回查询结果。The query result is returned according to the acquired data corresponding to the multiple attributes.
  9. 根据权利要求8所述的方法,其特征在于,所述记录的非结构化属性包括KV属性,所述根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据,包括:The method according to claim 8, wherein the unstructured attribute of the record comprises a KV attribute, and the data is obtained from the first storage system and the second storage system according to the record Corresponding data in multiple attributes of the object, including:
    根据键值从所述第二存储系统中读取所述键值对应的KV数据,并去除所述键值中的版本标识;所述键值为所述记录中的KV属性对应的数据,所述版本标识包括第一版本标识和 第二版本标识;The KV data corresponding to the key value is read from the second storage system according to the key value, and the version identifier in the key value is removed; the key value is the data corresponding to the KV attribute in the record, so The version identifier includes a first version identifier and a second version identifier;
    其中,所述数据对象的KV属性对应的数据包括去除版本标识后的键值和所述KV数据;所述数据对象的结构化属性对应的数据包括所述记录中结构化属性对应的数据。The data corresponding to the KV attribute of the data object includes the key value after removing the version identifier and the KV data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  10. 根据权利要求8所述的方法,其特征在于,所述记录的非结构化属性包括文件属性,所述根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的多个属性中对应的数据,包括:8. The method according to claim 8, wherein the unstructured attributes of the record include file attributes, and the data is obtained from the first storage system and the second storage system according to the record Corresponding data in multiple attributes of the object, including:
    根据路径从所述第二存储系统中读取所述路径对应的文件数据,并去除所述路径中的版本标识;所述路径为所述记录中的文件属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;The file data corresponding to the path is read from the second storage system according to the path, and the version identifier in the path is removed; the path is the data corresponding to the file attribute in the record, and the version identifier includes The first version identification and the second version identification;
    其中,所述数据对象的文件属性对应的数据包括,去除版本标识后的路径和所述文件数据;所述数据对象的结构化属性对应的数据包括,所述记录中结构化属性对应的数据。Wherein, the data corresponding to the file attribute of the data object includes the path after removing the version identifier and the file data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  11. 根据权利要求1~7任一项所述的方法,其特征在于,所述操作指令包括删除指令,所述删除指令包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;The method according to any one of claims 1 to 7, wherein the operation instruction includes a delete instruction, and the delete instruction includes an object type of the data object, and a structured attribute corresponding to the data object. Data and data corresponding to unstructured attributes;
    其中,所述从所述第一存储系统中确定所述数据对象的所述记录,包括:Wherein, said determining said record of said data object from said first storage system includes:
    根据所述对象类型确定所述数据对象对应的关系数据表;Determining the relational data table corresponding to the data object according to the object type;
    从所述关系数据表中确定所述数据对象的所述记录,所述记录中的结构化属性对应的数据与所述数据对象的结构化属性对应的数据相同;Determine the record of the data object from the relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object;
    所述根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的多个属性中的至少一个属性对应的数据,包括:The obtaining data corresponding to at least one attribute of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record includes:
    所述根据所述记录从所述第一存储系统中获取所述数据对象的所述多个属性对应的数据;Acquiring data corresponding to the multiple attributes of the data object from the first storage system according to the record;
    所述基于所述至少一个属性对应的数据,对所述数据对象执行所述操作,包括:The performing the operation on the data object based on the data corresponding to the at least one attribute includes:
    从所述第一存储系统中删除所述数据对象的所述多个属性对应的数据;Deleting data corresponding to the multiple attributes of the data object from the first storage system;
    提交所述删除指令对应的事务。Submit the transaction corresponding to the delete instruction.
  12. 根据权利要求1~7任一项所述的方法,其特征在于,所述非结构化属性包括KV属性,所述第二存储系统为KV存储系统,所述方法还包括:The method according to any one of claims 1 to 7, wherein the unstructured attribute includes a KV attribute, the second storage system is a KV storage system, and the method further comprises:
    当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的键值,所述键值为所述第二存储系统中的KV属性对应的数据;When a verification instruction is received or when it is detected that a verification condition is met, traversing the key value in the second storage system, where the key value is the data corresponding to the KV attribute in the second storage system;
    在遍历所述键值的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四键值相同的键值,则在所述第二存储系统中删除所述第四键值和所述第四键值对应的KV数据,所述第四键值为所述第二存储系统中多个键值中的一个。In the process of traversing the key value, if the same key value as the fourth key value cannot be found in the relational data table stored in the first storage system, the first storage system will delete the first value. Four key values and KV data corresponding to the fourth key value, where the fourth key value is one of multiple key values in the second storage system.
  13. 根据权利要求1~7任一项所述的方法,其特征在于,所述非结构化属性包括文件属性,所述第二存储系统为文件存储系统,所述方法还包括:The method according to any one of claims 1 to 7, wherein the unstructured attributes include file attributes, the second storage system is a file storage system, and the method further comprises:
    当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的路径,所述路径为所述第二存储系统中的文件属性对应的数据;When a verification instruction is received or when a verification condition is detected to be satisfied, traverse a path in the second storage system, where the path is the data corresponding to the file attribute in the second storage system;
    在遍历所述路径的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第 四路径相同的路径,则在所述第二存储系统中删除所述第四路径和所述第四路径对应的文件数据,所述第四路径为所述第二存储系统中多个路径中的一个。In the process of traversing the path, if the same path as the fourth path cannot be found in the relational data table stored in the first storage system, delete the fourth path and the fourth path in the second storage system. File data corresponding to the fourth path, where the fourth path is one of multiple paths in the second storage system.
  14. 根据权利要求2~6任一项所述的方法,其特征在于,在根据所述对象类型确定所述数据对象对应的关系数据表之前,所述方法还包括:The method according to any one of claims 2 to 6, wherein before determining the relational data table corresponding to the data object according to the object type, the method further comprises:
    接收针对所述数据对象所属对象类型的定义指令,所述定义指令中包含所述对象类型的定义信息,所述定义信息用于定义所述对象类型的关系数据表的结构;Receiving a definition instruction for the object type to which the data object belongs, where the definition instruction includes definition information of the object type, and the definition information is used to define the structure of a relational data table of the object type;
    根据所述定义指令在所述第一存储系统中,生成所述对象类型的关系数据表。According to the definition instruction, a relational data table of the object type is generated in the first storage system.
  15. 根据权利要求2~6任一项所述的方法,其特征在于,所述根据所述对象类型确定所述数据对象对应的关系数据表,包括:The method according to any one of claims 2 to 6, wherein the determining the relational data table corresponding to the data object according to the object type comprises:
    根据所述插入指令或更新指令确定所述数据对象所属的对象类型;Determine the object type to which the data object belongs according to the insert instruction or update instruction;
    根据所述对象类型确定所述数据对象对应的关系数据表。The relational data table corresponding to the data object is determined according to the object type.
  16. 一种数据管理设备,其特征在于,所述数据管理设备包括生成单元,存储单元,接收单元,确定单元,获取单元,操作单元:A data management device, characterized in that the data management device includes a generating unit, a storage unit, a receiving unit, a determining unit, an acquiring unit, and an operating unit:
    所述生成单元,用于在关系数据表中生成数据对象的记录,所述数据对象具有多个属性,所述多个属性包括结构化属性和非结构化属性,所述记录包含所述结构化属性对应的数据,以及所述数据对象的结构化属性和非结构化属性的关联关系,所述关系数据表存储于第一存储系统中;The generating unit is configured to generate a record of a data object in a relational data table, the data object has multiple attributes, and the multiple attributes include structured attributes and unstructured attributes, and the record contains the structured attributes. Data corresponding to the attribute, and the association relationship between the structured attribute and the unstructured attribute of the data object, the relationship data table is stored in the first storage system;
    所述存储单元,用于将所述数据对象的非结构化属性对应的数据存储到第二存储系统中;The storage unit is configured to store data corresponding to the unstructured attributes of the data object in a second storage system;
    所述接收单元,用于接收操作指令,所述操作指令用于对所述数据对象执行操作;The receiving unit is configured to receive an operation instruction, and the operation instruction is used to perform an operation on the data object;
    所述确定单元,用于响应于所述操作指令,从所述第一存储系统中确定所述数据对象的所述记录;The determining unit is configured to determine the record of the data object from the first storage system in response to the operation instruction;
    所述获取单元,用于根据所述记录从所述第一存储系统和所述第二存储系统中的至少一个存储系统中获取所述数据对象的所述多个属性中的至少一个属性对应的数据;The acquiring unit is configured to acquire at least one attribute corresponding to at least one of the multiple attributes of the data object from at least one of the first storage system and the second storage system according to the record data;
    所述操作单元,用于基于所述至少一个属性对应的数据,对所述数据对象执行所述操作。The operation unit is configured to perform the operation on the data object based on the data corresponding to the at least one attribute.
  17. 根据权利要求16所述的数据管理设备,其特征在于,所述生成单元具体用于:The data management device according to claim 16, wherein the generating unit is specifically configured to:
    接收插入指令或更新指令,所述插入指令用于插入所述数据对象,所述更新指令用于更新所述数据对象;所述插入指令和所述更新指令均包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;Receiving an insert instruction or an update instruction, the insert instruction is used to insert the data object, the update instruction is used to update the data object; both the insert instruction and the update instruction include the object type of the data object, And data corresponding to the structured attribute and data corresponding to the unstructured attribute of the data object;
    根据所述对象类型确定所述数据对象对应的关系数据表;Determining the relational data table corresponding to the data object according to the object type;
    根据所述结构化属性对应的数据以及所述非结构化属性对应的数据,在所述数据对象对应的关系数据表中生成所述数据对象的记录;Generating a record of the data object in a relational data table corresponding to the data object according to the data corresponding to the structured attribute and the data corresponding to the unstructured attribute;
    提交所述插入指令或更新指令对应的事务;Submit the transaction corresponding to the insert instruction or update instruction;
    其中,所述插入指令或更新指令对应的事务在所述数据对象的非结构化属性对应的数据存储到第二存储系统中之后提交。Wherein, the transaction corresponding to the insert instruction or the update instruction is submitted after the data corresponding to the unstructured attribute of the data object is stored in the second storage system.
  18. 根据权利要求17所述的数据管理设备,其特征在于,接收的指令为所述插入指令, 所述数据对象的非结构化属性包括键值KV属性,所述第二存储系统为KV存储系统;The data management device according to claim 17, wherein the received instruction is the insert instruction, the unstructured attribute of the data object includes a key value KV attribute, and the second storage system is a KV storage system;
    所述生成单元具体用于:The generating unit is specifically used for:
    根据第一版本标识和所述KV属性对应的数据中的第一键值生成第二键值;Generating a second key value according to the first version identifier and the first key value in the data corresponding to the KV attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第二键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。A record of the data object is generated in a relational data table corresponding to the data object; wherein the data corresponding to the KV attribute in the record includes the second key value, and the data corresponding to the structured attribute in the record Including data corresponding to the structured attributes of the data object.
  19. 根据权利要求17所述的数据管理设备,其特征在于,接收的指令为所述插入指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;The data management device according to claim 17, wherein the received instruction is the insert instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system;
    所述生成单元具体用于:The generating unit is specifically used for:
    根据所述第一版本标识和所述文件属性对应的数据中的第一路径生成第二路径;Generating a second path according to the first path in the data corresponding to the first version identifier and the file attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第二路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。The record of the data object is generated in the relational data table corresponding to the data object; wherein the data corresponding to the file attribute in the record includes the second path, and the data corresponding to the structured attribute in the record includes Data corresponding to the structured attributes of the data object.
  20. 根据权利要求17所述的数据管理设备,其特征在于,接收的指令为所述更新指令,所述数据对象的非结构化属性包括KV属性,所述第二存储系统为KV存储系统;The data management device according to claim 17, wherein the received instruction is the update instruction, the unstructured attributes of the data object include KV attributes, and the second storage system is a KV storage system;
    所述生成单元具体用于:The generating unit is specifically used for:
    根据第二版本标识和所述KV属性对应的数据中的第一键值生成第三键值;Generating a third key value according to the second version identifier and the first key value in the data corresponding to the KV attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的KV属性对应的数据包括所述第三键值,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。A record of the data object is generated in the relational data table corresponding to the data object; wherein the data corresponding to the KV attribute in the record includes the third key value, and the data corresponding to the structured attribute in the record Including data corresponding to the structured attributes of the data object.
  21. 根据权利要求17所述的数据管理设备,其特征在于,接收的指令为所述更新指令,所述数据对象的非结构化属性包括文件属性,所述第二存储系统为文件存储系统;The data management device according to claim 17, wherein the received instruction is the update instruction, the unstructured attributes of the data object include file attributes, and the second storage system is a file storage system;
    所述生成单元具体用于:The generating unit is specifically used for:
    根据所述第二版本标识和所述文件属性对应的数据中的第一路径生成第三路径;Generating a third path according to the second version identifier and the first path in the data corresponding to the file attribute;
    在所述数据对象对应的关系数据表中生成所述数据对象的记录;其中,所述记录中的文件属性对应的数据包括所述第三路径,所述记录中的结构化属性对应的数据包括所述数据对象的结构化属性对应的数据。The record of the data object is generated in the relational data table corresponding to the data object; wherein the data corresponding to the file attribute in the record includes the third path, and the data corresponding to the structured attribute in the record includes Data corresponding to the structured attributes of the data object.
  22. 根据权利要求18~21任一项所述的数据管理设备,其特征在于,所述第二存储系统中存储的非结构化属性对应的数据包括所述非结构化属性的标识和内容;所述关系数据表中存储的非结构化属性对应的数据包括所述非结构化属性的标识。The data management device according to any one of claims 18 to 21, wherein the data corresponding to the unstructured attribute stored in the second storage system includes the identifier and content of the unstructured attribute; the The data corresponding to the unstructured attribute stored in the relational data table includes the identifier of the unstructured attribute.
  23. 根据权利要求16~22任一项所述的数据管理设备,其特征在于,所述操作指令包括查询指令,所述查询指令中包括查询条件;The data management device according to any one of claims 16 to 22, wherein the operation instruction includes a query instruction, and the query instruction includes a query condition;
    所述确定单元具体用于:The determining unit is specifically used for:
    响应于所述操作指令,从所述第一存储系统中选取满足所述查询条件的数据对象的记录;In response to the operation instruction, select a record of a data object that meets the query condition from the first storage system;
    所述获取单元具体用于:The acquiring unit is specifically used for:
    根据所述记录从所述第一存储系统和所述第二存储系统中获取所述数据对象的所述多个属性对应的数据;Acquiring data corresponding to the multiple attributes of the data object from the first storage system and the second storage system according to the record;
    所述操作单元具体用于:根据获取的所述多个属性对应的数据返回查询结果。The operating unit is specifically configured to return query results according to the acquired data corresponding to the multiple attributes.
  24. 根据权利要求23所述的数据管理设备,其特征在于,所述记录的非结构化属性包括KV属性,所述第二存储系统为KV存储系统,所述获取单元具体用于:The data management device according to claim 23, wherein the recorded unstructured attributes include KV attributes, the second storage system is a KV storage system, and the obtaining unit is specifically configured to:
    根据键值从所述第二存储系统中读取所述键值对应的KV数据,并去除所述键值中的版本标识;所述键值为所述记录中的KV属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;The KV data corresponding to the key value is read from the second storage system according to the key value, and the version identifier in the key value is removed; the key value is the data corresponding to the KV attribute in the record, so The version identifier includes a first version identifier and a second version identifier;
    其中,所述数据对象的KV属性对应的数据包括去除版本标识后的键值和所述KV数据;所述数据对象的结构化属性对应的数据包括所述记录中结构化属性对应的数据。The data corresponding to the KV attribute of the data object includes the key value after removing the version identifier and the KV data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  25. 根据权利要求23所述的数据管理设备,其特征在于,所述记录的非结构化属性包括文件属性,所述第二存储系统为文件存储系统,所述获取单元具体用于:The data management device according to claim 23, wherein the recorded unstructured attributes include file attributes, the second storage system is a file storage system, and the acquiring unit is specifically configured to:
    根据路径从所述第二存储系统中读取所述路径对应的文件数据,并去除所述路径中的版本标识;所述路径为所述记录中的文件属性对应的数据,所述版本标识包括第一版本标识和第二版本标识;The file data corresponding to the path is read from the second storage system according to the path, and the version identifier in the path is removed; the path is the data corresponding to the file attribute in the record, and the version identifier includes The first version identification and the second version identification;
    其中,所述数据对象的文件属性对应的数据包括去除版本标识后的路径和所述文件数据;所述数据对象的结构化属性对应的数据包括所述记录中结构化属性对应的数据。Wherein, the data corresponding to the file attribute of the data object includes the path after removing the version identifier and the file data; the data corresponding to the structured attribute of the data object includes the data corresponding to the structured attribute in the record.
  26. 根据权利要求16~22任一项所述的数据管理设备,其特征在于,所述操作指令包括删除指令,所述删除指令包括所述数据对象的对象类型,以及所述数据对象的结构化属性对应的数据和非结构化属性对应的数据;The data management device according to any one of claims 16 to 22, wherein the operation instruction includes a delete instruction, and the delete instruction includes an object type of the data object and a structured attribute of the data object Corresponding data and data corresponding to unstructured attributes;
    其中,所述确定单元具体用于:Wherein, the determining unit is specifically used for:
    根据所述对象类型确定所述数据对象对应的关系数据表;Determining the relational data table corresponding to the data object according to the object type;
    从所述关系数据表中确定所述数据对象的所述记录,所述记录中的结构化属性对应的数据与所述数据对象的结构化属性对应的数据相同;Determine the record of the data object from the relational data table, and the data corresponding to the structured attribute in the record is the same as the data corresponding to the structured attribute of the data object;
    所述获取单元具体用于:The acquiring unit is specifically used for:
    所述根据所述记录从所述第一存储系统中获取所述数据对象的所述多个属性对应的数据;Acquiring data corresponding to the multiple attributes of the data object from the first storage system according to the record;
    所述操作单元具体用于:The operating unit is specifically used for:
    从所述第一存储系统中删除所述数据对象的所述多个属性对应的数据;Deleting data corresponding to the multiple attributes of the data object from the first storage system;
    提交所述删除指令对应的事务。Submit the transaction corresponding to the delete instruction.
  27. 根据权利要求16~22任一项所述的数据管理设备,其特征在于,所述非结构化属性包括KV属性,所述第二存储系统为KV存储系统,所述数据管理设备还包括校验单元:The data management device according to any one of claims 16 to 22, wherein the unstructured attribute includes a KV attribute, the second storage system is a KV storage system, and the data management device further includes a check unit:
    所述校验单元,用于当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的键值,所述键值为所述第二存储系统中的KV属性对应的数据;The verification unit is configured to traverse the key value in the second storage system when the verification instruction is received or when it is detected that the verification condition is satisfied, and the key value is the value in the second storage system. The data corresponding to the KV attribute;
    在遍历所述键值的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四键值相同的键值,则在所述第二存储系统中删除所述第四键值和所述第四键值对应的KV数据,所述第四键值为所述第二存储系统中多个键值中的一个。In the process of traversing the key value, if the same key value as the fourth key value cannot be found in the relational data table stored in the first storage system, the first storage system will delete the first value. Four key values and KV data corresponding to the fourth key value, where the fourth key value is one of multiple key values in the second storage system.
  28. 根据权利要求16~22任一项所述的数据管理设备,其特征在于,所述非结构化属性包括文件属性,所述第二存储系统为文件存储系统,所述数据管理设备还包括校验单元:The data management device according to any one of claims 16 to 22, wherein the unstructured attributes include file attributes, the second storage system is a file storage system, and the data management device further includes verification unit:
    所述校验单元,用于当接收到校验指令时或者当检测到满足校验条件时,遍历所述第二存储系统中的路径,所述路径为所述第二存储系统中的文件属性对应的数据;The verification unit is configured to traverse a path in the second storage system when a verification instruction is received or when a verification condition is detected to be satisfied, and the path is a file attribute in the second storage system Corresponding data;
    在遍历所述路径的过程中,若在所述第一存储系统存储的关系数据表中无法查找到与第四路径相同的路径,则在所述第二存储系统中删除所述第四路径和所述第四路径对应的文件数据,所述第四路径为所述第二存储系统中多个路径中的一个。In the process of traversing the path, if the same path as the fourth path cannot be found in the relational data table stored in the first storage system, delete the fourth path and the fourth path in the second storage system. File data corresponding to the fourth path, where the fourth path is one of multiple paths in the second storage system.
  29. 根据权利要求17~22任一项所述的数据管理设备,其特征在于,所述生成单元还用于:The data management device according to any one of claims 17-22, wherein the generating unit is further configured to:
    接收针对所述数据对象所属对象类型的定义指令,所述定义指令中包含所述对象类型的定义信息,所述定义信息用于定义所述对象类型的关系数据表的结构;Receiving a definition instruction for the object type to which the data object belongs, where the definition instruction includes definition information of the object type, and the definition information is used to define the structure of a relational data table of the object type;
    根据所述定义指令在所述第一存储系统中,生成所述对象类型的关系数据表。According to the definition instruction, a relational data table of the object type is generated in the first storage system.
  30. 根据权利要求17~22任一项所述的数据管理设备,其特征在于,所述生成单元具体用于:The data management device according to any one of claims 17-22, wherein the generating unit is specifically configured to:
    根据所述插入指令或更新指令确定所述数据对象所属的对象类型;Determine the object type to which the data object belongs according to the insert instruction or update instruction;
    根据所述对象类型确定所述数据对象对应的关系数据表。The relational data table corresponding to the data object is determined according to the object type.
  31. 一种数据管理设备,其特征在于,包括处理器和存储器,其中,所述存储器用于存储程序指令,所述处理器用于根据所述程序指令执行如权利要求1~15任一项所述的方法。A data management device, characterized by comprising a processor and a memory, wherein the memory is used to store program instructions, and the processor is used to execute any one of claims 1-15 according to the program instructions method.
  32. 一种计算机可读存储介质,其特征在于,所述计算机存储介质存储有程序指令,所述程序指令当被计算机执行时使所述计算机执行如权利要求1~15任一项所述的方法。A computer-readable storage medium, wherein the computer storage medium stores program instructions that when executed by a computer cause the computer to execute the method according to any one of claims 1-15.
PCT/CN2020/080952 2019-03-26 2020-03-24 Data management method and related device WO2020192663A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910240947.2 2019-03-26
CN201910240947.2A CN111753141A (en) 2019-03-26 2019-03-26 Data management method and related equipment

Publications (1)

Publication Number Publication Date
WO2020192663A1 true WO2020192663A1 (en) 2020-10-01

Family

ID=72609600

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/080952 WO2020192663A1 (en) 2019-03-26 2020-03-24 Data management method and related device

Country Status (2)

Country Link
CN (1) CN111753141A (en)
WO (1) WO2020192663A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495398A (en) * 2022-09-28 2022-12-20 北京亚控科技发展有限公司 Interface resource operation method and device, electronic equipment, storage medium and product

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023123287A1 (en) * 2021-12-30 2023-07-06 深圳晶泰科技有限公司 Molecular data storage method and device, and molecular data application method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1187421A2 (en) * 2000-08-17 2002-03-13 FusionOne, Inc. Base rolling engine for data transfer and synchronization system
US20030084405A1 (en) * 2001-10-26 2003-05-01 Nec Corporation Contents conversion system, automatic style sheet selection method and program thereof
US6785690B1 (en) * 1996-03-18 2004-08-31 Hewlett-Packard Development Company, L.P. Method and system for storage, retrieval, and query of objects in a schemeless database
EP1868201A1 (en) * 2006-06-14 2007-12-19 Hitachi Consulting Co. Ltd. Contents metadata registering method, registering system, and registering program
EP1906319A1 (en) * 2006-09-29 2008-04-02 Omron Corporation Database generation apparatus and database use aid apparatus
CN101630322A (en) * 2009-08-26 2010-01-20 中国人民解放军信息工程大学 Method for storing and accessing file set under tree directory structure in database
CN105677826A (en) * 2016-01-04 2016-06-15 博康智能网络科技股份有限公司 Resource management method for massive unstructured data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7155444B2 (en) * 2003-10-23 2006-12-26 Microsoft Corporation Promotion and demotion techniques to facilitate file property management between object systems
WO2015085507A1 (en) * 2013-12-11 2015-06-18 华为技术有限公司 Data storage method, data processing method and device, and mobile terminal
CN106844374B (en) * 2015-12-04 2020-04-03 北京四维图新科技股份有限公司 Method and device for storing and retrieving photos
CN107092685A (en) * 2017-04-24 2017-08-25 广州新盛通科技有限公司 A kind of method that file system and RDBMS store transaction data are used in combination

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6785690B1 (en) * 1996-03-18 2004-08-31 Hewlett-Packard Development Company, L.P. Method and system for storage, retrieval, and query of objects in a schemeless database
EP1187421A2 (en) * 2000-08-17 2002-03-13 FusionOne, Inc. Base rolling engine for data transfer and synchronization system
US20030084405A1 (en) * 2001-10-26 2003-05-01 Nec Corporation Contents conversion system, automatic style sheet selection method and program thereof
EP1868201A1 (en) * 2006-06-14 2007-12-19 Hitachi Consulting Co. Ltd. Contents metadata registering method, registering system, and registering program
EP1906319A1 (en) * 2006-09-29 2008-04-02 Omron Corporation Database generation apparatus and database use aid apparatus
CN101630322A (en) * 2009-08-26 2010-01-20 中国人民解放军信息工程大学 Method for storing and accessing file set under tree directory structure in database
CN105677826A (en) * 2016-01-04 2016-06-15 博康智能网络科技股份有限公司 Resource management method for massive unstructured data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495398A (en) * 2022-09-28 2022-12-20 北京亚控科技发展有限公司 Interface resource operation method and device, electronic equipment, storage medium and product
CN115495398B (en) * 2022-09-28 2023-06-30 北京亚控科技发展有限公司 Interface resource operation method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111753141A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
US20220043830A1 (en) Versioned hierarchical data structures in a distributed data store
US9715518B2 (en) Cross-ACL multi-master replication
US10019536B2 (en) Snapshot-consistent, in-memory graph instances in a multi-user database
KR20200093597A (en) Assignment and reallocation of unique identifiers for synchronization of content items
US7822710B1 (en) System and method for data collection
CN102629247B (en) Method, device and system for data processing
US20060224626A1 (en) Versioned file group repository
EP2746971A2 (en) Replication mechanisms for database environments
US11487714B2 (en) Data replication in a data analysis system
EP2874077A2 (en) Stateless database cache
US8959117B2 (en) System and method utilizing a generic update module with recursive calls
US10866865B1 (en) Storage system journal entry redaction
US9519673B2 (en) Management of I/O and log size for columnar database
CN102955792A (en) Method for implementing transaction processing for real-time full-text search engine
US9390111B2 (en) Database insert with deferred materialization
WO2020192663A1 (en) Data management method and related device
US9043371B1 (en) Storing information in a trusted environment for use in processing data triggers in an untrusted environment
US10620660B2 (en) Efficient timestamp solution for analyzing concurrent software systems
US11657088B1 (en) Accessible index objects for graph data structures
CA3089270C (en) Systems and methods for storing object state on hash chains
US20070078800A1 (en) System and method of building an atomic view of a filesystem that lacks support for atomic operations
US11593338B2 (en) Computer-implemented method for database management, computer program product and database system
CN115878563B (en) Method for realizing directory-level snapshot of distributed file system and electronic equipment
US11599520B1 (en) Consistency management using query restrictions in journal-based storage systems
JP2023551626A (en) Generation and modification of collection content items to organize and present content items

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20778341

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20778341

Country of ref document: EP

Kind code of ref document: A1