CN110765126B - Data storage and query method, device and storage medium of distributed database - Google Patents

Data storage and query method, device and storage medium of distributed database Download PDF

Info

Publication number
CN110765126B
CN110765126B CN201910854499.5A CN201910854499A CN110765126B CN 110765126 B CN110765126 B CN 110765126B CN 201910854499 A CN201910854499 A CN 201910854499A CN 110765126 B CN110765126 B CN 110765126B
Authority
CN
China
Prior art keywords
data
time
physical
query
archive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910854499.5A
Other languages
Chinese (zh)
Other versions
CN110765126A (en
Inventor
何帆
何林强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN201910854499.5A priority Critical patent/CN110765126B/en
Publication of CN110765126A publication Critical patent/CN110765126A/en
Application granted granted Critical
Publication of CN110765126B publication Critical patent/CN110765126B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses a data storage and query method, a device and a storage medium of a distributed database, wherein the distributed database comprises a plurality of time slices, and the time slices are divided by time attributes; each time slice comprises a plurality of physical slices; the data storage method comprises the following steps: receiving data to be stored; determining a physical storage path of the data to be stored according to the key value of the data to be stored; storing the data to be stored in a physical fragment corresponding to the physical storage path under each time fragment or partial time fragments; the data query method comprises the following steps: acquiring a query condition; traversing each time slice or physical slices under partial time slices according to the query condition; and outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result. By the mode, the data in the database can be quickly inquired.

Description

Data storage and query method, device and storage medium of distributed database
Technical Field
The present invention relates to the field of distributed databases, and in particular, to a method, an apparatus, and a storage medium for storing and querying data of a distributed database.
Background
In the existing distributed database, the existing relational database mode is generally relied on, so that the relational data query process needs to be stored and queried level by level, and the query efficiency is low for the relational data with large data volume.
Disclosure of Invention
The invention provides a data storage and query method, a data storage and query device and a data storage medium of a distributed database, and aims to solve the problem that rapid query on relational data cannot be realized in the prior art.
In order to solve the above technical problem, one technical solution adopted by the present invention is to provide a data storage method for a distributed database, where the distributed database includes a plurality of time slices, and the time slices are divided by time attributes; each of the time slices comprises a plurality of physical slices; the data storage method comprises the following steps: receiving data to be stored; determining a physical storage path of the data to be stored according to the key value of the data to be stored; and storing the data to be stored into a physical fragment corresponding to the physical storage path under each time fragment or partial time fragments.
In order to solve the above technical problem, another technical solution adopted by the present invention is to provide a data query method for a distributed database, where the distributed database includes a plurality of time slices, and the time slices are divided by time attributes; each of the time slices comprises a plurality of physical slices; the physical fragments are stored with archive data and space-time data, and the archive data comprises archive primary key values and attribute key values; the space-time data comprises an associated key value, a time distribution key value and a space distribution key value; the association key value is associated with the primary file key value; the data query method comprises the following steps: acquiring a query condition; traversing each time slice or physical slices under partial time slices according to the query condition; and outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
In order to solve the above technical problem, another technical solution adopted by the present invention is to provide a data storage device of a distributed database, where the data storage device of the distributed database includes a processor and a memory; the memory has stored therein a computer program for execution by the processor to implement the steps of any of the methods of storing distributed data.
In order to solve the above technical problem, another technical solution adopted by the present invention is to provide a data query apparatus for a distributed database, wherein the data query apparatus for the distributed database comprises a processor and a memory; the memory has stored therein a computer program for execution by the processor to implement the steps of any of the methods of querying distributed data.
In order to solve the above technical problem, another technical solution of the present invention is to provide a computer storage medium, in which a computer program is stored, and the computer program implements the steps of any one of the above methods when executed.
Different from the prior art, the invention provides a data storage and query method, a device and a storage medium of a distributed database, wherein the distributed database comprises time slices and physical slices, and the data of the time slices and the data of the physical slices are queried independently, so that progressive query is not needed, and the query speed can be greatly increased.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating a first embodiment of a data storage method for a distributed database according to the present invention;
FIG. 2 is a flow chart of a second embodiment of the data storage method of the distributed database according to the present invention;
FIG. 3 is a flow chart of a third embodiment of the data storage method of the distributed database according to the present invention;
FIG. 4 is a schematic block diagram of a first embodiment of a data storage device of a distributed database in accordance with the present invention;
FIG. 5 is a flowchart illustrating a first embodiment of a method for querying data in a distributed database according to the present invention;
FIG. 6 is a flowchart illustrating a second embodiment of a method for querying data in a distributed database according to the present invention;
FIG. 7 is a flow chart of a third embodiment of the data query method of the distributed database according to the present invention;
FIG. 8 is a flowchart illustrating a fourth embodiment of the data query method of the distributed database according to the present invention;
FIG. 9 is a schematic structural diagram of a first embodiment of a data query device of a distributed database according to the present invention;
FIG. 10 is a schematic structural diagram of a second embodiment of a data storage device of a distributed database according to the present invention;
FIG. 11 is a schematic structural diagram of a second embodiment of the data query device of the distributed database according to the present invention;
FIG. 12 is a structural schematic of the distributed database of the present invention;
FIG. 13 is a schematic structural diagram of an embodiment of a computer storage medium.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Relational data or unstructured data cannot be well stored in a distributed database at the same time.
In video monitoring, a large amount of archives and time-space data need to be stored, and the data can be stored and inquired well by providing the distributed database.
The distributed database provided by the application comprises a physical storage table and a logic management table. The physical storage table includes a plurality of time slices, and the time slices are divided by a time attribute, specifically, the time attribute may be in a period of one day, a period of one week, or another time. Are not limited herein. Taking a one-day period as an example, the distributed database is used for data of 2019 in 6 months, and includes 30 time slices from a time slice with a time attribute of 2019 in 6 months and 1 day to a time slice with a time attribute of 2019 in 6 months and 30 days.
Each time slice comprises a plurality of physical slices, and specifically comprises at least one physical slice. The physical shards are used as time memory shards for storing data.
Referring to fig. 1 specifically, fig. 1 is a schematic flowchart of a first embodiment of a data storage method for a distributed database according to the present invention, where the data storage method for the distributed database in this embodiment includes the following steps:
and S11, receiving data to be stored.
Data to be stored, in particular, may be archival data or spatio-temporal data, is received.
And S12, determining a physical storage path of the data to be stored according to the key value of the data to be stored.
The physical storage path on which the data is stored may then be determined based on the key value of the data to be stored.
And S13, storing the data to be stored in the physical fragments corresponding to the physical storage paths under each time fragment or partial time fragments.
In a specific embodiment, the data to be stored is then stored in the physical partition corresponding to the physical storage path of each time slice, or in a specific embodiment, the data to be stored may be stored in the physical partition corresponding to the physical storage path of a part of the time slices.
In particular, the data to be stored is stored differently due to the type of data to be stored.
Referring to fig. 2, fig. 2 is a schematic flow chart of a data storage method of a distributed database according to a second embodiment of the present invention, in which data to be stored is a storage manner of archive data, and the data storage method of the distributed database of the present embodiment includes the following steps:
s11a, receiving data to be stored.
In an embodiment, the data to be stored is file data, and the key values of the file data include a file primary key value and an attribute key value. Specifically, one expression of the archive data may be { "person _ id": 1"," name ": john", and "sex":1}, where in "person _ id": 1", person _ id is the archive primary key identifier, and 1 is the archive primary key value. The attribute key is specifically an attribute parameter, such as name and gender, which are all independent of time. For example, "name": john and "sex":1, both name and sex are attribute identifiers, and John and 1 are attribute keys respectively.
If 1 is predefined as male, 0 is female. Namely { "person _ id": 1), "name": john "," sex ":1} the archive data is data with an archive primary key value of 1, a name of John and a gender of male.
And S12a, determining a physical storage path of the data to be stored according to the key value of the data to be stored.
Calculating a physical archive storage path of archive data according to an archive primary key value, specifically, calculating a key value by using a consistent Hash composite algorithm, specifically, calculating an archive primary key value, thereby obtaining a routing value; the route value is then divided by the number of physical slices in the time slice to get a remainder, and the remainder is taken as the physical storage path.
In particular, the number of physical slices in each time slice may be the same or different. For different time slices, when the storage path is searched, the route value is divided by the number of the physical slices in the time slice, and then the remainder is obtained, so that the physical storage path can be obtained.
In a specific embodiment, when multiple physical fragments are established in a time fragment, serial numbers are sequentially allocated to the multiple physical fragments and the multiple physical fragments are used as physical storage paths of the physical fragments. And after the route value is divided by the number of the physical fragments in the time fragment and is left, obtaining the remainder and the serial number of the physical fragment to be stored.
And S13a, storing the data to be stored into the physical fragments corresponding to the physical storage paths under each time fragment or partial time fragments.
And storing the archive data in the physical fragment corresponding to the archive physical storage path under each time fragment.
And correspondingly storing the data to be stored, namely the archive data, into the physical fragment corresponding to the sequence number under each time fragment.
Specifically referring to fig. 3, fig. 3 is a schematic flowchart of a third embodiment of a data storage method for a distributed database according to the present invention, specifically, a storage manner of spatio-temporal data is adopted as data to be stored, and the data storage method for the distributed database according to this embodiment includes the following steps:
s11b, receiving data to be stored.
In an embodiment, the data to be stored is space-time data, and the key values of the space-time data include associated key values, time distribution key values, and spatial distribution key values. Specifically, one expression form of the spatiotemporal data may be { "person _ id": 1"," record _ place ": X Road", "record _ time": 2019-05-01 00"}, where in" person _ id ": 1", person _ id is an association key identifier, and 1 is an association key value. In "record _ time": 2019-05-01 00", record _ time is a time distribution key identifier, 2019-05-01 00 is a time distribution key value, and in" record _ place ": X Road", record _ place is a spatial distribution key identifier, and X Road is a spatial distribution key value.
The strip of spatio-temporal data is represented as data with an association key value of 1, a time distribution key value of 2019-05-01 00 and a spatial distribution key value of X Road. Namely, the data of one day is expressed by the time of 2019-05-01 00.
And S12b, determining a physical storage path of the data to be stored according to the key value of the data to be stored.
And calculating a time storage path of the spatio-temporal data according to the time distribution key values.
For the time-space data, because of having a time attribute, the time-space data needs to be stored into a corresponding time slice, specifically, a time storage path of the time-space data is calculated by a time distribution key value, and for the time-space data with the time distribution key value of 2019-05-01 00, the time slice corresponding to the time distribution key value is a time slice with the time attribute of 2019, 05, month and 01.
In a specific embodiment, when time slicing is established, a serial number may be assigned to the time slicing, taking the time slicing of 5 months in 2019 as an example. Days 1 to 30 were 1,2,3 … …, respectively. When calculating the time storage path of the spatio-temporal data, the time distribution key value can be calculated to obtain the sequence number of the time slice required to be stored.
And calculating a physical storage path of the spatiotemporal data according to the associated key values.
In a specific embodiment, the spatio-temporal data is associated with the archive data, specifically, the spatio-temporal data serves as the subdata of the archive data, and the spatio-temporal data is associated with the archive data by an association key, i.e., the association key of the spatio-temporal data is the archive primary key of the archive data corresponding to the spatio-temporal data.
The physical storage path of the spatio-temporal data can be calculated according to the associated key values, and in particular, the calculation method can be similar to that of the primary key values of the archive. Calculating a correlation key value by using a consistent Hash composite algorithm, thereby obtaining a routing value; the route value is then divided by the number of physical slices in the time slice to get a remainder, and the remainder is taken as the physical storage path.
Specifically, the time storage path and the physical storage path may be calculated simultaneously, and then an intersection is taken, or the time storage path may be calculated first, and then the physical storage path is calculated under the time slice corresponding to the time storage path, which is not limited herein.
And S13b, storing the data to be stored in the physical fragment corresponding to the physical storage path under each time fragment or partial time fragment.
And storing the space-time data in the time slices corresponding to the time storage paths and the physical slices corresponding to the space-time physical storage paths.
That is, the spatiotemporal data is stored in the time slice corresponding to the time storage path and further in the physical slice corresponding to the physical storage path.
In a specific embodiment, the distributed database provided by the present application further includes a logical parent table and a logical child table for performing logical management. The logical parent table includes an archive primary key definition value, and the logical child table includes an association key definition value and a time distribution definition value.
In one embodiment, if the data to be stored is file data, the file data is identified by the file primary key definition value to obtain the file primary key value.
And if the data to be stored is space-time data, identifying the space-time data by using the associated key definition value to obtain a space-time associated key value, and identifying the space-time data by using the time distribution definition value to obtain a time distribution key value.
In the foregoing embodiment, a distributed database and a storage method are provided, where time slices and physical slices are established, and respective key values of archival data and spatio-temporal data are used, so that distributed storage may be performed, and thus, fast storage of relational data is achieved.
The data storage method of the distributed database is generally realized by a data storage device of the distributed database, so the invention also provides the data storage device of the distributed database. Referring to fig. 4, fig. 4 is a schematic structural diagram of a data storage device of a distributed database according to an embodiment of the present invention. The data storage device 100 of the distributed database of the present embodiment includes a processor 11 and a memory 12; the memory 12 stores a computer program, and the processor 11 is configured to execute the computer program to implement the steps of the data storage method of the distributed database.
Referring to fig. 5, fig. 5 is a schematic flowchart of a first embodiment of a data query method for a distributed database according to the present invention, where the data query method for the distributed database includes the following steps:
for each time slice, there are multiple physical slices. The physical shards store archival data and spatiotemporal data, wherein the archival data, as parent data of the spatiotemporal data, may include a plurality of spatiotemporal data. The archive data comprises an archive primary key value and an attribute key value; the space-time data comprises an association key value, a time distribution key value and a space distribution key value, wherein the association key value is associated with the primary archive key value.
In particular embodiments, a physical shard may store both the gender, name, etc. profile data of a person and the time-space data of the day.
And S21, acquiring the query condition.
And acquiring a query condition, wherein the query condition can be an archive query condition or a space-time query condition. Or includes both archive query conditions and spatio-temporal query conditions.
And S22, traversing the physical fragments of each time fragment or part of the time fragments according to the query conditions.
And traversing the physical fragments under all time fragments or the physical fragments under partial time according to the query conditions, thereby acquiring the archival data or/and the spatio-temporal data matched with the query conditions.
And S23, outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
After traversal, outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
Referring to fig. 6, fig. 6 is a schematic flowchart of a second embodiment of a data query method for a distributed database according to the present invention, where the query condition is a file query condition, and the data query method for the distributed database of the present embodiment includes the following steps:
and S21a, acquiring the query condition.
In one embodiment, the query condition is a profile query condition, which includes at least an attribute condition. Specifically, there may be one or two. Such as name or gender, name and gender, etc.
And S22a, traversing each time slice or the physical slices under partial time slices according to the query condition.
Because the archive query condition is irrelevant to time, the physical fragments under each time fragment can be traversed according to the archive query condition, namely, the physical fragments under all the time fragments are traversed.
The format of the archive query condition may be { "archive query condition": [ "sex =1" ] }, and the attribute condition is "sex =1", so that physical shards under all time shards are traversed according to the query condition. Thereby acquiring the archive data with the attribute key value also being 1.
And S23a, outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
And outputting the archive data with the attribute key value matched with the attribute condition, for example, outputting the archive data with the attribute key value sex =1 as 1, that is, outputting all the archive data with gender as the query result.
In one embodiment, the subdata of the archive data, i.e., the spatiotemporal data of the archive data, may also be output as a query result.
Referring to fig. 7, fig. 7 is a schematic flowchart of a third embodiment of a data query method for a distributed database according to the present invention, where the query condition is a spatio-temporal query condition, and the data query method for the distributed database of the present embodiment includes the following steps:
and S21b, acquiring the query condition.
In a particular embodiment, the query condition is a spatiotemporal query condition, the spatiotemporal query condition including a temporal distribution condition.
And S22b, traversing each time slice or the physical slices under partial time slices according to the query condition.
In an embodiment, the format of the spatio-temporal query condition may be { [ "spatio-temporal query condition". According to the time distribution condition, the time slice with the time attribute matched with the time distribution condition is determined, which is somewhat similar to the storage method in the above embodiment and is not described here again. All physical slices below the time slice are then traversed.
And S23b, outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
And outputting the spatio-temporal data and the archive data thereof with the time distribution key values matched with the time distribution conditions under the physical fragmentation. Taking the above archive query condition as an example, outputting spatiotemporal data of the time distribution key value record _ time =2019-05-01 00.
In a specific embodiment, the archive data corresponding to the spatio-temporal data also needs to be output, and specifically, the archive data of the spatio-temporal data can be determined according to the association between the associated key value of the spatio-temporal data and the primary key value of the archive, and output as the query result.
In a specific embodiment, the spatio-temporal query condition further includes a spatial distribution condition, such as { [ "record _ time =2019-05-01 00.
Referring to fig. 8, fig. 8 is a schematic flowchart of a fourth embodiment of a data query method for a distributed database according to the present invention, where query conditions include an archive query condition and a spatio-temporal query condition, and the data query method for the distributed database of the present embodiment includes the following steps:
and S21c, acquiring the query condition.
In one embodiment, the query criteria include profile query criteria and spatiotemporal query criteria. The archive query condition comprises an attribute condition, and the spatio-temporal query condition comprises a time distribution condition.
And S22c, traversing each time slice or the physical slices under partial time slices according to the query condition.
In a specific embodiment, the query condition may be { "archive query condition". And judging whether the number of the successfully matched file data is greater than a preset threshold value.
If the time distribution key value is greater than the threshold value, determining the time slice with the time attribute matched with the time distribution condition, traversing the physical slices under the time slice, and determining the time-space data with the time distribution key value matched with the time distribution condition in the physical slices.
If not, namely the time slice is smaller than the threshold, determining the time slice with the time attribute matched with the time distribution condition, and traversing the physical slice to which the archive data with the attribute key value matched with the attribute condition belongs under the time slice. Namely, the physical segment to which the archive data corresponding to the attribute key value belongs is directly traversed under the time segment corresponding to the time distribution condition.
And S23c, outputting the archive data or/and the spatio-temporal data matched with the query conditions as a query result.
And if the number of the file data is larger than the threshold value, matching operation is carried out on the file primary key values of the file data obtained by traversing and the associated key values of the time-space data, and the file data matched with the file primary key values and the associated key values and the time-space data thereof are output to be used as query results.
The file data corresponding to the associated key value of the spatio-temporal data is determined by performing matching operation according to the primary file key value of the file data and the associated key value of the spatio-temporal data.
Taking the above query condition as an example, in an embodiment, the archive data of sex =1 is queried on all physical slices under the time slice to obtain a result set: [ { "id": 1), "person _ id": 1, "" name ": john," "sex":1}; { "id": 1), "person _ id": 2), "name": jonny, "sex":1} ].
Querying spatio-temporal data of record _ time =2019-05-01 00 on a physical slice on the determined time slice to obtain a result set [ "id": 1", {" person _ id ": 1", "record _ place": X Road "," record _ time ": 2019-05-01"; { "id": 1"," person _ id ": 3", "record _ place": Y Road "," record _ time ": 2019-05-01-00" }.
And intersecting the result set of the file data with the result set of the spatio-temporal data, and determining that the file data { "id": 1"," person _ id ": 1", "record _ place": X Road "," record _ time ": 2019-05-01 00":00 ". So as to output the archive data { "id": 1), "person _ id": 1, "" name ": john," "sex":1} as the query result. The time-space data { "id": 1"," person _ id ": 1", "record _ place": X Road "," record _ time ": 2019-05-01 00" } are not returned as the query result.
In the "id": 1", 1 refers to a serial number of a physical segment to which the archival data or the spatio-temporal data belongs, and is added by way of example only, and in a specific embodiment, the serial number may be calculated according to an algorithm and a primary key/associated key and may not be stored in a record.
And if the number of the file data is smaller than the threshold value, determining the spatio-temporal data of which the time distribution key values in the traversed physical fragments are matched with the time distribution conditions, and outputting the spatio-temporal data and the file data thereof as a query result.
In a specific embodiment, the spatio-temporal conditions may further include a spatial distribution condition, and the query manner is similar to that in the above embodiment, which is not described herein again.
In the above embodiment, different traversal methods are adopted by determining the quantity value of the archive data, and for a large amount of data, the query speed can be increased by adopting an intersection manner, so that an optimal query manner is provided by setting a threshold value, and the query efficiency is ensured.
In an embodiment, the query condition may further include a paging condition and a sorting condition, the paging condition includes a numerical value of each page of the file data, and the sorting condition includes a sorting manner of the primary key of the file.
The format of the query condition may be { "archive query condition" [ "sex =1" ], [ "spatiotemporal query condition" [ record _ time =2019-05-01 00 ], "" record _ place = X Road "]," "paging condition".
After determining a good result set according to the archive query condition and the spatio-temporal query condition,
first, the archive data is sorted according to the archive ID of the archive data in the query result in a sorting manner. That is, sorting is performed according to the size of "id": Z "in the archive data, specifically from large to small in this embodiment.
Then, the sorted file data is paged according to the number of each page, as exemplified by the paging condition. The three bars are divided into one page so that each page displays the file data of the bar value of each page and the spatio-temporal data of the file data.
In a specific embodiment, after each physical partition is queried, the number of result sets may be determined first, and it is determined whether the number meets the requirement of the paging condition, such as the minimum requirement of 3, and if so, paging may be performed first. If not, the page display can be performed after the query play in the whole process is sequentially waited.
In particular embodiments, the paging condition may also be a "paging condition" of "3 strips per page get page 1". I.e. further including the number of pages displayed, such as only the first page.
In the above embodiment, by providing the time slice and the physical slice, when the query condition includes the archive query condition and the time-space query condition, the data can be independently queried according to the time-space query condition and the archive query condition, so that the required data can be quickly traversed without traversing step by step, and the efficiency of querying the data can be improved.
The data query method of the distributed database is generally realized by a data query device of the distributed database, so the invention also provides the data query device of the distributed database. Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a data query apparatus for a distributed database according to the present invention. The data query device 200 of the distributed database of the present embodiment includes a processor 21 and a memory 22; the memory 22 stores a computer program, and the processor 21 is configured to execute the computer program to implement the steps of the data query method of the distributed database as described above.
In an embodiment, the data storage device 100 of the distributed database and the data query device 200 of the distributed database may be specifically the same device, and are not limited herein.
As shown in fig. 10, the present application further provides a data storage apparatus 400 of a distributed database, where the data storage apparatus 400 of the distributed database includes a receiving module 41, a determining module 42, and a storing module 43. The receiving module 41 is configured to receive data to be stored; the determining module 42 is configured to determine a physical storage path of the data to be stored according to a key value of the data to be stored; the storage module 43 is configured to store the data to be stored in each time slice or a physical slice corresponding to a physical storage path under a part of the time slices. The specific steps of the method have already been described in the above embodiments, and are not described herein again.
As shown in fig. 11, the present application further provides a data query apparatus 500 for a distributed database, where the data query apparatus 500 for a distributed database includes an obtaining module 51, a traversing module 52, and an outputting module 53. The obtaining module 51 is configured to obtain a query condition, the traversing module 52 is configured to traverse each time slice or each physical slice in a part of the time slices according to the query condition, and the outputting module 53 is configured to output archive data or/and spatio-temporal data matched with the query condition as a query result. The specific steps of the above embodiments have already been described, and are not described herein again.
As shown in fig. 12, the present application further provides a distributed database, where the distributed database includes a storage manager and a logic manager, the storage manager includes a plurality of time slices, the time slices are divided by time attributes, each time slice includes a plurality of physical slices, and the physical slices are divided by routing attributes. The physical shards may store archival data and spatiotemporal data as actual memory shards. The file data is father data of the space-time data, the space-time data is subdata of the file data, one file data can correspond to a plurality of time data, and one space-time data only corresponds to one file data. The logic manager is used for carrying out logic management on the storage manager and comprises a logic parent table and a logic child table, wherein the logic parent table comprises a file primary key definition value, and the file primary key definition value can identify file data to obtain a file primary key value. The logical sub-table includes an associated key definition value and a time distribution definition value. The linkage definition value can identify spatio-temporal data to obtain spatio-temporal association key values, and the time distribution definition value can identify spatio-temporal data to obtain time distribution key values.
The logic processes of the data storage method of the distributed database and the data query method of the distributed database are presented as a computer program, and in terms of the computer program, if the computer program is sold or used as an independent software product, the computer program can be stored in a computer storage medium, so the invention provides the computer storage medium. Referring to fig. 13, fig. 13 is a schematic structural diagram of an embodiment of a computer storage medium in the present invention, a computer program 31 is stored in the computer storage medium 300 of the embodiment, and the computer program is executed by a processor to implement the distribution network method or the control method.
The computer storage medium 300 may be a medium that can store a computer program, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, or may be a server that stores the computer program, and the server may send the stored computer program to another device for running or may run the stored computer program by itself. The computer storage medium 300 may be a combination of a plurality of entities from a physical point of view, for example, a plurality of servers, a server plus a memory, or a memory plus a removable hard disk.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A data storage method of a distributed database is characterized in that the distributed database comprises a plurality of time slices, and the time slices are divided by time attributes; each of the time slices comprises a plurality of physical slices; the data storage method comprises the following steps:
receiving data to be stored;
determining a physical storage path of the data to be stored according to the key value of the data to be stored;
storing the data to be stored into a physical fragment corresponding to the physical storage path under each time fragment or partial time fragments;
the distributed database also comprises a logic parent table and a logic child table, wherein the logic parent table comprises an archive main key definition value, and the logic child table comprises an association key definition value and a time distribution definition value; the data storage method comprises the following steps:
the data to be stored is archive data, and the archive data is identified by the archive primary key definition value to obtain an archive primary key value;
and the data to be stored is space-time data, the space-time data is identified by using the associated key definition value to obtain an associated key value, the associated key value is associated with the archive primary key value, and the space-time data is identified by using the time distribution definition value to obtain a time distribution key value.
2. The data storage method according to claim 1, wherein the data to be stored is archive data, and the key values of the archive data include an archive primary key value and an attribute key value;
the determining a physical storage path of the data to be stored according to the key value of the data to be stored includes:
calculating a file physical storage path of the file data according to the file primary key value;
the storing the data to be stored into the physical partition corresponding to the physical storage path under each time partition or partial time partitions includes:
and storing the archive data in a physical fragment corresponding to the archive physical storage path under each time fragment.
3. The data storage method according to claim 1, wherein the data to be stored is spatio-temporal data, and key values of the spatio-temporal data include an association key value, a time distribution key value and a space distribution key value;
the determining a physical storage path of the data to be stored according to the key value of the data to be stored includes:
calculating a time storage path of the time-space data according to the time distribution key value, and calculating a physical storage path of the time-space data according to the association key value;
the storing the data to be stored in the physical partition corresponding to the physical storage path under each time partition or partial time partition includes:
and storing the spatiotemporal data in a physical fragment corresponding to the physical storage path in the time fragments corresponding to the time storage path.
4. The data storage method of claim 1, wherein the determining a physical storage path of the data to be stored according to the key value of the data to be stored comprises:
calculating the key value by using a consistent Hash composite algorithm to obtain a routing value;
calculating a route value divided by the number of physical fragments in the time fragment to obtain a remainder;
the retrieved remainder is taken as the physical memory path.
5. A data query method of a distributed database is characterized in that the distributed database comprises a plurality of time slices, and the time slices are divided by time attributes; each of the time slices comprises a plurality of physical slices; the physical fragments are stored with archive data and space-time data, and the archive data comprises archive primary key values and attribute key values; the space-time data comprises an associated key value, a time distribution key value and a space distribution key value; the association key value is associated with the primary file key value; the data query method comprises the following steps:
acquiring a query condition;
traversing each time slice or physical slices under partial time slices according to the query condition;
outputting archive data or/and spatio-temporal data matched with the query conditions as a query result;
the query conditions comprise archive query conditions and spatio-temporal query conditions, the archive query conditions comprise attribute conditions, and the spatio-temporal query conditions comprise time distribution conditions;
the traversing of the physical slices under each time slice or part of the time slices according to the query condition includes:
determining the archive data of which the attribute key value is matched with the attribute condition;
determining whether the quantity of the archival data is greater than a preset threshold value;
if so, determining the time slice with the time attribute matched with the time distribution condition, and traversing the physical slice under the time slice;
determining time-space data with time distribution key values matched with the time distribution conditions in the physical fragments;
the outputting the archive data or/and the spatio-temporal data matched with the query condition as a query result comprises:
matching operation is carried out on the primary file key values of the file data and the associated key values of the space-time data;
outputting the archive data and the spatio-temporal data of the archive primary key value and the associated key value which are matched as query results;
if not, determining the time slice with the time attribute matched with the time distribution condition, and traversing the physical slice to which the attribute key value under the time slice and the archive data matched with the attribute condition belong;
the outputting the archive data or/and the spatio-temporal data matched with the query condition as a query result comprises:
and determining the spatio-temporal data of which the time distribution key values are matched with the time distribution conditions in the physical fragments, and outputting the spatio-temporal data and the archival data thereof as a query result.
6. The data query method of claim 5, wherein the query condition further comprises a paging condition and a sorting condition, the paging condition comprises a numerical value of each page of the archive data, and the sorting condition comprises a sorting manner of the primary key values of the archive;
the query method further comprises the following steps:
sorting the file data according to the sorting mode according to the file primary key values of the file data in the query result;
and paging the sorted file data according to the number of each page strip so as to display the file data of the number of each page strip and the spatio-temporal data of the file data on each page.
7. A data storage device of a distributed database, wherein the data storage device of the distributed database comprises a processor and a memory; the memory has stored therein a computer program for execution by the processor to implement the steps of the method according to any one of claims 1-4.
8. The data query device of the distributed database is characterized by comprising a processor and a memory; stored in the memory is a computer program for execution by the processor to carry out the steps of the method according to any one of claims 5 or 6.
9. A computer storage medium, characterized in that the computer storage medium stores a computer program which, when executed, implements the steps of the method according to any one of claims 1-6.
CN201910854499.5A 2019-09-10 2019-09-10 Data storage and query method, device and storage medium of distributed database Active CN110765126B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910854499.5A CN110765126B (en) 2019-09-10 2019-09-10 Data storage and query method, device and storage medium of distributed database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910854499.5A CN110765126B (en) 2019-09-10 2019-09-10 Data storage and query method, device and storage medium of distributed database

Publications (2)

Publication Number Publication Date
CN110765126A CN110765126A (en) 2020-02-07
CN110765126B true CN110765126B (en) 2023-02-07

Family

ID=69329410

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910854499.5A Active CN110765126B (en) 2019-09-10 2019-09-10 Data storage and query method, device and storage medium of distributed database

Country Status (1)

Country Link
CN (1) CN110765126B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527891B (en) * 2020-11-24 2022-06-03 武汉联影医疗科技有限公司 Data storage method, device, equipment and storage medium
CN112328842B (en) * 2021-01-05 2022-03-25 北京谷数科技股份有限公司 Data processing method and device, electronic equipment and storage medium
CN112380276B (en) * 2021-01-15 2021-09-07 四川新网银行股份有限公司 Method for querying data by non-fragment key fields after database division and table division of distributed system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101727465B (en) * 2008-11-03 2011-12-21 中国移动通信集团公司 Methods for establishing and inquiring index of distributed column storage database, device and system thereof
US20130179476A1 (en) * 2012-01-09 2013-07-11 Microsoft Corporation Scalable billing usage data storage
JP6032467B2 (en) * 2012-06-18 2016-11-30 株式会社日立製作所 Spatio-temporal data management system, spatio-temporal data management method, and program thereof
WO2017062288A1 (en) * 2015-10-07 2017-04-13 Oracle International Corporation Relational database organization for sharding

Also Published As

Publication number Publication date
CN110765126A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN110765126B (en) Data storage and query method, device and storage medium of distributed database
CN110019396B (en) Data analysis system and method based on distributed multidimensional analysis
CN111158977B (en) Abnormal event root cause positioning method and device
US11132346B2 (en) Information processing method and apparatus
CN106462583B (en) System and method for rapid data analysis
US20090271385A1 (en) System and method for parallel query evaluation
US11100584B2 (en) Systems and methods of creating order lifecycles via daisy chain linkage
CN110990372A (en) Dimensional data processing method and device and data query method and device
CN111125518B (en) Household appliance information recommendation system and method
WO2021244574A1 (en) Method and device for establishing question-answer pair
CN112579854A (en) Information processing method, device, equipment and storage medium
CN107592223A (en) A kind of intelligent alarm processing method based on big data
CN113361954A (en) Attribution analysis method, attribution analysis device, attribution analysis equipment and storage medium
CN110716950A (en) Method, device and equipment for establishing aperture system and computer storage medium
CN111198961B (en) Commodity searching method, commodity searching device and commodity searching server
CN114218211A (en) Data processing system, method, computer device and readable storage medium
CN110851758B (en) Webpage visitor quantity counting method and device
CN112508119A (en) Feature mining combination method, device, equipment and computer readable storage medium
US8224858B2 (en) Methods and system for information storage enabling fast information retrieval
CN112445833A (en) Data paging query method, device and system for distributed database
CN114860806A (en) Data query method and device of block chain, computer equipment and storage medium
CN110837508A (en) Method, device and equipment for establishing aperture system and computer storage medium
CN110929207B (en) Data processing method, device and computer readable storage medium
CN109582863B (en) Recommendation method and server
US10558647B1 (en) High performance data aggregations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant