WO2023024247A1 - 一种标签数据的范围查询方法、装置、设备及存储介质 - Google Patents
一种标签数据的范围查询方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2023024247A1 WO2023024247A1 PCT/CN2021/127321 CN2021127321W WO2023024247A1 WO 2023024247 A1 WO2023024247 A1 WO 2023024247A1 CN 2021127321 W CN2021127321 W CN 2021127321W WO 2023024247 A1 WO2023024247 A1 WO 2023024247A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- query
- data
- tag data
- range
- target tag
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000013138 pruning Methods 0.000 claims description 9
- 238000010586 diagram Methods 0.000 description 23
- 230000006870 function Effects 0.000 description 11
- 238000003379 elimination reaction Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 230000008030 elimination Effects 0.000 description 8
- 230000007246 mechanism Effects 0.000 description 5
- 238000005259 measurement Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 235000019633 pungent taste Nutrition 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Definitions
- the present application relates to the technical field of databases, in particular to a tag data range query method, device, computer equipment and non-volatile storage medium.
- time series database The full name of time series database is time series database.
- Time series databases are mainly used to process data with time tags (changes in the order of time, that is, time serialization). Data with time tags is also called time series data.
- Tag data is used to identify the inherent attributes of data objects. In some time series databases, tag data is stored in string format and does not support range queries.
- This application provides a range query method for label data, including:
- query the attribute value of the target tag data in the range query data structure including:
- recording the query time of the target label data includes:
- the latest query time for target tag data is recorded, including:
- the attribute value of the target tag data and the latest query time of the target tag data are stored through the Hash table structure.
- judging whether there is an attribute value of the target label data in the range query data structure includes:
- it also includes:
- the range query data structure When the range query data structure reaches the pruning condition, the range query data structure is pruned according to the latest query time of each tag data in the range query data structure.
- the range query data structure is a B+ tree structure.
- the present application also provides a range query device for tag data, including:
- the creation unit is used to predetermine the corresponding relationship between the attribute value of the tag data and the storage address of the tag data in the time series index file of the time series database, and establish a range query data structure according to the attribute value of each tag data;
- the query unit is used to query the attribute value of the target tag data in the range query data structure according to the query range of the attribute value in the input tag data range query request;
- a determining unit configured to determine the storage address of the target label data in the time series index file according to the correspondence
- an acquisition unit configured to acquire the target tag data according to the storage address of the target tag data
- An output unit for outputting target label data for outputting target label data.
- the present application also provides a computer device, including a memory and one or more processors, wherein computer-readable instructions are stored in the memory, and when the computer-readable instructions are executed by the one or more processors, the The one or more processors perform the following steps:
- the present application also provides one or more non-transitory computer-readable storage media storing computer-readable instructions that, when executed by one or more processors, cause the one or more processors to Perform the following steps:
- FIG. 1A is an application scenario diagram of a tag data range query method provided in the present application according to one or more embodiments
- Fig. 1 is a flow chart of a range query method for label data provided in the present application according to one or more embodiments;
- Figure 2 is a schematic diagram of the data reading relationship in the InfluxDB database
- Figure 3 is a schematic diagram of a time series index file
- FIG. 4 is a schematic diagram of a B+ tree structure provided in the present application according to one or more embodiments.
- FIG. 5 is a schematic diagram of a B+ tree query method provided in the present application according to one or more embodiments
- FIG. 6 is a schematic diagram of a HashSet structure provided in the present application according to one or more embodiments.
- Figure 7 is a schematic diagram of the calculation method of the HashSet structure
- Fig. 8 is a functional topology diagram of label data range query in a time-series database according to one or more embodiments of the present application;
- FIG. 9 is a schematic diagram of an elimination process of non-hot label data provided in the present application according to one or more embodiments.
- Fig. 10 is a schematic diagram of the structure of a label data range query method provided in the present application according to one or more embodiments;
- Fig. 11 is a schematic structural diagram of a tag data range query device provided in the present application according to one or more embodiments;
- Fig. 12 is a schematic structural diagram of a label data range query device provided in one or more embodiments of the present application.
- the core of the present application is to provide a tag data range query method, device, equipment and storage medium for realizing range query of tag data in a time series database.
- the tag data range query method provided in this application can be applied to the application environment shown in FIG. 1A .
- the terminal 102a communicates with the server 104a through a network.
- the terminal 102a can be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices, and the server 104a can be realized by an independent server or a server cluster composed of multiple servers.
- the server 104a pre-determines the corresponding relationship between the attribute value of the tag data and the storage address of the tag data in the time series index file of the time series database, and establishes a range query data structure according to the attribute value of each tag data, and according to the input
- the query range of the attribute value in the tag data range query request query the attribute value of the target tag data in the range query data structure, and determine the storage address of the target tag data in the time series index file according to the corresponding relationship, according to the target tag data
- the target tag data is acquired at the storage address of the target tag, and the target tag data is output to the terminal 102a.
- Fig. 1 is a flow chart of a tag data range query method provided by an embodiment of the present application
- Fig. 2 is a schematic diagram of data reading relationships in an InfluxDB database
- Fig. 3 is a schematic diagram of a time series index file.
- S101 Predetermine the corresponding relationship between the attribute values of the tag data and the storage addresses of the tag data in the time series index file of the time series database, and establish a range query data structure according to the attribute values of each tag data.
- S104 Obtain the target tag data according to the storage address of the target tag data.
- InfluxDB database reference data is shown in Table 1:
- Range query is a common database query, which is used to query tuples whose attribute values are within a specified range.
- the InfluxDB database is a time series database that does not support label data range queries.
- looking for data needs to find the corresponding value (value) in the data area through the data source key value (SeriesKey) in the index area.
- the basic concepts involved are:
- Series data source, a mapping composed of data source key (SeriesKey) and value (value), the query process is to find the corresponding value (value) from the data source key (SeriesKey).
- Tag data an inherent attribute of a data object, can only be stored in string format, and does not support range queries.
- Timestamp Timestamp, used to mark the time when the field is generated.
- time series databases usually use Time Series Index (TSI) files to store memory indexes, so that the data source key values (SeriesKey) of all data sources that meet the query conditions can be found in a short time , and solve the problem that the memory index is limited by the memory size and the recovery time is long.
- TSI Time Series Index
- the structure of the time series index file may be shown in FIG. 3 .
- the data source block (Series Block), tag block (Tag Block), table name block (Measurement Block), index file (Index File Trailer), etc. are stored through different storage blocks (Block).
- the data source block (Series Block), tag block (Tag Block), table name block (Measurement Block), index file (Index File Trailer), etc. are stored through different storage blocks (Block).
- the range query method for tag data constructs a range query data structure for performing range query on attribute values of tag data on the basis of time series index files.
- the range query data structure adopts support for The data structure in which the range query can be performed, for example, a balanced multi-fork tree structure.
- the attribute value of the label data used to construct the range query data structure can be set for all attributes of the label data, or for some attributes of the label data, which can be set according to the type of label data and the user's query habits . It can be understood that, for the convenience of query, the attribute value of the tag data used to construct the range query data structure is not the entire data content of the tag data, while the tag data stored in the time series index file of the time series database is the tag data. all content.
- the embodiment of the present application predetermines the corresponding relationship between the attribute values of the label data and the storage addresses of the label data in the time series index file, and establishes a range query data structure according to the attribute values of each label data, so that the range In the query data structure, after the range query of the label data based on the attribute value is realized, it can be further mapped to the time series index file to obtain the target label data.
- the user can input the query range of the attribute value of the target tag data, and then the attribute value within the query range can be found in the range query data structure, and then can be based on the pre-established
- the corresponding relationship determines the storage address of the target label data in the time series index file of the time series database, so that the target label data is taken out from the time series index file of the time series database.
- the range query method for tag data is to establish a range query data structure in advance according to the attribute value of the tag data, and establish the attribute value of the tag data and the storage of the tag data in the time series index file of the time series database The corresponding relationship of the address, so that the query range of the attribute value in the query request can be based on the input tag data range.
- After querying the attribute value of the target tag data in the range query data structure determine the time series of the target tag data according to the corresponding relationship.
- the storage address in the index file and then obtain and output the target label data, thus filling the gap in the range query function of label data in some time series databases, making it convenient for users to perform range query on label data in time series databases, and enriching the time series database. Function.
- the addition and deletion of tag data in the range query data structure can be performed along with the creation and deletion of tag data in the time series index file, or along with the "hotness" of the tag data.
- step S102 according to the query range of the attribute value in the input tag data range query request, query in the range query data structure to obtain The attribute values of the target label data, including:
- the query range in the tag data range query request determine whether the range query data structure has an attribute value of the target tag data
- the query time-based range query data structure management can be further realized. Since the range query is usually to query a piece of continuous label data, for example, to query label data with a label number of 1-18, if a certain label data cannot be queried from the temporary data structure, you need to select the attribute of the label data The value is added to the range query data structure for the user to query next time. In addition, target tag data that is not recorded in the range query data structure for the user's separate query is also added to the range query data structure.
- record the query time of the target tag data specifically: record the latest query time of the target tag data. Whenever the label data is queried once, its latest query time will overwrite the last query time. For the tag data that is queried for the first time, create its query time.
- another data structure may be used to record all tag data in the range query data structure and its corresponding latest query time, and update members along with updating and pruning of the range query data structure.
- the "hotness" of the tag data can be further determined, so as to eliminate non-hot data and create new hot data for the range query data structure, thereby simplifying the range query data structure and improving the efficiency of each query. s efficiency.
- the range query data structure is pruned according to the latest query time of each tag data in the range query data structure.
- the pruning condition can be: when the time interval between the latest query time of the label data and the current time exceeds the first threshold, then delete the label data from the range query data structure; when the number of members in the range query data structure exceeds the first threshold After the second threshold, delete the label data with the latest query time and the longest time, etc. Pruning rules for range query data structures can also be established according to actual application scenarios.
- the label data to be queried by the user As for the label data to be queried by the user that is not recorded in the range query data structure, it is added to the range query data structure according to the creation rules of the range query data structure.
- FIG. 4 is a schematic diagram of a B+ tree structure provided by the embodiment of the application
- Fig. 5 is a schematic diagram of a query method of a B+ tree provided by the embodiment of the application
- Fig. 6 is a schematic diagram of a HashSet structure provided by the embodiment of the application
- Figure 7 is a schematic diagram of the calculation method of the HashSet structure
- Figure 8 is a functional topology diagram of tag data range query in a time series database provided by the embodiment of the application
- Figure 9 is a non-hot tag data provided by the embodiment of the application Schematic diagram of the elimination process
- FIG. 10 is a schematic diagram of the structure of a label data range query method provided by the embodiment of the present application.
- the range query data structure may specifically adopt a B-tree structure, preferably a B+ tree structure. Because the B+ tree structure has a linked list structure in its leaf nodes, it can greatly improve the efficiency of range query on label fields.
- a 3rd-order B+ tree created using labeled data.
- tag1-tag18 is the tag field of tag data incremented in lexicographical order, but in fact the corresponding tag blocks (Tag Block) in the time series index file may be out of order or discontinuous, and the storage address data of tag data Stored on the leaf node, the physical structure corresponds to the tag block (Tag Block) in the time series index file.
- the tree building process has completed the lexicographical ordering of the tag data.
- Tree search starting from the root node of the B+ tree, search for the lower limit value of the label data range.
- Linked list lookup traverse the linked list along the lower limit value position in the linked list, and query all leaf nodes within the query range.
- Modify query time update the query time for target tag data.
- the query time of the target tag data can be stored through the HashSet structure. Then record the latest query time of the target tag data, specifically: store the attribute value of the target tag data and the latest query time of the target tag data through the HashSet structure.
- the method of recording the latest query time of the target tag data may specifically be: store the attribute value of the target tag data through the Hash table structure and the latest query time for the target label data.
- the characteristic of the Hash table structure is that duplicate key values (key) are not allowed to be inserted into the Hash table, and the newly inserted key value (key) will overwrite the old one.
- the Hashset structure stores the attribute values of the target tag data and the latest query time of the target tag data.
- the tag data (tags) stored in the B+ tree and the latest query time (time) are recorded through the HashSet structure.
- Bloom filter can be used to easily and efficiently judge whether a tag data is stored in the HashSet structure. Its core is to realize a super large bit array and several hash functions. As shown in FIG. 7 , assuming that the length of the bit array is m and the number of hash functions is k, the bit array is first initialized, and each bit in it is set to 0.
- each mapping For each element in the collection, the elements are mapped through k hash functions in turn, and each mapping will generate a hash value, which corresponds to a point on the bit array, and then mark the position corresponding to the bit array as 1 .
- the same method maps W to k points on the bit array through hashing. If one of the k points is not 1, it can be judged that the element must not exist in the set.
- HashSet structure By parsing the query statement, obtain the query range and related label fields, use the HashSet structure to mark whether the label data has been inserted into the B+ tree, and record the query time, mainly including:
- tag member record record whether the tag data has been inserted into the temporary index of the B+ tree structure, using a set structure, and also record the timestamp of the latest query of the tag data. If it is the first query, the insertion time will be recorded as the timestamp in In the set, the set can be used to check whether the label data node is in the B+ tree, or it can also be used in the elimination mechanism to scan whether the node belongs to non-hot data.
- Label data temporary query index The range query data structure is composed of B+ tree, which can realize range query, deletion and field order sorting of label data fields.
- the completed B+ tree structure supports range query.
- the range query method for label data provided by the embodiment of the present application specifically includes:
- updating the query time of the target tag data is specifically to increase or update the corresponding tag:time key-value pair in the HashSet structure. feature, the new query time will overwrite the last query time.
- Timing tasks can use background threads to implement timing tasks.
- the scanning object is the timestamp mark of the tag data object recorded in the HashSet structure. If the definition of non-hot data is met, the leaf node in the B+ tree structure and the Members are deleted to maintain the query efficiency of the B+ tree.
- the user can be provided with the function of parameter adjustment.
- relevant parameters such as the elimination threshold T thd should remain configurable, that is, exposed to users and can be modified through terminals or commands.
- the range query method for label data provided by the embodiment of the present application specifically includes:
- step S303 Determine whether the index is an empty tree; if yes, go to step S204; if not, go to step S206.
- step S206 Determine whether the tag data is in the HashSet structure; if yes, go to step S207; if not, go to step S208.
- step S207 B+ tree range query. Then enter step S210.
- step S209 Prune the temporary index (B+ tree). Then enter step S207.
- step S210 Updating the latest query time of the target tag data. Then enter step S216.
- step S213 Determine whether the query interval time is greater than the elimination threshold; if yes, proceed to step S214; if not, repeat step S213.
- step S215 Delete the tag data node in the temporary index (B+ tree). Then enter step S216.
- the range query method for tag data provided by the embodiment of the present application can be realized by predesigning the creation/index component, the data query component and the elimination mechanism execution component.
- steps S202-S206, S208-S209 are the implementation blocks of the creation/index component
- steps S207 and S210 are the realization blocks of the data query component
- steps S211-215 are the realization blocks of the elimination mechanism execution component.
- tag data range query method Various embodiments corresponding to the tag data range query method are described in detail above. On this basis, the present application also discloses a tag data range query device, device, and storage medium corresponding to the above method.
- FIG. 11 is a schematic structural diagram of an apparatus for querying the range of label data provided by an embodiment of the present application.
- the range query device for tag data includes:
- the creation unit 301 is used to predetermine the corresponding relationship between the attribute value of the tag data and the storage address of the tag data in the time series index file of the time series database, and establish a range query data structure according to the attribute value of each tag data;
- the query unit 302 is used to query the attribute value of the target tag data in the range query data structure according to the query range of the attribute value in the input tag data range query request;
- a determining unit 303 configured to determine the storage address of the target tag data in the time series index file according to the correspondence
- An acquisition unit 304 configured to acquire the target tag data according to the storage address of the target tag data
- An output unit 305 configured to output target label data.
- query unit 302 specifically includes:
- the first judging subunit is used to judge whether there is an attribute value of the target tag data in the range query data structure according to the query range in the tag data range query request; if yes, then enter the first record subunit; if not, then Enter the second recording subunit;
- the first recording subunit is used to record the query time of the target tag data, and then perform the step of querying the target tag data in the range query data structure according to the query range of the attribute value in the input tag data range query request;
- the second recording subunit is used to record the query time of the target tag data, and after adding the attribute value of the target tag data into the range query data structure, execute the query range of the attribute value in the query request according to the input tag data range, A step of querying the range query data structure to obtain target tag data.
- the creating unit 301 also includes:
- the second judging subunit is used to judge whether the range query data structure meets the pruning condition; if yes, enter the pruning subunit;
- the pruning subunit is configured to prune the range query data structure according to the latest query time of each tag data in the range query data structure.
- a computer device may be a server. Please refer to FIG. 12 for its internal structure diagram.
- the computer device includes a processor, memory, network interface and database connected by a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
- the memory of the computer device includes a non-volatile storage medium and an internal memory.
- the non-volatile storage medium stores an operating system, computer readable instructions and a database.
- the internal memory provides an environment for the execution of the operating system and computer readable instructions in the non-volatile storage medium.
- the database of the computer device is used to store data.
- the network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer-readable instructions are executed by the processor, the above tag data range query method is realized.
- the above-described device and device embodiments are only illustrative.
- the division of modules is only a logical function division.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
- a module described as a separate component may or may not be physically separated, and a component shown as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network modules. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.
- the above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules.
- an integrated module is realized in the form of a software function module and sold or used as an independent product, it can be stored in a storage medium.
- the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , executing all or part of the steps of the methods described in the various embodiments of the present application.
- a computer device includes a memory and one or more processors. Computer-readable instructions are stored in the memory. When the computer-readable instructions are executed by the processor, the one or more processors execute the above method.
- One or more non-volatile storage media storing computer-readable instructions.
- the computer-readable instructions are executed by one or more processors, one or more processors are made to execute the above method.
- a range query method, device, equipment and storage medium for tag data provided by the present application have been introduced in detail above.
- Each embodiment in the description is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.
- the devices, equipment and storage media disclosed in the embodiments since they correspond to the methods disclosed in the embodiments, the description is relatively simple, and for relevant details, please refer to the description of the method part. It should be pointed out that those skilled in the art can make some improvements and modifications to the application without departing from the principles of the application, and these improvements and modifications also fall within the protection scope of the claims of the application.
- Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
- Volatile memory can include random access memory (RAM) or external cache memory.
- RAM random access memory
- RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请公开了一种标签数据的范围查询方法、装置、设备及存储介质,通过预先根据标签数据的属性取值建立范围查询数据结构,并确立标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,从而可以在根据输入的标签数据范围查询请求中属性取值的查询范围在范围查询数据结构中查询得到目标标签数据的属性取值后,依据对应关系确定目标标签数据在时间序列索引文件中的存储地址,进而获取并输出目标标签数据。
Description
相关申请的交叉引用
本申请要求在2021年08月26日提交中国专利局,申请号为202110984987.5,发明名称为“一种标签数据的范围查询方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及数据库技术领域,特别是涉及一种标签数据的范围查询方法、装置、计算机设备及非易失性存储介质。
时序数据库全称为时间序列数据库。时间序列数据库主要用于指处理带时间标签(按照时间的顺序变化,即时间序列化)的数据,带时间标签的数据也称为时间序列数据。标签(tag)数据用于标识数据对象的固有属性,在一些时序数据库中,标签数据采用string格式存储,不支持范围查询。
时序数据库目前常用的数据范围查询方式有两种,一种是使用模糊条件方法进行匹配查询,另一种是使用or指令对查询条件进行拼接。然而,第一种查询方法对时序数据库中存储的数据字段值命名有严格的要求,不仅仅给用户输入范围查询命令增加了难度,还容易漏查不符合命名规则的字段值。第二种查询方法仅适用于查询少量字段值的场景,对大规模的字段值数量查询甚至连续变化的字段值的查询,难以进行编码实现。故许多时序数据库没有很好地实现对标签数据的范围查询功能。
弥补对时序数据库中标签数据进行范围查询功能的空白,是本领域技术人员需要解决的技术问题。
发明内容
本申请提供一种标签数据的范围查询方法,包括:
预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构;
根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值;
依据对应关系确定目标标签数据在时间序列索引文件中的存储地址;
根据目标标签数据的存储地址获取目标标签数据;和
输出目标标签数据。
在其中一个实施例中,根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值,包括:
根据标签数据范围查询请求中的查询范围;
在判断范围查询数据结构中具有目标标签数据的属性取值时,记录对目 标标签数据的查询时间,并执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值的步骤;和
在判断范围查询数据结构中不具有目标标签数据的属性取值时,记录对目标标签数据的查询时间,并将目标标签数据的属性取值加入范围查询数据结构后,执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值的步骤。
在其中一个实施例中,记录对目标标签数据的查询时间,包括:
记录对目标标签数据的最新查询时间。
在其中一个实施例中,记录对目标标签数据的最新查询时间,包括:
通过Hash表结构存储目标标签数据的属性取值和目标标签数据的最新查询时间。
在其中一个实施例中,判断范围查询数据结构中是否具有目标标签数据的属性取值,包括:
在记录有目标标签数据的最新查询时间时,确定范围查询数据结构中具有目标标签数据的属性取值;和
在没有记录目标标签数据的最新查询时间时,确定范围查询数据结构中不具有目标标签数据的属性取值。
在其中一个实施例中,还包括:
在范围查询数据结构达到修剪条件时,按照范围查询数据结构中各标签数据的最新查询时间对范围查询数据结构进行修剪。
在其中一个实施例中,范围查询数据结构为B+树结构。
本申请还提供一种标签数据的范围查询装置,包括:
创建单元,用于预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构;
查询单元,用于根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值;
确定单元,用于依据对应关系确定目标标签数据在时间序列索引文件中的存储地址;
获取单元,用于根据目标标签数据的存储地址获取目标标签数据;和
输出单元,用于输出目标标签数据。
本申请还提供一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构;
根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值;
依据对应关系确定目标标签数据在时间序列索引文件中的存储地址;
根据目标标签数据的存储地址获取目标标签数据;和
输出目标标签数据。
本申请还提供一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构;
根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值;
依据对应关系确定目标标签数据在时间序列索引文件中的存储地址;
根据目标标签数据的存储地址获取目标标签数据;和
输出目标标签数据。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
为了更清楚的说明本申请实施例或现有技术的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1A为本申请根据一个或多个实施例中提供的一种标签数据的范围查询方法的应用场景图;
图1为本申请根据一个或多个实施例中提供的一种标签数据的范围查询方法的流程图;
图2为InfluxDB数据库中数据读取关系示意图;
图3为时间序列索引文件示意图;
图4为本申请根据一个或多个实施例中提供的一种B+树结构的示意图;
图5为本申请根据一个或多个实施例中提供的一种B+树的查询方式示意图;
图6为本申请根据一个或多个实施例中提供的一种HashSet结构的示意图;
图7为HashSet结构的计算方式示意图;
图8为本申请根据一个或多个实施例中提供的一种时序数据库中标签数据范围查询的功能拓扑图;
图9为本申请根据一个或多个实施例中提供的一种非热点标签数据的淘汰过程示意图;
图10为本申请根据一个或多个实施例中提供的一种标签数据范围查询方法的架构示意图;
图11为本申请根据一个或多个实施例中提供的一种标签数据的范围查询装置的结构示意图;
图12为本申请根据一个或多个实施例中提供的一种标签数据的范围查询设备的结构示意图。
本申请的核心是提供一种标签数据的范围查询方法、装置、设备及存储介质,用于实现对时序数据库中标签数据的范围查询。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请提供的标签数据的范围查询方法,可以应用于如图1A所示的应用环境中。其中,终端102a与服务器104a通过网络进行通信。终端102a可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104a可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
具体地,服务器104a预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构,根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值,依据对应关系确定目标标签数据在时间序列索引文件中的存储地址,根据目标标签数据的存储地址获取目标标签数据,向终端102a输出目标标签数据。
图1为本申请实施例提供的一种标签数据的范围查询方法的流程图;图2为InfluxDB数据库中数据读取关系示意图;图3为时间序列索引文件示意图。
如图1所示,本申请实施例提供的标签数据的范围查询方法的流程图,以该方法应用于图1A中的服务器为例进行说明,包括:
S101:预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构。
S102:根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值。
S103:依据对应关系确定目标标签数据在时间序列索引文件中的存储地址。
S104:根据目标标签数据的存储地址获取目标标签数据。
S105:输出目标标签数据。
InfluxDB数据库参考数据如表1所示:
表1 InfluxDB数据库参考数据表
范围查询是一种常见的数据库查询,用于查询属性取值在规定范围之内的元组。
如图2所示,InfluxDB数据库是一种不支持标签数据范围查询的时序数据库。在InfluxDB数据库中,查找数据需要通过索引区域中的数据源键值(SeriesKey)查找到数据区域中对应的值(value),其中涉及到的基本概念有:
Series:数据源,由数据源键值(SeriesKey)和值(value)对应组成的映射,查询过程即为由数据源键值(SeriesKey)查找对应的值(value)。
Measurement:表名。
Tag:标签数据,数据对象的固有属性,只能采用string格式存储,不支持范围查询。
field:数据的动态属性,支持string,float,integer或boolean类型,并与时间戳相关联,支持范围查询。
timestamp:时间戳,用于标记field产生的时间。
为兼顾查询效率与内存约束,时序数据库通常采用时间序列索引(Time Series Index,TSI)文件存储内存索引,从而可以在短时间内找出符合查询条件的所有数据源的数据源键值(SeriesKey),并解决内存索引受限于内存大小、恢复时间长的问题。
时间序列索引文件的结构可以如图3所示。其中,通过不同的存储块(Block)存储有数据源块(Series Block)、标签块(Tag Block)、表名块(Measurement Block)、索引文件(Index File Trailer)等。查询数据时,首先在表名块(Measurement Block)中找到命令语句中对应的表名(Measurement),并获取表的标签数据(Tag)对应的标签块(Tag Block),根据查询条件中的标签数据(Tag)及值找到满足查询条件要求的所有数据源标识(Series ID),并根据数据源标识(Series ID)找到对应的数据源块(Series Block)。
则本申请实施例提供的标签数据的范围查询方法在时间序列索引文件的基础上构建一个用于实现对标签数据的属性取值进行范围查询的范围查询数据结构,该范围查询数据结构采用支持对其中的数据进行范围查询的数据结构即可,例如平衡多叉树结构。
用于构建范围查询数据结构的标签数据的属性取值,可以为标签数据的全部属性取值,也可以为标签数据的部分属性取值,具体可以根据标签数据的类型及用户的查询习惯设定。可以理解的是,为方便查询,用于构建范围 查询数据结构的标签数据的属性取值并非标签数据的全部数据内容,而存储于时序数据库的时间序列索引文件中的标签数据则是标签数据的全部内容。
同时,由于标签数据在时间序列索引文件中的存储位置可能是不按照其内容进行排序的、仅在确定标签数据的存储地址后才可以从时间序列索引文件中获取标签数据,则本申请实施例提供的标签数据的范围查询方法预先确定标签数据的属性取值与标签数据在时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构,从而在范围查询数据结构上实现基于属性取值对标签数据进行范围查询后,能够进一步映射到时间序列索引文件中获取目标标签数据。
基于本申请实施例提供的标签数据的范围查询方法,用户输入目标标签数据的属性取值的查询范围,即可在范围查询数据结构中查找到查询范围内的属性取值,进而可以根据预先建立的对应关系确定目标标签数据在时序数据库的时间序列索引文件中的存储地址,从而在时序数据库的时间序列索引文件中取出目标标签数据。
本申请实施例提供的标签数据的范围查询方法,通过预先根据标签数据的属性取值建立范围查询数据结构,并建立标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,从而可以根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值后,依据对应关系确定目标标签数据在时间序列索引文件中的存储地址,进而获取并输出目标标签数据,从而填补了对一些时序数据库中标签数据进行范围查询功能的空白,方便了用户对时序数据库中标签数据进行范围查询,丰富了时序数据库的功能。
实施例二
在上述实施例中,对于范围查询数据结构中标签数据的加入与删除,可以随着标签数据在时间序列索引文件中的创建与删除进行,也可以随着标签数据的“热度”进行。
在上述实施例的基础上,在本申请实施例提供的标签数据的范围查询方法中,步骤S102:根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值,具体包括:
根据标签数据范围查询请求中的查询范围,判断范围查询数据结构中是否具有目标标签数据的属性取值;
如果是,则记录对目标标签数据的查询时间,并执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的步骤;
如果否,则记录对目标标签数据的查询时间,并将目标标签数据的属性取值加入范围查询数据结构后,执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的步骤。
通过记录对目标标签数据的查询时间对目标标签数据的查询时间,可以进一步实现基于查询时间的范围查询数据结构管理。由于范围查询通常为查询一段连续的标签数据,例如查询标号为1-18的标签数据,则若从临时数据结构中未能查询到其中的某个标签数据,则需要将该标签数据的属性取值加 入范围查询数据结构中以便用户进行下次查询。此外,对于用户进行单独查询但未记载于范围查询数据结构中的目标标签数据,也加入范围查询数据结构中。
为避免查询时间数据占用较多存储空间以及方便查看,记录对目标标签数据的查询时间,具体为:记录对目标标签数据的最新查询时间。每当标签数据被查询一次后,将其最新查询时间覆盖上一次查询时间。对于第一次被查询的标签数据,则创建其查询时间。
具体的,可以通过另一数据结构记录范围查询数据结构中的所有标签数据及其对应的最新查询时间,并随着范围查询数据结构的更新与修剪进行成员更新。
则判断范围查询数据结构中是否具有目标标签数据的属性取值,具体可以为:
判断是否记录有目标标签数据的最新查询时间;
如果是,则确定范围查询数据结构中具有目标标签数据的属性取值;
如果否,则确定范围查询数据结构中不具有目标标签数据的属性取值。
而基于对标签数据的查询时间,还可以进一步确定标签数据的“热度”,从而对范围查询数据结构进行非热点数据的淘汰与新热点数据的创建,从而简化范围查询数据结构,提高每次查询的效率。
则本申请实施例提供的标签数据的范围查询方法还包括:
判断范围查询数据结构是否达到修剪条件;
如果是,则按照范围查询数据结构中各标签数据的最新查询时间对范围查询数据结构进行修剪。
其中,修剪条件可以为:当标签数据的最新查询时间与当前时间的时间间隔超出第一阈值后,则将该标签数据从范围查询数据结构中删除;当范围查询数据结构中的成员数量超出第二阈值后,删除最新查询时间最久远的标签数据,等。还可以根据实际应用场景建立范围查询数据结构的修剪规则。
而对于未记录在范围查询数据结构的用户要查询的标签数据,则按照范围查询数据结构的创建规则,将之加入范围查询数据结构中。
实施例三
图4为本申请实施例提供的一种B+树结构的示意图;图5为本申请实施例提供的一种B+树的查询方式示意图;图6为本申请实施例提供的一种HashSet结构的示意图;图7为HashSet结构的计算方式示意图;图8为本申请实施例提供的一种时序数据库中标签数据范围查询的功能拓扑图;图9为本申请实施例提供的一种非热点标签数据的淘汰过程示意图;图10为本申请实施例提供的一种标签数据范围查询方法的架构示意图。
在上述实施例的基础上,在本申请实施例提供的标签数据的范围查询方法中,范围查询数据结构具体可以采用B树结构,优选采用B+树结构。B+树结构因其叶子节点具有链表结构,可以极大提高对标签字段进行范围查询的效率。
如图4所示,使用标签数据创建的3阶B+树。假设tag1-tag18为按字典顺序递增的标签数据的标签字段,但实际上在时间序列索引文件中所对应的 标签块(Tag Block)可能是无序的或不连续的,标签数据的存储地址数据存储在叶节点上,物理结构上对应时间序列索引文件中的标签块(Tag Block),在此基础上,建树的过程已经完成对标签数据的字典顺序排序。
基于B+树结构的查询过程如图5所示,具体包括:
树查找:从B+树的根节点开始,查找标签数据的范围下限值。
链表查找:沿链表中下限值位置遍历链表,查询所有查询范围内的叶节点。
修改查询时间:更新对目标标签数据的查询时间。
在上述实施例的基础上,在本申请实施例提供的标签数据的范围查询方法中,具体可以通过HashSet结构存储目标标签数据的查询时间。则记录对目标标签数据的最新查询时间,具体为:通过HashSet结构存储目标标签数据的属性取值和目标标签数据的最新查询时间。
在上述实施例的基础上,在本申请实施例提供的标签数据的范围查询方法中,记录对目标标签数据的最新查询时间的方式具体可以为:通过Hash表结构存储目标标签数据的属性取值和目标标签数据的最新查询时间。Hash表结构的特性为不允许重复的键值(key)插入Hash表,新插入的键值(key)将会覆盖旧的。
例如,可以采用Hash表结构的一种:Hashset结构存储目标标签数据的属性取值和目标标签数据的最新查询时间。如图6所示,通过HashSet结构记录B+树中存储的标签数据(tags)以及最新查询时间(time)。采用布隆过滤器(bloom filter)可以很容易且高效的判断一个标签数据是否存储在HashSet结构中,其核心是实现是一个超大的位数组和几个哈希函数。如图7所示,假设位数组的长度为m,哈希函数的个数为k,首先将位数组进行初始化,将里面每个位都设置为0。对于集合里面的每一个元素,将元素依次通过k个哈希函数进行映射,每次映射都会产生一个哈希值,这个值对应位数组上面的一个点,然后将位数组对应的位置标记为1。查询W元素是否存在集合中的时候,同样的方法将W通过哈希映射到位数组上的k个点。如果k个点的其中有一个点不为1,则可以判断该元素一定不存在集合中。
通过解析查询语句,获取查询范围和相关的标签字段,使用HashSet结构标记标签数据是否已插入B+树中,并记录查询时间,主要包括:
tag成员记录:记录标签数据是否已经插入B+树结构的临时索引,采用集合结构,同时被记录的还有标签数据最近被查询的时间戳,如果是首次查询,则将插入时间作为时间戳记录在集合中,该集合可以用于校验标签数据节点是否在B+树中,也可以用于淘汰机制中,扫描节点是否属于非热点数据。
标签数据临时查询索引:该范围查询数据结构由B+树构成,可以实现标签数据字段的范围查询,删除以及按字段顺序排序,构建完成的B+树结构支持范围查询。
则基于图2、4、5所示的结构,如图8所示,本申请实施例提供的标签数据的范围查询方法具体包括:
将时间序列索引文件由磁盘加载至内存,遍历时间序列索引文件,使用HashSet结构存储标签数据的值以及标签数据插入B+树的时间,同时将标签数据插入B+树中。需要说明的是,此时时间序列索引文件并没有被改动,B+ 树中仅仅存储了时间序列索引文件中标签块(Tag Block)的存储地址。同时,使用定时任务监控HashSet结构中各标签数据对应的存储时间,若时间较为久远,则在B+树中删除该节点。同样需要注意的是,删除的仅仅是B+树中的节点,而非时间序列索引文件中的标签块(Tag Block)。
则在上述所述的基于B+树的标签数据查询过程中,更新对目标标签数据的查询时间具体为增加或更新HashSet结构中对应的tag:time键值对所对应的位置处,由于HashSet结构的特性,新的查询时间将覆盖上一次查询时间。
如图9所示,基于HashSet结构的淘汰机制通过定义热点标签数据,使用定时任务扫描标签数据,对于B+树中不属于热点数据的标签数据节点进行删除,维持树的查询效率;通过定义淘汰阈值T
thd,对时序数据库的系统时间T,与所有标签数据的最新查询时间T
0做差得到△T=T-T
0,若△T>T
thd,则认为该标签数据为非热点数据,其余标签数据被定义为热点数据。
定时任务可以使用后台线程实现定时任务,扫描对象为HashSet结构中记录的标签数据对象的时间戳标记,若满足非热点数据的定义,则对B+树结构中的该叶节点和HashSet结构中的该成员进行删除,维持B+树的查询效率。
此外,可以提供给用户进行参数调节的功能。如在淘汰机制中,相关的参数如淘汰阈值T
thd应保持可配置状态,即暴露给用户可通过终端或指令进行修改。
在图4-9所示的内容的基础上,如图10所示,本申请实施例提供的标签数据的范围查询方法具体包括:
S201:查询指令。
S202:解析标签字段/查询范围。
S303:判断索引是否为空树;如果是,则进入步骤S204;如果否,则进入步骤S206。
S204:创建记录表(HashSet结构),插入标签数据,记录查询时间。
S205:创建临时索引(B+树)。
S206:判断标签数据是否在HashSet结构中;如果是,则进入步骤S207;如果否,则进入步骤S208。
S207:B+树范围查询。而后进入步骤S210。
S208:将标签数据插入记录表,记录查询时间。
S209:修剪临时索引(B+树)。而后进入步骤S207。
S210:更新目标标签数据的最新查询时间。而后进入步骤S216。
S211:定时任务。
S212:检测标签数据的最新查询时间。
S213:判断查询间隔时间是否大于淘汰阈值;如果是,则进入步骤S214;如果否,则重复步骤S213。
S214:在记录表中删除标签数据的记录。
S215:在临时索引(B+树)中删除标签数据节点。而后进入步骤S216。
S216:临时索引(B+树)内存化。
在具体实施中,可以通过预先设计创建/索引组件、数据查询组件和淘汰机制执行组件来实现本申请实施例提供的标签数据的范围查询方法。
其中,步骤S202-S206、S208-S209为创建/索引组件的实现版块,步骤 S207、S210为数据查询组件的实现版块,步骤S211-215为淘汰机制执行组件的实现版块,具体实施方式可以参考上述实施例,在此不再赘述。
上文详述了标签数据的范围查询方法对应的各个实施例,在此基础上,本申请还公开了与上述方法对应的标签数据的范围查询装置、设备及存储介质。
实施例四
图11为本申请实施例提供的一种标签数据的范围查询装置的结构示意图。
如图11所示,本申请实施例提供的标签数据的范围查询装置包括:
创建单元301,用于预先确定标签数据的属性取值与标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各标签数据的属性取值建立范围查询数据结构;
查询单元302,用于根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的属性取值;
确定单元303,用于依据对应关系确定目标标签数据在时间序列索引文件中的存储地址;
获取单元304,用于根据目标标签数据的存储地址获取目标标签数据;
输出单元305,用于输出目标标签数据。
进一步的,查询单元302具体包括:
第一判断子单元,用于根据标签数据范围查询请求中的查询范围,判断范围查询数据结构中是否具有目标标签数据的属性取值;如果是,则进入第一记录子单元;如果否,则进入第二记录子单元;
第一记录子单元,用于记录对目标标签数据的查询时间,而后执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的步骤;
第二记录子单元,用于记录对目标标签数据的查询时间,并将目标标签数据的属性取值加入范围查询数据结构后,执行根据输入的标签数据范围查询请求中属性取值的查询范围,在范围查询数据结构中查询得到目标标签数据的步骤。
进一步的,创建单元301还包括:
第二判断子单元,用于判断范围查询数据结构是否达到修剪条件;如果是,则进入修剪子单元;
修剪子单元,用于按照范围查询数据结构中各标签数据的最新查询时间对范围查询数据结构进行修剪。
由于装置部分的实施例与方法部分的实施例相互对应,因此装置部分的实施例请参见方法部分的实施例的描述,这里暂不赘述。
根据本申请的另一方面,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图请参照图12所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用 于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时实现以上的标签数据的范围查询方法。
需要说明的是,以上所描述的装置、设备实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,执行本申请各个实施例所述方法的全部或部分步骤。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行上述方法。
一个或多个存储有计算机可读指令的非易失性存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述方法。
以上对本申请所提供的一种标签数据的范围查询方法、装置、设备及存储介质进行了详细介绍。说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置、设备及存储介质而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得 包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。
Claims (10)
- 一种标签数据的范围查询方法,其特征在于,包括:预先确定标签数据的属性取值与所述标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各所述标签数据的属性取值建立范围查询数据结构;根据输入的标签数据范围查询请求中属性取值的查询范围,在所述范围查询数据结构中查询得到目标标签数据的属性取值;依据所述对应关系确定所述目标标签数据在所述时间序列索引文件中的存储地址;根据所述目标标签数据的存储地址获取所述目标标签数据;和输出所述目标标签数据。
- 根据权利要求1所述的方法,其特征在于,所述根据输入的标签数据范围查询请求中属性取值的查询范围,在所述范围查询数据结构中查询得到目标标签数据的属性取值,包括:根据所述标签数据范围查询请求中属性取值的查询范围;在判断所述范围查询数据结构中具有所述目标标签数据的属性取值时,记录对所述目标标签数据的查询时间,并执行所述根据输入的标签数据范围查询请求中属性取值的查询范围,在所述范围查询数据结构中查询得到目标标签数据的步骤;和在判断所述范围查询数据结构中不具有所述目标标签数据的属性取值时,记录对所述目标标签数据的查询时间,并将所述目标标签数据的属性取值加入所述范围查询数据结构后,执行所述根据输入的标签数据范围查询请求中属性取值的查询范围,在所述范围查询数据结构中查询得到目标标签数据的步骤。
- 根据权利要求2所述的方法,其特征在于,所述记录对所述目标标签数据的查询时间,包括:记录对所述目标标签数据的最新查询时间。
- 根据权利要求3所述的方法,其特征在于,所述记录对所述目标标签数据的最新查询时间,包括:通过Hash表结构存储所述目标标签数据的属性取值和所述目标标签数据的最新查询时间。
- 根据权利要求2所述的方法,其特征在于,所述方法还包括:在记录有所述目标标签数据的最新查询时间时,确定所述范围查询数据结构中具有所述目标标签数据的属性取值;和在没有记录所述目标标签数据的最新查询时间时,确定所述范围查询数据结构中不具有所述目标标签数据的属性取值。
- 根据权利要求2所述的方法,其特征在于,还包括:在所述范围查询数据结构达到修剪条件时,按照所述范围查询数据结构中各所述标签数据的最新查询时间对所述范围查询数据结构进行修剪。
- 根据权利要求1至6任意一项所述的方法,其特征在于,所述范围查询数据结构为B+树结构。
- 一种标签数据的范围查询装置,其特征在于,包括:创建单元,用于预先确定标签数据的属性取值与所述标签数据在时序数据库的时间序列索引文件中的存储地址的对应关系,并根据各所述标签数据的属性取值建立范围查询数据结构;查询单元,用于根据输入的标签数据范围查询请求中属性取值的查询范围,在所述范围查询数据结构中查询得到目标标签数据的属性取值;确定单元,用于依据所述对应关系确定所述目标标签数据在所述时间序列索引文件中的存储地址;获取单元,用于根据所述目标标签数据的存储地址获取所述目标标签数据;和输出单元,用于输出所述目标标签数据。
- 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行如权利要求1-7任意一项所述的方法的步骤。
- 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如权利要求1-7任意一项所述的方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110984987.5A CN113434557B (zh) | 2021-08-26 | 2021-08-26 | 一种标签数据的范围查询方法、装置、设备及存储介质 |
CN202110984987.5 | 2021-08-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023024247A1 true WO2023024247A1 (zh) | 2023-03-02 |
Family
ID=77797928
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/127321 WO2023024247A1 (zh) | 2021-08-26 | 2021-10-29 | 一种标签数据的范围查询方法、装置、设备及存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113434557B (zh) |
WO (1) | WO2023024247A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117235120A (zh) * | 2023-11-09 | 2023-12-15 | 支付宝(杭州)信息技术有限公司 | 具有时序特性的超图数据存储和查询方法及装置 |
CN117560228A (zh) * | 2024-01-10 | 2024-02-13 | 西安电子科技大学杭州研究院 | 基于标签和图对齐的流式溯源图实时攻击检测方法及系统 |
CN117591577A (zh) * | 2024-01-18 | 2024-02-23 | 中核武汉核电运行技术股份有限公司 | 一种基于文件存储的核电历史数据对比方法及系统 |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113434557B (zh) * | 2021-08-26 | 2021-12-17 | 苏州浪潮智能科技有限公司 | 一种标签数据的范围查询方法、装置、设备及存储介质 |
CN117573944B (zh) * | 2024-01-17 | 2024-04-02 | 深圳十沣科技有限公司 | 数据检索方法、装置、设备及存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103853752A (zh) * | 2012-11-30 | 2014-06-11 | 国际商业机器公司 | 管理时间序列数据库的方法和装置 |
CN109902088A (zh) * | 2019-02-13 | 2019-06-18 | 北京航空航天大学 | 一种面向流式时序数据的数据索引方法 |
WO2020047584A1 (en) * | 2018-09-04 | 2020-03-12 | Future Grid Pty Ltd | Method and system for indexing of time-series data |
US20200117763A1 (en) * | 2018-10-15 | 2020-04-16 | Ca, Inc. | Relational interval tree with distinct borders |
WO2020159397A1 (en) * | 2019-01-30 | 2020-08-06 | Siemens Aktiengesellschaft | Method and computerized device for processing numeric time series data |
CN113254451A (zh) * | 2021-06-01 | 2021-08-13 | 北京城市网邻信息技术有限公司 | 一种数据索引构建方法、装置、电子设备及存储介质 |
CN113434557A (zh) * | 2021-08-26 | 2021-09-24 | 苏州浪潮智能科技有限公司 | 一种标签数据的范围查询方法、装置、设备及存储介质 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10007690B2 (en) * | 2014-09-26 | 2018-06-26 | International Business Machines Corporation | Data ingestion stager for time series database |
US10977234B2 (en) * | 2019-08-02 | 2021-04-13 | Timescale, Inc. | Combining compressed and uncompressed data at query time for efficient database analytics |
CN112818013B (zh) * | 2021-01-27 | 2023-07-21 | 北京百度网讯科技有限公司 | 时序数据库查询优化方法、装置、设备以及存储介质 |
CN112861022A (zh) * | 2021-02-01 | 2021-05-28 | 杭州全拓科技有限公司 | 一种基于人工智能的人员活动大数据记录查询方法 |
CN113297135A (zh) * | 2021-02-10 | 2021-08-24 | 阿里巴巴集团控股有限公司 | 数据处理方法以及装置 |
CN113111098B (zh) * | 2021-06-11 | 2021-10-29 | 阿里云计算有限公司 | 检测时序数据的查询的方法、装置及时序数据库系统 |
-
2021
- 2021-08-26 CN CN202110984987.5A patent/CN113434557B/zh active Active
- 2021-10-29 WO PCT/CN2021/127321 patent/WO2023024247A1/zh active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103853752A (zh) * | 2012-11-30 | 2014-06-11 | 国际商业机器公司 | 管理时间序列数据库的方法和装置 |
WO2020047584A1 (en) * | 2018-09-04 | 2020-03-12 | Future Grid Pty Ltd | Method and system for indexing of time-series data |
US20200117763A1 (en) * | 2018-10-15 | 2020-04-16 | Ca, Inc. | Relational interval tree with distinct borders |
WO2020159397A1 (en) * | 2019-01-30 | 2020-08-06 | Siemens Aktiengesellschaft | Method and computerized device for processing numeric time series data |
CN109902088A (zh) * | 2019-02-13 | 2019-06-18 | 北京航空航天大学 | 一种面向流式时序数据的数据索引方法 |
CN113254451A (zh) * | 2021-06-01 | 2021-08-13 | 北京城市网邻信息技术有限公司 | 一种数据索引构建方法、装置、电子设备及存储介质 |
CN113434557A (zh) * | 2021-08-26 | 2021-09-24 | 苏州浪潮智能科技有限公司 | 一种标签数据的范围查询方法、装置、设备及存储介质 |
Non-Patent Citations (1)
Title |
---|
FAN XINXIN: "Time series database technology system – inverted index of InfluxDB multidimensional query", 9 February 2018 (2018-02-09), XP093039924, Retrieved from the Internet <URL:http://hbasefly.com/2018/02/09/timeseries-database-5/> [retrieved on 20230418] * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117235120A (zh) * | 2023-11-09 | 2023-12-15 | 支付宝(杭州)信息技术有限公司 | 具有时序特性的超图数据存储和查询方法及装置 |
CN117560228A (zh) * | 2024-01-10 | 2024-02-13 | 西安电子科技大学杭州研究院 | 基于标签和图对齐的流式溯源图实时攻击检测方法及系统 |
CN117560228B (zh) * | 2024-01-10 | 2024-03-19 | 西安电子科技大学杭州研究院 | 基于标签和图对齐的流式溯源图实时攻击检测方法及系统 |
CN117591577A (zh) * | 2024-01-18 | 2024-02-23 | 中核武汉核电运行技术股份有限公司 | 一种基于文件存储的核电历史数据对比方法及系统 |
CN117591577B (zh) * | 2024-01-18 | 2024-05-03 | 中核武汉核电运行技术股份有限公司 | 一种基于文件存储的核电历史数据对比方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN113434557A (zh) | 2021-09-24 |
CN113434557B (zh) | 2021-12-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023024247A1 (zh) | 一种标签数据的范围查询方法、装置、设备及存储介质 | |
CN109299102B (zh) | 一种基于Elastcisearch的HBase二级索引系统及方法 | |
US11531682B2 (en) | Federated search of multiple sources with conflict resolution | |
WO2023087673A1 (zh) | 一种层次数据检索方法、装置和设备 | |
US10769124B2 (en) | Labeling versioned hierarchical data | |
TW202032386A (zh) | 資料儲存裝置、轉譯裝置及資料庫存取方法 | |
EP3365812A1 (en) | Create table for exchange | |
CN103595797B (zh) | 一种分布式存储系统中的缓存方法 | |
CN112800287B (zh) | 基于图数据库的全文索引方法和系统 | |
US20150205834A1 (en) | PROVIDING FILE METADATA QUERIES FOR FILE SYSTEMS USING RESTful APIs | |
US10762068B2 (en) | Virtual columns to expose row specific details for query execution in column store databases | |
CN109582831B (zh) | 一种支持非结构化数据存储与查询的图数据库管理系统 | |
US9659023B2 (en) | Maintaining and using a cache of child-to-parent mappings in a content-addressable storage system | |
WO2023232120A1 (zh) | 数据处理方法、电子设备及存储介质 | |
JP2022543306A (ja) | ブロックチェーンデータ処理の方法、装置、機器及び可読記憶媒体 | |
WO2023179787A1 (zh) | 分布式文件系统的元数据管理方法和装置 | |
CN111221785A (zh) | 一种多源异构数据的语义数据湖构建方法 | |
US20240078234A1 (en) | Apparatus, method and storage medium for database pagination | |
US20220035820A1 (en) | Storage structure of data object, method and system for storing and dynamically managing data object on computer, and storage medium and electronic device | |
US20170316042A1 (en) | Index page with latch-free access | |
CN113704248B (zh) | 一种基于外置索引的区块链查询优化方法 | |
CN107273443B (zh) | 一种基于大数据模型元数据的混合索引方法 | |
CN113779068B (zh) | 数据查询方法、装置、设备及存储介质 | |
CN114048219A (zh) | 图数据库更新方法及装置 | |
Xu et al. | Skia: Scalable and efficient in-memory analytics for big spatial-textual data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21954767 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21954767 Country of ref document: EP Kind code of ref document: A1 |