CN116680276A - Data tag storage management method, device, equipment and storage medium - Google Patents

Data tag storage management method, device, equipment and storage medium Download PDF

Info

Publication number
CN116680276A
CN116680276A CN202310765129.0A CN202310765129A CN116680276A CN 116680276 A CN116680276 A CN 116680276A CN 202310765129 A CN202310765129 A CN 202310765129A CN 116680276 A CN116680276 A CN 116680276A
Authority
CN
China
Prior art keywords
data
query
data tag
tag
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310765129.0A
Other languages
Chinese (zh)
Inventor
孙涛
毛晓霖
李征文
李梦薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Bowo Wisdom Technology Co ltd
Original Assignee
Shenzhen Bowo Wisdom Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Bowo Wisdom Technology Co ltd filed Critical Shenzhen Bowo Wisdom Technology Co ltd
Priority to CN202310765129.0A priority Critical patent/CN116680276A/en
Publication of CN116680276A publication Critical patent/CN116680276A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data storage, and discloses a data tag storage management method, a device, equipment and a storage medium. The data tag storage management method comprises the following steps: acquiring related data information of a selected subject object according to the subject object; based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule; and after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation.

Description

Data tag storage management method, device, equipment and storage medium
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a method, an apparatus, a device, and a storage medium for managing data tag storage.
Background
With the rapid development of information technology, data storage and management become more and more important. In many practical application scenarios, the efficiency and reliability of data storage has a significant impact on the performance and stability of the system. Data tags play a key role in data storage management by assigning specific tags to data to facilitate indexing, querying, and management.
At present, the traditional data tag storage management method mainly comprises the following two steps that one data tag is stored together with a data object, the data tag is generally added with tag attributes, if one data object is provided with a plurality of tag attributes, a plurality of attribute fields are required to be added when the data tag is used for identification, and the data tag and the data object are stored together; another is the normal Key Value (non-reverse Key Value) tag storage mode, in which the Key (Key) in the Key Value map is a data object or data object identification, and the Value (Value) is a data tag.
The traditional data tag storage management method is low in efficiency, particularly when the data volume of the data object is large, for example, when the client information is large, the low-efficiency tag management method is used, so that the client tag management delay is large, the client classification timeliness can be seriously influenced, and further the fine management and the client experience based on classified clients are influenced. For example, when personalized pushing is performed, the pushed information exceeds the validity period, which can seriously affect the customer experience.
Therefore, how to provide an efficient data tag storage management method is a problem to be solved by those skilled in the art.
Disclosure of Invention
The invention provides a data tag storage management method, a device, equipment and a storage medium, which are used for solving the technical problems.
The first aspect of the present invention provides a data tag storage management method, which includes:
acquiring related data information of a selected subject object according to the subject object; based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule; and after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation.
Optionally, in a first implementation manner of the first aspect of the present invention, after the triggering manner is selected by the data tags of different groups, a step of automatically triggering the corresponding data tag to perform a storage management operation includes:
positioning and searching the data tag through a quick searching algorithm to obtain an actual searching result, and adjusting parameters of the quick searching algorithm according to the actual searching result; wherein, the algorithm adopted in the adjustment is based on a dynamic error query threshold adjustment algorithm;
According to the actual searching result, inquiring and storing the data tag through a prefix tree and/or a hash table, and utilizing a self-adaptive mixed structure strategy, and selecting an optimal data structure in real time according to the property of the current data tag when inquiring each time;
partitioning the data labels with different access frequencies based on a locality principle, giving weights, and optimizing the data access layering according to the weights;
and managing the query cache through an elimination algorithm, and introducing a dynamic replacement strategy according to a real-time access mode to adaptively update the query cache.
Optionally, in a second implementation manner of the first aspect of the present invention, the performing, by using a fast search algorithm, a locating search on the data tag to obtain an actual search result, and dynamically adjusting parameters of the fast search algorithm according to the actual search result, where the method includes:
selecting a corresponding quick search algorithm according to the input data tag, wherein the quick search algorithm is used for locating and searching the subsequent data tag;
setting an initial false query threshold, wherein the initial false query threshold is used for judging whether parameters of a quick search algorithm need to be adjusted or not;
Positioning and searching the input data tag by using a quick searching algorithm to obtain an actual searching result;
based on a dynamic false query threshold adjustment algorithm, periodically evaluating the false query rate in the actual search result, comparing the false query rate with a set false query threshold, and if the false query rate is lower than the set false query threshold, not needing adjustment; and if the false query rate is higher than the set false query threshold, adjusting parameters of the quick search algorithm.
Optionally, in a third implementation manner of the first aspect of the present invention, according to the actual search result, the querying and storing the data tag through a prefix tree and/or a hash table, and using an adaptive hybrid structure policy, selecting, in real time, an optimal data structure according to a property of a current data tag when querying each time, includes:
initializing an empty prefix tree structure according to an input data set, wherein the empty prefix tree structure is used for inquiring and storing subsequent data labels;
evaluating the property of each data label to be queried, and selecting a prefix tree to be used as an optimal data structure according to an evaluation result by using an adaptive mixed structure strategy;
If the prefix tree is selected to be used, the prefix tree is matched step by step from the root node according to the character sequence of the data tag until a termination condition is reached; starting to create or update nodes from the root node according to the character sequence of the data tag until all characters are inserted;
updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
the query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
Optionally, in a fourth implementation manner of the first aspect of the present invention, according to the actual search result, the querying and storing, by using a prefix tree and/or a hash table, the data tag, and using an adaptive hybrid structure policy, selecting, in real time, an optimal data structure according to a property of a current data tag when querying each time, further includes:
initializing an empty hash table structure according to an input data set, wherein the empty hash table structure is used for inquiring and storing subsequent data labels;
if the prefix tree is selected to be used, the hash value of the data tag is calculated, and the corresponding storage position is quickly positioned for inquiring: calculating a hash value of the data tag, and storing the data in a hash table according to the corresponding hash value;
Updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
the query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
Optionally, in a fifth implementation manner of the first aspect of the present invention, the partitioning and weighting according to the tag with a higher access frequency by using a locality principle, and optimizing the data access hierarchy according to the weight includes:
collecting and analyzing the input data set and determining the access frequency of the data tag;
based on the locality principle, grouping the data tags according to the access frequency;
the data labels of different groups are assigned with weights, and the weight value is in direct proportion to the access frequency of the data labels;
and carrying out layering processing on the data access operation according to the weight value, wherein the layering processing refers to that data with high weight value is stored on faster storage equipment preferentially, and data with low weight value is stored on low speed.
Optionally, in a sixth implementation manner of the first aspect of the present invention, the managing the query cache through the elimination algorithm, and introducing a dynamic replacement policy according to a real-time access mode, adaptively updating the query cache includes:
Establishing a query cache and selecting a corresponding elimination algorithm, wherein the query cache is used for storing data tags with high access frequency;
in the data access process, loading the data tag with high access frequency into a query cache;
and periodically checking whether the cache reaches the maximum capacity, and executing a cache elimination strategy if the cache reaches the maximum capacity.
Removing the data tag meeting the elimination condition from the cache according to the selected elimination algorithm;
collecting and analyzing real-time access mode data, and detecting whether the data access mode changes or not;
introducing a dynamic replacement strategy for the query cache according to the real-time access mode analysis result;
periodically monitoring the performance of the query cache and evaluating the effect of the elimination algorithm and the dynamic replacement strategy; and optimizing parameters of the elimination algorithm, the dynamic replacement strategy and the cache capacity according to the evaluated effect.
A second aspect of the present invention provides a data tag storage management apparatus comprising:
the acquisition module is used for acquiring related data information of the main body object according to the selected main body object;
the classification module is used for dividing the related data information with different attributes into data labels with different groups based on the attributes of the related data information, wherein the triggering modes of the data labels comprise different triggering modes according to the attributes, scripts and interfaces of the data labels with different groups when the related data information meets the preset triggering rules;
And the triggering module is used for automatically triggering the corresponding data tag to carry out storage management operation after the data tags in different groups select the triggering mode.
A third aspect of the present invention provides a data tag storage management apparatus comprising: a memory and at least one processor, the memory having instructions stored therein; the at least one processor invokes the instructions in the memory to cause the data tag storage management apparatus to perform the data tag storage management method described above.
A fourth aspect of the present invention provides a computer-readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the above-described data tag storage management method.
In the technical scheme provided by the invention, the beneficial effects are as follows: according to the data tag storage management method, the device, the equipment and the storage medium, related data information of a main object is obtained according to the selected main object; based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule; and finally, after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation. The invention realizes accurate query and automatic processing on the enterprise level, is beneficial to improving the efficiency, accuracy and response speed of the business process, and can realize accurate query and monitoring, optimize the business process and improve the efficiency, accuracy and response speed.
Drawings
FIG. 1 is a schematic diagram of an embodiment of a data tag storage management method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of another embodiment of a data tag storage management method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an embodiment of a data tag storage management apparatus according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention provide a data tag storage management method, apparatus, device, and storage medium, where the terms "first," "second," "third," "fourth," etc. (if any) in the description and claims of the present invention and the above figures are used for distinguishing similar objects, and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.
For ease of understanding, a specific flow of an embodiment of the present invention is described below with reference to fig. 1, and one embodiment of a data tag storage management method in an embodiment of the present invention includes:
before describing the embodiments of the present invention in detail, the inventive concepts of the present invention will be briefly described: first, data information associated with a particular subject object (e.g., user, device, etc.) is obtained. Such data information may include, but is not limited to, user behavior data, device status data, environmental data, and the like.
The relevant data information is classified, archived and distributed to different sets of data tags according to the attributes (such as type, source, time, etc.) of the data information. This helps organize and manage the data for subsequent analysis and processing.
The triggering mode of the data tag refers to selecting different triggering modes according to the attribute, script and interface of the data tag of different groups when the preset triggering rule is met. This means that under certain conditions, for example when the amount of data reaches a certain threshold or when a certain event occurs, the data tag will trigger a corresponding operation, such as storing, updating or deleting data.
Finally, in the event that the triggering rule is satisfied, the data tags of different groups will automatically trigger the corresponding storage management operation (e.g., store, update, or delete the relevant data information). The automation is helpful for effectively and efficiently managing the data information, and ensures that the data is always kept up to date and accurate
Step 101, acquiring related data information of a selected main object according to the main object;
it will be appreciated that the execution subject of the present invention may be a data tag storage management device, and may also be a terminal or a server, which is not limited herein. The embodiment of the invention is described taking the data tag storage management device as an execution subject.
Specifically, the refinement process of acquiring relevant data information from the selected subject object:
first, subject objects, such as users, devices, websites, etc., that are explicitly required to be acquired, and related data types, such as user behavior, device status, website click-through rate, etc. And related data information of the main object is more comprehensively acquired through various data acquisition methods, such as web crawlers, sensor data and active recording of user behaviors.
And processing and screening the original data by applying text mining and natural language processing to the acquired various information, and extracting the required core data. After extracting the core data, preprocessing operations are carried out, including removing repeated data, null value processing, unit conversion and the like, so as to ensure the accuracy and consistency of the data and lay a foundation for subsequent analysis and processing.
And supplementing and enriching related data of the subject object from different dimensions through an API or a third party data source. For example, device location information may be obtained through a geographic location API, or social network information for a user may be collected through a social media API.
102, dividing the related data information with different attributes into data tags with different groups based on the attributes of the related data information, wherein the triggering modes of the data tags comprise selecting different triggering modes according to the attributes, scripts and interfaces of the data tags with different groups when the related data information meets a preset triggering rule;
specifically, first, the attribute of the related data information is defined and identified. These attributes may be data types (e.g., text, numbers, or images), data sources (e.g., user input, data acquired by a third party API, etc.), or other features (e.g., update frequency of data, etc.).
Then, a preset trigger rule is designed. These rules should be determined based on the attributes of the data information. For example, if the data source is a particular channel, or the data type is an image, a set of data tags is triggered; if the data update frequency is high, another group of data tags is triggered.
A separate attribute, script, and interface is defined for each group of data tags. The attributes may be the name, description, color, etc. of the data tag; the script may control the operations performed by the tag upon triggering, such as data visualization, message notification, or other tasks; the interface may be a way to interact with an external system or service, such as calling an API, sending a request, etc.
When the related data information meets the preset triggering rules, different triggering modes are selected according to the attributes, scripts and interfaces of the data labels of different groups. The method has certain flexibility, so that the system can select the most suitable triggering mode according to actual requirements, and automatic classification and processing of data information aiming at different attributes are realized.
And 103, after the data labels in different groups select the triggering mode, automatically triggering the corresponding data labels to perform storage management operation.
Specifically, first, various trigger conditions are defined, such as the data reaching a certain threshold, a data type variation, a time period, etc., so as to automatically trigger the corresponding data tag when these conditions are satisfied.
And then different data storage structures, such as relational databases, non-relational databases or file systems, are designed for each group of data tags. This will facilitate partitioning of data according to different attributes into the appropriate storage environment.
And when the data meets the triggering condition, automatically inputting or updating the data into the corresponding data storage environment through the corresponding API, SDK or other methods according to the defined storage structure.
Finally, according to the script and the interface of the data label, storage management operations such as data backup, cleaning, compression, encryption, decryption, indexing and the like are realized. This helps to ensure the integrity and reliability of the data.
In the embodiment of the invention, the beneficial effects are as follows: according to the data tag storage management method, the device, the equipment and the storage medium, related data information of a main object is obtained according to the selected main object; based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule; and finally, after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation. The invention realizes accurate query and automatic processing on the enterprise level, is beneficial to improving the efficiency, accuracy and response speed of the business process, and can realize accurate query and monitoring, optimize the business process and improve the efficiency, accuracy and response speed.
Referring to fig. 2, another embodiment of a data tag storage management method according to an embodiment of the present invention includes:
step 201: positioning and searching the data tag through a quick searching algorithm to obtain an actual searching result, and adjusting parameters of the quick searching algorithm according to the actual searching result; wherein, the algorithm adopted in the adjustment is based on a dynamic error query threshold adjustment algorithm;
specifically, the following is a specific process of locating the data tag by a quick search algorithm and adjusting parameters according to an actual search result, and includes the following refinement steps:
the data labels need to be processed, such as sorted, divided, etc., to accommodate the needs of the fast search algorithm. And initial parameters of the fast search algorithm are set, including query threshold, similarity measure, weight, etc.
And then a fast search algorithm (such as binary search, hash search and the like) is utilized to locate the target data label, and a search result and related performance indexes (such as time, space and the like) are recorded. And then based on a dynamic error query threshold adjustment algorithm, comparing the actual search result with the expected target, calculating an error value, and using a sliding window average error value. And according to the error value trend, a proper parameter adjustment strategy, such as dynamically adjusting the query threshold or weight, is designed to achieve the goal of optimizing the search process.
The performance index is monitored in real time and visually displayed in the searching process, so that the system abnormality can be found quickly, and the performance optimization is facilitated. Parameters of the search algorithm are automatically adjusted through methods such as machine learning and the like so as to adapt to continuously changing data sets and query scenes.
And finally, a distributed computing and multi-core processing technology is adopted to execute parallel searching tasks, so that the searching efficiency and the system throughput capacity are improved. And introducing user feedback, measuring the actual effect of the search result, and considering factors such as user satisfaction and the like in the adjustment process of the search algorithm.
Step 202: according to the actual searching result, inquiring and storing the data tag through a prefix tree and/or a hash table, and utilizing a self-adaptive mixed structure strategy, and selecting an optimal data structure in real time according to the property of the current data tag when inquiring each time;
specifically, the following is a specific process of querying and storing the data tag through the prefix tree and/or the hash table according to the actual search result, and selecting the optimal data structure by utilizing the adaptive hybrid structure policy, which comprises the following refinement steps:
data tags are first classified, the attributes (e.g., length, character distribution, etc.) of the data tags are analyzed, and classified into tags that fit into a prefix tree store (e.g., long strings with common prefixes) and tags that fit into a hash table store (e.g., short strings or tags that have no obvious prefix associations). And respectively constructing a prefix tree and a hash table for the classified data labels, and storing the corresponding labels in the corresponding data structures.
An adaptive mixed structure strategy is then developed for selecting the best data structure (prefix tree or hash table) in real time, based on the nature of the current data tag, at each query. And records and monitors performance metrics (e.g., query time, success rate, etc.) of the exercise query to adjust the adaptive strategy based on the performance data. And optimizing characteristics used for selecting a query mode in the self-adaptive strategy, such as label length, character types and the like, and updating characteristic weights in real time according to performance indexes.
A lightweight transition logic is designed to rapidly switch between a prefix tree and a hash table, and minimize performance loss in the query process. And periodically analyzing the states of the data labels in each data structure, such as access frequency, quantity and the like, and reorganizing the storage structure based on the information so as to improve the query efficiency. And then compressing and optimizing redundant data in the prefix tree and the hash table, thereby further reducing the occupied storage space and the inquiry time.
And finally, temporarily storing the hot spot data tag and the query result in a high-speed access memory by using a caching technology so as to reduce query time and system overhead.
The embodiment of the invention realizes the self-adaptive query and storage of the data tag, and ensures that the most suitable data structure can be quickly selected under different scenes so as to meet the performance requirement. In the whole process, the query system is more efficient and stable by monitoring the performance index and optimizing the storage structure.
Step 203: partitioning the data labels with different access frequencies based on a locality principle, giving weights, and optimizing the data access layering according to the weights;
specifically, the following is a specific process of partitioning and weighting data tags with different access frequencies based on a locality principle, and optimizing the data access hierarchy according to the weights, and the specific process comprises the following refinement steps:
firstly, monitoring access records of data tags, counting access frequencies, forming access history data, and preparing for subsequent partition and weight distribution. Based on the locality principle, the data tags are partitioned according to the access frequency, the tags with high access frequency are divided into hot data areas, and the tags with low access frequency are divided into cold data areas. Each data partition is then assigned a weight, with higher weights being assigned to hot data regions and lower weights being assigned to cold data regions. And dynamically adjusting the weight according to the actual access condition. The self-adaptive loading strategy is realized, and in the data access process, high-weight hot data is loaded preferentially according to the weight of the partition where the data tag is located.
And a dynamic threshold adjustment strategy is designed, the access trend of the hot data area and the cold data area is monitored, and the hot data and cold data division threshold is adjusted in real time according to the access frequency change.
A multi-level caching mechanism is introduced, and the data is stored into caches of different levels according to the hot degree of the data so as to further optimize the access performance and the utilization rate of system resources. And an asynchronous access and preloading strategy is realized, and the data which can be accessed is preloaded in the background based on the access history data and the prediction model, so that the waiting time of a user is reduced.
The performance and access frequency of each data partition are periodically evaluated, and partial data is migrated to areas of different weights as needed to maintain an optimal hierarchy.
And identifying data association characteristics by utilizing a digital fingerprint technology, clustering data with similar access frequencies according to the association characteristics, and improving the access success rate.
And 204, managing the query cache through an elimination algorithm, and introducing a dynamic replacement strategy according to a real-time access mode to adaptively update the query cache.
Specifically, the following specific process of managing the query cache through the elimination algorithm and introducing a dynamic replacement policy according to the real-time access mode to adaptively update the query cache includes the following refinement steps:
cache initialization and configuration: the total size of the cache, partition mode and adaptive elimination algorithm (such as LRU, LFU, etc.) are determined.
Request interception and analysis: the query request is intercepted before accessing the data, and the characteristics of the query request are analyzed to prepare for subsequent cache accesses.
Query cache and cache hit: after intercepting the request, it is first checked whether the query is cached. If the cache is hit, a cache result is directly returned, and the cache hit times are recorded; if not, continue searching data.
Real-time access pattern recognition: the access records are analyzed in real time, and the current access mode (such as hot spot access, cyclic access, random access and the like) is judged.
Dynamic replacement policy: and introducing a corresponding dynamic replacement strategy according to the real-time access mode. For example, in a hot spot access mode, data with high access frequency is preferentially reserved; in the cyclic access mode, data in a cyclic period is reserved preferentially; in the random access mode, data with higher access frequency is moderately reserved.
Cache elimination and updating: when the buffer capacity reaches a threshold value, part of the buffer data is eliminated and a new query result is added into the buffer according to a selected elimination algorithm and a dynamic replacement strategy.
Cache performance monitoring and assessment: cache performance metrics, such as hit rate, access latency, etc., are collected to evaluate the effectiveness of the cache configuration and replacement policy.
Self-adaptive buffer adjustment: according to the cache performance index, the cache parameters such as the cache capacity, the elimination algorithm, the replacement strategy and the like are dynamically adjusted to match the actual access characteristics.
Distributed and multi-level cache expansion: and a distributed and multi-level cache strategy is applied, so that the cache management efficiency is further improved. For example, multiple levels of cache are provided at different storage levels, such as memory, local disk, and remote server.
In the embodiment of the invention, the beneficial effects are as follows: according to the embodiment of the invention, the data tag is subjected to positioning searching through a quick searching algorithm to obtain an actual searching result, and parameters of the quick searching algorithm are adjusted according to the actual searching result; wherein, the algorithm adopted in the adjustment is based on a dynamic error query threshold adjustment algorithm; inquiring and storing the data tag through a prefix tree and/or a hash table according to the actual searching result, and selecting an optimal data structure in real time according to the property of the current data tag when inquiring each time by utilizing a self-adaptive mixed structure strategy; then, based on the locality principle, partitioning the data labels with different access frequencies, giving weights, and optimizing the data access layering according to the weights; and finally, managing the query cache through an elimination algorithm, and introducing a dynamic replacement strategy according to a real-time access mode to adaptively update the query cache. The invention adopts a quick searching algorithm and a dynamic false-query threshold adjustment algorithm, improves searching efficiency, realizes self-adaptive adjustment and ensures high-efficiency searching performance. The algorithm parameters are adjusted in real time to adapt to different scene requirements, and the operation complexity and the system consumption are reduced. And inquiring and storing the data tag by utilizing the prefix tree and the hash table, and selecting an optimal data structure in real time according to the property of the data tag by combining with a self-adaptive mixed structure strategy, so that the utilization efficiency of the storage structure is improved, and the data access speed is further increased. Based on the locality principle, the method partitions the data labels with different access frequencies, gives weights, and optimizes the data access layering according to the weights. The method is favorable for storing data with high access frequency by using the high-speed storage medium preferentially, exerts the potential of storage resources and improves the data access performance obviously. And the continuous optimization and improvement of the cache performance are realized by managing the query cache and introducing a dynamic replacement strategy through an elimination algorithm. Even under the environment of limited cache capacity, the cache hit rate and the access speed can be considered, and the system operation efficiency can be further improved.
Another embodiment of the data tag storage management method in the embodiment of the present invention includes:
the positioning searching is carried out on the data tag through a quick searching algorithm to obtain an actual searching result, and parameters of the quick searching algorithm are dynamically adjusted according to the actual searching result, and the method comprises the following steps:
selecting a corresponding quick search algorithm according to the input data tag, wherein the quick search algorithm is used for locating and searching the subsequent data tag;
setting an initial false query threshold, wherein the initial false query threshold is used for judging whether parameters of a quick search algorithm need to be adjusted or not;
positioning and searching the input data tag by using a quick searching algorithm to obtain an actual searching result;
based on a dynamic false query threshold adjustment algorithm, periodically evaluating the false query rate in the actual search result, comparing the false query rate with a set false query threshold, and if the false query rate is lower than the set false query threshold, not needing adjustment; and if the false query rate is higher than the set false query threshold, adjusting parameters of the quick search algorithm.
In the embodiment of the invention, the beneficial effects are as follows: the embodiment of the invention realizes the rapid searching and the dynamic parameter adjustment of the data label and optimizes the performance of the searching algorithm. And the performance indexes and the user experience are focused at each stage, so that the searching system is ensured to be efficient and stable and meets the user requirements.
Another embodiment of the data tag storage management method in the embodiment of the present invention includes:
according to the actual search result, the data tag is queried and stored through a prefix tree and/or a hash table, and an adaptive mixed structure strategy is utilized to select the optimal data structure in real time according to the property of the current data tag during each query, including:
initializing an empty prefix tree structure according to an input data set, wherein the empty prefix tree structure is used for inquiring and storing subsequent data labels;
evaluating the property of each data label to be queried, and selecting a prefix tree to be used as an optimal data structure according to an evaluation result by using an adaptive mixed structure strategy;
if the prefix tree is selected to be used, the prefix tree is matched step by step from the root node according to the character sequence of the data tag until a termination condition is reached; starting to create or update nodes from the root node according to the character sequence of the data tag until all characters are inserted;
updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
the query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
In the embodiment of the invention, the beneficial effects are as follows: the embodiment of the invention realizes the self-adaptive query and storage of the data tag, and ensures that the most suitable data structure can be quickly selected under different scenes so as to meet the performance requirement. In the whole process, the query system is more efficient and stable by monitoring the performance index and optimizing the storage structure.
Another embodiment of the data tag storage management method in the embodiment of the present invention includes:
and according to the actual searching result, searching and storing the data tag through a prefix tree and/or a hash table, and selecting the optimal data structure in real time according to the property of the current data tag when searching each time by utilizing an adaptive mixed structure strategy, wherein the method further comprises the following steps:
initializing an empty hash table structure according to an input data set, wherein the empty hash table structure is used for inquiring and storing subsequent data labels;
if the prefix tree is selected to be used, the hash value of the data tag is calculated, and the corresponding storage position is quickly positioned for inquiring: calculating a hash value of the data tag, and storing the data in a hash table according to the corresponding hash value;
updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
The query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
In the embodiment of the invention, the beneficial effects are as follows: the embodiment of the invention applies the self-adaptive mixed structure strategy and the actual search result, realizes the optimization of the on-line data tag query and storage, and improves the query efficiency.
Another embodiment of the data tag storage management method in the embodiment of the present invention includes:
the partitioning and weighting are carried out according to the labels with higher access frequency by utilizing the locality principle, and the data access layering is optimized according to the weighting, which comprises the following steps:
collecting and analyzing the input data set and determining the access frequency of the data tag;
based on the locality principle, grouping the data tags according to the access frequency;
the data labels of different groups are assigned with weights, and the weight value is in direct proportion to the access frequency of the data labels;
and carrying out layering processing on the data access operation according to the weight value, wherein the layering processing refers to that data with high weight value is stored on faster storage equipment preferentially, and data with low weight value is stored on low speed.
In the embodiment of the invention, the beneficial effects are as follows: the embodiment of the invention realizes partition data access optimization based on the locality principle, ensures reasonable utilization of resources in different scenes and improves data access efficiency. Throughout the process, close attention is paid to access history and trends to dynamically adjust data partitioning and weighting according to changes.
Another embodiment of the data tag storage management method in the embodiment of the present invention includes:
the method for managing the query cache through the elimination algorithm and introducing a dynamic replacement strategy according to the real-time access mode, and adaptively updating the query cache comprises the following steps:
establishing a query cache and selecting a corresponding elimination algorithm, wherein the query cache is used for storing data tags with high access frequency;
in the data access process, loading the data tag with high access frequency into a query cache;
and periodically checking whether the cache reaches the maximum capacity, and executing a cache elimination strategy if the cache reaches the maximum capacity.
Removing the data tag meeting the elimination condition from the cache according to the selected elimination algorithm;
collecting and analyzing real-time access mode data, and detecting whether the data access mode changes or not;
introducing a dynamic replacement strategy for the query cache according to the real-time access mode analysis result;
periodically monitoring the performance of the query cache and evaluating the effect of the elimination algorithm and the dynamic replacement strategy; and optimizing parameters of the elimination algorithm, the dynamic replacement strategy and the cache capacity according to the evaluated effect.
In the embodiment of the invention, the beneficial effects are as follows: the embodiment of the invention realizes efficient and dynamic query cache management. In the whole process, the real-time access mode and the performance index are concerned, the self-adaptive updating of the query cache is ensured, and the cache hit rate and the overall system performance are improved.
The method for managing data tag storage in the embodiment of the present invention is described above, and the following describes a data tag storage management apparatus in the embodiment of the present invention, referring to fig. 2, one embodiment of the data tag storage management apparatus 1 in the embodiment of the present invention includes:
an obtaining module 11, configured to obtain related data information of a selected subject object according to the subject object;
the classification module 12 is configured to divide related data information with different attributes into different groups of data tags based on the attributes of the related data information, where the triggering modes of the data tags include selecting different triggering modes according to the attributes, scripts and interfaces of the data tags with different groups when the related data information meets a preset triggering rule;
and the triggering module 13 is used for automatically triggering the corresponding data tag to perform storage management operation after the data tags in different groups select the triggering mode.
The present invention also provides a data tag storage management apparatus, including a memory and a processor, in which computer readable instructions are stored, which when executed by the processor, cause the processor to execute the steps of the data tag storage management method in the above embodiments.
The present invention also provides a computer readable storage medium, which may be a non-volatile computer readable storage medium, or may be a volatile computer readable storage medium, in which instructions are stored which, when executed on a computer, cause the computer to perform the steps of the data tag storage management method.
The beneficial effects are that: according to the data tag storage management method, the device, the equipment and the storage medium, related data information of a main object is obtained according to the selected main object; based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule; and finally, after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation. The invention realizes accurate query and automatic processing on the enterprise level, is beneficial to improving the efficiency, accuracy and response speed of the business process, and can realize accurate query and monitoring, optimize the business process and improve the efficiency, accuracy and response speed.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (randomaccess memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A data tag storage management method, comprising:
acquiring related data information of a selected subject object according to the subject object;
based on the attribute of the related data information, the related data information with different attributes is divided into data labels with different groups, wherein the triggering modes of the data labels comprise different triggering modes according to the attribute, script and interface of the data labels with different groups when the related data information meets the preset triggering rule;
and after the triggering modes are selected by the data labels in different groups, automatically triggering the corresponding data labels to carry out storage management operation.
2. The method according to claim 1, wherein the step of automatically triggering the corresponding data tag to perform the storage management operation after the triggering mode is selected by the data tag of the different group includes:
positioning and searching the data tag through a quick searching algorithm to obtain an actual searching result, and adjusting parameters of the quick searching algorithm according to the actual searching result; wherein, the algorithm adopted in the adjustment is based on a dynamic error query threshold adjustment algorithm;
according to the actual searching result, inquiring and storing the data tag through a prefix tree and/or a hash table, and utilizing a self-adaptive mixed structure strategy, and selecting an optimal data structure in real time according to the property of the current data tag when inquiring each time;
partitioning the data labels with different access frequencies based on a locality principle, giving weights, and optimizing the data access layering according to the weights;
and managing the query cache through an elimination algorithm, and introducing a dynamic replacement strategy according to a real-time access mode to adaptively update the query cache.
3. The method according to claim 2, wherein the performing a locating search on the data tag by a fast search algorithm to obtain an actual search result, and dynamically adjusting parameters of the fast search algorithm according to the actual search result, includes:
Selecting a corresponding quick search algorithm according to the input data tag, wherein the quick search algorithm is used for locating and searching the subsequent data tag;
setting an initial false query threshold, wherein the initial false query threshold is used for judging whether parameters of a quick search algorithm need to be adjusted or not;
positioning and searching the input data tag by using a quick searching algorithm to obtain an actual searching result;
based on a dynamic false query threshold adjustment algorithm, periodically evaluating the false query rate in the actual search result, comparing the false query rate with a set false query threshold, and if the false query rate is lower than the set false query threshold, not needing adjustment; and if the false query rate is higher than the set false query threshold, adjusting parameters of the quick search algorithm.
4. The method according to claim 2, wherein the querying and storing the data tag according to the actual search result through a prefix tree and/or a hash table, using an adaptive hybrid structure policy, selecting an optimal data structure in real time according to the property of the current data tag at each querying time, includes:
initializing an empty prefix tree structure according to an input data set, wherein the empty prefix tree structure is used for inquiring and storing subsequent data labels;
Evaluating the property of each data label to be queried, and selecting a prefix tree to be used as an optimal data structure according to an evaluation result by using an adaptive mixed structure strategy;
if the prefix tree is selected to be used, the prefix tree is matched step by step from the root node according to the character sequence of the data tag until a termination condition is reached; starting to create or update nodes from the root node according to the character sequence of the data tag until all characters are inserted;
updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
the query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
5. The method according to claim 2, wherein the searching and storing the data tag according to the actual searching result through a prefix tree and/or a hash table, and selecting the best data structure in real time according to the property of the current data tag at each searching time by using an adaptive mixed structure policy, further comprising:
initializing an empty hash table structure according to an input data set, wherein the empty hash table structure is used for inquiring and storing subsequent data labels;
If the prefix tree is selected to be used, the hash value of the data tag is calculated, and the corresponding storage position is quickly positioned for inquiring: calculating a hash value of the data tag, and storing the data in a hash table according to the corresponding hash value;
updating and optimizing the self-adaptive mixed structure strategy according to the actual query result;
the query and stored performance metrics are periodically monitored and, as performance decreases, the adaptive hybrid architecture strategy is adjusted to accommodate new data conditions.
6. The method of claim 2, wherein the partitioning and weighting the access hierarchy based on tags with higher access frequencies using locality principles, optimizing the data access hierarchy based on weights, comprises:
collecting and analyzing the input data set and determining the access frequency of the data tag;
based on the locality principle, grouping the data tags according to the access frequency;
the data labels of different groups are assigned with weights, and the weight value is in direct proportion to the access frequency of the data labels;
and carrying out layering processing on the data access operation according to the weight value, wherein the layering processing refers to that data with high weight value is stored on faster storage equipment preferentially, and data with low weight value is stored on low speed.
7. The method according to claim 2, wherein the managing the query cache by the elimination algorithm and introducing the dynamic replacement policy according to the real-time access mode, adaptively updating the query cache, comprises:
establishing a query cache and selecting a corresponding elimination algorithm, wherein the query cache is used for storing data tags with high access frequency;
in the data access process, loading the data tag with high access frequency into a query cache;
and periodically checking whether the cache reaches the maximum capacity, and executing a cache elimination strategy if the cache reaches the maximum capacity.
Removing the data tag meeting the elimination condition from the cache according to the selected elimination algorithm;
collecting and analyzing real-time access mode data, and detecting whether the data access mode changes or not;
introducing a dynamic replacement strategy for the query cache according to the real-time access mode analysis result;
periodically monitoring the performance of the query cache and evaluating the effect of the elimination algorithm and the dynamic replacement strategy; and optimizing parameters of the elimination algorithm, the dynamic replacement strategy and the cache capacity according to the evaluated effect.
8. A data tag storage management apparatus, characterized in that the data tag storage management apparatus comprises:
The acquisition module is used for acquiring related data information of the main body object according to the selected main body object;
the classification module is used for dividing the related data information with different attributes into data labels with different groups based on the attributes of the related data information, wherein the triggering modes of the data labels comprise different triggering modes according to the attributes, scripts and interfaces of the data labels with different groups when the related data information meets the preset triggering rules;
and the triggering module is used for automatically triggering the corresponding data tag to carry out storage management operation after the data tags in different groups select the triggering mode.
9. A data tag storage management apparatus, characterized in that the data tag storage management apparatus comprises: a memory and at least one processor, the memory having instructions stored therein;
the at least one processor invoking the instructions in the memory to cause the data tag storage management apparatus to perform the data tag storage management method of any of claims 1-7.
10. A computer readable storage medium having instructions stored thereon, which when executed by a processor, implement the data tag storage management method of any of claims 1-7.
CN202310765129.0A 2023-06-27 2023-06-27 Data tag storage management method, device, equipment and storage medium Pending CN116680276A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310765129.0A CN116680276A (en) 2023-06-27 2023-06-27 Data tag storage management method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310765129.0A CN116680276A (en) 2023-06-27 2023-06-27 Data tag storage management method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116680276A true CN116680276A (en) 2023-09-01

Family

ID=87785484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310765129.0A Pending CN116680276A (en) 2023-06-27 2023-06-27 Data tag storage management method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116680276A (en)

Similar Documents

Publication Publication Date Title
US10025867B2 (en) Cache efficiency by social graph data ordering
CN104834675A (en) Query performance optimization method based on user behavior analysis
US20100106681A1 (en) Identifying Files Associated With A Workflow
Wu et al. Prediction of web page accesses by proxy server log
CN112580817A (en) Managing machine learning features
CN108647266A (en) A kind of isomeric data is quickly distributed storage, exchange method
Kucukyilmaz et al. A machine learning approach for result caching in web search engines
CN106155934A (en) Based on the caching method repeating data under a kind of cloud environment
US20110179013A1 (en) Search Log Online Analytic Processing
US7895247B2 (en) Tracking space usage in a database
CN112561197A (en) Power data prefetching and caching method with active defense influence range
CN117235088B (en) Cache updating method, device, equipment, medium and platform of storage system
CN114281855A (en) Data request method, data request device, computer equipment, storage medium and program product
CN101635001A (en) Method and apparatus for extracting information from a database
CN117370058A (en) Service processing method, device, electronic equipment and computer readable medium
CN114785858B (en) Active resource caching method and device applied to mutual inductor online monitoring system
CN116680276A (en) Data tag storage management method, device, equipment and storage medium
CN115168509A (en) Processing method and device of wind control data, storage medium and computer equipment
Baskaran et al. Study of combined Web prefetching with Web caching based on machine learning technique
CN105282236B (en) A kind of distributed caching method and device
US11966393B2 (en) Adaptive data prefetch
CN109412883A (en) Recommendation paths method for tracing, device and system
Wu et al. Using provenance to boost the metadata prefetching in distributed storage systems
CN110705736A (en) Macroscopic economy prediction method and device, computer equipment and storage medium
Alrahwan et al. ASCF: Optimization of the Apriori Algorithm Using Spark‐Based Cuckoo Filter Structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination