CN114036904A - Dictionary data processing method and device, electronic equipment and storage medium - Google Patents

Dictionary data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114036904A
CN114036904A CN202111417139.2A CN202111417139A CN114036904A CN 114036904 A CN114036904 A CN 114036904A CN 202111417139 A CN202111417139 A CN 202111417139A CN 114036904 A CN114036904 A CN 114036904A
Authority
CN
China
Prior art keywords
dictionary
target
variable length
data
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111417139.2A
Other languages
Chinese (zh)
Other versions
CN114036904B (en
Inventor
申毅杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111417139.2A priority Critical patent/CN114036904B/en
Publication of CN114036904A publication Critical patent/CN114036904A/en
Application granted granted Critical
Publication of CN114036904B publication Critical patent/CN114036904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/157Transformation using dictionaries or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The disclosure relates to a dictionary data processing method, a dictionary data processing device, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring a data record to be processed; acquiring a target dictionary corresponding to a variable length field aiming at least one variable length field contained in a data record to be processed; the variable length field has at least one field value; for any variable length field, recording dictionary codes corresponding to all field values of any variable length field into a target dictionary corresponding to any variable length field; if the total number of dictionary items in the target dictionary reaches a preset number threshold value and the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition, removing the target dictionary after the target data operation is executed; the target data operation is a data query operation generated in response to a data query request of the data record to be processed, and the frequency distribution condition is determined according to the frequency distribution of each dictionary item in the effective dictionary. By adopting the method and the device, the operating efficiency of the database can be improved.

Description

Dictionary data processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of database technologies, and in particular, to a dictionary data processing method and apparatus, an electronic device, and a storage medium.
Background
A database is a "warehouse that organizes, stores, and manages data according to a data structure," which is an organized, sharable, uniformly managed collection of large amounts of data that is stored in a computer for a long period of time.
Dictionary coding is a commonly used compression algorithm that codes a field with a finite occurrence number (i.e., a small number of valued ranges) as a dictionary and a series of indices of actual values into the dictionary.
In the related art, when the dictionary coding is used for processing the variable length field with the large base number, the variable length field with the large base number has field content with unobvious data distribution characteristics, so that the dictionary coding operation cost is overlarge, the dictionary obtained by coding is too large, the execution of a database engine is affected to be unstable, and the operation efficiency of the database is further reduced.
Disclosure of Invention
The disclosure provides a dictionary data processing method, a dictionary data processing device, an electronic device and a storage medium, which are used for at least solving the problem of unstable database operation in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a dictionary data processing method, including:
acquiring a data record to be processed; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary corresponding to the variable length field; the variable length field has at least one field value;
for any variable length field, recording dictionary codes corresponding to all field values of the variable length field into a target dictionary corresponding to the variable length field;
if the total number of dictionary items in the target dictionary reaches a preset number threshold value and the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition, removing the target dictionary after target data operation is executed; the target data operation is a data query operation generated in response to a data query request for the data record to be processed, and the frequency distribution condition is determined according to frequency distribution of dictionary items in the effective dictionary.
In one possible implementation manner, the recording dictionary codes corresponding to field values of any variable length field into a target dictionary corresponding to the any variable length field includes:
for any of the field values, matching dictionary entries in the target dictionary corresponding to the field value;
if the dictionary item corresponding to the field value is not matched in the target dictionary, carrying out coding processing on the field value to obtain a dictionary code corresponding to the field value;
and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
In one possible implementation, each dictionary entry has a corresponding dictionary index, and the recording of the dictionary code corresponding to each field value as a dictionary entry into the target dictionary includes:
if the dictionary item corresponding to the field value is matched in the target dictionary, determining the dictionary index of the dictionary item in the target dictionary;
and establishing a mapping relation between the field value and the dictionary index, and updating the occurrence frequency of the dictionary item.
In one possible implementation, the method further includes:
acquiring the occurrence frequency of each dictionary item in the target dictionary;
determining the occurrence frequency of a target in the occurrence frequency of each dictionary item; the difference between the target occurrence frequency and other occurrence frequencies meets a preset condition; the other occurrence frequencies are the occurrence frequencies of all dictionary items except the target occurrence frequency;
and when the occurrence frequency of the target is smaller than a preset frequency threshold value, judging that the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition.
In one possible implementation manner, the determining the target occurrence frequency from the occurrence frequencies of the dictionary entries includes:
sequencing the occurrence frequency of each dictionary item to obtain the sequenced occurrence frequency;
and determining the occurrence frequency of the target in the occurrence frequency of each dictionary item according to the ordered frequency distribution condition of the occurrence frequency.
In a possible implementation manner, the determining, according to the frequency distribution of the sorted occurrence frequencies, a target occurrence frequency from the occurrence frequencies of the dictionary entries includes:
determining the corresponding occurrence frequency of a preset position according to the frequency distribution condition;
and taking the corresponding occurrence frequency of the preset quantile as the target occurrence frequency.
In one possible implementation, the removing the target dictionary includes:
generating a discard identification for the target dictionary;
and the abandon identifier is used for indicating a database engine to remove the target dictionary from a preset memory after the target data operation is executed.
According to a second aspect of the embodiments of the present disclosure, there is provided a dictionary data processing apparatus including:
an acquisition unit configured to perform acquisition of a data record to be processed; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary corresponding to the variable length field; each of the variable length fields has at least one field value;
the recording unit is configured to record dictionary codes corresponding to field values of any variable length field into a target dictionary corresponding to the variable length field;
the removing unit is configured to remove the target dictionary after target data operation is executed if the total number of dictionary items in the target dictionary reaches a preset number threshold and the frequency distribution of each dictionary item in the target dictionary does not meet a frequency distribution condition; the target data operation is a data query operation generated in response to a data query request for the data record to be processed, and the frequency distribution condition is determined according to frequency distribution of dictionary items in the effective dictionary.
In one possible implementation, the recording unit is configured to perform matching, for any of the field values, dictionary entries corresponding to the field values in the target dictionary; if the dictionary item corresponding to the field value is not matched in the target dictionary, carrying out coding processing on the field value to obtain a dictionary code corresponding to the field value; and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
In a possible implementation manner, each dictionary entry has a corresponding dictionary index, and the recording unit is configured to determine the dictionary index of the dictionary entry in the target dictionary if the dictionary entry corresponding to the field value is matched in the target dictionary; and establishing a mapping relation between the field value and the dictionary index, and updating the occurrence frequency of the dictionary item.
In one possible implementation manner, the removing unit is configured to perform obtaining of occurrence frequency of each dictionary item in the target dictionary; determining the occurrence frequency of a target in the occurrence frequency of each dictionary item; the difference between the target occurrence frequency and other occurrence frequencies meets a preset condition; the other occurrence frequencies are the occurrence frequencies of all dictionary items except the target occurrence frequency; and when the occurrence frequency of the target is smaller than a preset frequency threshold value, judging that the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition.
In a possible implementation manner, the removing unit is configured to perform sorting on the occurrence frequency of each dictionary entry to obtain a sorted occurrence frequency; and determining the occurrence frequency of the target in the occurrence frequency of each dictionary item according to the ordered frequency distribution condition of the occurrence frequency.
In a possible implementation manner, the removing unit is configured to determine, according to the frequency distribution, a corresponding frequency of occurrence at a preset quantile; and taking the corresponding occurrence frequency of the preset quantile as the target occurrence frequency.
In one possible implementation, the removing unit is configured to perform generating a castout identification for the target dictionary; and the abandon identifier is used for indicating a database engine to remove the target dictionary from a preset memory after the target data operation is executed.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the dictionary data processing method according to the first aspect or any one of the possible implementation manners of the first aspect when executing the computer program.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements a dictionary data processing method according to the first aspect or any one of the possible implementations of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, the program product comprising a computer program, the computer program being stored in a readable storage medium, from which at least one processor of a device reads and executes the computer program, so that the device performs the dictionary data processing method according to any one of the possible implementations of the first aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: acquiring a target dictionary which is constructed in advance in a memory and corresponds to a variable length field by acquiring a data record to be processed and aiming at least one variable length field contained in the data record to be processed; each variable length field has a plurality of field values; then, taking the dictionary code corresponding to each field value as a dictionary item to be recorded in a target dictionary; if the total number of dictionary items in the target dictionary reaches a preset number threshold, determining the occurrence frequency of the target dictionary items in the target dictionary; the difference between the occurrence frequency of the target dictionary item and the occurrence frequency of other dictionary items meets a preset condition; the other dictionary items are dictionary items except the target dictionary item in the target dictionary; if the occurrence frequency of the target dictionary items is smaller than a preset frequency threshold, removing the target dictionary from the memory after the data query operation generated by responding to the data query request of the data record to be processed is executed based on the dictionary codes in the target dictionary; therefore, dictionary coding of all field values with unobvious data distribution characteristics of the large-radix variable length field can be avoided, the situation that unstable execution of a database engine is influenced due to the fact that the data quantity of a target dictionary obtained by coding is too large is avoided, and the running efficiency of the database is effectively improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow diagram illustrating a method of dictionary data processing in accordance with an exemplary embodiment.
FIG. 2 is a flow diagram illustrating another dictionary data processing method in accordance with an exemplary embodiment.
FIG. 3 is a flowchart illustrating a method of dictionary data processing in accordance with another exemplary embodiment.
Fig. 4 is a block diagram illustrating a dictionary data processing apparatus according to an exemplary embodiment.
FIG. 5 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
It should also be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are both information and data that are authorized by the user or sufficiently authorized by various parties.
Fig. 1 is a flowchart illustrating a dictionary data processing method according to an exemplary embodiment, and the dictionary data processing method is used in an electronic device, which may be a server, as illustrated in fig. 1, and includes the following steps.
In step S110, a data record to be processed is acquired; acquiring a target dictionary which is constructed in advance in a memory and corresponds to the variable length field aiming at least one variable length field contained in the data record to be processed; each variable length field has a plurality of field values.
The pending data records may refer to data records that need to be written to a database.
The data records to be processed can be a plurality of data tables. One row in each data table is called a Record (Record).
Wherein, a field may refer to an attribute of a record; for example, in an information record, a field age is 28, and a field type is int.
Wherein, the variable length field may refer to a field in which the length of the field value is not fixed. In practical applications, the variable length field may be a string field. For example, in an information record, a certain string field "city" has a plurality of field values, such as "a chengshi", "B chengshi", "C chengshi", and "D chengshi", and the lengths of the field values are not fixed; that is, the character string field "city" is a variable length field.
In a specific implementation, the electronic equipment acquires at least one variable-length field processing data record; then, the electronic device constructs a target dictionary corresponding to the variable length field in the memory in advance for the variable length field. Specifically, the electronic device may build a hash table in the memory for the variable-length field as a target dictionary, and fix the maximum number of entries N of the target dictionary.
In step S120, for any of the variable length fields, the dictionary code corresponding to each field value of the any variable length field is recorded in the target dictionary corresponding to the any variable length field.
In a specific implementation, when the current number of items in the target dictionary is less than N, after the electronic device acquires the to-be-processed data record, the electronic device may attempt to perform field encoding on field values of the variable length field, that is, the electronic device may record dictionary encoding corresponding to each field value as a dictionary item into the target dictionary, and update an index field (index field) of the corresponding field value as an index of the corresponding field value in the target dictionary.
In step S130, if the total number of dictionary entries in the target dictionary reaches a preset number threshold and the frequency distribution of each dictionary entry in the target dictionary does not satisfy the frequency distribution condition, the target dictionary is removed after the target data operation is performed.
Wherein the quantity threshold may be a maximum number of terms of the target dictionary.
Wherein the target data operation is a data query operation generated in response to a data query request for the data record to be processed.
In a specific implementation, if the electronic device determines that the total number of dictionary entries in the target dictionary reaches a preset number threshold, that is, the dictionary reaches the maximum number of entries, and there is a field value that is not recorded in the target dictionary, the electronic device needs to perform a decision for determining whether to discard the target dictionary. In particular, the electronic device may determine a frequency of occurrence of target dictionary entries in the target dictionary. The difference between the occurrence frequency of the target dictionary item and the occurrence frequency of other dictionary items meets a preset condition; the other dictionary items are dictionary items in the target dictionary except the target dictionary item.
Specifically, the electronic device may count the occurrence frequency of each dictionary entry in the target dictionary to obtain the occurrence frequency of each dictionary entry. For example, it is known that there are three "a chengshi" field values in the character string field "city" in the data record to be processed, that is, one field value of "a chengshi" appears three times in the data record to be processed, that is, the frequency of occurrence of dictionary entries corresponding to the field value of "a chengshi" is 3 times. Then, the electronic device may take the occurrence frequency of dictionary items of p quantiles (e.g., 90 quantiles) of the occurrence frequency o in the target dictionary as the occurrence frequency of the above-described target dictionary items.
After the electronic device determines the occurrence frequency of the target dictionary item, the electronic device compares the occurrence frequency of the target dictionary item with a preset frequency threshold, if the electronic device determines that the occurrence frequency of the target dictionary item is smaller than the preset frequency threshold, it indicates that the field base number of the variable length field is too large, and if all field values of the variable length field are field-coded, the target dictionary corresponding to the variable length field is too large, which easily causes unstable execution of the database engine. Therefore, the electronic device executes a policy of discarding the target dictionary. That is, the electronic device removes the target dictionary from the memory after performing the target data operation based on the dictionary code in the target dictionary.
Wherein the target data operation is a data query operation generated in response to a data query request for the data record to be processed. Specifically, the data query request for the data record to be processed may refer to an SQL query statement; in the process of the data query operation generated by the electronic device in response to the data query request of the data record to be processed, the electronic device can generate or compile a specific data operation process for each operator and the data type to be processed thereof according to the definition of the semantics of the SQL query statement, and the execution efficiency of the operator is closely related to the data type of the operation thereof.
In order to make different operators able to process dictionary codes in the target dictionary, the operators need to be rewritten.
1) Scan operator (Scan operator):
the Scan operator is a leaf node in an operator tree that reads the original data operated on by a query directly from a distributed file system or database. The dictionary coding rules make a decision based on whether the variable length field in the original data has been dictionary coded. The electronic device can rewrite the Scan operator and the expression to modify the Scan operator so that the field of the returned line directly contains the dictionary index and simultaneously returns the field dictionary when the original data is subjected to dictionary coding (such as data storage formats of partial/orc). When the variable length field in the original data is not subjected to dictionary coding and the storage format is a format containing dictionary coding optimization (such as request/orc and the like), the variable length field is large in base number and is not suitable for field coding optimization, and the dictionary coding of the field is ignored. When the variable length field of the original data is not subjected to dictionary coding and the storage format does not contain the optimization of dictionary coding, a MayDictEncode operator is inserted behind a Scan operator, and dictionary coding is performed on the field content after the variable length field is read out.
2) Exchange operator:
the Exchange operator is an operator which is responsible for receiving RowBatch in the execution plan, and the corresponding sending end is an Exchange Sender operator; the electronic equipment can rewrite the Exchange operator and the expression so as to realize the addition of dictionary distribution and merging logic. Wherein, the dictionary distribution: in a shuffle (data shuffling process), aiming at each shuffle block, a sub-dictionary is created based on a dictionary in a memory and an index number in the block, an index of a corresponding field in the block is updated to point to a new dictionary, and the sub-dictionary and the index are transmitted to a downstream process initiating shuffle reading along with shuffle data.
And (3) dictionary merging: on the shuffle read side, the first taken sub-dictionary is first deserialized and the subsequently arriving sub-dictionaries are merged into the dictionary. And when the entry of the new dictionary exists in the current dictionary, updating the index of the corresponding row to point to the existing dictionary entry, and if the entry does not exist, inserting the dictionary and updating the index at the same time.
3) Dictionary decoding operator:
the electronic device can rewrite the dictionary decoding operator and the expression so as to realize that when the data is output to a user or stored as a new file:
if the output format can be used for dictionary coding, modifying the DataWriter operator, and directly storing the dictionary in the memory and the index to the dictionary to the character string field at the corresponding file position
And if the dictionary is output to the user or the dictionary coding is not available, inserting a DictDecode operator, decoding the dictionary and the index thereof in the memory to the corresponding position, and returning the dictionary and the index to the user.
4) Filter operator:
the Filter operator (operator), namely the screening operator, finds out the records meeting the conditions according to the defined screening expression. The electronic equipment can rewrite the Filter operator and the expression so as to realize the coding of the screening condition based on the memory dictionary and convert the screening condition into the screening based on the dictionary index. For example, y ═ a chengshi ", the expression in the Filter operator is modified to newY ═ 3 by looking up the index of" a chengshi "in the dictionary to be 3.
5) Project operator:
project (operator), namely a mapping operator, calculates the fields in the record according to the mapping rule. The electronic device may rewrite the Project operator and the expression to realize calculation line by line for the current dictionary entry, and form a new dictionary, for example, substr (y,3), and when processing to the "a chengshi" dictionary entry, the corresponding entry of the formed new dictionary is "bei".
Merging the same values of the new dictionary: after project operator operates dictionary item, the generated new item may be repeated, in order to ensure dictionary item uniqueness, new dictionary should be deduplicated, and dictionary index should be updated to point to new position.
6) The Aggregate operator:
aggregation (operator) is an aggregation operator, which aggregates data packets according to a grouping rule, for example, to find a total number of male/female employees, where male/female is the grouping rule (grouping key), and the total number sum is an aggregation expression (aggregation function). The electronic equipment can rewrite the Aggregate operator and the expression to realize the hash table construction/sequencing based on the dictionary index when the dictionary column is a grouping key, thereby reducing the repeated operation of the variable length field in the construction process.
7) The Join operator:
join (operator) is a join operator, and according to the join key, corresponding lines in two tables with different sources are organized into one or more new records. The electronic equipment can rewrite the Join operator and the expression so as to realize that when the Join keys of the two tables are dictionary codes, the two dictionaries need to be merged firstly, the indexes of the dictionary items in the records are updated to point to the new dictionary, and the subsequent connection operation is directly compared based on the dictionary indexes. When the non-join key is dictionary coding, the dictionary and the index items of the dictionary can be kept unchanged, and the connected records are constructed.
In the dictionary data processing method, a target dictionary which is constructed in advance in a memory and corresponds to a variable length field is obtained by obtaining a data record to be processed and aiming at least one variable length field contained in the data record to be processed; each variable length field has a plurality of field values; then, taking the dictionary code corresponding to each field value as a dictionary item to be recorded in a target dictionary; if the total number of dictionary items in the target dictionary reaches a preset number threshold, determining the occurrence frequency of the target dictionary items in the target dictionary; the difference between the occurrence frequency of the target dictionary item and the occurrence frequency of other dictionary items meets a preset condition; the other dictionary items are dictionary items except the target dictionary item in the target dictionary; if the occurrence frequency of the target dictionary items is smaller than a preset frequency threshold, removing the target dictionary from the memory after the data query operation generated by responding to the data query request of the data record to be processed is executed based on the dictionary codes in the target dictionary; therefore, dictionary coding of all field values with unobvious data distribution characteristics of the large-radix variable length field can be avoided, the situation that unstable execution of a database engine is influenced due to the fact that the data quantity of a target dictionary obtained by coding is too large is avoided, and the running efficiency of the database is effectively improved.
In an exemplary embodiment, recording the dictionary code corresponding to each field value as a dictionary entry into the target dictionary includes: for any field value, matching dictionary entries corresponding to the field value in the target dictionary; if the dictionary item corresponding to the field value is not matched in the target dictionary, carrying out coding processing on the field value to obtain a dictionary code corresponding to the field value; and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
In a specific implementation, in the process that the electronic device records dictionary codes corresponding to the field values as dictionary items in a target dictionary, the electronic device may query whether the dictionary items corresponding to the field values exist in the target dictionary; if the dictionary entry corresponding to the field value does not exist in the target dictionary, the target field still does not record the dictionary code corresponding to the field value at the moment; at this time, the electronic device performs dictionary encoding processing on the field value to obtain a dictionary code corresponding to the field value. And finally, the electronic equipment records the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
According to the technical scheme of the embodiment, when dictionary codes corresponding to all field values are recorded into a target dictionary as dictionary items, by inquiring whether the dictionary items corresponding to the field values exist in the target dictionary or not, only under the condition that the dictionary items corresponding to the field values do not exist in the target dictionary, the electronic equipment carries out dictionary coding processing on the field values to obtain the dictionary codes corresponding to the field values; and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary, thereby avoiding repeated dictionary coding of the field value recorded in the target dictionary and improving the operating efficiency of the electronic equipment.
In an exemplary embodiment, each dictionary entry has a corresponding dictionary index, and recording the dictionary code corresponding to each field value as a dictionary entry into the target dictionary includes: if the dictionary item corresponding to the field value exists in the target dictionary, determining the dictionary item corresponding to the field value as an existing dictionary item; and updating the index field corresponding to the field value into the dictionary index corresponding to the existing dictionary entry, and updating the occurrence frequency of the existing dictionary entry.
In a specific implementation, each dictionary entry has a corresponding dictionary index, the electronic device records dictionary codes corresponding to each field value as dictionary entries in the target dictionary, and if the electronic device detects that a dictionary entry corresponding to a field value exists in the target dictionary, the electronic device determines the dictionary entry corresponding to the field value as an existing dictionary entry. Then, the electronic device updates the index field corresponding to the field value to the dictionary index corresponding to the existing dictionary entry, and updates the occurrence frequency of the existing dictionary entry, i.e., the occurrence frequency occurrence of the field value corresponding to the updated target dictionary entry.
According to the technical scheme of the embodiment, if the dictionary entry corresponding to the field value exists in the target dictionary, the dictionary entry corresponding to the field value is determined to be the existing dictionary entry, the index field corresponding to the field value is directly updated to be the dictionary index corresponding to the existing dictionary entry, the occurrence frequency of the existing dictionary entry is updated, repeated dictionary encoding on the field value recorded in the target dictionary is not needed, and invalid occupation of processing resources of electronic equipment is avoided.
In an exemplary embodiment, the method further comprises: if the dictionary item corresponding to the field value does not exist in the target dictionary, recording the field value to a preset target storage position; and if the dictionary entry corresponding to the field value exists in the target dictionary, updating the index field corresponding to the field value into the dictionary index corresponding to the existing dictionary entry.
In a specific implementation, if the electronic device detects that the total number of dictionary entries in the target dictionary reaches a number threshold, that is, the dictionary reaches the maximum entry, after the electronic device obtains the to-be-processed data record, if the electronic device determines that no dictionary entry corresponding to a field value exists in the target dictionary, in order to not increase the size of the dictionary any more, the electronic device directly records the field value (value), that is, the electronic device records the field value to a preset target storage location.
If the electronic device determines that the dictionary entry corresponding to the field value exists in the target dictionary, that is, the dictionary entry corresponding to the field value already exists in the target dictionary, at this time, the electronic device updates the index field corresponding to the field value to the dictionary index corresponding to the existing dictionary entry, and updates the occurrenceof the field value of the entry corresponding to the target dictionary.
According to the technical scheme of the embodiment, if the total number of dictionary entries in the target dictionary reaches the number threshold and no dictionary entry corresponding to the field value exists in the target dictionary, the field value is recorded to the preset target storage position, so that the data volume of the target dictionary is prevented from being enlarged without restriction while the data record to be processed is accurately recorded.
FIG. 2 is a flow chart illustrating another dictionary data processing method according to an exemplary embodiment, as shown in FIG. 2, including the steps of:
in step S21O, a data record to be processed is acquired; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary which is constructed in advance in a memory and corresponds to the variable length field; each of the variable length fields has a plurality of field values.
In step S22O, inquiring whether a dictionary entry corresponding to the field value already exists in the target dictionary;
in step S221, if there is no dictionary entry corresponding to the field value in the target dictionary, performing dictionary encoding processing on the field value to obtain a dictionary code corresponding to the field value.
In step S222, the dictionary code corresponding to the field value is recorded as a new dictionary entry into the target dictionary.
In step S223, if there is a dictionary entry corresponding to the field value in the target dictionary, it is determined that the dictionary entry corresponding to the field value is an existing dictionary entry.
In step S224, the index field corresponding to the field value is updated to the dictionary index corresponding to the existing dictionary entry, and the occurrence frequency of the existing dictionary entry is updated.
In step S230, if the total number of dictionary entries in the target dictionary reaches a preset number threshold, the occurrence frequency of each dictionary entry in the target dictionary is obtained.
In step S240, the occurrence frequency of each dictionary entry is sorted to obtain the sorted occurrence frequency.
In step S250, in the sorted occurrence frequencies, the occurrence frequency of the target in the preset quantile is determined.
In step S260, if the occurrence frequency of the target is less than a preset frequency threshold, removing the target dictionary from the memory after performing target data operation based on the dictionary code in the target dictionary; the target data operation is a data query operation generated in response to a data query request for the data record to be processed.
It should be noted that, for the specific limitations of the above steps, reference may be made to the above specific limitations of a dictionary data processing method, which is not described herein again.
In an exemplary embodiment, removing the target dictionary from the memory after performing the target data operation based on the dictionary code in the target dictionary comprises: generating a discard identifier for the target dictionary; and the discarding identifier is used for indicating the database engine to remove the target dictionary from the memory after the target data operation is executed based on the dictionary code in the target dictionary.
In a specific implementation, in a process that the electronic device removes the target dictionary from the memory after performing the target data operation based on the dictionary code in the target dictionary, the electronic device may generate a discarding identifier for the target dictionary, and specifically, the electronic device may set a discarding dictionary identification bit of the target dictionary to true in the database engine. When the database engine detects that the dump dictionary identification bit of the target dictionary is true, the database engine indicates that a part of records of the variable length field are dictionary-coded. And when the subsequent operator uses the field, the electronic equipment decodes based on dictionary index, and removes the dictionary from the memory after all the coded fields are processed.
According to the technical scheme of the embodiment, the rejection identifier for the target dictionary is generated, so that the database engine can be accurately instructed to remove the target dictionary from the memory after the target data operation is executed based on the dictionary code in the target dictionary.
In an exemplary embodiment, determining a frequency of occurrence of target dictionary entries in a target dictionary comprises: acquiring the occurrence frequency of each dictionary item in the target dictionary; sequencing the occurrence frequency of each dictionary item to obtain the sequenced occurrence frequency; and determining the occurrence frequency of the target dictionary items according to the ordered distribution condition of the occurrence frequency.
Determining the occurrence frequency of the target in a preset position in the sequenced occurrence frequency; and taking the occurrence frequency of the target as the occurrence frequency of the target dictionary item.
In a specific implementation, in the process of determining the occurrence frequency of target dictionary items in a target dictionary, the electronic equipment can acquire the occurrence frequency of each dictionary item in the target dictionary; then, the electronic equipment sorts the occurrence frequency of each dictionary item according to the sequence of the occurrence frequency from large to small to obtain the sorted occurrence frequency. Then, the electronic device can determine the occurrence frequency of the target dictionary item according to the sorted distribution situation of the occurrence frequency. Specifically, the electronic device may determine the occurrence frequency of the target in the preset quantile from the sorted occurrence frequencies; and finally, the electronic equipment takes the occurrence frequency of the target as the occurrence frequency of the target dictionary item.
In practical applications, the preset quantile may be 90 quantiles, for example, the electronic device may determine, as the target occurrence frequency, an occurrence frequency of a target in 90 quantiles in the sorted occurrence frequencies, that is, determine an occurrence frequency o of dictionary entries in the dictionary as a 90 quantile in the sorted occurrence frequencies. The difference between the occurrence frequency of the target and the occurrence frequency of other dictionary items meets a preset condition; the other dictionary items are dictionary items except the target dictionary item in the target dictionary
According to the technical scheme, the occurrence frequency of each dictionary item in the target dictionary is obtained, the occurrence frequency of each dictionary item is sequenced to obtain the sequenced occurrence frequency, and the occurrence frequency of the target dictionary item is determined according to the distribution condition of the sequenced occurrence frequency, so that the target dictionary item with the difference between the occurrence frequency and the occurrence frequency of other dictionary items meeting the preset condition can be accurately determined in the occurrence frequency of each dictionary item.
Fig. 3 is a flowchart illustrating a dictionary data processing method according to another exemplary embodiment, as shown in fig. 3, including the steps of:
in step S310, a data record to be processed is acquired; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary which is constructed in advance in a memory and corresponds to the variable length field; each of the variable length fields has a plurality of field values.
In step S320, the dictionary code corresponding to each field value is recorded as a dictionary entry into the target dictionary.
In step S330, if the total number of dictionary entries in the target dictionary reaches a preset number threshold, the occurrence frequency of each dictionary entry in the target dictionary is obtained.
In step S340, the occurrence frequency of each dictionary entry is sorted to obtain the sorted occurrence frequency.
In step S350, in the sorted occurrence frequencies, the occurrence frequency of the target in the preset quantile is determined.
In step S360, if the occurrence frequency of the target is less than a preset frequency threshold, generating a discarding identifier for the target dictionary; wherein the discarding identifier is used for instructing a database engine to remove the target dictionary from the memory after target data operation is executed based on dictionary codes in the target dictionary; the target data operation is a data query operation generated in response to a data query request for the data record to be processed.
It should be noted that, for the specific limitations of the above steps, reference may be made to the above specific limitations of a dictionary data processing method, which is not described herein again.
It should be understood that although the steps in the flowcharts of fig. 1, 2 and 3 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1, 2, and 3 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least some of the other steps.
It is understood that the same/similar parts between the embodiments of the method described above in this specification can be referred to each other, and each embodiment focuses on the differences from the other embodiments, and it is sufficient that the relevant points are referred to the descriptions of the other method embodiments.
FIG. 4 is a block diagram illustrating a dictionary data processing apparatus according to an exemplary embodiment. Referring to fig. 4, the apparatus includes:
an obtaining unit 410 configured to perform obtaining a data record to be processed; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary which is constructed in advance in a memory and corresponds to the variable length field; each of the variable length fields has a plurality of field values;
a recording unit 420 configured to perform recording of dictionary codes corresponding to the respective field values as dictionary entries into the target dictionary;
the determining unit 430 is configured to determine the occurrence frequency of the target dictionary items in the target dictionary if the total number of the dictionary items in the target dictionary reaches a preset number threshold; the difference between the occurrence frequency of the target dictionary item and the occurrence frequency of other dictionary items meets a preset condition; the other dictionary items are dictionary items in the target dictionary except the target dictionary item;
a removing unit 440, configured to remove the target dictionary from the memory after performing target data operation based on dictionary codes in the target dictionary if the occurrence frequency of the target dictionary entries is less than a preset frequency threshold; the target data operation is a data query operation generated in response to a data query request for the data record to be processed.
In an exemplary embodiment, the recording unit 420 is specifically configured to perform querying whether a dictionary entry corresponding to the field value already exists in the target dictionary; if the dictionary item corresponding to the field value does not exist in the target dictionary, performing dictionary coding processing on the field value to obtain a dictionary code corresponding to the field value; and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
In an exemplary embodiment, each dictionary entry has a corresponding dictionary index, and the recording unit 420 is specifically configured to determine that the dictionary entry corresponding to the field value is an existing dictionary entry if the dictionary entry corresponding to the field value exists in the target dictionary; and updating the index field corresponding to the field value to the dictionary index corresponding to the existing dictionary item, and updating the occurrence frequency of the existing dictionary item.
In an exemplary embodiment, if the total number of dictionary entries in the target dictionary reaches the number threshold, the dictionary data processing apparatus further includes: a value recording unit, configured to perform recording the field value to a preset target storage location if a dictionary entry corresponding to the field value does not exist in the target dictionary; and the updating unit is specifically configured to update the index field corresponding to the field value to the dictionary index corresponding to the existing dictionary entry if the dictionary entry corresponding to the field value exists in the target dictionary.
In an exemplary embodiment, the removing unit 440 is specifically configured to perform generating a castout identifier for the target dictionary; wherein the discarding identifier is used for instructing a database engine to remove the target dictionary from the memory after target data operation is executed based on dictionary codes in the target dictionary.
In an exemplary embodiment, the determining unit 430 is specifically configured to perform obtaining of occurrence frequency of each dictionary item in the target dictionary; sequencing the occurrence frequency of each dictionary item to obtain the sequenced occurrence frequency; and determining the occurrence frequency of the target dictionary items according to the ordered distribution condition of the occurrence frequency.
In an exemplary embodiment, the determining unit 430 is specifically configured to determine, from the sorted occurrence frequencies, a target occurrence frequency in a preset quantile; and taking the occurrence frequency of the target as the occurrence frequency of the target dictionary item.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
FIG. 5 is a block diagram illustrating an electronic device 500 for performing a dictionary data processing method in accordance with one illustrative embodiment. For example, the electronic device 500 may be a server. Referring to fig. 5, electronic device 500 includes a processing component 520 that further includes one or more processors and memory resources, represented by memory 522, for storing instructions, such as applications, that are executable by processing component 520. The application programs stored in memory 522 may include one or more modules that each correspond to a set of instructions. Further, the processing component 520 is configured to execute instructions to perform the above-described method.
The electronic device 500 may further include: the power component 524 is configured to perform power management for the electronic device 500, the wired or wireless network interface 526 is configured to connect the electronic device 500 to a network, and the input/output (I/O) interface 528. The electronic device 500 may operate based on an operating system stored in the memory 522, such as Window 55 over, Mac O5X, Unix, Linux, FreeB5D, or the like.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory 522 comprising instructions, executable by the processor of the electronic device 500 to perform the above-described method is also provided. The storage medium may be a computer-readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, which includes instructions executable by a processor of the electronic device 500 to perform the above-described method.
It should be noted that the descriptions of the above-mentioned apparatus, the electronic device, the computer-readable storage medium, the computer program product, and the like according to the method embodiments may also include other embodiments, and specific implementations may refer to the descriptions of the related method embodiments, which are not described in detail herein.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A dictionary data processing method, comprising:
acquiring a data record to be processed; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary corresponding to the variable length field; the variable length field has at least one field value;
for any variable length field, recording dictionary codes corresponding to all field values of the variable length field into a target dictionary corresponding to the variable length field;
if the total number of dictionary items in the target dictionary reaches a preset number threshold value and the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition, removing the target dictionary after target data operation is executed; the target data operation is a data query operation generated in response to a data query request for the data record to be processed, and the frequency distribution condition is determined according to frequency distribution of dictionary items in the effective dictionary.
2. The method for processing dictionary data according to claim 1, wherein the recording of dictionary codes corresponding to each field value of any one of the variable length fields into a target dictionary corresponding to the any one of the variable length fields includes:
for any of the field values, matching dictionary entries in the target dictionary corresponding to the field value;
if the dictionary item corresponding to the field value is not matched in the target dictionary, carrying out coding processing on the field value to obtain a dictionary code corresponding to the field value;
and recording the dictionary code corresponding to the field value as a new dictionary item into the target dictionary.
3. The method for processing dictionary data according to claim 2, wherein each dictionary entry has a corresponding dictionary index, and the recording of dictionary codes corresponding to field values of any variable length field into a target dictionary corresponding to any variable length field comprises:
if the dictionary item corresponding to the field value is matched in the target dictionary, determining the dictionary index of the dictionary item in the target dictionary;
and establishing a mapping relation between the field value and the dictionary index, and updating the occurrence frequency of the dictionary item.
4. The dictionary data processing method according to claim 1, further comprising:
acquiring the occurrence frequency of each dictionary item in the target dictionary;
determining the occurrence frequency of a target in the occurrence frequency of each dictionary item; the difference between the target occurrence frequency and other occurrence frequencies meets a preset condition; the other occurrence frequencies are the occurrence frequencies of all dictionary items except the target occurrence frequency;
and when the occurrence frequency of the target is smaller than a preset frequency threshold value, judging that the frequency distribution of each dictionary item in the target dictionary does not meet the frequency distribution condition.
5. The dictionary data processing method according to claim 4, wherein the determining of the target occurrence frequency among the occurrence frequencies of the dictionary entries comprises:
sequencing the occurrence frequency of each dictionary item to obtain the sequenced occurrence frequency;
and determining the occurrence frequency of the target in the occurrence frequency of each dictionary item according to the ordered frequency distribution condition of the occurrence frequency.
6. The dictionary data processing method of claim 1, wherein the removing the target dictionary comprises:
generating a discard identification for the target dictionary;
and the abandon identifier is used for indicating a database engine to remove the target dictionary from a preset memory after the target data operation is executed.
7. A dictionary data processing apparatus, comprising:
an acquisition unit configured to perform acquisition of a data record to be processed; aiming at least one variable length field contained in the data record to be processed, acquiring a target dictionary corresponding to the variable length field; each of the variable length fields has at least one field value;
the recording unit is configured to record dictionary codes corresponding to field values of any variable length field into a target dictionary corresponding to the variable length field;
the removing unit is configured to remove the target dictionary after target data operation is executed if the total number of dictionary items in the target dictionary reaches a preset number threshold and the frequency distribution of each dictionary item in the target dictionary does not meet a frequency distribution condition; the target data operation is a data query operation generated in response to a data query request for the data record to be processed, and the frequency distribution condition is determined according to frequency distribution of dictionary items in the effective dictionary.
8. A server, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the dictionary data processing method of any one of claims 1 to 6.
9. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of a server, enable the server to perform the dictionary data processing method of any one of claims 1 to 6.
10. A computer program product comprising instructions, characterized in that the instructions, when executed by a processor of a server, enable the server to perform the dictionary data processing method of any one of claims 1 to 6.
CN202111417139.2A 2021-11-25 2021-11-25 Dictionary data processing method and device, electronic equipment and storage medium Active CN114036904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111417139.2A CN114036904B (en) 2021-11-25 2021-11-25 Dictionary data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111417139.2A CN114036904B (en) 2021-11-25 2021-11-25 Dictionary data processing method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114036904A true CN114036904A (en) 2022-02-11
CN114036904B CN114036904B (en) 2024-07-12

Family

ID=80138974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111417139.2A Active CN114036904B (en) 2021-11-25 2021-11-25 Dictionary data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114036904B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982206A (en) * 2023-02-09 2023-04-18 中国证券登记结算有限责任公司 Method and device for processing data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069773A (en) * 2020-07-23 2020-12-11 北京三快在线科技有限公司 Data processing system, method, apparatus, electronic device, and computer-readable medium
CN112527970A (en) * 2020-12-24 2021-03-19 上海浦东发展银行股份有限公司 Data dictionary standardization processing method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069773A (en) * 2020-07-23 2020-12-11 北京三快在线科技有限公司 Data processing system, method, apparatus, electronic device, and computer-readable medium
CN112527970A (en) * 2020-12-24 2021-03-19 上海浦东发展银行股份有限公司 Data dictionary standardization processing method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982206A (en) * 2023-02-09 2023-04-18 中国证券登记结算有限责任公司 Method and device for processing data
CN115982206B (en) * 2023-02-09 2023-08-29 中国证券登记结算有限责任公司 Method and device for processing data

Also Published As

Publication number Publication date
CN114036904B (en) 2024-07-12

Similar Documents

Publication Publication Date Title
US11995086B2 (en) Methods for enhancing rapid data analysis
Loekito et al. A binary decision diagram based approach for mining frequent subsequences
KR20100059901A (en) Two-pass hash extraction of text strings
CN111666468A (en) Method for searching personalized influence community in social network based on cluster attributes
Santoso et al. Close dominance graph: An efficient framework for answering continuous top-$ k $ dominating queries
CN111562920A (en) Method and device for determining similarity of small program codes, server and storage medium
CN109857833B (en) Rule engine implementation method and device and electronic equipment
CN114036904B (en) Dictionary data processing method and device, electronic equipment and storage medium
US8140546B2 (en) Computer system for performing aggregation of tree-structured data, and method and computer program product therefor
US20190347302A1 (en) Device, system, and method for determining content relevance through ranked indexes
JP5061147B2 (en) Image search device
CN111723122A (en) Method, device and equipment for determining association rule between data and readable storage medium
CN114282119B (en) Scientific and technological information resource retrieval method and system based on heterogeneous information network
US20060101045A1 (en) Methods and apparatus for interval query indexing
CN115577147A (en) Visual information map retrieval method and device, electronic equipment and storage medium
CN113065419B (en) Pattern matching algorithm and system based on flow high-frequency content
CN114048219A (en) Graph database updating method and device
CN103577560B (en) Method and device for inputting data base operating instructions
CN114943004B (en) Attribute graph query method, attribute graph query device, and storage medium
CN118445264B (en) Electronic filing data storage method and system
CN116800625A (en) API call data processing method and using scene data recommending method
CN118300617B (en) Information integration service data compression method and system based on logistics cold chain big data
US8990173B2 (en) Method and apparatus for selecting an optimal delete-safe compression method on list of delta encoded integers
CN109885733B (en) Graph data compression method and device for target spanning tree query
Kreesuradej et al. A Technique for Estimating Updated Frequent Itemsets in ESC-Growth Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant