CN117271533A - Construction method and device of large data linked list and terminal equipment - Google Patents

Construction method and device of large data linked list and terminal equipment Download PDF

Info

Publication number
CN117271533A
CN117271533A CN202311564163.8A CN202311564163A CN117271533A CN 117271533 A CN117271533 A CN 117271533A CN 202311564163 A CN202311564163 A CN 202311564163A CN 117271533 A CN117271533 A CN 117271533A
Authority
CN
China
Prior art keywords
suffix
type
barrel
block
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311564163.8A
Other languages
Chinese (zh)
Other versions
CN117271533B (en
Inventor
韩凌波
李晓玉
冯天心
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Ocean University
Original Assignee
Guangdong Ocean University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Ocean University filed Critical Guangdong Ocean University
Priority to CN202311564163.8A priority Critical patent/CN117271533B/en
Publication of CN117271533A publication Critical patent/CN117271533A/en
Application granted granted Critical
Publication of CN117271533B publication Critical patent/CN117271533B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3068Precoding preceding compression, e.g. Burrows-Wheeler transformation
    • H03M7/3077Sorting
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3086Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method, a device and terminal equipment for constructing a large data linked list, which relate to the technical field of computers and comprise the following steps: dividing the character string and the suffix into blocks according to the size of the equipment memory, pre-calculating the L-type and S-type substring successor characters of each character block in the memory by adopting inductive sorting, and using a priority queue and a suffix barrel block to simulate the inductive sorting process of the memory so as to recursively sort the S-type substring; deducing an S type suffix by using the ordered S type substring, deducing an L type suffix by using the ordered S type suffix, deducing an S type suffix by using the ordered L type suffix, and sequencing all the suffixes to corresponding suffix barrel blocks by adopting the same sequencing method; at the program 0 th recursion layer, merging suffix barrel blocks by using multi-path merging sequencing, simultaneously calculating pointer fields of all the suffixes, writing the pointer fields to corresponding suffix linked list barrel blocks according to suffix positions, sequencing all the suffix linked list barrel blocks according to positions, merging all the suffix linked list barrel blocks, and generating a final linked list.

Description

Construction method and device of large data linked list and terminal equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for constructing a large data link table, and a terminal device.
Background
The Suffix chain table is developed from Suffix Array (Suffix Array), is a full-text index data structure, and is mainly applied to the field of data compression. The suffix linked list is an integer array with the length of n+1, the first element is a chain table head pointing to the minimum suffix of the character string, and the other elements point to the next suffix position larger than the current suffix dictionary sequence, so that the whole suffix linked list can be traversed according to the chain table head ascending sequence. The construction efficiency of the large-scale data suffix chain table directly determines the calculation efficiency of the compression process, and is the key of the whole calculation process.
The existing linked list construction technology is mainly aimed at small-sized data, the whole calculation process is completed in a memory, and the linked list construction technology is not applicable to large-sized data exceeding the memory of a computer. Theoretically, if a large data suffix array is known, the suffix can be ordered and linked by using the suffix position as an ordering key by sequentially scanning the suffix array, and a final large data linked list is generated. However, this method has a disadvantage in that it requires additional time to calculate the suffix array, and particularly for large data, it takes a lot of time to calculate the suffix array.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method, a device and terminal equipment for constructing a large-scale data linked list.
In order to achieve the above object, the present invention provides the following technical solutions: in one aspect, the invention provides a method for constructing a large-scale data suffix list, which comprises the following steps:
according to the size of the equipment memory and the character string X, logically dividing the character string X into a plurality of blocks, dividing the suffix into different suffix barrel blocks, and completing induction sequencing of the blocks in the memory; the suffix barrel block consists of continuous suffix barrels, and data blocks stored on an external memory are partitioned; the suffix barrel is composed of suffixes with the same initial letters;
reading each block of the character string X into the memory sequentially from right to left, carrying out inductive sorting on the S-type substrings, respectively storing the preceding characters of the L-type substrings and the S-type substrings in sequence according to the access sequence, and respectively storing the preceding characters to suffix barrel blocks corresponding to the external memory;
the first character and the access sequence of the substring are used as ordering keywords, a minimum priority queue is adopted, the L-type substring is ordered in an ascending order mode in a suffix bucket block mode, and the generated L-type substring sequence is stored in the suffix bucket block corresponding to the external memory; the S type substring is ordered in descending order in the form of a suffix bucket block by adopting a maximum priority queue, and the generated S type substring sequence is stored in the suffix bucket block corresponding to the external memory;
sequentially traversing the suffix barrel blocks of the S-type substring sequence, naming the S-type substring, generating a new contracted character string X1, and recursively executing the steps if characters in the new contracted character string X1 are not unique;
reading each block X into a memory sequentially from right to left, adopting a generalized ordering access rule, respectively calculating the successor characters of the type L and the type S suffixes according to the ordered type S suffixes, and respectively sequentially storing the successor characters into suffix barrel blocks corresponding to the external memory;
if the current recursion layer is a non-0 layer, respectively adopting a maximum priority queue and a minimum priority queue, respectively carrying out inductive sequencing on L type suffixes and S type suffixes in a suffix bucket block form by taking suffix first characters and an access sequence as sequencing keywords, and generating a suffix array of the current layer;
and if the current recursion layer is the 0 th layer, respectively adopting maximum and minimum priority queues, respectively sequencing L type and S type suffixes to the affiliated suffix barrel blocks by taking suffix first characters and access sequences as sequencing keywords, then sequentially acquiring each suffix sequence from the suffix barrel blocks according to a dictionary sequence by using multi-path merging sequencing, calculating pointer fields of the suffixes and the affiliated suffix linked list barrel blocks according to position information of adjacent suffixes, and then sequencing the suffix linked list barrel blocks according to positions, merging each suffix linked list barrel block, and generating a final linked list.
The suffix chain table block is a suffix in a section of continuous position in the suffix chain table, each suffix is represented by a binary group, the 1 st item of the tuple is the suffix position, the 2 nd item is the pointer domain of the suffix, and the position of the next suffix which is larger than the current suffix is pointed at. When a suffix is written into a suffix linked list barrel block, the suffix linked list barrel blocks are not arranged according to the suffix positions, so that the suffix linked list barrel blocks are required to be ordered according to the suffix positions, and then the suffix linked list barrel blocks are combined to form a final linked list.
On the other hand, the invention also provides a device for constructing the large data link list, which comprises the following steps:
the character string preprocessing module is used for reading the character string from the external memory, calculating the character type and dividing the character string into a plurality of blocks;
the sub-string successor character calculation module is used for calculating the successor character sequences of each sub-string of the X blocks and storing the successor character sequences into suffix barrel blocks corresponding to the external memory;
the substring sequencing module calculates an L-type substring sequence according to the S-type character dictionary sequence of each X partition, and then calculates an S-type substring sequence;
the substring naming module scans ordered S-type substrings, names and generates a new contracted character string X1;
a determiner A for determining whether the character in the new character string X1 is unique;
the suffix prefix character calculation module is used for calculating suffix prefix characters of each block X and storing the suffix prefix characters into suffix barrel blocks corresponding to the external memory;
a determiner B for determining whether the current recursion layer is layer 0;
the suffix ordering module calculates an L-type suffix sequence according to the S-type suffix word order of each block of X1, then calculates an S-type suffix sequence, and finally combines the two types of suffixes;
the suffix bucket block calculation module calculates each L type suffix bucket block according to the S type suffix word pattern of each X partition block, and then calculates the S type suffix bucket block;
the link information calculation module sequentially takes out each suffix from each suffix barrel block in ascending order by adopting multi-path merging and sorting, calculates the pointer domain of the suffix according to the acquired information of the adjacent suffix, and writes the pointer domain into the corresponding suffix linked list barrel block according to the position of the pointer domain;
and the suffix chain table bucket block merging module sorts all suffix chain table bucket blocks according to suffix positions, and finally merges all blocks to generate a final chain table.
In yet another aspect, the present invention further provides a terminal device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the steps of a method for constructing a large data link table according to the first aspect when the processor executes the computer program.
Compared with the prior art, the invention has the beneficial effects that:
the terminal equipment provided by the invention firstly adopts a generalized ordering method to recursively order the suffixes, calculates a suffix linked list barrel block and a pointer field of the suffix, which the suffix belongs to, when a program is in a 0 th recursion layer, then orders the suffix linked list barrel block according to the suffix position, and finally sequentially merges the suffix linked list barrel blocks to generate a final linked list. In the application scenario of calculating the LZ77 factor of the large data, the large data linked list calculation method provided by the embodiment can be adopted to directly construct the suffix linked list from the original character string, so that a suffix array is not required to be calculated, and the calculation efficiency of the compression process is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that are required to be used in the embodiments or the description of the prior art. It is apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a schematic diagram of a suffix array and a suffix linked list storage structure of a character string X according to an embodiment of the present application;
fig. 2 is a schematic diagram of a suffix bucket structure of a character string X according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a suffix linked list bucket of a character string X according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of steps of a method for constructing a large data link table according to an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a device for constructing a large data link table according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
The technical scheme of the invention is further described below with reference to the accompanying drawings and examples.
First, technical terms that may be used in the embodiments of the present application are described herein in a unified manner.
Character string X: an array of n characters, i.e., X [1, n ] = X [1] X [2]. X [ n ], where X [ n ] = $is the unique and smallest character of the lexicographic order.
Character type: including L-type and S-type. In the character string X, if a character is larger than the dictionary sequence of the right neighbor, or the dictionary sequence is the same and the right neighbor is of the L type, the character is of the L type; otherwise, the character is of the S type. If the left neighbor of the S type character is L type, the character is S type; if the left neighbor of the L-type character is of the S type, the character is of the L-type; in particular, the last character X [ n ] is designated as S type.
Sub-strings: a string starting with one character of X and ending with the nearest S-type character to the right, called a substring; the substring beginning with X [ i ] may be represented by sub (X, i).
Suffix: a character string from a character of X to an end of X is a suffix; the suffix beginning with the character X i is denoted by suf (X, i).
Substring/suffix type: the type of substring and suffix is determined by the substring/suffix head character type.
Successor characters: if the successor characters of the suffix suf (X, i) and the substring sub (X, i) exist, the successor character is X [ i-1].
Suffix array SA: and storing suffix positions of the ascending sequence of the character string X into an integer array, wherein the array is the suffix array of the X and is denoted by SA (X).
Suffix linked list ψ: an integer array of length n+1, the first element ψ [0] being the head of the linked list, pointing to the suffix with the smallest dictionary sequence, the i-th element ψ [ i ] pointing to the right neighbor of the suffix suf (X, i) in SA (X), if not, to the head of the linked list.
Suffix bucket: the suffix of the same character in the suffix array is a suffix barrel, and a plurality of adjacent suffix barrels form a suffix barrel block. Suffix bucket size calculation rules: suffixes belonging to the same suffix bucket block can be generalized and ordered in the memory.
Suffix linked list barrel block: the suffixes of the same character start in the suffix chain table are linked together to form a suffix chain table barrel, and a plurality of adjacent suffix chain table barrels form a suffix chain table barrel block. The size of the suffix linked list barrel block is the same as the calculation rule of the size of the suffix barrel block.
Taking the character string X [1,12] = "bbcbbbabbcbba" as an example, the specific contents of the above respective terms are explained:
as shown in fig. 1, a storage structure diagram of a suffix array and a suffix linked list of a character string X is shown.
Referring to fig. 1, the character type of the character string X is shown. For example, the dictionary of character X [05] =b is larger than character X [06] =a, so character X [05] is of the L type; the character X [04] =b is the same as the character X [05] =b, and X [05] is of the L type, so X [04] is also of the L type; the character X06 is of the S type, its left neighbor X05 is of the L type, so that X06 is of the S type. The S-type substrings contained in the character string X are X [06, 12] and X [12], since X [06] and X [12] are both S-type, in particular, X [12] is specified to be an S-type substring composed of one character.
Referring to fig. 1, a storage structure of a suffix array of a character string X is shown. The suffix array SA (X) of the character string X can be an integer array with the length of n, the position of the suffix in ascending order is stored, and the suffix positioned at the left side of the array is smaller than the suffix positioned at the right side of the array; if i < j, then the dictionary of suf (X, SA [ i ]) is less than suf (X, SA [ j ]).
Referring to fig. 1, a storage structure of a suffix linked list of a character string X is shown. Wherein, psi [0] is a linked list head pointing to the smallest suffix suf (X, 12), psi [12] points to suf (X, 12) right neighbor suf (X, 06) in SA (X), and so on; finally, the largest back suf (X, 03) points to the chain header ψ [0]. The right neighbor refers to the position relationship in SA, the position of suffix suf (X, 12) in SA (X) is 1, that is, SA [1] =12, and the right neighbor is SA [2] =06, corresponding to suffix suf (X, 06).
Fig. 2 is a schematic diagram of a suffix bucket of a character string X according to an embodiment of the present application. Wherein, the suffix bucket of character a is positioned from SA [01] to SA [02]; the suffix barrel of the character b is positioned from SA [03] to SA [10], wherein SA [03] to SA [06] on the left side are L-type suffix barrels, and SA [07] to SA [10] on the right side are S-type suffix barrels; suffix buckets for character c are located at SA [11] to SA [12]. Adjacent suffix barrels can form a suffix barrel block; for example, the suffix buckets of characters a and b may constitute one suffix bucket block from SA [01] to SA [10], and the suffix buckets of characters b and c may constitute one suffix bucket block from SA [03] to SA [12].
Fig. 3 is a schematic structural diagram of a suffix list of a character string X according to an embodiment of the present application. The suffix list ψ (X) may be formed by sequentially linking a chain header, an S type suffix list barrel of the character a, an L type suffix list barrel of the character b, an S type suffix list barrel of the character b and an L type suffix list barrel of the character c. The suffix list barrels of the characters a and b can form a suffix list barrel block, and the suffix list barrels of the characters b and c can form a suffix list barrel block. The arrow represents the order of linking and does not represent the actual storage location.
The design idea of the embodiment of the application is to divide the input characters and the suffixes thereof into blocks, execute inductive sorting on each block in a memory, pre-calculate the successor character sequence of the substring or the suffix in each block, then simulate the inductive sorting process of the memory algorithm by using the priority queue and the suffix barrel block, sort the substring or the suffix, and finally generate a final suffix linked list according to the suffix position and the word book order in the recursion 0 layer.
The specific calculation process of the embodiment of the application is as follows: firstly, recursively sorting the substrings of the S type, deducing the sequence of the suffixes of the S type by the substrings of the S type in order, deducing the sequence of the suffixes of the L type by the suffixes of the S type in order, deducing the sequence of the suffixes of the S type by the suffixes of the L type in order, and sorting all the suffixes to corresponding suffix barrel blocks; secondly, sequencing the suffixes in each suffix barrel block at a program 0 th recursion layer, and simultaneously calculating pointer fields of the suffixes; finally, using multi-path merging and sorting, deriving the suffixes of each suffix barrel block to corresponding suffix linked list barrel blocks according to the positions, sorting each suffix linked list barrel block according to the positions, sequentially merging each suffix linked list barrel block, and generating a final linked list.
The technical scheme of the present application is described below by specific examples.
Referring to fig. 4, a flow chart of steps of a method for constructing a large data link table according to an embodiment of the present application is shown, which may specifically include the following steps:
s101, reading a character string X from an external memory in an inverted sequence by using a character string preprocessing module, sequentially calculating each character type by comparing the sizes of two adjacent characters, logically dividing X into a plurality of character blocks by taking an S type character as a separator according to the memory size M, and dividing a suffix into a plurality of suffix barrel blocks.
In this embodiment, the logical partition size of the string X should meet the minimum memory requirement required for the current partition to perform inductive ordering in the memory, unless the block is composed of an S-type substring. In the process of reading X, when an S type character is searched, judging whether the character is the beginning of the current block or the ending of the next block according to the length of the current block of X and the length of the current S substring.
In this embodiment of the present application, the calculation method for the suffix number group to be a plurality of suffix bucket blocks includes: counting the sizes of each suffix barrel in the process of reading X by using an integer array with the length of 256; according to the sequence and the size of each suffix barrel and the size of the memory, the suffix array is divided into a plurality of suffix barrel blocks, each suffix barrel block is as large as possible, and the suffix barrel blocks can be guaranteed to finish the induction sorting process in the memory. The suffix is segmented into suffix barrel blocks for sorting, and the main purpose is to simulate the memory induction sorting process and reduce the reading and writing times and I/O quantity of the external memory.
S102, sequentially reading each X block into a memory, sequentially deducing the sequence of L types and S substrings according to the sequence of S type characters in the blocks, and sequentially writing the read sequence of L types and S type preceding characters into a suffix barrel block corresponding to an external memory in the deducing process; after the deduction is finished, the S-type substring is orderly searched and sequentially written out to the suffix barrel block corresponding to the external memory.
S103, adopting a minimum priority queue PQ 1 The method comprises the steps of taking the initial character, the sequence number and the position of the substring as ordering keywords, carrying out ascending order ordering on the L-type substring tuples in the form of suffix barrel blocks, and storing the generated L-type substring sequences into an external memory, wherein the specific calculation process is as follows:
(1) Initializing PQ 1 For null, substring sequence number idx=0; reading the current suffix barrel block into a memory array A, wherein each substring is a triplet<chr, idx, pos>The first character, the sequence number and the position of each substring are respectively represented, the elements in the array A are sequenced in an ascending order, and finally all the L-type substrings are sequenced in the array A according to the ascending order;
(2) Traversing each suffix barrel in L-type suffix barrel blocks in ascending order, and for the same suffix barrel, traversing S-type characters in X blocks first and then traversing the groups A and PQ 1 For each traversed substring tuple e=<chr, idx, pos>The following operations are performed: reading the successor character chr of e from the external memory space corresponding to the number according to the suffix bucket block number of the e.pos calculation substring e 0 If the successor is of L type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , idx+1, pos-1>PQ is pressed in 1 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor.
S104, adopting a maximum priority queue PQ 2 The S-type substring is ordered in descending order by taking the substring first character, the sequence number and the substring position as ordering keywords and the suffix barrel block, and the S-type substring is stored in an external memory, and the specific calculation process is as follows:
(1) Initializing PQ 2 Null, substring sequence number idx=n; reading the current suffix barrel block into a memory array A, wherein each substring is a triplet<chr, idx, pos>The first character, the sequence number and the position of the substring are respectively represented, elements in the array A are ordered in a descending order, and finally all S-type substrings are ordered in the array A according to the descending order;
(2) Traversing each suffix bucket in the S-type suffix bucket blocks in descending order, and for the same suffix bucket, traversing the L-type suffix bucket blocks first and then traversing the groups A and PQ 2 For each traversed substring e=<chr, pos, idx>The following operations are performed: according to the suffix barrel block number of pos calculation substring e, reading e successor character chr from the external memory space corresponding to the number 0 If the successor is of S type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current barrel block, the successor tuple e 0 =<chr 0 , idx-1, pos-1>PQ is pressed in 2 In the other hand, will e 0 Writing out to a corresponding suffix barrel block; if e 0 And writing the S-type substring into an external memory S-type substring sequence SStar. The SStar sequence stores the S-type substrings of the character string X in descending order.
S105, sequentially and reversely reading the S-type substring sequence SStar from the external memory, comparing adjacent substrings to be identical, naming the S-type substring, and generating a new contracted character string X1 according to the position and the naming of the substring.
In this embodiment of the present application, the method for comparing two adjacent S-type substrings includes: firstly, calculating the partition of X to which each partition belongs according to the positions of two substrings, and then reading the original substrings from the S-type substring external memory space corresponding to the partition, and comparing each character one by one.
S106, recursively executing the steps from S101 to S105 if the character in the newly generated contracted character string X1 is not unique.
S107, deducing the sequence of S type suffixes in each X partition according to the word sequence of X1 characters or SA (X1); and sequentially reading the X blocks into the memory, sequentially deducing the sequence of the L type and the S type suffixes according to the sequence of the S type suffixes in the blocks, and sequentially writing the read L type and S type successor character sequences into corresponding suffix barrel blocks of the external memory in the deducing process.
S108, judging whether the current layer is a recursion 0 layer.
S109, adopting a minimum priority queue PQ 1 The suffix first character is used as a sorting keyword, the L-type suffixes are sorted in an ascending order mode in a suffix barrel block mode, and the generated L-type suffix sequence is stored in a suffix barrel block corresponding to an external memory; using maximum priority queues PQ 2 The suffix first characters are used as ordering keywords, the S type suffixes are ordered in a descending order in the form of suffix barrel blocks, and the generated S type suffix sequences are stored in suffix barrel blocks corresponding to external memories; and finally merging the L type and S type suffix barrel blocks to generate SA (X1).
In this embodiment of the present application, the ascending sort procedure of the L-type suffix is:
(1) Initializing PQ 1 For null, reading the current L-type suffix barrel block into the memory array A, wherein each suffix is a binary group<chr, pos>Respectively representing the first character and the position of each substring, performing stable ascending order sorting on the elements in the array A, and finally arranging all L-type suffixes in the array A according to ascending order;
(2) Traversing each suffix barrel in L-type and S-type suffix barrel blocks in ascending order, and for the same suffix barrel, traversing the S-type suffix barrel first and then traversing the arrays A and PQ 1 For each traversed suffix e=<chr, pos>The following operations are performed: calculating the number of a suffix barrel block to which e belongs according to e.pos, and reading a successor character chr of e from an external memory space corresponding to the number 0 If the successor is of L type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , pos-1>PQ is pressed in 1 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor. The S-type bucket block is an S-type suffix bucket block corresponding to each block of the character string X in S107.
In this embodiment of the present application, the descending order sorting process of the S-type suffix is:
(1) Initializing PQ 2 For null, reading the current S-type suffix barrel block into the memory array A, wherein each suffix is a binary group<chr, pos>Each substring first character and position are respectively represented,the elements in the array A are ordered in a stable descending order, and finally, all S-type suffixes are ordered in the array A according to the descending order;
(2) Traversing each suffix barrel in the S type and L type suffix barrel blocks in descending order, and for the same suffix barrel, traversing the L type suffix barrel first and then traversing the arrays A and PQ 2 For each traversed suffix e=<chr, pos>The following operations are performed: calculating the number of a suffix barrel block to which e belongs according to e.pos, and reading a successor character chr of e from an external memory space corresponding to the number 0 If the successor is of S type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , pos-1>PQ is pressed in 2 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor. And the L-type suffix bucket blocks are suffix bucket blocks formed by L-type suffixes in the L-type suffix bucket blocks in the ascending sort process of the L-type suffixes.
S110, sequencing the suffixes to corresponding suffix barrel blocks by adopting a sequencing method of the L type and the S type suffixes in the step S109; sequentially taking out the minimum suffixes in the L-type and S-type suffix barrel blocks by adopting multi-path merging and sorting, and calculating pointer fields of the suffixes; ordering the suffixes to the belonging suffix linked list barrel blocks according to the positions of the suffixes; and sequencing the suffixes in the suffix linked list barrel blocks according to the positions, and sequentially merging all the suffix linked list barrel blocks to generate a final linked list.
In this embodiment of the present application, the calculating process of the pointer field of the suffix and the suffix linked list bucket block includes: assume that the length of a suffix linked list barrel block is K, e 1 、e 2 And e 3 Sequentially popping suffixes for the minimum priority queue, when e 2 At the time of ejection, calculate e 1 Pointer field and suffix linked list bucket blocks of (2), respectively e 2 Pos and e 1 pos/K, structure e 1 Is a binary group of (2)<e 1 .pos, e 2 .pos>And save to the e 1 pos/K suffix linked list barrel blocks; when e 3 At the time of ejection, obtain e 2 Pointer field e of (2) 3 Pos and suffix linked list bucket e 2 pos/K, structure e 2 Is a binary group of (2)<e 2 .pos, e 3 .pos>And save to the e 2 pos/K suffix linked list bucket blocks. When the priority queue is empty, the last suffix points to the chain header.
In this embodiment of the present application, the suffix linked list bucket block element is formed by two tuples, the 1 st item of tuple is a suffix position, the 2 nd item is a suffix position pointed by the current suffix, the suffix linked list bucket blocks are sorted in ascending order according to the 1 st item of tuple, the suffix linked list bucket block suffixes are sorted according to the position sequence, and each suffix linked list bucket block is combined in sequence, so as to generate a final linked list.
In the embodiment of the present application, in order to speed up the computation speed of steps S102 and S107, the terminal device may use multithreading and pipelining to speed up the inductive ordering speed. The terminal equipment can organize the ordering process into a three-level pipeline, the random reading and writing speed of the memory is improved through multithreading parallelism, and the pipeline is used for solving the problem of data dependence of induced ordering.
Referring to fig. 5, a schematic structural diagram of a large data link table constructing apparatus provided in an embodiment of the present application is shown, which may specifically include the following modules:
the character string preprocessing module 201 is configured to reversely read the character string X from the external storage to the memory, sequentially compare dictionary sequences of adjacent characters in the reading process, calculate the type of the character, identify S-type characters at the same time, segment the character string X into a plurality of blocks by using S-type characters as separators, and start and end characters of each block are S-type characters.
The substring successor character calculation module 202 is configured to read each block of X into the memory, perform inductive sorting on the substring of L type and S type according to the character sequence of S type, record the successor character sequences of L type and S type of the substring in the sorting process, and store the successor character sequences in the suffix bucket block corresponding to the external memory;
the substring sorting module 203 is configured to sort the L-type substrings according to the order of the S-type characters of each block using the minimum priority queue, and write the sorted substring sequences to the external memory space; then, according to the order of the ordered L-type substrings, sequencing the S-type substrings by using a maximum priority queue, and writing the sequenced S-type substring sequence into an external memory space;
the substring naming module 204 is configured to traverse the S-type substring sequence, search the S-type substring, and generate a new contracted string X1 according to the naming and the position of the S-type substring by comparing adjacent S-type substrings and naming the S-type substring;
a determiner a module 205, configured to determine whether the character in the newly generated character string X1 is unique, recursively execute the above 201 to 204 if the same character is included in X1, otherwise enter the next step;
the suffix prefix character calculation module 206 is configured to read each block of X into the memory, perform inductive sorting on L-type and S-type suffixes according to the order of S-type suffixes, record L-type and S-type prefix character sequences of substrings in the sorting process, and store the sequences in suffix bucket blocks corresponding to the external memory;
in the embodiment of the present application, the sequence of the S-type suffixes in each of the blocks of X in the module 206 is determined by the contracted string X1 in the module 204 at the highest level of recursion, because X1 is the naming result of the S-type substring of X, the character in X1 uniquely represents the S-type substring in X uniquely, and simultaneously represents the sequence of the S-type suffixes in X; at the non-recursive highest level, the SA (X1) returned by block 208 is determined because SA (X1) gives the suffix order of X1, representing the S-type substring order of X, and simultaneously representing the order of the S-type suffixes.
A determiner B module 207 for determining whether the current recursion layer is layer 0, if not, executing a module 208, otherwise executing a module 209;
a suffix ordering module 208, configured to order the L-type suffixes by using the minimum priority queue according to the order of the S-type suffixes of each suffix bucket block and the L-type successor character sequence calculated in the module 206, and write the ordered suffix sequence to the suffix bucket block corresponding to the external memory; then, according to the ordered sequence of the L-type suffixes and the S-type preceding character sequence calculated in the module 206, ordering the S-type suffixes by using a maximum priority queue, and writing the ordered suffix sequence into an external memory space; finally merging L-type and S-type suffix barrel blocks to generate a suffix array SA (X1) of X1;
in the embodiment of the present application, the detailed calculation process of the module 208 is given in step S109 in fig. 4.
A suffix bucket calculation module 209, configured to sort the L-type suffixes by using a minimum priority queue according to the order of the S-type suffixes of each suffix bucket and the L-type successor character sequence calculated in the module 206, and write the sorted suffix sequence to a corresponding suffix bucket of the external memory; then, according to the ordered sequence of the L-type suffixes and the S-type preceding character sequence calculated in the module 206, ordering the S-type suffixes by using a maximum priority queue, and writing the ordered suffix sequence into a suffix bucket block corresponding to an external memory;
the link information calculating module 210 is configured to sequentially extract the minimum suffix from each suffix bucket (L type and S type) using multiple merging and sorting, calculate the pointer domain of each suffix and the suffix linked list bucket to which each suffix belongs according to the position and sequence of the adjacent suffix that is continuously extracted, and write the suffix to the corresponding suffix linked list bucket;
and the suffix list barrel block merging module 211 is used for sequentially merging the suffix list barrel blocks according to the position ordering of the suffix tuples in the suffix list barrel blocks to generate a final suffix list.
In the embodiment of the present application, the detailed execution of the modules 209, 210 and 211 is given in step S110 in fig. 4.
Referring to fig. 6, the embodiment of the application further discloses a terminal device, which includes a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor implements a method for constructing a large data link table according to the foregoing embodiments when executing the computer program.
The terminal device 300 may be a terminal device in the foregoing embodiments, and the terminal device may be a computing device such as a desktop computer, a cloud server, or the like. The terminal device 300 may include, but is not limited to, a processor 310, a memory 320. It will be appreciated by those skilled in the art that fig. 6 is merely an example of a terminal device 300 and is not meant to be limiting as to the terminal device 300, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the terminal device 300 may also include input and output devices, network access devices, buses, etc.
The processor 310 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 320 may be an internal storage unit of the terminal device 300, for example, a memory of the terminal device 300, etc. The memory 320 may also be a removable external storage device of the terminal device 300, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 300. Further, the memory 320 may also include both an internal storage unit and an external storage device of the terminal device 300. The memory 320 is used for storing the computer program 321 and other programs and data required by the terminal device 300. The memory 320 may also be used to temporarily or permanently store data that has been output or is to be output.
The terms describing the positional relationship in the drawings are merely illustrative, and are not to be construed as limiting the present patent;
it is to be understood that the above examples of the present invention are provided by way of illustration only and not by way of limitation of the embodiments of the present invention. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention as set forth in the claims.

Claims (9)

1. The method for constructing the large data linked list is characterized by comprising the following steps of:
according to the size of the equipment memory, logically dividing the character string X into a plurality of blocks, dividing the suffix into different suffix barrel blocks, wherein the dividing principle is that each block X and the corresponding suffix barrel block can complete the induction and sorting process in the memory;
reading each block of the character string X into the memory sequentially from right to left, carrying out inductive sorting on the S-type substrings, respectively storing the preceding characters of the L-type substrings and the S-type substrings in sequence according to the access sequence, and respectively storing the preceding characters to suffix barrel blocks corresponding to the external memory;
the first character of the substring and the access sequence are used as ordering keywords, a minimum priority queue is adopted, the L-type substring is ordered in an ascending order in a bucket block mode, and the generated L-type substring sequence is stored in a suffix bucket block corresponding to an external memory; the S type substring is ordered in descending order in a bucket block mode by adopting a maximum priority queue, and the generated S type substring sequence is stored into a suffix bucket block corresponding to an external memory;
sequentially traversing the suffix barrel blocks of the S-type substrings in ascending order, naming the S-type substrings, generating a new contracted character string X1, and if characters in the new contracted character string are not unique, enabling the program to enter recursion;
reading each block X into a memory sequentially from right to left, adopting a generalized ordering access rule, respectively calculating the successor characters of the type L and the type S suffixes according to the ordered type S suffixes, and respectively sequentially storing the successor characters into suffix barrel blocks corresponding to the external memory;
at a program recursion high layer, respectively adopting maximum and minimum priority queues, taking suffix first characters and an access sequence as sequencing keywords, respectively inducing and sequencing L type suffixes and S type suffixes in a suffix bucket block form, and generating a suffix array of the current layer;
and in a program recursion 0 layer, respectively adopting maximum and minimum priority queues, respectively using suffix first characters and access sequences as sequencing keywords, respectively sequencing L-type and S-type suffixes to the suffix barrel blocks to which the suffix belongs, then using multi-path merging sequencing, sequentially acquiring each suffix sequence from the suffix barrel blocks according to a word book sequence, calculating pointer fields of the suffixes and the suffix linked list barrel blocks to which the suffix linked list barrel blocks belong according to position information of adjacent suffixes, then sequencing the suffix linked list barrel blocks according to positions, merging the blocks, and generating a final suffix linked list.
2. The method of claim 1, wherein the L-type substring ordering process is: using minimum priority queues PQ 1 The sub-string head characters, sequence numbers and sub-string positions are used as ordering keywords, and the sub-string head characters, the sequence numbers and the sub-string positions are ordered in ascending order in a suffix barrel block form, and the specific steps are as follows: initializing PQ 1 If the current suffix bucket block is empty, the substring sequence number idx=0 is set, the current suffix bucket block is read into the memory array A, and each substring is a triplet<chr, idx, pos>Respectively representing the first character, sequence number and position of each substring, and carrying out ascending sort on the elements in the A; traversing each suffix barrel of L-type suffix barrel blocks in ascending order, for the same suffix barrel, traversing S-type characters in X blocks first, and then traversing the groups A and PQ 1 For each traversed substring tuple e=<chr, idx, pos>Reading a successor character chr of e from an external memory space corresponding to the number according to the suffix bucket block number of the e.pos calculation substring e 0 If the successor is of L type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , idx+1, pos-1>PQ is pressed in 1 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor.
3. The method of claim 1, wherein the S-type substring ordering process is: using maximum priority queues PQ 2 The sub-string head characters, sequence numbers and sub-string positions are used as sorting keywords, and the sub-string head characters, the sequence numbers and the sub-string positions are sorted in descending order in a suffix barrel block form, and the specific steps are as follows: initializing PQ 2 If the current S type suffix bucket block is empty, setting a sequence number idx=n, reading the current S type suffix bucket block into a memory array A, wherein each substring is a triplet<chr, idx, pos>Sorting the groups A in descending order; traversing each of the S-type suffix bucket blocks in descending orderAnd for the same suffix barrel, traversing L-type suffix barrel blocks and traversing the groups A and PQ 2 For each traversed substring, according to the suffix bucket block number of the e.pos calculation substring e, reading the successor character chr of e from the external memory space corresponding to the number 0 If the successor is of S type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , idx-1, pos-1>PQ is pressed in 2 In the other hand, will e 0 Writing out to the corresponding suffix barrel block.
4. The method of claim 1, wherein the L-type suffix ordering procedure is: initializing PQ 1 If the suffix is empty, reading the current L-type suffix barrel block into a memory array A, wherein each suffix is a binary group<chr, pos>Performing stable ascending sort on the elements in the A; traversing each suffix bucket in L-type and S-type suffix bucket blocks, and for the same suffix bucket, traversing the S-type suffix bucket first and then traversing A and PQ 1 For each traversed suffix e=<chr, pos>Calculating the suffix barrel block number of e according to e.pos, and reading the successor character chr of e from the external memory space corresponding to the number 0 If the successor is of L type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , pos-1>PQ is pressed in 1 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor.
5. The method of claim 1, wherein the ordering of the S-type suffixes is: initializing PQ 2 For null, reading the current S-type suffix barrel block into the memory array A, wherein each suffix is a binary group<chr, pos>The elements in the A are subjected to stable descending order sorting; traversing each suffix barrel in the S type and L type suffix barrel blocks in descending order, and for the same suffix barrel, traversing the L type suffix barrel first and then traversing the arrays A and PQ 2 For each traversed suffix e=<chr, pos>Calculating the suffix barrel block number of e according to e.pos, and numbering from the numberReading the preceding character chr of e in the corresponding external memory space 0 If the successor is of S type, according to chr 0 Calculating the suffix barrel block to which the successor belongs, and if the suffix barrel block belongs to the current suffix barrel block, making the successor tuple e 0 =<chr 0 , pos-1>PQ is pressed in 2 In the other hand, will e 0 Writing out to the suffix barrel block corresponding to the successor.
6. The method of claim 1, wherein the calculation of the pointer field of the suffix and the bucket of the suffix linked list: sequentially taking out the minimum suffixes from S and L type suffix barrel blocks by using multi-path merging and sorting, and assuming that the length of a suffix linked list barrel block is K, e 1 And e 2 Sequentially popped suffixes, when e 2 At the time of ejection, calculate e 1 Pointer field and suffix linked list bucket blocks of (2), respectively e 2 Pos and e 1 pos/K, structure e 1 Suffix bigram<e 1 .pos, e 2 .pos>And save to the e 1 pos/K suffix linked list barrel blocks; after the sorting is finished, the last suffix is pointed to the chain table head; the suffix linked list barrel block is composed of suffix binary groups, wherein the 1 st item of the tuple is the suffix position, the 2 nd item is the suffix position pointed by the current suffix, and the suffixes in the same block belong to a certain block of the character string X but are not ordered according to the positions.
7. The method according to claim 1 or 6, wherein the suffix linked list bucket block merging process is: and sorting the suffix linked list barrel blocks in ascending order according to the 1 st item of the tuple, arranging the suffixes in the suffix linked list barrel blocks according to the position sequence, and merging the suffix linked list barrel blocks from left to right in sequence to generate a final linked list.
8. The device for constructing the large data linked list is characterized by comprising the following components: the character string preprocessing module is used for reading the character string from the external memory, calculating the character type and cutting the character string into blocks; the sub-string successor character calculation module is used for calculating the successor character sequences of each sub-string of the X blocks and storing the successor character sequences into corresponding suffix barrel blocks of the external memory; the substring sequencing module calculates an L-type substring sequence according to the word order of the S-type characters of each X partition, and then calculates an S-type substring sequence; the substring naming module is used for scanning ordered S-type substrings and naming to generate a contracted character string X1; a determiner A for determining whether the character in X1 is unique; the suffix prefix character calculation module is used for calculating suffix prefix characters of each block X and storing the suffix prefix characters to the corresponding suffix barrel blocks of the external memory; a determiner B for determining whether the current recursion layer is layer 0; the suffix ordering module calculates an L-type suffix sequence according to the S-type suffix word order of each block of X1, then calculates an S-type suffix sequence, and finally combines the two types of suffixes; the suffix bucket block calculation module calculates L-type suffix bucket blocks according to S-type suffix word patterns of each X partition block, and then calculates S-type suffix bucket blocks; the link information calculation module sequentially extracts each suffix from each suffix barrel block in ascending order by adopting multi-path merging and sorting, calculates pointer fields of the suffixes, and writes the pointer fields into corresponding suffix linked list barrel blocks according to the positions of the pointer fields; and the suffix list barrel block merging module sorts all suffix list barrel blocks according to positions, and finally merges all suffix list barrel blocks to generate a final list.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements a method of constructing a large data link table according to any one of claims 1-7 when executing the computer program.
CN202311564163.8A 2023-11-22 2023-11-22 Construction method and device of large data linked list and terminal equipment Active CN117271533B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311564163.8A CN117271533B (en) 2023-11-22 2023-11-22 Construction method and device of large data linked list and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311564163.8A CN117271533B (en) 2023-11-22 2023-11-22 Construction method and device of large data linked list and terminal equipment

Publications (2)

Publication Number Publication Date
CN117271533A true CN117271533A (en) 2023-12-22
CN117271533B CN117271533B (en) 2024-01-16

Family

ID=89203095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311564163.8A Active CN117271533B (en) 2023-11-22 2023-11-22 Construction method and device of large data linked list and terminal equipment

Country Status (1)

Country Link
CN (1) CN117271533B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090187560A1 (en) * 2008-01-11 2009-07-23 International Business Machines Corporation String pattern conceptualization method and program product for string pattern conceptualization
CN115982310A (en) * 2023-03-21 2023-04-18 广东海洋大学 Link table generation method with verification function and electronic equipment
CN115982311A (en) * 2023-03-21 2023-04-18 广东海洋大学 Chain table generation method and device, terminal equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090187560A1 (en) * 2008-01-11 2009-07-23 International Business Machines Corporation String pattern conceptualization method and program product for string pattern conceptualization
CN115982310A (en) * 2023-03-21 2023-04-18 广东海洋大学 Link table generation method with verification function and electronic equipment
CN115982311A (en) * 2023-03-21 2023-04-18 广东海洋大学 Chain table generation method and device, terminal equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MOHAMED IBRAHIM ABOUELHODA等: "Replacing suffix trees with enhanced suffix arrays", 《JOURNAL OF DISCRETE ALGORITHMS》, pages 53 - 86 *
黄政林;张冰;: "一种分布式后缀树构造与匹配算法", 华中科技大学学报(自然科学版), no. 1, pages 226 - 231 *

Also Published As

Publication number Publication date
CN117271533B (en) 2024-01-16

Similar Documents

Publication Publication Date Title
Van der Loo The stringdist package for approximate string matching.
Yun et al. Efficient approach for incremental high utility pattern mining with indexed list structure
US10521441B2 (en) System and method for approximate searching very large data
US7805427B1 (en) Integrated search engine devices that support multi-way search trees having multi-column nodes
US7603346B1 (en) Integrated search engine devices having pipelined search and b-tree maintenance sub-engines therein
Peng et al. Paris: The next destination for fast data series indexing and query answering
US20100106713A1 (en) Method for performing efficient similarity search
Riba et al. Handwritten word spotting by inexact matching of grapheme graphs
CN111460170B (en) Word recognition method, device, terminal equipment and storage medium
CN87100829A (en) The method and apparatus that is used to retrieve
US8086641B1 (en) Integrated search engine devices that utilize SPM-linked bit maps to reduce handle memory duplication and methods of operating same
CN110837584B (en) Method and system for constructing suffix array in block parallel manner
US7653619B1 (en) Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that support variable tree height
CN110852046B (en) Block induction sequencing method and system for text suffix index
Louza et al. External memory generalized suffix and LCP arrays construction
CN110888981A (en) Title-based document clustering method and device, terminal equipment and medium
US7987205B1 (en) Integrated search engine devices having pipelined node maintenance sub-engines therein that support database flush operations
US7725450B1 (en) Integrated search engine devices having pipelined search and tree maintenance sub-engines therein that maintain search coherence during multi-cycle update operations
CN115982310B (en) Chain table generation method with verification function and electronic equipment
CN115982311B (en) Method and device for generating linked list, terminal equipment and storage medium
CN117271533B (en) Construction method and device of large data linked list and terminal equipment
US7953721B1 (en) Integrated search engine devices that support database key dumping and methods of operating same
Rachkovskij Index structures for fast similarity search for symbol strings
Appiah et al. Magnetic bubble sort algorithm
CN108897787B (en) SIMD instruction-based set intersection method and device in graph database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant