CN114840726A - Method for realizing character string storage and search by hash table - Google Patents

Method for realizing character string storage and search by hash table Download PDF

Info

Publication number
CN114840726A
CN114840726A CN202210537791.6A CN202210537791A CN114840726A CN 114840726 A CN114840726 A CN 114840726A CN 202210537791 A CN202210537791 A CN 202210537791A CN 114840726 A CN114840726 A CN 114840726A
Authority
CN
China
Prior art keywords
hash table
character string
pos
collision
character strings
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210537791.6A
Other languages
Chinese (zh)
Inventor
李道双
陆怀军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Huoya Technology Co ltd
Original Assignee
Nanjing Huoya Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Huoya Technology Co ltd filed Critical Nanjing Huoya Technology Co ltd
Priority to CN202210537791.6A priority Critical patent/CN114840726A/en
Publication of CN114840726A publication Critical patent/CN114840726A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for realizing character string storage and search by a hash table, which relates to the technical field of hash tables and comprises S1 and collision orders of character strings; s2, counting collision orders; s3, storing the character strings into a hash table and S4 according to the collision order, and searching the position of the character strings in the hash table. Compared with a general method in a standard library, the method for realizing the storage and the search of the character string by the hash table greatly improves the character string query efficiency, greatly reduces the delay and the jitter in the transaction process, ensures that a transaction system realized by software achieves the performance of a transaction system realized by hardware, and is favorable for improving the character string search performance in a high-frequency transaction counter realized by software.

Description

Method for realizing character string storage and search by hash table
Technical Field
The invention relates to the technical field of hash tables, in particular to a method for realizing character string storage and search by a hash table.
Background
The character string search is the content that must be related to in all software engineering and projects, the hash table is a classic data structure, it is one of the core methods to deal with character string storage and search too, in most scenes that the performance requirement is not high, usually use the tool that the standard library provides to realize the storage and search of the character string, it is a common method, can meet most demands.
The high-frequency trading counter in the market at present has two realization modes: one is realized by hardware, the other is realized by software, the hardware has the advantages of fast data processing and low jitter, but has the defect of high development cost, the software has the advantages of slow data processing and high jitter, but the development cost is low, the problem of the speed of data processing is solved when a software high-frequency trading counter is realized, and one of the cores is to improve the character string searching performance.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method for realizing character string storage and search by a hash table, which solves the problems in the background technology.
In order to achieve the purpose, the invention is realized by the following technical scheme: a method for realizing character string storage and search by a hash table comprises the following operation steps:
s1, collision order of character strings:
determining the length of a hash table according to the number of the character strings, marking the length as Len, calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
the longest character string is 16 bytes;
s2, collision order statistics:
because different character strings str have the probability of obtaining the same position value pos, namely different character strings have conflicts, the conflicts need to be solved;
recording the pos value obtained for the first time, wherein the collision sequence is 1; obtaining pos value for the second time, wherein the collision order is 2; by analogy, recording the collision order map [ str ] ═ seq [ pos ] of the character string str at the position pos;
s3, storing the character strings into a hash table according to the collision order:
traversing all the character strings according to the collision sequence, wherein the collision sequence is from low to high, namely max (seq [ pos ]) is traversed, and all the character strings are traversed each time;
s4, finding the position of the character string in the hash table:
calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
and jumping to a corresponding position of the hash table, and checking whether the character string of the current position is equal to the target character string.
Further, in the step S2, seq [ pos ] indicates a pos collision order.
Further, in the step S3, the detailed steps are as follows:
(1) setting the collision sequence as N, and traversing all character strings from 1;
(1-1) if the collision order of the character strings is not the same as N, the step is circulated.
Further, (1-2) if the collision order of the character strings is the same as N and the position corresponding to the pos value is null, storing the character strings in the position, and turning to the step (1-1).
Further, (1-3) if the current location pointer is not null, then jump to the next location pointed to by the pointer and check the next location pointer, looping this step.
Further, (1-4) if the pointer of the current position is null, finding a free position backwards in the hash table, storing the character string in the position, and letting the pointer of the last position point to the position, and turning to step (1-1).
Further, (1-5) if the collision order is N traversal ends, N is equal to N +1, if N is less than or equal to max (seq [ pos ]), go to step (1), otherwise, end.
Further, in the step S4, if the character string of the current location is equal to the target character string, the process is ended, otherwise, it is checked whether the pointer is empty.
And further, if the checking pointer is null, ending, otherwise, turning to the step (1-1) according to the pointer value.
Furthermore, the method for realizing character string storage and search by the hash table is applied to the field of data processing.
The invention provides a method for realizing character string storage and search by a hash table, which has the following beneficial effects:
compared with a general method in a standard library, the method for realizing the storage and the search of the character string by the hash table greatly improves the character string query efficiency, greatly reduces the delay and the jitter in the transaction process, ensures that a transaction system realized by software achieves the performance of a transaction system realized by hardware, and is favorable for improving the character string search performance in a high-frequency transaction counter realized by software.
Drawings
FIG. 1 is a schematic flow chart illustrating the collision order of statistical strings in a method for implementing string storage and search by a hash table according to the present invention;
FIG. 2 is a schematic flow chart illustrating a method for storing and searching a character string in a hash table according to a collision order according to the present invention;
FIG. 3 is a schematic diagram of a first traversal of a method for implementing string storage and search by a hash table according to the present invention;
FIG. 4 is a schematic diagram illustrating a second traversal of a method for implementing string storage and search by using a hash table according to the present invention;
fig. 5 is a schematic flow chart illustrating a process of searching for a position of a character string in a hash table according to a method for implementing character string storage and search by using the hash table of the present invention.
Detailed Description
Referring to fig. 1 to 5, the present invention provides a technical solution: a method for realizing character string storage and search by a hash table comprises the following operation steps:
s1, collision order of character strings:
determining the length of a hash table according to the number of the character strings, marking the length as Len, calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
the longest character string is 16 bytes;
s2, collision order statistics:
because different character strings str have the probability of obtaining the same position value pos, namely different character strings have conflicts, the conflicts need to be solved;
recording pos values obtained for the first time, wherein the collision sequence is 1; obtaining pos value for the second time, wherein the collision order is 2; by analogy, recording the collision order map [ str ] ═ seq [ pos ] of the character string str at the position pos;
s3, storing the character strings into a hash table according to the collision order:
traversing all the character strings according to the collision sequence, wherein the collision sequence is from low to high, namely max (seq [ pos ]) is traversed, and all the character strings are traversed each time;
s4, finding the position of the character string in the hash table:
calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
and jumping to a corresponding position of the hash table, and checking whether the character string at the current position is equal to the target character string.
In the step S2, seq [ pos ] indicates a pos collision order.
In step S3, the detailed steps are as follows:
(1) setting the collision sequence as N, and traversing all character strings from 1;
(1-1) if the collision order of the character strings is not the same as N, the step is circulated.
(1-2) if the collision order of the character strings is the same as N and the position corresponding to the pos value is empty, storing the character strings at the position, and turning to the step (1-1).
(1-3) if the current position pointer is not null, jumping to the next position pointed by the pointer and checking the next position pointer, and circulating the step.
(1-4) if the pointer of the current position is null, finding a free position backwards in the hash table, storing the character string in the position, and enabling the pointer of the last position to point to the position, and turning to the step (1-1).
(1-5) if the collision order is that N traversal ends, then N is equal to N +1, if N is less than or equal to max (seq [ pos ]), then go to step (1), otherwise end.
In step S4, if the character string of the current position is equal to the target character string, the process is ended, otherwise, it is checked whether the pointer is empty.
If the checking pointer is null, ending, otherwise, turning to the step (1-1) according to the pointer value.
The method for realizing character string storage and search by the hash table is applied to the field of data processing.
In summary, the hash table implements a method for storing and searching a character string, and the embodiment of the method includes:
assume that there is a string hash value set containing 10 elements, and a hash table with a storage length of 11;
hash value set: {48,35,64,29,76,40,15,50,44,65 };
then they are stored in the following hash table:
a first traversal processing the hash value of the character string with the collision order of 1 (as shown in fig. 3), and a second traversal processing the hash value of the character string with the collision order of 2 (as shown in fig. 4);
description of the drawings: the hash table has a length of 11, the first position index is 0, the last position index is 10, and a pointer of 5 indicates a position pointing to an index of 5.
The invention provides a method for realizing character string storage and search by a hash table, which has the following beneficial effects:
compared with a general method in a standard library, the method for realizing the storage and the search of the character string by the hash table greatly improves the character string query efficiency, greatly reduces the delay and the jitter in the transaction process, ensures that a transaction system realized by software achieves the performance of a transaction system realized by hardware, and is favorable for improving the character string search performance in a high-frequency transaction counter realized by software.
Referring to fig. 1 to fig. 5, in summary, the method for implementing string storage and lookup by using the hash table includes the following steps:
s1, collision order of character strings:
determining the length of a hash table according to the number of the character strings, marking the length as Len, calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
the longest character string is 16 bytes;
s2, collision order statistics:
because different character strings str have the probability of obtaining the same position value pos, namely different character strings have conflicts, the conflicts need to be solved;
recording the pos value obtained for the first time, wherein the collision sequence is 1; obtaining pos value for the second time, wherein the collision order is 2; by analogy, recording the collision order map [ str ] ═ seq [ pos ] of the character string str at the position pos;
wherein seq [ pos ] represents a pos collision order;
s3, storing the character strings into a hash table according to the collision order:
traversing all the character strings according to the collision sequence, wherein the collision sequence is from low to high, namely max (seq [ pos ]) is traversed, and all the character strings are traversed each time;
the detailed steps are as follows:
(1) setting the collision sequence as N, and traversing all character strings from 1;
(1-1) if the collision order of the character strings is not the same as N, the step is circulated.
(1-2) if the collision order of the character strings is the same as N and the position corresponding to the pos value is empty, storing the character strings at the position, and turning to the step (1-1).
(1-3) if the current position pointer is not null, jumping to the next position pointed by the pointer and checking the next position pointer, and circulating the step.
(1-4) if the pointer of the current position is null, finding a free position backwards in the hash table, storing the character string in the position, and enabling the pointer of the last position to point to the position, and turning to the step (1-1).
(1-5) if the collision order is that N traversal ends, then N is equal to N +1, if N is less than or equal to max (seq [ pos ]), then go to step (1), otherwise, end;
s4, finding the position of the character string in the hash table:
calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
jumping to a corresponding position of a hash table, and checking whether the character string at the current position is equal to the target character string;
if the character string of the current position is equal to the target character string, ending, otherwise, checking whether the pointer is empty;
if the checking pointer is null, ending, otherwise, turning to the step (1-1) according to the pointer value.
The invention provides a method for realizing character string storage and search by a hash table, which has the following beneficial effects:
compared with a general method in a standard library, the method for realizing the storage and the search of the character string by the hash table greatly improves the character string query efficiency, greatly reduces the delay and the jitter in the transaction process, ensures that a transaction system realized by software achieves the performance of a transaction system realized by hardware, and is favorable for improving the character string search performance in a high-frequency transaction counter realized by software.

Claims (10)

1. A method for realizing character string storage and search by a hash table is characterized in that: the method for realizing the storage and the search of the character string by the hash table comprises the following operation steps:
s1, collision order of character strings:
determining the length of a hash table according to the number of the character strings, marking the length as Len, calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
the longest character string is 16 bytes;
s2, counting collision order:
because different character strings str have the probability of obtaining the same position value pos, namely different character strings have conflicts, the conflicts need to be solved;
recording the pos value obtained for the first time, wherein the collision sequence is 1; obtaining pos value for the second time, wherein the collision order is 2; by analogy, recording the collision order map [ str ] ═ seq [ pos ] of the character string str at the position pos;
s3, storing the character strings into a hash table according to the collision order:
traversing all the character strings according to the collision sequence, wherein the collision sequence is from low to high, namely max (seq [ pos ]) is traversed, and all the character strings are traversed each time;
s4, finding the position of the character string in the hash table:
calculating a hash value A of the character string str, and then performing modular operation on the A to obtain a position pos in the hash table;
wherein hash (str) ═ a; a mode Len ═ pos;
and jumping to a corresponding position of the hash table, and checking whether the character string of the current position is equal to the target character string.
2. The method of claim 1, wherein the hash table stores and searches a string, and further comprising: in the step S2, seq [ pos ] indicates a pos collision order.
3. The method of claim 1, wherein the hash table stores and searches strings, and comprises: in the step S3, the detailed steps are as follows:
(1) setting the collision sequence as N, and traversing all character strings from 1;
(1-1) if the collision order of the character strings is not the same as N, the step is circulated.
4. The method of claim 3, wherein the hash table stores and searches a string, and further comprising: (1-2) if the collision order of the character strings is the same as N and the position corresponding to the pos value is empty, storing the character strings at the position, and turning to the step (1-1).
5. The method of claim 3, wherein the hash table stores and searches a string, and further comprising: (1-3) if the current position pointer is not null, jumping to the next position pointed by the pointer and checking the next position pointer, and circulating the step.
6. The method of claim 3, wherein the hash table stores and searches a string, and further comprising: (1-4) if the pointer of the current position is null, finding a free position backwards in the hash table, storing the character string in the position, and enabling the pointer of the last position to point to the position, and turning to the step (1-1).
7. The method of claim 3, wherein said hash table is used for storing and searching character strings, and is characterized in that: (1-5) if the collision order is that N traversal ends, then N is equal to N +1, if N is less than or equal to max (seq [ pos ]), then go to step (1), otherwise end.
8. The method of claim 1, wherein the hash table stores and searches a string, and further comprising: in the step S4, if the character string of the current position is equal to the target character string, the process is ended, otherwise, it is checked whether the pointer is empty.
9. The method of claim 8, wherein the hash table stores and searches a string, and further comprising: if the checking pointer is null, ending, otherwise, turning to the step (1-1) according to the pointer value.
10. The method for storing and searching character strings by using the hash table according to any of claims 1-9, wherein: the method for realizing character string storage and search by the hash table is applied to the field of data processing.
CN202210537791.6A 2022-05-17 2022-05-17 Method for realizing character string storage and search by hash table Withdrawn CN114840726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210537791.6A CN114840726A (en) 2022-05-17 2022-05-17 Method for realizing character string storage and search by hash table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210537791.6A CN114840726A (en) 2022-05-17 2022-05-17 Method for realizing character string storage and search by hash table

Publications (1)

Publication Number Publication Date
CN114840726A true CN114840726A (en) 2022-08-02

Family

ID=82571215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210537791.6A Withdrawn CN114840726A (en) 2022-05-17 2022-05-17 Method for realizing character string storage and search by hash table

Country Status (1)

Country Link
CN (1) CN114840726A (en)

Similar Documents

Publication Publication Date Title
US7454403B2 (en) Method and mechanism of improving performance of database query language statements using data duplication information
US9189506B2 (en) Database index management
US6546394B1 (en) Database system having logical row identifiers
US6205451B1 (en) Method and apparatus for incremental refresh of summary tables in a database system
CN111046034A (en) Method and system for managing memory data and maintaining data in memory
US20090063527A1 (en) Processing of database statements with join predicates on range-partitioned tables
WO2022048284A1 (en) Hash table lookup method, apparatus, and device for gene comparison, and storage medium
US8086641B1 (en) Integrated search engine devices that utilize SPM-linked bit maps to reduce handle memory duplication and methods of operating same
CN108509505B (en) Character string retrieval method and device based on partition double-array Trie
US20200403633A1 (en) Advanced database compression
US11502705B2 (en) Advanced database decompression
CN110928882B (en) Memory database indexing method and system based on improved red black tree
CN104063384A (en) Data retrieval method and device
KR20020029843A (en) Index data management method for main memory database
CN113468080B (en) Caching method, system and related device for full-flash metadata
US20100058006A1 (en) Document caching for multiple concurrent workflows
CN114840726A (en) Method for realizing character string storage and search by hash table
US20080306948A1 (en) String and binary data sorting
CN108021678B (en) Key value pair storage structure with compact structure and quick key value pair searching method
US11928093B2 (en) Object data stored out of line vector engine
CN115469810A (en) Data acquisition method, device, equipment and storage medium
CN102591941B (en) Analysis method and analysis device for SQLite idle struct nodes
CN110321346B (en) Method and system for realizing character string hash table
CN115495462A (en) Batch data updating method and device, electronic equipment and readable storage medium
CN114490737A (en) Method and terminal for improving deep paging query efficiency of database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220802

WW01 Invention patent application withdrawn after publication