CN111767436B - HASH index data storage and reading method and system - Google Patents

HASH index data storage and reading method and system Download PDF

Info

Publication number
CN111767436B
CN111767436B CN202010581485.3A CN202010581485A CN111767436B CN 111767436 B CN111767436 B CN 111767436B CN 202010581485 A CN202010581485 A CN 202010581485A CN 111767436 B CN111767436 B CN 111767436B
Authority
CN
China
Prior art keywords
index
record
hash
head
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010581485.3A
Other languages
Chinese (zh)
Other versions
CN111767436A (en
Inventor
王金山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Si Tech Information Technology Co Ltd
Original Assignee
Beijing Si Tech Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Si Tech Information Technology Co Ltd filed Critical Beijing Si Tech Information Technology Co Ltd
Priority to CN202010581485.3A priority Critical patent/CN111767436B/en
Publication of CN111767436A publication Critical patent/CN111767436A/en
Application granted granted Critical
Publication of CN111767436B publication Critical patent/CN111767436B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9014Indexing; Data structures therefor; Storage structures hash tables

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for storing and reading HASH index data, wherein index records are stored in memory values corresponding to the HASH index data; the number of index records is determined by the data of how many HASH values are repeated; the index data format is sequentially arranged with an index mark, an index value, a next pointer and a next deleted record pointer. The storage method of HASH index data comprises the following steps: storing the repeated data records of a plurality of HASH values in the same linked list; when the index record is inserted, when the index mark of the head index record of the linked list is in the use state, the head index record is directly multiplexed; when the index mark of the head index record of the linked list is in a deleted state, the record pointed by the next pointer of the head index record is fetched. The method solves the problem that the insertion and deletion cannot be fast when the repeatability of certain values of the HASH index is very high.

Description

HASH index data storage and reading method and system
Technical Field
The invention relates to the technical field of index data of a memory database, in particular to a method and a system for storing and reading HASH index data.
Background
In order to be able to quickly find a particular record from a huge number of memory records, an index needs to be created for frequently accessed fields. For equivalence lookup, a HASH index is typically used.
Conventional HASH indexes, when HASH values are repeated, are typically stored using a linked list, i.e., the repeated values are stored on the same linked list. If the repeatability of an index value is high, the corresponding linked list is particularly long. For the usage scenario of the memory database, frequent insert-delete operations are generally performed on the index record, and for high performance consideration, delete operations are not true delete, but delete marks are marked on corresponding records, so that the memory can be reused by subsequent insert operations. Therefore, when an index record is inserted, because of the need to multiplex previously deleted memory space, it is necessary to sequentially traverse the entire linked list to find a deleted location and then place a new record.
In the case where the amount of table data is large, such as when the repeated data exceeds 100 ten thousand records, the insertion becomes very slow.
Disclosure of Invention
Aiming at the problem that insertion and deletion cannot be quickly performed when the repeatability of certain values of the HASH index is very high at present, the invention provides a method and a system for storing and reading HASH index data.
The invention discloses a storage and reading method of HASH index data, wherein index records are stored in memory values corresponding to the HASH index data;
the number of index records is determined by the data of how many HASH values are repeated;
the index data format is sequentially arranged with an index mark, an index value, a next pointer and a next deleted record pointer.
Preferably, the storing method of HASH index data includes:
storing the repeated data records of a plurality of HASH values in the same linked list;
when the index record is inserted, when the index mark of the head index record of the linked list is 1, directly multiplexing the head index record; when the index mark of the head index record of the linked list is 0, the record pointed by the next pointer of the head index record is taken out, if the record pointed by the next pointer of the head index record is empty, no reusable record is indicated, and a new record is directly inserted into the head; if the record pointed to by the next pointer of the head index record is not null, multiplexing the record pointed to by the next pointer of the head index record.
Preferably, an "index flag" of 1 of the index record indicates that the current index record status is in use, and an "index flag" of 0 of the index record indicates that the current index record status is deleted.
Preferably, the method for reading HASH index data includes inquiring and reading data according to HASH values, index marks and index values; the specific process is as follows:
firstly, indexing according to the HASH value, then inquiring in the index record corresponding to the HASH value, wherein the inquiring process is to read the index mark firstly, read the index value in the index record with the index mark of 1, and compare with the inquired index value until the corresponding index record is found.
Preferably, the deletion method of HASH index data includes:
when the index record is deleted, the index space is not released, the index record is not deleted from the index linked list, and the index mark is modified to be 0;
when one index record is deleted, if the index record is not at the head of the linked list, the record pointed by the next pointer of the head record is pointed to the currently deleted index record, and the next deleted record pointer of the currently deleted index record is pointed to the record pointed by the next pointer of the original linked list head record.
A HASH index data storage and reading system at least comprises a processor and a memory, wherein the memory stores an executable program of the method; the processor runs the executable program of the method to index the memory data.
Compared with the prior art, the invention has the beneficial effects that:
after the HASH index data storage and reading method and system are adopted, the whole index chain table is prevented from being traversed when data are inserted, the corresponding index file is found through a direct memory location mode, and when the repeatability is more than 100 ten thousand, the execution efficiency can be improved by 100 times. The efficiency of the HASH index data is effectively improved, positive influence is generated on the application HASH index data, and the reliability and the practicability of the HASH index data are improved.
Drawings
Fig. 1 is a schematic diagram of a method for storing and reading HASH index data according to the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention is described in further detail below with reference to the attached drawing figures:
referring to fig. 1, a method for storing and reading HASH index data, wherein an index record is stored in a memory value corresponding to the HASH index data;
the number of index records is determined by the data of how many HASH values are repeated;
the index data format is sequentially arranged index marks, index values, next pointers and next deleted record pointers next2.
In specific implementation, the method for storing HASH index data comprises the following steps:
storing the repeated data records of a plurality of HASH values in the same linked list;
when the index record is inserted, when the index mark of the head index record of the linked list is 1, directly multiplexing the head index record; when the index mark of the head index record of the linked list is 0, the record pointed by the next pointer next of the head index record is taken out, if the record pointed by the next pointer next of the head index record is empty, no reusable record is indicated, and a new record is directly inserted into the head; if the record pointed to by the next pointer next of the head index record is not null, multiplexing the record pointed to by the next pointer next of the head index record.
In specific implementation, an "index flag" of 1 in the above index record indicates that the current index record status is in use, and an "index flag" of 0 in the index record indicates that the current index record status is deleted.
In specific implementation, the method for reading the HASH index data is to query and read the data according to the HASH value, the index mark and the index value; the specific process is as follows:
firstly, indexing according to the HASH value, then inquiring in the index record corresponding to the HASH value, wherein the inquiring process is to read the index mark firstly, read the index value in the index record with the index mark of 1, and compare with the inquired index value until the corresponding index record is found.
In specific implementation, the deletion method of HASH index data comprises the following steps:
when the index record is deleted, the index space is not released, the index record is not deleted from the index linked list, and the index mark is modified to be 0;
when one index record is deleted, if the index record is not at the head of the linked list, the record pointed to by the next pointer next of the head record is pointed to the currently deleted index record, and the next deleted record pointer next2 of the currently deleted index record is pointed to the record pointed to by the next pointer next of the original linked list head record.
A HASH index data storage and reading system at least comprises a processor and a memory, wherein the memory stores an executable program of the method; the processor runs the executable program of the method to index the memory data.
The method and the system for storing and reading the HASH index data avoid traversing the whole index linked list when inserting data, find the corresponding index file by a direct memory location mode, and improve the execution efficiency by 100 times when the repeatability is more than 100 ten thousand. The efficiency of the HASH index data is effectively improved, positive influence is generated on the application HASH index data, and the reliability and the practicability of the HASH index data are improved.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (2)

1. A method for storing and reading HASH index data is characterized in that an index record is stored in a memory value corresponding to the HASH index data;
the number of index records is determined by the data of how many HASH values are repeated;
the index data format is an index mark, an index value, a next pointer and a next deleted record pointer which are sequentially arranged;
the storage method of the HASH index data comprises the following steps:
storing the repeated data records of a plurality of HASH values in the same linked list;
when the index record is inserted, when the index mark of the head index record of the linked list is 1, directly multiplexing the head index record; when the index mark of the head index record of the linked list is 0, the record pointed by the next pointer of the head index record is taken out, if the record pointed by the next pointer of the head index record is empty, no reusable record is indicated, and a new record is directly inserted into the head; multiplexing the record pointed by the next pointer of the head index record if the record pointed by the next pointer of the head index record is not empty; wherein, an index flag of 1 of the index record indicates that the current index record state is in use, and an index flag of 0 of the index record indicates that the current index record state is deleted;
the method for reading the HASH index data is to query and read the data according to the HASH value, the index mark and the index value; the specific process is as follows:
firstly, indexing according to the HASH value, then inquiring in the index record corresponding to the HASH value, wherein the inquiring process is to read an index mark firstly, read the index value in the index record with the index mark of 1, and compare with the inquired index value until the corresponding index record is found;
the deletion method of the HASH index data comprises the following steps:
when the index record is deleted, the index space is not released, the index record is not deleted from the index linked list, and the index mark is modified to be 0;
when one index record is deleted, if the index record is not at the head of the linked list, the record pointed by the next pointer of the head record is pointed to the currently deleted index record, and the next deleted record pointer of the currently deleted index record is pointed to the record pointed by the next pointer of the original linked list head record.
2. A HASH index data storage and reading system, at least comprising a processor and a memory, characterized in that: the memory stores therein an executable program of the method of claim 1; the processor running the executable program of the method of claim 1 to index the memory data.
CN202010581485.3A 2020-06-23 2020-06-23 HASH index data storage and reading method and system Active CN111767436B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010581485.3A CN111767436B (en) 2020-06-23 2020-06-23 HASH index data storage and reading method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010581485.3A CN111767436B (en) 2020-06-23 2020-06-23 HASH index data storage and reading method and system

Publications (2)

Publication Number Publication Date
CN111767436A CN111767436A (en) 2020-10-13
CN111767436B true CN111767436B (en) 2023-11-10

Family

ID=72722047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010581485.3A Active CN111767436B (en) 2020-06-23 2020-06-23 HASH index data storage and reading method and system

Country Status (1)

Country Link
CN (1) CN111767436B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114579812B (en) * 2022-03-14 2023-09-15 上海壁仞智能科技有限公司 Management method and device of linked list queue, task management method and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101001159A (en) * 2006-12-30 2007-07-18 华为技术有限公司 Decoding method and decoder
CN101187901A (en) * 2007-12-20 2008-05-28 康佳集团股份有限公司 High speed cache system and method for implementing file access
CN101241492A (en) * 2007-02-06 2008-08-13 中兴通讯股份有限公司 EMS memory data storage apparatus possessing capacity dynamic control function and its accomplishing method
CN102375852A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Method for building data index as well as method and system using data index for inquiring data
CN102779180A (en) * 2012-06-29 2012-11-14 华为技术有限公司 Operation processing method of data storage system and data storage system
CN102890675A (en) * 2011-07-18 2013-01-23 阿里巴巴集团控股有限公司 Method and device for storing and finding data
CN103810246A (en) * 2013-12-27 2014-05-21 北京天融信软件有限公司 Index building method and device and index query method and device
CN103823865A (en) * 2014-02-25 2014-05-28 南京航空航天大学 Database primary memory indexing method
WO2016122548A1 (en) * 2015-01-29 2016-08-04 Hewlett Packard Enterprise Development Lp Hash index

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10558705B2 (en) * 2010-10-20 2020-02-11 Microsoft Technology Licensing, Llc Low RAM space, high-throughput persistent key-value store using secondary memory

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101001159A (en) * 2006-12-30 2007-07-18 华为技术有限公司 Decoding method and decoder
CN101241492A (en) * 2007-02-06 2008-08-13 中兴通讯股份有限公司 EMS memory data storage apparatus possessing capacity dynamic control function and its accomplishing method
CN101187901A (en) * 2007-12-20 2008-05-28 康佳集团股份有限公司 High speed cache system and method for implementing file access
CN102375852A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Method for building data index as well as method and system using data index for inquiring data
CN102890675A (en) * 2011-07-18 2013-01-23 阿里巴巴集团控股有限公司 Method and device for storing and finding data
CN102779180A (en) * 2012-06-29 2012-11-14 华为技术有限公司 Operation processing method of data storage system and data storage system
CN103810246A (en) * 2013-12-27 2014-05-21 北京天融信软件有限公司 Index building method and device and index query method and device
CN103823865A (en) * 2014-02-25 2014-05-28 南京航空航天大学 Database primary memory indexing method
WO2016122548A1 (en) * 2015-01-29 2016-08-04 Hewlett Packard Enterprise Development Lp Hash index

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于GPU的可扩展哈希方法";胡学萱 等;《华南理工大学学报(自然科学版)》;第43卷(第1期);第111-117页 *

Also Published As

Publication number Publication date
CN111767436A (en) 2020-10-13

Similar Documents

Publication Publication Date Title
CN110879813B (en) Binary log analysis-based MySQL database increment synchronization implementation method
US20110302195A1 (en) Multi-Versioning Mechanism for Update of Hierarchically Structured Documents Based on Record Storage
EP2821924A1 (en) Method, device and system for querying data index
CN105320775A (en) Data access method and apparatus
US20130013648A1 (en) Method for database storage of a table with plural schemas
US8457018B1 (en) Merkle tree reference counts
CN104572920A (en) Data arrangement method and data arrangement device
CN111767436B (en) HASH index data storage and reading method and system
CN116756253B (en) Data storage and query methods, devices, equipment and media of relational database
CN109213760A (en) The storage of high load business and search method of non-relation data storage
CN103116652B (en) A kind of index stores management method based on slide fastener information
CN113360495B (en) Database query interruption recovery method, device, equipment and readable medium
CN115495462A (en) Batch data updating method and device, electronic equipment and readable storage medium
CN108021472B (en) Format recovery method of ReFS file system and storage medium
CN114218277A (en) Efficient query method and device for relational database
CN112527196B (en) Cache read-write method and device, computer readable storage medium and electronic equipment
CN110321346B (en) Method and system for realizing character string hash table
CN112380208A (en) Method, system and medium for generating real-time data ID of distribution automation system
CN114490737A (en) Method and terminal for improving deep paging query efficiency of database
CN109325023B (en) Data processing method and device
CN118210760B (en) Backup IO log indexing method, system and storage medium based on B tree
CA2249080C (en) Method for efficiently searching free space in a relational database
CN110874182A (en) Processing method, device and equipment for stripe index
CN116955363B (en) Method, device, computer equipment and medium for creating index of modeless data
CN113282573B (en) Database recovery method, system and storage medium based on IAM page

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant