CN110764705B - Data reading and writing method, device, equipment and storage medium - Google Patents

Data reading and writing method, device, equipment and storage medium Download PDF

Info

Publication number
CN110764705B
CN110764705B CN201911007543.5A CN201911007543A CN110764705B CN 110764705 B CN110764705 B CN 110764705B CN 201911007543 A CN201911007543 A CN 201911007543A CN 110764705 B CN110764705 B CN 110764705B
Authority
CN
China
Prior art keywords
key
key value
index
target
key name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911007543.5A
Other languages
Chinese (zh)
Other versions
CN110764705A (en
Inventor
高魁
谢永恒
万月亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Ruian Technology Co Ltd
Original Assignee
Beijing Ruian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Ruian Technology Co Ltd filed Critical Beijing Ruian Technology Co Ltd
Priority to CN201911007543.5A priority Critical patent/CN110764705B/en
Publication of CN110764705A publication Critical patent/CN110764705A/en
Application granted granted Critical
Publication of CN110764705B publication Critical patent/CN110764705B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data reading and writing method, a device, equipment and a storage medium. The method comprises the following steps: acquiring target data from the partition, and acquiring a target key name and a target key value in the target data; according to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in a first array adjacently in a ring writing mode, and storing the target key name and the target key value index in a second array adjacently; the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array; and carrying out annular reading on the data in the first array and the second array according to the current reading sequence number. By using the technical scheme of the embodiment of the invention, the data can be directly read and written in the memory, thereby avoiding the discrete I/O problem of the disk and improving the data processing performance.

Description

Data reading and writing method, device, equipment and storage medium
Technical Field
Embodiments of the present invention relate to data processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for reading and writing data.
Background
In the current big data processing framework, the streaming big data processing framework spark streaming has the characteristics of high throughput and strong fault tolerance when processing batch streaming services.
However, when the spark streaming processes the batch streaming service, there is a one-to-one relationship between RDD (Resilient Distributed Datasets, elastic distributed data set) at this stage and RDD partitions at the upstream, which is a wide dependency, and a shuffle operation is required. According to the flow chart of spark streaming shuffle operation in the prior art of fig. 1, firstly, data from the partitions are aggregated, the data in the aggregated data set is ordered according to key values, when the data volume in the memory reaches a certain threshold value, the data set is written into a disk, then the data set is pulled out from the disk, and the aggregation is performed again, so that a final result is obtained. The process of writing to disk in the shuffle operation may cause random I/O, resulting in a decrease in the performance of the streaming process. In order to solve the Disk discrete I/O problem caused by the reading and writing, most of the current schemes use SSD (Solid State Disk or Solid State Drive, solid State Disk), but the cost is too high in large-scale large data clusters due to the configuration of SSD.
Disclosure of Invention
The embodiment of the invention provides a data reading and writing method, a device, equipment and a storage medium, so as to realize the reading and writing of data in a memory directly.
In a first aspect, an embodiment of the present invention provides a method for reading and writing data, where the method includes:
acquiring target data from the partition, and acquiring a target key name and a target key value in the target data;
according to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in a first array adjacently in a ring writing mode, and storing the target key name and the target key value index in a second array adjacently;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
and carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
In a second aspect, an embodiment of the present invention further provides a data reading and writing device, where the device includes:
the target data acquisition module is used for acquiring target data from the partition and acquiring a target key name and a target key value in the target data;
The target data writing module is used for adjacently storing the target key value and the target key name index in a first array in a ring writing mode according to the current reading sequence number and/or writing sequence number, and adjacently storing the target key name and the target key value index in a second array;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
and the target data reading module is used for carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
In a third aspect, an embodiment of the present invention further provides a computer device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements a method for reading and writing data provided in any embodiment of the present invention when the processor executes the program.
In a fourth aspect, embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a computer processor, implements a method of reading and writing data as provided by any of the embodiments of the present invention.
According to the embodiment of the invention, the key value and the key name index of the target data are stored in the first array, the key name and the key value index are stored in the second array, and the target data are read and written in a ring writing mode and a ring reading mode. The method solves the technical problem of discrete I/O of the disk caused by the fact that data is required to be written into the disk in the traditional data stream processing process. The stream processing execution performance degradation caused by random I/O is avoided when the data is written into the disk, and the technical effect of improving the data processing performance is realized.
Drawings
FIG. 1 is a flow chart of the operation of spark streaming shuffle in the prior art;
FIG. 2 is a flow chart of a method for reading and writing data according to a first embodiment of the present invention;
FIG. 3 is a flow chart of a method for reading and writing data according to a second embodiment of the present invention;
FIG. 4 is a flow chart of a method for reading and writing data according to a third embodiment of the present invention;
FIG. 5 is a schematic diagram of a data reading and writing device according to a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data reading and writing device in a fifth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Example 1
Fig. 2 is a flowchart of a data reading and writing method according to an embodiment of the present invention, where the method may be implemented by a data reading and writing device, and the device may be implemented by software and/or hardware and is generally integrated in a data reading and writing server, where the method is applicable to a case that a buffer operation is required when a streaming big data processing frame is used for processing streaming data. The method specifically comprises the following steps:
and 110, acquiring target data from the partition, and acquiring a target key name and a target key value in the target data.
RDD is a fault-tolerant, parallel data structure that allows users to explicitly store data to disk and memory and to control partitioning of the data. The data set inside an RDD is logically (and physically) divided into small sets, each such small set being referred to as a partition, and an RDD may contain multiple partitions. For target data in a partition, the target data is expressed in the form of (key, value), wherein the key is a target key name, and the value is a target key value.
Step 120, according to the current reading sequence number and/or writing sequence number, the target key value and the target key name index are stored in the first array adjacently in a ring writing manner, and the target key name and the target key value index are stored in the second array adjacently.
The target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array.
The read sequence number is used for indicating the current read position on the first array, and the write sequence number is used for indicating the current write position on the first array. The annular writing is that after one round of writing is performed on the first array and the second array by using the writing sequence number, the new round of writing is returned to be executed again. The first array and the second array have the same dimension, a target key value and a target key name index are stored in the first array, wherein the target key name index is a result obtained by carrying out hash calculation on a target key name and is used for indicating the storage position of the target key name in the second array, and the target key name index is stored in the next position of the target key value; and storing a target key name and a target key value index in the second array, wherein the target key value index is used for indicating the storage position of the target key value in the first array. The target key value index is stored in the next position of the target key name.
And 130, performing annular reading on the data in the first array and the second array according to the current reading sequence number.
The annular reading is to read the first array and the second array with the read sequence number for one round, and then return to execute a new round of reading.
According to the technical scheme, the first array and the second array are written in a ring-shaped writing mode and read in a ring-shaped reading mode through the storage positions of the target key value and the target key name in the target data and the storage positions of the target key name and the target key value, so that the technical effect that when the shuffle operation exceeds the memory threshold value, data are required to be written into a disk, the technical problem of discrete I/O of the disk is caused, the data are directly processed in the memory, and the execution performance of the data processing is improved is achieved.
Example two
Fig. 3 is a flowchart of a data reading and writing method in a second embodiment of the present invention, where the writing process of data is further refined based on the foregoing embodiment, and the method in this embodiment specifically includes the steps of:
step 210, obtaining target data from the partition, and obtaining a target key name and a target key value in the target data;
Step 220, using the hash value of the target key name as the target key name index, locating a first key name storage position and a first key value index storage position in the second group, and obtaining a first key value index at the first key value index storage position;
and taking the hash value of the target key name as the target key name index, wherein the target key name index is used for indicating the storage position of the target key name in the second array. And in the second array, a storage position matched with the target key name index is obtained and used as the first key name storage position. Because the target key name and the target key value index are stored in the second array, the target key value index is stored in the next position of the target key name, and therefore the next storage position of the first key name storage position is used as the first key value index storage position;
step 230, judging whether the first key index is empty;
wherein if the first key index is empty, it indicates that no data has been stored before the location, and if the first key index is not empty, it indicates that data has been stored before the location.
Step 240, if it is determined that the first key value index is not null, locating a first key value storage location and a first key name index storage location in a first array according to the first key value index, and obtaining a first key value at the first key value storage location;
If it is determined that the first key index is not null, indicating that the position has previously stored data, where the stored data may be stored in the first array and the second array and not yet read by the read sequence number in the present round; or the data can be read by the read sequence number in the round but is covered by new data; or the first key index is not null but the first key index is null when the first key index is read by the read sequence number in the round but not covered by the new data. Therefore, the data which has been accessed before needs to be acquired and is matched with the target key name of the target data, and if the key names are the same, the data can be directly accumulated; if the key names are different, direct accumulation is not possible. And in the first array, a storage position matched with the first key value index is obtained to be used as a first key value storage position, and a storage position behind the first key value storage position is used as the first key name index storage position. And acquiring a first key value at the first key value storage position.
Step 250, determining whether a hash collision-free storage condition is satisfied according to one or more of the first key value storage location, the read sequence number, the first key value, the first key name and the target key name;
The hash conflict indicates that a result obtained by performing hash calculation according to the target key name, namely, the first key name on the second array pointed by the target key name index is different from the target key name.
And if the position number of the first key value storage position is determined to be smaller than the reading sequence number and the first key value is empty, determining that the hash collision-free storage condition is met, wherein the data stored in the position before is read according to the reading sequence number but not covered by new data, the first key value and the first key name are empty, and the target data can be directly stored in the position.
If the position number of the first key value storage position is determined to be smaller than the reading sequence number, the first key value is non-null, and the target key name is consistent with the first key name, the hash collision-free storage condition is determined to be met, at this time, the data stored in the position before is read according to the reading sequence number, but is covered by new data, and the key name (namely, the first key name) of the new data is identical to the target key name of the target data, and the target key value of the target data can be directly accumulated with the key value of the new data.
If the position number of the first key value storage position is determined to be smaller than the read sequence number, the first key value is non-null, and the target key name is inconsistent with the first key name, the hash collision-free storage condition is determined not to be met, and the data stored in the position before is read according to the read sequence number and is covered by new data, but the key name (namely the first key name) of the new data is not identical to the target key name of the target data, namely hash collision exists, and the target data cannot be stored in the position any more.
If it is determined that the position number of the first key value storage position is greater than the reading sequence number and the target key name is consistent with the first key name, it is determined that the hash collision-free storage condition is satisfied, at this time, it is indicated that the data stored in this position before has not been read according to the reading sequence number, and the key name (i.e., the first key name) of the stored data is the same as the target key name of the target data, and the key value of the target data may be directly accumulated with the key value of the stored data.
And if the position number of the first key value storage position is determined to be larger than the reading sequence number and the target key name is inconsistent with the first key name, determining that the hash collision-free storage condition is not met. At this time, it is explained that the data stored in the location before has not been read according to the read sequence number, but the key name (i.e. the first key name) of the stored data is different from the target key name of the target data, that is, there is a hash collision, and at this time, the target data cannot be stored in the location any more.
Step 260, if yes, storing the target key name index in the first key name index storage location, storing the target key name in the first key name storage location, and accumulating and storing the target key value in the first key value storage location.
If the hash collision-free storage condition is met, the target data can be stored in the position, the target key values of the target data are accumulated and stored in the first key value storage position on the first array, and the target key names of the target data are stored in the first key name storage position on the second array.
Step 270, if not, in the second array, taking the next key name storage position after the first key name storage position as a starting point, obtaining a second key name storage position closest to the first key name storage position and not storing a key name, and determining a second key name index according to the second key name storage position;
if the hash collision-free storage condition is not satisfied, it is indicated that a hash collision exists at the moment, and the target data cannot be stored in the position, so that the next key name storage position of the first key name storage position in the second array is taken as a starting point, and the first empty key name storage position is acquired as the second key name storage position. And determining a second key name index according to the second key name storage location.
Step 280, determining a second key value storage position in the first array according to the writing sequence number, and determining a second key value index according to the second key value storage position;
and taking the position of the first array pointed by the current writing sequence number as a second key value storage position for storing the target key value of the target data. The second key value storage location is used as a second key value index.
Step 290, storing the target key value in the second key value storage location, storing the target key name in the second key name storage location, storing the second key name index in a storage location subsequent to the second key value storage location, and storing the second key value index in a storage location subsequent to the second key name storage location.
After obtaining the second key name storage location, the second key name index, the second key value storage location and the second key value index according to step 270 and step 280, storing the target key value in the second key value storage location, and storing the second key name index in a storage location subsequent to the second key value storage location, wherein the target key value and the second key name index are both stored in the first array; and storing the target key name in the second key name storage position, and storing a second key value index in a storage position behind the second key name storage position, wherein the target key name and the second key value index are both stored on a second array.
Step 2100, if it is determined that the first key value index is empty, determining a third key name index according to the first key name storage location;
if the first key index is empty, it indicates that no data is written in the location, whether in the previous writing process or the current writing process, and the target data can be directly written in the location. And acquiring a third key name index according to the storage position of the first key name.
Step 2110, determining a third key value storage location in the first array according to the write sequence number, and determining a third key value index according to the third key value storage location;
and determining a storage position of a third key value according to the position in the first array pointed by the current writing sequence number, and determining a third key value index according to the third key value storage position.
Step 2120, storing the target key value in the third key value storage location, storing a target key name in the first key name storage location, storing the third key name index in a storage location subsequent to the third key value storage location, and storing the third key value index in a storage location subsequent to the first key name storage location.
Storing a target key value into a third key value storage position, and storing a third key name index into a storage position behind the third key value storage position, wherein the target key value and the third key name index are both stored on a first array; storing a target key name in the first key name storage location, and storing a third key value index in a storage location subsequent to the first key name storage location, wherein the target key name and the third key value index are both stored on a second array.
Step 2130, performing annular reading on the data in the first array and the second array according to the current reading sequence number.
According to the technical scheme of the embodiment, the target key value, the target key name and the target key name index of the target data are obtained, the first key value index position is determined according to the target key name index, and the writing of the data is classified into three cases by judging whether the first key value index position is empty or not and whether the hash collision-free storage condition is met or not, so that different writing situations are distinguished. The method solves the technical problem that when the memory threshold is exceeded by the shuffle operation in the prior art, data is required to be written into a disk, so that discrete I/O of the disk is caused. The method achieves the technical effect that the data can be directly processed in the memory by annular writing of the data without writing the data into a disk, thereby improving the execution performance of the data processing.
Example III
Fig. 4 is a flowchart of a data reading and writing method in the third embodiment of the present invention, where the data reading flow is further refined based on the foregoing embodiment, and the method in this embodiment specifically includes the steps of:
step 310, obtaining target data from the partition, and obtaining a target key name and a target key value in the target data;
step 320, according to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in the first array adjacently in a ring writing manner, and storing the target key name and the target key value index in the second array adjacently;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
step 330, determining a current key value storage position pointed by the read sequence number in the first array according to the current read sequence number, obtaining a current key value, and emptying the current key value storage position;
and acquiring the current key value according to the position in the first array pointed by the current reading sequence number, and setting the position to be empty to indicate that the position is read, so that writing of other data is convenient.
Step 340, determining a current key name index according to a storage position subsequent to the current key value storage position, determining a storage position of the current key name in the second array according to the current key name index, obtaining the current key name, and emptying the storage position of the current key name.
The position in the first array pointed by the reading sequence number is the storage position of the current key value, and the storage position behind the storage position of the current key value stores the current key name index. And according to the current key name index, positioning the storage position of the current key name in the second array, acquiring the current key name, and emptying the storage position of the current key name to indicate that the storage position is read, so that writing of other data is facilitated.
According to the embodiment of the invention, the target data is circularly written in the first array and the second array, the key value position pointed by the current reading sequence number is emptied according to the reading sequence number, the key name position in the second array is emptied according to the key name index of the position behind the key value position, and the key name index storage position in the first array and the key value index storage position in the second array are reserved. The method solves the technical problem that when the memory threshold is exceeded by the shuffle operation in the prior art, data is required to be written into a disk, so that discrete I/O of the disk is caused. The method has the advantages that the key value position and the key name position are emptied by annular writing and annular reading of the data so as to facilitate writing of new data, the data is directly processed in the memory, the data is not required to be written into a magnetic disk, and therefore the execution performance of the data processing is improved.
Example IV
Fig. 5 is a schematic structural diagram of a data reading and writing device according to a fourth embodiment of the present invention, where the data reading and writing device includes: a target data acquisition module 410, a target data writing module 420, and a target data reading module 430. Wherein:
a target data obtaining module 410, configured to obtain target data from a partition, and obtain a target key name and a target key value in the target data;
the target data writing module 420 is configured to store the target key value and the target key name index in a first array adjacently in a ring writing manner according to a current reading sequence number and/or writing sequence number, and store the target key name and the target key name index in a second array adjacently;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
and a target data reading module 430, configured to perform annular reading on the data in the first array and the second array according to the current reading sequence number.
According to the technical scheme, the first array and the second array are written in a ring-shaped writing mode and read in a ring-shaped reading mode through the storage positions of the target key value and the target key name in the target data and the storage positions of the target key name and the target key value, so that the technical effect that when the shuffle operation exceeds the memory threshold value, data are required to be written into a disk, the technical problem of discrete I/O of the disk is caused, the data are directly processed in the memory, and the execution performance of the data processing is improved is achieved.
Based on the above embodiment, the target data writing module 420 includes:
a first key value index obtaining unit, configured to locate a first key name storage location and a first key value index storage location in the second array with the hash value of the target key name as the target key name index, and obtain a first key value index at the first key value index storage location;
a first key value obtaining unit, configured to locate a first key value storage location and a first key name index storage location in a first array according to the first key value index if it is determined that the first key value index is not null, and obtain a first key value at the first key value storage location;
a condition judging unit, configured to determine whether a hash collision-free storage condition is satisfied according to one or more of the first key value storage location, the read sequence number, the first key value, the first key name, and the target key name;
and the satisfaction condition executing unit is used for storing the target key name index in the first key name index storage position, storing the target key name in the first key name storage position and accumulating and storing the target key value in the first key value storage position if the hash collision-free storage condition is satisfied.
Based on the above embodiment, the target data writing module 420 further includes:
the unsatisfied condition executing unit is used for acquiring a second key name storage position which is closest to the first key name storage position and does not store a key name, from a next key name storage position behind the first key name storage position in the second array, and determining a second key name index according to the second key name storage position if the unsatisfied condition is not satisfied;
a second key value index determining unit, configured to determine a second key value storage location in the first array according to the write sequence number, and determine a second key value index according to the second key value storage location;
and the first target data storage unit is used for storing the target key value in the second key value storage position, storing a target key name in the second key name storage position, storing the second key name index in a storage position behind the second key value storage position, and storing the second key value index in a storage position behind the second key name storage position.
On the basis of the above embodiment, the condition judgment unit specifically includes:
A first satisfaction condition determining unit, configured to determine that the hash collision free storage condition is satisfied if it is determined that the position number of the first key value storage position is smaller than the read sequence number and the first key value is null;
a second satisfaction condition determining unit, configured to determine that the hash collision free storage condition is satisfied if it is determined that the position number of the first key value storage position is smaller than the read sequence number, the first key value is non-null, and the target key name is consistent with the first key name;
a first unsatisfied condition determining unit, configured to determine that the hash collision free storage condition is not satisfied if it is determined that the position number of the first key value storage position is smaller than the read sequence number, the first key value is non-null, and the target key name is inconsistent with the first key name; and
a third satisfaction condition determining unit, configured to determine that the hash collision free storage condition is satisfied if it is determined that the position number of the first key value storage position is greater than the read sequence number and the target key name is consistent with the first key name; and
and a second unsatisfied condition determining unit, configured to determine that the hash collision free storage condition is not satisfied if it is determined that the position number of the first key value storage position is greater than the read sequence number and the target key name is inconsistent with the first key name.
On the basis of the above embodiment, the first key index obtaining unit is specifically configured to:
taking the hash value of the target key name as the target key name index, acquiring a storage position matched with the target key name index in the second array as the first key name storage position, and taking the later storage position of the first key name storage position as the first key value index storage position;
on the basis of the above embodiment, the first key value obtaining unit is specifically configured to:
if the first key value index is determined not to be empty, a storage position matched with the first key value index is obtained in the first array to serve as a first key value storage position, a later storage position of the first key value storage position is taken as the first key name index storage position, and a first key value is obtained at the first key value storage position.
Based on the above embodiment, the target data writing module 420 further includes:
a third key index determining unit configured to determine a third key index according to the first key storage location if it is determined that the first key index is empty;
a third key value index determining unit configured to determine a third key value storage location in the first array according to the write sequence number, and determine a third key value index according to the third key value storage location;
And a second target data storage unit configured to store the target key value in the third key value storage location, store a target key name in the first key name storage location, store the third key name index in a storage location subsequent to the third key value storage location, and store the third key value index in a storage location subsequent to the first key name storage location.
Based on the above embodiment, the target data reading module 430 includes:
a key value storage position emptying unit, configured to determine, according to the current read sequence number, a current key value storage position pointed by the read sequence number in the first array, obtain a current key value, and empty the current key value storage position;
and the key name storage position emptying unit is used for determining a current key name index according to the later storage position of the current key value storage position, determining the storage position of the current key name in the second array according to the current key name index, acquiring the current key name and emptying the storage position of the current key name.
The data read-write device provided by the embodiment of the invention can execute the data read-write method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 6 is a schematic structural diagram of an apparatus according to a fifth embodiment of the present invention, and as shown in fig. 6, the apparatus includes a processor 50, a memory 51, an input device 52 and an output device 53; the number of processors 50 in the device may be one or more, one processor 50 being taken as an example in fig. 6; the processor 50, the memory 51, the input means 52 and the output means 53 in the device may be connected by a bus or other means, in fig. 6 by way of example.
The memory 51 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and modules, such as program instructions/modules (e.g., a target data acquisition module 410, a target data writing module 420, and a target data reading module 430 in a data reading and writing device) corresponding to a data reading and writing method in an embodiment of the present invention. The processor 50 executes various functional applications of the device and data processing, i.e., implements the above-described data reading and writing method, by running software programs, instructions, and modules stored in the memory 51. The method comprises the following steps:
acquiring target data from the partition, and acquiring a target key name and a target key value in the target data;
According to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in a first array adjacently in a ring writing mode, and storing the target key name and the target key value index in a second array adjacently;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
and carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
The memory 51 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. In addition, memory 51 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 51 may further include memory located remotely from processor 50, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input means 52 may be used to receive entered numeric or character information and to generate key signal inputs related to user settings and function control of the device. The output means 53 may comprise a display device such as a display screen.
Example six
A sixth embodiment of the present invention also provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are for performing a method of reading and writing data, the method comprising:
acquiring target data from the partition, and acquiring a target key name and a target key value in the target data;
according to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in a first array adjacently in a ring writing mode, and storing the target key name and the target key value index in a second array adjacently;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
and carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the method operations described above, and may also perform the related operations in the data reading and writing method provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data reading and writing apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (7)

1. A method of reading and writing data, comprising:
acquiring target data from the partition, and acquiring a target key name and a target key value in the target data;
according to the current reading sequence number and/or writing sequence number, storing the target key value and the target key name index in a first array adjacently in a ring writing mode, and storing the target key name and the target key value index in a second array adjacently;
according to the current reading sequence number and/or writing sequence number, the target key value and the target key name index are stored adjacently in a first array in a ring writing mode, and the target key name and the target key value index are stored adjacently in a second array, comprising:
The hash value of the target key name is used as the target key name index, a first key name storage position and a first key value index storage position are positioned in the second group, and a first key value index is acquired at the first key value index storage position;
if the first key value index is determined not to be empty, positioning a first key value storage position and a first key name index storage position in a first array according to the first key value index, and acquiring a first key value at the first key value storage position;
determining whether a hash collision-free storage condition is met according to one or more of the first key value storage position, the reading sequence number, the first key value, the first key name and the target key name;
if yes, storing the target key name index in the first key name index storage position, storing the target key name in the first key name storage position, and accumulating and storing the target key value in the first key value storage position;
if not, in the second array, taking the next key name storage position behind the first key name storage position as a starting point, acquiring a second key name storage position closest to the first key name storage position and not storing a key name, and determining a second key name index according to the second key name storage position;
Determining a second key value storage position in the first array according to the writing sequence number, and determining a second key value index according to the second key value storage position;
storing the target key value in the second key value storage position, storing a target key name in the second key name storage position, storing the second key name index in a storage position subsequent to the second key value storage position, and storing the second key value index in a storage position subsequent to the second key name storage position;
if the first key value index is determined to be empty, a third key name index is determined according to the first key name storage position;
determining a third key value storage location in the first array according to the write sequence number, and determining a third key value index according to the third key value storage location;
storing the target key value in the third key value storage location, storing a target key name in the first key name storage location, storing the third key name index in a storage location subsequent to the third key value storage location, and storing the third key value index in a storage location subsequent to the first key name storage location;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
And carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
2. The method of claim 1, wherein determining whether a hash collision free storage condition is satisfied based on one or more of the first key value storage location, the read sequence number, the first key value, the first key name, and the target key name comprises at least one of:
if the position number of the first key value storage position is smaller than the reading sequence number and the first key value is empty, determining that the hash collision-free storage condition is met;
if the position number of the first key value storage position is smaller than the reading sequence number, the first key value is non-null, and the target key name is consistent with the first key name, determining that the hash collision-free storage condition is met;
if the position number of the first key value storage position is smaller than the reading sequence number, the first key value is non-null, and the target key name is inconsistent with the first key name, determining that the hash collision-free storage condition is not met; and
if the position number of the first key value storage position is determined to be larger than the reading sequence number and the target key name is consistent with the first key name, determining that the hash-conflict-free storage condition is met; and
And if the position number of the first key value storage position is determined to be larger than the reading sequence number and the target key name is inconsistent with the first key name, determining that the hash collision-free storage condition is not met.
3. The method of claim 1, wherein locating a first key name storage location at the second array and a first key value index storage location with the hash value of the target key name as the target key name index comprises:
in the second array, a storage position matched with the target key name index is obtained to be used as the first key name storage position, and a storage position after the first key name storage position is used as the first key value index storage position;
positioning a first key value storage position and a first key name index storage position in a first array according to the first key value index, wherein the positioning comprises the following steps:
and in the first array, a storage position matched with the first key value index is obtained to be used as a first key value storage position, and a storage position behind the first key value storage position is used as the first key name index storage position.
4. A method according to any one of claims 1-3, wherein performing a circular read of data in the first array and the second array according to the current read sequence number comprises:
Determining a current key value storage position pointed by the reading sequence number in the first array according to the current reading sequence number, acquiring a current key value, and emptying the current key value storage position;
and determining a current key name index according to the later storage position of the current key value storage position, determining the storage position of the current key name in the second array according to the current key name index, acquiring the current key name, and emptying the storage position of the current key name.
5. A data reading and writing apparatus, comprising:
the target data acquisition module is used for acquiring target data from the partition and acquiring a target key name and a target key value in the target data;
the target data writing module is used for adjacently storing the target key value and the target key name index in a first array in a ring writing mode according to the current reading sequence number and/or writing sequence number, and adjacently storing the target key name and the target key value index in a second array;
the target key name index is used for indicating the storage position of the target key name in the second array, and the target key value index is used for indicating the storage position of the target key value in the first array;
A target data writing module comprising:
a first key value index obtaining unit, configured to locate a first key name storage location and a first key value index storage location in the second array with the hash value of the target key name as the target key name index, and obtain a first key value index at the first key value index storage location;
a first key value obtaining unit, configured to locate a first key value storage location and a first key name index storage location in a first array according to the first key value index if it is determined that the first key value index is not null, and obtain a first key value at the first key value storage location;
a condition judging unit, configured to determine whether a hash collision-free storage condition is satisfied according to one or more of the first key value storage location, the read sequence number, the first key value, the first key name, and the target key name;
the satisfaction condition executing unit is used for storing the target key name index in the first key name index storage position, storing the target key name in the first key name storage position and accumulating and storing the target key value in the first key value storage position if the hash collision-free storage condition is satisfied;
The unsatisfied condition executing unit is used for acquiring a second key name storage position which is closest to the first key name storage position and does not store a key name, from a next key name storage position behind the first key name storage position in the second array, and determining a second key name index according to the second key name storage position if the unsatisfied condition is not satisfied;
a second key value index determining unit, configured to determine a second key value storage location in the first array according to the write sequence number, and determine a second key value index according to the second key value storage location;
a first target data storage unit, configured to store the target key value in the second key value storage location, store a target key name in the second key name storage location, store the second key name index in a storage location subsequent to the second key value storage location, and store the second key value index in a storage location subsequent to the second key name storage location;
a third key index determining unit configured to determine a third key index according to the first key storage location if it is determined that the first key index is empty;
A third key value index determining unit configured to determine a third key value storage location in the first array according to the write sequence number, and determine a third key value index according to the third key value storage location;
a second target data storage unit configured to store the target key value in the third key value storage location, store a target key name in the first key name storage location, store the third key name index in a storage location subsequent to the third key value storage location, and store the third key value index in a storage location subsequent to the first key name storage location;
and the target data reading module is used for carrying out annular reading on the data in the first array and the second array according to the current reading sequence number.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements a method of reading and writing data according to any of claims 1-4 when executing the program.
7. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing a method of reading and writing data as claimed in any of claims 1 to 4.
CN201911007543.5A 2019-10-22 2019-10-22 Data reading and writing method, device, equipment and storage medium Active CN110764705B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911007543.5A CN110764705B (en) 2019-10-22 2019-10-22 Data reading and writing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911007543.5A CN110764705B (en) 2019-10-22 2019-10-22 Data reading and writing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110764705A CN110764705A (en) 2020-02-07
CN110764705B true CN110764705B (en) 2023-08-04

Family

ID=69331308

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911007543.5A Active CN110764705B (en) 2019-10-22 2019-10-22 Data reading and writing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110764705B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112579003B (en) * 2020-12-15 2022-06-14 浙江大华技术股份有限公司 Key value pair adjustment method, key value pair adjustment device, storage medium and electronic device
CN113467716B (en) * 2021-06-11 2023-05-23 苏州浪潮智能科技有限公司 Method, device, equipment and readable medium for data storage

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102323947A (en) * 2011-09-05 2012-01-18 东北大学 Generation method of pre-join table on ring-shaped schema database
CN102722449A (en) * 2012-05-24 2012-10-10 中国科学院计算技术研究所 Key-Value local storage method and system based on solid state disk (SSD)
CN103186668A (en) * 2013-03-11 2013-07-03 北京京东世纪贸易有限公司 Method and device for processing data as well as data storage system based on key value data base
CN105320775A (en) * 2015-11-11 2016-02-10 中科曙光信息技术无锡有限公司 Data access method and apparatus
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
CN106886375A (en) * 2017-03-27 2017-06-23 百度在线网络技术(北京)有限公司 The method and apparatus of data storage
WO2018103315A1 (en) * 2016-12-09 2018-06-14 上海壹账通金融科技有限公司 Monitoring data processing method, apparatus, server and storage equipment
CN108563532A (en) * 2018-02-28 2018-09-21 深圳和而泰数据资源与云技术有限公司 Data processing method and relevant apparatus
CN108572958A (en) * 2017-03-07 2018-09-25 腾讯科技(深圳)有限公司 Data processing method and device
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce
CN108664487A (en) * 2017-03-28 2018-10-16 Tcl集团股份有限公司 A kind of write-in of hash table data, read method and system
CN109543080A (en) * 2018-12-04 2019-03-29 北京字节跳动网络技术有限公司 A kind of caching data processing method, device, electronic equipment and storage medium
CN109558084A (en) * 2018-11-29 2019-04-02 文华学院 A kind of data processing method and relevant device
CN109656923A (en) * 2018-12-19 2019-04-19 北京字节跳动网络技术有限公司 A kind of data processing method, device, electronic equipment and storage medium
CN109710190A (en) * 2018-12-26 2019-05-03 百度在线网络技术(北京)有限公司 A kind of date storage method, device, equipment and storage medium
CN110049091A (en) * 2019-01-10 2019-07-23 阿里巴巴集团控股有限公司 Date storage method and device, electronic equipment, storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9846642B2 (en) * 2014-10-21 2017-12-19 Samsung Electronics Co., Ltd. Efficient key collision handling
US10891264B2 (en) * 2015-04-30 2021-01-12 Vmware, Inc. Distributed, scalable key-value store

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102323947A (en) * 2011-09-05 2012-01-18 东北大学 Generation method of pre-join table on ring-shaped schema database
CN102722449A (en) * 2012-05-24 2012-10-10 中国科学院计算技术研究所 Key-Value local storage method and system based on solid state disk (SSD)
CN103186668A (en) * 2013-03-11 2013-07-03 北京京东世纪贸易有限公司 Method and device for processing data as well as data storage system based on key value data base
CN105320775A (en) * 2015-11-11 2016-02-10 中科曙光信息技术无锡有限公司 Data access method and apparatus
CN106096023A (en) * 2016-06-24 2016-11-09 腾讯科技(深圳)有限公司 Method for reading data, method for writing data and data server
WO2018103315A1 (en) * 2016-12-09 2018-06-14 上海壹账通金融科技有限公司 Monitoring data processing method, apparatus, server and storage equipment
CN108572958A (en) * 2017-03-07 2018-09-25 腾讯科技(深圳)有限公司 Data processing method and device
CN106886375A (en) * 2017-03-27 2017-06-23 百度在线网络技术(北京)有限公司 The method and apparatus of data storage
CN108664487A (en) * 2017-03-28 2018-10-16 Tcl集团股份有限公司 A kind of write-in of hash table data, read method and system
CN108563532A (en) * 2018-02-28 2018-09-21 深圳和而泰数据资源与云技术有限公司 Data processing method and relevant apparatus
CN108595268A (en) * 2018-04-24 2018-09-28 咪咕文化科技有限公司 A kind of data distributing method, device and computer readable storage medium based on MapReduce
CN109558084A (en) * 2018-11-29 2019-04-02 文华学院 A kind of data processing method and relevant device
CN109543080A (en) * 2018-12-04 2019-03-29 北京字节跳动网络技术有限公司 A kind of caching data processing method, device, electronic equipment and storage medium
CN109656923A (en) * 2018-12-19 2019-04-19 北京字节跳动网络技术有限公司 A kind of data processing method, device, electronic equipment and storage medium
CN109710190A (en) * 2018-12-26 2019-05-03 百度在线网络技术(北京)有限公司 A kind of date storage method, device, equipment and storage medium
CN110049091A (en) * 2019-01-10 2019-07-23 阿里巴巴集团控股有限公司 Date storage method and device, electronic equipment, storage medium

Also Published As

Publication number Publication date
CN110764705A (en) 2020-02-07

Similar Documents

Publication Publication Date Title
CN110096336B (en) Data monitoring method, device, equipment and medium
CN105786405B (en) A kind of online upgrading method, apparatus and system
US10878335B1 (en) Scalable text analysis using probabilistic data structures
US11650990B2 (en) Method, medium, and system for joining data tables
US10523743B2 (en) Dynamic load-based merging
US11429566B2 (en) Approach for a controllable trade-off between cost and availability of indexed data in a cloud log aggregation solution such as splunk or sumo
CN109547807B (en) Information processing method and device based on live broadcast and server
CN110764705B (en) Data reading and writing method, device, equipment and storage medium
US10642530B2 (en) Global occupancy aggregator for global garbage collection scheduling
CN110955857A (en) Service processing method and device for high concurrency environment
CN103973470A (en) Cluster management method and equipment for shared-nothing cluster
CN106708865B (en) Method and device for accessing window data in stream processing system
US20220035666A1 (en) Method and apparatus for data processing, server and storage medium
CN116881051B (en) Data backup and recovery method and device, electronic equipment and storage medium
CN111046004B (en) Data file storage method, device, equipment and storage medium
CN111400241B (en) Data reconstruction method and device
CN109101595B (en) Information query method, device, equipment and computer readable storage medium
CN103414756A (en) Task distributing method and distributing node and system
US8935710B1 (en) Unique event identification
CN112181662B (en) Task scheduling method and device, electronic equipment and storage medium
CN111131512B (en) Equipment information processing method and device, storage medium and processor
JP7133037B2 (en) Message processing method, device and system
CN113076197A (en) Load balancing method and device, storage medium and electronic equipment
US10567507B2 (en) Message processing method and apparatus, and message processing system
CN111061712A (en) Data connection operation processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant