CN112732194B - Irregular data storage method, device and storage medium - Google Patents

Irregular data storage method, device and storage medium Download PDF

Info

Publication number
CN112732194B
CN112732194B CN202110043234.4A CN202110043234A CN112732194B CN 112732194 B CN112732194 B CN 112732194B CN 202110043234 A CN202110043234 A CN 202110043234A CN 112732194 B CN112732194 B CN 112732194B
Authority
CN
China
Prior art keywords
data
area
storage
tail
storage unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110043234.4A
Other languages
Chinese (zh)
Other versions
CN112732194A (en
Inventor
曲远汶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongdun Technology Co ltd
Tongdun Holdings Co Ltd
Original Assignee
Tongdun Technology Co ltd
Tongdun Holdings Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongdun Technology Co ltd, Tongdun Holdings Co Ltd filed Critical Tongdun Technology Co ltd
Priority to CN202110043234.4A priority Critical patent/CN112732194B/en
Publication of CN112732194A publication Critical patent/CN112732194A/en
Application granted granted Critical
Publication of CN112732194B publication Critical patent/CN112732194B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an irregular data storage method, an irregular data storage device and a storage medium, wherein the irregular data storage method comprises the following steps: acquiring the data length of received source data, and determining the type of a storage unit for storing the source data based on the data length, wherein the storage unit in each storage unit type respectively comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area; if the data length is less than or equal to the length of the wide data area of the minimum storage unit type, or is greater than any one of the sum of the basic data area and the tail storage area of the previous storage unit type in the two adjacent storage unit types and is less than or equal to the length of the wide data area of the next storage unit type; and dividing the storage unit in the storage unit type corresponding to the source data into a first identification bit area and a merged first wide data area for storing the source data, wherein the merged first wide data area is an area obtained by merging the first dynamic information area and the first basic data area.

Description

Irregular data storage method, device and storage medium
Technical Field
The present invention relates to data storage technologies, and in particular, to an irregular data storage method, an irregular data storage device, and a storage medium.
Background
The memory has very high read-write performance, and in a service scene with high concurrency and high performance requirements, the memory is usually used as a hot spot data cache of an application service, so that the processing pressure of a downstream application or a database can be relieved, and the throughput and the response capability of the application service can be improved.
In the local caching process in the prior art, to avoid waste of resources, application services are usually deployed in a virtual machine or a container, but the available space of a local memory is limited, and the available memory is usually only a few G, so that the caching amount of the local memory is small.
Moreover, the currently adopted scheme of local cache needs to be implemented based on a data structure provided by a programming language or a third-party tool, and because different data may have different language modes, the memory usage amount is greatly enlarged during data storage, so that the effective storage efficiency of the local cache is reduced.
Under the conditions of less storage capacity and lower storage efficiency of the local memory, the subsequent data retrieval and processing performance of the local cache is reduced.
Disclosure of Invention
The embodiment of the invention provides an irregular data storage method, an irregular data storage device and a storage medium, and has the advantages of high storage capacity, high processing speed, high efficiency, high performance and the like.
In a first aspect of the embodiments of the present invention, an irregular data storage method is provided, including:
acquiring the data length of received source data, and determining the type of a storage unit for storing the source data based on the data length, wherein the data length is preset in correspondence with different types of storage units, and the storage unit in each type of storage unit respectively comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area;
if the data length is less than or equal to the length of the wide data area of the minimum storage unit type, or is greater than any one of the sum of the basic data area and the tail storage area of the previous storage unit type in the two adjacent storage unit types and less than or equal to the length of the wide data area of the next storage unit type;
and dividing the storage unit in the storage unit type corresponding to the source data into a first identification bit area and a first wide data area for storing the source data, wherein the first wide data area is an area obtained by combining a first dynamic information area and a first basic data area.
Optionally, in a possible implementation manner of the first aspect, if the data length is greater than a wide data area length of a previous storage unit type in the two adjacent storage unit types and is less than or equal to a sum of a base data length and a tail storage structure length in a next storage unit type;
and dividing the storage unit in the storage unit type corresponding to the source data into a second identification bit area, a second dynamic information area, a second basic data area and a tail storage area to store the source data respectively.
Optionally, in a possible implementation manner of the first aspect, dividing a storage unit in a storage unit type corresponding to the source data into a first flag region and a first wide data region for storing the source data includes:
storing information other than identification information in the source data in a first wide data region, storing identification information of the source data in a first identification bit region, wherein the identification information is calculated by,
calculating the difference value between the actual length of the first wide data area and the actual length of the source data;
summing the difference value and a preset value to obtain identification information; the step of dividing the storage unit into a second identification bit area, a second dynamic information area, a second basic data area and a tail storage area to store the source data respectively comprises the following steps:
acquiring index information of a tail storage area, wherein the index information comprises a segment index and a unit index;
storing the segment index in the second flag region;
and storing the unit index in the second dynamic information area.
Optionally, in a possible implementation manner of the first aspect, the dividing the storage unit into a second identification bit area, a second dynamic information area, a second basic data area, and a tail storage area to store the source data respectively includes:
splitting the source data into front-segment data and rear-segment data, and storing the front-segment data in the second basic data area, wherein the length of the front-segment data is the same as that of the second basic data area;
and storing the latter data in a tail storage area.
Optionally, in a possible implementation manner of the first aspect, storing the later data in the tail storage area includes:
storing the back segment data to an idle tail storage area at first, and acquiring a segment index and a unit index of the idle tail storage area;
and if no idle tail storage area exists at present, newly building a tail section storage area, and storing the residual source data in the newly built tail storage area to obtain a section index and a unit index of the newly built tail section storage area.
Optionally, in a possible implementation manner of the first aspect, the first flag region and the merged first wide data region form a header storage structure;
receiving source data, acquiring a group index corresponding to the source data, and comparing data in a header storage structure under the group index with the source data in sequence;
if the comparison result is the same, the source data is not stored;
and if the comparison result is different, determining that the free header storage structure under the group index stores the source data.
Optionally, in a possible implementation manner of the first aspect, if there is no free header storage structure, one header storage structure and a tail index corresponding to the header storage structure are randomly selected;
if the randomly selected head storage structure has a tail index, clearing the data in the tail storage unit, and recording that the cleared tail is in an idle state.
And clearing the data in the randomly selected header storage structure, and storing the randomly selected header storage structure for the source data.
Optionally, in a possible implementation manner of the first aspect, the second identification bit area, the second dynamic information area, and the second basic data area form a head storage structure, and the tail storage area forms a tail storage structure;
receiving source data, acquiring a group index corresponding to the source data, and comparing data in a header storage structure under the group index with front-segment data in sequence;
if the comparison result is the same, acquiring the segment index and the unit index of the tail storage structure from the second identification bit area and the second dynamic information area, and comparing the data in the tail storage structure with the later segment data again;
and if the comparison result is different, storing the later-segment data to an idle tail storage structure firstly, and acquiring the segment index and the unit index of the idle tail storage structure.
Optionally, in a possible implementation manner of the first aspect, if there is no idle tail storage structure currently, a tail section storage area is newly created, and the segment index and the unit index of the remaining source data are obtained after the remaining source data are stored in a tail storage structure corresponding to the newly created tail storage area;
determining a free header storage structure under the group index to store the front-segment data;
if no free head storage structure exists, a head storage structure and a tail index corresponding to the head storage structure are randomly selected;
if the randomly selected head storage unit has a tail index, clearing data in a tail storage structure, and marking the tail storage structure as an idle state;
and clearing the data in the randomly selected header storage structure, and storing the front-segment data.
In a second aspect of the embodiments of the present invention, an irregular data storage device is provided, including:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the data length of received source data and determining the type of a storage unit based on the data length, the data length is preset corresponding to different types of the storage unit, and the storage unit comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area;
the judging module is used for judging whether the data length is smaller than or equal to the length of the wide data area of the minimum storage unit or is larger than the sum of the basic data area and the tail storage area of the previous storage unit in two adjacent storage units and is smaller than or equal to the length of the wide data area of the next storage unit;
and the storage dividing module is used for dividing the storage unit into a first identification bit area and a merged first wide data area for storing the source data, wherein the merged first wide data area is an area obtained by merging the first dynamic information area and the first wide data area.
In a third aspect of the embodiments of the present invention, a readable storage medium is provided, in which a computer program is stored, which, when being executed by a processor, is adapted to carry out the method according to the first aspect of the present invention and various possible designs of the first aspect of the present invention.
According to the irregular data storage method, the irregular data storage device and the irregular data storage medium, different storage units are divided into limited different-length head storage structure types and fixed uniform-length tail storage structures in the data storage process. When the data is stored, the appropriate storage space is selected according to the length of the source data, so that the waste of the storage space can be reduced, the occupied storage space can be aligned, the data can be conveniently and quickly searched, and the data storage method has the advantages of high storage capacity, high processing speed, high efficiency, high performance and the like.
In addition, the invention greatly reduces the storage space occupied by data by optimizing the design of the storage structure and distributing the storage structure according to the data characteristics. And the storage rule designed by the invention is combined to realize the rapid positioning and searching of the data.
In the process of storing the data, the invention automatically traverses the storage units, searches for an idle storage unit to preferentially store the data, establishes a new tail storage structure and/or selects an idle tail storage structure if no corresponding idle unit exists, and deletes the data in the head storage structure corresponding to the tail storage structure. Through the mode, the data can be preferentially cached in the idle storage unit when the data is cached, if the idle storage unit does not exist, the storage unit which stores the data is randomly acquired to erase the stored data, and then the storage unit which erases the data stores the current source data, so that idle resources are preferentially utilized when the source data is cached and stored, and the practicability is improved.
Drawings
FIG. 1 is a flow chart of a first embodiment of an irregular data storage method;
FIG. 2 is a flow chart of a second embodiment of an irregular data storage method;
FIG. 3 is a flow chart of a third embodiment of an irregular data storage method;
FIG. 4 is a flow chart of a fourth embodiment of an irregular data storage method;
FIG. 5 is a flow chart of a first embodiment of an irregular data storage device.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein.
It should be understood that, in the various embodiments of the present invention, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
It should be understood that in the present application, "comprising" and "having" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that, in the present invention, "a plurality" means two or more. "and/or" is merely an association describing an associated object, meaning that three relationships may exist, for example, and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "comprises A, B and C" and "comprises A, B, C" means that all three of A, B, C comprise, "comprises A, B or C" means that one of A, B, C comprises, "comprises A, B and/or C" means that any 1 or any 2 or 3 of A, B, C comprises.
It should be understood that in the present invention, "B corresponding to a", "a corresponds to B", or "B corresponds to a" means that B is associated with a, and B can be determined from a. Determining B from a does not mean determining B from a alone, but may be determined from a and/or other information. And the matching of A and B means that the similarity of A and B is greater than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context.
The technical solution of the present invention will be described in detail below with specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Embodiment 1 of the present invention provides an irregular data storage method, as shown in fig. 1, which is a flowchart of a first implementation manner, and includes:
step S110, acquiring the data length of the received source data, and determining the type of a storage unit for storing the source data based on the data length, wherein the data length is preset in correspondence with different types of storage units, and the storage unit in each type of storage unit respectively comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area. Wherein the source data may be data as shown in table 1. In the invention, based on the read-write principle of the source data in the storage medium, any data is allocated with space on the storage medium by taking bytes as the minimum unit. Wherein each character occupies 1 byte in the non-chinese character string data. In a chinese string, each character takes 2 bytes.
Figure BDA0002896110370000061
Figure BDA0002896110370000071
TABLE 1
The data of the number 1 in table 1 is composed of a number, a letter, and a symbol, and the set of the corresponding element types is [ number, letter, symbol ]. The data of number 2 in table 1 is composed of all numbers, and the set of corresponding element types is [ number ]. The data of the number 3 in table 1 are composed of numbers and letters, respectively, and the set of the corresponding element types is [ numbers, letters ]. The data of number 4 in table 1 is composed of all the letters, and the set of the corresponding element types is [ letter ].
In step S110, a plurality of types of storage units are preset, and each storage unit type corresponds to the same or different dynamic information length index, basic data length index, wide data length index, and flag bit length index, as shown in table 2.
Serial number Length of mark bit zone Dynamic information zone length Length of basic data area Wide data zone length
Type 1 1 byte 2 bytes 8 bytes 10 bytes
Type 2 1 byte 2 bytes 16 bytes 18 bytes
Type 3 1 byte 2 bytes 24 bytes 26 bytes
Type 4 1 byte 2 bytes 32 bytes 34 bytes
TABLE 2
Wherein 4 memory cell types are shown in table 2.
Through step S110, the source data can be processed after being received, and different storage unit types can be determined according to the length of the source data.
Step S120, if the data length is less than or equal to the length of the wide data area of the minimum memory cell type, or is greater than the sum of the basic data area and the tail storage area of the previous memory cell type in the two adjacent memory cell types, and is less than or equal to the length of the wide data area of the next memory cell type.
The minimum storage unit type is a storage unit type with the minimum byte amount among a plurality of preset storage unit types, for example, 4 storage unit types are preset in table 2, and the minimum storage unit type is a storage unit type with the minimum byte length among the 4 storage unit types. The memory cell type of type 1 is only 21 bytes, which is smaller than the byte length of memory cell type 2, memory cell type 3 and memory cell type 4, and type 1 is the smallest memory cell type.
After a plurality of storage unit types are set, the byte length of each storage unit type is obtained respectively, and all the storage unit types are arranged in an ascending order according to the type length of each storage unit type. The two adjacent memory cell types in step S120 refer to two adjacent memory cell types after the plurality of memory cell types are arranged in an ascending order.
Step S130, dividing the storage unit in the storage unit type corresponding to the source data into a first flag region and a first wide data region for storing the source data, where the first wide data region is a region obtained by merging the first dynamic information region and the first basic data region. Wherein the first dynamic information area belongs to a subset of the dynamic information areas and the first basic data area belongs to a subset of the basic data areas.
In step S130, the method further includes:
and storing identification information of the source data in a first identification bit region, wherein the identification information is calculated by calculating the difference between the actual length of the first wide data region and the actual length of the source data and summing the difference and a preset value to obtain the identification information.
In one embodiment, taking the source data corresponding to sequence number 2 in table 1 as an example, the length is greater than the sum of the basic data area and the tail storage area of the storage unit type 1 (14>8+5), but is less than or equal to the wide data area length of the storage unit type 2 (14<18), that is, the sum of the basic data area and the tail storage area of the previous storage unit type is greater than or equal to, and is less than or equal to the wide data area length of the next storage unit type, that is, the data of sequence number 2 in table 1 satisfies the condition in step S120. At this time, the source data corresponding to sequence number 2 in table 1 is stored as follows:
Figure BDA0002896110370000081
explanation: the identification bit has a number greater than or equal to 100, which indicates that the data is not split, and at this time, the dynamic information area and the basic data area are combined into a wide data area, and the complete data is directly stored. The length of the wide data area minus the actual length of the data is 4, i.e. 4+ 100-104 is stored in the identification bit.
In one embodiment, taking the source data corresponding to sequence number 3 in table 1 as an example, the length of the source data is greater than the sum of the basic data area and the tail storage area in storage unit type 1, (18>8+5), but is less than or equal to the wide data area length (18< ═ 18) of storage unit type 2, that is, the length of the source data is greater than the sum of the basic data area and the tail storage area in the previous storage unit type in two adjacent storage unit types, and is less than or equal to the wide data area length in the next storage unit type, so that the source data does not need to be split into a tail and a head, that is, the data of sequence number 3 in table 1 satisfies the condition in step S120, and the storage condition is as follows:
Figure BDA0002896110370000082
explanation: the identification bit has a number greater than or equal to 100, which indicates that the data is not split, and at this time, the dynamic information area and the basic data area are combined into a wide data area, and the complete data is directly stored. The length of the wide data area minus the actual length of the data is 0, i.e., 0+100 is stored in the identification bit.
In one possible embodiment, the first identification bit region and the first wide data region form a header storage structure;
after acquiring the data length of the received source data, as shown in fig. 2, the method further includes:
step S1101, receiving source data, acquiring a group index corresponding to the source data, and comparing data in the header storage structure under the group index with the source data in sequence. The source data is compared to the data within the header storage structure in the manner described above.
Step S1102, if the comparison results are the same, the step of saving the source data is not performed; if the result is the same, it is proved that the header storage structure has the same or corresponding data as the source data, and the source data is not repeatedly stored at this time.
Step S1103, if the comparison result is different, determining that the free header storage structure under the group index stores the source data. If not, then the header storage structure is proven to have not stored the same or corresponding data as the source data. At this time, the source data is stored.
Wherein, still include: step S1104, if there is no free header storage structure, randomly selecting one header storage structure. If no free header storage structure exists, all storage units in the header storage structure are proved to be occupied by data at the moment, and only one header storage structure can be randomly selected to prepare for storing the source data at the moment.
Step S1105, if the randomly selected head storage structure has a tail index, clearing the data in the tail storage unit, and recording the cleared tail as an idle state. The random selection of the head memory structure is followed by the corresponding tail index, and the data in the tail memory location is cleared.
And step S1106, clearing the data in the randomly selected header storage structure, and storing the source data by using the randomly selected header storage structure. The source data is stored in a header storage structure. Through the scheme, the source data can be stored continuously after the header storage structure stores full data.
Further, as shown in fig. 3, the method further includes:
step S140, if the data length is larger than the wide data area length of the previous storage unit type in the two adjacent storage unit types and is less than or equal to the sum of the basic data length and the tail storage structure length in the next storage unit type;
step S150, dividing the storage unit in the storage unit type corresponding to the source data into a second flag region, a second dynamic information region, a second basic data region, and a tail storage region, and storing the source data respectively. At this time, compared with step S130, another storage method is adopted to store the source data.
In an embodiment, taking the data corresponding to sequence number 1 in table 1 as an example, the length is greater than the wide data area length (13>10) of the memory cell type 1, and is less than or equal to the sum (13< ═ 8+5) of the basic data length and the tail storage structure length of the memory cell type 1, that is, the length is greater than the wide data area length of the previous memory cell type in the two adjacent memory cell types, and is less than or equal to the sum of the basic data length and the tail storage structure length of the next memory cell type, at this time, the source data needs to be split into a tail portion and a head portion.
In step S150, as shown in fig. 4, the method includes:
s1501, acquiring index information of the tail storage area, wherein the index information comprises a segment index and a unit index. In step S1501, first, an ASCII code of each character in the source data is summed to obtain a summed value, and the summed value is modulo to obtain a segment index and/or a cell index. The data are converted into the ASCII codes according to characters, then the ASCII codes are summed, finally a summation result is subjected to modular operation, and segment indexes and/or unit indexes corresponding to the header storage structure are generated, wherein the segment indexes and/or unit indexes obtained based on the ASCII codes summing modular operation are the prior art, and are not repeated in the application. Through the scheme, the data can be quickly positioned, and the uniform distribution of data caching and storage can be ensured.
S1502, storing the segment index in the second flag region;
s1503, storing the unit index in the second dynamic information area.
Further, the method also comprises the following steps: s1504, splitting the source data into front-segment data and rear-segment data, and storing the front-segment data in the second basic data area, wherein the length of the front-segment data is the same as that of the second basic data area;
s1505, store the following data in the tail storage area.
In one embodiment, since the source data satisfies the condition of step S140, the source data needs to be split into front-stage data and back-stage data. Assume that the tail memory region to be stored is at the 1002 th cell of the 80 th segment of the structure where the tail is stored. The data of sequence number 1 in table 1 satisfies the condition of the first data, and the storage condition is as follows:
Figure BDA0002896110370000101
Figure BDA0002896110370000111
explanation: the segment index for identifying the tail of the bit memory, the unit index for the tail of the dynamic information area (the index value is from 0, so the index value of the 80 th segment is 79, the index value of the 1002 th unit is 1001), the first 8 characters of the basic data area memory data, and the tail stores the last 5 characters of the structure unit memory data.
In an embodiment, taking the data corresponding to sequence number 4 in table 1 as an example, the length is greater than the wide data area length of the header storage structure type 2 (19>18), but is less than or equal to the sum of the basic data length of the header storage structure type 2 and the tail storage structure length (19< ═ 16+5), that is, the length is greater than the sum of the basic data area and the tail storage area of the previous storage unit type in two adjacent storage unit types, and is less than or equal to the wide data area length of the next storage unit type, so that the source data needs to be split into the tail and the header. Assume that the trailer to be stored is at the 1111 st element of the 91 st segment of the structure stored at the trailer. The storage conditions are as follows:
Figure BDA0002896110370000112
explanation: the identification bit stores the segment index of the tail part, the unit index of the dynamic information area stores the tail part, the basic data area stores the first 16 characters of the data, and the tail part stores the last 3 characters of the structural unit stored data.
By the method, the limited different-length header storage structure types and the fixed uniform-length tail storage structure are divided. When the data is stored, the proper storage space is selected according to the length of the data, so that the waste of the storage space can be reduced, the occupied storage space is aligned, and the data can be conveniently and quickly searched.
In a possible implementation manner, the second identification bit area, the second dynamic information area and the second basic data area form a head storage structure, and the tail storage area forms a tail storage structure;
and the source data is divided into front data and rear data according to the length of the second basic data area.
After acquiring the data length of the received source data, the method further comprises:
step S1201, receiving source data, obtaining a group index corresponding to the source data, and comparing data in a header storage structure under the group index with front-segment data in sequence. The purpose of this step is to find the data in the header storage structure that is the same as or corresponds to the previous segment of data.
And step S1202, if the comparison result is the same, acquiring the segment index and the unit index of the tail storage structure from the second identification bit area and the second dynamic information area, and comparing the data in the tail storage structure with the next segment of data again. If the former data is the same, then the former data does not need to be stored in the head storage structure separately again, and at this time, whether the tail storage structure has the data which is the same as or corresponding to the later data is judged.
Step S1203, if the comparison result is different, storing the last segment of data to an idle tail storage structure first, and obtaining a segment index and a unit index of the idle tail storage structure. If the two data are different, the fact that the tail storage structure does not have the data which is the same as or corresponds to the later data is proved, and the later data are stored into the free tail storage structure at the moment.
In a possible implementation manner, in step S1204, if there is no idle tail storage structure currently, a tail segment storage area is newly created, and the segment index and the unit index of the remaining source data are acquired after the remaining source data are stored in the tail storage structure corresponding to the newly created tail storage area.
Step S1205, determining a free header storage structure under the group index to store the previous segment data.
Step S1206, if there is no free header storage structure, randomly selecting a header storage structure and a tail index corresponding to the header storage structure.
Step S1207, if the randomly selected head storage unit has a tail index, clearing data in a tail storage structure, and marking the tail storage structure as an idle state.
And S1208, clearing the data in the randomly selected header storage structure, and storing the front-segment data.
Through the steps, the head storage structure and the front section data and the tail storage structure and the tail section data can be compared respectively. If the head storage structure is completely the same as the front section data and the tail storage structure is completely the same as the tail section data, the source data is completely the same as the data in the head storage structure and the tail storage structure, and the source data does not need to be stored again at this time.
If the head storage structure is different from the front section data or the tail storage structure is different from the tail section data, the source data needs to be stored.
According to the scheme, the source data can be judged, and the strategy for storing or not storing the source data is generated according to the judgment result, so that the storage space is saved.
An embodiment of the present invention further provides an irregular data storage device, as shown in fig. 5, including:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the data length of received source data and determining the type of a storage unit based on the data length, the data length is preset corresponding to different types of the storage unit, and the storage unit comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area;
the judging module is used for judging whether the data length is smaller than or equal to the length of the wide data area of the minimum storage unit or is larger than the sum of the basic data area and the tail storage area of the previous storage unit in two adjacent storage units and is smaller than or equal to the length of the wide data area of the next storage unit;
and the storage dividing module is used for dividing the storage unit into a first identification bit area and a merged first wide data area for storing the source data, wherein the merged first wide data area is an area obtained by merging the first dynamic information area and the first wide data area.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. Of course, the readable storage medium may also be an integral part of the processor. The processor and the readable storage medium may reside in an Application Specific Integrated Circuits (ASIC). Additionally, the ASIC may reside in user equipment. Of course, the processor and the readable storage medium may also reside as discrete components in a communication device. The readable storage medium may be read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and the like.
The present invention also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the device may read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the device to implement the methods provided by the various embodiments described above.
In the above embodiments of the terminal or the server, it should be understood that the Processor may be a Central Processing Unit (CPU), other general-purpose processors, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor, or in a combination of the hardware and software modules within the processor.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and these modifications or substitutions do not depart from the spirit of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. An irregular data storage method, comprising:
acquiring the data length of received source data, and determining the type of a storage unit for storing the source data based on the data length, wherein the data length is preset in correspondence with different types of storage units, and the storage unit in each type of storage unit respectively comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area;
if the data length is less than or equal to the length of the wide data area of the minimum storage unit type, or is greater than any one of the sum of the basic data area and the tail storage area of the previous storage unit type in the two adjacent storage unit types and less than or equal to the length of the wide data area of the next storage unit type;
and dividing the storage unit in the storage unit type corresponding to the source data into a first identification bit area and a first wide data area for storing the source data, wherein the first wide data area is an area obtained by combining a first dynamic information area and a first basic data area.
2. The irregular data storage method of claim 1,
if the data length is larger than the length of the wide data area of the previous storage unit type in the two adjacent storage unit types and is less than or equal to the sum of the length of the basic data and the length of the tail storage structure in the next storage unit type;
and dividing the storage unit in the storage unit type corresponding to the source data into a second identification bit area, a second dynamic information area, a second basic data area and a tail storage area to store the source data respectively.
3. The irregular data storage method of claim 1,
dividing the storage unit in the storage unit type corresponding to the source data into a first identification bit area and a merged first wide data area for storing the source data, wherein the storage unit comprises:
storing information other than identification information in the source data in a first wide data area, storing identification information of the source data in a first identification bit area, wherein the identification information is calculated by,
calculating the difference value between the actual length of the first wide data area and the actual length of the source data;
and summing the difference value and a preset value to obtain identification information.
4. The irregular data storage method of claim 2,
the step of dividing the storage unit into a second identification bit area, a second dynamic information area, a second basic data area and a tail storage area for respectively storing the source data comprises the following steps:
splitting the source data into front-segment data and rear-segment data, and storing the front-segment data in the second basic data area, wherein the length of the front-segment data is the same as that of the second basic data area;
and storing the latter data in a tail storage area.
5. The irregular data storage method of claim 4,
storing the back segment of data in the tail storage area includes:
storing the back segment data to an idle tail storage area at first, and acquiring a segment index and a unit index of the idle tail storage area;
and if no idle tail storage area exists at present, newly building a tail section storage area, and storing the residual source data in the newly built tail storage area to obtain the section index and the unit index of the newly built tail section storage area.
6. The irregular data storage method of claim 1,
the first identification bit area and the merged first wide data area form a header storage structure;
receiving source data, acquiring a group index corresponding to the source data, and comparing data in a header storage structure under the group index with the source data in sequence;
if the comparison result is the same, the source data is not stored;
and if the comparison result is different, determining that the free header storage structure under the group index stores the source data.
7. The irregular data storage method of claim 6,
if no free head storage structure exists, randomly selecting a head storage structure and a tail index corresponding to the head storage structure;
if the randomly selected head storage structure has a tail index, clearing data in a tail storage unit, and recording that the cleared tail is in an idle state;
and clearing the data in the randomly selected header storage structure, and storing the randomly selected header storage structure for the source data.
8. The irregular data storage method of claim 4,
the second identification bit area, the second dynamic information area and the second basic data area form a head storage structure, and the tail storage area forms a tail storage structure;
receiving source data, acquiring a group index corresponding to the source data, and comparing data in a header storage structure under the group index with front-segment data in sequence;
if the comparison result is the same, acquiring the segment index and the unit index of the tail storage structure from the second identification bit area and the second dynamic information area, and comparing the data in the tail storage structure with the later segment data again;
and if the comparison result is different, storing the rear segment data to an idle tail storage structure firstly, and acquiring the segment index and the unit index of the idle tail storage structure.
9. The irregular data storage method of claim 8,
if no idle tail storage structure exists at present, a tail section storage area is newly built, and a section index and a unit index of the tail section storage area are obtained after the residual source data are stored in a tail storage structure corresponding to the newly built tail storage area;
determining a free header storage structure under the group index to store the front-segment data;
if no free head storage structure exists, a head storage structure and a tail index corresponding to the head storage structure are randomly selected;
if the randomly selected head storage unit has a tail index, clearing data in a tail storage structure, and marking the tail storage structure as an idle state;
and clearing the data in the randomly selected header storage structure, and storing the front-segment data.
10. An irregular data storage device, comprising:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring the data length of received source data and determining the type of a storage unit based on the data length, the data length is preset corresponding to different types of the storage unit, and the storage unit comprises an identification bit area, a dynamic information area, a basic data area and a tail storage area;
the judging module is used for judging whether the data length is smaller than or equal to the length of the wide data area of the minimum storage unit or is larger than the sum of the basic data area and the tail storage area of the previous storage unit in two adjacent storage units and is smaller than or equal to the length of the wide data area of the next storage unit;
and the storage dividing module is used for dividing the storage unit into a first identification bit area and a merged first wide data area for storing the source data, wherein the merged first wide data area is an area obtained by merging the first dynamic information area and the first wide data area.
11. A readable storage medium, in which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 9.
CN202110043234.4A 2021-01-13 2021-01-13 Irregular data storage method, device and storage medium Active CN112732194B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110043234.4A CN112732194B (en) 2021-01-13 2021-01-13 Irregular data storage method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110043234.4A CN112732194B (en) 2021-01-13 2021-01-13 Irregular data storage method, device and storage medium

Publications (2)

Publication Number Publication Date
CN112732194A CN112732194A (en) 2021-04-30
CN112732194B true CN112732194B (en) 2022-08-19

Family

ID=75593110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110043234.4A Active CN112732194B (en) 2021-01-13 2021-01-13 Irregular data storage method, device and storage medium

Country Status (1)

Country Link
CN (1) CN112732194B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426123A (en) * 2015-11-06 2016-03-23 安徽容知日新信息技术有限公司 Data management method, acquisition station and equipment monitoring system
CN110287044A (en) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 Without lock shared drive processing method, device, electronic equipment and readable storage medium storing program for executing
CN111897487A (en) * 2020-06-15 2020-11-06 北京瀚诺半导体科技有限公司 Method, apparatus, electronic device and medium for managing data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2549775B (en) * 2016-04-28 2018-05-16 Imagination Tech Ltd Directed placement of data in memory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105426123A (en) * 2015-11-06 2016-03-23 安徽容知日新信息技术有限公司 Data management method, acquisition station and equipment monitoring system
CN110287044A (en) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 Without lock shared drive processing method, device, electronic equipment and readable storage medium storing program for executing
CN111897487A (en) * 2020-06-15 2020-11-06 北京瀚诺半导体科技有限公司 Method, apparatus, electronic device and medium for managing data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种面向非规则引用的Cell多核处理器自适应Cache行策略;曹倩等;《计算机学报》;20110515(第05期);全文 *

Also Published As

Publication number Publication date
CN112732194A (en) 2021-04-30

Similar Documents

Publication Publication Date Title
CN110309437B (en) Information pushing method and device
US5542090A (en) Text retrieval method and system using signature of nearby words
JP2607818B2 (en) Method and apparatus for determining whether a record is stored in a computer system
US6389424B1 (en) Insertion method in a high-dimensional index structure for content-based image retrieval
CN114186100B (en) Data storage and query method, device and database system
CN104246765A (en) Image search device, image search method, program, and computer-readable storage medium
CN101446962A (en) Data conversion method, device thereof and data processing system
CN110728260A (en) Method and device for identifying electrical construction drawing
US20030033138A1 (en) Method for partitioning a data set into frequency vectors for clustering
CN112732194B (en) Irregular data storage method, device and storage medium
CN116126864A (en) Index construction method, data query method and related equipment
CN106777191B (en) Search engine-based retrieval mode generation method and device
CN111639151A (en) Efficient storage inverted index method for full-text retrieval
CN113407576A (en) Data association method and system based on dimension reduction algorithm
CN112732196B (en) Rule data storage method, device and storage medium
CN112988846A (en) Flow real-time statistical method and engine based on absolute time sliding window
CN112905871A (en) Hot keyword recommendation method and device, terminal and storage medium
CN108268515B (en) Selection method and device for dimension of aggregation table
CN111460088A (en) Similar text retrieval method, device and system
CN105868197B (en) A kind of statistical method and statistic device of call bill data
CN111488327A (en) Data standard management method and system
CN115270800B (en) Method, device and equipment for extracting terminal store names and computer storage medium
CN116628129B (en) Auto part searching method and system
US11631047B2 (en) System and method of geocoding
CN110532268B (en) Method, device, computer equipment and storage medium for storing mass data in database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant