CN106383670A - Data processing method and storage device - Google Patents

Data processing method and storage device Download PDF

Info

Publication number
CN106383670A
CN106383670A CN201610839436.9A CN201610839436A CN106383670A CN 106383670 A CN106383670 A CN 106383670A CN 201610839436 A CN201610839436 A CN 201610839436A CN 106383670 A CN106383670 A CN 106383670A
Authority
CN
China
Prior art keywords
data block
mapping relations
eigenvalue
storage device
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610839436.9A
Other languages
Chinese (zh)
Other versions
CN106383670B (en
Inventor
袁冉胤
游俊
李伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201610839436.9A priority Critical patent/CN106383670B/en
Publication of CN106383670A publication Critical patent/CN106383670A/en
Application granted granted Critical
Publication of CN106383670B publication Critical patent/CN106383670B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An embodiment of the invention provides a data processing scheme. According to the data processing scheme, a storage device stores a first mapping relationship, wherein the first mapping relationship comprises mapping of a first eigenvalue and data in a specific format; the storage device calculates a first data block to obtain an eigenvalue of the first data block, wherein the eigenvalue of the first data block is a first eigenvalue; and when the storage device queries the first mapping relationship according to the first eigenvalue of the first data block to determine that the first mapping relationship contains the first eigenvalue, the first data block belongs to the data in the specific format, and the storage device no longer performs repeated data deletion operation on the first data block.

Description

A kind of data processing method and storage device
Technical field
The present invention relates to technical field of data storage, more particularly, to a kind of data processing method and storage device.
Background technology
Data de-duplication (De-duplication), is very popular technology in technical field of data storage, passes through Delete the data repeating in data, only retain a unique data, eliminate redundant data.This technology can be largely The upper demand reducing to amount of physical memory, thus meet growing data storage requirement.
In prior art, if it is confirmed that the characteristic value of the data block of some pending data de-duplication is in storage device In existed, then show in storage device, to have stored this data block, storage device has stored this feature value and deposited Store up the mapping relations of the storage address of this data block.In this case, storage device can update and this storage address is quoted Count.When reading the data block of this pending data de-duplication, need the data according to this pending data de-duplication The characteristic value of block inquires about the mapping relations of this feature value and the storage address storing this data block, and reading from this storage address should Data.Therefore, inevitable have the multiple access operation that storage address is carried out.
Content of the invention
In a first aspect, embodiments providing a kind of data processing scheme, in this data processing scheme, storage sets Standby first mapping relations that are stored with, the first mapping relations include the First Eigenvalue specific format data corresponding with the First Eigenvalue Mapping, storage device calculates the first data block and obtains the characteristic value of the first data block, and the characteristic value of the first data block is first Characteristic value;Storage device is inquired about the first mapping relations according to the First Eigenvalue of the first data block and is determined bag in the first mapping relations Containing the First Eigenvalue, then the first data block belongs to specific format data, and storage device no longer carries out repeat number to the first data block According to deletion action.Wherein, characteristic format data can be the full 0 data of length-specific or complete 1 data, or the group of 0 and 1 data Close or number of repetition (multiplicity) compares high data, wherein number of repetition can be judged by reference count.Special Value indicative can be the fingerprint of the data block being obtained using hash algorithm.When the data that the first data block is specific format, storage Equipment does not need to carry out further data de-duplication operations, does not need to operate as follows:Inquire in storage device and deposited When having stored up this first data block, update the reference count of the corresponding storage address of the First Eigenvalue, or when not having in storage device When having this first data block of storage, storage device is that this first data block distributes storage address, and this first data block is stored To storage address, set up the mapping relations of the First Eigenvalue and this storage address, decrease the access operation to storage address.More Important, when accessing the first data block, the characteristic value according to the first data block is inquired about the first mapping relations and can be directly obtained First data block, it is no longer necessary to determine the storage address of storage the first data block according to the First Eigenvalue, visits again storage first The storage address of data block obtains the first data block, further reduces the access operation to storage address.
Optionally, storage device is stored with the second mapping relations;Wherein, the second mapping relations comprise Second Eigenvalue and The mapping of one storage address;First storage address is stored with the corresponding data of Second Eigenvalue;Storage device calculates the second data Block obtains the characteristic value of the second data block, and the characteristic value of the second data block is Second Eigenvalue;Storage device is according to the second data The Second Eigenvalue of block is inquired about the first mapping relations and is determined and do not comprise Second Eigenvalue in the first mapping relations;Storage device according to The Second Eigenvalue of the second data block is inquired about the second mapping relations and is determined in the second mapping relations and comprises Second Eigenvalue;Storage sets The standby reference count updating the first storage address.On the one hand, by specific format Data Control in certain data area, can be by The size of the first mapping relations of storage device storage controls in certain scope, prevents storage device from loading the first mapping relations The excessive caching of Shi Zhanyong;Meanwhile, the data that can will not belong to particular bin number carries out entering according to existing data de-duplication flow process Row is processed, thus saving the memory space of storage device.
Optionally, storage device is stored with the second mapping relations;Second mapping relations comprise Second Eigenvalue and deposit with first The mapping of storage address;Wherein, the first storage address is stored with the corresponding data of described Second Eigenvalue;Storage device calculates the 3rd Data block obtains the characteristic value of the 3rd data block, and the characteristic value of the 3rd data block is third feature value;Storage device is according to the 3rd Characteristic value is inquired about the first mapping relations and is determined and do not comprise third feature value in the first mapping relations;Storage device is according to third feature Value is inquired about the second mapping relations and is determined and do not comprise third feature value in the second mapping relations;3rd data block is stored by storage device To the second storage address;Storage device sets up the mapping of third feature value and the second storage address in the second mapping relations, that is, Second mapping relations include the mapping of third feature value and the second storage address.On the one hand, storage device can be stored The size of the first mapping relations controls in certain scope, prevents storage device from taking excessive delaying when loading the first mapping relations Deposit, meanwhile, the data that can will not belong to particular bin number carries out being processed according to existing data de-duplication flow process, thus saving The memory space of storage device.Further, storage device updates the reference count of the second storage address.
Optionally, storage device division data segment obtains the first data block;Storage device sets up data segment and the first data The mapping relations of the First Eigenvalue of block.When accessing data segment, the First Eigenvalue according to data segment and the first data block Mapping relations, determine the First Eigenvalue, and storage device is inquired about the first mapping relations according to the First Eigenvalue and can be obtained the first number According to block it is no longer necessary to access the storage address in storage device, decrease the access operation to storage address.
Optionally, the First Eigenvalue and the mapping of the 3rd storage address are comprised in the second mapping relations, wherein, the 3rd storage The characteristic value that is stored with address is the data of the First Eigenvalue;When the reference count of the 3rd storage address is more than threshold value R, storage Equipment sets up the mapping of the data that the First Eigenvalue is the First Eigenvalue with characteristic value in the first mapping relations, and that is, characteristic value is The data of the First Eigenvalue is the First Eigenvalue corresponding specific format data, and wherein, R is the integer more than 0.Further, Storage device is deleted in the second mapping relations in the mapping of the First Eigenvalue and the 3rd storage address and the 3rd storage address The data or data in the mapping of the First Eigenvalue and the 3rd storage address and the 3rd storage address is put using invalidated identification For invalid.Further reduce the access operation to storage address.
Second aspect, correspondingly, the embodiment of the present invention additionally provides a kind of storage device, is used for realizing first aspect various Implementation.Wherein, storage device includes realizing the construction unit of the various implementation of embodiment of the present invention first aspect, or Person, storage device includes the interface & processor implementation various to execute embodiment of the present invention first aspect respectively.
Correspondingly, present invention also offers non-volatile computer readable storage medium storing program for executing and computer program, when this The memory loading non-volatile computer-readable recording medium of storage device that inventive embodiments provide and computer program produce The computer instruction comprising in product, CPU (Central Processing Unit, CPU) execution of storage device During this computer instruction, storage device is made to execute the various possible implementation of embodiment of the present invention first aspect respectively.
Brief description
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, will make to required in embodiment description below Accompanying drawing is briefly described.
Fig. 1 is a kind of storage device structural representation provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of data processing method provided in an embodiment of the present invention;
Fig. 3 is a kind of data segment piecemeal schematic diagram provided in an embodiment of the present invention;
Fig. 4 is that a kind of logical address of data segment provided in an embodiment of the present invention is illustrated with data block characteristics value mapping relations Figure;
Fig. 5 is a kind of storage device structural representation provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly retouched State.
As shown in figure 1, storage device comprises CPU (Central Processing Unit, CPU) 101, deposits Reservoir 102 and interface 103, store computer instruction in memory 102, CPU101 executes the computer instruction in memory 102 Storage system is managed and carries out data de-duplication operations.In addition, being the computing resource saving CPU101, scene can be compiled Journey gate array (Field Programmable Gate Array, FPGA) or other hardware can be used for executing the present invention in fact Apply CPU all operationss in example, or, FPGA or other hardware are respectively used to execute the part behaviour of embodiment of the present invention CPU with CPU Make, to realize the technical scheme of embodiment of the present invention description.For convenience of describing, embodiment of the present invention Unify legislation is controller Processor is used for realizing the technical scheme of the embodiment of the present invention.Interface 103 is communicated with processor, and interface 103 is specifically permissible For host bus adaptor (Host Bus Adapter, HBA) card, peripheral component interconnection (Peripheral Component Interconnect Express, PCIE) interface card etc..
In the embodiment of the present invention shown in Fig. 2, storage device is stored with the first mapping relations, and the first mapping relations include The mapping of the First Eigenvalue specific format data corresponding with the First Eigenvalue, and the second mapping relations, the second mapping relations Mapping including Second Eigenvalue and the first storage address.Wherein, characteristic format data can be length-specific full 0 data or Complete 1 data, or the combination of 0 and 1 data, or multiplicity be more than threshold value data, the First Eigenvalue be special format data Characteristic value.First storage address stores unique data block in storage device, and Second Eigenvalue is the spy of unique data block Value indicative.Unique data block refers to that this data block is all differed with other data blocks in storage device, and multiplicity is more than threshold value to be had Body is realized to be more than threshold value for the reference count of the storage address of data storage block.Embodiment shown in Fig. 2 discloses as follows Data processing scheme:
Step 201:Storage device obtains data segment.
Storage device obtains the logical address of data segment and data segment especially by the interface 103 shown in Fig. 1.
Step 202:Storage device divides data segment and obtains data block.
Storage device can carry out graduation by elongated or fixed length block algorithm to data segment, obtains one or more data Block.As shown in figure 3, the embodiment of the present invention obtains data block A to divide data segment, as a example data block B data block C.
Step 203:Storage device calculates the characteristic value of data block.
Storage device calculates the characteristic value of data block by hash algorithm.In the embodiment of the present invention, storage device calculates one The characteristic value of individual or multiple data block.Data block A as shown in Figure 3, the characteristic value of data block B data block C are respectively first Characteristic value, Second Eigenvalue and third feature value.
Step 204:Storage device sets up the mapping relations of data segment and the characteristic value of data block.
After storage device data storage section, when receiving the read request reading data segment, it is to provide this number according to read request According to section, need to set up data segment and data block A, the mapping relations of the characteristic value of data block B data block C, according to data block A, The characteristic value of data block B data block C goes to read corresponding data.For example, set up the logical address of data segment as shown in Figure 4 With the mapping relations of data block A, the characteristic value of data block B data block C, according to the spy of data block A, data block B data block C Value indicative goes to read corresponding data.And for example calculate the characteristic value of data segment, set up the logical address of data segment and the spy of data segment The mapping relations of value indicative, and the mapping pass of the characteristic value of data segment and data block A, the characteristic value of data block B data block C System.Wherein, logical address can be LBA.
Step 205:Storage device is inquired about the first mapping relations according to the characteristic value of data block and is determined in the first mapping relations Whether comprise the characteristic value of data block, when the characteristic value comprising data block in the first mapping relations it is determined that this data block is spy Determine formatted data, no longer execute data de-duplication operations, flow process terminates;Otherwise execution step 206.
Storage device is inquired about the first mapping relations according to the characteristic value of data block A, data block B data block C and is determined first The First Eigenvalue, Second Eigenvalue and third feature value whether is comprised in mapping relations.The characteristic value of data block A is fisrt feature Value, comprises the First Eigenvalue in the first mapping relations, therefore, data block A is specific format data, no longer data block A is carried out Data de-duplication operations.Data block B and the characteristic value of C is not comprised, therefore, to data block B and C execution in first mapping relations Step 206.When the data that data block is specific format, storage device does not need to carry out further data de-duplication behaviour Make, do not need to operate as follows:Inquire when having stored data block A in storage device, update the First Eigenvalue corresponding The reference count of storage address, or when not having data storage block A in storage device, storage device is deposited for the distribution of data block A Storage address, and data block A is stored storage address, the second mapping relations are set up the First Eigenvalue and this storage address Mapping, decreases the access operation to storage address.Prior, when accessing data block A, according to the characteristic value of data block A Inquire about the first mapping relations and can directly obtain data block A it is no longer necessary to depositing of data storage block A is determined according to the First Eigenvalue Storage address, the storage address visiting again data storage block A obtains data block A, further reduces the access behaviour to storage address Make.
Step 206:Storage device is inquired about the second mapping relations according to the characteristic value of data block and is determined in the second mapping relations Whether comprise the characteristic value of data block.When the characteristic value comprising data block in the second mapping relations, then execution step 207, otherwise Execution step 208.
Storage device is inquired about the second mapping relations according to the Second Eigenvalue of data block B and is determined in the second mapping relations and comprises Second Eigenvalue, then execution step 207.Storage device determines according to third feature value inquiry second mapping relations of data block C Third feature value, then execution step 208 are not comprised in second mapping relations.
Step 207:Storage device updates the reference count of the first storage address.
Storage device determines the Second Eigenvalue having comprised data block B in the second mapping relations, i.e. the second mapping relations In corresponding first storage address of middle Second Eigenvalue, the data block of storage is identical with data block B, then do not need data storage again Block B, the Second Eigenvalue of data block B also corresponds to the first storage address, and therefore, the reference count of the first storage address adds 1.This In bright embodiment, reference count refers to the number of repetition of the data block of storage in the storage address comprising in the second mapping relations, When in the storage address that data block stores in the second mapping relations first, the reference count of this storage address is 1, when having again During the data block of same characteristic features value, the reference count of this storage address adds 1.
Step 208:Data block is stored the second storage address by storage device, sets up data block in the second mapping relations Characteristic value and the second storage address mapping.
Storage device determines in the second mapping relations according to third feature value inquiry second mapping relations of data block C does not wrap Value containing third feature, that is, data block C is the data block writing first, therefore distributes the second storage address for data block C, by data Block C stores in the second storage address, and sets up reflecting of third feature value and the second storage address in the second mapping relations Penetrate.
The embodiment of the present invention, carries out data de-duplication operations to being not belonging to specific format data, on the one hand, by particular bin In certain data area, the size of the first mapping relations that can store storage device controls certain formula Data Control Scope, prevents storage device from taking excessive caching when loading the first mapping relations;Meanwhile, the number of particular bin number can be will not belong to According to carrying out being processed according to existing data de-duplication flow process, thus saving the memory space of storage device.
In the embodiment of the present invention, a kind of implementation of specific format data, can be data with the data that length is n byte Section elementary cell builds data segment.Wherein n is the integer more than 0, and the value of n can be true according to the resource utilization of storage device Fixed.Generally, the value of n is bigger, and the resource consuming storage device is more, and resource utilization is higher.If n is 1, then with 1 byte (8 Position) as data segment elementary cell, build data segment based on this.According to 8, it is possible to obtain 00000000-11111111 Totally 256 kinds of data segment elementary cells, form data segment by the multiple identical data segment elementary cells setting.According to fixed length or change Data segment is divided into data block by long block algorithm.According to fixed length or elongated block algorithm, the data block of a data segment division Content all same, therefrom an optional data block as specific format data, determine the characteristic value of data block, such as hash value, The mapping of characteristic value and data block is set up in the first mapping relations.
In another kind of realization, storage device can be true according to the reference count of the storage address of record in the second mapping relations Determine the data of specific format, reference count is used for characterizing the multiplicity of the data of storage in storage address.As reference count is big Data block D in storage address M of threshold value R is as specific format data.Wherein, R is the integer more than 0.Then reflect first Penetrate the mapping of characteristic value T setting up data block D in relation and data block D, that is, the first mapping relations include the feature of data block D Value T and the mapping of data block D.Because having contained characteristic value T of data block D, therefore, storage device in the second mapping relations Characteristic value T of data block D directly can be obtained from the second mapping relations.Set up characteristic value T and data block D of data block D After mapping, delete the data in characteristic value T of data block D and the mapping of storage address M and storage address M or use no criterion Know the data in the mapping of characteristic value T of data block D and storage address M and storage address M is set to invalid.Work as storage device When characteristic value of reentrying is the data block of T, then it is no longer necessary to execute data de-duplication operations, decreases the visit to storage address Ask operation.
In the embodiment of the present invention, the first mapping relations can be to be realized using modes such as array or binary trees, and the present invention is implemented Example is not construed as limiting to this.
The embodiment of the present invention can apply to the online data de-duplication scene such as data backup scene, and such as storage device connects Receive data segment, the operation described by the execution embodiment of the present invention.The embodiment of the present invention such as can also be applied to delete offline at the scene again. For example, storage device reads the data segment of storage, the operation described by the execution embodiment of the present invention.The embodiment of the present invention is to this not It is construed as limiting.
Described scheme according to embodiments of the present invention, another embodiment of the present invention provides a kind of as shown in Figure 5 Storage device, storage device is stored with the first mapping relations, and the first mapping relations include the First Eigenvalue and the First Eigenvalue pair The mapping of the specific format data answered, storage device includes computing unit 501 and determining unit 502;Wherein, computing unit 501, Obtain the characteristic value of the first data block for calculating the first data block, the characteristic value of the first data block is the First Eigenvalue;Determine Unit 502, comprises for being determined in the first mapping relations according to the First Eigenvalue inquiry first mapping relations of the first data block The First Eigenvalue, then the first data block is specific format data, and storage device no longer carries out repeated data to the first data block and deletes Division operation.
Optionally, storage device is stored with the second mapping relations;Second mapping relations comprise Second Eigenvalue and deposit with first The mapping of storage address;Wherein, the first storage address is stored with the corresponding data of Second Eigenvalue;Storage device also includes updating list Unit 503;Computing unit 501, is additionally operable to calculate the characteristic value that the second data block obtains the second data block, the feature of the second data block It is worth for Second Eigenvalue;Determining unit 502, Second Eigenvalue inquiry the first mapping relations being additionally operable to according to the second data block are true Second Eigenvalue is not comprised in fixed first mapping relations;Determining unit 502, is additionally operable to the Second Eigenvalue according to the second data block Inquire about the second mapping relations and determine in the second mapping relations and comprise Second Eigenvalue;Updating block 503 is additionally operable to update first to be deposited The reference count of storage address.Optionally, storage device also includes memory cell 504 and sets up unit 505, computing unit 501, also Obtain the characteristic value of the 3rd data block for calculating the 3rd data block, the characteristic value of the 3rd data block is third feature value;Determine Unit 502, is additionally operable to inquire about the first mapping relations according to third feature value and determines and do not comprise third feature in the first mapping relations Value;Determining unit 502, is additionally operable to inquire about the second mapping relations according to third feature value and determines and do not comprise the in the second mapping relations Three characteristic values;Memory cell 504, for storing the second storage address by the 3rd data block;Set up unit 505 to be used for second The mapping of third feature value and the second storage address is set up in mapping relations.Further, updating block 503, are additionally operable to update The reference count of the second storage address.
Optionally, storage device also includes division unit 506;Division unit 506, for dividing data segment
Obtain the first data block;Set up unit 505 to be additionally operable to set up data segment and described first data block
The mapping of the First Eigenvalue.
Effect that storage device shown in Fig. 5 is realized and further realize referring to previous embodiments corresponding Description, will not be described here.
Storage device as shown in Figure 5, a kind of implementation is to be provided with said units in storage device, and said units can It is loaded in the memory of storage device, the instruction in memory is executed by the CPU in storage device, realize the present invention and correspond to Embodiment in function;Another kind of realization, the unit comprising in storage device can be realized by hardware, or is held by CPU Instruction in line storage is realized with hardware combinations.Said units are also referred to as construction unit.
The embodiment of the present invention, additionally provides non-volatile computer readable storage medium storing program for executing and computer program, non-easy The computer instruction comprising in the property lost computer-readable recording medium and computer program, loads in CPU execution memory This computer instruction be used for realizing the present invention is each implement in the corresponding function of storage device.
The exemplary description being given in the embodiment of the present invention." first ", " second ", " the 3rd " in the embodiment of the present invention etc. It is not used to considered critical precedence relationship, for example, when " first ", " second " and " the 3rd " is used for representing data block, simply For distinguishing different data blocks, when " first ", " second " and " the 3rd " is used for representing characteristic value, it is only intended to expression and belongs to Different characteristic value.One or more data blocks can also be had between the first data block and the second data block.
In the first mapping relations and the second mapping relations in the embodiment of the present invention, a plurality of mapping, example can be included respectively As in the first mapping relations, a plurality of mapping can be comprised, wherein one is the First Eigenvalue spy corresponding with the First Eigenvalue Determine the mapping of formatted data.The First Eigenvalue corresponding specific format data refers to that the characteristic value of this specific format data is first Characteristic value.
It should be understood that disclosed device, method in several embodiments provided by the present invention, can be passed through it Its mode is realized.For example, the division of unit described in device embodiment described above, only a kind of logic function is drawn Point, actual can have other dividing mode when realizing, and for example multiple units or assembly can combine or be desirably integrated into separately One system, or some features can ignore, or do not execute.Another, shown or discussed coupling each other or straight Connecing coupling or communication connection can be by some interfaces, and the INDIRECT COUPLING of device or unit or communication connection, can be electrical, Mechanical or other forms.
The described unit illustrating as separating component can be or may not be physically separate, show as unit The part showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.The mesh to realize this embodiment scheme for some or all of unit therein can be selected according to the actual needs 's.
In addition, can be integrated in a processing unit in each functional unit in each embodiment of the present invention it is also possible to It is that unit is individually physically present it is also possible to two or more units are integrated in a unit.

Claims (15)

1. a kind of data processing method it is characterised in that storage device is stored with the first mapping relations, described first mapping relations Including the mapping of the First Eigenvalue specific format data corresponding with described first data block, methods described includes:
Described storage device calculates the characteristic value that the first data block obtains described first data block, the feature of described first data block It is worth for described the First Eigenvalue;
Described storage device is inquired about described first mapping relations according to the corresponding the First Eigenvalue of described first data block and is determined institute State and in the first mapping relations, comprise described the First Eigenvalue, then described first data block is described specific format data, described deposits Storage equipment no longer carries out data de-duplication operations to described first data block.
2. method according to claim 1 it is characterised in that described storage device is stored with the second mapping relations;Described Second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address stores State the corresponding data of Second Eigenvalue;Methods described also includes:
Described storage device calculates the characteristic value that the second data block obtains described second data block, the feature of described second data block It is worth for described Second Eigenvalue;
Described storage device is inquired about described first mapping relations according to the Second Eigenvalue of described second data block and is determined described the Described Second Eigenvalue is not comprised in one mapping relations;
Described storage device is inquired about described second mapping relations according to the Second Eigenvalue of described second data block and is determined described the Described Second Eigenvalue is comprised in two mapping relations;
Described storage device updates the reference count of described first storage address.
3. method according to claim 1 it is characterised in that described storage device is stored with the second mapping relations;Described Second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address stores State the corresponding data of Second Eigenvalue;Methods described also includes:
Described storage device calculates the characteristic value that the 3rd data block obtains described 3rd data block, the feature of described 3rd data block It is worth for third feature value;
Described storage device is inquired about described first mapping relations according to described third feature value and is determined in described first mapping relations Do not comprise described third feature value;
Described storage device is inquired about described second mapping relations according to described third feature value and is determined in described second mapping relations Do not comprise described third feature value;
Described 3rd data block is stored the second storage address by described storage device;
Described storage device sets up reflecting of described third feature value and described second storage address in described second mapping relations Penetrate.
4. method according to claim 3 is it is characterised in that methods described also includes:
Described storage device updates the reference count of described second storage address.
5. method according to claim 1 is it is characterised in that methods described also includes:
Described storage device divides data segment and obtains described first data block;
Described storage device sets up the mapping relations of described data segment and the described the First Eigenvalue of described first data block.
6. a kind of storage device it is characterised in that storage device is stored with the first mapping relations, described first mapping relations include The mapping of the specific format data of the First Eigenvalue and described first data block, described storage device includes computing unit and determination Unit;Wherein,
Described computing unit is used for calculating the characteristic value that the first data block obtains described first data block, described first data block Characteristic value is described the First Eigenvalue;
Described determining unit is used for true according to described first mapping relations of described first data block corresponding the First Eigenvalue inquiry Described the First Eigenvalue is comprised, then described first data block is described specific format data, institute in fixed described first mapping relations State storage device and no longer data de-duplication operations are carried out to described first data block.
7. storage device according to claim 6 it is characterised in that described storage device is stored with the second mapping relations; Described second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address storage There is the corresponding data of described Second Eigenvalue;Described storage device also includes updating block:
Described computing unit is additionally operable to calculate the characteristic value that the second data block obtains described second data block, described second data block Characteristic value be described Second Eigenvalue;
Described determining unit is additionally operable to be determined according to described first mapping relations of Second Eigenvalue inquiry of described second data block Described Second Eigenvalue is not comprised in described first mapping relations;
Described determining unit is additionally operable to be determined according to described second mapping relations of Second Eigenvalue inquiry of described second data block Described Second Eigenvalue is comprised in described second mapping relations;
Described updating block is additionally operable to update the reference count of described first storage address.
8. storage device according to claim 6 it is characterised in that described storage device is stored with the second mapping relations; Described second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address storage There is the corresponding data of described Second Eigenvalue;Described storage device also includes memory cell and sets up unit:Wherein,
Described computing unit is additionally operable to calculate the characteristic value that the 3rd data block obtains described 3rd data block, described 3rd data block Characteristic value be third feature value;
Described determining unit is additionally operable to determine described first mapping according to described first mapping relations of described third feature value inquiry Described third feature value is not comprised in relation;
Described determining unit is additionally operable to determine described second mapping according to described second mapping relations of described third feature value inquiry Described third feature value is not comprised in relation;
Described memory cell is used for for described 3rd data block storing the second storage address;
Described unit of setting up is for setting up described third feature value and described second storage address in described second mapping relations Mapping.
9. storage device according to claim 8 is it is characterised in that described storage device also includes updating block,
Described updating block is used for updating the reference count of described second storage address.
10. storage device according to claim 6 is it is characterised in that described storage device also includes division unit and builds Vertical unit;Described division unit is used for dividing data segment and obtains described first data block;
Described unit of setting up is for setting up the mapping of described data segment and the described the First Eigenvalue of described first data block.
A kind of 11. storage devices, first mapping relations it is characterised in that storage device is stored with, described first mapping relations bag Include the mapping of the First Eigenvalue specific format data corresponding with described the First Eigenvalue, described storage device includes computing unit And determining unit;Described storage device includes locating interface & processor, and wherein, described processor is used for:
Calculate the characteristic value that the first data block obtains described first data block, the characteristic value of described first data block is described first Characteristic value;
The First Eigenvalue according to described first data block is inquired about described first mapping relations and is determined in described first mapping relations Comprise described the First Eigenvalue, then described first data block is described specific format data, described processor is no longer to described the One data block carries out data de-duplication operations.
12. storage devices according to claim 11 it is characterised in that described storage device is stored with, close by the second mapping System;Described second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address is deposited Contain the corresponding data of described Second Eigenvalue;Described processor is additionally operable to:
Calculate the characteristic value that the second data block obtains described second data block, the characteristic value of described second data block is described second Characteristic value;
Second Eigenvalue according to described second data block is inquired about described first mapping relations and is determined in described first mapping relations Do not comprise described Second Eigenvalue;
Second Eigenvalue according to described second data block is inquired about described second mapping relations and is determined in described second mapping relations Comprise described Second Eigenvalue;
Update the reference count of described first storage address.
13. storage devices according to claim 11 it is characterised in that described storage device is stored with, close by the second mapping System;Described second mapping relations comprise Second Eigenvalue and the mapping of the first storage address;Wherein, described first storage address is deposited Contain the corresponding data of described Second Eigenvalue;Described processor is additionally operable to:
Calculate the characteristic value that the 3rd data block obtains described 3rd data block, the characteristic value of described 3rd data block is third feature Value;
Described first mapping relations are inquired about according to described third feature value determine and in described first mapping relations, do not comprise described the Three characteristic values;
Described second mapping relations are inquired about according to described third feature value determine and in described second mapping relations, do not comprise described the Three characteristic values;
Described 3rd data block is stored to the second storage address;
The mapping of described third feature value and described second storage address is set up in described second mapping relations.
14. storage devices according to claim 13 are it is characterised in that described processor is additionally operable to update described second and deposits The reference count of storage address.
15. storage devices according to claim 11 are it is characterised in that described processor is additionally operable to:
Divide data segment and obtain described first data block;
Set up the mapping relations of described data segment and the described the First Eigenvalue of described first data block.
CN201610839436.9A 2016-09-21 2016-09-21 Data processing method and storage device Active CN106383670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610839436.9A CN106383670B (en) 2016-09-21 2016-09-21 Data processing method and storage device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610839436.9A CN106383670B (en) 2016-09-21 2016-09-21 Data processing method and storage device

Publications (2)

Publication Number Publication Date
CN106383670A true CN106383670A (en) 2017-02-08
CN106383670B CN106383670B (en) 2020-02-14

Family

ID=57935887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610839436.9A Active CN106383670B (en) 2016-09-21 2016-09-21 Data processing method and storage device

Country Status (1)

Country Link
CN (1) CN106383670B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112394874A (en) * 2019-08-13 2021-02-23 华为技术有限公司 Key value KV storage method and device and storage equipment
CN113467716A (en) * 2021-06-11 2021-10-01 苏州浪潮智能科技有限公司 Data storage method, device, equipment and readable medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120123999A1 (en) * 2010-11-16 2012-05-17 Actifio, Inc. System and method for managing data with service level agreements that may specify non-uniform copying of data
CN102591592A (en) * 2010-12-14 2012-07-18 微软公司 Data deduplication in a virtualization environment
CN103279502A (en) * 2013-05-06 2013-09-04 北京赛思信安技术有限公司 Framework and method of repeated data deleting file system combined with parallel file system
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN104376584A (en) * 2013-08-15 2015-02-25 华为技术有限公司 Data compression method, computer system and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120123999A1 (en) * 2010-11-16 2012-05-17 Actifio, Inc. System and method for managing data with service level agreements that may specify non-uniform copying of data
CN102591592A (en) * 2010-12-14 2012-07-18 微软公司 Data deduplication in a virtualization environment
CN103279502A (en) * 2013-05-06 2013-09-04 北京赛思信安技术有限公司 Framework and method of repeated data deleting file system combined with parallel file system
CN103514250A (en) * 2013-06-20 2014-01-15 易乐天 Method and system for deleting global repeating data and storage device
CN104376584A (en) * 2013-08-15 2015-02-25 华为技术有限公司 Data compression method, computer system and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112394874A (en) * 2019-08-13 2021-02-23 华为技术有限公司 Key value KV storage method and device and storage equipment
CN113467716A (en) * 2021-06-11 2021-10-01 苏州浪潮智能科技有限公司 Data storage method, device, equipment and readable medium
CN113467716B (en) * 2021-06-11 2023-05-23 苏州浪潮智能科技有限公司 Method, device, equipment and readable medium for data storage

Also Published As

Publication number Publication date
CN106383670B (en) 2020-02-14

Similar Documents

Publication Publication Date Title
US20230315294A1 (en) Memory system and method for controlling nonvolatile memory
CN106484331B (en) A kind of data processing method, device and flash memory device
JP5425541B2 (en) Method and apparatus for partitioning and sorting data sets on a multiprocessor system
CN103384877B (en) Comprise storage system and the storage controlling method of flash memory
CN103547329B (en) Data processing method and device in group system
CN109542332A (en) Storage system and the control method for controlling nonvolatile memory
US8880544B2 (en) Method of adapting a uniform access indexing process to a non-uniform access memory, and computer system
CN103995855B (en) The method and apparatus of data storage
CN104423894B (en) Data memory device and method for controlling flash memory
CN109542797A (en) Storage system and the control method for controlling nonvolatile memory
CN106874217A (en) Accumulator system and control method
CN106406756B (en) A kind of space allocation method and device of file system
KR102538126B1 (en) Tail latency aware foreground garbage collection algorithm
CN109690681A (en) Handle method, storage device, solid state hard disk and the storage system of data
CN103677674B (en) A kind of data processing method and device
CN106293525B (en) A kind of method and system improving caching service efficiency
CN105980992B (en) A kind of storage system, the method for identification data block stability and device
CN105242879B (en) A kind of date storage method and protocol server
CN104750432B (en) A kind of date storage method and device
CN110413211B (en) Storage management method, electronic device, and computer-readable medium
CN112988616A (en) Reading sequential data from memory using hub table
CN110109868A (en) Method, apparatus and computer program product for index file
CN105917303A (en) Controller, method for identifying data block stability and storage system
CN110377233A (en) SSD reading performance optimization method, device, computer equipment and storage medium
CN109960471A (en) Date storage method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant