CN109521957A - A kind of data processing method and device - Google Patents

A kind of data processing method and device Download PDF

Info

Publication number
CN109521957A
CN109521957A CN201811231230.3A CN201811231230A CN109521957A CN 109521957 A CN109521957 A CN 109521957A CN 201811231230 A CN201811231230 A CN 201811231230A CN 109521957 A CN109521957 A CN 109521957A
Authority
CN
China
Prior art keywords
data
cache pool
reading
pending
cached
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811231230.3A
Other languages
Chinese (zh)
Inventor
李�杰
张在贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811231230.3A priority Critical patent/CN109521957A/en
Publication of CN109521957A publication Critical patent/CN109521957A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present application discloses a kind of data processing method and device, and when target data is written into distributed memory system, while by data cached write-in cache pool, wherein cache pool is formed based on the storage device build more than or equal to 100GB.When needing to carry out the reading of pending data, can data cached middle lookup pending data in cache pool, read the pending data found and the pending data of reading handled.Since the capacity of storage equipment is greater than memory device, therefore it can be stored in cache pool more data cached, while improving the reading efficiency of data, it can be improved the reading hit rate of pending data, to further decrease the reading time delay of pending data, data reading performance using redundancy is effectively improved.

Description

A kind of data processing method and device
Technical field
The present invention relates to field of computer technology, more particularly to a kind of data processing method and device.
Background technique
Distributed storage is that data dispersion is stored in more independent equipment, compared in traditional storage mode All data are stored using the storage server of concentration, distributed storage uses expansible system structure, stores using more Server shares storage load, improves reliability, availability and scalability, but simultaneously, distributed storage is in reading efficiency It is upper not have advantage, there is certain time delay.
In the prior art in order to improve the reading efficiency of distributed storage, memory device can be increased and carry out storage unit score According to, the speed of data is read from distributed memory system due to being higher than from the speed for reading data in memory device, it can To improve data reading performance using redundancy.But in this data reading mode, the partial data stored in memory device may not be to need The data to be read, therefore the hit rate of reading data is often lower, the raising of data reading performance using redundancy is limited.
Summary of the invention
In order to solve the above technical problems, the embodiment of the present application provides a kind of data processing method, for improving reading data Hit rate, and then improve data reading performance using redundancy.
The embodiment of the present application provides a kind of data processing method, which comprises
Pending data is searched in cache pool;Be stored in the cache pool it is data cached, it is described it is data cached to The cache pool is written when target data being written in distributed memory system, the cache pool is based on more than or equal to 100GB's Storage device build forms;
Read the pending data found.
Optionally, the method also includes:
If the pending data is not present in cache pool, read from the distributed memory system described to be processed The pending data is written in cache pool data.
Optionally, the method also includes:
When the utilization rate of the cache pool is more than preset value, the data cached of older preset data is deleted;And/or
Deleting storage time is more than the data cached of preset time threshold.
Optionally, described data cached for single copy data.
Optionally, described data cached identical as the target data.
Optionally, the data cached partial data in the target data.
Optionally, the storage equipment includes solid state hard disk, random access memory or hard disk drive.
The embodiment of the present application also provides a kind of data processing equipment, described device includes:
Searching unit, for searching pending data in cache pool;Be stored in the cache pool it is data cached, it is described It is data cached to be written the cache pool when target data is written into distributed memory system, the cache pool be based on being greater than or Storage device build equal to 100GB forms;
Reading unit, for reading the pending data found.
Optionally, described device further include:
Read-write cell, if for the pending data to be not present in cache pool, from the distributed memory system The pending data is read, the pending data is written in cache pool.
Optionally, described device further include:
Unit is deleted, when being more than preset value for the utilization rate in the cache pool, deletes the slow of older preset data Deposit data;And/or
Deleting storage time is more than the data cached of preset time threshold.
Optionally, described data cached for single copy data.
The embodiment of the present application provides a kind of data processing method and device, and target is being written into distributed memory system When data, while by data cached write-in cache pool, wherein cache pool based on the storage device build more than or equal to 100GB and At.When needing to carry out the reading of pending data, can data cached middle lookup pending data in cache pool, read The pending data found.Since the capacity of storage equipment is greater than memory device, can be stored in cache pool more It is data cached, improve data reading efficiency while, can be improved the reading hit rate of pending data, thus into one Step reduces the reading time delay of pending data, effectively improves data reading performance using redundancy.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations as described in this application Example, for those of ordinary skill in the art, is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of flow chart of data processing method provided by the embodiments of the present application;
Fig. 2 is a kind of structural block diagram of data processing equipment provided by the embodiments of the present application.
Specific embodiment
Inventor has found that in the prior art, in order to improve the reading efficiency of distributed storage, can increase interior It deposits equipment and carrys out storage section data, the speed due to reading data from memory device, which is higher than from distributed memory system, reads number According to speed, it is thus possible to improve data reading performance using redundancy, but in this data reading mode, the portion that is stored in memory device Divided data may not be the data for needing to read, therefore the hit rate of reading data is often lower, the raising of data reading performance using redundancy It is limited.
For example, non-linear editing (referred to as non-volume) be from storage system rapidly, accurately access material, and then it is right The mode that material is edited, specifically, non-linear editing can be carried out by using the non-volume software such as nova.In non-volume scene In, if not enough quickly and accurately to the readings of data, it will Caton occur in software interface, the behaviour of user cannot be timely responded to Make, seriously affects user experience.
Based on this, the embodiment of the present application provides a kind of data processing method, and target data is being written into storage system When, while by data cached write-in cache pool, wherein cache pool is formed based on the storage device build more than or equal to 100GB. When needing to carry out the reading of pending data, can data cached middle lookup pending data in cache pool, reading looks into The pending data found.Since the capacity of storage equipment is greater than memory device, can be stored in cache pool more It is data cached, while improving the reading efficiency of data, it can be improved the reading hit rate of pending data, thus further The reading time delay for reducing pending data, effectively improves data reading performance using redundancy.
With reference to the accompanying drawing, data processing method provided by the embodiments of the present application and device are described in detail by embodiment Specific implementation.
A kind of flow chart of data processing method provided by the embodiments of the present application is shown with reference to Fig. 1, this method can wrap Include following steps.
S101 searches pending data in cache pool.
In the embodiment of the present application, cache pool is based on made of the storage device build more than or equal to 100GB, wherein Storage equipment may include that solid state hard disk, random access memory, hard disk drive etc. can since cache pool has biggish capacity To store more data, meanwhile, it is higher than from the speed for reading data in cache pool and reads data from distributed memory system Speed.
It can store in cache pool data cached, data cached is that target data is being written into distributed memory system When write-in cache pool in, wherein writing mode can be asynchronous write.
Specifically, when target data is written to distributed memory system, it can be by target data itself as caching number According in write-in cache pool;It, can also be using the partial data of target data as data cached, write-in when target data is more In cache pool, in case generating excessive write operation occupies excessive bandwidth.Wherein, the partial data of target data can be target Preceding predetermined number evidence in data, the rear predetermined number evidence being also possible in target data can also be according to preset data week The partial data that phase is selected in target data.
In order to efficiently use the memory space of cache pool, data cached in cache pool can be deposited using single copy mode Storage, does not provide data persistence and backup strategy, reduces the redundancy of cache pool;Data cached in cache pool can only include That reads is data cached, and without saving dirty data, wherein dirty data is the data not being written into distributed memory system, this Sample can pointedly store the data cached of reading, convenient for improving the reading efficiency of system.
In the embodiment of the present application, data cached in cache pool can also be cleared up.
Specifically, can delete, storage time is data cached more than preset time threshold, and preset time threshold can root Depending on service condition according to data cached corresponding software, for example, the data that the data cached corresponding software unit time generates Amount is big, and runing time is shorter, then preset time threshold can be shorter, such as can be 1 day, data cached corresponding software unit The data volume that time generates is small, and long operational time, then preset time threshold can be longer, such as can be 3 days.
Specifically, the caching number of older preset data can also be deleted when the utilization rate of cache pool is more than preset value According to, wherein preset value can be 90%.
S102 reads the pending data found.
After finding pending data in cache pool, the pending data can be read, due to from the reading in cache pool The reading speed for evidence of fetching is higher than the reading speed from the reading data of distributed memory system, it is thus possible to improve data are read Take efficiency.Meanwhile the capacity of cache pool compared with the prior art in memory device it is bigger, can store more caching numbers According to improve the hit rate of pending data.
If pending data is not present in cache pool, pending data can be read from distributed memory system, and Pending data is written in cache pool.Since hit rate of the pending data in cache pool is higher, deposited from distribution The probability that data are read in storage system is smaller, for a large amount of pending data, significantly improves data reading performance using redundancy.
The embodiment of the present application provides a kind of data processing method, and target data is being written into distributed memory system When, while by data cached write-in cache pool, wherein cache pool is formed based on the storage device build more than or equal to 100GB. When needing to carry out the reading of pending data, can data cached middle lookup pending data in cache pool, reading looks into The pending data found.Since the capacity of storage equipment is greater than memory device, can be stored in cache pool more It is data cached, while improving the reading efficiency of data, it can be improved the reading hit rate of pending data, thus further The reading time delay for reducing pending data, effectively improves data reading performance using redundancy.
Based on above data processing method, the embodiment of the present application also provides a kind of data processing equipments, with reference to Fig. 2 institute Show, be a kind of structural block diagram of data processing equipment provided by the embodiments of the present application, which includes:
Searching unit 110, for searching pending data in cache pool;Be stored in the cache pool it is data cached, It is described it is data cached be written the cache pool when target data is written into distributed memory system, the cache pool is based on big It forms in or equal to the storage device build of 100GB;
Reading unit 120, for reading the pending data found.
Optionally, described device further include:
Read-write cell, if for the pending data to be not present in cache pool, from the distributed memory system The pending data is read, the pending data is written in cache pool.
Optionally, described device further include:
Unit is deleted, when being more than preset value for the utilization rate in the cache pool, deletes the slow of older preset data Deposit data;And/or
Deleting storage time is more than the data cached of preset time threshold.
Optionally, described data cached for single copy data.
The embodiment of the present application provides a kind of data processing equipment, and target data is being written into distributed memory system When, while by data cached write-in cache pool, wherein cache pool is formed based on the storage device build more than or equal to 100GB. When needing to carry out the reading of pending data, can data cached middle lookup pending data in cache pool, reading looks into The pending data found.Since the capacity of storage equipment is greater than memory device, can be stored in cache pool more It is data cached, while improving the reading efficiency of data, it can be improved the reading hit rate of pending data, thus further The reading time delay for reducing pending data, effectively improves data reading performance using redundancy.
" first " in the titles such as " first ... " mentioned in the embodiment of the present application, " first ... " is used only to do name Word mark, does not represent first sequentially.The rule is equally applicable to " second " etc..
As seen through the above description of the embodiments, those skilled in the art can be understood that above-mentioned implementation All or part of the steps in example method can add the mode of general hardware platform to realize by software.Based on this understanding, The technical solution of the application can be embodied in the form of software products, which can store is situated between in storage In matter, such as read-only memory (English: read-only memory, ROM)/RAM, magnetic disk, CD etc., including some instructions to So that a computer equipment (can be the network communication equipments such as personal computer, server, or router) executes Method described in certain parts of each embodiment of the application or embodiment.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for method reality For applying example and apparatus embodiments, since it is substantially similar to system embodiment, so describe fairly simple, related place ginseng See the part explanation of system embodiment.Equipment and system embodiment described above is only schematical, wherein making It may or may not be physically separated for the module of separate part description, the component shown as module can be Or it may not be physical module, it can it is in one place, or may be distributed over multiple network units.It can be with Some or all of the modules therein is selected to achieve the purpose of the solution of this embodiment according to the actual needs.The common skill in this field Art personnel can understand and implement without creative efforts.
The above is only the preferred embodiment of the application, is not intended to limit the protection scope of the application.It should refer to Out, for those skilled in the art, it under the premise of not departing from the application, can also make several improvements And retouching, these improvements and modifications also should be regarded as the protection scope of the application.

Claims (10)

1. a kind of data processing method, which is characterized in that the described method includes:
Pending data is searched in cache pool;Be stored in the cache pool it is data cached, it is described data cached to distribution The cache pool is written when target data being written in formula storage system, the cache pool is based on the storage for being greater than or equal to 100GB Device build forms;
Read the pending data found.
2. the method according to claim 1, wherein the method also includes:
If the pending data is not present in cache pool, the number to be processed is read from the distributed memory system According to, will the pending data be written cache pool in.
3. the method according to claim 1, wherein the method also includes:
When the utilization rate of the cache pool is more than preset value, the data cached of older preset data is deleted;And/or
Deleting storage time is more than the data cached of preset time threshold.
4. method according to claim 1 to 3, which is characterized in that described data cached for single copy data.
5. method according to claim 1 to 3, which is characterized in that the described data cached and target data It is identical.
6. method according to claim 1 to 3, which is characterized in that described data cached for the target data In partial data.
7. method according to claim 1 to 3, which is characterized in that the storage equipment include solid state hard disk, Random access memory or hard disk drive.
8. a kind of data processing equipment, which is characterized in that described device includes:
Searching unit, for searching pending data in cache pool;Data cached, the caching is stored in the cache pool The cache pool is written when target data is written into distributed memory system for data, and the cache pool is based on being greater than or equal to The storage device build of 100GB forms;
Reading unit, for reading the pending data found.
9. device according to claim 8, which is characterized in that described device further include:
Read-write cell, if being read from the distributed memory system for the pending data to be not present in cache pool The pending data is written in cache pool the pending data.
10. device according to claim 8, which is characterized in that described device further include:
Unit is deleted, when being more than preset value for the utilization rate in the cache pool, deletes the caching number of older preset data According to;And/or
Deleting storage time is more than the data cached of preset time threshold.
CN201811231230.3A 2018-10-22 2018-10-22 A kind of data processing method and device Pending CN109521957A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811231230.3A CN109521957A (en) 2018-10-22 2018-10-22 A kind of data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811231230.3A CN109521957A (en) 2018-10-22 2018-10-22 A kind of data processing method and device

Publications (1)

Publication Number Publication Date
CN109521957A true CN109521957A (en) 2019-03-26

Family

ID=65772300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811231230.3A Pending CN109521957A (en) 2018-10-22 2018-10-22 A kind of data processing method and device

Country Status (1)

Country Link
CN (1) CN109521957A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110417901A (en) * 2019-07-31 2019-11-05 北京金山云网络技术有限公司 Data processing method, device and gateway server
CN112714409A (en) * 2021-01-27 2021-04-27 深圳市徕纳智能科技有限公司 Old man watch based on NB-IOT (network B-Internet of things) and data processing method
CN114063923A (en) * 2021-11-17 2022-02-18 海光信息技术股份有限公司 Data reading method and device, processor and electronic equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107632784A (en) * 2017-09-14 2018-01-26 郑州云海信息技术有限公司 The caching method of a kind of storage medium and distributed memory system, device and equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107632784A (en) * 2017-09-14 2018-01-26 郑州云海信息技术有限公司 The caching method of a kind of storage medium and distributed memory system, device and equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110417901A (en) * 2019-07-31 2019-11-05 北京金山云网络技术有限公司 Data processing method, device and gateway server
WO2021017884A1 (en) * 2019-07-31 2021-02-04 北京金山云网络技术有限公司 Data processing method and apparatus, and gateway server
CN110417901B (en) * 2019-07-31 2022-04-29 北京金山云网络技术有限公司 Data processing method and device and gateway server
CN112714409A (en) * 2021-01-27 2021-04-27 深圳市徕纳智能科技有限公司 Old man watch based on NB-IOT (network B-Internet of things) and data processing method
CN114063923A (en) * 2021-11-17 2022-02-18 海光信息技术股份有限公司 Data reading method and device, processor and electronic equipment

Similar Documents

Publication Publication Date Title
CN105242881B (en) Distributed memory system and its data read-write method
CN103064639B (en) Date storage method and device
US8225029B2 (en) Data storage processing method, data searching method and devices thereof
US10509701B2 (en) Performing data backups using snapshots
US8782368B2 (en) Storing chunks in containers
CN108647151A (en) It is a kind of to dodge system metadata rule method, apparatus, equipment and storage medium entirely
CN103999058B (en) Tape drive system server
CN101510223B (en) Data processing method and system
US7577808B1 (en) Efficient backup data retrieval
CN107832423B (en) File reading and writing method for distributed file system
CN103617097B (en) File access pattern method and device
CN102982182B (en) Data storage planning method and device
CN109521957A (en) A kind of data processing method and device
CN103329111A (en) Data processing method, device and system based on block storage
CN103917962A (en) Reading files stored on a storage system
CN104503703B (en) The treating method and apparatus of caching
CN107491523A (en) The method and device of data storage object
CN103902479A (en) Quick reconstruction mechanism for metadata cache on basis of metadata log
CN103019890A (en) Block-level disk data protection system and method thereof
CN108733306A (en) A kind of Piece file mergence method and device
CN113626431A (en) LSM tree-based key value separation storage method and system for delaying garbage recovery
CN109471843A (en) A kind of metadata cache method, system and relevant apparatus
CN109189772A (en) File management method and system for no file system storage medium
CN107168651A (en) A kind of small documents polymerize storage processing method
CN104657461A (en) File system metadata search caching method based on internal memory and SSD (Solid State Disk) collaboration

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190326

RJ01 Rejection of invention patent application after publication