CN104123300A - Data distributed storage system and method - Google Patents

Data distributed storage system and method Download PDF

Info

Publication number
CN104123300A
CN104123300A CN201310150539.0A CN201310150539A CN104123300A CN 104123300 A CN104123300 A CN 104123300A CN 201310150539 A CN201310150539 A CN 201310150539A CN 104123300 A CN104123300 A CN 104123300A
Authority
CN
China
Prior art keywords
data
back end
unit
internal memory
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310150539.0A
Other languages
Chinese (zh)
Other versions
CN104123300B (en
Inventor
吴朱华
潘志铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI PEOPLEYUN INFORMATION TECHNOLOGY Co Ltd
Original Assignee
SHANGHAI PEOPLEYUN INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI PEOPLEYUN INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI PEOPLEYUN INFORMATION TECHNOLOGY Co Ltd
Priority to CN201310150539.0A priority Critical patent/CN104123300B/en
Publication of CN104123300A publication Critical patent/CN104123300A/en
Application granted granted Critical
Publication of CN104123300B publication Critical patent/CN104123300B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data distributed storage system and method. The system comprises a node cluster module, a data import module and a storage module. The node cluster module is used for connecting data nodes in a cluster with corresponding management nodes; the data import module is used for scanning input data according to data blocks with sizes set and loading the input data into a memory, data in the memory are grouped according to characteristic values of the data, and the grouped data are sent to the corresponding data nodes; the storage module is used for storing data fragmentations in the memory after the data nodes receive file fragmentations, and the data nodes output logs to a hard disk; whether the data in the memory exceed a set threshold value or not is judged, if the data in the memory exceed the set threshold value, the data are reorganized and compressed and then written into the hard disk, and corresponding log files restored through user memory data are deleted. According to the system and method, the cluster based on memory computing power can be accelerated; real-time loading and processing capacity for large-scale data can be improved, and response time of the system is shortened.

Description

Data distributed memory system and method
Technical field
The invention belongs to database storage techniques field, relate to a kind of distributed memory system, relate in particular to a kind of data distributed memory system; Meanwhile, the invention still further relates to a kind of data distributed storage method.
Background technology
At present, the data storage method of database has: 1. unit data storage method; 2. master-slave back-up storage mode; 3. utilize the storage mode of distributed file system.But, no matter adopt above which kind of mode, all exist certain deficiency.
Although unit data storage method is convenient to management and using, extensibility exists major defect to be difficult to meet the access needs of current mass data, and the security of data also has problems.Master-slave back-up storage mode has only solved safety issue, and other problems still exists.Utilize the database storage mode of distributed file system, although solve the security of data and the access requirement of mass data, and be not suitable for those data access that requires low delay and processing.
In view of this, nowadays in the urgent need to designing a kind of new distributed memory system for database and method, to solve the above-mentioned defect of existing storage system.
Summary of the invention
Technical matters to be solved by this invention is: a kind of distributed memory system for database is provided, can realizes the cluster based on rapid memory computing power and promote large-scale data real-time loading and processing power, accelerate the response time of whole system.
In addition, the present invention also provides a kind of data distributed storage method, can realize the cluster based on rapid memory computing power and promote large-scale data real-time loading and processing power, accelerates the response time of whole system.
For solving the problems of the technologies described above, the present invention adopts following technical scheme:
A kind of data distributed memory system, described system comprises:
Registering modules, in order to be registered to management node by client by the back end in cluster;
Data importing module, in order to the data of input are scanned and be written into internal memory according to setting big or small data block, the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end; Described data importing module specifically comprises data scanning unit, packet rule match unit, packet unit, data transmission unit; Described data scanning unit to be so that the data of input are scanned and be written into internal memory according to setting big or small data block, and in order to data are carried out cutting and generated integer numerical value identification code as data according to eigenwert according to data feature values; Described packet rule match unit divides into groups according to rule of classification this identification code in order to the Data Identification code according to different pieces of information to it; Described packet unit is in order to divide into groups the setting size data piece through overscanning in internal memory according to the eigenwert of data; The data after grouping are sent to corresponding back end by described data transmission unit;
Memory module, in order to data fragmentation is retained in internal memory after back end receives file fragmentation, judging whether needs these data to backup to other back end, as needs back up by backup module; Back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data is sorted out according to metadata feature, after the reorganization of data, then compresses; The mode of the reorganization to data is mainly the eigenwert according to data, and similarity between data sorts, and the data of maximum similarity can be deposited continuously, for next step compression storing data is prepared; After the reorganization of data, because similar data can store together, adopt LZAM algorithm to compress it, to obtain higher compressibility, and then after write hard disk, and delete the journal file that corresponding user memory data are recovered;
Backup module, in order in data transmission to after on corresponding back end, these data are backed up according to the backup number of setting, the data of backup will be distributed on other back end;
Retrieval module is retrieved corresponding data in order to receive the request of data retrieval at management node after; Retrieval module specifically comprises positioning unit, inefficacy judging unit, request Dispatching Unit, retrieval unit, result merge cells; Management node is by the related back end of positioning unit locator data retrieval request; Management node adopts by inefficacy judging unit Lease is machine-processed determines whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node is by asking Dispatching Unit dispense request to respective nodes; Back end receives after data retrieval request, after by retrieval unit, corresponding data being retrieved, returns results to client; Client utilizes result merge cells that the result receiving is merged.
A kind of data distributed memory system, described system comprises:
Node cluster module, in order to connect corresponding management node by the back end in cluster;
Data importing module, in order to the data of input are scanned and be written into internal memory according to setting big or small data block, the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end;
Memory module, in order to after back end receives file fragmentation, data fragmentation is retained in internal memory, back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, after compression, write hard disk, and delete the journal file that corresponding user memory data are recovered.
As a preferred embodiment of the present invention, described data importing module specifically comprises data cutting unit, document scanning unit, packet rule match unit, packet unit, data transmission unit;
Described data cutting unit is in order to scan and to be written into internal memory to the data of input according to setting big or small data block; Described packet rule match unit is in order to set the eigenwert of different regular computational datas according to different data types; Described packet unit is in order to divide into groups the data block of the setting size through overscanning according to the feature of data; The data after grouping are sent to corresponding back end by described data transmission unit.
As a preferred embodiment of the present invention, described system also comprises backup module, in order in data transmission to after on corresponding back end, these data are backed up according to the backup number of setting, the data of backup will be distributed on other back end.
As a preferred embodiment of the present invention, described system also comprises retrieval module, in order to receive the request of data retrieval at management node after, corresponding data is retrieved;
Described retrieval module specifically comprises positioning unit, inefficacy judging unit, request Dispatching Unit, retrieval unit, result merge cells;
Management node is by the related back end of positioning unit locator data retrieval request; Management node adopts by inefficacy judging unit Lease is machine-processed determines whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node is by asking Dispatching Unit dispense request to respective nodes; Back end receives after data retrieval request, after by retrieval unit, corresponding data being retrieved, returns results to client; Client utilizes result merge cells that the result receiving is merged.
A kind of data distributed storage method, described method comprises the steps:
Node cluster step: the back end in cluster is connected to corresponding management node;
Data importing step: the data of input are scanned and be written into internal memory according to setting big or small data block, and the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end;
Storing step: after back end receives file fragmentation, data fragmentation is retained in internal memory, back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, after compression, write hard disk, and delete the journal file that corresponding user memory data are recovered.
As a preferred embodiment of the present invention, described data importing step comprises:
Data scanning step, scans and is written into internal memory to the data of input according to setting big or small data block;
Packet rule match step, sets the eigenwert of different regular computational datas according to different data types;
Packet step, divides into groups the data block of the setting size through overscanning according to the feature of data;
Data sending step, is sent to corresponding back end by the data after grouping.
As a preferred embodiment of the present invention, described method also comprises backup-step:, to after on corresponding back end these data are backed up according to the backup number of setting in data transmission, the data of backup will be distributed on other back end.
As a preferred embodiment of the present invention, described method also comprises searching step, corresponding data is retrieved after receiving the request of data retrieval at management node;
Described searching step specifically comprises:
The back end that management node locator data retrieval request is related;
Management node adopts Lease mechanism to determine whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node dispense request is to respective nodes;
Back end receives after data retrieval request, after corresponding data is retrieved, returns results to client;
Client merges the result receiving.
Beneficial effect of the present invention is: data distributed memory system and method that the present invention proposes, can realize the cluster calculating based on internal memory; Can realize the real-time transaction management to large-scale data, the response time of Hoisting System.On each back end, internal storage data all backs up on disk, the safety of bonding machine data; Simultaneity factor adopts redundant design, and each piece of data all has redundancy backup on different nodes, and the machine of delaying of any node does not affect data integrity and system availability.
Brief description of the drawings
Fig. 1 is the composition schematic diagram of data distributed memory system of the present invention.
Fig. 2 is the process flow diagram that imports data in data distributed storage method of the present invention.
Fig. 3 is the composition schematic diagram of the data importing module of system of the present invention.
Fig. 4 is the process flow diagram of data storage in data distributed storage method of the present invention.
Fig. 5 is the process flow diagram of data retrieval in data distributed storage method of the present invention.
Embodiment
Describe the preferred embodiments of the present invention in detail below in conjunction with accompanying drawing.
Embodiment mono-
Refer to Fig. 1, the present invention has disclosed a kind of data distributed memory system, and described system comprises: Registering modules 1(also can be called " node cluster module "), data importing module 2, memory module 3, backup module, retrieval module 4.
Registering modules 1 is in order to be registered to management node by client by the back end in cluster;
Data importing module 2 is in order to scan and to be written into internal memory to the data of input according to setting big or small data block, and the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end.
Particularly, refer to Fig. 3, in the present embodiment, described data importing module specifically comprises data cutting unit, document scanning unit, packet rule match unit, packet unit, data transmission unit.
Described data cutting unit is in order to scan and to be written into internal memory to the data of input according to setting big or small data block; Described packet rule match unit is in order to set the eigenwert of different regular computational datas according to different data types; Described packet unit is in order to divide into groups the setting size data piece through overscanning in internal memory according to the eigenwert of data; The data after grouping are sent to corresponding back end by described data transmission unit.
Memory module 3 in order to be retained in data fragmentation in internal memory after back end receives file fragmentation, and judging whether needs these data to backup to other back end, as needs back up by backup module.Backup module in order in data transmission to after on corresponding back end, these data are backed up according to the backup number of setting, the data of backup will be distributed on other back end.Back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, then compress; The mode of the reorganization to data is mainly the eigenwert according to data, and similarity between data sorts, and the data of maximum similarity can be deposited continuously, for next step compression storing data is prepared; After the reorganization of data, because similar data can store together, adopt LZAM algorithm to compress it, to obtain higher compressibility, and then after write hard disk, and delete the journal file that corresponding user memory data are recovered.
Retrieval module 4 is retrieved corresponding data in order to receive the request of data retrieval at management node after.Retrieval module specifically comprises positioning unit, inefficacy judging unit, request Dispatching Unit, retrieval unit, result merge cells.
Particularly, management node is by the related back end of positioning unit locator data retrieval request; Management node adopts by inefficacy judging unit Lease is machine-processed determines whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node is by asking Dispatching Unit dispense request to respective nodes; Back end receives after data retrieval request, after by retrieval unit, corresponding data being retrieved, returns results to client; Client utilizes result merge cells that the result receiving is merged.
More than introduced the composition of data distributed memory system of the present invention, the present invention, in disclosing said system, also discloses a kind of data distributed storage method; Refer to Fig. 2, Fig. 4, described method comprises the steps:
[step S1] node cluster step (being registration step): the back end in cluster is connected to corresponding management node, can complete connection by the mode of registration, as client sends log-on message, the back end in cluster is registered on management node.
[step S2] data importing step: the data of input are scanned and be written into internal memory according to setting big or small data block, and the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end.In conjunction with Fig. 3, described data importing step specifically comprises:
Step S21, data scanning step, scan and be written into internal memory to the data of input according to setting big or small data block;
Step S22, packet rule match step, set the eigenwert of different regular computational datas according to different data types;
Step S23, packet step, divide into groups the data block of the setting size through overscanning according to the feature of data;
Step S24, data sending step, be sent to corresponding back end by the data after grouping.
[step S3] storing step: as shown in Figure 4, after back end receives file fragmentation, data fragmentation is retained in internal memory, judging whether needs these data to backup to other back end, as needs back up.
Backup-step is included in data transmission to after on corresponding back end, and these data are backed up according to the backup number of setting, and the data of backup will be distributed on other back end.Back end output journal, to hard disk, recovers for datarams data.
Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, then compress; The mode of the reorganization to data is mainly the eigenwert according to data, and similarity between data sorts, and the data of maximum similarity can be deposited continuously, for next step compression storing data is prepared; After the reorganization of data, because similar data can store together, adopt LZAM algorithm to compress it, to obtain higher compressibility, and then after write hard disk, and delete the journal file that corresponding user memory data are recovered.
[step S4] searching step, to corresponding data retrieves after receiving the request of data retrieval at management node.Refer to Fig. 5, described searching step specifically comprises:
Step S40, client send to the request of data retrieval on the node of data management;
Step S41, the related back end of management node locator data retrieval request;
Step S42, management node adopt Lease mechanism to determine whether this back end lost efficacy, and directly return to request failure information as lost efficacy, if effectively, management node dispense request is to respective nodes;
Step S43, back end receive after data retrieval request, after corresponding data is retrieved, return results to client;
Step S44, client merge the result receiving.
In sum, data distributed memory system and method that the present invention proposes, can realize the cluster calculating based on internal memory; Can realize the real-time transaction management to large-scale data, the response time of Hoisting System.On each back end, internal storage data all backs up on disk, the safety of bonding machine data; Simultaneity factor adopts redundant design, and each piece of data all has redundancy backup on different nodes, and the machine of delaying of any node does not affect data integrity and system availability.
Here description of the invention and application is illustrative, not wants scope of the present invention to limit in the above-described embodiments.Here the distortion of disclosed embodiment and change is possible, and for those those of ordinary skill in the art, the various parts of the replacement of embodiment and equivalence are known.Those skilled in the art are noted that in the situation that not departing from spirit of the present invention or essential characteristic, and the present invention can be with other form, structure, layout, ratio, and realize with other assembly, material and parts.In the situation that not departing from the scope of the invention and spirit, can carry out other distortion and change to disclosed embodiment here.

Claims (9)

1. a data distributed memory system, is characterized in that, described system comprises:
Registering modules, in order to be registered to management node by client by the back end in cluster;
Data importing module, in order to the data of input are scanned and be written into internal memory according to setting big or small data block, the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end; Described data importing module specifically comprises data cutting unit, data scanning unit, packet rule match unit, packet unit, data transmission unit; Described data cutting unit is in order to scan and to be written into internal memory to the data of input according to setting big or small data block; Described packet rule match unit is in order to set the eigenwert of Different Rule computational data according to different data types; Described packet unit is in order to divide into groups the setting size data piece through overscanning in internal memory according to the eigenwert of data; The data after grouping are sent to corresponding back end by described data transmission unit;
Memory module, in order to data fragmentation is retained in internal memory after back end receives file fragmentation, judging whether needs these data to backup to other back end, as needs back up by backup module; Back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data is sorted out according to metadata feature, after the reorganization of data, then compresses; The mode of the reorganization to data is mainly the eigenwert according to data, and similarity between data sorts, and the data of maximum similarity can be deposited continuously, for next step compression storing data is prepared; After the reorganization of data, because similar data can store together, adopt LZAM algorithm to compress it, to obtain higher compressibility, and then after write hard disk, and delete the journal file that corresponding user memory data are recovered;
Backup module, in order in data transmission to after on corresponding back end, these data are backed up according to the backup number of setting, the data of backup will be distributed on other back end;
Retrieval module is retrieved corresponding data in order to receive the request of data retrieval at management node after; Retrieval module specifically comprises positioning unit, inefficacy judging unit, request Dispatching Unit, retrieval unit, result merge cells; Management node is by the related back end of positioning unit locator data retrieval request; Management node adopts by inefficacy judging unit Lease is machine-processed determines whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node is by asking Dispatching Unit dispense request to respective nodes; Back end receives after data retrieval request, after by retrieval unit, corresponding data being retrieved, returns results to client; Client utilizes result merge cells that the result receiving is merged.
2. a data distributed memory system, is characterized in that, described system comprises:
Node cluster module, in order to connect corresponding management node by the back end in cluster;
Data importing module, in order to the data of input are scanned and be written into internal memory according to setting big or small data block, the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end;
Memory module, in order to after back end receives data fragmentation, data fragmentation is retained in internal memory, back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, after compression, write hard disk, and delete the journal file that corresponding user memory data are recovered.
3. data distributed memory system according to claim 2, is characterized in that:
Described data importing module specifically comprises data cutting unit, document scanning unit, packet rule match unit, packet unit, data transmission unit;
Described data cutting unit is in order to scan and to be written into internal memory to the data of input according to setting big or small data block; Described packet rule match unit is in order to set the eigenwert of Different Rule computational data according to different data types; Described packet unit is in order to divide into groups the data block of the setting size through overscanning according to the feature of data; The data after grouping are sent to corresponding back end by described data transmission unit.
4. data distributed memory system according to claim 2, is characterized in that:
Described system also comprises backup module, in order in data transmission to after on corresponding back end, these data are backed up according to the backup number of setting, the data of backup will be distributed on other back end.
5. data distributed memory system according to claim 2, is characterized in that:
Described system also comprises retrieval module, in order to receive the request of data retrieval at management node after, corresponding data is retrieved;
Described retrieval module specifically comprises positioning unit, inefficacy judging unit, request Dispatching Unit, retrieval unit, result merge cells;
Management node is by the related back end of positioning unit locator data retrieval request; Management node adopts by inefficacy judging unit Lease is machine-processed determines whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node is by asking Dispatching Unit dispense request to respective nodes; Back end receives after data retrieval request, after by retrieval unit, corresponding data being retrieved, returns results to client; Client utilizes result merge cells that the result receiving is merged.
6. a data distributed storage method, is characterized in that, described method comprises the steps:
Node cluster step: the back end in cluster is connected to corresponding management node;
Data importing step: the data of input are scanned and be written into internal memory according to setting big or small data block, and the data in internal memory are divided into groups according to the eigenwert of data, then the data after grouping are sent to corresponding back end;
Storing step: after back end receives file fragmentation, data fragmentation is retained in internal memory, back end output journal, to hard disk, recovers for datarams data; Judge whether the size of data in internal memory exceedes the threshold values setting, as exceeded, data are reorganized, after compression, write hard disk, and delete the journal file that corresponding user memory data are recovered.
7. data distributed storage method according to claim 6, is characterized in that:
Described data importing step comprises:
Data scanning step, scans and is written into internal memory to the data of input according to setting big or small data block;
Packet rule match step, sets the eigenwert of different regular computational datas according to different data types;
Packet step, divides into groups the data block of the setting size through overscanning according to the feature of data;
Data sending step, is sent to corresponding back end by the data after grouping.
8. data distributed storage method according to claim 6, is characterized in that:
Described method also comprises backup-step:, to after on corresponding back end these data are backed up according to the backup number of setting in data transmission, the data of backup will be distributed on other back end.
9. data distributed storage method according to claim 6, is characterized in that:
Described method also comprises searching step, corresponding data is retrieved after receiving the request of data retrieval at management node;
Described searching step specifically comprises:
The back end that management node locator data retrieval request is related;
Management node adopts Lease mechanism to determine whether this back end lost efficacy, and directly returns to request failure information as lost efficacy, if effectively, management node dispense request is to respective nodes;
Back end receives after data retrieval request, after corresponding data is retrieved, returns results to client;
Client merges the result receiving.
CN201310150539.0A 2013-04-26 2013-04-26 Data distribution formula storage system and method Expired - Fee Related CN104123300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310150539.0A CN104123300B (en) 2013-04-26 2013-04-26 Data distribution formula storage system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310150539.0A CN104123300B (en) 2013-04-26 2013-04-26 Data distribution formula storage system and method

Publications (2)

Publication Number Publication Date
CN104123300A true CN104123300A (en) 2014-10-29
CN104123300B CN104123300B (en) 2017-10-13

Family

ID=51768713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310150539.0A Expired - Fee Related CN104123300B (en) 2013-04-26 2013-04-26 Data distribution formula storage system and method

Country Status (1)

Country Link
CN (1) CN104123300B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572987A (en) * 2015-01-04 2015-04-29 浙江大学 Method and system for improving storage efficiency of simple regenerating codes by compression
CN104679847A (en) * 2015-02-13 2015-06-03 王磊 Method and equipment for building online real-time updating mass audio fingerprint database
CN104731676A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for accelerating data recovery of cluster system
CN105159818A (en) * 2015-08-28 2015-12-16 东北大学 Log recovery method in memory data management and log recovery simulation system in memory data management
CN105335513A (en) * 2015-10-30 2016-02-17 迈普通信技术股份有限公司 Distributed file system and file storage method
WO2016095791A1 (en) * 2014-12-19 2016-06-23 Huawei Technologies Co., Ltd. Replicated database distribution for workload balancing after cluster reconfiguration
CN105912601A (en) * 2016-04-05 2016-08-31 国电南瑞科技股份有限公司 Partition storage method for distributed real-time memory database of energy management system
CN106648442A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Metadata node internal memory mirroring method and device
CN106649481A (en) * 2016-09-30 2017-05-10 郑州云海信息技术有限公司 A method and system of log optimization for SQL Server database
WO2017092384A1 (en) * 2015-12-01 2017-06-08 深圳市华讯方舟软件技术有限公司 Clustered database distributed storage method and device
CN106886555A (en) * 2016-12-27 2017-06-23 苏州春禄电子科技有限公司 A kind of anti-loss of data based on block chain technology and the data-storage system for damaging
CN107203554A (en) * 2016-03-17 2017-09-26 北大方正集团有限公司 A kind of distributed search method and device
CN107436738A (en) * 2017-08-17 2017-12-05 北京理工大学 A kind of date storage method and system
WO2017215339A1 (en) * 2016-06-14 2017-12-21 武汉斗鱼网络科技有限公司 Search cluster optimisation method and system based on rbf neural network
CN108664223A (en) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 A kind of distributed storage method, device, computer equipment and storage medium
CN108920215A (en) * 2018-07-18 2018-11-30 郑州云海信息技术有限公司 A method of passing through initramfs collection system log
CN108921728A (en) * 2018-07-03 2018-11-30 北京科东电力控制系统有限责任公司 Distributed real-time database system based on power network dispatching system
CN108984686A (en) * 2018-07-02 2018-12-11 中国电子科技集团公司第五十二研究所 A kind of distributed file system indexing means and device merged based on log
CN109360605A (en) * 2018-09-25 2019-02-19 安吉康尔(深圳)科技有限公司 Gene order-checking data archiving method, server and computer readable storage medium
CN109522310A (en) * 2018-11-16 2019-03-26 北京锐安科技有限公司 Data storage, search method, system and storage medium
CN109885536A (en) * 2019-02-26 2019-06-14 深圳众享互联科技有限公司 One kind is based on the storage of distributed data fragment and fuzzy search method
CN110019210A (en) * 2017-11-24 2019-07-16 阿里巴巴集团控股有限公司 Method for writing data and equipment
CN110069483A (en) * 2017-08-17 2019-07-30 阿里巴巴集团控股有限公司 Loading data is to the method for Distributed Data Warehouse, node and system
CN114281604A (en) * 2022-03-02 2022-04-05 北京金山云网络技术有限公司 Data recovery method and device, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079896A (en) * 2007-06-22 2007-11-28 西安交通大学 A multi-availability mechanism coexistence framework of concurrent storage system
US20120150824A1 (en) * 2010-12-10 2012-06-14 Inventec Corporation Processing System of Data De-Duplication
CN102906751A (en) * 2012-07-25 2013-01-30 华为技术有限公司 Method and device for data storage and data query
CN103020077A (en) * 2011-09-24 2013-04-03 国家电网公司 Method for managing memory of real-time database of power system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079896A (en) * 2007-06-22 2007-11-28 西安交通大学 A multi-availability mechanism coexistence framework of concurrent storage system
US20120150824A1 (en) * 2010-12-10 2012-06-14 Inventec Corporation Processing System of Data De-Duplication
CN103020077A (en) * 2011-09-24 2013-04-03 国家电网公司 Method for managing memory of real-time database of power system
CN102906751A (en) * 2012-07-25 2013-01-30 华为技术有限公司 Method and device for data storage and data query

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄翀民: "搜索引擎中的分布式文件系统的研究和优化", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016095791A1 (en) * 2014-12-19 2016-06-23 Huawei Technologies Co., Ltd. Replicated database distribution for workload balancing after cluster reconfiguration
CN107005596A (en) * 2014-12-19 2017-08-01 华为技术有限公司 Replicated database for the worn balance after cluster is reconfigured is distributed
US10102086B2 (en) 2014-12-19 2018-10-16 Futurewei Technologies, Inc. Replicated database distribution for workload balancing after cluster reconfiguration
CN104572987A (en) * 2015-01-04 2015-04-29 浙江大学 Method and system for improving storage efficiency of simple regenerating codes by compression
CN104572987B (en) * 2015-01-04 2017-12-22 浙江大学 A kind of method and system that simple regeneration code storage efficiency is improved by compressing
CN104679847B (en) * 2015-02-13 2019-03-15 高第网络技术(北京)有限公司 A kind of method and apparatus constructing online real-time update magnanimity audio-frequency fingerprint library
CN104679847A (en) * 2015-02-13 2015-06-03 王磊 Method and equipment for building online real-time updating mass audio fingerprint database
CN104731676A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for accelerating data recovery of cluster system
CN105159818A (en) * 2015-08-28 2015-12-16 东北大学 Log recovery method in memory data management and log recovery simulation system in memory data management
CN105159818B (en) * 2015-08-28 2018-01-02 东北大学 Journal recovery method and its analogue system in main-memory data management
CN106648442A (en) * 2015-10-29 2017-05-10 阿里巴巴集团控股有限公司 Metadata node internal memory mirroring method and device
CN105335513A (en) * 2015-10-30 2016-02-17 迈普通信技术股份有限公司 Distributed file system and file storage method
CN105335513B (en) * 2015-10-30 2018-09-25 迈普通信技术股份有限公司 A kind of distributed file system and file memory method
WO2017092384A1 (en) * 2015-12-01 2017-06-08 深圳市华讯方舟软件技术有限公司 Clustered database distributed storage method and device
CN107203554A (en) * 2016-03-17 2017-09-26 北大方正集团有限公司 A kind of distributed search method and device
CN105912601A (en) * 2016-04-05 2016-08-31 国电南瑞科技股份有限公司 Partition storage method for distributed real-time memory database of energy management system
WO2017173842A1 (en) * 2016-04-05 2017-10-12 国电南瑞科技股份有限公司 Method of partitioning and storing real-time distributed database in memory in energy management system
WO2017215339A1 (en) * 2016-06-14 2017-12-21 武汉斗鱼网络科技有限公司 Search cluster optimisation method and system based on rbf neural network
CN106649481A (en) * 2016-09-30 2017-05-10 郑州云海信息技术有限公司 A method and system of log optimization for SQL Server database
CN106886555A (en) * 2016-12-27 2017-06-23 苏州春禄电子科技有限公司 A kind of anti-loss of data based on block chain technology and the data-storage system for damaging
CN110069483A (en) * 2017-08-17 2019-07-30 阿里巴巴集团控股有限公司 Loading data is to the method for Distributed Data Warehouse, node and system
CN110069483B (en) * 2017-08-17 2023-04-28 阿里巴巴集团控股有限公司 Method, node and system for loading data into distributed data warehouse
CN107436738B (en) * 2017-08-17 2019-10-25 北京理工大学 A kind of date storage method and system
CN107436738A (en) * 2017-08-17 2017-12-05 北京理工大学 A kind of date storage method and system
CN110019210A (en) * 2017-11-24 2019-07-16 阿里巴巴集团控股有限公司 Method for writing data and equipment
CN110019210B (en) * 2017-11-24 2024-01-09 阿里云计算有限公司 Data writing method and device
CN108664223A (en) * 2018-05-18 2018-10-16 百度在线网络技术(北京)有限公司 A kind of distributed storage method, device, computer equipment and storage medium
US11842072B2 (en) 2018-05-18 2023-12-12 Baidu Online Network Technology (Beijing) Co., Ltd. Distributed storage method and apparatus, computer device, and storage medium
CN108984686B (en) * 2018-07-02 2021-03-30 中国电子科技集团公司第五十二研究所 Distributed file system indexing method and device based on log merging
CN108984686A (en) * 2018-07-02 2018-12-11 中国电子科技集团公司第五十二研究所 A kind of distributed file system indexing means and device merged based on log
CN108921728A (en) * 2018-07-03 2018-11-30 北京科东电力控制系统有限责任公司 Distributed real-time database system based on power network dispatching system
CN108920215A (en) * 2018-07-18 2018-11-30 郑州云海信息技术有限公司 A method of passing through initramfs collection system log
CN109360605A (en) * 2018-09-25 2019-02-19 安吉康尔(深圳)科技有限公司 Gene order-checking data archiving method, server and computer readable storage medium
CN109522310A (en) * 2018-11-16 2019-03-26 北京锐安科技有限公司 Data storage, search method, system and storage medium
CN109885536A (en) * 2019-02-26 2019-06-14 深圳众享互联科技有限公司 One kind is based on the storage of distributed data fragment and fuzzy search method
CN114281604A (en) * 2022-03-02 2022-04-05 北京金山云网络技术有限公司 Data recovery method and device, electronic equipment and storage medium
CN114281604B (en) * 2022-03-02 2022-07-29 北京金山云网络技术有限公司 Data recovery method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104123300B (en) 2017-10-13

Similar Documents

Publication Publication Date Title
CN104123300A (en) Data distributed storage system and method
CN1318974C (en) Method for compression and search of database backup data
CN101820426B (en) Data compression method in on-line backup service software
CN104932956B (en) A kind of cloud disaster-tolerant backup method towards big data
CN102906751B (en) A kind of method of data storage, data query and device
CN101334797B (en) Distributed file systems and its data block consistency managing method
CN102419766B (en) Data redundancy and file operation methods based on Hadoop distributed file system (HDFS)
CN101630282B (en) Data backup method based on Erasure coding and copying technology
CN102456059A (en) Data deduplication processing system
CN103733195A (en) Managing storage of data for range-based searching
CN104199816A (en) Managing storage of individually accessible data units
CN102591947A (en) Fast and low-RAM-footprint indexing for data deduplication
CN102411637A (en) Metadata management method of distributed file system
CN103279502B (en) A kind of framework and method with the data de-duplication file system be combined with parallel file system
CN102722583A (en) Hardware accelerating device for data de-duplication and method
CN103067525A (en) Cloud storage data backup method based on characteristic codes
CN102469142A (en) Data transmission method for data deduplication program
CN105893169A (en) File storage method and system based on erasure codes
CN109446267B (en) Cross-database data integration system and method based on 95598 ex-situ double-active disaster recovery model
CN116233111A (en) Minio-based large file uploading method
CN103019891A (en) Method and system for restoring deleted file
CN105353988A (en) Metadata reading and writing method and device
CN1851691A (en) Database back-up data compression and search method
CN102385624B (en) DFS (distributed file system)-oriented log data organization method
CN103207916A (en) Metadata processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20171013

Termination date: 20180426

CF01 Termination of patent right due to non-payment of annual fee