CN109521958A - A kind of delay process method and device of data distribution - Google Patents

A kind of delay process method and device of data distribution Download PDF

Info

Publication number
CN109521958A
CN109521958A CN201811232307.9A CN201811232307A CN109521958A CN 109521958 A CN109521958 A CN 109521958A CN 201811232307 A CN201811232307 A CN 201811232307A CN 109521958 A CN109521958 A CN 109521958A
Authority
CN
China
Prior art keywords
cluster
delay timer
data
timing
data distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811232307.9A
Other languages
Chinese (zh)
Other versions
CN109521958B (en
Inventor
甄天桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811232307.9A priority Critical patent/CN109521958B/en
Publication of CN109521958A publication Critical patent/CN109521958A/en
Application granted granted Critical
Publication of CN109521958B publication Critical patent/CN109521958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

This application discloses a kind of delay process method and devices of data distribution, it include: when the first trigger event for detecting the presence of triggering cluster progress data distribution occurs, start-up study timer carries out timing, and, before the timing duration of the delay timer reaches preset duration, it detects whether to exist and triggers the second trigger event that the cluster carries out data distribution, if it is determined that there is the second trigger event for triggering cluster progress data distribution before reaching preset duration in the timing duration of the delay timer, it then restarts delay timer and carries out timing, until when the timing duration of the delay timer reaches preset duration, data in the cluster are redistributed.As it can be seen that being merged by the trigger event that triggering clusters all in the short time are carried out data distribution, cluster only needs to execute a data distribution, so as to reduce the computing resource of the required cluster consumed.

Description

A kind of delay process method and device of data distribution
Technical field
This application involves technical field of data processing, more particularly to the delay process method and dress of a kind of data distribution It sets.
Background technique
In each cluster of distributed memory system, hard disk is pulled out if it exists, then is usually that will be deposited on the hard disk The state of storage data is set to degrading state, the abnormal condition of the data to be processed such as, when determining what the hard disk was pulled out Duration exceeds certain threshold value, then will be considered that the permanent cluster exited where the hard disk of the hard disk, and is deposited on the hard disk The data of storage can be restored to other hard disks by Data Recovery Process, and the data in the cluster can be re-distributed;And work as quilt When thinking that cluster is added again or is added in the cluster there are other new hard disks for the hard disk for permanently exiting cluster, in the cluster Data can be balanced on the hard disk being newly added by data balancing process, complete redistribution to company-data.That is, working as When depositing that hard disk in the cluster is permanent to be exited cluster or cluster is added there are new hard disk, the data on cluster usually all can Re-start distribution.
In practical application, it will usually deposit the hard disk extracted in multiple clusters in a short time and/or be added into the cluster The events of multiple hard disks occurs, and the extraction of hard disk and/or insertion are not that synchronization is completed, namely difference hard disk pulls out There are certain time interval out and/or between insertion, this allows for the data on cluster in a short time and will do it more Secondary data distribution consumes computing resource more in the cluster, moreover, the more data in the cluster can be different hard Repeatedly meaningless migration is carried out between disk, causes the waste of PC cluster resource to a certain extent.
Summary of the invention
The embodiment of the present application provides a kind of delay process method and device of data distribution, to avoid the data in cluster The waste of PC cluster resource is caused because multiple Data Migration occurs.
In a first aspect, the embodiment of the present application provides a kind of delay process method of data distribution, which comprises
When detecting the presence of the first trigger event of triggering cluster progress data distribution, start-up study timer is counted When;
Before the timing duration of the delay timer reaches preset duration, detects whether to exist and trigger the cluster progress Second trigger event of data distribution;
If it is determined that the timing duration in the delay timer reaches before preset duration to exist and triggers the cluster and counted According to the second trigger event of distribution, then restarts the delay timer and carry out timing;
When the timing duration of the delay timer reaches the preset duration, weight is carried out to the data in the cluster New distribution.
In some possible embodiments, described to detect the presence of the first trigger event for being directed to data distribution, packet It includes:
It detects the presence of the first external memory and exits or be added the cluster;
It is described to detect whether there is the second trigger event for triggering the cluster progress data distribution, comprising:
It detects whether to exit or be added the cluster there are the second external memory.
In some possible embodiments, the external memory is specially hard disk.
In some possible embodiments, the method also includes:
When starting the delay timer progress timing, the cluster external memory state recording file is recorded First version mark;
When the delay timer duration reaches the preset duration, the cluster external memory state note is obtained Record the second edition mark of file;
It is then described when the timing duration of the delay timer reaches the preset duration, to the data in the cluster It is redistributed, comprising:
When the timing duration of the delay timer reaches the preset duration, and first version mark and described the When two version identifiers are inconsistent, the data in the cluster are redistributed.
In some possible embodiments, the method also includes:
When first version mark is consistent with second edition mark, restarts the delay timer and carry out Timing.
In some possible embodiments, the second edition mark is greater than the first version and identifies, the method Further include:
After completing to redistribute the data in the cluster, by the version of external memory state recording file Number reset.
It is in some possible embodiments, described that data are redistributed, comprising:
Determine the third external memory that the cluster is added in target time section, the target time section is detects The timing duration for stating the first moment to the determination delay timer of the first trigger event reaches the second of the preset duration The period at moment;
Partial data in the cluster is distributed into the third external memory;
Determine the 4th external memory that cluster is exited in the target time section;
The data stored on 4th external memory are restored into the 5th external memory into the cluster.
Second aspect, the embodiment of the present application also provides a kind of delay process device of data distribution, which includes:
Start unit, for when detecting the presence of the first trigger event of triggering cluster progress data distribution, starting to be prolonged When timer carry out timing;
Detection unit, for before the timing duration of the delay timer reaches preset duration, detecting whether there is touching Send out cluster described and carry out the second trigger event of data distribution;
First restarting unit, for if it is determined that being deposited before the timing duration of the delay timer reaches preset duration In the second trigger event for triggering the cluster progress data distribution, then restarts the delay timer and carry out timing;
Data distribution unit, for when the timing duration of the delay timer reaches the preset duration, to described Data in cluster are redistributed.
In some possible embodiments, described to detect the presence of the first trigger event for being directed to data distribution, tool Body is to detect the presence of the first external memory to exit or be added the cluster;
The detection unit, specifically for detecting whether to exit or be added the cluster there are the second external memory.
In some possible embodiments, the external memory is specially hard disk.
In some possible embodiments, described device further include:
Recording unit, for recording and characterizing external deposit in the cluster when starting the delay timer progress timing The first version of reservoir state identifies;
Acquiring unit, for obtaining table in the cluster when the delay timer duration reaches the preset duration Levy the second edition mark of external memory state;
The then data distribution unit, specifically for when the timing duration of the delay timer reaches described default It is long, and when first version mark and inconsistent second edition mark, the data in the cluster are divided again Cloth.
In some possible embodiments, described device further include:
Second restarting unit is used for when first version mark is consistent with second edition mark, again Start the delay timer and carries out timing.
In some possible embodiments, the second edition mark is greater than the first version and identifies, described device Further include:
Resetting unit, for complete the data in the cluster are redistributed after, by external memory state The version number for recording file resets.
In some possible embodiments, the data distribution unit, comprising:
First determines subelement, described for determining the third external memory that the cluster is added in target time section Target time section is to detect that the timing duration of the first moment to the determination delay timer of first trigger event reaches To the period at the second moment of the preset duration;
Subelement is distributed, for distributing the partial data in the cluster into the third external memory;
Second determines subelement, for determining the 4th external memory for exiting cluster in the target time section;
Restore subelement, for the data stored on the 4th external memory to be restored to the 5th into the cluster External memory.
In the above-mentioned implementation of the embodiment of the present application, by the way that all triggering clusters in the short time are carried out data point The trigger event of cloth merges, so that cluster only carries out a data distribution, so as to reduce the computing resource of cluster Consumption, reduce the partial data in cluster and cause the waste of PC cluster resource because of repeatedly meaningless migration is carried out.Tool Body, when the first trigger event for detecting the presence of triggering cluster progress data distribution occurs, it can star delay timer Carry out timing, also, before the timing duration of the delay timer reaches preset duration, detect whether to exist trigger the cluster into Second trigger event of row data distribution, however, it is determined that the timing duration of the delay timer should in the presence of triggering before reaching preset duration Cluster carries out the second trigger event of data distribution, then restarts delay timer and carry out timing, until working as the delay timing When the timing duration of device reaches preset duration, the data in the cluster are redistributed.As it can be seen that by by institute in the short time The trigger event for having triggering cluster to carry out data distribution merges, and cluster is just not necessarily in a short time based on multiple trigger events Execute multiple data distribution, and Exactly-once data distribution, in this way, the data in cluster only need to different hard disks it Between once migrated, cause PC cluster to provide because multiple Data Migration occurs so as to avoid data in cluster The waste in source, also, the required computing resource consumed of data distribution that the cluster is only performed once, than executing multiple data point The computing resource consumed needed for cloth will be lacked, so as to promote the performance of the cluster to a certain extent.
Detailed description of the invention
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations as described in this application Example, for those of ordinary skill in the art, is also possible to obtain other drawings based on these drawings.
Fig. 1 is an application scenarios schematic diagram in the embodiment of the present application;
Fig. 2 is a kind of delay process method flow schematic diagram of data distribution in the embodiment of the present application;
Fig. 3 is a kind of delay process apparatus structure schematic diagram of data distribution in the embodiment of the present application.
Specific embodiment
For each cluster of distributed memory system, if exiting cluster in the cluster or depositing there are hard disk is permanent Cluster is added in new hard disk, then the data on the cluster usually require to re-start distribution.If extracted in a short time Hard disk in multiple clusters and/or the event that multiple hard disks are inserted into the cluster occur, it will be understood that the multiple hard disks of the extraction And be inserted into the events of multiple hard disks into cluster and will not usually take place at the same instant, i.e., it the extraction of different hard disks and/or inserts There are certain time intervals between entering, this allows for the data on cluster in a short time and will do it multiple data point Cloth, this is allowed for for the partial data in the cluster, can be different in the multiple data distribution that cluster is carried out Repeatedly meaningless migration is carried out between hard disk, certain waste is caused to the computing resource of cluster, moreover, carrying out in the short time more Secondary data distribution also consumes the more computing resource of the cluster, affects the performance of the cluster.
In order to solve the above-mentioned technical problem, the embodiment of the present application provides a kind of delay process method of data distribution, leads to It crosses and merges the trigger event that all triggering clusters in the short time carry out data distribution, so that cluster only carries out once Data distribution, so as to reduce cluster computing resource consumption, reduce cluster in partial data because repeatedly carry out nothing The migration of meaning and the waste for causing PC cluster resource.Specifically, carrying out the of data distribution when detect the presence of triggering cluster When one trigger event occurs, it can star delay timer and carry out timing, also, reach in the timing duration of the delay timer Before preset duration, detects whether to exist and trigger the second trigger event that the cluster carries out data distribution, however, it is determined that the delay timing There is the second trigger event for triggering cluster progress data distribution before reaching preset duration in the timing duration of device, then restart Delay timer carries out timing, until when the timing duration of the delay timer reaches preset duration, to the number in the cluster According to being redistributed.
As it can be seen that being merged by the trigger event that triggering clusters all in the short time are carried out data distribution, cluster is just Without executing multiple data distribution based on multiple trigger events in a short time, and Exactly-once data distribution, this Sample, the data in cluster only need to once be migrated between different hard disks, so as to avoid the data in cluster because of hair Raw multiple Data Migration and the waste for causing PC cluster resource, also, needed for the data distribution that is only performed once of the cluster The computing resource of consumption, the computing resource than consumption needed for executing multiple data distribution is few, so as to a certain degree The upper performance for promoting the cluster.
For example, the embodiment of the present application can be applied to exemplary application scene as shown in Figure 1.In this scenario, It include a host node and multiple from node in cluster.Host node can continue to detect currently to be added with the presence or absence of the first hard disk Host node in the cluster is either exited or from node, if so, can star delay timer carries out timing, in the delay During timer carries out timing, if host node also detects the presence of the main section that the second hard disk is added or is exited in the cluster It puts or from node, then delay timer can be restarted, so that the delay timer restarts timing.When the delay When the timing duration of timer reaches preset duration, then host node, which can control, redistributes the data in the cluster. Although being added in this way, there are multiple hard disks in a short time or exiting cluster, host node only needs to control in cluster Data carry out primary distribution, the calculating consumed so as to reduce cluster in a short time because data distribution is carried out Resource.
It is understood that above-mentioned scene is only a Sample Scenario provided by the embodiments of the present application, the embodiment of the present application It is not limited to this scene.
In order to make the above objects, features, and advantages of the present application more apparent, below in conjunction with attached drawing to this Shen Please the various non-limiting implementations in embodiment illustrate.Obviously, described embodiment is the application one Section Example, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing All other embodiment obtained under the premise of creative work out, shall fall in the protection scope of this application.
Referring to Fig.2, Fig. 2 shows the signals of the process of the delay process method of data distribution a kind of in the embodiment of the present application Figure, this method can specifically include:
S201: when detecting the presence of the first trigger event of triggering cluster progress data distribution, start-up study timer Carry out timing.
In practical application, if in cluster there are the external memories such as hard disk, floppy disk exit the cluster or user to this New external memory etc. is added in cluster, can all trigger the cluster and carry out data distribution.If the host node in the cluster detects To being added there are external memory or the triggering cluster such as exit the cluster carries out the trigger event generation of data distribution, then the master Node can star delay timer and carry out timing.
S202: before the timing duration of the delay timer reaches preset duration, detect whether exist trigger the cluster into Second trigger event of row data distribution.
It is noted that in the present embodiment, in order to by the trigger event of all generations in the cluster in a period of time into Row merges, and host node can not immediately re-start the data in cluster when detecting the presence of the generation of the first trigger event Distribution, but start-up study timer starts to carry out timing.And reach preset in the timing duration of the delay timer During duration, host node can continue to detect in the cluster whether there is also the second touchings that triggering cluster carries out data distribution Hair event occurs.
Similar with the first trigger event, the second trigger event is also possible to host node and detects that there are external memory additions Or the triggering cluster such as exit the cluster carries out the event of data distribution.
In most of scenes of practical application, the external memory for being added or exiting cluster can be hard disk.
S203: if it is determined that there are triggering cluster progress data before the timing duration of the delay timer reaches preset duration Second trigger event of distribution then restarts delay timer and carries out timing.
S204: when the timing duration of the delay timer reaches preset duration, the data in the cluster are carried out again Distribution.
It should be noted that if host node also detects before the timing duration of delay timer reaches preset duration There are the generation of the second trigger event in the cluster, then host node equally will not immediately re-start point the data in the cluster Cloth, but delay timer is restarted, so that delay timer restarts timing.Although in this way, depositing in a short time Occur in the first trigger event and the second trigger event, but there is no triggering clusters immediately to carry out data for the two trigger events Distribution, but data are carried out again again when waiting delay timer reaches preset duration.This means that host node by first Trigger event is merged with the second trigger event, so that originally need to be implemented the cluster of data distribution twice only needs to hold now Data distribution of row, in this manner it is possible to the number of the data distribution executed needed for reducing cluster in a short time, thus Cluster can be reduced in a short time because data distribution is executed and the computing resource of required consumption, avoid the portion in the cluster Divided data causes the waste of computing resource because of meaningless migration is carried out between multiple hard disks.
It is appreciated that be only in the present embodiment by the short time in cluster there are two trigger events for carry out it is exemplary Explanation.In practical application, host node detect the second trigger event and again start-up study timer start carry out timing after, If before the duration that the delay timer restarts timing reaches preset duration, host node also detect trigger the cluster into When the third trigger event of row data distribution, then the delay timer can be restarted again, so that the delay timer Generation based on third trigger event and restart timing again.Such process is repeated, until host node is in delay timing During the timing duration of device reaches preset duration, do not detect to trigger the trigger event that the cluster carries out data distribution again When, just the data in the cluster are redistributed.
As the specific implementation example that the data in a kind of pair of cluster are redistributed, host node can be determined first The third external memory of the cluster is added in target time section, certainly, which can be one or more It is a, wherein the target time section refers to that first moment of the host node when detecting that the first trigger event occurs is true to host node The timing duration for determining delay timer reaches period between the second moment of preset duration;Then, host node can be according to Preset data distribution strategy distributes the partial data in cluster into the third external memory being newly added, so that Data in cluster are evenly distributed on all hard disks in the cluster;Then, host node can be determined in target time section and be moved back 4th external memory of cluster out, similar, the 4th external memory is also possible to one or more;Finally, main section Point can enable corresponding Data Recovery Process, and the data that will move out on the 4th external memory of cluster are restored into the cluster The 5th external memory in, i.e., so that after the 4th external memory exits cluster, the data on the 4th external memory It will not lose, in this way, the data in final cluster can be distributed uniformly on each external memory into the cluster, complete The redistribution of data.
In practical application, when being added in cluster there are external memory or when exiting cluster, then external storage in cluster The record file of device state can also occur to change accordingly.For example, when being added in the cluster there are external memory, then The record file of original external memory state will be updated to comprising the new record file that external memory state is added. Then, in a kind of example, version identifier can be added for the record file of external memory state, such as the version of the record file Number etc..Due to while start-up study timer, can also generate based on the trigger event and update after trigger event occurs Record file, then can recorde the first version mark of updated record file, when the timing duration of delay timer reaches When to preset duration, the second edition mark of the record file of current newest external memory state in the cluster is obtained, if Persistent storage is completed in the Status Change of hard disk, then the second edition mark and the first edition of the record file of external memory state This mark will not be identical.Therefore, in order to avoid the Status Change of external memory do not complete persistent storage to cluster again into Row data distribution impacts, and can determine the hard disk by comparing the version identifier of the record file of external memory state Status Change whether complete persistent storage, if the reasons such as host node heavy traffic lead to external memory on the host node Status Change does not complete persistent storage, even if then the timing duration of delay timer reaches preset duration, can also refuse pair Data in cluster are redistributed.Illustratively, in some embodiments, it can be host node and determine delay timer Timing duration reach preset duration, and record file first version mark with the second edition mark it is inconsistent when, permit Perhaps the data in cluster are redistributed.
Further, when host node determines the first version mark and consistent second edition mark of record file, due to The Status Change of external memory does not complete persistent storage, and therefore, host node can restart delay timer, to wait outside Persistent storage is realized in the Status Change of memory.
In practical application, the version identifier for recording file can use number to be indicated.For example, first version identifies It can be and be similar to " version 1 ", second edition mark, which can be, is similar to " version 2 ".Based on this, current newest record file Version identifier corresponding to number can be bigger than number corresponding to the version identifier that is recorded before, then judging whether to permit When redistributing perhaps to the data in cluster, the second edition that can be judgement record file identifies whether to be greater than the first edition This mark, if so, just allowing to redistribute the data in cluster.Further, in order to avoid this data distribution Delay process can impact the delay process of data distribution next time, can also will remember after the completion of this delay process Record the version identifier clearing processing of file.
In some possible application scenarios, the host node in cluster may switch, i.e. host node in cluster Another node in the cluster may be switched to by present node.If host node is delayed to the data distribution in cluster During processing, need to complete the switching to host node in cluster, then in order to guarantee the delay process for being directed to data distribution It can continue to carry out, can use the version identifier of record file to indicate that the host node after switching in the cluster continues to execute and prolong When handle.Specifically, whether the host node after switching can be zero with the version identifier of inspection record file, if so, showing to work as Before do not need in cluster data distribution carry out delay process;If it is not, then showing currently to need to the data in cluster point Cloth carries out delay process, and then can search whether to have existed the event of waiting delay processing, and the event, then may be used if it does not exist To create the event of waiting delay processing, and start-up study timer starts to carry out timing, executes the process of delay process, The event if it exists then can continue to carry out at delay the data distribution of cluster based on the event of existing waiting delay processing The process of reason, without additional processing.
In the present embodiment, closed by the trigger event that all triggering clusters in the short time are carried out data distribution And so that cluster only carries out a data distribution, so as to reduce cluster computing resource consumption, reduce in cluster Partial data causes the waste of PC cluster resource because of repeatedly meaningless migration is carried out.Specifically, when detecting the presence of touching When sending out cluster and carrying out the first trigger event of data distribution and occur, it can star delay timer and carry out timing, also, prolong at this When timer timing duration reach preset duration before, detect whether exist trigger the cluster carry out data distribution second triggering Event, however, it is determined that the timing duration of the delay timer, which reaches to exist before preset duration, triggers the cluster carries out data distribution the Two trigger events then restart delay timer and carry out timing, until the timing duration when the delay timer reaches default When duration, the data in the cluster are redistributed.As it can be seen that by the way that triggering clusters all in the short time are carried out data point The trigger event of cloth merges, and cluster is just not necessarily to execute multiple data distribution based on multiple trigger events in a short time, And Exactly-once data distribution, in this way, the data in cluster only need to once be migrated between different hard disks, thus The waste of PC cluster resource, also, the collection can be caused because multiple Data Migration occurs to avoid the data in cluster The computing resource consumed needed for the data distribution that group is only performed once, than the calculating money of consumption needed for executing multiple data distribution Source will be lacked, so as to promote the performance of the cluster to a certain extent.
In addition, the embodiment of the present application also provides a kind of delay process devices of data distribution.It is shown refering to Fig. 3, Fig. 3 A kind of structural schematic diagram of the delay process device of data distribution, the device 300 include: in the embodiment of the present application
Start unit 301, for starting when detecting the presence of the first trigger event of triggering cluster progress data distribution Delay timer carries out timing;
Detection unit 302, for detecting whether exist before the timing duration of the delay timer reaches preset duration Trigger the second trigger event that the cluster carries out data distribution;
First restarting unit 303, for if it is determined that the timing duration in the delay timer reaches preset duration It is preceding to there is the second trigger event for triggering the cluster progress data distribution, then it restarts the delay timer and is counted When;
Data distribution unit 304, for when the timing duration of the delay timer reaches the preset duration, to institute The data stated in cluster are redistributed.
In some possible embodiments, described to detect the presence of the first trigger event for being directed to data distribution, tool Body is to detect the presence of the first external memory to exit or be added the cluster;
The detection unit 302, specifically for detecting whether to exit or be added the collection there are the second external memory Group.
In some possible embodiments, the external memory is specially hard disk.
In some possible embodiments, described device 300 further include:
Recording unit, for recording and characterizing external deposit in the cluster when starting the delay timer progress timing The first version of reservoir state identifies;
Acquiring unit, for obtaining table in the cluster when the delay timer duration reaches the preset duration Levy the second edition mark of external memory state;
The then data distribution unit, specifically for when the timing duration of the delay timer reaches described default It is long, and when first version mark and inconsistent second edition mark, the data in the cluster are divided again Cloth.
In some possible embodiments, described device 300 further include:
Second restarting unit is used for when first version mark is consistent with second edition mark, again Start the delay timer and carries out timing.
In some possible embodiments, the second edition mark is greater than the first version and identifies, described device Further include:
Resetting unit, for complete the data in the cluster are redistributed after, by external memory state The version number for recording file resets.
In some possible embodiments, the data distribution unit 304, comprising:
First determines subelement, described for determining the third external memory that the cluster is added in target time section Target time section is to detect that the timing duration of the first moment to the determination delay timer of first trigger event reaches To the period at the second moment of the preset duration;
Subelement is distributed, for distributing the partial data in the cluster into the third external memory;
Second determines subelement, for determining the 4th external memory for exiting cluster in the target time section;
Restore subelement, for the data stored on the 4th external memory to be restored to the 5th into the cluster External memory.
In the present embodiment, it is seen then that carried out by the trigger event that triggering clusters all in the short time are carried out data distribution Merge, cluster is just not necessarily to execute multiple data distribution based on multiple trigger events in a short time, and Exactly-once data Distribution, in this way, the data in cluster only need to once be migrated between different hard disks, so as to avoid in cluster Data cause the waste of PC cluster resource, also, the number that the cluster is only performed once because multiple Data Migration occurs According to the computing resource consumed needed for distribution, the computing resource than consumption needed for executing multiple data distribution is few, so as to The performance of the cluster is promoted to a certain extent.
" the first external memory " mentioned in the embodiment of the present application, " the first trigger event ", " first version mark ", " first " in titles such as " the first restarting units " is used only to do name mark, does not represent first sequentially.It should Rule is equally applicable to " second ", " third ", " the 4th ", " the 5th " etc..
As seen through the above description of the embodiments, those skilled in the art can be understood that above-mentioned implementation All or part of the steps in example method can add the mode of general hardware platform to realize by software.Based on this understanding, The technical solution of the application can be embodied in the form of software products, which can store is situated between in storage In matter, such as read-only memory (English: read-only memory, ROM)/RAM, magnetic disk, CD etc., including some instructions to So that a computer equipment (can be the network communication equipments such as personal computer, server, or router) executes Method described in certain parts of each embodiment of the application or embodiment.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device reality For applying example, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to embodiment of the method Part explanation.The apparatus embodiments described above are merely exemplary, wherein mould as illustrated by the separation member Block may or may not be physically separated, and the component shown as module may or may not be physics Module, it can it is in one place, or may be distributed over multiple network units.It can select according to the actual needs Some or all of the modules therein achieves the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creation Property labour in the case where, it can understand and implement.
The above is only the illustrative embodiment of the application, is not intended to limit the protection scope of the application.

Claims (10)

1. a kind of delay process method of data distribution, which is characterized in that the described method includes:
When detecting the presence of the first trigger event of triggering cluster progress data distribution, start-up study timer carries out timing;
Before the timing duration of the delay timer reaches preset duration, detects whether to exist and trigger the cluster progress data Second trigger event of distribution;
If it is determined that there are the triggering cluster progress data point before the timing duration of the delay timer reaches preset duration Second trigger event of cloth then restarts the delay timer and carries out timing;
When the timing duration of the delay timer reaches the preset duration, the data in the cluster are divided again Cloth.
2. the method according to claim 1, wherein described detect the presence of the first touching for being directed to data distribution Hair event, comprising:
It detects the presence of the first external memory and exits or be added the cluster;
It is described to detect whether there is the second trigger event for triggering the cluster progress data distribution, comprising:
It detects whether to exit or be added the cluster there are the second external memory.
3. according to the method described in claim 2, it is characterized in that, the external memory is specially hard disk.
4. the method according to claim 1, wherein the method also includes:
When starting the delay timer progress timing, the first of the cluster external memory state recording file is recorded Version identifier;
When the delay timer duration reaches the preset duration, the cluster external memory state recording text is obtained The second edition of part identifies;
It is then described when the timing duration of the delay timer reaches the preset duration, the data in the cluster are carried out Redistribution, comprising:
When the timing duration of the delay timer reaches the preset duration, and first version mark and the second edition When this mark is inconsistent, the data in the cluster are redistributed.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
When first version mark is consistent with second edition mark, restarts the delay timer and counted When.
6. method according to claim 4 or 5, which is characterized in that the second edition mark is greater than the first version Mark, the method also includes:
It is after completing to redistribute the data in the cluster, the version number of external memory state recording file is clear Zero.
7. the method according to claim 1, wherein described redistribute data, comprising:
Determine the third external memory that the cluster is added in target time section, the target time section is to detect described the The timing duration of first moment of one trigger event to the determination delay timer reaches the second moment of the preset duration Period;
Partial data in the cluster is distributed into the third external memory;
Determine the 4th external memory that cluster is exited in the target time section;
The data stored on 4th external memory are restored into the 5th external memory into the cluster.
8. a kind of delay process device of data distribution, which is characterized in that described device includes:
Start unit, for when detect the presence of triggering cluster carry out data distribution the first trigger event when, start-up study meter When device carry out timing;
Detection unit, for before the timing duration of the delay timer reaches preset duration, detecting whether there is triggering institute State the second trigger event that cluster carries out data distribution;
First restarting unit, for if it is determined that there is touching before the timing duration of the delay timer reaches preset duration The second trigger event for sending out cluster described and carrying out data distribution, then restart the delay timer and carry out timing;
Data distribution unit, for when the timing duration of the delay timer reaches the preset duration, to the cluster In data redistributed.
9. device according to claim 8, which is characterized in that described device further include:
Recording unit, for recording in the cluster and characterizing external memory when starting the delay timer progress timing The first version of state identifies;
Acquiring unit, it is outer for when the delay timer duration reaches the preset duration, obtaining characterization in the cluster The second edition of portion's memory state identifies;
The then data distribution unit reaches the preset duration specifically for the timing duration when the delay timer, and When the first version mark and inconsistent second edition mark, the data in the cluster are redistributed.
10. device according to claim 9, which is characterized in that described device further include:
Second restarting unit is used for the restarting when first version mark is consistent with second edition mark The delay timer carries out timing.
CN201811232307.9A 2018-10-22 2018-10-22 Delay processing method and device for data distribution Active CN109521958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811232307.9A CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811232307.9A CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Publications (2)

Publication Number Publication Date
CN109521958A true CN109521958A (en) 2019-03-26
CN109521958B CN109521958B (en) 2022-02-18

Family

ID=65772996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811232307.9A Active CN109521958B (en) 2018-10-22 2018-10-22 Delay processing method and device for data distribution

Country Status (1)

Country Link
CN (1) CN109521958B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035728A (en) * 2014-03-31 2014-09-10 深圳英飞拓科技股份有限公司 Hard disk hot plug handling method, device and node
CN104461389A (en) * 2014-12-03 2015-03-25 上海新储集成电路有限公司 Automatically learning method for data migration in mixing memory
CN107395721A (en) * 2017-07-20 2017-11-24 郑州云海信息技术有限公司 A kind of method and system of metadata cluster dilatation
CN107422977A (en) * 2017-07-31 2017-12-01 北京小米移动软件有限公司 Trigger action processing method, device and computer-readable recording medium
US20170371928A1 (en) * 2016-06-28 2017-12-28 International Business Machines Corporation Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources
CN107562382A (en) * 2017-08-30 2018-01-09 郑州云海信息技术有限公司 A kind of disk automatic dynamic expansion method and system based on timed task

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104035728A (en) * 2014-03-31 2014-09-10 深圳英飞拓科技股份有限公司 Hard disk hot plug handling method, device and node
CN104461389A (en) * 2014-12-03 2015-03-25 上海新储集成电路有限公司 Automatically learning method for data migration in mixing memory
US20170371928A1 (en) * 2016-06-28 2017-12-28 International Business Machines Corporation Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources
CN107395721A (en) * 2017-07-20 2017-11-24 郑州云海信息技术有限公司 A kind of method and system of metadata cluster dilatation
CN107422977A (en) * 2017-07-31 2017-12-01 北京小米移动软件有限公司 Trigger action processing method, device and computer-readable recording medium
CN107562382A (en) * 2017-08-30 2018-01-09 郑州云海信息技术有限公司 A kind of disk automatic dynamic expansion method and system based on timed task

Also Published As

Publication number Publication date
CN109521958B (en) 2022-02-18

Similar Documents

Publication Publication Date Title
CN110309161B (en) Data synchronization method and device and server
CN107656705B (en) Computer storage medium and data migration method, device and system
CN109173270B (en) Game service system and implementation method
EP4195149A1 (en) Target detection and tracking method and apparatus, electronic device, and storage medium
CN109582459A (en) The method and device that the trustship process of application is migrated
CN107704310B (en) Method, device and equipment for realizing container cluster management
CN106354566A (en) Command processing method and server
CN109144787A (en) A kind of data reconstruction method, device, equipment and readable storage medium storing program for executing
CN107368324A (en) A kind of component upgrade methods, devices and systems
CN105718304A (en) Virtual machine management method and system
Fu et al. FARMS: Efficient mapreduce speculation for failure recovery in short jobs
JP5969315B2 (en) Data migration processing system and data migration processing method
CN106021296B (en) Method and device for detecting batch operation paths of core bank system
CN109521958A (en) A kind of delay process method and device of data distribution
CN105743696A (en) Cloud computing platform management method
WO2018001375A1 (en) Physical to virtual migration method, physical server, virtual server, and system
WO2019000791A1 (en) Method and apparatus for remote process calling using asynchronous mode
CN118093630A (en) Data source selection method, device, storage medium and electronic equipment
CN109032940B (en) Test scene input method, device, equipment and storage medium
CN109389271B (en) Application performance management method and system
US10311032B2 (en) Recording medium, log management method, and log management apparatus
EP3396553B1 (en) Method and device for processing data after restart of node
CN114936106A (en) Method, device and medium for processing host fault
CN110908821A (en) Method, device, equipment and storage medium for task failure management
CN114244709B (en) UP equipment association control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant