CN110333931A - The system of shared storage for training pattern - Google Patents
The system of shared storage for training pattern Download PDFInfo
- Publication number
- CN110333931A CN110333931A CN201910446352.2A CN201910446352A CN110333931A CN 110333931 A CN110333931 A CN 110333931A CN 201910446352 A CN201910446352 A CN 201910446352A CN 110333931 A CN110333931 A CN 110333931A
- Authority
- CN
- China
- Prior art keywords
- cluster
- physical
- virtual machine
- machine
- several
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
- G06F16/1824—Distributed file systems implemented using Network-attached Storage [NAS] architecture
- G06F16/183—Provision of network file services by network file servers, e.g. by using NFS, CIFS
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Abstract
The present invention provides a kind of systems of shared storage for training pattern, comprising: more the first physical mechanisms at the first cluster, provide Distributed sharing storage;More the second physical mechanisms at the second cluster, virtual machine is provided;Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry on the virtual machine.It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, model training can be carried out with user.By the support of large-scale cluster, based on shared storage and virtualization, the efficiency being greatly lifted in machine learning training promotes user experience.
Description
Technical field
The present invention relates to machine learning fields, relate more specifically to a kind of system of shared storage for training pattern.
Background technique
With the rise of machine learning, how to obtain training pattern and receive more and more attention.Since training is to be based on
What a large amount of sample data carried out, therefore the requirement to storage is very high.In view of the environment of training pattern, a kind of current scheme
It is based on Docker containerization environment, since Docker container is the application container engine of an open source, it is easy to it is transplanted,
Also it may be implemented to virtualize, therefore, use Docker containerization that environment is made to be easy to build, speed is quickly.
But since container does not have any interface between each other, be mutually isolated between the storage environment of container, it is difficult altogether
Data are enjoyed, even if shared data, also have many unexpected problems.In addition, also having many problems, kernel level based on container
Other isolation is not so good, can have many factors of instability under shared storage, for user usage experience very not
It is good.
Summary of the invention
The present invention provides a kind of systems of shared storage for training pattern, can guarantee the stability of system, mention
Rise user experience.
The system of shared storage provided by the present invention for training pattern, comprising:
More the first physical mechanisms at the first cluster, provide Distributed sharing storage;
More the second physical mechanisms at the second cluster, virtual machine is provided;
Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry in the virtual machine
On.
In a kind of implementation of the invention, dispatching platform dispatches second cluster, comprising: the dispatching platform from
Several the second physical machines are selected in second cluster, and send scheduling request to several second physical machines.
In a kind of implementation of the invention, several second physical machines in second cluster are being received
After scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.
In a kind of implementation of the invention, the Distributed sharing storage carry provided by first cluster exists
On the virtual machine that several second physical machines are started.
In a kind of implementation of the invention, it is provided with agency in every second physical machine in second cluster,
The agency obtains the virtual machine information of the second physical machine where it, and acquired virtual machine information is sent to the tune
Spend platform.
In a kind of implementation of the invention, the dispatching platform uses dispatching algorithm according to the virtual machine information
It is scheduled.
In a kind of implementation of the invention, certain the second physical machine being arranged in several second physical machines is made
For server, and other second physical machines in several second physical machines are as worker's machine.Wherein, the server mentions
It is serviced for Network File System NFS, worker's machine carry is simultaneously serviced using the NFS.
In such manner, it is possible to guarantee that shared storage is supported to write more readings, and make full use of the stability of storage more, ensure that entire
The stability of system avoids the occurrence of the bottleneck of performance and stability.
In a kind of implementation of the invention, described in the block storage RBD conduct of the virtual machine institute carry of the server
The server of NFS service.
In a kind of implementation of the invention, the internet protocol address or domain name of the server are fixed
's.
In a kind of implementation of the invention, when the state of certain second physical machine is deteriorated and not can guarantee program
When normal operation, migrated using by another the second physical machine as the server.
In such manner, it is possible to guarantee the state and performance of current server, to guarantee the stability of whole system, avoid by
The performance of whole system is affected in being not in good state for server.
In a kind of implementation of the invention, further includes: control centre receives user's request, and flat to the scheduling
Platform sends instruction;Wherein, the dispatching platform dispatches second cluster after receiving described instruction.
It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, can be carried out with user
Model training.It is greatly lifted in machine learning training by the support of large-scale cluster based on shared storage and virtualization
Efficiency, promoted user experience.
Detailed description of the invention
The embodiment of the present invention is described in more detail in conjunction with the accompanying drawings, the above and other purposes of the present invention,
Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present invention, and constitutes explanation
A part of book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings,
Identical reference label typically represents same parts or step.
Fig. 1 is an exemplary block diagram of the system of the shared storage for training pattern of the embodiment of the present invention;
Fig. 2 is a schematic block diagram of the system of the offer virtualization of the embodiment of the present invention.
Specific embodiment
In order to enable the object, technical solutions and advantages of the present invention become apparent, root is described in detail below with reference to accompanying drawings
According to example embodiments of the present invention.Obviously, described embodiment is only a part of the embodiments of the present invention, rather than this hair
Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Based on described in the present invention
The embodiment of the present invention, those skilled in the art's obtained all other embodiment in the case where not making the creative labor
It should all fall under the scope of the present invention.
Shared storage refers to that two or more processors share the parallel architecture of a main memory.Each processing
Machine can be stored in information main memory, or be taken out information.Communication between processor passes through access shared memory
To realize.The embodiment of the present invention selects a kind of large-scale distributed shared storage, is capable of providing a kind of stable efficient big
Scale storage scheme, for example, distributed file system Ceph, HDFS etc..
Ceph be one reliably, rebalancing method, the distributed memory system restored automatically automatically, can be with according to scene partitioning
It is three bulks by Ceph points, is respectively: object storage, block device storage and file system service.In virtualization field, compare
Commonly used is the block device storage of Ceph, for example in OpenStack project, the block device storage of Ceph can be docked
The rear end the cinder storage of OpenStack, the data storage of the mirrored storage of Glance and virtual machine, are more intuitively Ceph
Cluster can provide the block storage of a raw format as the hard disk of virtual machine instance.Ceph compares the advantage of other storages
It is storage that point, which is it not merely, while being also fully utilized by the computing capability on memory node, when storing each data,
The position that will be stored by the way that the data are calculated, as far as possible by data distribution equilibrium, simultaneously because the good design of Ceph, is adopted
With the methods of CRUSH algorithm, HASH ring, so that the problem of traditional Single Point of Faliure is not present in it, and with the expansion of scale
Performance can't be affected.Compare concern in distributed memory system is a little how to enable data to be distributed more
Add equilibrium, common data distribution algorithms have the Crush algorithm of consistency Hash and Ceph.Crush is a kind of pseudorandom control
Data distribution processed, duplication algorithm, Ceph design for large-scale distributed storage, and data distribution algorithms allow for expiring
Foot data under large-scale cluster still quickly can accurately calculate storage position, while can be in hardware fault or expansion
Accomplish that Data Migration as small as possible, the CRUSH algorithm of Ceph are exactly to design meticulously for these characteristics when opening up hardware device, it can
To say that CRUSH algorithm is also one of the core of Ceph.
HDFS (Hadoop Distributed File System) is designed to be suitble to operate in common hardware
Distributed file system on (commodity hardware).It and existing distributed file system have many common ground.
But meanwhile the difference of it and other distributed file systems is also apparent.HDFS is the system of an Error Tolerance,
It is suitble to be deployed on cheap machine.HDFS can provide the data access of high-throughput, be very suitable on large-scale dataset
Using.HDFS relaxes a part of POSIX constraint, and Lai Shixian streaming reads the purpose of file system data.HDFS is most starting
It is the architecture as Apache Nutch search engine project and develops.HDFS is Apache Hadoop Core project
A part.HDFS has the characteristics of high fault tolerance, and is designed to be deployed on cheap hardware.And it provides height and gulps down
The amount of spitting carrys out the data of access application, those is suitble to have the application program of super large data set (large data set).
The data in the form access file system of stream may be implemented in the requirement that HDFS relaxes POSIX in this way.
The selected large-scale distributed shared storage of the embodiment of the present invention can be more convenient in large-scale cluster
Ground is built and is accessed, and that this also can be very easily accessed is large-scale distributed total for the cluster network of the embodiment of the present invention
Enjoy storage.
For the ease of training machine study model, the embodiment of the invention provides more the first physical mechanisms at first
Cluster can be described as large-scale distributed shared storage cluster or on a large scale shared storage cluster etc..First cluster includes
A large amount of first physical machine (or being referred to as processor), is capable of providing shared storage, particularly provides Distributed sharing
Storage.The embodiment of the invention also provides more the second physical mechanisms at the second cluster, can be described as large-scale virtual machine collection
Group etc..Second cluster includes a large amount of second physical machine (or being referred to as processor), is capable of providing virtual machine.
It should be noted that the embodiment of the present invention is to physical machine (the first physical machine for constituting cluster (the first cluster or the second cluster)
Or second physical machine) quantity be not construed as limiting, such as the order of magnitude can for hundred, thousand, ten thousand, it is even more.
Fig. 1 is an exemplary block diagram of the system of the shared storage for training pattern of the embodiment of the present invention.Fig. 1 institute
The system 10 shown includes: the first cluster 110, the second cluster 120 and dispatching platform 130.
More the first physical mechanisms at the first cluster 110, provide Distributed sharing storage;
More the second physical mechanisms at the second cluster 120, virtual machine is provided;
Dispatching platform 130 dispatches second cluster 120, so that the Distributed sharing stores carry described
On virtual machine.
Illustratively, the first cluster can be shared cluster on a large scale, and existing any shared storage side can be used
Case.Using a kind of stable efficient Mass storage scheme in the embodiment of the present invention, in particular Distributed sharing is deposited
Storage scheme, such as can be Ceph, HDFS etc..Shared storage scheme used by the embodiment of the present invention can be in large-scale cluster
In more easily build and access.
Illustratively, the second cluster can be large-scale virtual cluster, and the embodiment of the present invention is to used in the second cluster
Virtualization scheme without limitation, such as can be based on kernel virtual machine (Kernel-based Virtual Machine,
KVM) alternatively, KVM-QEMU (Quick Emulator, fast simulator) etc..Second cluster includes a large amount of second physical machine, and
Every second physical machine is built-in with virtualization and supports.
In the embodiment of the present invention, the dispatching platform 130 selects several the second physical machines from the second cluster 120, and
Scheduling request is sent to several second physical machines.Several second physical machines in second cluster 120 are receiving
To after scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.To provided by the first cluster 110
The Distributed sharing storage carry is on the virtual machine that several second physical machines are started.Wherein, dispatching platform 130
It is referred to as scheduler.
Illustratively, it can also include control centre, receive user's request, and refer to the dispatching platform 130 transmission
It enables.To which dispatching platform 130 dispatches second cluster 120 after receiving described instruction.
Referring to Fig. 2, control centre 140, dispatching platform 130 and the second cluster 120 is shown, wherein the second cluster
120 include more the second physical machines 1201.
Control centre 140 can receive and handle user's request, then sends and instructs to dispatching platform 130.Dispatching platform
130 upon receipt of the instructions, can select one therein or several the second physical machines from the second cluster 120, and to this
One or this several second physical machines transmission scheduling requests.This or this several the second physical machines receive scheduling request
Later, start respective virtual machine, and distribute hardware resource for respective virtual machine, then by the access of Mass storage,
The shared storage carry that first cluster is provided is in this or this several virtual machines.
Wherein, user's request may include that request creates and enters virtual machine, carries out model training etc. using virtual machine.
It is understood that user's request also may include the request of other creation virtual machines, the present invention does not limit this.
Wherein, dispatching platform 130 can be therein to select according to the situation of each the second physical machine in the second cluster 120
One or several the second physical machines.For example, according to the available resources of each the second physical machine, each the second physical mechanism at
Machine topology, machine configuration etc. are selected.
Illustratively, agency can be set in every second physical machine 1201 in the second cluster 120, the agency obtains
The virtual machine information of the second physical machine where it is taken, and acquired virtual machine information is sent to the dispatching platform 130.
To which dispatching platform 130 can be scheduled according to the virtual machine information of each the second physical machine 1201.
Wherein, virtual machine information may include: machine state, the available physical resource of virtual machine etc..Wherein, scheduling is flat
Platform 130 can be used dispatching algorithm and select the second physical machine.For example, dispatching platform 130 can first choice will be unsatisfactory for virtual machine and want
The second physical machine asked filters out, then again to remaining second physical machine carry out weight calculation, and weight selection calculated value compared with
One big or several the second physical machines.It is understood that dispatching platform 130 can be based on large-scale cluster (i.e. the second cluster) benefit
Carry out high-performance dispatch with dispatching algorithm, but the embodiment of the present invention to used dispatching algorithm without limitation, such as can be with
Use more efficient dispatching algorithm.
In this way, the system of the embodiment of the present invention can specify the same of the first cluster by virtualization scheme shown in Fig. 2
A shared more virtual equipments of storage carry, so as to the exploitation for realizing efficient shared storage machine learning.And
And the embodiment of the present invention builds shared storage by means of virtualization scheme, realization is simple and convenient, and speed is quickly.
In order to specify same shared more virtual equipments of storage carry, which needs support write more and reads more.
As a kind of implementation, the single storage that can choose in the first cluster supports write to read more more, then will share
Direct carry is stored in more virtual machine facilities.In this implementation, very convenient first can quickly be constructed
Cluster.
Specifically, in this case, it is only necessary to which shared storage is directly mounted in multiple virtual machines.However,
Since this storage stability is not so good, cause performance that may also become bottleneck, therefore the implementation may need to select to close
Suitable storage rear end (backend).Due at present on the market it is existing storage it is more difficult meet this requirement, this will lead in order to
The first cluster is constructed, needs that a large amount of manpower is spent to go to be purchased, and it is excessively high to may cause cost.In addition, using in time
The implementation has reached satisfaction and has write the shared storage more read more, and also still there may be the bottlenecks of performance and stability.
As another implementation, it can choose the support being individually stored in the first cluster and write more reading some not more
Support writes more readings, and the block storage (abbreviation RBD, rodas block device) that Ceph can be used for example is used as backend,
But since RBD does not support to write more readings more, it can meet to share to store to need support by following scheme and write more readings more.
Certain the second physical machine in several second physical machines is set as server (master), and if described
Other second physical machines in dry the second physical machine of platform are as worker's machine (worker).Wherein, the server provides network text
Part system (Network File System, NFS) service, worker's machine carry are simultaneously serviced using the NFS.Wherein, it services
The server that the RBD of the virtual machine institute carry of machine is serviced as the NFS.
In such manner, it is possible to make full use of the stability of storage, the stability of whole system ensure that, avoid the occurrence of performance and steady
Qualitative bottleneck.
As an example it is assumed that several the second physical machines are N platform, certain the second physical machine therein (can be labeled as
M1 it) is used as master, and others N-1 the second physical machine of platform is worker.
Specifically, it is exposed the RBD of carry in certain virtual equipment as the server of NFS, becomes NFS clothes
Business, for other virtual machine carries and uses.Wherein the second physical machine for providing NFS can be known as master, by carry NFS
Second physical machine of service is known as worker.
Optionally, the Internet protocol address (Internet Protocol, IP) of server (master) or domain name are
Fixed.So as to solve the problems, such as that master becomes big single-point.
Illustratively, it when the state of certain second physical machine is deteriorated and not can guarantee normal program operation, carries out
Migration using by another the second physical machine as the server.
That is, the state of current service machine can be detected constantly, the degradation of current service machine, which is unable to satisfy, to be needed
When asking, server can be migrated online.In this way, the performance of server is constantly in preferable state, it is entire to guarantee
The stability of system.
Above example is still returned to, the M1 in the second physical machine of N platform is as master, other the second physics of N-1 platform
Machine is as worker.When the state variation for detecting M1 not can guarantee performance, M2 can be reselected as master, and incite somebody to action
M1 and remaining the second physical machine of N-2 platform are as worker.
Wherein, the state of current service machine may include that the state of its IP address, its performance could guarantee the normal of program
Operation and other and the performance-relevant state of shared storage etc..
It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, can be carried out with user
Model training.It is greatly lifted in machine learning training by the support of large-scale cluster based on shared storage and virtualization
Efficiency.Specifically, user local operation carry out model training when, by by shared storage carry on a virtual machine, in turn
Efficiently quickly calculate using virtualization.In addition it is possible to the stability of storage be made full use of, so that system stability
Good, the user experience is improved.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
Although describing example embodiment by reference to attached drawing here, it should be understood that above example embodiment are only exemplary
, and be not intended to limit the scope of the invention to this.Those of ordinary skill in the art can carry out various changes wherein
And modification, it is made without departing from the scope of the present invention and spiritual.All such changes and modifications are intended to be included in appended claims
Within required the scope of the present invention.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, apparatus embodiments described above are merely indicative, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another equipment is closed or is desirably integrated into, or some features can be ignored or not executed.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention
Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this specification.
Similarly, it should be understood that in order to simplify the present invention and help to understand one or more of the various inventive aspects,
To in the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure,
Or in descriptions thereof.However, the method for the invention should not be construed to reflect an intention that i.e. claimed
The present invention claims features more more than feature expressly recited in each claim.More precisely, such as corresponding power
As sharp claim reflects, inventive point is that the spy of all features less than some disclosed single embodiment can be used
Sign is to solve corresponding technical problem.Therefore, it then follows thus claims of specific embodiment are expressly incorporated in this specific
Embodiment, wherein each, the claims themselves are regarded as separate embodiments of the invention.
It will be understood to those skilled in the art that any combination pair can be used other than mutually exclusive between feature
All features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed any method
Or all process or units of equipment are combined.Unless expressly stated otherwise, this specification (is wanted including adjoint right
Ask, make a summary and attached drawing) disclosed in each feature can be replaced with an alternative feature that provides the same, equivalent, or similar purpose.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention
Within the scope of and form different embodiments.For example, in detail in the claims, embodiment claimed it is one of any
Can in any combination mode come using.
Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors
Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice
Microprocessor or digital signal processor (Digital Signal Processing, DSP) Lai Shixian are implemented according to the present invention
The some or all functions of some modules in the article analytical equipment of example.The present invention is also implemented as executing here
Some or all program of device (for example, computer program and computer program product) of described method.In this way
Realization program of the invention can store on a computer-readable medium, or can have the shape of one or more signal
Formula.Such signal can be downloaded from an internet website to obtain, and perhaps be provided on the carrier signal or with any other shape
Formula provides.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability
Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real
It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch
To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame
Claim.
The above description is merely a specific embodiment or to the explanation of specific embodiment, protection of the invention
Range is not limited thereto, and anyone skilled in the art in the technical scope disclosed by the present invention, can be easily
Expect change or replacement, should be covered by the protection scope of the present invention.Protection scope of the present invention should be with claim
Subject to protection scope.
Claims (11)
1. a kind of system of the shared storage for training pattern, which is characterized in that the system comprises:
More the first physical mechanisms at the first cluster, provide Distributed sharing storage;
More the second physical mechanisms at the second cluster, virtual machine is provided;
Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry on the virtual machine.
2. system according to claim 1, which is characterized in that dispatching platform dispatches second cluster, comprising:
The dispatching platform selects several the second physical machines from second cluster, and to several second physical machines
Send scheduling request.
3. system according to claim 2, which is characterized in that several second physical machines in second cluster
After receiving scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.
4. system according to claim 3, which is characterized in that the Distributed sharing provided by first cluster is deposited
Carry is stored up on the virtual machine that several second physical machines are started.
5. system according to claim 1, which is characterized in that be arranged in every second physical machine in second cluster
There is agency, the agency obtains the virtual machine information of the second physical machine where it, and acquired virtual machine information is sent
To the dispatching platform.
6. system according to claim 5, which is characterized in that the dispatching platform is used according to the virtual machine information
Dispatching algorithm is scheduled.
7. system according to claim 2, which is characterized in that certain second in setting several second physical machines
Physical machine is as server, and other second physical machines in several second physical machines are as worker's machine,
Wherein, the server provides Network File System NFS service, and worker's machine carry is simultaneously serviced using the NFS.
8. system according to claim 7, which is characterized in that the block of the virtual machine institute carry of the server stores RBD
Server as NFS service.
9. system according to claim 7, which is characterized in that the internet protocol address or domain name of the server
It is fixed.
10. system according to claim 7, which is characterized in that when the state variation and nothing of certain second physical machine
When method guarantees normal program operation, migrated using by another the second physical machine as the server.
11. system according to any one of claim 1 to 10, which is characterized in that further include:
Control centre receives user's request, and sends and instruct to the dispatching platform;
Wherein, the dispatching platform dispatches second cluster after receiving described instruction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910446352.2A CN110333931A (en) | 2019-05-27 | 2019-05-27 | The system of shared storage for training pattern |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910446352.2A CN110333931A (en) | 2019-05-27 | 2019-05-27 | The system of shared storage for training pattern |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110333931A true CN110333931A (en) | 2019-10-15 |
Family
ID=68140167
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910446352.2A Pending CN110333931A (en) | 2019-05-27 | 2019-05-27 | The system of shared storage for training pattern |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110333931A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114780272A (en) * | 2022-04-18 | 2022-07-22 | 北京亚康万玮信息技术股份有限公司 | Intelligent fault self-healing scheduling method and device based on shared storage and virtualization |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103856343A (en) * | 2012-12-05 | 2014-06-11 | 北京华胜天成科技股份有限公司 | Method and system for configurating virtual machine network information |
CN103853599A (en) * | 2014-03-17 | 2014-06-11 | 北京京东尚科信息技术有限公司 | Extension method of node calculating ability |
CN106095527A (en) * | 2016-06-07 | 2016-11-09 | 国云科技股份有限公司 | A kind of storage pool implementation method being applicable to cloud platform virtual machine |
CN106878457A (en) * | 2017-03-24 | 2017-06-20 | 网宿科技股份有限公司 | The attached storage method of distributed network and system |
CN106919346A (en) * | 2017-02-21 | 2017-07-04 | 无锡华云数据技术服务有限公司 | A kind of shared Storage Virtualization implementation method based on CLVM |
CN108234551A (en) * | 2016-12-15 | 2018-06-29 | 腾讯科技(深圳)有限公司 | A kind of data processing method and device |
US10091212B2 (en) * | 2016-03-04 | 2018-10-02 | BlueTalon, Inc. | Policy management, enforcement, and audit for data security |
-
2019
- 2019-05-27 CN CN201910446352.2A patent/CN110333931A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103856343A (en) * | 2012-12-05 | 2014-06-11 | 北京华胜天成科技股份有限公司 | Method and system for configurating virtual machine network information |
CN103853599A (en) * | 2014-03-17 | 2014-06-11 | 北京京东尚科信息技术有限公司 | Extension method of node calculating ability |
US10091212B2 (en) * | 2016-03-04 | 2018-10-02 | BlueTalon, Inc. | Policy management, enforcement, and audit for data security |
CN106095527A (en) * | 2016-06-07 | 2016-11-09 | 国云科技股份有限公司 | A kind of storage pool implementation method being applicable to cloud platform virtual machine |
CN108234551A (en) * | 2016-12-15 | 2018-06-29 | 腾讯科技(深圳)有限公司 | A kind of data processing method and device |
CN106919346A (en) * | 2017-02-21 | 2017-07-04 | 无锡华云数据技术服务有限公司 | A kind of shared Storage Virtualization implementation method based on CLVM |
CN106878457A (en) * | 2017-03-24 | 2017-06-20 | 网宿科技股份有限公司 | The attached storage method of distributed network and system |
Non-Patent Citations (1)
Title |
---|
侯丽莎: "《云计算与物联网技术》", 31 May 2017, 电子科技大学出版社 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114780272A (en) * | 2022-04-18 | 2022-07-22 | 北京亚康万玮信息技术股份有限公司 | Intelligent fault self-healing scheduling method and device based on shared storage and virtualization |
CN114780272B (en) * | 2022-04-18 | 2023-03-17 | 北京亚康万玮信息技术股份有限公司 | Intelligent fault self-healing scheduling method and device based on shared storage and virtualization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3270289B1 (en) | Container-based multi-tenant computing infrastructure | |
US11263084B2 (en) | Saving program execution state | |
CN110062924B (en) | Capacity reservation for virtualized graphics processing | |
US10540212B2 (en) | Data-locality-aware task scheduling on hyper-converged computing infrastructures | |
CN105893139B (en) | Method and device for providing storage service for tenant in cloud storage environment | |
US8260840B1 (en) | Dynamic scaling of a cluster of computing nodes used for distributed execution of a program | |
US9898522B2 (en) | Distributed storage of aggregated data | |
CN108351806B (en) | Distributed stream-based database triggers | |
US8972990B2 (en) | Providing a seamless transition for resizing virtual machines from a development environment to a production environment | |
US9591094B2 (en) | Caching of machine images | |
US20170353348A1 (en) | APPLICATION RESILIENCY USING APIs | |
US20160378751A1 (en) | Fast query processing in columnar databases with gpus | |
US9800484B2 (en) | Optimizing resource utilization in a networked computing environment | |
CN104350460A (en) | Determining virtual machine placement | |
CN110333931A (en) | The system of shared storage for training pattern | |
US10230594B2 (en) | Intelligently managing pattern contents across multiple racks based on workload and human interaction usage patterns | |
Srinivasan et al. | Google Cloud Platform for Architects: Design and manage powerful cloud solutions | |
US11323322B2 (en) | Non-disruptively merging coordinated timing networks | |
CN109716280A (en) | Flexible rank storage arrangement | |
US11126371B2 (en) | Caching file data within a clustered computing system | |
Awada | Application-Container Orchestration Tools and Platform-as-a-Service Clouds: A Survey | |
US10565006B2 (en) | Platform for analytic applications | |
US11650809B2 (en) | Autonomous and optimized cloning, reinstating, and archiving of an application in a containerized platform | |
Xiao et al. | Co-located Compute and Binary File Storage in Data-Intensive Computing | |
Brock et al. | Multi-cloud application execution and data management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191015 |
|
RJ01 | Rejection of invention patent application after publication |