CN110333931A

CN110333931A - The system of shared storage for training pattern

Info

Publication number: CN110333931A
Application number: CN201910446352.2A
Authority: CN
Inventors: 黄维啸; 王曙光
Original assignee: Beijing Maigewei Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd; Beijing Maigewei Technology Co Ltd
Priority date: 2019-05-27
Filing date: 2019-05-27
Publication date: 2019-10-15

Abstract

The present invention provides a kind of systems of shared storage for training pattern, comprising: more the first physical mechanisms at the first cluster, provide Distributed sharing storage；More the second physical mechanisms at the second cluster, virtual machine is provided；Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry on the virtual machine.It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, model training can be carried out with user.By the support of large-scale cluster, based on shared storage and virtualization, the efficiency being greatly lifted in machine learning training promotes user experience.

Description

The system of shared storage for training pattern

Technical field

The present invention relates to machine learning fields, relate more specifically to a kind of system of shared storage for training pattern.

Background technique

With the rise of machine learning, how to obtain training pattern and receive more and more attention.Since training is to be based on What a large amount of sample data carried out, therefore the requirement to storage is very high.In view of the environment of training pattern, a kind of current scheme It is based on Docker containerization environment, since Docker container is the application container engine of an open source, it is easy to it is transplanted, Also it may be implemented to virtualize, therefore, use Docker containerization that environment is made to be easy to build, speed is quickly.

But since container does not have any interface between each other, be mutually isolated between the storage environment of container, it is difficult altogether Data are enjoyed, even if shared data, also have many unexpected problems.In addition, also having many problems, kernel level based on container Other isolation is not so good, can have many factors of instability under shared storage, for user usage experience very not It is good.

Summary of the invention

The present invention provides a kind of systems of shared storage for training pattern, can guarantee the stability of system, mention Rise user experience.

The system of shared storage provided by the present invention for training pattern, comprising:

More the first physical mechanisms at the first cluster, provide Distributed sharing storage；

More the second physical mechanisms at the second cluster, virtual machine is provided；

Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry in the virtual machine On.

In a kind of implementation of the invention, dispatching platform dispatches second cluster, comprising: the dispatching platform from Several the second physical machines are selected in second cluster, and send scheduling request to several second physical machines.

In a kind of implementation of the invention, several second physical machines in second cluster are being received After scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.

In a kind of implementation of the invention, the Distributed sharing storage carry provided by first cluster exists On the virtual machine that several second physical machines are started.

In a kind of implementation of the invention, it is provided with agency in every second physical machine in second cluster, The agency obtains the virtual machine information of the second physical machine where it, and acquired virtual machine information is sent to the tune Spend platform.

In a kind of implementation of the invention, the dispatching platform uses dispatching algorithm according to the virtual machine information It is scheduled.

In a kind of implementation of the invention, certain the second physical machine being arranged in several second physical machines is made For server, and other second physical machines in several second physical machines are as worker's machine.Wherein, the server mentions It is serviced for Network File System NFS, worker's machine carry is simultaneously serviced using the NFS.

In such manner, it is possible to guarantee that shared storage is supported to write more readings, and make full use of the stability of storage more, ensure that entire The stability of system avoids the occurrence of the bottleneck of performance and stability.

In a kind of implementation of the invention, described in the block storage RBD conduct of the virtual machine institute carry of the server The server of NFS service.

In a kind of implementation of the invention, the internet protocol address or domain name of the server are fixed 's.

In a kind of implementation of the invention, when the state of certain second physical machine is deteriorated and not can guarantee program When normal operation, migrated using by another the second physical machine as the server.

In such manner, it is possible to guarantee the state and performance of current server, to guarantee the stability of whole system, avoid by The performance of whole system is affected in being not in good state for server.

In a kind of implementation of the invention, further includes: control centre receives user's request, and flat to the scheduling Platform sends instruction；Wherein, the dispatching platform dispatches second cluster after receiving described instruction.

It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, can be carried out with user Model training.It is greatly lifted in machine learning training by the support of large-scale cluster based on shared storage and virtualization Efficiency, promoted user experience.

Detailed description of the invention

The embodiment of the present invention is described in more detail in conjunction with the accompanying drawings, the above and other purposes of the present invention, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present invention, and constitutes explanation A part of book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings, Identical reference label typically represents same parts or step.

Fig. 1 is an exemplary block diagram of the system of the shared storage for training pattern of the embodiment of the present invention；

Fig. 2 is a schematic block diagram of the system of the offer virtualization of the embodiment of the present invention.

Specific embodiment

In order to enable the object, technical solutions and advantages of the present invention become apparent, root is described in detail below with reference to accompanying drawings According to example embodiments of the present invention.Obviously, described embodiment is only a part of the embodiments of the present invention, rather than this hair Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Based on described in the present invention The embodiment of the present invention, those skilled in the art's obtained all other embodiment in the case where not making the creative labor It should all fall under the scope of the present invention.

Shared storage refers to that two or more processors share the parallel architecture of a main memory.Each processing Machine can be stored in information main memory, or be taken out information.Communication between processor passes through access shared memory To realize.The embodiment of the present invention selects a kind of large-scale distributed shared storage, is capable of providing a kind of stable efficient big Scale storage scheme, for example, distributed file system Ceph, HDFS etc..

Ceph be one reliably, rebalancing method, the distributed memory system restored automatically automatically, can be with according to scene partitioning It is three bulks by Ceph points, is respectively: object storage, block device storage and file system service.In virtualization field, compare Commonly used is the block device storage of Ceph, for example in OpenStack project, the block device storage of Ceph can be docked The rear end the cinder storage of OpenStack, the data storage of the mirrored storage of Glance and virtual machine, are more intuitively Ceph Cluster can provide the block storage of a raw format as the hard disk of virtual machine instance.Ceph compares the advantage of other storages It is storage that point, which is it not merely, while being also fully utilized by the computing capability on memory node, when storing each data, The position that will be stored by the way that the data are calculated, as far as possible by data distribution equilibrium, simultaneously because the good design of Ceph, is adopted With the methods of CRUSH algorithm, HASH ring, so that the problem of traditional Single Point of Faliure is not present in it, and with the expansion of scale Performance can't be affected.Compare concern in distributed memory system is a little how to enable data to be distributed more Add equilibrium, common data distribution algorithms have the Crush algorithm of consistency Hash and Ceph.Crush is a kind of pseudorandom control Data distribution processed, duplication algorithm, Ceph design for large-scale distributed storage, and data distribution algorithms allow for expiring Foot data under large-scale cluster still quickly can accurately calculate storage position, while can be in hardware fault or expansion Accomplish that Data Migration as small as possible, the CRUSH algorithm of Ceph are exactly to design meticulously for these characteristics when opening up hardware device, it can To say that CRUSH algorithm is also one of the core of Ceph.

HDFS (Hadoop Distributed File System) is designed to be suitble to operate in common hardware Distributed file system on (commodity hardware).It and existing distributed file system have many common ground. But meanwhile the difference of it and other distributed file systems is also apparent.HDFS is the system of an Error Tolerance, It is suitble to be deployed on cheap machine.HDFS can provide the data access of high-throughput, be very suitable on large-scale dataset Using.HDFS relaxes a part of POSIX constraint, and Lai Shixian streaming reads the purpose of file system data.HDFS is most starting It is the architecture as Apache Nutch search engine project and develops.HDFS is Apache Hadoop Core project A part.HDFS has the characteristics of high fault tolerance, and is designed to be deployed on cheap hardware.And it provides height and gulps down The amount of spitting carrys out the data of access application, those is suitble to have the application program of super large data set (large data set). The data in the form access file system of stream may be implemented in the requirement that HDFS relaxes POSIX in this way.

The selected large-scale distributed shared storage of the embodiment of the present invention can be more convenient in large-scale cluster Ground is built and is accessed, and that this also can be very easily accessed is large-scale distributed total for the cluster network of the embodiment of the present invention Enjoy storage.

For the ease of training machine study model, the embodiment of the invention provides more the first physical mechanisms at first Cluster can be described as large-scale distributed shared storage cluster or on a large scale shared storage cluster etc..First cluster includes A large amount of first physical machine (or being referred to as processor), is capable of providing shared storage, particularly provides Distributed sharing Storage.The embodiment of the invention also provides more the second physical mechanisms at the second cluster, can be described as large-scale virtual machine collection Group etc..Second cluster includes a large amount of second physical machine (or being referred to as processor), is capable of providing virtual machine.

It should be noted that the embodiment of the present invention is to physical machine (the first physical machine for constituting cluster (the first cluster or the second cluster) Or second physical machine) quantity be not construed as limiting, such as the order of magnitude can for hundred, thousand, ten thousand, it is even more.

Fig. 1 is an exemplary block diagram of the system of the shared storage for training pattern of the embodiment of the present invention.Fig. 1 institute The system 10 shown includes: the first cluster 110, the second cluster 120 and dispatching platform 130.

More the first physical mechanisms at the first cluster 110, provide Distributed sharing storage；

More the second physical mechanisms at the second cluster 120, virtual machine is provided；

Dispatching platform 130 dispatches second cluster 120, so that the Distributed sharing stores carry described On virtual machine.

Illustratively, the first cluster can be shared cluster on a large scale, and existing any shared storage side can be used Case.Using a kind of stable efficient Mass storage scheme in the embodiment of the present invention, in particular Distributed sharing is deposited Storage scheme, such as can be Ceph, HDFS etc..Shared storage scheme used by the embodiment of the present invention can be in large-scale cluster In more easily build and access.

Illustratively, the second cluster can be large-scale virtual cluster, and the embodiment of the present invention is to used in the second cluster Virtualization scheme without limitation, such as can be based on kernel virtual machine (Kernel-based Virtual Machine, KVM) alternatively, KVM-QEMU (Quick Emulator, fast simulator) etc..Second cluster includes a large amount of second physical machine, and Every second physical machine is built-in with virtualization and supports.

In the embodiment of the present invention, the dispatching platform 130 selects several the second physical machines from the second cluster 120, and Scheduling request is sent to several second physical machines.Several second physical machines in second cluster 120 are receiving To after scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.To provided by the first cluster 110 The Distributed sharing storage carry is on the virtual machine that several second physical machines are started.Wherein, dispatching platform 130 It is referred to as scheduler.

Illustratively, it can also include control centre, receive user's request, and refer to the dispatching platform 130 transmission It enables.To which dispatching platform 130 dispatches second cluster 120 after receiving described instruction.

Referring to Fig. 2, control centre 140, dispatching platform 130 and the second cluster 120 is shown, wherein the second cluster 120 include more the second physical machines 1201.

Control centre 140 can receive and handle user's request, then sends and instructs to dispatching platform 130.Dispatching platform 130 upon receipt of the instructions, can select one therein or several the second physical machines from the second cluster 120, and to this One or this several second physical machines transmission scheduling requests.This or this several the second physical machines receive scheduling request Later, start respective virtual machine, and distribute hardware resource for respective virtual machine, then by the access of Mass storage, The shared storage carry that first cluster is provided is in this or this several virtual machines.

Wherein, user's request may include that request creates and enters virtual machine, carries out model training etc. using virtual machine. It is understood that user's request also may include the request of other creation virtual machines, the present invention does not limit this.

Wherein, dispatching platform 130 can be therein to select according to the situation of each the second physical machine in the second cluster 120 One or several the second physical machines.For example, according to the available resources of each the second physical machine, each the second physical mechanism at Machine topology, machine configuration etc. are selected.

Illustratively, agency can be set in every second physical machine 1201 in the second cluster 120, the agency obtains The virtual machine information of the second physical machine where it is taken, and acquired virtual machine information is sent to the dispatching platform 130. To which dispatching platform 130 can be scheduled according to the virtual machine information of each the second physical machine 1201.

Wherein, virtual machine information may include: machine state, the available physical resource of virtual machine etc..Wherein, scheduling is flat Platform 130 can be used dispatching algorithm and select the second physical machine.For example, dispatching platform 130 can first choice will be unsatisfactory for virtual machine and want The second physical machine asked filters out, then again to remaining second physical machine carry out weight calculation, and weight selection calculated value compared with One big or several the second physical machines.It is understood that dispatching platform 130 can be based on large-scale cluster (i.e. the second cluster) benefit Carry out high-performance dispatch with dispatching algorithm, but the embodiment of the present invention to used dispatching algorithm without limitation, such as can be with Use more efficient dispatching algorithm.

In this way, the system of the embodiment of the present invention can specify the same of the first cluster by virtualization scheme shown in Fig. 2 A shared more virtual equipments of storage carry, so as to the exploitation for realizing efficient shared storage machine learning.And And the embodiment of the present invention builds shared storage by means of virtualization scheme, realization is simple and convenient, and speed is quickly.

In order to specify same shared more virtual equipments of storage carry, which needs support write more and reads more.

As a kind of implementation, the single storage that can choose in the first cluster supports write to read more more, then will share Direct carry is stored in more virtual machine facilities.In this implementation, very convenient first can quickly be constructed Cluster.

Specifically, in this case, it is only necessary to which shared storage is directly mounted in multiple virtual machines.However, Since this storage stability is not so good, cause performance that may also become bottleneck, therefore the implementation may need to select to close Suitable storage rear end (backend).Due at present on the market it is existing storage it is more difficult meet this requirement, this will lead in order to The first cluster is constructed, needs that a large amount of manpower is spent to go to be purchased, and it is excessively high to may cause cost.In addition, using in time The implementation has reached satisfaction and has write the shared storage more read more, and also still there may be the bottlenecks of performance and stability.

As another implementation, it can choose the support being individually stored in the first cluster and write more reading some not more Support writes more readings, and the block storage (abbreviation RBD, rodas block device) that Ceph can be used for example is used as backend, But since RBD does not support to write more readings more, it can meet to share to store to need support by following scheme and write more readings more.

Certain the second physical machine in several second physical machines is set as server (master), and if described Other second physical machines in dry the second physical machine of platform are as worker's machine (worker).Wherein, the server provides network text Part system (Network File System, NFS) service, worker's machine carry are simultaneously serviced using the NFS.Wherein, it services The server that the RBD of the virtual machine institute carry of machine is serviced as the NFS.

In such manner, it is possible to make full use of the stability of storage, the stability of whole system ensure that, avoid the occurrence of performance and steady Qualitative bottleneck.

As an example it is assumed that several the second physical machines are N platform, certain the second physical machine therein (can be labeled as M1 it) is used as master, and others N-1 the second physical machine of platform is worker.

Specifically, it is exposed the RBD of carry in certain virtual equipment as the server of NFS, becomes NFS clothes Business, for other virtual machine carries and uses.Wherein the second physical machine for providing NFS can be known as master, by carry NFS Second physical machine of service is known as worker.

Optionally, the Internet protocol address (Internet Protocol, IP) of server (master) or domain name are Fixed.So as to solve the problems, such as that master becomes big single-point.

Illustratively, it when the state of certain second physical machine is deteriorated and not can guarantee normal program operation, carries out Migration using by another the second physical machine as the server.

That is, the state of current service machine can be detected constantly, the degradation of current service machine, which is unable to satisfy, to be needed When asking, server can be migrated online.In this way, the performance of server is constantly in preferable state, it is entire to guarantee The stability of system.

Above example is still returned to, the M1 in the second physical machine of N platform is as master, other the second physics of N-1 platform Machine is as worker.When the state variation for detecting M1 not can guarantee performance, M2 can be reselected as master, and incite somebody to action M1 and remaining the second physical machine of N-2 platform are as worker.

Wherein, the state of current service machine may include that the state of its IP address, its performance could guarantee the normal of program Operation and other and the performance-relevant state of shared storage etc..

It can be seen that the embodiment of the present invention realizes a kind of virtualization system based on shared storage, can be carried out with user Model training.It is greatly lifted in machine learning training by the support of large-scale cluster based on shared storage and virtualization Efficiency.Specifically, user local operation carry out model training when, by by shared storage carry on a virtual machine, in turn Efficiently quickly calculate using virtualization.In addition it is possible to the stability of storage be made full use of, so that system stability Good, the user experience is improved.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.

Although describing example embodiment by reference to attached drawing here, it should be understood that above example embodiment are only exemplary , and be not intended to limit the scope of the invention to this.Those of ordinary skill in the art can carry out various changes wherein And modification, it is made without departing from the scope of the present invention and spiritual.All such changes and modifications are intended to be included in appended claims Within required the scope of the present invention.

In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, apparatus embodiments described above are merely indicative, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another equipment is closed or is desirably integrated into, or some features can be ignored or not executed.

In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this specification.

Similarly, it should be understood that in order to simplify the present invention and help to understand one or more of the various inventive aspects, To in the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure, Or in descriptions thereof.However, the method for the invention should not be construed to reflect an intention that i.e. claimed The present invention claims features more more than feature expressly recited in each claim.More precisely, such as corresponding power As sharp claim reflects, inventive point is that the spy of all features less than some disclosed single embodiment can be used Sign is to solve corresponding technical problem.Therefore, it then follows thus claims of specific embodiment are expressly incorporated in this specific Embodiment, wherein each, the claims themselves are regarded as separate embodiments of the invention.

It will be understood to those skilled in the art that any combination pair can be used other than mutually exclusive between feature All features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed any method Or all process or units of equipment are combined.Unless expressly stated otherwise, this specification (is wanted including adjoint right Ask, make a summary and attached drawing) disclosed in each feature can be replaced with an alternative feature that provides the same, equivalent, or similar purpose.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in detail in the claims, embodiment claimed it is one of any Can in any combination mode come using.

Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice Microprocessor or digital signal processor (Digital Signal Processing, DSP) Lai Shixian are implemented according to the present invention The some or all functions of some modules in the article analytical equipment of example.The present invention is also implemented as executing here Some or all program of device (for example, computer program and computer program product) of described method.In this way Realization program of the invention can store on a computer-readable medium, or can have the shape of one or more signal Formula.Such signal can be downloaded from an internet website to obtain, and perhaps be provided on the carrier signal or with any other shape Formula provides.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame Claim.

The above description is merely a specific embodiment or to the explanation of specific embodiment, protection of the invention Range is not limited thereto, and anyone skilled in the art in the technical scope disclosed by the present invention, can be easily Expect change or replacement, should be covered by the protection scope of the present invention.Protection scope of the present invention should be with claim Subject to protection scope.

Claims

1. a kind of system of the shared storage for training pattern, which is characterized in that the system comprises:

Dispatching platform dispatches second cluster, so that the Distributed sharing stores carry on the virtual machine.

2. system according to claim 1, which is characterized in that dispatching platform dispatches second cluster, comprising:

The dispatching platform selects several the second physical machines from second cluster, and to several second physical machines Send scheduling request.

3. system according to claim 2, which is characterized in that several second physical machines in second cluster After receiving scheduling request, start virtual machine, and distribute hardware resource for respective virtual machine.

4. system according to claim 3, which is characterized in that the Distributed sharing provided by first cluster is deposited Carry is stored up on the virtual machine that several second physical machines are started.

5. system according to claim 1, which is characterized in that be arranged in every second physical machine in second cluster There is agency, the agency obtains the virtual machine information of the second physical machine where it, and acquired virtual machine information is sent To the dispatching platform.

6. system according to claim 5, which is characterized in that the dispatching platform is used according to the virtual machine information Dispatching algorithm is scheduled.

7. system according to claim 2, which is characterized in that certain second in setting several second physical machines Physical machine is as server, and other second physical machines in several second physical machines are as worker's machine,

Wherein, the server provides Network File System NFS service, and worker's machine carry is simultaneously serviced using the NFS.

8. system according to claim 7, which is characterized in that the block of the virtual machine institute carry of the server stores RBD Server as NFS service.

9. system according to claim 7, which is characterized in that the internet protocol address or domain name of the server It is fixed.

10. system according to claim 7, which is characterized in that when the state variation and nothing of certain second physical machine When method guarantees normal program operation, migrated using by another the second physical machine as the server.

11. system according to any one of claim 1 to 10, which is characterized in that further include:

Control centre receives user's request, and sends and instruct to the dispatching platform；

Wherein, the dispatching platform dispatches second cluster after receiving described instruction.