CN109272116A - A kind of method and device of deep learning - Google Patents
A kind of method and device of deep learning Download PDFInfo
- Publication number
- CN109272116A CN109272116A CN201811032430.6A CN201811032430A CN109272116A CN 109272116 A CN109272116 A CN 109272116A CN 201811032430 A CN201811032430 A CN 201811032430A CN 109272116 A CN109272116 A CN 109272116A
- Authority
- CN
- China
- Prior art keywords
- training
- model
- deep learning
- container
- role
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5077—Logical partitioning of resources; Management or configuration of virtualized resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45562—Creating, deleting, cloning virtual machine instances
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45575—Starting, stopping, suspending or resuming virtual machine instances
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of methods of deep learning, it include: after receiving deep learning task, according to the deep learning frames match cell therefor mirror image that user selects, corresponding container mirror image is pulled on specified node according to the role of user's selection and nodal information, creates container resource;Start container, corresponding logic is executed according to container corresponding role, model is trained.Also disclose a kind of device of deep learning.The environment that automation building deep learning may be implemented, can be with reasoning business in automatic management task and automatically dispose deep learning line etc..
Description
Technical field
The present invention relates to the communication technology, espespecially a kind of method and device of deep learning.
Background technique
The learning behavior of the mankind is simulated or realized to machine learning research computer how, to obtain new knowledge or skills,
The existing structure of knowledge is reorganized to be allowed to constantly improve the performance of itself.Machine learning needs to select by data, analysis, special
Sign selection, algorithm design, training are verified, test, are issued reasoning on line and are applied a series of processes.Deep learning is machine learning
In a branch.
The process of deep learning generally requires the preparation of experience data, data analysis, feature selecting, data set fractionation, mould
The design of type, model training (wherein, further relate to the preparation of training environment, the submission and monitoring of training mission), and model is commented
Estimate, the online process of model.Simultaneously when model changes, the model that timely update on line.Entire depth study
The process of process is more complex, and the link being related to is more, and causing deep learning to use, excessively complicated and efficiency is lower.
Summary of the invention
In order to solve the above-mentioned technical problems, the present invention provides a kind of method and device of deep learning, may be implemented certainly
The environment of dynamicization building deep learning.
In order to reach the object of the invention, the present invention provides a kind of methods of deep learning, wherein includes:
After receiving deep learning task, according to the deep learning frames match cell therefor mirror image that user selects, root
Corresponding container mirror image is pulled on specified node according to the role and nodal information of user's selection, creates container resource;
Start container, corresponding logic is executed according to container corresponding role, model is trained.
It is further, described to be executed after corresponding logic is trained model according to role, further includes:
The model and parameter that training is completed are saved, then discharges the container resource automatically.
It is further, described to be executed after corresponding logic is trained model according to role, further includes:
The module and parameter completed using training carry out model measurement, output test result.
It is further, described to be executed after corresponding logic is trained model according to role, further includes:
The module that training is completed is pushed in one or more aol servers.
Further, during the module that training is completed pushes in one or more aol servers, packet
It includes:
The data applied when user's use are collected, are added in training data;
Model training is carried out using the training data, when newly trained model measurement effect is better than "current" model, into
Row model modification.
A kind of device of deep learning, wherein include:
Creation module, after receiving deep learning task, the deep learning frames match selected according to user is corresponding
Container mirror image, corresponding container mirror image is pulled on specified node according to the role of user's selection and nodal information, creation is held
Device resource;
Training module executes corresponding logic according to the corresponding role of container and is trained to model for starting container.
Further, the training module, after being trained according to the corresponding logic of role's execution to model, further includes:
The model and parameter that training is completed are saved, then discharges the container resource automatically.
Further, the training module, after being trained according to the corresponding logic of role's execution to model, further includes:
The module that training is completed is pushed in one or more aol servers.
Further, the training module pushes to the module that training is completed in one or more aol servers
In the process, comprising: collect the data applied when user's use, be added in training data;Mould is carried out using the training data
Type training carries out model modification when newly trained model measurement effect is better than "current" model.
A kind of device of deep learning, including processor and computer readable storage medium, the computer-readable storage
Instruction is stored in medium, wherein when described instruction is executed by the processor, the method for realizing above-mentioned deep learning.
The method of the present embodiment proposes reasoning automation publication in the management of deep learning training environment and line of containerization
Method, while the online deployment that can be automated trained model, and will it is online during user's use when apply
Some data are collected, further abundant data collection, carry out next round training.After training is completed, it can automate
It updates, model on publication line, completes the service closed_loop of automation.
Intelligent, the automation by many operations during deep learning, mitigation deep learning developer and user's
Burden, makes developer be absorbed in the design of model, greatly improves the efficiency.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right
Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this
The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.
Fig. 1 is the flow chart of the method for the deep learning of the embodiment of the present invention;
Fig. 2 is the flow chart of the method for the one exemplary deep learning of application of the present invention;
Fig. 3 is a kind of schematic diagram of the device of deep learning of the embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention
Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application
Feature can mutual any combination.
Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions
It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable
Sequence executes shown or described step.
Fig. 1 is the flow chart of the method for the deep learning of the embodiment of the present invention, as shown in Figure 1, the method for the present embodiment can
To include:
Step 11, after receiving deep learning task, according to user select deep learning frames match cell therefor
Mirror image pulls corresponding container mirror image on specified node according to the role of user's selection and nodal information, creates container resource;
Step 12, starting container, execute corresponding logic according to the corresponding role of container and are trained to model.
The environment of automation building deep learning may be implemented in the embodiment of the present invention, can be with automatic management task and automatic
Change reasoning business etc. on deployment depth study line.The technology combined simultaneously by deep learning with Kubernetes, Ke Yijin
The building of row deep learning environment, the resource allocation of task, the management of task status, and the publication of application.
Kubernetes is the completely new leading scheme of the distributed structure/architecture based on container technique.Use Kubernets
The application of micro services construction can be easily constructed, while Kubernets is that system architecture has HA and superpower lateral expansion
Appearance ability.Kubernets can easily realize the isolation of the quota mechanism and resource of resource by Namespace.It is one
A universal resource management system can provide unified resource management and scheduling for upper layer application, its cluster that is introduced as is utilizing
Rate, resource unified management and data sharing etc. bring big advantages.
Fig. 2 is the flow chart of the method for the one exemplary deep learning of application of the present invention, as shown in Fig. 2, the side of the present embodiment
Method may include:
The Docker Image (container mirror image) of common deep learning frame built in step 101, system;
Such as Tensorflow (tensor stream), MXNet, Pytorch etc..
Tensorflow is the second generation artificial intelligence learning system that Google is researched and developed based on DistBelief, can be incited somebody to action
Complicated data structure, which is transmitted in artificial intelligence nerve net, to be analyzed and is handled.
Meanwhile this example also supports user to make deep learning mirror image automatically and be uploaded to system, system can be unified by mirror image
It stores to docker regsitry (container warehouse).
Step 102, the parameter of user's preference pattern training;
Selection is used for the data set path of model training, by model file path, the node number of each role of cluster with
And the individual cultivation of node passes to platform as parameter.
After step 103, platform receive relevant parameter, will judge the type of submission task, e.g. using Caffe or
The deep learning frame of Tensorflow.Frame as needed matches cell therefor mirror image from docker registry,
Simultaneously according to the role of user's selection and nodal information, corresponding container in docker registry is pulled on specified node
Mirror image creates container resource.
Step 104, container asset creation completion after, start container automatically.
Determine it oneself is Parameter Server (parameter according to the parameter being passed in step 102 in each container
Server) or the different role such as Worker (worker) model is carried out so that different roles executes different logics
Training.
Step 105, after the completion of task execution, automatically save the model and parameter etc. of training completion, later automatic release
Container resource.
Step 106 completes model and parameter using training and carries out the test of model, and export precision (accurate) and
The indexs such as recall (recalling).
Step 107, later as needed, model automatization can be pushed in multiple aol servers.
Server front end carries out load balancing using Haproxy, and provides API service.It is right on line during reasoning
The related data that user submits is collected and labels.
HAProxy, which is one, makes the freedom shown a C language and open-source software.
The new data collected in step 107 can be used for this model later and continue step 102 to step 106
Movement, while can also be at this moment modification model appropriate.When model before newly trained model measurement effect is better than,
Model deployed in step 107 can be updated.
The scheme of the deep learning of the embodiment of the present invention proposes in the management of deep learning training environment and line of containerization
Reasoning automates dissemination method, and many operations during deep learning are intelligent, and automation mitigates deep learning developer
With the burden of user, so that developer is absorbed in the design of model, greatly improve the efficiency.
Fig. 3 is a kind of schematic diagram of the device of deep learning of the embodiment of the present invention, as shown in figure 3, the dress of the present embodiment
Set 300, comprising:
Creation module 301, after receiving deep learning task, according to the deep learning frames match phase of user's selection
The container mirror image answered pulls corresponding container mirror image on specified node according to the role of user's selection and nodal information, creates
Container resource;
Training module 302 executes corresponding logic according to the corresponding role of container and instructs to model for starting container
Practice.
It, can be when submitting training mission using the device of the present embodiment, the distribution resource of automation, creation training
Environment improves the degree of automation of operation submission and the efficiency of deep learning model training.
In one embodiment, the training module 301, after being trained according to the corresponding logic of role's execution to model,
It can also include: to save the model and parameter that training is completed, then discharge the container resource automatically, it can be after the completion of task
Automatic Resource recovery.
In one embodiment, the training module 302, after being trained according to the corresponding logic of role's execution to model,
Further include: the module that training is completed is pushed in one or more aol servers, the online process automation of implementation model.
In one embodiment, the module that training is completed is pushed to one or more online services by the training module 302
During in device, comprising: collect the data applied when user's use, be added in training data;Utilize the training data
Model training is carried out, when newly trained model measurement effect is better than "current" model, carries out model modification.
The device of the present embodiment can will be excessively tediously long in traditional deep learning application whole process, many and diverse step is automatic
Change, and by collecting data on line, carries out continuous updating and the publication of model, be finally reached preferable model application effect.
The embodiment of the present invention also provides a kind of device of deep learning, including processor and computer readable storage medium,
Instruction is stored in the computer readable storage medium, wherein when described instruction is executed by the processor, realize above-mentioned
The method of deep learning.
The embodiment of the invention also provides a kind of computer readable storage mediums, are stored with computer executable instructions,
The computer executable instructions are performed the method for realizing the deep learning.
It will appreciated by the skilled person that whole or certain steps, system, dress in method disclosed hereinabove
Functional module/unit in setting may be implemented as software, firmware, hardware and its combination appropriate.In hardware embodiment,
Division between the functional module/unit referred in the above description not necessarily corresponds to the division of physical assemblies;For example, one
Physical assemblies can have multiple functions or a function or step and can be executed by several physical assemblies cooperations.Certain groups
Part or all components may be implemented as by processor, such as the software that digital signal processor or microprocessor execute, or by
It is embodied as hardware, or is implemented as integrated circuit, such as specific integrated circuit.Such software can be distributed in computer-readable
On medium, computer-readable medium may include computer storage medium (or non-transitory medium) and communication media (or temporarily
Property medium).As known to a person of ordinary skill in the art, term computer storage medium is included in for storing information (such as
Computer readable instructions, data structure, program module or other data) any method or technique in the volatibility implemented and non-
Volatibility, removable and nonremovable medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or its
His memory technology, CD-ROM, digital versatile disc (DVD) or other optical disc storages, magnetic holder, tape, disk storage or other
Magnetic memory apparatus or any other medium that can be used for storing desired information and can be accessed by a computer.This
Outside, known to a person of ordinary skill in the art to be, communication media generally comprises computer readable instructions, data structure, program mould
Other data in the modulated data signal of block or such as carrier wave or other transmission mechanisms etc, and may include any information
Delivery media.
Claims (10)
1. a kind of method of deep learning characterized by comprising
After receiving deep learning task, according to user select deep learning frames match cell therefor mirror image, according to
The role of family selection and nodal information pull corresponding container mirror image on specified node, create container resource;
Start container, corresponding logic is executed according to container corresponding role, model is trained.
2. the method according to claim 1, wherein described execute corresponding logic to model progress according to role
After training, further includes:
The model and parameter that training is completed are saved, then discharges the container resource automatically.
3. the method according to claim 1, wherein described execute corresponding logic to model progress according to role
After training, further includes:
The module and parameter completed using training carry out model measurement, output test result.
4. the method according to claim 1, wherein described execute corresponding logic to model progress according to role
After training, further includes:
The module that training is completed is pushed in one or more aol servers.
5. the method according to claim 1, wherein the module that training is completed pushes to one or more
During in aol server, comprising:
The data applied when user's use are collected, are added in training data;
Model training is carried out using the training data, when newly trained model measurement effect is better than "current" model, carries out mould
Type updates.
6. a kind of device of deep learning characterized by comprising
Creation module is held after receiving deep learning task according to the deep learning frames match that user selects accordingly
Device mirror image pulls corresponding container mirror image, creation container money according to the role of user's selection and nodal information on specified node
Source;
Training module executes corresponding logic according to the corresponding role of container and is trained to model for starting container.
7. device according to claim 6, which is characterized in that
The training module, after being trained according to the corresponding logic of role's execution to model, further includes: save what training was completed
Then model and parameter discharge the container resource automatically.
8. device according to claim 6, which is characterized in that
The training module, after being trained according to the corresponding logic of role's execution to model, further includes: the mould for completing training
Block pushes in one or more aol servers.
9. device according to claim 6, which is characterized in that
The training module, during the module that training is completed is pushed in one or more aol servers, comprising: receive
Collect the data applied when user's use, is added in training data;Model training is carried out using the training data, when new training
Model measurement effect be better than "current" model when, carry out model modification.
10. a kind of device of deep learning, including processor and computer readable storage medium, the computer-readable storage medium
Instruction is stored in matter, which is characterized in that when described instruction is executed by the processor, realize the side of above-mentioned deep learning
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811032430.6A CN109272116A (en) | 2018-09-05 | 2018-09-05 | A kind of method and device of deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811032430.6A CN109272116A (en) | 2018-09-05 | 2018-09-05 | A kind of method and device of deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109272116A true CN109272116A (en) | 2019-01-25 |
Family
ID=65187217
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811032430.6A Pending CN109272116A (en) | 2018-09-05 | 2018-09-05 | A kind of method and device of deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109272116A (en) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
CN110414687A (en) * | 2019-07-12 | 2019-11-05 | 苏州浪潮智能科技有限公司 | A kind of method and apparatus for the training of deep learning frame distribution |
CN110780987A (en) * | 2019-10-30 | 2020-02-11 | 上海交通大学 | Deep learning classroom analysis system and method based on container technology |
CN110928553A (en) * | 2019-10-16 | 2020-03-27 | 中国平安人寿保险股份有限公司 | Deployment method, device and system of deep learning model |
CN110941421A (en) * | 2019-11-29 | 2020-03-31 | 广西电网有限责任公司 | Development machine learning device and using method thereof |
CN111190690A (en) * | 2019-12-25 | 2020-05-22 | 中科曙光国际信息产业有限公司 | Intelligent training device based on container arrangement tool |
CN111461332A (en) * | 2020-03-24 | 2020-07-28 | 北京五八信息技术有限公司 | Deep learning model online reasoning method and device, electronic equipment and storage medium |
CN111629061A (en) * | 2020-05-28 | 2020-09-04 | 苏州浪潮智能科技有限公司 | Inference service system based on Kubernetes |
WO2020186899A1 (en) * | 2019-03-19 | 2020-09-24 | 华为技术有限公司 | Method and apparatus for extracting metadata in machine learning training process |
CN112148419A (en) * | 2019-06-28 | 2020-12-29 | 杭州海康威视数字技术股份有限公司 | Mirror image management method, device and system in cloud platform and storage medium |
CN112364897A (en) * | 2020-10-27 | 2021-02-12 | 曙光信息产业(北京)有限公司 | Distributed training method and device, storage medium and electronic equipment |
CN112579303A (en) * | 2020-12-30 | 2021-03-30 | 苏州浪潮智能科技有限公司 | Method and equipment for allocating deep learning development platform resources |
CN112633501A (en) * | 2020-12-25 | 2021-04-09 | 深圳晶泰科技有限公司 | Development method and system of machine learning model framework based on containerization technology |
CN112700004A (en) * | 2020-12-25 | 2021-04-23 | 南方电网深圳数字电网研究院有限公司 | Deep learning model training method and device based on container technology and storage medium |
CN112862098A (en) * | 2021-02-10 | 2021-05-28 | 杭州幻方人工智能基础研究有限公司 | Method and system for processing cluster training task |
CN113241056A (en) * | 2021-04-26 | 2021-08-10 | 标贝(北京)科技有限公司 | Method, device, system and medium for training speech synthesis model and speech synthesis |
WO2022012305A1 (en) * | 2020-07-13 | 2022-01-20 | 华为技术有限公司 | Method and apparatus for managing model file in inference application |
WO2022134001A1 (en) * | 2020-12-25 | 2022-06-30 | 深圳晶泰科技有限公司 | Machine learning model framework development method and system based on containerization technology |
CN114791856A (en) * | 2022-06-27 | 2022-07-26 | 北京瑞莱智慧科技有限公司 | K8 s-based distributed training task processing method, related equipment and medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104714852A (en) * | 2015-03-17 | 2015-06-17 | 华中科技大学 | Parameter synchronization optimization method and system suitable for distributed machine learning |
CN106529682A (en) * | 2016-10-28 | 2017-03-22 | 北京奇虎科技有限公司 | Method and apparatus for processing deep learning task in big-data cluster |
CN107135257A (en) * | 2017-04-28 | 2017-09-05 | 东方网力科技股份有限公司 | Task is distributed in a kind of node cluster method, node and system |
CN107370796A (en) * | 2017-06-30 | 2017-11-21 | 香港红鸟科技股份有限公司 | A kind of intelligent learning system based on Hyper TF |
CN107733977A (en) * | 2017-08-31 | 2018-02-23 | 北京百度网讯科技有限公司 | A kind of cluster management method and device based on Docker |
CN107766940A (en) * | 2017-11-20 | 2018-03-06 | 北京百度网讯科技有限公司 | Method and apparatus for generation model |
CN108062246A (en) * | 2018-01-25 | 2018-05-22 | 北京百度网讯科技有限公司 | For the resource regulating method and device of deep learning frame |
-
2018
- 2018-09-05 CN CN201811032430.6A patent/CN109272116A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104714852A (en) * | 2015-03-17 | 2015-06-17 | 华中科技大学 | Parameter synchronization optimization method and system suitable for distributed machine learning |
CN106529682A (en) * | 2016-10-28 | 2017-03-22 | 北京奇虎科技有限公司 | Method and apparatus for processing deep learning task in big-data cluster |
CN107135257A (en) * | 2017-04-28 | 2017-09-05 | 东方网力科技股份有限公司 | Task is distributed in a kind of node cluster method, node and system |
CN107370796A (en) * | 2017-06-30 | 2017-11-21 | 香港红鸟科技股份有限公司 | A kind of intelligent learning system based on Hyper TF |
CN107733977A (en) * | 2017-08-31 | 2018-02-23 | 北京百度网讯科技有限公司 | A kind of cluster management method and device based on Docker |
CN107766940A (en) * | 2017-11-20 | 2018-03-06 | 北京百度网讯科技有限公司 | Method and apparatus for generation model |
CN108062246A (en) * | 2018-01-25 | 2018-05-22 | 北京百度网讯科技有限公司 | For the resource regulating method and device of deep learning frame |
Non-Patent Citations (1)
Title |
---|
林桂芳: "异构化TensorFlow架构的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
WO2020186899A1 (en) * | 2019-03-19 | 2020-09-24 | 华为技术有限公司 | Method and apparatus for extracting metadata in machine learning training process |
CN112148419B (en) * | 2019-06-28 | 2024-01-02 | 杭州海康威视数字技术股份有限公司 | Mirror image management method, device and system in cloud platform and storage medium |
CN112148419A (en) * | 2019-06-28 | 2020-12-29 | 杭州海康威视数字技术股份有限公司 | Mirror image management method, device and system in cloud platform and storage medium |
CN110414687A (en) * | 2019-07-12 | 2019-11-05 | 苏州浪潮智能科技有限公司 | A kind of method and apparatus for the training of deep learning frame distribution |
CN110928553A (en) * | 2019-10-16 | 2020-03-27 | 中国平安人寿保险股份有限公司 | Deployment method, device and system of deep learning model |
CN110780987A (en) * | 2019-10-30 | 2020-02-11 | 上海交通大学 | Deep learning classroom analysis system and method based on container technology |
CN110941421A (en) * | 2019-11-29 | 2020-03-31 | 广西电网有限责任公司 | Development machine learning device and using method thereof |
CN111190690A (en) * | 2019-12-25 | 2020-05-22 | 中科曙光国际信息产业有限公司 | Intelligent training device based on container arrangement tool |
CN111461332A (en) * | 2020-03-24 | 2020-07-28 | 北京五八信息技术有限公司 | Deep learning model online reasoning method and device, electronic equipment and storage medium |
CN111629061A (en) * | 2020-05-28 | 2020-09-04 | 苏州浪潮智能科技有限公司 | Inference service system based on Kubernetes |
WO2021238251A1 (en) * | 2020-05-28 | 2021-12-02 | 苏州浪潮智能科技有限公司 | Inference service system based on kubernetes |
CN111629061B (en) * | 2020-05-28 | 2023-01-24 | 苏州浪潮智能科技有限公司 | Inference service system based on Kubernetes |
CN114090516A (en) * | 2020-07-13 | 2022-02-25 | 华为技术有限公司 | Management method and device of model file in inference application |
WO2022012305A1 (en) * | 2020-07-13 | 2022-01-20 | 华为技术有限公司 | Method and apparatus for managing model file in inference application |
CN114090516B (en) * | 2020-07-13 | 2023-02-03 | 华为技术有限公司 | Management method and device of model file in inference application |
CN112364897A (en) * | 2020-10-27 | 2021-02-12 | 曙光信息产业(北京)有限公司 | Distributed training method and device, storage medium and electronic equipment |
CN112364897B (en) * | 2020-10-27 | 2024-05-28 | 曙光信息产业(北京)有限公司 | Distributed training method and device, storage medium and electronic equipment |
CN112700004A (en) * | 2020-12-25 | 2021-04-23 | 南方电网深圳数字电网研究院有限公司 | Deep learning model training method and device based on container technology and storage medium |
CN112633501A (en) * | 2020-12-25 | 2021-04-09 | 深圳晶泰科技有限公司 | Development method and system of machine learning model framework based on containerization technology |
WO2022134001A1 (en) * | 2020-12-25 | 2022-06-30 | 深圳晶泰科技有限公司 | Machine learning model framework development method and system based on containerization technology |
CN112579303A (en) * | 2020-12-30 | 2021-03-30 | 苏州浪潮智能科技有限公司 | Method and equipment for allocating deep learning development platform resources |
CN112862098A (en) * | 2021-02-10 | 2021-05-28 | 杭州幻方人工智能基础研究有限公司 | Method and system for processing cluster training task |
CN113241056A (en) * | 2021-04-26 | 2021-08-10 | 标贝(北京)科技有限公司 | Method, device, system and medium for training speech synthesis model and speech synthesis |
CN113241056B (en) * | 2021-04-26 | 2024-03-15 | 标贝(青岛)科技有限公司 | Training and speech synthesis method, device, system and medium for speech synthesis model |
CN114791856A (en) * | 2022-06-27 | 2022-07-26 | 北京瑞莱智慧科技有限公司 | K8 s-based distributed training task processing method, related equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109272116A (en) | A kind of method and device of deep learning | |
CN106528395B (en) | The generation method and device of test case | |
US20210294320A1 (en) | Data acquisition method and apparatus | |
CN105487980B (en) | The method and device that repairing applications are operating abnormally | |
CN111461338A (en) | Intelligent system updating method and device based on digital twin | |
CN108958892A (en) | A kind of method and apparatus creating the container for deep learning operation | |
CN106227652B (en) | Automated testing method and system | |
US9319280B2 (en) | Calculating the effect of an action in a network | |
CN104461693B (en) | Virtual machine update method and system under a kind of desktop cloud computing environment | |
CN106529673A (en) | Deep learning network training method and device based on artificial intelligence | |
CN109062700A (en) | A kind of method for managing resource and server based on distributed system | |
US10223104B2 (en) | Optimizing a build process by scaling build agents based on system need | |
CN108462746A (en) | A kind of container dispositions method and framework based on openstack | |
CN111274036A (en) | Deep learning task scheduling method based on speed prediction | |
CA3056859C (en) | Window parameter configuration method and system, computer-readable media | |
CN108182075A (en) | A kind of program by the automatic escalation target software of socket communication modes | |
CN113836754A (en) | Multi-agent simulation modeling oriented simulation method, device, equipment and medium | |
CN110209574A (en) | A kind of data mining system based on artificial intelligence | |
CN106101213A (en) | Information-distribution type storage method | |
CN107547317A (en) | Virtualize control method, device and the communication system of BAS Broadband Access Server | |
Vilalta et al. | Architecture to deploy and operate a digital twin optical network | |
CN104572941B (en) | Date storage method, device and equipment | |
Troia et al. | Machine learning-assisted planning and provisioning for SDN/NFV-enabled metropolitan networks | |
AU2013351905A1 (en) | An application server and computer readable storage medium for generating project specific configuration data | |
CN107741874A (en) | A kind of GIS clouds virtual machine automatically creates method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |