CN110032449A - A kind of method and device for the performance optimizing GPU server - Google Patents
A kind of method and device for the performance optimizing GPU server Download PDFInfo
- Publication number
- CN110032449A CN110032449A CN201910303999.XA CN201910303999A CN110032449A CN 110032449 A CN110032449 A CN 110032449A CN 201910303999 A CN201910303999 A CN 201910303999A CN 110032449 A CN110032449 A CN 110032449A
- Authority
- CN
- China
- Prior art keywords
- gpu
- server
- gpu server
- deep learning
- response
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
Abstract
The invention discloses a kind of methods of performance for optimizing GPU server, comprising: constructs deep learning frame on GPU server and is trained using deep learning frame, obtains deep learning model;Monitor the properties data of GPU server in training process;According to the obtained properties data of monitoring, judge whether exception and GPU utilization rate are less than first predetermined value to the operation of GPU server;It is operating abnormally in response to GPU server, changes the configuration of GPU server or deep learning model;And it is less than first predetermined value in response to GPU utilization rate, increases the size of transmission data block and increases data line number of passes.The invention also discloses the computer equipments and readable storage medium storing program for executing of a kind of performance for optimizing GPU server.The status monitoring to each system of server may be implemented in the method and device of the performance of optimization GPU server proposed by the present invention, finds simultaneously elimination gap, improves the calculated performance of GPU.
Description
Technical field
The present invention relates to the server fields GPU, more specifically, particularly relating to a kind of side of performance for optimizing GPU server
Method and device.
Background technique
In recent years, AI technology achieves quantum jump in fields such as image recognition, natural language processing, recommender systems, is
Commercial field landing provides unlimited possibility.AI model can reach higher firstly the need of a large amount of data training is carried out
Precision, to play a role in actual production.The breakthrough of AI technology, other than algorithm itself, most important reason is
The rapid growth of power is calculated, GPU accelerator card plays the role of vital.
Deep learning frame is commonly used to that developer is helped to fast implement the exploitation of AI model, training, it is also possible to
It is disposed in production environment, is used for reasoning operation.Caffe, full name Convolutional Architecture for Fast
Feature Embedding is a kind of common deep learning frame.Caffe may operate on CPU or GPU, in mould
Type training stage, GPU are the most powerful calculating units of current performance.If playing maximum calculated performance to GPU, need
The cooperation of CPU, memory system, PCIE system, cooling system and other I/O systems guarantee that GPU is in optimal working condition.
Summary of the invention
In view of this, the purpose of the embodiment of the present invention is to propose the method and dress of a kind of performance for optimizing GPU server
It sets, can be realized the status monitoring to each system of server, find simultaneously elimination gap, improve the calculated performance of GPU.
Based on above-mentioned purpose, the one side of the embodiment of the present invention provides a kind of method of performance for optimizing GPU server,
Include the following steps: to construct deep learning frame on GPU server and be trained using deep learning frame, obtains depth
Learning model;Monitor the properties data of GPU server in training process;According to the properties data that monitoring obtains, sentence
Whether whether exception and GPU utilization rate are less than first predetermined value for disconnected GPU server operation;It is run in response to GPU server different
Often, the configuration of GPU server or deep learning model is changed;And it is less than first predetermined value in response to GPU utilization rate, increase and passes
The size and increase data line number of passes of transmission of data block.
In some embodiments, monitor training process in GPU server properties data include: monitoring CPU and
Temperature and utilization rate, the disk input and output situation, the size of memory cache of GPU.
In some embodiments, judge whether exception includes: whether detection CPU or GPU temperature is big to the operation of GPU server
In second predetermined value.
In some embodiments, it is operating abnormally in response to GPU server, changes GPU server or deep learning model
Configuration include: in response to detect CPU or GPU temperature be greater than second predetermined value when, increase the revolving speed of fan.
In some embodiments, judge whether exception includes: to detect the test of deep learning frame to the operation of GPU server
Whether data are all cached in memory.
In some embodiments, it is operating abnormally in response to GPU server, changes GPU server or deep learning model
Configuration include: in response to detect the non-total caching of the test data of deep learning frame into memory, extend the training time.
In some embodiments, judge whether the operation of GPU server is abnormal and include: whether detection cpu busy percentage is higher than
Third predetermined value.
In some embodiments, it is operating abnormally in response to GPU server, changes GPU server or deep learning model
Configuration include: in response to detect cpu busy percentage be higher than third predetermined value, replace CPU.
The another aspect of the embodiment of the present invention additionally provides a kind of computer equipment, comprising: at least one processor;With
And memory, memory are stored with the computer instruction that can be run on a processor, instruction is executed as follows to realize by processor
Step: deep learning frame is constructed on GPU server and is trained using deep learning frame;It monitors in training process
The properties data of GPU server;According to the properties data that monitoring obtains, judge whether the operation of GPU server is abnormal
And whether GPU utilization rate is less than first predetermined value;It is operating abnormally in response to GPU server, changes GPU server or depth
Practise the configuration of model;And it is less than first predetermined value in response to GPU utilization rate, increases the size of transmission data block and increases data
Transmit Thread Count.
The embodiment of the present invention in another aspect, additionally provide a kind of computer readable storage medium, computer-readable storage
Media storage has the computer program that method as above is executed when being executed by processor.
The present invention has following advantageous effects: can be realized the status monitoring to each system of server, finds
And elimination gap, improve the calculated performance of GPU.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
Other embodiments are obtained according to these attached drawings.
Fig. 1 is the flow diagram of the embodiment of the method for the performance of optimization GPU server provided by the invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference
The embodiment of the present invention is further described in attached drawing.
It should be noted that all statements for using " first " and " second " are for differentiation two in the embodiment of the present invention
The non-equal entity of a same names or non-equal parameter, it is seen that " first " " second " only for the convenience of statement, does not answer
It is interpreted as the restriction to the embodiment of the present invention, subsequent embodiment no longer illustrates this one by one.
Based on above-mentioned purpose, the first aspect of the embodiment of the present invention proposes a kind of performance of optimization GPU server
The embodiment of method.Shown in fig. 1 is that the process of the embodiment of the method for the performance of optimization GPU server provided by the invention is shown
It is intended to.As shown in Figure 1, the embodiment of the present invention includes following steps:
S1, deep learning frame is constructed on GPU server and is trained using deep learning frame, obtain depth
Practise model;
The properties data of GPU server in S2, monitoring training process;
S3, according to the obtained properties data of monitoring, judge the operation of GPU server whether exception and GPU utilization rate
Whether first predetermined value is less than;
S4, it is operating abnormally in response to GPU server, changes the configuration of GPU server or deep learning model;And response
It is less than first predetermined value in GPU utilization rate, increase the size of transmission data block and increases data line number of passes.
Deep learning frame is run on GPU server, is illustrated by taking Caffe as an example in the present embodiment, in others
It can also be using other deep learning frame training GPU servers in embodiment.
Monitor the properties data of GPU server in training process.Performance data includes: the temperature of CPU and GPU, benefit
With rate and running frequency;Training data memory cache (cache) situation;Disk I/O situation;And memory real-time bandwidth.
In the present embodiment, the utilization rate and running frequency that turbostat tool monitoring CPU core can be used, make
The temperature of CPU is obtained with IPMI tool;The management tool provided using GPU manufacturer monitors the utilization rate and temperature of GPU;It uses
Memory behaviour in service is checked in Free instruction, observes the service condition of the part cache;Real-Time Disk is checked using iostat instruction
IO sees whether there is the process for reading in disk data memory;Using intel-cmt-cat tool, memory bandwidth is carried out
Observation in real time.
It is operating abnormally in response to GPU server, the configuration for changing the GPU server or deep learning model includes: sound
Ying Yu detects that CPU or GPU temperature is greater than second predetermined value, increases the revolving speed of fan.According to the monitoring temperature number of CPU and GPU
According to judging whether cooling system meets cooling requirements.For example, when the temperature of CPU or GPU is more than second predetermined value, Ke Yizeng
The revolving speed of big fan improves heat-sinking capability, guarantees that CPU and GPU is worked normally.Second predetermined value can be manually set, such as can
To be 80 degrees Celsius, naturally it is also possible to be set according to actual conditions as other values.
It is operating abnormally in response to GPU server, the configuration for changing the GPU server or deep learning model includes: sound
Should in deep learning frame the non-total caching of test data into memory, extend the training time.Deep learning frame is for the first time
When starting, according to the size of the size and training dataset of disk input and output (IO) situation and memory cache, judge to test
Whether data have all been cached in memory.Since the CPU speed for obtaining data from memory is significantly larger than disk, delay
The runnability of CPU can greatly be improved by depositing training data, and then improve the performance of GPU.When disk I/O stops, cache is big
Small stopping increases and is greater than training data, then proves that data have cached in memory.If data do not have including total caching
In depositing, then extend the trained time.
It is operating abnormally in response to GPU server, the configuration for changing the GPU server or deep learning model includes: sound
Third predetermined value should be higher than in cpu busy percentage, replace the CPU.According to core cpu utilization rate situation, whether CPU specification is judged
Rationally.If core cpu utilization rate is higher than third predetermined value, prove that CPU there are bottleneck, needs to change the product of more high standard.
Third predetermined value can also be manually set, such as can be 90%, certainly, may be set to be other in other examples
Numerical value.
Judge GPU with the presence or absence of bottleneck according to GPU utilization rate situation.If GPU utilization rate is unstable, jumps larger and pass through
Saturation state often is not achieved, that is to say that GPU utilization rate is less than first predetermined value, it was demonstrated that GPU there are bottleneck, training data can not and
When be sent to.GPU performance can be optimized by taking the transmission speed for increasing data at this time, for example, can suitably increase transmission
The size of data block (Batchsize) or the mode for increasing data line number of passes are promoted to reduce the probability of GPU zero load
Overall performance.First predetermined value can also be manually set, for example, first predetermined value can be 95%.
It is important to note that each step in each embodiment of the method for the performance of above-mentioned optimization GPU server
Suddenly can intersect, replace, increase, deleting, therefore, these reasonable permutation and combination transformation in optimization GPU server
The method of performance should also be as belonging to the scope of protection of the present invention, and protection scope of the present invention should not be confined to embodiment
On.
Based on above-mentioned purpose, the second aspect of the embodiment of the present invention proposes a kind of computer equipment, including at least one
A processor and memory.The memory is stored with the computer instruction that can be run on the processor, described instruction by
The processor is executed to realize following steps: being constructed deep learning frame on GPU server and is utilized the deep learning
Frame is trained;Monitor the properties data of GPU server in training process;The properties number obtained according to monitoring
According to judging whether exception and GPU utilization rate are less than first predetermined value to the operation of GPU server;It is transported in response to GPU server
Row is abnormal, changes the configuration of the GPU server or deep learning model;And it is predetermined less than first in response to GPU utilization rate
Value increases the size of transmission data block and increases data line number of passes.
It is computer-readable the present invention also provides a kind of computer readable storage medium of performance for optimizing GPU server
Storage medium is stored with the computer program that method as described above is executed when being executed by processor.
Finally, it should be noted that those of ordinary skill in the art will appreciate that realizing the whole in above-described embodiment method
Or part process, related hardware can be instructed to complete by computer program, optimize the method for the performance of GPU server
Program can be stored in a computer-readable storage medium, and the program is when being executed, it may include such as the implementation of above-mentioned each method
The process of example.Wherein, the storage medium of program can be magnetic disk, CD, read-only memory (ROM) or random access memory
(RAM) etc..The embodiment of above-mentioned computer program, can achieve that corresponding aforementioned any means embodiment is identical or phase
Similar effect.
In addition, disclosed method is also implemented as the computer journey executed by processor according to embodiments of the present invention
Sequence, the computer program may be stored in a computer readable storage medium.When the computer program is executed by processor, hold
The above-mentioned function of being limited in row method disclosed by the embodiments of the present invention.
In addition, above method step and system unit also can use controller and for storing so that controller is real
The computer readable storage medium of the computer program of existing above-mentioned steps or Elementary Function is realized.
In addition, it should be appreciated that the computer readable storage medium (for example, memory) of this paper can be volatibility and deposit
Reservoir or nonvolatile memory, or may include both volatile memory and nonvolatile memory.As an example and
Unrestricted, nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM
(EPROM), electrically erasable programmable ROM (EEPROM) or flash memory.Volatile memory may include that arbitrary access is deposited
Reservoir (RAM), the RAM can serve as external cache.As an example and not restrictive, RAM can be with a variety of
Form obtains, such as synchronous random access memory (DRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate SDRAM (DDR
SDRAM), enhance SDRAM (ESDRAM), synchronization link DRAM (SLDRAM) and directly Rambus RAM (DRRAM).Institute is public
The storage equipment for the aspect opened is intended to the memory of including but not limited to these and other suitable type.
Those skilled in the art will also understand is that, various illustrative logical blocks, mould in conjunction with described in disclosure herein
Block, circuit and algorithm steps may be implemented as the combination of electronic hardware, computer software or both.It is hard in order to clearly demonstrate
This interchangeability of part and software, with regard to various exemplary components, square, module, circuit and step function to its into
General description is gone.This function is implemented as software and is also implemented as hardware depending on concrete application and application
To the design constraint of whole system.The function that those skilled in the art can realize in various ways for every kind of concrete application
Can, but this realization decision should not be interpreted as causing a departure from range disclosed by the embodiments of the present invention.
Various illustrative logical blocks, module and circuit, which can use, in conjunction with described in disclosure herein is designed to
The following component of function here is executed to realize or execute: general processor, digital signal processor (DSP), dedicated integrated electricity
It is road (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete
Any combination of hardware component or these components.General processor can be microprocessor, but alternatively, processor can
To be any conventional processors, controller, microcontroller or state machine.Processor also may be implemented as calculating the group of equipment
Close, for example, the combination of DSP and microprocessor, multi-microprocessor, one or more microprocessors combination DSP and/or it is any its
Its this configuration.
The step of method in conjunction with described in disclosure herein or algorithm, can be directly contained in hardware, be held by processor
In capable software module or in combination of the two.Software module may reside within RAM memory, flash memory, ROM storage
Device, eprom memory, eeprom memory, register, hard disk, removable disk, CD-ROM or known in the art it is any its
In the storage medium of its form.Illustrative storage medium is coupled to processor, enables a processor to from the storage medium
Information is written to the storage medium in middle reading information.In an alternative, storage medium can be integral to the processor
Together.Pocessor and storage media may reside in ASIC.ASIC may reside in user terminal.In an alternative
In, it is resident in the user terminal that pocessor and storage media can be used as discrete assembly.
In one or more exemplary designs, function can be realized in hardware, software, firmware or any combination thereof.
If realized in software, can using function as one or more instruction or code may be stored on the computer-readable medium or
It is transmitted by computer-readable medium.Computer-readable medium includes computer storage media and communication media, which is situated between
Matter includes any medium for helping for computer program to be transmitted to another position from a position.Storage medium can be energy
Any usable medium being enough accessed by a general purpose or special purpose computer.As an example and not restrictive, the computer-readable medium
It may include that RAM, ROM, EEPROM, CD-ROM or other optical disc memory apparatus, disk storage equipment or other magnetic storages are set
It is standby, or can be used for carrying or storage form be instruct or the required program code of data structure and can by general or
Special purpose computer or any other medium of general or specialized processor access.In addition, any connection can suitably claim
For computer-readable medium.For example, if using coaxial cable, optical fiber cable, twisted pair, digital subscriber line (DSL) or all
It is if the wireless technology of infrared ray, radio and microwave to send software from website, server or other remote sources, then above-mentioned coaxial
Cable, fiber optic cable, twisted pair, DSL or such as wireless technology of infrared ray, radio and microwave are included in determining for medium
Justice.As used herein, disk and CD include compact disk (CD), it is laser disk, CD, digital versatile disc (DVD), soft
Disk, Blu-ray disc, wherein disk usually magnetically reproduce data, and CD using laser optics reproduce data.Above content
Combination should also be as being included in the range of computer-readable medium.
It is exemplary embodiment disclosed by the invention above, it should be noted that in the sheet limited without departing substantially from claim
Under the premise of inventive embodiments scope of disclosure, it may be many modifications and modify.According to open embodiment described herein
The function of claim to a method, step and/or movement be not required to the execution of any particular order.In addition, although the present invention is implemented
Element disclosed in example can be described or be required in the form of individual, but be unless explicitly limited odd number, it is understood that be multiple.
It should be understood that it is used in the present context, unless the context clearly supports exceptions, singular " one
It is a " it is intended to also include plural form.It is to be further understood that "and/or" used herein refers to including one or one
Any and all possible combinations of a above project listed in association.
It is for illustration only that the embodiments of the present invention disclose embodiment sequence number, does not represent the advantages or disadvantages of the embodiments.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
Complete, relevant hardware can also be instructed to complete by program, program can store in a kind of computer-readable storage
In medium, storage medium mentioned above can be read-only memory, disk or CD etc..
It should be understood by those ordinary skilled in the art that: the discussion of any of the above embodiment is exemplary only, not
It is intended to imply that range disclosed by the embodiments of the present invention (including claim) is limited to these examples;In the think of of the embodiment of the present invention
Under road, it can also be combined between the technical characteristic in above embodiments or different embodiments, and there is this hair as above
Many other variations of the different aspect of bright embodiment, for simplicity, they are not provided in details.Therefore, all in the present invention
Within the spirit and principle of embodiment, any omission, modification, equivalent replacement, improvement for being made etc. be should be included in of the invention real
It applies within the protection scope of example.
Claims (10)
1. a kind of method for the performance for optimizing GPU server characterized by comprising
Deep learning frame is constructed on GPU server and is trained using the deep learning frame, and deep learning is obtained
Model;
Monitor the properties data of GPU server in training process;
According to the obtained properties data of monitoring, judge whether exception and GPU utilization rate are less than the operation of GPU server
First predetermined value;
It is operating abnormally in response to GPU server, changes the configuration of the GPU server or deep learning model;And
It is less than first predetermined value in response to GPU utilization rate, increase the size of transmission data block and increases data line number of passes.
2. the method according to claim 1, wherein monitoring the properties number of GPU server in training process
According to temperature and utilization rate, the disk input and output situation, the size of memory cache for including: monitoring CPU and GPU.
3. according to the method described in claim 2, it is characterized in that, judging whether exception includes: detection to the operation of GPU server
Whether CPU or GPU temperature is greater than second predetermined value.
4. according to the method described in claim 3, it is characterized in that, changing the GPU in response to GPU server operation exception
The configuration of server or deep learning model includes: to increase wind in response to detecting that CPU or GPU temperature is greater than second predetermined value
The revolving speed of fan.
5. according to the method described in claim 2, it is characterized in that, judging whether exception includes: that detection is deep to the operation of GPU server
Whether the test data of degree learning framework is all cached in memory.
6. according to the method described in claim 5, it is characterized in that, changing the GPU in response to GPU server operation exception
The configuration of server or deep learning model includes: in response to detecting that the non-total caching of the test data of deep learning frame arrives
In memory, extend the training time.
7. according to the method described in claim 2, it is characterized in that, judging whether exception includes: detection to the operation of GPU server
Whether cpu busy percentage is higher than third predetermined value.
8. changing the GPU the method according to the description of claim 7 is characterized in that being operating abnormally in response to GPU server
The configuration of server or deep learning model includes: in response to detecting that cpu busy percentage is higher than third predetermined value, described in replacement
CPU。
9. a kind of computer equipment characterized by comprising
At least one processor;And
Memory, the memory are stored with the computer instruction that can be run on the processor, and described instruction is by described
Device is managed to execute to realize following steps:
Deep learning frame is constructed on GPU server and is trained using the deep learning frame;
Monitor the properties data of GPU server in training process;
According to the obtained properties data of monitoring, judge whether exception and GPU utilization rate are less than the operation of GPU server
First predetermined value;
It is operating abnormally in response to GPU server, changes the configuration of the GPU server or deep learning model;And
It is less than first predetermined value in response to GPU utilization rate, increase the size of transmission data block and increases data line number of passes.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In perform claim requires method described in 1-8 any one when the computer program is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910303999.XA CN110032449A (en) | 2019-04-16 | 2019-04-16 | A kind of method and device for the performance optimizing GPU server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910303999.XA CN110032449A (en) | 2019-04-16 | 2019-04-16 | A kind of method and device for the performance optimizing GPU server |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110032449A true CN110032449A (en) | 2019-07-19 |
Family
ID=67238562
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910303999.XA Pending CN110032449A (en) | 2019-04-16 | 2019-04-16 | A kind of method and device for the performance optimizing GPU server |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110032449A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110597465A (en) * | 2019-08-30 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Method and device for improving performance of GPU server and readable medium |
CN110647423A (en) * | 2019-08-15 | 2020-01-03 | 苏州浪潮智能科技有限公司 | Method, device and readable medium for creating storage volume mirror image based on application |
CN111090504A (en) * | 2019-10-12 | 2020-05-01 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for realizing timing task based on placemaker |
CN111104238A (en) * | 2019-10-30 | 2020-05-05 | 苏州浪潮智能科技有限公司 | CE-based memory diagnosis method, device and medium |
CN111124722A (en) * | 2019-10-30 | 2020-05-08 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for isolating fault memory |
CN111143148A (en) * | 2019-12-30 | 2020-05-12 | 北京奇艺世纪科技有限公司 | Model parameter determination method, device and storage medium |
CN111222636A (en) * | 2020-01-07 | 2020-06-02 | 深圳鲲云信息科技有限公司 | Deep learning model conversion method and device, server and storage medium |
CN111259939A (en) * | 2020-01-10 | 2020-06-09 | 苏州浪潮智能科技有限公司 | Tuning management method, device, equipment and medium for deep learning model |
CN111367878A (en) * | 2020-03-16 | 2020-07-03 | 中国银行股份有限公司 | IPFS node monitoring method and device |
CN111488987A (en) * | 2020-04-16 | 2020-08-04 | 苏州浪潮智能科技有限公司 | Deep learning large model training method, system, equipment and medium |
CN111736463A (en) * | 2020-05-09 | 2020-10-02 | 刘炜 | Adaptive deep learning control method based on operation platform |
CN112257856A (en) * | 2020-12-18 | 2021-01-22 | 鹏城实验室 | Deep learning framework determination method and device and readable storage medium |
CN112306623A (en) * | 2019-07-31 | 2021-02-02 | 株式会社理光 | Processing method and device for deep learning task and computer readable storage medium |
CN112732591A (en) * | 2021-01-15 | 2021-04-30 | 杭州中科先进技术研究院有限公司 | Edge computing framework for cache deep learning |
CN112988527A (en) * | 2019-12-13 | 2021-06-18 | 中国电信股份有限公司 | GPU management platform anomaly detection method and device and storage medium |
CN114138449A (en) * | 2021-12-14 | 2022-03-04 | 河南省儿童医院郑州儿童医院 | Rehabilitation training system based on virtual reality |
CN115113714A (en) * | 2022-06-30 | 2022-09-27 | 苏州浪潮智能科技有限公司 | High-power supply dynamic current control method, device, equipment and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107659609A (en) * | 2017-07-26 | 2018-02-02 | 北京天云融创软件技术有限公司 | A kind of deep learning support platform and deep learning training method based on cloud computing |
CN108415776A (en) * | 2018-03-06 | 2018-08-17 | 华中科技大学 | A kind of memory in distributed data processing system estimates the method with configuration optimization |
CN108446200A (en) * | 2018-02-07 | 2018-08-24 | 福建星瑞格软件有限公司 | Server intelligence O&M method based on big data machine learning and computer equipment |
CN108881446A (en) * | 2018-06-22 | 2018-11-23 | 深源恒际科技有限公司 | A kind of artificial intelligence plateform system based on deep learning |
US20180341856A1 (en) * | 2017-05-24 | 2018-11-29 | International Business Machines Corporation | Balancing memory consumption of multiple graphics processing units in deep learning |
CN109062692A (en) * | 2018-07-24 | 2018-12-21 | 郑州云海信息技术有限公司 | A kind of optimization method and system of recognition of face deep learning training platform |
-
2019
- 2019-04-16 CN CN201910303999.XA patent/CN110032449A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180341856A1 (en) * | 2017-05-24 | 2018-11-29 | International Business Machines Corporation | Balancing memory consumption of multiple graphics processing units in deep learning |
CN107659609A (en) * | 2017-07-26 | 2018-02-02 | 北京天云融创软件技术有限公司 | A kind of deep learning support platform and deep learning training method based on cloud computing |
CN108446200A (en) * | 2018-02-07 | 2018-08-24 | 福建星瑞格软件有限公司 | Server intelligence O&M method based on big data machine learning and computer equipment |
CN108415776A (en) * | 2018-03-06 | 2018-08-17 | 华中科技大学 | A kind of memory in distributed data processing system estimates the method with configuration optimization |
CN108881446A (en) * | 2018-06-22 | 2018-11-23 | 深源恒际科技有限公司 | A kind of artificial intelligence plateform system based on deep learning |
CN109062692A (en) * | 2018-07-24 | 2018-12-21 | 郑州云海信息技术有限公司 | A kind of optimization method and system of recognition of face deep learning training platform |
Non-Patent Citations (5)
Title |
---|
RAMA VELPURI,ANAND ADKOLI: "《Oracle8i备份与恢复手册》", 30 September 2001 * |
张瑜: "《计算机组成与结构》", 31 August 2011, 中国铁道出版社 * |
潘怡,何可可,叶晖,刘华富: "《数据流知识发现》", 31 December 2016, 华中科技大学出版社 * |
董仕: "《基于流记录的网络流量识别关键技术研究》", 30 October 2017, 科学技术文献出版社 * |
陈敏: "《认知计算导论》", 30 May 2017, 华中科技大学出版社 * |
Cited By (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112306623A (en) * | 2019-07-31 | 2021-02-02 | 株式会社理光 | Processing method and device for deep learning task and computer readable storage medium |
CN110647423A (en) * | 2019-08-15 | 2020-01-03 | 苏州浪潮智能科技有限公司 | Method, device and readable medium for creating storage volume mirror image based on application |
CN110597465B (en) * | 2019-08-30 | 2023-01-06 | 苏州浪潮智能科技有限公司 | Method and device for improving performance of GPU server and readable medium |
CN110597465A (en) * | 2019-08-30 | 2019-12-20 | 苏州浪潮智能科技有限公司 | Method and device for improving performance of GPU server and readable medium |
CN111090504A (en) * | 2019-10-12 | 2020-05-01 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for realizing timing task based on placemaker |
CN111090504B (en) * | 2019-10-12 | 2022-07-19 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for realizing timing task based on placemaker |
CN111104238A (en) * | 2019-10-30 | 2020-05-05 | 苏州浪潮智能科技有限公司 | CE-based memory diagnosis method, device and medium |
CN111124722A (en) * | 2019-10-30 | 2020-05-08 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for isolating fault memory |
CN111124722B (en) * | 2019-10-30 | 2022-11-29 | 苏州浪潮智能科技有限公司 | Method, equipment and medium for isolating fault memory |
CN111104238B (en) * | 2019-10-30 | 2022-06-03 | 苏州浪潮智能科技有限公司 | CE-based memory diagnosis method, device and medium |
CN112988527A (en) * | 2019-12-13 | 2021-06-18 | 中国电信股份有限公司 | GPU management platform anomaly detection method and device and storage medium |
CN111143148B (en) * | 2019-12-30 | 2023-09-12 | 北京奇艺世纪科技有限公司 | Model parameter determining method, device and storage medium |
CN111143148A (en) * | 2019-12-30 | 2020-05-12 | 北京奇艺世纪科技有限公司 | Model parameter determination method, device and storage medium |
CN111222636A (en) * | 2020-01-07 | 2020-06-02 | 深圳鲲云信息科技有限公司 | Deep learning model conversion method and device, server and storage medium |
CN111222636B (en) * | 2020-01-07 | 2023-06-06 | 深圳鲲云信息科技有限公司 | Deep learning model conversion method, device, server and storage medium |
CN111259939A (en) * | 2020-01-10 | 2020-06-09 | 苏州浪潮智能科技有限公司 | Tuning management method, device, equipment and medium for deep learning model |
CN111259939B (en) * | 2020-01-10 | 2022-06-07 | 苏州浪潮智能科技有限公司 | Tuning management method, device, equipment and medium for deep learning model |
CN111367878A (en) * | 2020-03-16 | 2020-07-03 | 中国银行股份有限公司 | IPFS node monitoring method and device |
CN111367878B (en) * | 2020-03-16 | 2023-08-18 | 中国银行股份有限公司 | IPFS node monitoring method and device |
WO2021208558A1 (en) * | 2020-04-16 | 2021-10-21 | 苏州浪潮智能科技有限公司 | Large deep learning model training method and system, device, and medium |
CN111488987B (en) * | 2020-04-16 | 2022-12-06 | 苏州浪潮智能科技有限公司 | Method, system, equipment and medium for deep learning large model training |
CN111488987A (en) * | 2020-04-16 | 2020-08-04 | 苏州浪潮智能科技有限公司 | Deep learning large model training method, system, equipment and medium |
CN111736463B (en) * | 2020-05-09 | 2023-03-03 | 刘炜 | Adaptive deep learning control method based on operation platform |
CN111736463A (en) * | 2020-05-09 | 2020-10-02 | 刘炜 | Adaptive deep learning control method based on operation platform |
CN112257856A (en) * | 2020-12-18 | 2021-01-22 | 鹏城实验室 | Deep learning framework determination method and device and readable storage medium |
CN112732591A (en) * | 2021-01-15 | 2021-04-30 | 杭州中科先进技术研究院有限公司 | Edge computing framework for cache deep learning |
CN114138449A (en) * | 2021-12-14 | 2022-03-04 | 河南省儿童医院郑州儿童医院 | Rehabilitation training system based on virtual reality |
CN115113714B (en) * | 2022-06-30 | 2023-07-21 | 苏州浪潮智能科技有限公司 | Method, device, equipment and storage medium for controlling dynamic current of high-power supply |
CN115113714A (en) * | 2022-06-30 | 2022-09-27 | 苏州浪潮智能科技有限公司 | High-power supply dynamic current control method, device, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110032449A (en) | A kind of method and device for the performance optimizing GPU server | |
US11310313B2 (en) | Multi-threaded processing of search responses returned by search peers | |
US11184467B2 (en) | Multi-thread processing of messages | |
Chen et al. | Fedgraph: Federated graph learning with intelligent sampling | |
US11882198B2 (en) | Methods and systems for communicating relevant content | |
TWI738721B (en) | Task scheduling method and device | |
US11093496B1 (en) | Performance-based query plan caching | |
US20110138270A1 (en) | System of Enabling Efficient XML Compression with Streaming Support | |
CN105138679A (en) | Data processing system and method based on distributed caching | |
CN104461929B (en) | Distributed data cache method based on blocker | |
US11237813B1 (en) | Model driven state machine transitions to configure an installation of a software program | |
US10540360B2 (en) | Identifying relationship instances between entities | |
US11297147B2 (en) | Managed data export to a remote network from edge devices | |
CN111104198A (en) | Method, equipment and medium for improving operation efficiency of scanning system plug-in | |
CN111221715A (en) | Method, system, device and medium for dynamically optimizing Caffe performance | |
CN110990148A (en) | Method, device and medium for optimizing storage performance | |
CN116232971A (en) | Communication method and network system based on structured P2P relay network | |
US11836095B2 (en) | Forwarding incoming IO to SCM namespaces | |
US10268375B2 (en) | Methods for proactive prediction of disk failure in the disk maintenance pipeline and devices thereof | |
CN113934767A (en) | Data processing method and device, computer equipment and storage medium | |
WO2022070278A1 (en) | Anomaly determination system, anomaly determination method, and program | |
WO2012124295A1 (en) | Computer system, control system, control method and control program | |
US11146663B2 (en) | Facilitating improved overall performance of remote data facility replication systems | |
WO2021208238A1 (en) | K-truss graph-based storage system cache prefetching method, system, and medium | |
CN116909853A (en) | Monitoring method and system of storage system, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190719 |
|
RJ01 | Rejection of invention patent application after publication |