CN112799924A - Simulation test system and method for cloud storage system storing training data - Google Patents

Simulation test system and method for cloud storage system storing training data Download PDF

Info

Publication number
CN112799924A
CN112799924A CN202110089178.8A CN202110089178A CN112799924A CN 112799924 A CN112799924 A CN 112799924A CN 202110089178 A CN202110089178 A CN 202110089178A CN 112799924 A CN112799924 A CN 112799924A
Authority
CN
China
Prior art keywords
training
trained
data
model
storage system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110089178.8A
Other languages
Chinese (zh)
Other versions
CN112799924B (en
Inventor
余虹建
李锦丰
朱军
李秋庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Juyun Technology Co ltd
Original Assignee
Beijing Juyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Juyun Technology Co ltd filed Critical Beijing Juyun Technology Co ltd
Priority to CN202110089178.8A priority Critical patent/CN112799924B/en
Publication of CN112799924A publication Critical patent/CN112799924A/en
Application granted granted Critical
Publication of CN112799924B publication Critical patent/CN112799924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3457Performance evaluation by simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3485Performance evaluation by tracing or monitoring for I/O devices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides a simulation test system and a method for a cloud storage system for storing training data, wherein the system comprises the following steps: the training task generator is used for obtaining a model to be trained and training parameters and generating a training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained required by the model to be trained from the cloud storage system simulator, loading the data to be trained and executing a training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The system realizes effective test of the cloud storage system for storing the training data.

Description

Simulation test system and method for cloud storage system storing training data
Technical Field
The invention relates to the technical field of simulation test, in particular to a simulation test system and method for a cloud storage system for storing training data.
Background
With more and more AI services driven by deep learning, more and more work is done to train deep learning models. Training a deep learning model often requires a large amount of training data, and in the model training process, the training data is frequently read/written, which puts high demands on equipment for storing the training data. For this reason, memory systems dedicated to storing training data have been developed.
However, there is currently no effective test method for how well the performance of a memory system dedicated to storing training data.
Disclosure of Invention
The embodiment of the invention aims to provide a simulation test system and a simulation test method for a cloud storage system storing training data, so as to realize effective test on the cloud storage system storing the training data.
In order to achieve the above object, an embodiment of the present invention provides a simulation test system for a cloud storage system storing training data, where the simulation test system includes: the training task simulator comprises a training task generator, a training task execution simulator and a cloud storage system simulator;
the training task generator is used for obtaining a model to be trained and training parameters and generating a training task;
the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, required by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
Further, the simulation test system further includes: training a task scheduler;
the training task generator is specifically used for obtaining a plurality of models to be trained and corresponding training parameters and generating a plurality of training tasks for different models to be trained;
the training task scheduler is used for selecting one scheduling strategy from a plurality of preset task scheduling strategies as a target strategy of the current simulation test; each task scheduling strategy comprises the following steps: the number and the completion sequence of training tasks to be completed are tested in one simulation; selecting one or more training tasks of the simulation test as target training tasks from the multiple training tasks according to a target strategy;
the training task execution simulator is specifically used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, required by a model to be trained, from the cloud storage system simulator according to a task completion sequence in the target strategy and based on the model to be trained and training parameters corresponding to each target training task, loading the data to be trained, and executing the target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
Further, the training task execution simulator simulates a model training server, and includes: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from the cloud storage system simulator through the I/O resource based on the model to be trained and the training parameters corresponding to the training task by the processor resource, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
Further, the cloud storage system simulator, which simulates a cloud storage system, includes: providing processor resources required by the cloud storage system based on each virtual server, and simulating a data connector and a cache manager of the cloud storage system;
the simulated data connector is used for receiving a data loading request sent by the training task execution simulator and forwarding the data loading request to the cache manager;
the simulated cache manager is used for judging whether the memory resource stores the data to be trained required by the model to be trained, if so, the data to be trained required by the model to be trained is obtained from the memory resource and returned to the training task execution simulator; and if not, obtaining the data to be trained required by the model to be trained from the disk resources and returning the data to be trained to the training task execution simulator.
Further, the training task generator obtains a model to be trained and training parameters, and includes: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained.
In order to achieve the above object, an embodiment of the present invention provides a simulation test method for a cloud storage system storing training data, which is applied to the simulation test system described in the present invention, and the simulation test method includes:
the training task generator is used for obtaining a model to be trained and training parameters and generating a training task;
the training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator simulates a cloud storage system and at least one virtual server carrying the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
Further, the simulation test system further includes: training a task scheduler;
the training task generator obtains a model to be trained and training parameters and generates a training task, and the training task generator comprises the following steps: obtaining a plurality of models to be trained and corresponding training parameters, generating a plurality of training tasks aiming at different models to be trained, and sending the training tasks to a training task scheduler;
the simulation test method further comprises the following steps: the training task scheduler selects one scheduling strategy from a plurality of preset task scheduling strategies as a target strategy of the simulation test; each task scheduling strategy comprises the following steps: the number and the completion sequence of training tasks to be completed are tested in one simulation; selecting one or more training tasks of the simulation test as target training tasks from the multiple training tasks according to a target strategy;
the training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data, wherein the step comprises the following steps of:
simulating a model training server, and enabling the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator according to a task completion sequence in the target strategy and based on the model to be trained and the training parameters corresponding to each target training task, and load the data to be trained, and execute the target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
Further, the step of simulating a model training server by the training task execution simulator includes: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from the cloud storage system simulator through the I/O resource based on the model to be trained and the training parameters corresponding to the training task by the processor resource, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
Further, the step of simulating the cloud storage system by the cloud storage system simulator includes: providing processor resources required by the cloud storage system based on each virtual server, and simulating a data connector and a cache manager of the cloud storage system;
the training task execution simulator is used for enabling a simulated model training server to obtain data to be trained, required by a model to be trained, from the cloud storage system simulator and loaded on the basis of the model to be trained and training parameters corresponding to a training task, and comprises the following steps:
the training task execution simulator sends a data loading request to the cloud storage system simulator;
the data connector simulated by the cloud storage system simulator receives the data loading request sent by the training task execution simulator and forwards the data loading request to the cache manager;
the simulated cache manager judges whether the memory resource stores the data to be trained required by the model to be trained, if so, the simulated cache manager obtains the data to be trained required by the model to be trained from the memory resource and returns the data to be trained to the training task execution simulator; and if not, obtaining the data to be trained required by the model to be trained from the disk resources and returning the data to be trained to the training task execution simulator.
Further, the training task generator obtains a model to be trained and training parameters, and includes: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained.
In order to achieve the above object, an embodiment of the present invention provides an electronic device, which includes a processor, a communication interface, a memory, and a communication bus, where the processor and the communication interface are configured to complete communication between the memory and the processor through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing the steps of the simulation test method for the cloud storage system storing the training data when executing the program stored in the memory.
In order to achieve the above object, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the simulation testing method for a cloud storage system storing training data are implemented.
In order to achieve the above object, an embodiment of the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to perform any of the above steps of the simulation test method for a cloud storage system storing training data.
The embodiment of the invention has the following beneficial effects:
the simulation test system for the cloud storage system storing the training data provided by the embodiment of the invention comprises: the training task simulator comprises a training task generator, a training task execution simulator and a cloud storage system simulator; the training task generator is used for obtaining a model to be trained and training parameters and generating a training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which are needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The simulation test system provided by the embodiment of the invention realizes effective test of the cloud storage system for storing the training data.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a structural diagram of a simulation test system for a cloud storage system storing training data according to an embodiment of the present invention;
fig. 2 is another structural diagram of a simulation test system for a cloud storage system storing training data according to an embodiment of the present invention;
fig. 3 is a flowchart of a simulation test method for a cloud storage system storing training data according to an embodiment of the present invention;
fig. 4 is an interaction diagram of a simulation test system for a cloud storage system storing training data according to an embodiment of the present invention;
FIG. 5 is a task execution flow diagram of a training task execution simulator provided by an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a simulation test system for a cloud storage system storing training data, which comprises the following components: a training task generator 110, a training task execution simulator 120, and a cloud storage system simulator 130;
a training task generator 110, configured to obtain a model to be trained and training parameters, and generate a training task;
the training task execution simulator 120 is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator 130 based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator 130 is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
By adopting the simulation test system for the cloud storage system storing the training data, which is provided by the embodiment of the invention, the training task generator is used for obtaining the model to be trained and the training parameters and generating the training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which are needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The simulation test system provided by the embodiment of the invention realizes effective test of the cloud storage system for storing the training data.
In the embodiment of the invention, the cloud storage system for storing the training data can be evaluated according to the time length for loading the data to be trained in the test result data and the time length required for executing the training task. For example, if the time for loading the data to be trained is shorter than the preset loading time and the time required for completing the training task is shorter than the preset task time, it can be tested that the cloud storage system for storing the training data has better performance; if the time length for loading the data to be trained is not less than the preset loading time length and/or the time length required for executing the training task is not less than the preset task time length, the performance of the cloud storage system for storing the training data can be tested to be not superior enough. The preset loading duration and the preset task duration may be set according to an actual application condition, and are not specifically limited herein.
The simulation test system and method for the cloud storage system storing training data provided by the invention are described in detail with specific embodiments in the following with reference to the accompanying drawings.
In an embodiment of the present invention, as shown in fig. 2, a cloud storage system for storing training data provided in an embodiment of the present invention may include: a training task generator 110, a training task execution simulator 120, a cloud storage system simulator 130, and a training task scheduler 210;
the training task generator 110 is specifically configured to obtain a plurality of models to be trained and corresponding training parameters, and generate a plurality of training tasks for different models to be trained.
In this embodiment of the present invention, the training task generator 110, obtaining the model to be trained and the training parameters, may include: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained.
A training task scheduler 210, configured to select one scheduling policy from multiple preset task scheduling policies as a target policy of the current simulation test; each task scheduling strategy comprises the following steps: the number and the completion sequence of training tasks to be completed are tested in one simulation; and selecting one or more training tasks of the simulation test as target training tasks from the plurality of training tasks according to a target strategy.
A training task execution simulator 120, configured to simulate a model training server, and enable the simulated model training server to obtain data to be trained, which is required by a model to be trained, from the cloud storage system simulator, to be loaded based on a model to be trained and a training parameter corresponding to each target training task according to a task completion sequence in the target strategy, and execute the target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
In the embodiment of the present invention, the training task execution simulator 120 simulates a model training server, which may include: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from a cloud storage system simulator through I/O resources by processor resources based on the model to be trained and training parameters corresponding to the training task, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
The cloud storage system simulator 130 is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in disk resources, and all or part of the data to be trained are stored in memory resources.
In this embodiment of the present invention, the simulating the cloud storage system by the cloud storage system simulator 130 may include: providing processor resources required by the cloud storage system based on each virtual server, and simulating a data connector 201 and a cache manager 202 of the cloud storage system;
the simulated data connector 201 is used for receiving a data loading request sent by the training task execution simulator and forwarding the data loading request to the cache manager;
the simulated cache manager 202 is used for judging whether to store data to be trained required by the model to be trained in the memory resource, and if so, obtaining the data to be trained required by the model to be trained from the memory resource and returning the data to be trained to the training task execution simulator; and if not, obtaining the data to be trained required by the model to be trained from the disk resources and returning the data to be trained to the training task execution simulator.
By adopting the simulation test system for the cloud storage system storing the training data, which is provided by the embodiment of the invention, the training task generator is used for obtaining the model to be trained and the training parameters and generating the training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which are needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The simulation test system provided by the embodiment of the invention realizes effective test of the cloud storage system for storing the training data.
The embodiment of the invention discloses a simulation test method for a cloud storage system storing training data, which is applied to the simulation test system for the cloud storage system storing the training data, and comprises the following steps: the training task simulator comprises a training task generator, a training task execution simulator and a cloud storage system simulator; fig. 3 is a flowchart of a simulation test method for a cloud storage system storing training data according to an embodiment of the present invention, and as shown in fig. 3, the method may include the following steps:
step 301, training task generator, obtaining model to be trained and training parameters, and generating training task.
Step 302, a training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by a model to be trained, from a cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
Step 303, simulating a cloud storage system by a cloud storage system simulator, and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
By adopting the method provided by the embodiment of the invention, the training task generator is used for obtaining the model to be trained and the training parameters to generate the training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which are needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The simulation test method provided by the embodiment of the invention realizes effective test of the cloud storage system for storing the training data.
The simulation test system in the embodiment of the invention further comprises: training a task scheduler; fig. 4 is another flow of a simulation testing method for a cloud storage system storing training data according to an embodiment of the present invention, and as shown in fig. 4, the method may include the following steps:
step 401, a training task generator obtains a plurality of models to be trained and corresponding training parameters, and generates a plurality of training tasks for different models to be trained.
Step 402, a training task generator sends a plurality of models to be trained, corresponding training parameters and a plurality of generated training tasks aiming at different models to be trained to a training task scheduler.
In the embodiment of the present invention, the obtaining of the model to be trained and the training parameters by the training task generator may include: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained. See, for example, table 1 below:
the trainable task generator may generate a plurality of trained tasks for different models to be trained: training task Job1 and training task Job 2. And the model structure, the initial model parameters and the training duration of the model to be trained corresponding to the training task Job1 are respectively as follows: rest50, GPU server for 4 cards, and 50 minutes; the model structure, the initial model parameters and the training duration of the model to be trained corresponding to the training task Job2 are respectively as follows: vgg16, GPU server for 1 card, and 60 minutes.
Table 1: model to be trained and training parameters
Figure BDA0002912068210000111
In step 403, the training task scheduler receives the multiple models to be trained and the corresponding training parameters sent by the training task generator, and information of multiple training tasks for different models to be trained.
Step 404, training a task scheduler, and selecting one scheduling strategy from a plurality of preset task scheduling strategies as a target strategy of the current simulation test; and selecting one or more training tasks of the simulation test as target training tasks from the plurality of training tasks according to a target strategy.
Each task scheduling strategy comprises the following steps: one simulation test tests the number and completion order of training tasks to be completed.
For example, the task scheduling policy T1 includes: completing the training task Job1 in Table 1, and then completing the training task Job2 in Table 1; the task scheduling policy T1 includes: the training task Job2 in Table 1 was completed first, followed by the training task Job1 in Table 1. If the task scheduling policy T1 is selected as the target task scheduling policy, the training task Job1 and the training task Job2 of the simulation test may be selected from the plurality of training tasks as the target training tasks according to the task scheduling policy T1.
In step 405, the training task scheduler sends the target strategy to the training task execution simulator.
In step 406, the training task execution simulator receives the target strategy sent by the training task scheduler.
Step 407, a training task execution simulator simulates a model training server, and the simulated model training server sends to a cloud storage system simulator a request for obtaining data to be trained required by a model to be trained to load and execute a target training task according to a task completion sequence in a target strategy and based on the model to be trained and training parameters corresponding to each target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
Specifically, the step of simulating the model training server by the training task execution simulator may include: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from a cloud storage system simulator through I/O resources by processor resources based on the model to be trained and training parameters corresponding to the training task, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
In step 408, the cloud storage system simulator receives a request sent by the training task execution simulator.
In step 409, the cloud storage system simulator sends the data to be trained required by the model to be trained to the training task execution simulator.
Step 410, a cloud storage system simulator simulates a cloud storage system and at least one virtual server carrying the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
Specifically, the step of simulating the cloud storage system by the cloud storage system simulator may include: and providing processor resources required by the cloud storage system based on each virtual server, and simulating a data connector and a cache manager of the cloud storage system.
By adopting the method provided by the embodiment of the invention, the training task generator is used for obtaining the model to be trained and the training parameters to generate the training task; the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, which are needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and the training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data; the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources. The simulation test method provided by the embodiment of the invention realizes effective test of the cloud storage system for storing the training data.
Fig. 5 is a task execution flow of a training task execution simulator, and as shown in fig. 5, the training task execution simulator is configured to enable a simulated model training server to obtain data to be trained, which is required by a model to be trained, from a cloud storage system simulator and is loaded based on a model to be trained and training parameters corresponding to a training task, where the step may include:
step 501, a training task execution simulator sends a data loading request to a cloud storage system simulator;
502, receiving a data loading request sent by a training task execution simulator by a data connector simulated by a cloud storage system simulator, and forwarding the data loading request to a cache manager;
step 503, judging whether to store the data to be trained required by the model to be trained in the memory resource by using a cache manager simulated by the cloud storage system simulator;
step 504, if the cache manager simulated by the cloud storage system simulator judges that the memory resource stores the data to be trained required by the model to be trained, obtaining the data to be trained required by the model to be trained from the memory resource and returning the data to be trained to the training task execution simulator;
and 505, if the cache manager simulated by the cloud storage system simulator judges that no data to be trained required by the model to be trained is stored in the memory resource, obtaining the data to be trained required by the model to be trained from the disk resource and returning the data to be trained to the training task execution simulator.
An embodiment of the present invention further provides an electronic device, as shown in fig. 6, including a processor 601, a communication interface 602, a memory 603, and a communication bus 604, where the processor 601, the communication interface 602, and the memory 603 complete mutual communication through the communication bus 604,
a memory 603 for storing a computer program;
the processor 601 is configured to implement the following steps when executing the program stored in the memory 603:
the training task generator is used for obtaining a model to be trained and training parameters and generating a training task;
the training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator simulates a cloud storage system and at least one virtual server carrying the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above methods for simulation testing of a cloud storage system storing training data.
In another embodiment, the present invention further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute any of the above embodiments of the method for simulation testing of a cloud storage system storing training data.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus, the electronic device and the storage medium, since they are substantially similar to the method embodiments, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiments.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. A simulation test system for a cloud storage system storing training data, the simulation test system comprising: the training task simulator comprises a training task generator, a training task execution simulator and a cloud storage system simulator;
the training task generator is used for obtaining a model to be trained and training parameters and generating a training task;
the training task execution simulator is used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, required by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, loading the data to be trained, and executing the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator is used for simulating a cloud storage system and carrying at least one virtual server of the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
2. The system of claim 1, wherein the simulation test system further comprises: training a task scheduler;
the training task generator is specifically used for obtaining a plurality of models to be trained and corresponding training parameters and generating a plurality of training tasks for different models to be trained;
the training task scheduler is used for selecting one scheduling strategy from a plurality of preset task scheduling strategies as a target strategy of the current simulation test; each task scheduling strategy comprises the following steps: the number and the completion sequence of training tasks to be completed are tested in one simulation; selecting one or more training tasks of the simulation test as target training tasks from the multiple training tasks according to a target strategy;
the training task execution simulator is specifically used for simulating a model training server, enabling the simulated model training server to obtain data to be trained, required by a model to be trained, from the cloud storage system simulator according to a task completion sequence in the target strategy and based on the model to be trained and training parameters corresponding to each target training task, loading the data to be trained, and executing the target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
3. The system of claim 1, wherein the training task execution simulator simulates a model training server, comprising: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from the cloud storage system simulator through the I/O resource based on the model to be trained and the training parameters corresponding to the training task by the processor resource, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
4. The system of claim 1, wherein the cloud storage system simulator, simulating a cloud storage system, comprises: providing processor resources required by the storage system based on each virtual server, and simulating a data connector and a cache manager of the storage system;
the simulated data connector is used for receiving a data loading request sent by the training task execution simulator and forwarding the data loading request to the cache manager;
the simulated cache manager is used for judging whether the memory resource stores the data to be trained required by the model to be trained, if so, the data to be trained required by the model to be trained is obtained from the memory resource and returned to the training task execution simulator; and if not, obtaining the data to be trained required by the model to be trained from the disk resources and returning the data to be trained to the training task execution simulator.
5. The system according to any one of claims 1 to 4, wherein the training task generator obtains the model to be trained and the training parameters, and comprises: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained.
6. A simulation test method for a cloud storage system storing training data, which is applied to the simulation test system of claim 1, the simulation test method comprising:
the training task generator is used for obtaining a model to be trained and training parameters and generating a training task;
the training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; recording the time length for loading the data to be trained and the time length required for executing the training task as test result data;
the cloud storage system simulator simulates a cloud storage system and at least one virtual server carrying the cloud storage system; each virtual server provides processor resources, memory resources and disk resources required by the cloud storage system; all data to be trained are stored in the disk resources, and all or part of the data to be trained are stored in the memory resources.
7. The method of claim 6, wherein the simulation test system further comprises: training a task scheduler;
the training task generator obtains a model to be trained and training parameters and generates a training task, and the training task generator comprises the following steps: obtaining a plurality of models to be trained and corresponding training parameters, generating a plurality of training tasks aiming at different models to be trained, and sending the training tasks to a training task scheduler;
the simulation test method further comprises the following steps: the training task scheduler selects one scheduling strategy from a plurality of preset task scheduling strategies as a target strategy of the simulation test; each task scheduling strategy comprises the following steps: the number and the completion sequence of training tasks to be completed are tested in one simulation; selecting one or more training tasks of the simulation test as target training tasks from the multiple training tasks according to a target strategy;
the training task execution simulator simulates a model training server, and enables the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator based on the model to be trained and training parameters corresponding to the training task, load the data and execute the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data, wherein the step comprises the following steps of:
simulating a model training server, and enabling the simulated model training server to obtain data to be trained, which is needed by the model to be trained, from the cloud storage system simulator according to a task completion sequence in the target strategy and based on the model to be trained and the training parameters corresponding to each target training task, and load the data to be trained, and execute the target training task; and recording the time length of each target training task for loading the data to be trained and the time length required for executing the training task, and calculating the average value of the time lengths of each target training task for loading the data to be trained and the average value of the time lengths required for executing the training task as test result data.
8. The method of claim 6, wherein the training task execution simulator, the step of simulating a model training server, comprises: simulating processor resources and I/O resources of a model training server; obtaining data to be trained required by the model to be trained from the cloud storage system simulator through the I/O resource based on the model to be trained and the training parameters corresponding to the training task by the processor resource, loading the data to be trained, and executing the training task; and recording the time length for loading the data to be trained and the time length required for executing the training task as test result data.
9. The method of claim 6, wherein the step of simulating the cloud storage system by the cloud storage system simulator comprises: providing processor resources required by the cloud storage system based on each virtual server, and simulating a data connector and a cache manager of the cloud storage system;
the training task execution simulator is used for enabling a simulated model training server to obtain data to be trained, required by a model to be trained, from the cloud storage system simulator and loaded on the basis of the model to be trained and training parameters corresponding to a training task, and comprises the following steps:
the training task execution simulator sends a data loading request to the cloud storage system simulator;
the data connector simulated by the cloud storage system simulator receives the data loading request sent by the training task execution simulator and forwards the data loading request to the cache manager;
the simulated cache manager judges whether the memory resource stores the data to be trained required by the model to be trained, if so, the simulated cache manager obtains the data to be trained required by the model to be trained from the memory resource and returns the data to be trained to the training task execution simulator; and if not, obtaining the data to be trained required by the model to be trained from the disk resources and returning the data to be trained to the training task execution simulator.
10. The method according to any one of claims 6 to 9, wherein the training task generator obtains the model to be trained and the training parameters, and comprises: and obtaining the model structure, the initial model parameters and the training duration of the model to be trained.
CN202110089178.8A 2021-01-22 2021-01-22 Simulation test system and method for cloud storage system for storing training data Active CN112799924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110089178.8A CN112799924B (en) 2021-01-22 2021-01-22 Simulation test system and method for cloud storage system for storing training data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110089178.8A CN112799924B (en) 2021-01-22 2021-01-22 Simulation test system and method for cloud storage system for storing training data

Publications (2)

Publication Number Publication Date
CN112799924A true CN112799924A (en) 2021-05-14
CN112799924B CN112799924B (en) 2023-07-21

Family

ID=75811248

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110089178.8A Active CN112799924B (en) 2021-01-22 2021-01-22 Simulation test system and method for cloud storage system for storing training data

Country Status (1)

Country Link
CN (1) CN112799924B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190115753A1 (en) * 2017-10-13 2019-04-18 Schneider Electric It Corporation Method for automatic cloud control of energy storage systems
US20190340659A1 (en) * 2018-05-03 2019-11-07 Disney Enterprises, Inc. Machine learning enabled evaluation systems and methods
CN110471820A (en) * 2019-08-05 2019-11-19 南开大学 A kind of cloud storage system disk failure prediction technique based on Recognition with Recurrent Neural Network
CN111352821A (en) * 2020-03-10 2020-06-30 深圳市宝能投资集团有限公司 Service testing method, device, electronic equipment and computer readable storage medium
CN111709468A (en) * 2020-06-05 2020-09-25 内蒙古中孚明丰农业科技有限公司 Training method and device for directional artificial intelligence and storage medium
CN112087487A (en) * 2020-07-30 2020-12-15 北京聚云科技有限公司 Model training task scheduling method and device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190115753A1 (en) * 2017-10-13 2019-04-18 Schneider Electric It Corporation Method for automatic cloud control of energy storage systems
US20190340659A1 (en) * 2018-05-03 2019-11-07 Disney Enterprises, Inc. Machine learning enabled evaluation systems and methods
CN110471820A (en) * 2019-08-05 2019-11-19 南开大学 A kind of cloud storage system disk failure prediction technique based on Recognition with Recurrent Neural Network
CN111352821A (en) * 2020-03-10 2020-06-30 深圳市宝能投资集团有限公司 Service testing method, device, electronic equipment and computer readable storage medium
CN111709468A (en) * 2020-06-05 2020-09-25 内蒙古中孚明丰农业科技有限公司 Training method and device for directional artificial intelligence and storage medium
CN112087487A (en) * 2020-07-30 2020-12-15 北京聚云科技有限公司 Model training task scheduling method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112799924B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
CN110837410B (en) Task scheduling method and device, electronic equipment and computer readable storage medium
US7415444B2 (en) Determining compliance rates for probabilistic requests
CN113377540A (en) Cluster resource scheduling method and device, electronic equipment and storage medium
CN112148582B (en) Policy testing method and device, computer readable medium and electronic equipment
CN114095567B (en) Data access request processing method and device, computer equipment and medium
CN111045932B (en) Business system simulation processing method and device, electronic equipment and storage medium
EP4095700A1 (en) Method and system for micro-service testing, and storage medium
GB2524737A (en) A system and method for testing a workflow
CN108427639A (en) Automated testing method, application server and computer readable storage medium
US20150286753A1 (en) Estimating Think Times
CN113191114B (en) Method and apparatus for validating a system
CN112749072B (en) Testing method and device for cloud storage system for storing training data
CN111159038B (en) Method for simulating CPU load and electronic equipment
US11144693B1 (en) Method and system for generating verification tests at runtime
CN112799924B (en) Simulation test system and method for cloud storage system for storing training data
US20230401086A1 (en) Quality control system for quantum-as-a-service brokers
CN113010376B (en) Monitoring method and device for cloud storage system for storing training data
US9892010B2 (en) Persistent command parameter table for pre-silicon device testing
US20210303766A1 (en) Pre-silicon chip model of extracted workload inner loop instruction traces
US7395196B2 (en) Test-cases for functional verification of system-level interconnect
US10733345B1 (en) Method and system for generating a validation test
CN108848183B (en) Login method and device for simulation user
US10503854B1 (en) Method and system for generating validation tests
CN110618778A (en) Method and system for automatically generating business data, electronic equipment and computer storage medium
CN105930260A (en) Method and apparatus for testing system availability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant