CN116542344A - Model automatic deployment method, platform and system - Google Patents

Model automatic deployment method, platform and system Download PDF

Info

Publication number
CN116542344A
CN116542344A CN202310819279.5A CN202310819279A CN116542344A CN 116542344 A CN116542344 A CN 116542344A CN 202310819279 A CN202310819279 A CN 202310819279A CN 116542344 A CN116542344 A CN 116542344A
Authority
CN
China
Prior art keywords
model
parameter
platform
terminal equipment
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310819279.5A
Other languages
Chinese (zh)
Inventor
殷俊
金达
郑春煌
周祥明
程德强
傅凯
张朋
蔡丹平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202310819279.5A priority Critical patent/CN116542344A/en
Publication of CN116542344A publication Critical patent/CN116542344A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The embodiment of the application discloses a model automatic deployment method, a platform and a system. The method comprises the following steps: the platform determines a first model structure and initial training parameters according to set indexes from the terminal equipment; the platform trains the first model structure by applying training samples and initial training parameters to obtain a first model; wherein the first model comprises a first model structure and a first parameter; the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment applies the test sample and the first parameter of the first model, and operates the first model to obtain a first test result; the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or a first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter. The automatic deployment of the model is realized.

Description

Model automatic deployment method, platform and system
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a method, a platform, and a system for model automation deployment.
Background
In different application scenarios, the requirements on the speed, precision or accuracy of the model are different. In different application scenarios, the devices deployed by the model are different, for example, the devices comprise supercomputers, servers or simple terminal devices, and the data processing capacities of the different devices are different.
In the related art, a model is trained using an existing training framework, and deployed on a corresponding device. In this method, the model effect can not be objectively expressed by means of manual participation in both the training frame selection stage and the model effect verification stage.
Disclosure of Invention
The embodiment of the application provides a model automatic deployment method, a platform and a system, which are used for realizing automatic deployment of a model.
In a first aspect, an embodiment of the present application provides a method for automatically deploying a model, the method at least includes the following steps:
the platform determines a first model structure and initial training parameters according to set indexes from the terminal equipment;
the platform trains the first model structure by applying training samples and initial training parameters to obtain a first model; wherein the first model comprises a first model structure and a first parameter;
The platform sends the first model and the test sample to the terminal equipment so that the terminal equipment applies the test sample and the first parameter of the first model, and operates the first model to obtain a first test result;
the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or a first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter.
In a second aspect, an embodiment of the present application provides a model automation deployment platform, including:
a determining module for: determining a first model structure and initial training parameters according to set indexes from terminal equipment;
training module for: training the first model structure by applying training samples and initial training parameters to obtain a first model; wherein the first model comprises a first model structure and a first parameter;
a data sending module, configured to: the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment of the user can apply the test sample and the first parameter of the first model, and the first model is operated to obtain a first test result;
an adjustment module for: the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or a first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter.
In a third aspect, an embodiment of the present application provides a model automation deployment system, including a platform and a terminal device;
the terminal equipment is used for testing the model from the platform and sending the test result to the platform;
the platform is for performing the method of the first aspect.
In a fourth aspect, an embodiment of the present application provides a chip, the chip including a processor and a memory; a processor is coupled to the memory, the processor being for reading a computer program stored in the memory, causing the chip to perform the method of the first aspect.
In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method of the first aspect.
According to the embodiment of the application, the model is automatically deployed by interaction between the platform and the terminal equipment of the user. In the process, the platform automatically selects the first model structure and the initial training parameters according to the setting index from the terminal equipment without manual selection. In addition, the platform trains the first model structure by applying training samples and initial training parameters to obtain a first model comprising the first model structure and the first parameters. The platform sends the first model and the test sample to the terminal device. And the terminal equipment uses the test sample and the first parameter to operate the first model to obtain a first test result. The terminal equipment sends the first test result to the platform, and the platform adjusts the first model structure and the first parameter according to the first test result to obtain a second model comprising a second model structure and the second parameter. Therefore, the platform updates the model according to the test result fed back by the terminal equipment, and automatic deployment of the model is realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, and it is obvious that the drawings that are described below are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an application scenario schematic diagram of a method for model automatic deployment according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for automated deployment of models according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a platform for determining a first model structure and initial training parameters according to an embodiment of the present application;
fig. 4 is a schematic diagram of a first model of interaction between a platform and a terminal device according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a platform for adjusting a first model structure and a first parameter according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a model automated deployment system according to an embodiment of the present application;
FIG. 7 is an interactive flow chart of a model automated deployment method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a model automated deployment system according to an embodiment of the present application;
FIG. 9 is a schematic structural diagram of a model-automated deployment platform according to an embodiment of the present application;
fig. 10 is a schematic structural diagram of another model automation deployment platform according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
Any number of elements in the figures are for illustration and not limitation, and any naming is used for distinction only and not for any limiting sense.
In a specific practical process, the requirements on the speed, the precision or the accuracy of the model are different according to different application scenes. In different application scenarios, the devices deployed by the model are different, for example, the devices comprise supercomputers, servers or simple terminal devices, and the data processing capacities of the different devices are different.
In the related art, a model is trained using an existing training framework, and deployed on a corresponding device. In this method, during both the training stage and the model effect verification stage, the model effect cannot be objectively expressed by means of manual participation.
In addition, due to the fact that the target categories of the models are subdivided and distributed unevenly, the method in the related technology can only be used for open-loop training, and the models obtained through open-loop training can not meet the requirements of developers under the condition of no supervision. Thus, the method is applicable to a variety of applications. A method for saving manpower is urgently needed, and a machine is used for semi-automatic supervision to complete integral closed-loop training.
Therefore, the application provides a model automatic deployment method, wherein the method is used for obtaining a target model and target parameters of the target model through interaction between a platform and terminal equipment of a user. The terminal device applies the target model and the target parameters, and can realize the model function under the corresponding scene, such as the recognition of the target. In the process, a platform determines a set index according to user requirements, automatically selects a first model structure and initial training parameters according to the set index, trains the first model structure through the initial training parameters, sends a model (comprising the model structure and the training parameters) obtained through preliminary training to terminal equipment, evaluates a test result of the terminal equipment by applying the model, and adjusts the model structure and the training parameters according to the evaluation result until the test result meets set conditions. Taking the model obtained by the last update as a target model, and taking the parameter obtained by the last update as a target parameter. By the design, automatic deployment of the model is realized.
After the design concept of the embodiment of the present application is introduced, some simple descriptions are made below for application scenarios applicable to the technical solution of the embodiment of the present application, and it should be noted that the application scenarios described below are only used to illustrate the embodiment of the present application and are not limiting. In specific implementation, the technical scheme provided by the embodiment of the application can be flexibly applied according to actual needs.
Referring to fig. 1, an application scenario schematic diagram of a model automatic deployment method according to an embodiment of the present application is provided. Where 11 is a model automation deployment platform, abbreviated as platform, 12 is a user's terminal device, and the terminal device in fig. 1 is exemplified by a computer 121 and a server 122. In this example, the model automation deployment platform 11 sends the mth model and the mth parameter to the computer 121, and the model automation deployment platform 11 receives the mth test result sent by the computer 121; the model automation deployment platform 11 sends the P-th model and the P-th parameter to the server 122, and the model automation deployment platform 11 receives the P-th test result sent by the server 122. Here, M and P are positive integers, and the values of both are determined according to how many times the cycle passes, so as to determine the target model and the target parameter which meet the set condition.
Of course, the method provided in the embodiment of the present application is not limited to the application scenario shown in fig. 1, but may be used in other possible application scenarios, and the embodiment of the present application is not limited. The functions that can be implemented by each device in the application scenario shown in fig. 1 will be described together in the following method embodiments, which are not described in detail herein.
In order to further explain the technical solutions provided in the embodiments of the present application, the following details are described with reference to the accompanying drawings and the detailed description. Although the embodiments of the present application provide the method operational steps as shown in the following embodiments or figures, more or fewer operational steps may be included in the method based on routine or non-inventive labor. In steps where there is logically no necessary causal relationship, the execution order of the steps is not limited to the execution order provided by the embodiments of the present application.
The technical solution provided in the embodiment of the present application is described below with reference to the application scenario shown in fig. 1.
Referring to fig. 2, an embodiment of the present application provides a model automation deployment method, including the following steps:
s201: the platform determines a first model structure and initial training parameters according to the set index from the terminal device.
S202: and training the first model structure by the platform through the training sample and the initial training parameters to obtain a first model.
Wherein the first model comprises a first model structure and a first parameter.
S203: the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment of the user can apply the test sample and the first parameter and operate the first model to obtain a first test result.
S204: the platform receives a first test result from the terminal equipment, and adjusts the first model structure and the first parameter according to the first test result to obtain a second model.
Wherein the second model comprises a second model structure and a second parameter. According to the embodiment of the application, the model is automatically deployed by interaction between the platform and the terminal equipment of the user. In the process, the platform automatically selects the first model structure and the initial training parameters according to the setting index from the terminal equipment without manual selection. In addition, the platform trains the first model structure by applying training samples and initial training parameters to obtain a first model comprising the first model structure and the first parameters. The platform sends the first model and the test sample to the terminal device. And the terminal equipment uses the test sample and the first parameter to operate the first model to obtain a first test result. The terminal equipment sends the first test result to the platform, and the platform adjusts the first model structure and the first parameter according to the first test result to obtain a second model comprising a second model structure and the second parameter. Therefore, the platform updates the model according to the test result fed back by the terminal equipment, and automatic deployment of the model is realized.
Referring to S201, a platform may refer to an artificial intelligence (artificial intelligence, AI) chip, which includes an AI chip for storing and running a network model.
Optionally, the setting indicator includes one or more of a precision indicator, an accuracy indicator, and a time-consuming indicator.
For the user, a setting index is generated according to the corresponding application scene, the setting index can be obtained according to the user requirement, and the setting index can be matched with the actual application scene. The application scene may be, for example, a pedestrian detection scene, an urban traffic monitoring scene, a user image analysis scene, and the like.
Fig. 3 is a schematic diagram of determining a first model structure and initial training parameters by using a platform according to an embodiment of the present application, and the process may be implemented through steps S201-1 to S201-2.
S201-1: the platform determines the number of training categories and the complexity of the model according to the set index, and determines the first model structure according to the number of training categories and the complexity of the model.
The user requirements in different application scenes are different, different setting indexes can be determined according to the different user requirements, and certain requirements are met on the number of training categories in different application scenes. Thus, the platform may determine the number of training classes and the complexity of the model according to the user requirements, where the number of training classes and the complexity of the model are related to the model structure, and thus, the platform may determine the corresponding first model structure according to the number of training classes and the complexity of the model. Optionally, the mainstream framework network of the first model structure includes VGG, google-Net, dark Net, or densene, etc.
S201-2: the platform determines initial training parameters corresponding to the first model structure according to the corresponding relation between the models stored in the preset matching table and the training parameters.
The platform is pre-stored with a matching table, which can be called a preset matching table, and the matching table stores the corresponding relation between the model and the training parameters. The training parameters here include the kind, number and initial values of the parameters. The training parameters comprise distillation parameters, pruning parameters, self-adaptive balance parameters of the sample, a preprocessing mode of the data and the like. These training parameters may control the training mode and training network presets.
In this embodiment of the present application, the platform may determine, according to a correspondence between a model stored in a preset matching table and a training parameter, an initial training parameter corresponding to the first model structure.
Referring to S202, the platform trains the first model structure by applying training samples and initial training parameters to obtain a first model.
In the training process, the model structure is unchanged, but the model parameters are optimized continuously, so that the first model comprises a first model structure and first parameters, and the first parameters are optimized on the basis of initial training parameters. Alternatively, a fixed batch of values may be set, such as N, that is, the platform trains the first model structure using training samples and initial training parameters, and after N iterations, the training is stopped, and the model at this time is referred to as the first model. The training process may be implemented by a training framework already available on the platform. Since the initial training parameters are also updated during the iteration process, the training parameters of the first model are first parameters.
The training process is the same as that in the related art, and is not described here.
Referring to S203, the platform transmits the first model and the test sample to the terminal device. And the terminal equipment applies the test sample and the first parameter, and operates the first model to obtain a first test result.
Because the calculation power of the terminal equipment of the user is different from that of the AI chip, the terminal equipment receives the first model and the test sample from the platform, and the first model is operated by applying the test sample and the first parameter of the first model to obtain a first test result. By the design, the test result of the terminal equipment can be detected in real time in the model training process, so that the model structure and training parameters can be adjusted in time.
Fig. 4 is a schematic diagram of a first model of interaction between a platform and a terminal device according to an embodiment of the present application, where the platform and the terminal device may interact and obtain a first test result through steps S203-1 to S203-2.
S203-1: and the platform carries out quantization processing on the first model according to the equipment information of the terminal equipment, and sends the quantized first model and the test sample to an equipment port of the terminal equipment.
The computing power of the terminal equipment of the user is different from that of the AI chip, and the first model obtained through platform training can be operated only after conversion is needed. In the conversion process, the difference between the computing power of the platform and the computing power of the terminal equipment is related, if the first model is not processed, the operation effect of the first model is different from the training result of the platform on fp16 or fp32, and the computing power of the terminal equipment is related to the equipment information, so that the platform can perform quantization processing on the first model (convert the first model into a model which can be operated by the terminal equipment) according to the equipment information of the terminal equipment, and the terminal equipment can recognize and operate the quantized first model. The quantization mode may include offline quantization, online quantization, binary quantization, linear quantization, logarithmic quantization, etc. And the platform sends the quantized first model and the test sample to an equipment port of the terminal equipment.
S203-2: after the driver of the equipment port is matched with the first model key, the terminal equipment applies the test sample and the first parameter, and operates the first model to obtain a first test result.
The driver of the device port performs key matching with the first model, and the key is a key set by the platform. After the driver of the equipment port is matched with the first model key, the terminal equipment applies the test sample and the first parameter, and operates the first model to obtain a first test result. The first test result here may be a model recognition result, for example a label.
Alternatively, the first test result may be an encrypted compressed tag.
Referring to S204, the platform receives a first test result from the terminal device, and adjusts the first model structure and/or the first parameter according to the first test result to obtain the second model. Wherein the second model comprises a second model structure and a second parameter. The terminal equipment sends the first test result to the platform, and the platform can determine how to adjust the first model structure and the first parameters according to the first test result, so that an adjusted model and adjusted parameters can obtain better operation effects on the terminal equipment, and project requirements on different terminal equipment are met. For example, if the first model structure is adjusted, the obtained second model structure is different from the first model structure; if the first model structure is not adjusted, the obtained second model structure is the same as the first model structure.
Optionally, fig. 5 is a schematic diagram of adjusting the first model structure and the first parameter by the platform according to the embodiment of the present application, where the process of adjusting the first model structure and the first parameter by the platform may be implemented through steps S204-1 to S204-3.
S204-1: and the platform receives the encrypted compressed label and decrypts and decompresses the encrypted compressed label.
After the terminal device operates the first model, the terminal device outputs a model operation result in a set format, which is called a label. In order to solve the problem of asynchronous operation of multiple models under the condition of mass data, terminal equipment compresses data of the labels, encrypts the labels by applying a set key, and finally outputs the encrypted compressed labels. The terminal equipment sends the encrypted compressed label to a platform, and the platform decodes and decompresses the encrypted compressed label to match the secret key. Exemplary compression means include VarByte, zlib, snappy, lz4, and the like.
S204-2: and the platform compares the result after decryption and decompression operation with a pre-stored real label to obtain an evaluation parameter.
The platform stores real labels in advance, and compares the result after decryption and decompression operation with the stored real labels in advance to obtain evaluation parameters. Optionally, the evaluation parameters include mAP, F1 Score, confusion matrix, accuracy, recall, forward time consumption, and the like.
In the actual application process, the evaluation parameters can be quantitatively analyzed and analyzed into sub-indexes. Sub-indicators such as class confusion, target detection rate distribution, mAP or algorithm time consumption, etc.
S204-3: the platform adjusts the first model structure of the first model according to the evaluation parameters to obtain a second model structure, and/or adjusts the number or the parameter value of the parameters of the first parameters to obtain second parameters.
The evaluation parameters may indicate advantages and disadvantages of the first model, so that the platform may adjust the first model structure of the first model according to the evaluation parameters to obtain the second model structure, and/or adjust the number of parameters and parameter values of the first parameters to obtain the second parameters.
Alternatively, the adjustment process may generally include the following two cases, see fig. 6:
first case Q1: if the evaluation parameter characterizes that the time consumption of the first model is greater than the first time length threshold, reducing network blocks with the time consumption greater than the second time length threshold in the first model structure of the first model, and obtaining a second model structure.
Wherein the evaluation parameters are different and the characteristics of the characterized first model are different. If the time consumption of the first model characterized by the evaluation parameter is greater than the first time length threshold value, network blocks with time consumption greater than the second time length threshold value in the first model structure can be reduced. And the number and type of network blocks reduced may be determined based on the specific size relationship of the time elapsed for the first model to the first time length threshold. In this example, the second time period threshold is less than the first time period threshold.
In this case, the first model structure is optimized from a time-consuming point of view, resulting in the second model structure.
Second case Q2: and if the accuracy of the evaluation parameter representation first model is smaller than the accuracy threshold value, adjusting the number or the parameter value of the parameters of the first parameter to obtain a second parameter.
The accuracy is also an important factor affecting the accuracy of the model in the application process, so if the accuracy of the first model represented by the evaluation parameter is smaller than the accuracy threshold, the number or the parameter value of the first parameter can be adjusted to obtain the second parameter.
Optionally, the precision of the first model includes a category target precision, a dimension precision, and a quantized precision. In the second case, it can be further divided into the following cases, still referring to fig. 6:
q2-1: and if the category target precision is smaller than the first setting precision, modifying the weight of the weakness targets in the first parameter.
Wherein the weights of the disadvantaged targets in the first parameter may be modified in case the category target accuracy is smaller than the first set accuracy threshold. Here, the disadvantaged object is an object that is misidentified, for example, an apple is identified as a banana, and then the banana is the disadvantaged object.
Q2-2: and if the size precision is smaller than the second setting precision, modifying the weight of the weakness size in the first parameter.
Wherein the weight of the size of the weakness in the first parameter may be modified in case the size accuracy is smaller than the second setting accuracy. The weakness here is the size of the identification error.
Q2-3: if the quantized loss of accuracy is greater than a loss threshold, parameters characterizing the training pattern in the first parameters are adjusted so as to use a training pattern that is lower than the current accuracy.
Wherein if the quantized loss of accuracy is greater than the loss threshold, a training pattern lower than the current accuracy is required, for example by adjusting a parameter of the first parameters that characterizes the training pattern.
In one possible implementation manner, after obtaining the second model, the method in the embodiment of the application further includes the following steps:
judging whether the first test result meets the set condition, if yes, determining the second model as a target model, otherwise, applying the second parameter to train the second model structure by the platform to obtain a new model, and sending the new model to the terminal equipment so that the terminal equipment can apply the test sample and the new parameter in the new model to operate the new model until the platform determines that the test result of the terminal equipment meets the set condition.
Optionally, the setting condition includes that the number of iterations is greater than a number threshold, or the evaluation parameter is greater than an index threshold. For example, the platform updates the first model structure to obtain a second model structure, and/or updates the first parameter to obtain a second parameter, and applies the second parameter to train the second model structure continuously, and the iteration times are calculated according to N times to obtain a new model and a new parameter.
The platform sends the new model to the terminal equipment so that the terminal equipment can apply the test sample and the new parameters to run the new model to obtain a new test result, and sends the new test result to the platform until the platform determines that the test result of the terminal equipment meets the set condition, and the platform takes the model obtained by the last update as a target model and takes the parameters obtained by the last update as target parameters.
In the actual application process, the method is repeated for a plurality of times, and a target model which has higher precision and meets the set condition can be obtained. The terminal equipment applies the target parameters and runs the target model, so that a better model application effect can be obtained.
In order to make the technical solution of the present application more perfect, fig. 7 is an interaction flow chart of a model automation deployment method provided in an embodiment of the present application, where the interaction method at least includes the following steps:
S700: and the terminal equipment sends the set index to the platform.
The terminal equipment and the platform can perform data transmission according to a set transmission protocol, and set indexes are sent to the platform.
S701: the platform determines a first model structure and initial training parameters according to the set index.
The implementation of this step may refer to step S201, which is not described here in detail.
S702: and training the first model structure by the platform through the training sample and the initial training parameters to obtain a first model.
Wherein the first model comprises a first model structure and a first parameter. The implementation of this step may refer to step S202, which is not described here in detail.
S703: the platform sends the first model and the test sample to the terminal equipment of the user.
The implementation of this step may refer to step S203, which is not described here in detail.
S704: and the terminal equipment applies the test sample and the first parameter of the first model, and operates the first model to obtain a first test result.
The implementation of this step may refer to step S203, which is not described here in detail.
S705: and the terminal equipment sends the first test result to the platform.
S706: and the platform adjusts the first model structure and the first parameter according to the first test result to obtain a second model.
Wherein the second model comprises a second model structure and a second parameter.
The implementation of this step may refer to step S204, which is not described here in detail.
S707: whether the first test result satisfies the setting condition is determined, if so, S708 is executed, otherwise S709 is executed.
S708: the platform determines the second model as the target model.
S709: and training the second model structure by the platform by applying the second parameter to obtain a new model.
The implementation of this step may refer to the training procedure in step S202, which is not described here in detail.
S710: the platform sends the new model to the terminal device.
S711: and the terminal equipment uses the new parameters in the test sample and the new model to operate the new model, so as to obtain a new test result.
The implementation of this step may refer to the test procedure in step S203, which is not described here in detail.
S712: and the terminal sends the new test result to the platform.
S713: and obtaining a target model until the platform determines that the test result of the terminal equipment meets the set condition.
According to the embodiment of the application, response is real-time, application is simple, and pre-research can be flexibly performed on custom model training and platform baseline model training of terminal equipment of different users. The defects that manual intervention is needed, the abandoned scheme is more, the training period is longer, the problem is difficult to analyze and the like in the traditional scheme are overcome.
The embodiment of the application also provides a model automation deployment system, referring to fig. 8, including a platform 81 and a user terminal device 82; the terminal device 82 is used for testing the model from the platform and sending the test result to the platform; the platform 81 is used for executing the model automatic deployment method in the embodiment of the present application, and specific steps of the method may be referred to the foregoing embodiment, which is not described herein.
As shown in fig. 9, based on the same inventive concept as the above model automation deployment method, the embodiment of the present application further provides a model automation deployment platform, which includes a determining module 91, a training module 92, a data sending module 93, and an adjusting module 94.
Wherein, the determining module 91 is configured to: determining a first model structure and initial training parameters according to set indexes from terminal equipment;
training module 92 for: training the first model structure by applying training samples and initial training parameters to obtain a first model; wherein the first model comprises a first model structure and a first parameter;
a data transmitting module 93, configured to: the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment of the user can apply the test sample and the first parameter of the first model, and the first model is operated to obtain a first test result;
An adjustment module 94 for: the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or a first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter.
In an alternative embodiment, the apparatus further includes a determining module configured to:
judging whether the first test result meets the set condition, if so, determining the second model as a target model;
otherwise, the platform trains the second model structure by applying the second parameters to obtain a new model, and sends the new model to the terminal equipment so that the terminal equipment runs the new model by applying the test sample and the new parameters in the new model until the platform determines that the test result of the terminal equipment meets the set condition.
In an alternative embodiment, the setting indicator includes one or more of a precision indicator, an accuracy indicator, and a time-consuming indicator;
the determining module 91 is specifically configured to:
the platform determines the number of training categories and the complexity of the model according to the set indexes from the terminal equipment, and determines a corresponding first model structure according to the number of the training categories and the complexity of the model;
The platform determines initial training parameters corresponding to the first model structure according to the corresponding relation between the model structure and the training parameters stored in the preset matching table; the initial training parameters are used for controlling the training mode and training the preset network value.
In an alternative embodiment, the data sending module 93 is specifically configured to:
the platform carries out quantization processing on the first model according to the equipment information of the terminal equipment, and sends the quantized first model and the test sample to an equipment port of the terminal equipment, so that the terminal equipment applies the test sample and the first parameter after a driver of the equipment port is matched with a secret key of the first model, and the first model is operated to obtain a first test result.
In an alternative embodiment, the first test result is an encrypted compressed tag; the adjustment module 94 is specifically configured to:
the platform receives the encrypted compressed label, and decrypts and decompresses the encrypted compressed label;
the platform compares the result after decryption and decompression operation with a pre-stored real label to obtain an evaluation parameter;
the platform adjusts the first model structure of the first model according to the evaluation parameters to obtain a second model structure, and/or adjusts the number or the parameter value of the parameters of the first parameters to obtain second parameters.
In an alternative embodiment, the adjustment module 94 is specifically configured to:
if the evaluation parameter characterizes that the time consumption of the first model is greater than a first time length threshold, reducing network blocks with the time consumption greater than a second time length threshold in a first model structure of the first model to obtain a second model; or (b)
And if the accuracy of the evaluation parameter representation first model is smaller than the accuracy threshold value, adjusting the number or the parameter value of the parameters of the first parameter to obtain a second parameter.
In an alternative embodiment, the precision of the first model includes a category target precision, a dimension precision, and a quantized precision; the adjustment module 94 is specifically configured to:
if the category target precision is smaller than the first setting precision, modifying the weight of the weak target in the first parameter; wherein the weak target is a target of identification error;
if the size precision is smaller than the second setting precision, modifying the weight of the weakness size in the first parameter; wherein, the weakness size is the size of the identification error;
if the quantized loss of accuracy is greater than a loss threshold, parameters characterizing the training pattern in the first parameters are adjusted so as to use a training pattern that is lower than the current accuracy.
In an alternative embodiment, the setting condition includes the number of iterations being greater than a number threshold, or the evaluation parameter being greater than an index threshold.
The model automatic deployment platform device and the model automatic deployment platform method provided by the embodiment of the application adopt the same invention conception, can obtain the same beneficial effects, and are not described in detail herein.
Based on the same inventive concept as the above-mentioned model automation deployment method, the embodiment of the present application further provides a model automation deployment platform, which may be specifically a desktop computer, a portable computer, a smart phone, a tablet computer, a personal digital assistant (Personal Digital Assistant, PDA), a server, or the like. As shown in fig. 10, the electronic device may include a processor 1001 and a memory 1002.
The processor 1001 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, and may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
The memory 1002 is a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), magnetic Memory, magnetic disk, optical disk, and the like. The memory is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 1002 in the embodiments of the present application may also be circuitry or any other device capable of implementing a memory function for storing program instructions and/or data.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable storage medium, and when executed, the program performs steps including the above method embodiments; such computer storage media can be any available media or data storage device that can be accessed by a computer including, but not limited to: various media that can store program code, such as a mobile storage device, a random access memory (RAM, random Access Memory), a magnetic memory (e.g., a floppy disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical memory (e.g., CD, DVD, BD, HVD, etc.), and a semiconductor memory (e.g., ROM, EPROM, EEPROM, a nonvolatile memory (NAND FLASH), a Solid State Disk (SSD)), etc.
Alternatively, the integrated units described above may be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partly contributing to the prior art, and the computer software product may be stored in a storage medium, and include several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods of the embodiments of the present application. And the aforementioned storage medium includes: various media that can store program code, such as a mobile storage device, a random access memory (RAM, random Access Memory), a magnetic memory (e.g., a floppy disk, a hard disk, a magnetic tape, a magneto-optical disk (MO), etc.), an optical memory (e.g., CD, DVD, BD, HVD, etc.), and a semiconductor memory (e.g., ROM, EPROM, EEPROM, a nonvolatile memory (NAND FLASH), a Solid State Disk (SSD)), etc.
The foregoing embodiments are only used for describing the technical solutions of the present application in detail, but the descriptions of the foregoing embodiments are only used for helping to understand the methods of the embodiments of the present application, and should not be construed as limiting the embodiments of the present application. Variations or alternatives readily occur to those skilled in the art and are intended to be encompassed within the scope of the embodiments of the present application.

Claims (12)

1. A method of automated deployment of a model, comprising:
the platform determines a first model structure and initial training parameters according to set indexes from the terminal equipment;
the platform trains the first model structure by applying training samples and the initial training parameters to obtain a first model; wherein the first model comprises the first model structure and a first parameter;
the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment can apply the test sample and the first parameter of the first model, and operate the first model to obtain a first test result;
the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or the first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter.
2. The method of claim 1, wherein after the second model is obtained, the method further comprises:
judging whether the first test result meets a set condition, if so, determining the second model as a target model;
Otherwise, the platform trains the second model structure by applying the second parameters to obtain a new model, and sends the new model to the terminal equipment so that the terminal equipment can run the new model by applying the test sample and the new parameters in the new model until the platform determines that the test result of the terminal equipment meets the set condition.
3. The method of claim 1, wherein the set indicator comprises one or more of a precision indicator, an accuracy indicator, and a time-consuming indicator;
the platform determines a first model structure and initial training parameters according to set indexes from terminal equipment, and comprises the following steps:
the platform determines the number of training categories and the complexity of the model according to the set index, and determines a corresponding first model structure according to the number of training categories and the complexity of the model;
the platform determines initial training parameters corresponding to the first model structure according to the corresponding relation between the model structure and training parameters stored in a preset matching table; the initial training parameters are used for controlling training modes and training network preset values.
4. The method of claim 1, wherein the platform sends the first model and the test sample to a terminal device to cause the terminal device to apply the test sample and the first parameter of the first model, run the first model, and obtain a first test result, comprising:
and the platform carries out quantization processing on the first model according to the equipment information of the terminal equipment, and sends the quantized first model and the test sample to an equipment port of the terminal equipment, so that the terminal equipment can apply the test sample and the first parameter of the first model after a driver of the equipment port is matched with the first model key, and operate the first model to obtain a first test result.
5. The method of claim 1, wherein the first test result is an encrypted compressed label;
the platform receives a first test result from the terminal equipment, adjusts a first model structure and/or the first parameter of the first model according to the first test result, and obtains a second model, and the method comprises the following steps:
the platform receives the encrypted compressed label and decrypts and decompresses the encrypted compressed label;
The platform compares the result after decryption and decompression operation with a pre-stored real label to obtain an evaluation parameter;
the platform adjusts the first model structure of the first model according to the evaluation parameters to obtain a second model structure, and/or adjusts the number or the parameter value of the parameters of the first parameters to obtain second parameters.
6. The method according to claim 5, wherein said adjusting the first model structure of the first model according to the evaluation parameter to obtain a second model structure, and/or adjusting the number of parameters or parameter values of the first parameter to obtain a second parameter, comprises:
if the evaluation parameter characterizes that the time consumption of the first model is greater than a first time length threshold, reducing network blocks with the time consumption greater than a second time length threshold in a first model structure of the first model, and obtaining a second model structure; or alternatively, the first and second heat exchangers may be,
and if the accuracy of the evaluation parameter representation first model is smaller than an accuracy threshold value, adjusting the number or the parameter value of the parameters of the first parameter to obtain a second parameter.
7. The method of claim 6, wherein the precision of the first model comprises a category target precision, a dimension precision, and a quantized precision;
And if the accuracy of the evaluation parameter characterization first model is smaller than an accuracy threshold, adjusting the number or the parameter value of the parameters of the first parameter, including:
if the category target precision is smaller than the first setting precision, modifying the weight of the weak target in the first parameter; wherein the weak target is a target of identification error;
if the size precision is smaller than the second setting precision, modifying the weight of the weakness size in the first parameter; wherein the weakness size is a size of an identification error;
and if the quantized precision loss is larger than a loss threshold value, adjusting the parameters which characterize the training mode in the first parameters so as to use the training mode with lower precision than the current precision.
8. The method according to any one of claims 1 to 7, wherein the setting condition includes that the number of iterations is greater than a number threshold, or that the evaluation parameter is greater than an index threshold.
9. A model automation deployment platform, comprising:
a determining module for: determining a first model structure and initial training parameters according to set indexes from terminal equipment;
training module for: training the first model structure by using a training sample and the initial training parameters to obtain a first model; wherein the first model comprises the first model structure and a first parameter;
A data sending module, configured to: the platform sends the first model and the test sample to the terminal equipment so that the terminal equipment of the user can apply the test sample and the first parameter of the first model, and the first model is operated to obtain a first test result;
an adjustment module for: the platform receives a first test result from the terminal equipment, and adjusts a first model structure and/or the first parameter of the first model according to the first test result to obtain a second model; wherein the second model comprises a second model structure and a second parameter.
10. The model automatic deployment system is characterized by comprising a platform and terminal equipment;
the terminal equipment is used for testing the model from the platform and sending the test result to the platform;
the platform is used for executing the method of any one of claims 1-8.
11. A chip, wherein the chip comprises a processor and a memory; a processor coupled to the memory, the processor being configured to read a computer program stored in the memory, to cause the chip to perform the steps of the method according to any one of claims 1 to 8.
12. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the steps of the method of any of claims 1 to 8.
CN202310819279.5A 2023-07-05 2023-07-05 Model automatic deployment method, platform and system Pending CN116542344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310819279.5A CN116542344A (en) 2023-07-05 2023-07-05 Model automatic deployment method, platform and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310819279.5A CN116542344A (en) 2023-07-05 2023-07-05 Model automatic deployment method, platform and system

Publications (1)

Publication Number Publication Date
CN116542344A true CN116542344A (en) 2023-08-04

Family

ID=87458207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310819279.5A Pending CN116542344A (en) 2023-07-05 2023-07-05 Model automatic deployment method, platform and system

Country Status (1)

Country Link
CN (1) CN116542344A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111178517A (en) * 2020-01-20 2020-05-19 上海依图网络科技有限公司 Model deployment method, system, chip, electronic device and medium
CN113762520A (en) * 2020-06-04 2021-12-07 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment
CN113780466A (en) * 2021-09-27 2021-12-10 重庆紫光华山智安科技有限公司 Model iterative optimization method and device, electronic equipment and readable storage medium
CN114594963A (en) * 2022-03-21 2022-06-07 深圳市商汤科技有限公司 Model deployment method and device, electronic equipment and storage medium
CN114861836A (en) * 2022-07-05 2022-08-05 浙江大华技术股份有限公司 Model deployment method based on artificial intelligence platform and related equipment
WO2023050707A1 (en) * 2021-09-28 2023-04-06 苏州浪潮智能科技有限公司 Network model quantization method and apparatus, and computer device and storage medium
US11636348B1 (en) * 2016-05-30 2023-04-25 Apple Inc. Adaptive training of neural network models at model deployment destinations

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11636348B1 (en) * 2016-05-30 2023-04-25 Apple Inc. Adaptive training of neural network models at model deployment destinations
CN111178517A (en) * 2020-01-20 2020-05-19 上海依图网络科技有限公司 Model deployment method, system, chip, electronic device and medium
CN113762520A (en) * 2020-06-04 2021-12-07 杭州海康威视数字技术股份有限公司 Data processing method, device and equipment
CN113780466A (en) * 2021-09-27 2021-12-10 重庆紫光华山智安科技有限公司 Model iterative optimization method and device, electronic equipment and readable storage medium
WO2023050707A1 (en) * 2021-09-28 2023-04-06 苏州浪潮智能科技有限公司 Network model quantization method and apparatus, and computer device and storage medium
CN114594963A (en) * 2022-03-21 2022-06-07 深圳市商汤科技有限公司 Model deployment method and device, electronic equipment and storage medium
CN114861836A (en) * 2022-07-05 2022-08-05 浙江大华技术股份有限公司 Model deployment method based on artificial intelligence platform and related equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SATVIK GARG ET AL.: "On Continuous Integration / Continuous Delivery for Automated Deployment of Machine Learning Models using MLOps", 《2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE)》, pages 25 - 28 *
徐梦炜等: "面向移动终端智能的自治学习系统", 《软件学报》, vol. 31, pages 3004 - 3018 *
徐治理等: "基于云原生服务网格的AI模型部署方案", 《邮电设计技术》, no. 03, pages 32 - 36 *

Similar Documents

Publication Publication Date Title
CN109376615B (en) Method, device and storage medium for improving prediction performance of deep learning network
CN110347873B (en) Video classification method and device, electronic equipment and storage medium
CN110688288B (en) Automatic test method, device, equipment and storage medium based on artificial intelligence
CN109726763B (en) Information asset identification method, device, equipment and medium
CN111143226B (en) Automatic test method and device, computer readable storage medium and electronic equipment
CN108897829A (en) Modification method, device and the storage medium of data label
CN114896454B (en) Short video data recommendation method and system based on label analysis
CN110245232A (en) File classification method, device, medium and calculating equipment
CN110969600A (en) Product defect detection method and device, electronic equipment and storage medium
CN111783873A (en) Incremental naive Bayes model-based user portrait method and device
CN112527676A (en) Model automation test method, device and storage medium
CN116684330A (en) Traffic prediction method, device, equipment and storage medium based on artificial intelligence
CN113762503A (en) Data processing method, device, equipment and computer readable storage medium
CN110704614B (en) Information processing method and device for predicting user group type in application
CN116542344A (en) Model automatic deployment method, platform and system
US20230021551A1 (en) Using training images and scaled training images to train an image segmentation model
CN116703466A (en) System access quantity prediction method based on improved wolf algorithm and related equipment thereof
CN115328786A (en) Automatic testing method and device based on block chain and storage medium
CN111026661B (en) Comprehensive testing method and system for software usability
CN113095589A (en) Population attribute determination method, device, equipment and storage medium
CN113392867A (en) Image identification method and device, computer equipment and storage medium
CN115204381A (en) Weak supervision model training method and device and electronic equipment
CN112288032A (en) Method and device for quantitative model training based on generation of confrontation network
CN112861951B (en) Image neural network parameter determining method and electronic equipment
CN112000803B (en) Text classification method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination