CN114691111A

CN114691111A - Code recognition model training method and device based on visualization

Info

Publication number: CN114691111A
Application number: CN202011586098.5A
Authority: CN
Inventors: 刘晓娟; 李兆军; 尹非凡
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2020-12-28
Filing date: 2020-12-28
Publication date: 2022-07-01

Abstract

The embodiment of the disclosure provides a code recognition model training method and device based on visualization. The method comprises the following steps: displaying a model configuration page; obtaining a model training mode, a model network structure and a hyper-parameter input by a user in the model configuration page; acquiring a to-be-trained code recognition model matched with the model network structure; according to the model training mode, obtaining a pre-stored code training sample corresponding to the code recognition model to be trained; and loading the hyper-parameters, and training the code recognition model to be trained according to the code training sample. The embodiment of the disclosure can save a large amount of manpower, material resources and time, reduce the code development amount, save the model training time and improve the model training efficiency.

Description

Code recognition model training method and device based on visualization

Technical Field

The embodiment of the disclosure relates to the technical field of model training, in particular to a code recognition model training method and device based on visualization.

Background

Deep learning becomes the development trend of industrial intelligence, and is widely applied to business lines such as take-out, delivery, elephant, bicycle, rapid donkey and the like.

The core of deep learning is to find an optimal model in a complex network for matching data, the deep learning focuses more on the design of the model, the model effect can be greatly improved by simply changing or simply adjusting the structure, and the configuration of the model with flexibility and diversity is very important. And the hyper-parameters play a very important role in deep learning model training, and can be used for controlling algorithm behaviors.

At present, an algorithm platform provides a mainstream deep learning model component based on a training mode of a visual deep learning model component, and a user constructs an experiment in a visual interface in a dragging mode and does not support a user-defined network structure. And based on a model training mode of the user-defined deep learning codes, an algorithm platform integrates the form of open source JupyterLab, an interactive programming environment is provided, and a user can write, debug and run codes in the IDE.

A training mode based on the visual deep learning model component. The platform has constraints on the input and output types and algorithms of the model, and has higher limitation on the problem types and the technology of predictive and normative modeling. The user can only use limited deep learning algorithm for model training, and cannot customize new algorithm and network structure. It is difficult to experimentally manually adjust the type and number of layers, connection modes, nodes and other hyper-parameters, and the flexibility is low. The deep learning algorithm is complex, the universality is poor, and a mainstream deep learning method cannot be used for covering a complex service scene. A great deal of manpower, material resources and time are required to develop new algorithms and models. The self-definition, the adaptability, the input and output uncertainty and the super-parameter diversity of the deep learning model make the platform difficult to make customized adaptation. But on model training patterns of custom deep learning codes. A user needs to write a complete deep learning model training code, the user-defined deep learning code has high requirements on the capability of the user, and the threshold is high. Codes with the same function need to be repeatedly developed by users, the code development amount is large, the time consumption is long, and the model training efficiency is low.

Disclosure of Invention

The embodiment of the disclosure provides a code recognition model training method and device based on visualization, which are used for reducing the limitation of model training and the error-making cost of a user to the maximum extent, reducing the labor cost and the resource waste, avoiding the need of developing codes by the user, reducing the code development amount and improving the model training efficiency.

According to a first aspect of embodiments of the present disclosure, there is provided a visualization-based code recognition model training method, including:

displaying a model configuration page;

obtaining a model training mode, a model network structure and a hyper-parameter input by a user in the model configuration page;

acquiring a to-be-trained code recognition model matched with the model network structure;

according to the model training mode, obtaining a pre-stored code training sample corresponding to the code recognition model to be trained;

and loading the hyper-parameters, and training the code recognition model to be trained according to the code training sample.

Optionally, before the loading the hyper-parameter and training the to-be-trained code recognition model according to the code training sample, the method further includes:

obtaining model input parameters and model output parameters input by the user in the model configuration page;

the loading the hyper-parameters and training the code recognition model to be trained according to the code training sample comprises the following steps:

inputting the code training sample into the code recognition model to be trained according to the model input parameter;

obtaining a code prediction result corresponding to the code training sample output by the to-be-trained code recognition model according to the model output parameter;

calculating to obtain a loss value corresponding to the to-be-trained code recognition model according to an initial code result corresponding to the code training sample and the code prediction result;

and under the condition that the loss value is in a preset range, taking the trained code recognition model as a final target code recognition model.

obtaining a loss function type input by the user in the model configuration page;

the calculating to obtain the loss value corresponding to the to-be-trained code recognition model according to the initial code result corresponding to the code training sample and the code prediction result includes:

and calculating to obtain a loss value which is matched with the type of the loss function and corresponds to the to-be-trained code recognition model according to the initial code result and the code prediction result.

Optionally, after the calculating, according to the initial code result and the code prediction result corresponding to the code training sample, a loss value corresponding to the to-be-trained code recognition model, the method further includes:

under the condition that the loss value is out of the preset range, obtaining a loss gradient corresponding to the loss value;

adjusting the hyper-parameter according to the loss gradient to obtain a target hyper-parameter;

and loading the target hyper-parameter, and iteratively executing the training process of the to-be-trained code recognition model according to the code training sample until the loss value corresponding to the to-be-trained code recognition model is in the preset range.

verifying the configuration validity of the model input parameters, the model output parameters and the hyper-parameters;

generating configuration error prompt information under the condition that the configuration validity does not meet the model training condition;

and sending the configuration error prompt message to the user, so that the user reconfigures the model input parameters, the model output parameters and/or the hyper-parameters according to the configuration error prompt message.

Optionally, the loading the hyper-parameter and training the to-be-trained code recognition model according to the code training sample includes:

assigning the super parameters to obtain assigned super parameters;

and loading the assignment hyperparameters, and training the code recognition model to be trained according to the code training sample.

acquiring a hook function input by the user in the model configuration page;

the training the to-be-trained code recognition model according to the code training sample comprises the following steps:

and automatically loading the hook function in the training process of the to-be-trained code recognition model according to the code training sample.

According to a second aspect of embodiments of the present disclosure, there is provided a visualization-based code recognition model training apparatus, including:

the configuration page display module is used for displaying a model configuration page;

the model parameter acquisition module is used for acquiring a model training mode, a model network structure and a hyper-parameter input by a user in the model configuration page;

the model to be trained acquisition module is used for acquiring a code recognition model to be trained matched with the model network structure;

a code sample obtaining module, configured to obtain, according to the model training mode, a pre-stored code training sample corresponding to the to-be-trained code recognition model;

and the code model training module is used for loading the hyper-parameters and training the code recognition model to be trained according to the code training sample.

Optionally, the method further comprises:

an input/output parameter obtaining module, configured to obtain a model input parameter and a model output parameter that are input by the user in the model configuration page;

the code model training module comprises:

the code sample input unit is used for inputting the code training sample to the to-be-trained code recognition model according to the model input parameters;

the prediction result output unit is used for acquiring a code prediction result corresponding to the code training sample output by the to-be-trained code recognition model according to the model output parameter;

the loss value calculation unit is used for calculating and obtaining a loss value corresponding to the to-be-trained code recognition model according to the initial code result corresponding to the code training sample and the code prediction result;

and the target code model acquisition unit is used for taking the trained code recognition model as a final target code recognition model under the condition that the loss value is within a preset range.

Optionally, the method further comprises:

a loss function type obtaining module, configured to obtain a loss function type input by the user in the model configuration page;

the loss value calculation unit includes:

and the loss value operator unit is used for calculating and obtaining a loss value which is matched with the type of the loss function and corresponds to the to-be-trained code recognition model according to the initial code result and the code prediction result.

Optionally, the method further comprises:

the loss gradient acquisition module is used for acquiring a loss gradient corresponding to the loss value under the condition that the loss value is out of the preset range;

the target hyper-parameter acquisition module is used for adjusting the hyper-parameters according to the loss gradient to obtain target hyper-parameters;

and the model iteration training module is used for loading the target hyper-parameter, and iteratively executing the training process of the to-be-trained code recognition model according to the code training sample until the loss value corresponding to the to-be-trained code recognition model is in the preset range.

Optionally, the method further comprises:

the configuration validity checking module is used for checking the configuration validity of the model input parameters, the model output parameters and the hyper-parameters;

the configuration prompt information generation module is used for generating configuration error prompt information under the condition that the configuration effectiveness does not meet the model training condition;

and the configuration prompt information sending module is used for sending the configuration error prompt information to the user so that the user reconfigures the model input parameters, the model output parameters and the hyper-parameters according to the configuration error prompt information.

Optionally, the code model training module includes:

the super-parameter assignment unit is used for assigning the super-parameters to obtain assignment super-parameters;

and the code model training unit is used for loading the assignment hyperparameters and training the code recognition model to be trained according to the code training sample.

Optionally, the method further comprises:

the hook function acquisition module is used for acquiring a hook method input by the user in the model configuration page;

the code model training module comprises:

and the hook function loading unit is used for automatically loading the hook function in the training process of the to-be-trained code recognition model according to the code training sample.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing any one of the visualization-based code recognition model training methods described above when executing the program.

According to a fourth aspect of embodiments of the present disclosure, there is provided a readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform any one of the visualization-based code recognition model training methods described above.

The embodiment of the disclosure provides a code recognition model training method and device based on visualization. The method comprises the steps of obtaining a model training mode, a model network structure and a hyper-parameter input by a user in a model configuration page through displaying the model configuration page, obtaining a code recognition model to be trained matched with the model network structure, obtaining a pre-stored code training sample corresponding to the code recognition model to be trained according to the model training mode, loading the hyper-parameter, and training the code recognition model to be trained according to the code training sample. The embodiment of the disclosure can define the model training mode, the model network structure and the hyper-parameters by a user, and can avoid the problems of experimentally manually adjusting the type and the number of layers, the connection mode, the nodes and other hyper-parameters, and repeatedly developing and compiling codes by the user, thereby saving a large amount of manpower, material resources and time, reducing the code development amount, saving the model training time and improving the model training efficiency.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments of the present disclosure will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a flowchart illustrating steps of a visualization-based code recognition model training method according to an embodiment of the present disclosure;

FIG. 2 is a flowchart illustrating steps of another method for training a code recognition model based on visualization according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a code recognition model training apparatus based on visualization according to an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of another training apparatus for a code recognition model based on visualization according to an embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a code model training module according to an embodiment of the present disclosure.

Detailed Description

Technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present disclosure, belong to the protection scope of the embodiments of the present disclosure.

Example one

Referring to fig. 1, a flowchart illustrating steps of a visualized code recognition model training method provided by an embodiment of the present disclosure is shown, and as shown in fig. 1, the visualized code recognition model training method may specifically include the following steps:

step 101: a model configuration page is displayed.

Embodiments of the present disclosure may be applied in scenarios where model training patterns, model network structures, and hyper-parameters of a model are customized by a user for model training.

The model configuration page refers to a page preset by an operator and used for setting relevant parameters for model training by a user, in this embodiment, a model editor may be preset, when the user needs to set the relevant parameters for model training, the model editor may be started, at this time, a model configuration page may be displayed, and the user sets the relevant parameters required for model training in the model configuration page.

When a user desires some type of code recognition model training, a model configuration page may be opened and displayed.

After the model configuration page is displayed, step 102 is performed.

Step 102: and acquiring a model training mode, a model network structure and a hyper-parameter input by a user in the model configuration page.

The model training mode refers to a mode for indicating the type of the training model, that is, indicating what type of model is obtained for training, and after the model training mode is determined, the model training samples required for training the model can be determined.

The model network structure refers to the infrastructure of the model to be trained, such as which network layers are included, whether sub-models need to be spliced, and the like.

The hyper-parameters are parameters set before the model starts to train, and can be optimized in the process of training the model, and a group of optimal hyper-parameters are selected for the training of the model, so that the performance and the effect of the training of the model are improved.

After the model configuration page is displayed, the code to be trained can be input by the user in the model configuration page to identify the model training mode, the model network structure and the hyper-parameters corresponding to the model.

After acquiring the model training mode, the model network structure and the hyper-parameters corresponding to the code recognition model to be trained input by the user in the model configuration page, step 103 and step 104 are executed.

Step 103: and acquiring a to-be-trained code recognition model matched with the model network structure.

The code recognition model to be trained refers to a code recognition model which is determined according to a model network structure and is not trained yet.

After the model network structure input by the user is obtained, the code recognition model to be trained can be determined according to the network model structure input by the user, specifically, when the user clicks an operation button on a model configuration page, a corresponding deep learning frame can be pulled from a warehouse according to the environment, and the user-defined model network structure is automatically loaded, so that the code recognition model to be trained is obtained.

Step 104: and obtaining a pre-stored code training sample corresponding to the code recognition model to be trained according to the model training mode.

After the model training mode input by the user in the model configuration page is obtained, the code training sample required by the code recognition model to be trained can be obtained from the sample warehouse according to the model mode.

It is understood that the patterns of model training for the same network structure are different, and the required samples are also different. Therefore, in this embodiment, by presetting the model training modes, different model training modes correspond to different model types, and after the model training mode input by the user on the model configuration page is obtained, the code training sample required for training the model of the corresponding type can be obtained.

In this embodiment, the obtained code training sample may be a code authorized by another user and pre-stored in a code database, or may also be a code downloaded from the internet, and specifically, the obtaining manner of the code training sample may be determined according to a business requirement, which is not limited in this embodiment.

After the code recognition model to be trained and the code training sample corresponding to the code recognition model to be trained are obtained, step 105 is executed.

Step 105: and loading the hyper-parameters, and training the code recognition model to be trained according to the code training sample.

After the code recognition model to be trained and the code training sample are obtained, the hyper-parameters input by the user in the model configuration page can be loaded, and the code recognition model to be trained is trained according to the code training sample.

The embodiment of the disclosure can customize the model training mode, the model network structure and the hyper-parameters of the model by a user, can avoid experimentally and manually adjusting the type and the number of layers, the connection mode, the nodes and other hyper-parameters, and the problem of repeatedly developing and compiling codes by the user, thereby saving a large amount of manpower, material resources and time.

In this embodiment, the assignment of the hyper-parameters configured by the user may also be performed before training, and specifically, the detailed description may be described in conjunction with the following specific implementation manner.

In a specific implementation manner of the present disclosure, the step 105 may include:

substep A1: and assigning the super parameters to obtain assigned super parameters.

In the embodiment of the disclosure, a distributed computing module may be preset, when a user clicks a model training task in a visual model configuration page during operation, configuration in a visual interface will take effect in the distributed computing module, in a distributed computing process, a specified data set may be loaded to a partition for platform distributed computing to complete a feature preprocessing process, an intermediate result is assigned to a user-specified hyper-parameter, input and output data are converted according to a user-specified format, and the input and output data are stored in an hdfs file in a tfRecord format. And encapsulates the hyper-parameters into a specified format.

Of course, in this embodiment, the super parameter may also be assigned according to the experience of the developer, so as to obtain an assigned super parameter.

It is understood that the amplitude manner of the hyper-parameter may be determined according to the traffic demand, and the embodiment is not limited thereto.

After assigning the hyper-parameters to obtain assigned hyper-parameters, sub-step A2 is performed.

Substep A2: and loading the assignment hyperparameters, and training the code recognition model to be trained according to the code training sample.

After the assignment hyperparameters are obtained, the assignment hyperparameters can be loaded, and the code recognition model to be trained is trained according to the code training samples.

The embodiment of the disclosure can realize the separation of the super-parameter setting and assignment process, can realize the dynamic assignment of the super-parameter, does not need the intervention of developers, and saves the labor cost.

In this embodiment, a hook function may be further defined on a model configuration page by a user, so as to implement the use of a high-level function of the model training process, and specifically, the following detailed description may be combined with the following specific implementation manner.

In another specific implementation manner of the present disclosure, before the step 105, the method may further include:

step B1: and acquiring the hook function input by the user in the model configuration page.

The step 105 may include:

substep C1: and automatically loading the hook function in the training process of the to-be-trained code recognition model according to the code training sample.

In the embodiment of the disclosure, a hook function (i.e., a hooks function) may be set in a model configuration page by a user in advance, and the hook function may be automatically loaded in a process of training a to-be-trained code recognition model according to a code training sample.

According to the embodiment of the disclosure, the user self-defines the high-level function in the model configuration page, the high-level function can be used in the model training, and the self-configuration performance of the code recognition model is improved.

According to the visualized code recognition model training method provided by the embodiment of the disclosure, a model training mode, a model network structure and a hyper-parameter input by a user in a model configuration page are obtained by displaying the model configuration page, a to-be-trained code recognition model matched with the model network structure is obtained, a pre-stored code training sample corresponding to the to-be-trained code recognition model is obtained according to the model training mode, the hyper-parameter is loaded, and the to-be-trained code recognition model is trained according to the code training sample. The embodiment of the disclosure can define the model training mode, the model network structure and the hyper-parameters by a user, and can avoid the problems of experimentally manually adjusting the type and the number of layers, the connection mode, the nodes and other hyper-parameters, and repeatedly developing and compiling codes by the user, thereby saving a large amount of manpower, material resources and time, reducing the code development amount, saving the model training time and improving the model training efficiency.

Example two

Referring to fig. 2, a flowchart illustrating steps of another visualized code recognition model training method provided in an embodiment of the present disclosure is shown, and as shown in fig. 2, the visualized code recognition model training method may specifically include the following steps:

step 201: a model configuration page is displayed.

After the model configuration page is displayed, step 202 is performed.

Step 202: and acquiring a model training mode, a model network structure and a hyper-parameter input by a user in the model configuration page.

The model training mode refers to a mode for indicating the type of the training model, that is, indicating what type of model is obtained for training, and after the model training mode is determined, the model training samples required by the training model can be determined.

The hyper-parameters are parameters set before the model starts to train, and can be optimized in the process of model training, and a group of optimal hyper-parameters are selected for the model training so as to improve the performance and effect of the model training.

Step 203: and obtaining the model input parameters and the model output parameters input by the user in the model configuration page.

The model input parameters refer to parameters of code fields which can be input to the code recognition model to be trained each time in the process of training the code recognition model to be trained, and may include parameters such as the number of input fields and the types of the input fields.

The model output parameters refer to parameters of code fields which can be output by the code recognition model to be trained each time in the process of training the code recognition model to be trained, and may include parameters such as the number of output fields and the types of the output fields.

In this example, the model input parameters and the model output parameters may also be input by the user within the model configuration page.

Step 204: and acquiring a to-be-trained code recognition model matched with the model network structure.

Step 205: and acquiring a pre-stored code training sample corresponding to the code recognition model to be trained according to the model training mode.

Step 206: and loading the hyper-parameters, and inputting the code training sample into the code recognition model to be trained according to the model input parameters.

After the code recognition model to be trained and the code training sample are obtained, the hyper-parameters input by the user in the model configuration page can be loaded, and the code training sample is input into the code recognition model to be trained according to the model input parameters, so that the code training sample is correspondingly processed by the code recognition model to be trained.

Step 207: and obtaining a code prediction result corresponding to the code training sample output by the to-be-trained code recognition model according to the model output parameter.

After the input code training sample matched with the model input parameter is processed by the to-be-trained code recognition model, a code prediction result corresponding to the code training sample output by the to-be-trained code recognition model can be obtained according to the model output parameter.

Step 208: and calculating to obtain a loss value corresponding to the to-be-trained code recognition model according to the initial code result corresponding to the code training sample and the code prediction result.

The loss value refers to a numerical value used for determining the degree of training of the code recognition model to be trained.

After the code prediction result is obtained, the loss value corresponding to the to-be-trained code recognition model can be calculated and obtained by combining the initial code result and the code prediction result corresponding to the code training sample.

After calculating the loss value corresponding to the recognition model of the code to be trained, step 209 is executed.

Step 209: and under the condition that the loss value is in a preset range, taking the trained code recognition model as a final target code recognition model.

The preset range refers to a range preset by a service person and used for determining whether a loss value of the code recognition model to be trained meets a service requirement, and a specific numerical value of the preset range may be determined according to the service requirement, which is not limited in this embodiment.

After the loss value corresponding to the recognition model of the code to be trained is obtained through calculation, whether the loss value is within a preset range or not can be judged.

If the loss value is not within the preset range, the trained code recognition model to be trained does not meet the service requirement, and at the moment, more code training samples can be obtained to train the trained code recognition model to be trained continuously.

If the loss value is within the preset range, it indicates that the trained code recognition model to be trained can meet the business requirements, and at this time, the trained code recognition model to be trained can be used as a target code recognition model, and the target code recognition model can be used in a subsequent code recognition scene.

According to the embodiment of the method and the device, the model training mode, the model network structure and the hyper-parameters of the model, the model input parameters and the model output parameters can be customized by a user, the problems that the type and the number of layers, the connection mode, the nodes and other hyper-parameters are adjusted experimentally and manually, and the user repeatedly develops and compiles codes can be solved, so that a large amount of manpower, material resources and time can be saved.

In this embodiment, the loss function type may also be customized on a model configuration page by a user to determine a loss value calculation manner of the code recognition model, and specifically, the detailed description may be made in combination with the following specific implementation manner.

In another specific implementation manner of the present disclosure, the method may further include:

step D1: and obtaining the loss function type input by the user in the model configuration page.

The step 208 may include:

sub-step E1: and calculating to obtain a loss value which is matched with the type of the loss function and corresponds to the to-be-trained code recognition model according to the initial code result and the code prediction result.

In the embodiments of the present disclosure, the loss function type refers to a function type for calculating the loss value, i.e., the loss function type indicates a specific way of calculating the loss value.

The loss function type can be customized by a user in the model configuration page to be used as the type of the loss function calculated in the subsequent model training process.

After the code prediction result output by the to-be-trained code recognition model is obtained, a loss value corresponding to the to-be-trained code recognition model and matched with the type of the loss function can be obtained through calculation according to the initial code result and the code prediction result.

According to the embodiment of the invention, the loss function type of the model can be customized by a user, and the calculation mode of the loss value can be preset, so that the autonomous configuration of model training and the controllability of model training are realized.

In this embodiment, before training the code recognition model to be trained, the validity of the parameters of the model training defined by the user may be checked, and specifically, the following detailed description may be made in conjunction with the following specific implementation manner.

In a specific implementation manner of the embodiment of the present disclosure, before step 209, the method may further include:

step F1: and verifying the configuration validity of the model input parameters, the model output parameters and the hyper-parameters.

In embodiments of the present disclosure, configuration validity may be used to indicate whether a user-configured model parameter is valid.

Before the code recognition model to be trained needs to be trained, validity of model input parameters, model output parameters and hyper-parameters configured by a user can be verified.

After the configuration validity of the model input parameters, the model output parameters, and the hyper-parameters is checked, step F2 is performed.

Step F2: and generating configuration error prompt information under the condition that the configuration validity does not meet the model training condition.

The configuration error prompt message refers to a prompt message indicating that the model parameters configured by the user are invalid.

In this example, the configuration error notification message may be a text notification message, a voice notification message, or the like, and specifically, a specific form of the configuration error notification message may be determined according to a service requirement, which is not limited in this embodiment.

If the validity of the configuration parameter does not satisfy the model training condition, configuration error notification information may be generated, and step F3 is executed.

Step F3: and sending the configuration error prompt message to the user, so that the user reconfigures the model input parameters, the model output parameters and/or the hyper-parameters according to the configuration error prompt message.

After the configuration error prompt message is generated, the configuration error prompt message can be sent to the user, so that the user can configure the model input parameters, the model output parameters and/or the hyper-parameters according to the configuration error prompt message, and the task party can execute the configuration error prompt message until the user modifies the configuration error prompt message correctly.

According to the embodiment of the disclosure, the validity of the model parameter configured by the user is verified, and the user can be prompted to modify when the verification does not meet the training condition, so that error reporting in the model training process can be avoided, and the problems of resource occupation, blind waiting and even invalid result at the operation position are avoided.

In this embodiment, the super-parameter may also be automatically adjusted when the loss value is outside the preset range, and specifically, the detailed description may be made in conjunction with the following specific implementation manner.

step G1: : and under the condition that the loss value is out of the preset range, obtaining a loss gradient corresponding to the loss value.

Step G2: and adjusting the hyperparameter according to the loss gradient to obtain a target hyperparameter.

Step G3: and loading the target hyper-parameter, and iteratively executing the training process of the to-be-trained code recognition model according to the code training sample until the loss value corresponding to the to-be-trained code recognition model is in the preset range.

In this embodiment, in the case where the loss value is outside the preset range, a loss gradient corresponding to the loss value may be acquired. Specifically, during model training, whether the model has converged or not is automatically measured according to a loss function defined by a user. If the result does not converge, the hyperparameter is continuously adjusted along the trend of gradient decrease by calculating the gradient of the loss function, and the output of the algorithm is recalculated. And repeating the steps circularly until the result is converged, and the loss function reaches a minimum value through continuous model iteration.

The embodiment of the disclosure can reduce the limit of model training and the cost of error making of users to the maximum extent, reduce the labor cost and the resource waste, and make the model training platform more intelligent.

EXAMPLE III

Referring to fig. 3, a schematic structural diagram of a training apparatus based on a visualized code recognition model provided in an embodiment of the present disclosure is shown, and as shown in fig. 3, the training apparatus 300 based on a visualized code recognition model may specifically include the following modules:

a configuration page display module 310, configured to display a model configuration page;

a model parameter obtaining module 320, configured to obtain a model training mode, a model network structure, and a hyper-parameter input by a user in the model configuration page;

a model to be trained obtaining module 330, configured to obtain a code to be trained recognition model matched with the model network structure;

a code sample obtaining module 340, configured to obtain, according to the model training mode, a pre-stored code training sample corresponding to the to-be-trained code recognition model;

and the code model training module 350 is configured to load the hyper-parameter, and train the to-be-trained code recognition model according to the code training sample.

Optionally, the code model training module 350 includes:

Optionally, the method further comprises:

the code model training module 350 includes:

According to the visualized code recognition model training device provided by the embodiment of the disclosure, a model training mode, a model network structure and a hyper-parameter input by a user in a model configuration page are obtained through displaying the model configuration page, a to-be-trained code recognition model matched with the model network structure is obtained, a pre-stored code training sample corresponding to the to-be-trained code recognition model is obtained according to the model training mode, the hyper-parameter is loaded, and the to-be-trained code recognition model is trained according to the code training sample. The embodiment of the disclosure can define the model training mode, the model network structure and the hyper-parameters by a user, and can avoid the problems of experimentally manually adjusting the type and the number of layers, the connection mode, the nodes and other hyper-parameters, and repeatedly developing and compiling codes by the user, thereby saving a large amount of manpower, material resources and time, reducing the code development amount, saving the model training time and improving the model training efficiency.

Example four

Referring to fig. 4, a schematic structural diagram of another visualized code recognition model training apparatus provided in an embodiment of the present disclosure is shown, and as shown in fig. 4, the visualized code recognition model training apparatus 400 may specifically include the following modules:

a configuration page display module 410 for displaying a model configuration page;

a model parameter obtaining module 420, configured to obtain a model training mode, a model network structure, and a hyper-parameter input by a user in the model configuration page;

an input/output parameter obtaining module 430, configured to obtain a model input parameter and a model output parameter that are input by the user in the model configuration page;

a model to be trained obtaining module 440, configured to obtain a code recognition model to be trained that is matched with the model network structure;

a code sample obtaining module 450, configured to obtain, according to the model training mode, a pre-stored code training sample corresponding to the to-be-trained code recognition model;

and the code model training module 460 is configured to load the hyper-parameter, and train the to-be-trained code recognition model according to the code training sample.

Optionally, referring to fig. 5, a schematic diagram of a code model training module provided in an embodiment of the present disclosure is shown, and as shown in fig. 5, the code model training module 460 includes:

a code sample input unit 461, configured to input the code training sample to the to-be-trained code recognition model according to the model input parameter;

a prediction result output unit 462, configured to obtain, according to the model output parameter, a code prediction result corresponding to the code training sample output by the to-be-trained code identification model;

a loss value calculation unit 463, configured to calculate a loss value corresponding to the to-be-trained code recognition model according to the initial code result corresponding to the code training sample and the code prediction result;

and the target code model obtaining unit 464 is configured to, in a case that the loss value is within a preset range, take the trained code recognition model as a final target code recognition model.

Optionally, the method further comprises:

the loss value calculation unit 473 includes:

Optionally, the method further comprises:

An embodiment of the present disclosure also provides an electronic device, including: a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the visualization-based code recognition model training method of the foregoing embodiments when executing the program.

Embodiments of the present disclosure also provide a readable storage medium, wherein when the instructions of the storage medium are executed by a processor of an electronic device, the electronic device is enabled to execute the visualization-based code recognition model training method of the foregoing embodiments.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system is apparent from the description above. In addition, embodiments of the present disclosure are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the embodiments of the present disclosure as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the embodiments of the present disclosure.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the disclosure, various features of the embodiments of the disclosure are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that is, claimed embodiments of the disclosure require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of an embodiment of this disclosure.

Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

The various component embodiments of the disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It will be appreciated by those skilled in the art that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in a motion picture generating device according to embodiments of the present disclosure. Embodiments of the present disclosure may also be implemented as an apparatus or device program for performing a portion or all of the methods described herein. Such programs implementing embodiments of the present disclosure may be stored on a computer-readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit embodiments of the disclosure, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. Embodiments of the disclosure may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The above description is only for the purpose of illustrating the preferred embodiments of the present disclosure and is not to be construed as limiting the embodiments of the present disclosure, and any modifications, equivalents, improvements and the like that are made within the spirit and principle of the embodiments of the present disclosure are intended to be included within the scope of the embodiments of the present disclosure.

The above description is only a specific implementation of the embodiments of the present disclosure, but the scope of the embodiments of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present disclosure, and all the changes or substitutions should be covered by the scope of the embodiments of the present disclosure. Therefore, the protection scope of the embodiments of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A code recognition model training method based on visualization is characterized by comprising the following steps:

displaying a model configuration page;

2. The method according to claim 1, wherein before said loading the hyper-parameters and training the code recognition model to be trained according to the code training samples, further comprising:

3. The method according to claim 2, wherein before the loading the hyper-parameters and training the to-be-trained code recognition model according to the code training samples, the method further comprises:

4. The method according to claim 2, further comprising, after calculating a loss value corresponding to the to-be-trained code recognition model according to the initial code result corresponding to the code training sample and the code prediction result, the method further comprising:

and loading the target hyper-parameter, and iteratively executing the training process of the to-be-trained code recognition model according to the code training sample until the loss value corresponding to the to-be-trained code recognition model is within the preset range.

5. The method according to claim 2, wherein before the loading the hyper-parameters and training the to-be-trained code recognition model according to the code training samples, the method further comprises:

generating configuration error prompt information under the condition that the configuration effectiveness does not meet the model training condition;

6. The method according to claim 1, wherein the loading the hyper-parameters and training the code recognition model to be trained according to the code training samples comprises:

assigning the super-parameters to obtain assigned super-parameters;

7. The method according to claim 1, wherein before said loading the hyper-parameters and training the code recognition model to be trained according to the code training samples, further comprising:

acquiring a hook function input by the user in the model configuration page;

the training of the code recognition model to be trained according to the code training sample comprises the following steps:

8. A code recognition model training device based on visualization is characterized by comprising:

a code sample acquisition module, configured to acquire, according to the model training mode, a pre-stored code training sample corresponding to the to-be-trained code recognition model;

9. An electronic device, comprising:

a processor, a memory, and a computer program stored on the memory and executable on the processor, the processor implementing the visualization-based code recognition model training method of any of claims 1 to 7 when executing the program.

10. A readable storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the visualization-based code recognition model training method of any of claims 1 to 7.