CN110689124A

CN110689124A - Method and system for constructing neural network model

Info

Publication number: CN110689124A
Application number: CN201910943429.7A
Authority: CN
Inventors: 刘汶成; 武华亭
Original assignee: Nine Chapter Yunji Technology Co Ltd Beijing
Current assignee: Nine Chapter Yunji Technology Co Ltd Beijing
Priority date: 2019-09-30
Filing date: 2019-09-30
Publication date: 2020-01-14

Abstract

The invention provides a method and a system for constructing a neural network model, wherein the method comprises the following steps: displaying a user interface comprising at least two components; receiving a first input performed by a user within a user interface; in response to the first input, a first neural network model is constructed within the user interface. The invention can reduce the use threshold of the neural network, is convenient for users to use, enables users who cannot write codes to construct, train and use the neural network model, enables the users to know the neural network more intuitively, deepens understanding and facilitates the users to master quickly.

Description

Method and system for constructing neural network model

Technical Field

The invention relates to the field of big data processing, in particular to a method and a system for constructing a neural network model.

Background

With the rise of deep learning, the neural network has the advantages of no need of feature engineering for original data, short prediction time of a neural network model after full training and high prediction accuracy, is more and more widely applied to the field of artificial intelligence, and particularly has good performances in the fields of complex problems, such as NLP (Natural Language Processing), CV (computer vision), automatic driving, face recognition and the like. However, compared with the conventional machine learning algorithm based on rules, the neural network algorithm directly acquires 'knowledge' from the original data, so that the training time of the model is longer, and the interpretability of the model is relatively poor.

In the existing data analysis system, a neural network can be constructed only by relatively complex coding, complex parameters also need to be adjusted when the neural network is constructed, and algorithm engineers in various fields have certain professional requirements.

Therefore, the existing neural network model is usually required to be constructed by a coding mode, and the professional requirement on algorithm engineers is high.

Disclosure of Invention

The embodiment of the invention provides a method and a system for constructing a neural network model, which aim to solve the problem that the professional requirement on algorithm engineers is high when the neural network model is constructed in a coding mode in the prior art.

The embodiment of the invention provides a method for constructing a neural network model, which comprises the following steps:

displaying a user interface comprising at least two components;

receiving a first input performed by a user within the user interface;

in response to the first input, a first neural network model is constructed within the user interface.

Further, before receiving a first input performed by a user within the user interface, the method further comprises:

receiving selection operation of a user on at least two components;

and responding to the selection operation, determining the at least two components as target components, and displaying the target components in the designated positions of the user interface.

Further, the step of receiving a first input performed by a user within the user interface includes: receiving a first touch operation of a user on the target assembly;

said step of building a first neural network model within said user interface in response to said first input, comprising:

responding to the first touch operation, and acquiring a connection relation between the at least two components and component parameters of the at least two components;

and generating the first neural network model according to the connection relation and the component parameters of the at least two components.

Further, the user interface is an editing interface in a canvas mode, and after the first neural network model is built in the user interface, the method further includes:

generating a first neural network model code corresponding to the first neural network model;

displaying the first neural network model code within an editing interface of a code pattern.

Further, after the building the first neural network model within the user interface, the method further comprises:

receiving a second input performed by a user within the user interface, the second input being an editing operation on the first neural network model;

in response to the second input, updating the first neural network model and a first neural network model code corresponding to the first neural network model.

Further, the target component includes a second neural network model that has been completely constructed, and the step of constructing a first neural network model within the user interface includes:

building the first neural network model based on the second neural network model of the latest version;

or,

and constructing the first neural network model based on the second neural network model of a preset version.

Further, after the constructing the first neural network model in the user interface, the method further includes:

and displaying the first neural network model in an animation mode or a three-dimensional mode.

receiving a third input of a user in the user interface, wherein the third input is a query operation on model meta information of the first neural network model;

graphically displaying model meta-information characterizing the first neural network model in response to the third input.

and carrying out visual model training on the first neural network model to obtain a target neural network model.

Further, the method further comprises:

and when the first neural network model is trained, the training indexes of the first neural network model in each training period are displayed through a chart.

Further, before performing visualization model training on the first neural network model, the method further includes:

acquiring distributed training resources;

and visually displaying the training resources.

Further, the step of obtaining the allocated training resources comprises:

displaying a training resource allocation region on the user interface;

and acquiring the training resources according to a fourth input of the user in the training resource allocation area.

Further, before performing the visual model training on the first neural network model, the method further includes:

displaying a model hyper-parameter input area on the user interface;

acquiring at least one model hyper-parameter input by a user according to fifth input of the user in the model hyper-parameter input area;

and after the model hyper-parameter is obtained, performing visual model training on the first neural network model according to the model hyper-parameter.

Further, after the obtaining the target neural network model, the method further includes:

generating an evaluation index corresponding to the target neural network model;

and displaying the evaluation index in a chart form.

storing the target neural network model to a model repository;

updating at least one statistical view within the model repository, wherein each statistical view is generated according to a different statistical rule.

and based on the derivation operation, the target neural network model is derived in a visual output mode.

Further, the method further comprises:

and importing a third neural network model corresponding to the import operation based on the import operation.

Further, the step of importing, based on the import operation, a third neural network model corresponding to the import operation includes:

analyzing the model file of the third neural network model to obtain target structure information;

performing model training on the third neural network model according to the target structure information to obtain a trained third neural network model;

and importing the trained third neural network model.

Further, the performing model training on the third neural network model according to the target structure information to obtain a trained third neural network model includes:

freezing part of the target structure information;

training a network layer corresponding to the target structure information which is not frozen to obtain a target training result;

and obtaining a trained third neural network model according to the network layer corresponding to the frozen part of the target structure information and the target training result.

The embodiment of the present invention further provides a system for constructing a neural network model, including:

a first display module for displaying a user interface comprising at least two components;

a first receiving module for receiving a first input performed by a user within the user interface;

a building module to build a first neural network model within the user interface in response to the first input.

Further, the system further comprises:

a second receiving module, configured to receive a selection operation of at least two components by a user before the first receiving module receives a first input performed by the user in the user interface;

and the second display module is used for responding to the selection operation, determining the at least two components as target components and displaying the target components in the designated positions of the user interface.

Further, the first receiving module is further configured to: receiving a first touch operation of a user on the target assembly;

the building module comprises:

the processing submodule is used for responding to the first touch operation and acquiring the connection relation between the at least two assemblies and the assembly parameters of the at least two assemblies;

and the generation submodule is used for generating the first neural network model according to the connection relation and the component parameters of the at least two components.

Further, the user interface is an editing interface in canvas mode, and the system further comprises:

the first generation module is used for generating a first neural network model code corresponding to a first neural network model after the construction module constructs the first neural network model in the user interface;

and the third display module is used for displaying the first neural network model code in an editing interface of a code mode.

Further, the system further comprises:

a third receiving module, configured to receive a second input performed by a user in the user interface after the building module builds the first neural network model in the user interface, where the second input is an editing operation on the first neural network model;

a first update module to update the first neural network model and a first neural network model code corresponding to the first neural network model in response to the second input.

Further, the target component includes a second neural network model that has been built, the build module further to:

or,

Further, the system further comprises:

and the first display module is used for displaying the first neural network model in an animation mode or a three-dimensional mode after the construction module constructs the first neural network model in the user interface.

Further, the system further comprises:

a fourth receiving module, configured to receive a third input of the user in the user interface after the constructing module constructs the first neural network model in the user interface, where the third input is a query operation on model meta information of the first neural network model;

and the second display module is used for responding to the third input and displaying the model meta-information representing the first neural network model characteristic in a graph mode.

Further, the system further comprises:

and the first acquisition module is used for performing visual model training on the first neural network model after the construction module constructs the first neural network model in the user interface to acquire the target neural network model.

Further, the system further comprises:

and the third display module is used for displaying the training indexes of the first neural network model in each training period through a chart while training the first neural network model.

Further, the system further comprises:

the second acquisition module is used for acquiring the distributed training resources before the first acquisition module carries out visual model training on the first neural network model;

and the fourth display module is used for visually displaying the training resources.

Further, the second obtaining module is further configured to:

displaying a training resource allocation region on the user interface;

Further, the system further comprises:

a fifth display module, configured to display a model hyper-parameter input area on the user interface before the first obtaining module performs visual model training on the first neural network model;

the third acquisition module is used for acquiring at least one model hyper-parameter input by the user according to the fifth input of the user in the model hyper-parameter input area;

and the training module is used for performing visual model training on the first neural network model according to the model hyper-parameters after the model hyper-parameters are obtained.

Further, the system further comprises:

the second generation module is used for generating an evaluation index corresponding to the target neural network model after the first acquisition module acquires the target neural network model;

and the sixth display module is used for displaying the evaluation index in a chart form.

Further, the system further comprises:

the storage module is used for storing the target neural network model to a model warehouse after the first acquisition module acquires the target neural network model;

and the second updating module is used for updating at least one statistical view in the model warehouse, wherein each statistical view is generated according to different statistical rules.

Further, the system further comprises:

and the derivation module is used for deriving the target neural network model in a visual output mode based on a derivation operation after the first acquisition module acquires the target neural network model.

Further, the system further comprises:

and the importing module is used for importing a third neural network model corresponding to the importing operation based on the importing operation.

Further, the import module includes:

the analysis submodule is used for analyzing the model file of the third neural network model to obtain target structure information;

the training submodule is used for carrying out model training on the third neural network model according to the target structure information to obtain a trained third neural network model;

and the importing submodule is used for importing the trained third neural network model.

Further, the training sub-module is further to:

freezing part of the target structure information;

The embodiment of the invention provides a system for constructing a neural network model, which comprises a processor, a memory and a computer program stored on the memory and capable of running on the processor, wherein when the computer program is executed by the processor, the method for constructing the neural network model is realized.

The embodiment of the invention provides a computer readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements any one of the above methods for constructing a neural network model.

According to the technical scheme, the first neural network model is constructed by displaying the user interface comprising at least two components and according to the first input executed by the user in the user interface, the first neural network model is constructed in a visual mode, the use threshold of the neural network is reduced, the use of the neural network is facilitated for the user, the user who cannot write codes can construct, train and use the neural network model, the user can know the neural network more intuitively, the understanding is deepened, and the user can conveniently and quickly master the neural network.

Drawings

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

FIG. 1 is a schematic diagram of a method for constructing a neural network model according to an embodiment of the present invention;

FIG. 2 is a schematic representation of a user interface according to an embodiment of the present invention;

FIG. 3 is a second schematic diagram of a user interface according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a model meta-information visualization according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a model training visualization according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a model hyper-parameter of an embodiment of the present invention;

FIG. 7a is a schematic representation of a ROC curve according to an embodiment of the present invention;

FIG. 7b shows a list of model evaluation metrics in accordance with an embodiment of the present invention;

FIG. 8 illustrates a statistical radar map generated based on model types and application scenarios in accordance with an embodiment of the present invention;

FIG. 9 is a schematic diagram of a system for constructing a neural network model according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a method for constructing a neural network model.

The method for constructing the neural network model provided by the invention can be applied to a data analysis system.

In order to realize the bottom-layer detail encapsulation and obtain better compatibility, the bottom layer preferably uses Keras, TensorFlow, Torch and/or Pytorch frameworks, and the visualization of each stage of the whole life cycle of the neural network model is realized on the basis. Preferably, the neural network models involved in the embodiment of the present invention are models supported by Keras and/or tensrflow, and specifically may be models corresponding to a Keras H5-formatted model file and/or a tensrflow CheckPoint-formatted model file and/or a GraphDef-formatted model file.

Referring to fig. 1, a method for constructing a neural network model according to an embodiment of the present invention includes:

step 101, displaying a user interface comprising at least two components.

Embodiments of the present invention may first display a user interface, where the user interface includes at least two components. After displaying the user interface comprising at least two components, step 102 may be performed.

Step 102, receiving a first input executed by a user in a user interface.

After displaying the user interface comprising at least two components, a first input performed by a user within the user interface may be received, where the first input is an operation for building a neural network model.

Step 103, responding to the first input, and constructing a first neural network model in the user interface.

After receiving a first input performed by a user within the user interface, a corresponding first neural network model may be built within the user interface according to the first input of the user.

The process constructs the first neural network model by displaying the user interface comprising at least two components and according to the first input executed by the user in the user interface, so that the first neural network model is constructed in a visual mode, the use threshold of the neural network is reduced, the use of the user is facilitated, the user who cannot write codes can construct, train and use the neural network model, the user can know the neural network more intuitively, the understanding is deepened, and the user can master the neural network quickly.

In the embodiment of the present invention, before receiving the first input performed by the user in the user interface, the method further includes: receiving selection operation of a user on at least two components; in response to the selection operation, at least two components are determined to be target components, and the target components are displayed within the designated locations of the user interface.

The user interface comprises at least two components, the at least two components can be selected by a user before the first input is executed, the at least two components selected by the user can be determined as target components according to the selection operation executed by the user, and then the determined target components are displayed at the designated position of the user interface.

Wherein, the selecting operation includes but is not limited to: dragging operation, selecting operation and the like, for example, dragging the component to a specified position (canvas) in the user interface for displaying through the dragging operation; or after some components are selected in a single-click mode, a double-click mode, a shortcut key mode and the like, triggering a specified position (canvas) in a user interface to display the selected components.

After the target component is determined, receiving a first input performed by a user in the user interface, including: receiving a first touch operation of a user on a target assembly; accordingly, the step of constructing a first neural network model within the user interface in response to the first input comprises: responding to the first touch operation, and acquiring a connection relation between at least two assemblies and assembly parameters of the at least two assemblies; and generating a first neural network model according to the connection relation and the component parameters of the at least two components.

The receiving of the first input of the user is to receive a first touch operation of the user on the target component. According to a first touch operation of a user, determining a connection relation between at least two assemblies included in a target assembly, and connecting the at least two assemblies by adopting a connecting line according to the determined connection relation.

The connecting line can reflect the connection relation of the components at the two ends of the connecting line; the starting end of the connecting wire is connected with the output end of one component, and the terminal end of the connecting wire is provided with an arrow which is connected with the input end of the other component. The plurality of components are connected using connecting wires for determining a structure of the neural network model.

In a feasible embodiment, the connection relationship between the at least two components is obtained according to the first touch operation of the user, and when the at least two components are connected by using the connection line according to the determined connection relationship, the connection relationship between the at least two components can be determined based on an operation track of the first touch operation of the user on the at least two components. If the user touches the component a and the component B first, the connection line is clicked and dragged to the input end of the component B at the output end of the component a, so that the connection relationship between the component a and the component B is determined as the component a connecting component B.

The type of the connection line start end and the type of the connection line termination end must be consistent, that is, the data format output by the connection line start end and the data format input by the termination end are ensured to be the same, so as to ensure the correctness of the model.

Further, after determining that the at least two components are the target components, the component parameters of the at least two components are also acquired. In a possible embodiment, the process of obtaining the device parameters of the at least two devices in response to the first touch operation includes: the method comprises the steps of displaying a component parameter configuration interface according to a first touch operation of a user on a target component, obtaining an operation executed by the user on the component parameter configuration interface, and determining component parameters of at least two components based on the operation of the user.

For example, a click operation of a user on a target component in a user interface is received, component parameter configuration frames are displayed in a preset area according to the click operation, the component parameter configuration frames comprise configuration frames corresponding to all components, and component parameters of all the components are determined according to the operation of the user in all the configuration frames.

After determining the connection relationship between the at least two components and obtaining the component parameters of the at least two components, a first neural network model may be generated according to the connection relationship and the component parameters corresponding to the at least two components.

The components are displayed in a graphical mode in the corresponding area of the user interface, a user can select and drag each graphical component to a designated position (canvas) of the user interface by using a mouse, and when the user selects the component to perform dragging operation, the relative distance between the components and the position in the canvas can be adjusted.

The following describes the editing interface with reference to fig. 2, and the editing interface corresponds to a canvas mode, a code mode, and a meta information mode. In canvas mode, on the left side of the editing interface, a plurality of components of the neural network model are displayed, where the components include, but are not limited to, a kernel (full link layer), an Activation (active layer), a Dropout (discard layer), a Flatten (one-dimensional layer), a Reshape (remould layer), a Permute (rearrangement layer), a RepeatVecor (input repeat layer), a Lambda (custom layer), an Activation regularizer (regularization layer), a Masking (sequence value Masking layer), a BatchNormalization (normalization layer), an UpSampling2D (two-dimensional UpSampling layer), a Conv2D (two-dimensional convolutional layer), a Conv2 dtransverse (two-dimensional convolution inverse transformation layer), a LeakyReLU (neuron Activation improvement layer), a ReLU, a MaxPooling2D (two-dimensional maximum pooling layer), an LSTM (long-term memory short-term model layer), etc., and only a few components are listed as representatives in canvas mode in fig. 2. The middle region is the canvas of the neural network model, the right part is the component parameters of the corresponding component, and the component parameters of the two-dimensional convolution are shown in fig. 2.

In an embodiment of the present invention, the user interface is an editing interface in a canvas mode, and after the first neural network model is built in the user interface, the method further includes: generating a first neural network model code corresponding to the first neural network model; the first neural network model code is displayed within an editing interface of the code pattern.

After the first neural network model is constructed and the generated model is clicked, corresponding first neural network codes can be generated according to indexes automatically generated by each component in the model, component parameters of the components and connection relations among the components, the generated first neural network codes are displayed in a code mode, and information synchronization of an editing interface of the code mode and an editing interface of a canvas mode can be guaranteed.

The first neural network code corresponds to a code view, the first neural network model corresponds to a visual view (a topology view), the code view and the visual view can realize one-key switching, for example, a view switching key is clicked, the code view and the visual view can be switched back and forth between the topology view and the code view, and the two views are two display forms of the same neural network model.

After the first neural network model is built and the corresponding topology snapshot is generated, the background automatically determines the connection relation among the components according to the topology snapshot, obtains the component parameters of the components, and generates the corresponding first neural network model code according to the components, the connection relation and the component parameters. Correspondingly, after the code in the code view is updated and a new snapshot is generated, the components of the topology view are also updated correspondingly, specifically, the corresponding components (including component parameters) and the connection relationship between the components are updated according to the updated code, so that the update of the topology view is realized.

In an embodiment of the invention, after the first neural network model is built in the user interface, the method further includes:

receiving a second input executed by the user in the user interface, wherein the second input is an editing operation on the first neural network model; in response to the second input, the first neural network model and a first neural network model code corresponding to the first neural network model are updated.

After the first neural network model is built, a second input performed by the user in the user interface may also be received, where the second input is an editing operation on the first neural network model, including but not limited to at least one of: adding components, deleting components, adjusting the positions of the components, adjusting the connection relation, changing the parameters of the components and the like. For example, when a new component is added, the new component can be dragged from the alternative components to the canvas, and the network topology of the existing first neural network model is adjusted; when the component is deleted, selecting a right key of the component to be deleted to delete, and then adjusting the network topology structure of the existing first neural network model; and when the position of the component in the topology needs to be adjusted, selecting the existing connecting line to delete, and reconnecting each component according to the requirement. When a plurality of components need to be operated simultaneously, a multi-selection function key can be selected, the components needing to be operated are sequentially clicked by a mouse, and then the required operations, such as copying, moving, deleting and the like, are carried out. For example, when a plurality of new components are added, a multi-selection function key can be selected, the plurality of new components needing to be added are sequentially clicked through a mouse, and then the plurality of new components receiving the clicking operation are simultaneously moved to a specified position (canvas) in a user interface.

When receiving operations of adding, deleting, adjusting the position of, adjusting the connection relation of or changing the component parameters of the component, which are executed by a user, the first neural network model can be updated according to the operations executed by the user, and after the first neural network model is updated, the first neural network model code corresponding to the first neural network model is updated, so that the update synchronization of the editing interface of the code mode and the editing interface of the canvas mode is ensured.

After the first neural network model is updated and the fact that the topology structure of the updated first neural network model passes validity verification is determined, a corresponding topology snapshot can be generated, and the background can automatically generate an updated first neural network model code according to the new topology snapshot. Correspondingly, after the code in the code view is updated and a new snapshot is generated, the components of the topology view are also updated correspondingly.

It should be noted that the validity verification of the topology of the first neural network model is to ensure the usability of the first neural network model. When the validity of the topological structure of the first neural network model is verified, it is required to verify whether the construction rule of the topological structure of the first neural network model meets a preset construction rule, where the preset construction rule at least includes: all network components are in a neural network topology, with no free-standing components, and 1 input to the neural network corresponds to 1 output. For example, after operations such as dragging, copying, pasting, connecting and the like in a canvas are completed, a check network button is clicked, and the system can verify the validity of the topological structure of the neural network model, namely, whether all network components in the neural network model are in the topological structure or not is checked, no independent free network component exists, and the whole neural network model has one input and one output corresponding to each other; if the verification is passed, the save button can be clicked to save the current state, and the neural network model can be continuously updated and saved according to the operation of the user. Where each save corresponds to a version of the neural network model.

In an embodiment of the invention, the target component comprises a second neural network model which is constructed completely, and the step of constructing the first neural network model in the user interface comprises the following steps: constructing a first neural network model based on the second neural network model of the latest version; or constructing the first neural network model based on the preset version of the second neural network model.

The target component may comprise a second neural network model that has already been built, and in the case that the target component comprises the second neural network model, the first neural network model may be built on the basis of the second neural network model build. Specifically, other components can be added on the basis of the second neural network model to form a new neural network model. The second neural network model may be a trained model or an untrained model, and whether the second neural network model is created by means of visualization is not limited herein.

In the embodiment, the first neural network model is constructed on the basis of the second neural network model, so that the implementation process of constructing the neural network model is simplified to a certain extent, and the utilization rate of the existing neural network is improved.

Furthermore, when the first neural network model is constructed based on the second neural network model, in order to ensure that the constructed first neural network model can be successfully used, when the first neural network model is constructed according to the existing second neural network model, the framework types of the first neural network model and the second neural network model are required to be ensured to be the same.

The second neural network model may correspond to different versions, and when the first neural network model is constructed by referring to the second neural network model, the reference mode and the reference version corresponding to the reference mode also need to be determined. Specifically, the method comprises the following steps:

the neural network model reference can support two modes, linkage reference and snapshot reference. The linkage reference is the second neural network model based on the latest version, and the first neural network model is constructed, namely, when the referenced second neural network model changes, the constructed first neural network model also changes by referring to the second neural network model. For example, when the network model B refers to the network model a and the network model a has a version update, the topology corresponding to the network model B is automatically updated. The snapshot reference is a second neural network model based on a preset version, and a first neural network model is constructed, that is, when the first neural network model is constructed based on the second neural network model of the preset version and the referenced second neural network model changes, the constructed first neural network model cannot be changed along with the referenced second neural network model. For example, as shown in fig. 3, when the network model B references the first version of the network model a (snapshot 1), the corresponding topology in the network model B is the topology of the first version of the network model a even though the network model a is subsequently version updated. The right area in fig. 3 may show the corresponding way of reference, as well as the referenced version information.

In an embodiment of the present invention, after the first neural network model is built in the user interface, the method further includes: and displaying the first neural network model in an animation mode or a three-dimensional mode.

In the case of constructing the first neural network model and determining the topological structure effectiveness of the first neural network model, the features and the respective partial functions of the first neural network model can be viewed in various forms. Such as an animation mode and a stereoscopic mode (3D mode).

Wherein, the animation mode includes but is not limited to at least one of the following: flash animation mode, GIF animation mode; the actions in the animation comprise arrow pointing supporting actions, dotted line flowing, image expansion and contraction transformation, monitoring index curve generation, index value jumping increase and decrease and the like; the GIF map mode is as follows: and converting the motion into a gif format picture. 3D mode: by loading the 3D processing engine, scaling, rotation, view tracking, object marking and the like of the neural network model can be supported. The method is used for representing the flowing direction of training data in each network layer, the change before and after data processing, the convergence of model parameters, the visual characteristics of each layer and the like.

In an embodiment of the present invention, after the building the first neural network model in the user interface, the method further includes:

receiving a third input of the user in the user interface, wherein the third input is a query operation on model meta-information of the first neural network model; and responding to the third input, and graphically displaying model meta-information for characterizing the first neural network model.

In the case that the first neural network model building is completed, model meta-information of the model can be viewed by performing a third input to a model meta-information button of the model building page, the model meta-information including, but not limited to, at least one of: structural information of the model and index information of the model; wherein the structural information of the model includes, but is not limited to, at least one of: the total number of components of the neural network model, the number of core network layers of the neural network model, the number of full connection layers of the neural network model and the size of a convolution kernel of the neural network model; the metric information of the model includes, but is not limited to, at least one of: the method comprises the following steps of model complexity of the neural network model, model prediction speed of the neural network model, model calculation quantity influencing the neural network model and memory usage size of the neural network model. The structural information of the model reflects some index characteristics of the model to some extent, the structural information of the model and the index information of the model have positive correlation, and the corresponding relationship between the structural information of the model and the index information of the model is as follows: the total number of components of the neural network model influences the model complexity of the neural network model, the number of core network layers of the neural network model influences the model prediction speed of the neural network model, the number of fully-connected layers of the neural network model influences the model calculation amount of the neural network model, and the size of a convolution kernel of the neural network model influences the memory usage size of the neural network model.

By visualizing the model meta-information in a chart mode, a user can quickly establish visual understanding on the model, and quick iterative tuning is facilitated.

In order to facilitate the user to view the model operation, various model meta-information of the first neural network model can be displayed in various icon modes in one display interface. For example, as shown in fig. 4, fig. 4 is a schematic diagram of a model meta-information visualization, in the schematic diagram of the model meta-information visualization shown in fig. 4, the total number of components, the number of network layers, the number of neurons, the complexity of the model, the distribution of the components, the resource usage, and the like of the model are shown in the form of a bar chart and a numerical statistical chart.

Furthermore, resources used for training can be intelligently recommended, a typical data set is provided and typical resources are selected before intelligent recommendation, typical training (thorough training) is completed, and the system automatically records training metadata, wherein the training metadata is related information in a typical training process. The training metadata includes, but is not limited to, at least one of: the number of data sets, the file size of the data sets, the use of a Central Processing Unit (CPU), a memory, a Graphics Processing Unit (GPU) resource, a training iteration round and a final use time. And the resources used for training and the iteration times of training have positive correlation with the final use time. The resources used for training required for training the current model may be recommended based on training metadata recorded by the system corresponding to the representative data set and the selected representative resources.

For example, a typical data set is 100 face images to be processed, where the size of the face image is 100 × 100, typical resources are a 2G memory and a 2-core CPU, and the training duration is 1 hour, then the training metadata includes the number of face images corresponding to the end-touch training, the size of the face images, the size of the memory, the size of the CPU, and the training duration. When a user starts custom training, after a certain resource is allocated, the system gives estimated training time according to typical training data, current allocation data of the user and training metadata of end-touch training, wherein the typical training data is a batch of data to be processed, such as 200 human face images with the size of 100 × 100, and the current allocation data of the user is a 2G memory and a 2-core CPU (Central processing Unit), and the estimated training time can be 2 hours according to the training metadata of the end-touch training. Wherein, when the user distributed resource changes, the training time synchronous change is estimated.

Or, when the user selects a specific training duration, the system may evaluate the required training resources based on the training metadata and the specific training duration selected by the user, and intelligently recommend the training resources. If the user selects 100 facial images with the size of 100 x 100 to be processed for training for half an hour, the 2G memory and the 4-core CPU can be recommended intelligently.

Further, in order to avoid training invalid neural network models to some extent, performing the visual model training on the first neural network model may be performed under the condition of constructing the first neural network model and determining the validity of the topology structure of the first neural network model.

The visual model training comprises at least two rounds of training on a first neural network model, and when the first neural network model is trained to meet preset conditions, a target neural network model is obtained, wherein the preset conditions are as follows: the first neural network model is trained to preset ideal parameters.

Wherein, the method also comprises: and when the first neural network model is trained, the training indexes of the first neural network model in each training period are displayed through a chart.

The general model training needs to go through a plurality of epochs to enable the model to learn relatively ideal parameters to reach an available level. When the first neural network model reaches the available level through a plurality of rounds of training, the corresponding target neural network model can be determined and obtained. All indexes of each stage of the whole training process can be displayed in a form of a chart, for example:

the abscissa of the chart is a training stage, the name of each stage is p/q, p is the current training epoch, and q is the total epoch of the model training, wherein the total round q of the model training is a hyper-parameter which can be set by a user, and the length of the abscissa of each stage represents the length of the epoch using time. The ordinate of the graph can be the time length corresponding to the model precision, the processor utilization rate, the memory occupation size, the GPU utilization rate, the disk read-write rate, the storage file size, the network use information and the training period. In the training process, the change of the indexes of each epoch can intuitively know the comprehensive indexes of model training in each training period, and the understanding of the model training is deepened.

Specifically, please refer to fig. 5, in which fig. 5 is a model training visualization diagram corresponding to an example in which the ordinate is the time length corresponding to the model precision and the training period, and the abscissa is the model training round. The comprehensive training indexes of the model in each training period are displayed through the chart, so that the bottleneck of model training can be directly known, the model can be better adjusted, training resources can be adjusted, and iterative training of the model can be accelerated.

In an embodiment of the present invention, before performing the visual model training on the first neural network model, the method further includes: acquiring distributed training resources; and visually displaying the training resources.

Wherein the step of obtaining the allocated training resources comprises: displaying a training resource allocation region on a user interface; and acquiring the training resources according to the fourth input of the user in the training resource allocation area.

The distribution of model training resources directly affects the training speed, and can even decide whether the training can run normally. The memory is used for loading the model and the training data, the CPU and the GPU are used for calculating, the three training resources can be distributed in a graphical mode before training according to the needs of users, and recommended values can be displayed to be used as references. Whether the GPU is used as the training resource can be determined by user selection, and if the user selects the GPU to be used as the training resource, the resources can be distributed to the memory, the CPU and the GPU in a graphical mode in a training resource distribution area.

Wherein the recommended value of the system may be displayed before receiving a fourth input by the user within the training resource allocation region. For example, the recommended values of the system are displayed in the training resource allocation area in a scroll bar manner, the memory, the CPU, and the GPU correspond to one scroll bar, and the corresponding resource allocation manner may be determined according to the sliding input performed by the user in each scroll bar, so as to acquire the training resources.

It should be noted that, after the training resource allocation manner is determined and the allocated training resources are acquired, the training resources may be visually displayed in a graph manner. After the allocated resources are obtained, the first neural network model can be subjected to visual model training, wherein in the training process, the use condition of the resources can be updated in real time in a chart form.

In an embodiment of the present invention, before performing the visual model training on the first neural network model, the method further includes: displaying a model hyper-parameter input area on a user interface; acquiring at least one model hyper-parameter input by the user according to the fifth input of the user in the model hyper-parameter input area; and after the model hyper-parameters are obtained, performing visual model training on the first neural network model according to the model hyper-parameters.

And the model hyper-parameter input area comprises a graphical input sub-area corresponding to each model hyper-parameter.

The purpose of setting the model hyper-parameters is to improve the performance and effect of learning. Wherein the deeply learned model hyper-parameters include, but are not limited to, at least one of: iteration times, the number of hidden layers, the number of neurons in each layer, learning rate and the like. The model hyper-parameters can be set in various graphical modes such as a table, a dragging bar, a digital input box and the like of a page.

For example, taking the example of setting the number of iterations through the drag bar, the drag bar is proportional to the value of the model hyper-parameter, the drag bar track total length L0, the setting range of the number of iterations is [1, 10], the value of the number of iterations is set by sliding the drag bar, and the value of the number of iterations is calculated based on the following calculation formula: the value of the number of iterations (drag bar current length L1/drag bar track full length L0) × 10. Further, when the drag bar is slid, the value of the number of iterations displayed on the user interface may dynamically change with the position of the drag bar.

As shown in fig. 6, a model hyper-parameter input area is presented in the user interface, wherein the model hyper-parameter input area includes: the learning rate, the attenuation period, the iteration times and the batch size are 5 model hyper-parameters, and the graphical input sub-region corresponding to each model hyper-parameter receives the input of a user in the graphical input region corresponding to a certain model hyper-parameter, so that the model hyper-parameter can be obtained.

In an embodiment of the present invention, after obtaining the target neural network model, the method further includes: generating an evaluation index corresponding to the target neural network model; and displaying the evaluation index in a chart form.

After the model training is completed, the model evaluation indexes can be displayed in various charts, including but not limited to line graphs, bar graphs, radar charts and the like. The evaluation index includes r²(R square), F1(F1-score, F1 score), neg _ mean _ squared _ log _ error (mean square absolute error), call (recall), extended _ variance (explained variance, which measures how well the model explains the fluctuation of the dataset), accuracy (accuracy), ROC AUC (area under the ROC curve), neg _ mean _ squared _ error (mean square error), precision (accuracy), neg _ mean _ absolute _ error (median absolute maintenance), log _ loss (logarithmic loss function), Fbeta (F1 score)_βThe physical meaning of (1) is to combine the two scores of the accuracy and the recall rate into one score, wherein in the combining process, the weight of the recall rate is beta times of the accuracy rate), neg _ mean _ absolute _ error (mean absolute error) and the like. As shown in fig. 7a and 7b, a schematic diagram of an ROC (receiver operating characteristic curve) curve and a list of model evaluation indexes are respectively shown.

In an embodiment of the present invention, after obtaining the target neural network model, the method further includes:

storing the target neural network model to a model repository; at least one statistical view within the model repository is updated, wherein each statistical view is generated according to a different statistical rule.

And after the model training is finished, the model automatically enters a model warehouse, and the model warehouse is a device for uniformly managing the models. A user can carry out standard management on the model through the warehouse management page, check and evaluate the quality of the model through various modes, and release the model as an online service for specific business application of a production environment after screening.

The model warehouse is used for classifying and managing the trained models, a multi-dimensional statistical page can be freely switched on a warehouse overview page, and the statistical page is a statistical rule which comprises but is not limited to statistics according to time, statistics according to model types, screening according to certain evaluation indexes, statistics according to projects, business attributes, algorithm types and the like. The statistics according to time refers to training according to a model or the time of entering the model into a warehouse, and the statistics is carried out according to the ranking and the number of the model in the grades of hours, days, weeks, months and years. The statistics according to the model type refers to classification statistics according to the types of the model, such as binary classification, multi-classification, regression and the like. View types include, but are not limited to, line graphs, pie charts, bar graphs, pile-up/percent bar graphs/bar graphs, scatter plots, and the like. FIG. 8 is a radar map with model types applied in different scenes, wherein each broken line in the radar map shown in FIG. 8 represents a model type, such as a convolutional neural network model, a deep neural network model, a cyclic neural network model, a generative confrontation network model and a long-short term memory network model; each vertex angle of the radar map represents an application scene, such as the application scenes of image classification, voice recognition, target detection, text classification and image segmentation; the closer a certain broken line in the radar map is to a certain corner, the better the application effect of the model type represented by the broken line in the application scene is. As shown in fig. 8, in the speech recognition application scenario, the polyline representing the deep neural network model is closer to the corresponding vertex angle of the speech recognition scenario than the other 4 neural network models, that is, the application effect of the deep neural network model in the speech recognition application scenario is better than the application effect of the other 4 neural network models in the speech recognition application scenario. Similarly, as can be seen from fig. 8, the most effective application in the image classification application scenario is the convolutional neural network model; the convolution neural network with the best application effect in the target detection application scene; the convolution neural network with the best application effect in the image segmentation application scene; the best application effect in the context of text classification application is long-short term memory network.

Under different scenes, the requirements of system resources and models are different, and in order to facilitate the multiplexing and migration of model achievements among different systems, the models have better compatibility and wider application, and the import and export functions are indispensable.

When the model in the data analysis system is exported, in addition to the data of the model, model meta-information such as the model structure, the model hyper-parameters, the model training process, the model evaluation indexes and the like of the model also support visual output through charts, for example, visual output through formats such as static pictures, flash, dynamic pictures and the like. Specifically, the information is input into a model export compression package as accessory information of the model, for example, the information is output in a static jpg picture format by default, and a user can also select an output format according to needs. Model meta-information such as a model structure, model hyper-parameters, a model training process, model evaluation indexes and the like of the model are visually output through a chart, so that a system without visualization capacity can conveniently display the model visualization information.

In an embodiment of the present invention, the method further includes: and importing a third neural network model corresponding to the import operation based on the import operation.

When importing the model of the data analysis system, the imported model can be imported without any modification, or partially optimized and modified when being imported again, so as to perform optimized importing.

When the import operation is executed, whether the third neural network model to be imported belongs to the model exported by the data analysis system or not can be distinguished, if the third neural network model belongs to the model exported by the data analysis system, the model is imported as the model file is, and when the model exported by the data analysis system is imported into the data analysis system again, the data analysis system can completely reserve relevant information and does not need to perform special processing on the model.

If the third neural network model to be imported does not belong to the model exported by the data analysis system, the model of the third-party system is imported, the data analysis system firstly analyzes the model, and the importing operation is executed under the condition that the analysis is successful.

That is, the step of importing the third neural network model corresponding to the import operation based on the import operation includes:

analyzing the model file of the third neural network model to obtain target structure information; performing model training on the third neural network model according to the target structure information to obtain a trained third neural network model; and importing the trained third neural network model.

Firstly, the analytic process is explained: the data analysis system reads the suffix name of the model file, detects whether the suffix name of the model file is matched with a support frame of the data analysis system, and tries to analyze the model file by using a corresponding frame if the suffix name is matched with the support frame so as to obtain target structure information; if not, analyzing the model file by using a framework supported by the data analysis system in sequence, and if all the analysis fails, prompting that the model does not support. Wherein the target structure information includes but is not limited to: and acquiring the components, the component parameters and the connection relation among the components.

The analysis process includes checking a file format, and under the condition that the analysis is successful, the components, the component parameters and the connection relation among the components can be obtained, so that a third neural network model is obtained according to the components, the component parameters and the connection relation among the components, then the third neural network model is retrained, and corresponding model meta-information such as model hyper-parameters, a model training process, model evaluation indexes and the like is obtained according to a training result.

In the process of importing the model, the model for importing the model can be optimized, namely, the model file is optimized. Here, in order to facilitate optimization of the model, a corresponding model code may be generated based on the acquired target structure information, and the imported model may be optimized by a training method based on the model code.

In order to improve the efficiency of optimization processing and reduce the optimization processing time, when the model file is optimized, the optimization processing is realized by adopting a transfer learning mode, namely for the imported third neural network model, part of target structure information is frozen, and a network layer corresponding to the target structure information which is not frozen is retrained.

Further, when optimizing the imported model, the step of performing model training on the third neural network model according to the target structure information to obtain a trained third neural network model includes: freezing part of the target structure information; training a network layer corresponding to the unfrozen target structure information to obtain a target training result; and obtaining a trained third neural network model according to the network layer corresponding to the frozen part of the target structure information and the target training result.

The following describes a process of implementing optimization processing by using a transfer learning mode in combination with a specific application scenario.

The model for optimization is assumed to have M network layers, wherein the target structure information corresponding to the 1 st to N network layers is the frozen partial target structure information. Wherein M and N are positive integers, and M is greater than N. The specific method for realizing optimization processing by adopting the transfer learning mode comprises the following steps: selecting N continuous network layers from the first layer to the Nth layer, selecting N continuous network layers from the frozen first layer to the Nth layer to correspond to the target structure information, then carrying out validity verification on the frozen model, and then training the frozen model under the condition that the verification is passed. In the training process, the target structure information of the 1-N layers can be kept unchanged, and only the target structure information corresponding to the network layers of the (N + 1) th layer to the M (M) th layer is changed.

In the implementation process of the method for constructing the neural network model according to the embodiment of the invention, the model is created and edited for visualization, so that the use threshold of the neural network can be reduced, the use by a user is facilitated, the user who cannot write codes can construct, train and use the neural network model, the user can know the neural network more intuitively, the understanding is deepened, and the user can conveniently and quickly master the neural network; the model hyper-parameters are adjusted in a visualization mode, so that the working efficiency can be improved, the model meta-information is analyzed through visualization statistics, the model can be visually understood conveniently, the model training is performed in a visualization mode, the training bottleneck can be known, the model can be adjusted better, the hyper-parameters and the training resources can be adjusted, the iterative training of the model is accelerated, and the efficiency is improved; the model warehouse is managed in a visualization mode, flexible and efficient management can be achieved, and the model is imported and exported in a visualization mode, so that better compatibility and expansibility can be achieved, cross-system use is flexible and convenient, visualization of models among different systems is facilitated, and the reusability of the models is enhanced.

An embodiment of the present invention further provides a system for constructing a neural network model, as shown in fig. 9, including:

a first display module 10 for displaying a user interface comprising at least two components;

a first receiving module 20, configured to receive a first input performed by a user in a user interface;

a building module 30 for building a first neural network model within the user interface in response to the first input.

Further, the system further comprises:

the second receiving module is used for receiving the selection operation of the user on the at least two components before the first receiving module receives the first input executed by the user in the user interface;

and the second display module is used for responding to the selection operation, determining at least two components as target components and displaying the target components in the designated positions of the user interface.

Further, the first receiving module is further configured to: receiving a first touch operation of a user on a target assembly;

the construction module comprises:

and the generation submodule is used for generating a first neural network model according to the connection relation and the component parameters of the at least two components.

the first generation module is used for generating a first neural network model code corresponding to the first neural network model after the construction module constructs the first neural network model in the user interface;

and the third display module is used for displaying the first neural network model code in the editing interface of the code mode.

Further, the system further comprises:

the third receiving module is used for receiving a second input executed by the user in the user interface after the building module builds the first neural network model in the user interface, wherein the second input is an editing operation on the first neural network model;

and a first updating module for responding to the second input and updating the first neural network model and the first neural network model code corresponding to the first neural network model.

Further, the target component includes a second neural network model that has been built, and the build module is further to:

constructing a first neural network model based on the second neural network model of the latest version;

or,

and constructing the first neural network model based on the second neural network model of the preset version.

Further, the system further comprises:

the fourth receiving module is used for receiving a third input of the user in the user interface after the constructing module constructs the first neural network model in the user interface, wherein the third input is an inquiry operation of model meta-information of the first neural network model;

Further, the system further comprises:

Further, the second obtaining module is further configured to:

displaying a training resource allocation region on a user interface;

and acquiring the training resources according to the fourth input of the user in the training resource allocation area.

Further, the system further comprises:

the fifth display module is used for displaying the model hyper-parameter input area on the user interface before the first acquisition module performs visual model training on the first neural network model;

Further, the system further comprises:

and the sixth display module is used for displaying the evaluation indexes in a chart form.

Further, the system further comprises:

the storage module is used for storing the target neural network model to the model warehouse after the first acquisition module acquires the target neural network model;

Further, the system further comprises:

and the derivation module is used for deriving the target neural network model in a visual output mode based on the derivation operation after the first acquisition module acquires the target neural network model.

Further, the system further comprises:

Further, the import module includes:

Further, the training sub-module is further configured to:

freezing part of the target structure information;

training a network layer corresponding to the unfrozen target structure information to obtain a target training result;

The invention also provides a system for constructing the neural network model, which comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein the computer program realizes the method for constructing the neural network model when being executed by the processor.

The present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above method for constructing a neural network model.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method of constructing a neural network model, comprising:

displaying a user interface comprising at least two components;

receiving a first input performed by a user within the user interface;

2. The method of claim 1, prior to receiving a first input performed by a user within the user interface, further comprising:

receiving selection operation of a user on at least two components;

3. The method of claim 2, wherein the target component comprises a second neural network model that has been completely built, and wherein the step of building a first neural network model within the user interface comprises:

or,

4. The method of claim 1, wherein after constructing the first neural network model within the user interface, further comprising:

5. The method of claim 4, further comprising:

6. A system for constructing a neural network model, comprising:

7. The system of claim 6, further comprising:

8. The system of claim 7, wherein the target component comprises a second neural network model that has been completely constructed, the construction module further to:

or,

9. The system of claim 6, further comprising:

10. The system of claim 9, further comprising: