CN107766940B

CN107766940B - Method and apparatus for generating a model

Info

Publication number: CN107766940B
Application number: CN201711157646.0A
Authority: CN
Inventors: 谢永康; 施恩; 胡鸣人; 李曙鹏; 李亚帅; 臧硕; 潘子豪; 赵颖
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2017-11-20
Filing date: 2017-11-20
Publication date: 2021-07-23
Anticipated expiration: 2037-11-20
Also published as: CN107766940A

Abstract

Methods and apparatus for generating models are disclosed. One embodiment of the method comprises: in response to a received model generation request which is sent by a user terminal and comprises a user identifier, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal, wherein the model information in the model information set comprises a model category and a model parameter, and the model table is used for representing the corresponding relation between the user identifier and the model information; in response to receiving a model category and a model parameter which are sent by a terminal and selected by a user from a model information set, determining a neural network matched with the model category and the model parameter selected by the user; and responding to a received sample data set sent by the terminal, and training by utilizing a machine learning method based on the sample data set and the neural network to obtain a model. The embodiment can generate a user-defined neural network model.

Description

Method and apparatus for generating a model

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a method and a device for generating a model.

Background

Artificial Intelligence (Artificial Intelligence), abbreviated in english as AI. The method is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, a field of research that includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others.

With the development of deep learning, the existing AI technology can often obtain better effects in general fields, such as: general character recognition, image classification of a limited set, general word segmentation and general voice recognition. However, scenes requiring AI capabilities in real-world applications are often customized, such as: character recognition of specific documents, word segmentation of medical terms, and voice recognition of special scenes.

Disclosure of Invention

The embodiment of the application provides a method and a device for generating a model.

In a first aspect, an embodiment of the present application provides a method for generating a model, including: in response to a received model generation request which is sent by a user terminal and comprises a user identifier, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal, wherein the model information in the model information set comprises a model category and a model parameter, and the model table is used for representing the corresponding relation between the user identifier and the model information; in response to receiving a model category and a model parameter which are sent by a terminal and selected by a user from a model information set, determining a neural network matched with the model category and the model parameter selected by the user; and responding to a received sample data set sent by the terminal, and training by utilizing a machine learning method based on the sample data set and the neural network to obtain a model.

In some embodiments, the sample data in the set of sample data comprises input sample data and output sample data; and the method further comprises: acquiring a predetermined number of input sample data and a predetermined number of output sample data corresponding to the predetermined number of input sample data from the sample data set; inputting the input sample data into the model for each piece of acquired input sample data to obtain an output result corresponding to the input sample data, and accumulating the times of correct verification if the similarity between the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold; the ratio of the number of times the correctness was verified to a predetermined number is determined as the accuracy.

In some embodiments, the method further comprises: and if the accuracy is greater than the preset accuracy threshold, converting the model into an application, and issuing the application to a target server for downloading and using by at least one terminal.

In some embodiments, the method further comprises: receiving feedback data sent by a terminal downloading an application and adding the feedback data to a sample data set, wherein the feedback data comprises actual input data input into the application by the terminal downloading the application, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application, and the similarity between the actual output result and the expected output result is less than a preset similarity threshold; and retraining the model based on the actual input data, the actual output result and the expected output result, and re-issuing the updated model to the target server.

In some embodiments, the method further comprises: and if the accuracy is less than or equal to a preset accuracy threshold, determining recommended model information from the model information set according to the accuracy, the model type and the model parameters selected by the user, and sending the recommended model information to the terminal so that the user can reselect the model type and the model parameters.

In some embodiments, deriving the model based on the set of sample data and the neural network training comprises: selecting sample data from the sample data set and executing the following training steps: training a neural network by using sample data; adjusting network parameters of the neural network based on the training result; determining whether a training completion condition has been satisfied; in response to determining that the training information is not satisfied and the training stopping information sent by the terminal of the user is not received, selecting other sample data from the sample data set to continue to perform the training step; wherein the training completion condition includes at least one of: the training times of the neural network reach a preset training time threshold; in two adjacent training, the loss value between the outputs of the neural network is smaller than a preset threshold value.

In a second aspect, an embodiment of the present application provides an apparatus for generating a model, including: the receiving unit is configured to respond to a model generation request which is sent by a terminal of a user and comprises a user identifier, search a model information set corresponding to the user identifier from a preset model table, and send the model information set to the terminal, wherein the model information in the model information set comprises a model category and a model parameter, and the model table is used for representing the corresponding relation between the user identifier and the model information; the determining unit is configured to respond to the model type and the model parameters which are sent by the receiving terminal and selected by the user from the model information set, and determine the neural network matched with the model type and the model parameters selected by the user; and the training unit is configured to respond to the received sample data set sent by the terminal and train the sample data set and the neural network to obtain a model by utilizing the machine learning device.

In some embodiments, the sample data in the set of sample data comprises input sample data and output sample data; and the apparatus further comprises a verification unit configured to: acquiring a predetermined number of input sample data and a predetermined number of output sample data corresponding to the predetermined number of input sample data from the sample data set; inputting the input sample data into the model for each piece of acquired input sample data to obtain an output result corresponding to the input sample data, and accumulating the times of correct verification if the similarity between the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold; the ratio of the number of times the correctness was verified to a predetermined number is determined as the accuracy.

In some embodiments, the apparatus further comprises an issuing unit configured to: and if the accuracy is greater than the preset accuracy threshold, converting the model into an application, and issuing the application to a target server for downloading and using by at least one terminal.

In some embodiments, the apparatus further comprises a feedback unit configured to: receiving feedback data sent by a terminal downloading an application and adding the feedback data to a sample data set, wherein the feedback data comprises actual input data input into the application by the terminal downloading the application, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application, and the similarity between the actual output result and the expected output result is less than a preset similarity threshold; and retraining the model based on the actual input data, the actual output result and the expected output result, and re-issuing the updated model to the target server.

In some embodiments, the apparatus further comprises a recommendation unit configured to: and if the accuracy is less than or equal to a preset accuracy threshold, determining recommended model information from the model information set according to the accuracy, the model type and the model parameters selected by the user, and sending the recommended model information to the terminal so that the user can reselect the model type and the model parameters.

In some embodiments, the training unit is further to: selecting sample data from the sample data set and executing the following training steps: training a neural network by using sample data; adjusting network parameters of the neural network based on the training result; determining whether a training completion condition has been satisfied; in response to determining that the training information is not satisfied and the training stopping information sent by the terminal of the user is not received, selecting other sample data from the sample data set to continue to perform the training step; wherein the training completion condition includes at least one of: the training times of the neural network reach a preset training time threshold; in two adjacent training, the loss value between the outputs of the neural network is smaller than a preset threshold value.

In a third aspect, an embodiment of the present application provides a server, including: one or more processors; storage means for storing one or more programs which, when executed by one or more processors, cause the one or more processors to carry out a method according to any one of the first aspect.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method according to any one of the first aspect.

According to the method and the device for generating the model, the model customized by the user is trained by selecting the model type and the model parameters of the neural network and the training sample data uploaded by the user based on the user. Thus, the customized data is effectively utilized to generate a customized model.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for generating a model according to the present application;

FIG. 3 is a schematic illustration of an application scenario of a method for generating a model according to the present application;

FIG. 4 is a flow diagram of yet another embodiment of a method for generating a model according to the present application;

FIG. 5 is a schematic diagram of an embodiment of an apparatus for generating a model according to the present application;

fig. 6 is a schematic structural diagram of a computer system suitable for implementing the terminal device or the server according to the embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the present method for generating a model or apparatus for generating a model may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. Various communication client applications, such as a model generation application, a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the

terminal devices

101, 102, 103.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting selection of model training parameters and uploading of sample data used by the model training, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), laptop and desktop computers, etc.

The server 105 may be a server that provides various services, such as a background model generation server that provides support for model information displayed on the

terminal devices

101, 102, 103. The background model generation server can analyze the received model generation request and the sample data and generate the model.

It should be noted that the method for generating the model provided in the embodiment of the present application is generally performed by the server 105, and accordingly, the apparatus for generating the model is generally disposed in the server 105.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In an alternative embodiment, the method for generating the model may be performed on a server or a server cluster composed of a plurality of servers, and the trained neural network model may be run on various types of electronic devices such as a server, a PC, a mobile terminal, a vehicle-mounted terminal, and the like.

With continued reference to FIG. 2, a flow 200 of one embodiment of a method for generating a model according to the present application is shown. The method for generating the model comprises the following steps:

step 201, in response to receiving a model generation request including a user identifier sent by a terminal of a user, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal.

In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the method for generating a model operates may receive a model generation request from a terminal of a user through a wired connection manner or a wireless connection manner, then look up a model information set corresponding to a user identifier from a preset model table, and send the model information set to the terminal for the user to select a model category and a model parameter, where the model information in the model information set includes the model category and the model parameter, and the model table is used to represent a corresponding relationship between the user identifier and the model information. The model classes may be, for example, convolutional neural networks, cyclic neural networks, and the like. Model parameters may include, but are not limited to, the number of network layers, kernel function type, error accuracy, learning rate, etc. The electronic equipment can set model information for each user, can also divide the users into different authorities and then associate the model table with the user authorities. The server can provide different model categories and model parameters for users with different authorities for the users to select. Different levels of authority can be set for the users according to the user identification in advance, for example, the authority is divided into two levels, the users with high level authority can select the neural network above 10 layers, and the users with low level authority can only select the neural network below 10 layers.

Step 202, responding to the received model category and model parameter which are sent by the terminal and selected by the user from the model information set, and determining the neural network matched with the model category and model parameter selected by the user.

In this embodiment, after receiving the model information set, the terminal displays the model information set for the user to select the model category and the model parameter. The user may select model parameters from the list of model information or may manually input the model parameters. The user can also input the name of the model to be generated through the terminal, the name is user-defined, and the server can check the name so as to avoid the duplication with the model names of other users. In addition, if the user inputs the model parameters by manual input, the server needs to verify the parameters input by the user. The model categories in the model information set are associated with the model parameters, and after the user selects the model categories, the selectable model parameters are related to the selected model categories. For example, if the user selects a convolutional neural network, the number of network layers may be selected from 3-10 layers. And if a feedforward neural network is selected, the number of network layers can be selected from 3-5 layers. The server determines a neural network which is matched with the model category and the model parameters which are selected by the user and transmitted by the terminal from the candidate neural network set. The neural networks in the candidate set of neural networks are mathematical models that apply information processing to structures similar to brain neurosynaptic connections. It is also often directly referred to in engineering and academia as simply "neural networks" or neural-like networks.

Optionally, the user may also input a name of the model to be generated through the terminal, where the name is user-defined, and the server may check the name to avoid duplication with the model names of other users.

Step 203, responding to the received sample data set sent by the terminal, and training by using a machine learning method based on the sample data set and the neural network to obtain a model.

In this embodiment, the terminal may transmit the set of sample data while transmitting the model class and the model parameters. Or waiting until the server verifies the model type and the model parameters, and sending a message to the terminal after the neural network is determined to prompt the user to send the sample data set. Therefore, the phenomenon that the sample data set is uploaded under the condition that the specified model type and the model parameters cannot be generated for the user is avoided, and network traffic is wasted. The training process of the model is explained by taking a neural network for identifying a road in the remote sensing image and a sample data set of the remote sensing image as an example. The remote sensing image sample data set comprises an original remote sensing image and road information in the original remote sensing image. The electronic equipment can input the original remote sensing image into the neural network to obtain an output result, if the similarity between the output result and the road information in the original remote sensing image is smaller than a preset similarity threshold value, network parameters are adjusted, the original remote sensing image is input into the neural network again, and the output result is compared with the road information in the original remote sensing image. And continuously adjusting the network parameters until the similarity between the output result and the road information in the original remote sensing image is greater than a preset similarity threshold. And then, selecting other original remote sensing images for training until the similarity between the output results of the preset number of original remote sensing images and the road information in the original remote sensing images is greater than a preset similarity threshold value, and training to obtain a model. The electronic device can be trained by an initial neural network, the initial neural network can be an untrained neural network or an untrained neural network, each layer of the initial neural network can be provided with initial parameters, and the parameters can be continuously adjusted in the training process of the neural network. The initial neural network may be various types of untrained or untrained artificial neural networks or models resulting from combining a plurality of untrained or untrained artificial neural networks. In this way, the electronic device can input the original remote sensing image from the input side of the initial neural network, sequentially process the parameters of each layer in the initial neural network, and output the original remote sensing image from the output side of the initial neural network, wherein the information output by the output side is the road information in the original remote sensing image.

In some optional implementations of this embodiment, the obtaining a model based on the sample data set and the neural network training includes: for each sample data in the sample data set, executing the following training steps: training a neural network by using the sample data, and adjusting network parameters of the neural network based on a training result; determining whether a training completion condition has been satisfied; and in response to determining that the training stopping message sent by the terminal of the user is not satisfied and not received, selecting other sample data from the sample data set to continue the training step. Setting the training completion condition can avoid the situation of training the neural network in an infinite loop. The training completion condition may include, but is not limited to, at least one of: training times for training the neural network reach a preset training time threshold; in two adjacent training, the loss value between the outputs of the neural network is smaller than a preset threshold value. The training result refers to an output result after the sample data is input into the neural network. An optional training process of the model is illustrated by taking a neural network for identifying a road in a remote sensing image and a sample data set of the remote sensing image as an example. The remote sensing image sample data set comprises an original remote sensing image and road information in the original remote sensing image, and the road characteristics can comprise characteristics of color, texture, height, temperature, shadow, direction change and the like of a road. Selecting part of original remote sensing images from a remote sensing image sample data set and inputting the selected parts of original remote sensing images into a neural network; extracting road characteristics of the input remote sensing image through a neural network; determining a loss value of road feature extraction at least according to the difference between the extracted road feature and road information in the original remote sensing image; and adjusting the network parameters of the neural network according to the loss value. Before starting the training, the neural network kernel parameters are initialized with a number of different small random numbers. The small random number is used for ensuring that the network does not enter a saturation state due to overlarge nuclear parameter values, so that training failure is caused; "different" is used to ensure that the network can learn normally. In fact, if the same number is used to the initial core parameters, the network is unable to learn. And correcting errors by comparing the result obtained by network training with the real class mark. The kernel parameters are continuously optimized by adjusting the kernel parameters to minimize the error.

Optionally, the progress information is output to the terminal of the user in the training process, so that the user can refer to the progress information to determine whether to finish the training process in advance. If the user wants to stop training, a stop training message is sent by the terminal.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for generating a model according to the present embodiment. In the application scenario of fig. 3, a user sends a model generation request including a user identifier to a server through a terminal 300, and the server finds a model information set matching the user identifier in a model table and sends the model information set to the terminal 300. The terminal 300 presents a set of model information for the user to enter a custom model name 301 and select a model category 302 and a network tier number 303. The terminal sends the user-entered custom model name 301, the selected model category 302, and the number of network layers 303 to the server. The server determines the neural network based on the model class 302 and the number of network layers 303. And then obtaining a model based on a sample data set uploaded by the terminal and neural network training.

According to the method provided by the embodiment of the application, the customized sample data set is uploaded by the user, and the customized model can be generated without compiling codes.

With further reference to FIG. 4, a flow 400 of yet another embodiment of a method for generating a model is shown. The process 400 of the method for generating a model includes the steps of:

step 401, in response to receiving a model generation request including a user identifier sent by a terminal of a user, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal.

Step 402, responding to the received model category and model parameter which are sent by the terminal and selected by the user from the model information set, and determining the neural network matched with the model category and model parameter selected by the user.

And step 403, in response to receiving the sample data set sent by the terminal, training by using a machine learning method based on the sample data set and the neural network to obtain a model.

Steps

401 and 403 are substantially the same as

step

201 and 203, and therefore will not be described again.

Step 404, a predetermined number of input sample data and a predetermined number of output sample data corresponding to the predetermined number of input sample data are acquired from the sample data set.

In this embodiment, the model training may be completed before the model continues to be evaluated using the sample data set. For example, for supervised learning, sample data in the set of sample data includes input sample data and output sample data. When the model is trained, 1 ten thousand pairs of input sample data and output sample data are used, and 100 pairs of input sample data and output sample data can be selected from the 1 ten thousand pairs of input sample data and output sample data when the model is evaluated.

Step 405, for each piece of acquired input sample data, inputting the input sample data into the model to obtain an output result corresponding to the input sample data, and if the similarity between the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold, accumulating the times of correct verification.

In this embodiment, in order to evaluate the effect of the generated model, a part of the sample data may be selected from the sample data set and input into the model generated in step 203 to obtain an output result. And comparing the output result with the output sample to determine the similarity between the output result and the output sample. The similarity calculation method can adopt methods such as general cosine similarity, Euclidean distance and the like. And if the similarity of the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold, the verification is considered to be passed. And performing verification for a predetermined number of times by using a predetermined number of input sample data, and accumulating the correct times of verification if passing after each verification.

In step 406, the ratio of the number of times the correctness is verified to a predetermined number is determined as the accuracy.

In this embodiment, step 405 is repeated a predetermined number of times, accumulating the number of times the verification is correct. The ratio of the number of times of verifying correctness to a predetermined number is taken as the accuracy. For example, 100 verifications are performed, and if the number of times of verifying correctness is 90 times, the accuracy is 90%.

Step 407, if the accuracy is greater than the predetermined accuracy threshold, converting the model into an application, and issuing the application to the target server for downloading and use by at least one terminal.

In this embodiment, if the accuracy is greater than the predetermined accuracy threshold, the model is considered to pass the evaluation and be available to other users. RESTful (REST) architecture may be used to translate a model into an application and publish it to a target server through a container service. In the name "presentation layer state transition" of REST, the subject is omitted. "presentation layer" refers in essence to a "presentation layer" of a "resource". A resource is an entity on the network, or a specific piece of information on the network. A resource is an interesting conceptual entity that is exposed to clients. Examples of resources are: application objects, database records, algorithms, and the like. It may be pointed to by a URI (uniform Resource Identifier), each Resource corresponding to a particular URI. To obtain this resource, the URI that accesses it is sufficient, so the URI becomes the address or unique identifier for each resource. All resources share a uniform interface to transfer state between the client and the server. The container service provides high-performance scalable container application management service, supports life cycle management of containerized applications, provides multiple application publishing modes and continuous delivery capability, and supports micro-service architecture. The container service simplifies the construction work of the container management cluster, integrates the cloud virtualization, storage, network and safety capabilities, and creates the cloud optimal container operation environment.

Alternatively, the model obtained by unsupervised learning may be released directly without evaluation.

In some optional implementations of this embodiment, feedback data sent by a terminal downloading an application for use is received and added to a sample data set, where the feedback data includes actual input data input into the application by the terminal downloading the application for use, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application for use, and a similarity between the actual output result and the expected output result is less than a predetermined similarity threshold; and retraining the model based on the actual input data, the actual output result and the expected output result, and re-issuing the updated model to the target server. Wherein the desired output result is a correct output result for the actual input data entered by the user using the model. And when the user judges that the actual output result corresponding to the actual input data does not accord with the expected output result through manpower or machines, feeding the expected output result, the actual input data and the actual output result corresponding to the actual input data back to the server together. The server receives the feedback data and adds the feedback data to the sample data set, retrains the model through data set version management, and supports continuous integration of the model and the service.

And step 408, if the accuracy is less than or equal to the preset accuracy threshold, determining recommended model information from the model information set according to the accuracy and the model type and the model parameters selected by the user, and sending the recommended model information to the terminal so that the user can reselect the model type and the model parameters.

In this embodiment, if the accuracy is less than or equal to the predetermined accuracy threshold, it indicates that the model evaluation fails, and therefore, it is recommended that the user retrain the model. The adjusted model information may be determined as recommended model information, such as whether model parameters or model types need to be adjusted, based on the difference between the accuracy and a predetermined accuracy threshold. And the adjustment amount can be determined according to the characteristics of different model types. For example, if the predetermined accuracy threshold is 90%, and the actual test accuracy is 89% when the current network layer number is 3, it may be recommended to adjust the network layer number to 5. If the actual test accuracy is 20% when the current network layer number is 3, the effect of improving the accuracy by changing the network layer number alone may not be good, and thus it may be recommended to adjust the type of model to another network, for example, to adjust the feedforward neural network to the convolutional neural network.

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for generating a model in the present embodiment highlights the step of expanding the sample data. Therefore, more sample data can be introduced by the scheme described by the embodiment, so that the speed and the accuracy of model generation are improved.

With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for generating a model, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied in various electronic devices.

As shown in fig. 5, the apparatus 500 for generating a model of the present embodiment includes: a receiving unit 501, a determining unit 502 and a training unit 503. The receiving unit 501 is configured to, in response to a received model generation request including a user identifier sent by a terminal of a user, search a model information set corresponding to the user identifier from a preset model table, and send the model information set to the terminal, where the model information in the model information set includes a model category and a model parameter, and the model table is used to represent a corresponding relationship between the user identifier and the model information; the determining unit 502 is configured to determine a neural network matched with the model category and the model parameter selected by the user in response to receiving the model category and the model parameter selected by the user from the model information set sent by the terminal; the training unit 503 is configured to, in response to receiving the sample data set sent by the terminal, train, by using the machine learning apparatus, based on the sample data set and the neural network, to obtain a model.

In this embodiment, the specific processes of the receiving unit 501, the determining unit 502 and the training unit 503 of the apparatus 500 for generating a model may refer to step 201, step 202 and step 203 in the corresponding embodiment of fig. 2.

In some optional implementations of this embodiment, the sample data in the sample data set includes input sample data and output sample data; and the apparatus 500 further comprises a verification unit configured to: acquiring a predetermined number of input sample data and a predetermined number of output sample data corresponding to the predetermined number of input sample data from the sample data set; inputting the input sample data into the model for each piece of acquired input sample data to obtain an output result corresponding to the input sample data, and accumulating the times of correct verification if the similarity between the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold; the ratio of the number of times the correctness was verified to a predetermined number is determined as the accuracy.

In some optional implementations of this embodiment, the apparatus 500 further includes a publishing unit configured to: and if the accuracy is greater than the preset accuracy threshold, converting the model into an application, and issuing the application to a target server for downloading and using by at least one terminal.

In some optional implementations of this embodiment, the apparatus 500 further includes a feedback unit configured to: receiving feedback data sent by a terminal downloading an application and adding the feedback data to a sample data set, wherein the feedback data comprises actual input data input into the application by the terminal downloading the application, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application, and the similarity between the actual output result and the expected output result is less than a preset similarity threshold; and retraining the model based on the actual input data, the actual output result and the expected output result, and re-issuing the updated model to the target server.

In some optional implementations of this embodiment, the apparatus 500 further includes a recommending unit configured to: and if the accuracy is less than or equal to a preset accuracy threshold, determining recommended model information from the model information set according to the accuracy, the model type and the model parameters selected by the user, and sending the recommended model information to the terminal so that the user can reselect the model type and the model parameters.

In some optional implementations of this embodiment, the training unit 503 is further configured to: selecting sample data from the sample data set and executing the following training steps: training a neural network by using sample data; adjusting network parameters of the neural network based on the training result; determining whether a training completion condition has been satisfied; in response to determining that the training information is not satisfied and the training stopping information sent by the terminal of the user is not received, selecting other sample data from the sample data set to continue to perform the training step; wherein the training completion condition includes at least one of: the training times of the neural network reach a preset training time threshold; in two adjacent training, the loss value between the outputs of the neural network is smaller than a preset threshold value.

Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a server according to embodiments of the present application. The server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a receiving unit, a determining unit, and a training unit. The names of the units do not form a limitation on the units themselves in some cases, for example, the receiving unit may also be described as a "unit which, in response to receiving a model generation request including a user identifier sent by a terminal of a user, looks up a model information set corresponding to the user identifier from a preset model table, and sends the model information set to the terminal".

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: in response to a received model generation request which is sent by a user terminal and comprises a user identifier, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal, wherein the model information in the model information set comprises a model category and a model parameter, and the model table is used for representing the corresponding relation between the user identifier and the model information; in response to receiving a model category and a model parameter which are sent by a terminal and selected by a user from a model information set, determining a neural network matched with the model category and the model parameter selected by the user; and responding to a received sample data set sent by the terminal, and training by utilizing a machine learning method based on the sample data set and the neural network to obtain a model.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. A method for generating a model, comprising:

in response to a model generation request including a user identifier sent by a terminal of a user, searching a model information set corresponding to the user identifier from a preset model table, and sending the model information set to the terminal, wherein the model information in the model information set includes a model category and a model parameter, the model table is used for representing a corresponding relation between the user identifier and the model information, and the model parameter includes at least one of the following: network layer number, kernel function type, error precision and learning rate;

in response to receiving the model category and the model parameter which are sent by the terminal and selected by the user from the model information set, determining a neural network matched with the model category and the model parameter selected by the user;

and responding to the received sample data set sent by the terminal, and training by utilizing a machine learning method based on the sample data set and the neural network to obtain a model.

2. The method of claim 1, sample data in the set of sample data comprising input sample data and output sample data; and

the method further comprises the following steps:

acquiring a predetermined number of input sample data and the predetermined number of output sample data corresponding to the predetermined number of input sample data from the sample data set;

inputting the input sample data into the model for each piece of acquired input sample data to obtain an output result corresponding to the input sample data, and accumulating the times of correct verification if the similarity between the output result and the output sample data corresponding to the input sample data is greater than a preset similarity threshold;

determining a ratio of the number of times the verification is correct to the predetermined number as an accuracy.

3. The method of claim 2, wherein the method further comprises:

and if the accuracy is greater than a preset accuracy threshold, converting the model into an application, and issuing the application to a target server for downloading and using by at least one terminal.

4. The method of claim 3, wherein the method further comprises:

receiving feedback data sent by a terminal downloading the application and adding the feedback data to the sample data set, wherein the feedback data comprises actual input data input into the application by the terminal downloading the application, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application, and the similarity between the actual output result and the expected output result is smaller than the preset similarity threshold;

retraining the model based on the actual input data, the actual output result, and the expected output result, and reissuing the updated model to the target server.

5. The method of claim 2, wherein the method further comprises:

and if the accuracy is less than or equal to a preset accuracy threshold, determining recommended model information from the model information set according to the accuracy and the model type and model parameters selected by the user, and sending the recommended model information to the terminal so that the user can reselect the model type and the model parameters.

6. The method according to one of claims 1-5, wherein said training a model based on said set of sample data and said neural network comprises:

selecting sample data from the sample data set and executing the following training steps: training a neural network by using sample data; adjusting network parameters of the neural network based on training results; determining whether a training completion condition has been satisfied;

in response to determining that the training information is not satisfied and a training stopping message sent by the user terminal is not received, selecting other sample data from the sample data set to continue the training step;

wherein the training completion condition includes at least one of:

the training times of the neural network reach a preset training time threshold;

in two adjacent training, the loss value between the outputs of the neural network is smaller than a preset threshold value.

7. An apparatus for generating a model, comprising:

the receiving unit is configured to, in response to a model generation request including a user identifier and sent by a terminal of a user, search a model information set corresponding to the user identifier from a preset model table, and send the model information set to the terminal, where model information in the model information set includes a model category and a model parameter, the model table is used to represent a correspondence between the user identifier and the model information, and the model parameter includes at least one of: network layer number, kernel function type, error precision and learning rate;

a determining unit configured to determine, in response to receiving the model category and the model parameter, which are sent by the terminal and selected by the user from the model information set, a neural network matched with the model category and the model parameter selected by the user;

and the training unit is configured to respond to the received sample data set sent by the terminal and train the sample data set and the neural network to obtain a model by utilizing a machine learning device.

8. The device of claim 7, sample data in the set of sample data comprising input sample data and output sample data; and

the apparatus further comprises a verification unit configured to:

9. The apparatus of claim 8, wherein the apparatus further comprises an issuing unit configured to:

10. The apparatus of claim 8, wherein the apparatus further comprises a feedback unit configured to:

receiving feedback data sent by a terminal downloading an application and adding the feedback data to the sample data set, wherein the feedback data comprises actual input data input into the application by the terminal downloading the application, an actual output result corresponding to the actual input data, and an expected output result input by a user of the terminal downloading the application, and the similarity between the actual output result and the expected output result is smaller than the preset similarity threshold;

retraining the model based on the actual input data, the actual output result, and the expected output result, and reissuing the updated model to a target server.

11. The apparatus of claim 8, wherein the apparatus further comprises a recommending unit configured to:

12. The apparatus according to one of claims 7-11, wherein the training unit is further configured to:

wherein the training completion condition includes at least one of:

13. A server, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-6.

14. A computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-6.