CN112241452A

CN112241452A - Model training method and device, electronic equipment and storage medium

Info

Publication number: CN112241452A
Application number: CN202011110803.4A
Authority: CN
Inventors: 张言; 梁晓旭; 邓远达
Original assignee: Baidu China Co Ltd
Current assignee: Baidu China Co Ltd
Priority date: 2020-10-16
Filing date: 2020-10-16
Publication date: 2021-01-19
Anticipated expiration: 2040-10-16
Also published as: CN112241452B

Abstract

The application discloses a model training method and device, electronic equipment and a storage medium, and relates to the technical field of deep learning. The specific scheme is as follows: receiving a task category label input by a user; generating at least one training sample corresponding to the task category label based on the task category label; extracting a training sample from all training samples corresponding to the task class labels as a current training sample; responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition. According to the embodiment of the application, model training can be realized without acquiring the training sample with the label in advance, so that the manpower consumption and the capital cost of manual labeling are greatly reduced.

Description

Model training method and device, electronic equipment and storage medium

Technical Field

The application relates to the field of artificial intelligence, and further relates to the technical field of deep learning, in particular to a model training method and device, electronic equipment and a storage medium.

Background

With the advent of the big data era, data acquisition becomes relatively easy, but data used for training often needs to be manually screened and labeled before being trained. A large amount of training data means that a large amount of manpower, time and capital are consumed for data labeling, the training speed of the artificial intelligence model is greatly limited, and further the model iteration speed and the model online time are influenced.

In the prior art, a supervised artificial intelligence model training method is generally adopted, taking an image classification task as an example, a proper amount of training data with labels is labeled manually, training data features are extracted based on a traditional feature extraction operator or based on a deep learning network and the like, classification prediction is performed on the extracted features by using a classifier, and the labeled labels are used as expected output for training. The method depends on a large amount of manual labeling data, and needs to consume a large amount of manpower, capital and time to be put into data labeling work; under the condition of an emergency task, the situation that a large amount of data with labels cannot be acquired may occur, so that the model cannot meet the performance requirement in a short period and cannot be on-line; if the training task is to control extremely important risks, then if the risks leak out, irreparable consequences can be caused to the company. Therefore, the existing model training method is not only low in model training efficiency, but also can cause inestimable results.

Disclosure of Invention

The application provides a model training method, a model training device, electronic equipment and a storage medium, model training can be realized without acquiring training samples with labels in advance, and manpower consumption and the capital cost of manual labeling are greatly reduced.

In a first aspect, the present application provides a model training method, including:

receiving a task category label input by a user;

generating at least one training sample corresponding to the task category label based on the task category label;

extracting a training sample from all training samples corresponding to the task class labels as a current training sample;

responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition.

In a second aspect, the present application provides a model training apparatus, the apparatus comprising: the device comprises a receiving module, a generating module, an extracting module and a training module; wherein the content of the first and second substances,

the receiving module is used for receiving the task category labels input by the user;

the generating module is used for generating at least one training sample corresponding to the task category label based on the task category label;

the extraction module is used for extracting one training sample from all the training samples corresponding to the task category labels as a current training sample;

the training module is used for responding to the condition that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition.

In a third aspect, an embodiment of the present application provides an electronic device, including:

one or more processors;

a memory for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the model training method of any of the embodiments of the present application.

In a fourth aspect, the present application provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the model training method according to any embodiment of the present application.

According to the technical scheme, the technical problem that a large amount of manpower, capital and time are consumed to input into data labeling work due to the fact that the prior art depends on a large amount of manual labeling data when a model is trained is solved, model training can be achieved without acquiring a labeled training sample in advance, and manpower consumption and the capital cost of manual labeling are greatly reduced.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

FIG. 1 is a first flowchart of a model training method according to an embodiment of the present disclosure;

FIG. 2 is a second flowchart of a model training method provided by an embodiment of the present application;

FIG. 3 is a third flowchart of a model training method according to an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a model training apparatus provided in an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a generating module provided in an embodiment of the present application;

FIG. 6 is a block diagram of an electronic device for implementing a model training method according to an embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Example one

Fig. 1 is a first flowchart of a model training method provided in an embodiment of the present application, where the method may be performed by a model training apparatus or an electronic device, where the apparatus or the electronic device may be implemented by software and/or hardware, and the apparatus or the electronic device may be integrated in any intelligent device with a network communication function. As shown in fig. 1, the model training method may include the steps of:

s101, acquiring a task category label input by a user.

In this step, the electronic device may obtain a task category tag input by the user. Specifically, the task category label input by the user is in a text format. For example, the task type label entered by the user is "dog".

And S102, generating at least one training sample corresponding to the task category label based on the task category label.

In this step, the electronic device may generate at least one training sample corresponding thereto based on the task category label. Specifically, the electronic device may generate at least one hyponym corresponding to the task type label based on a pre-constructed knowledge graph; then, taking each hyponym corresponding to the task type label as a keyword to capture at least one image corresponding to each hyponym; and combining each hyponym and each image corresponding to each hyponym into each training sample to obtain at least one training sample corresponding to the task class label. For example, assuming that the task category label input by the user is "dog", the electronic device may generate hyponyms "husky" and "faggish" corresponding to "dog" based on the pre-constructed knowledge graph; and then taking the 'Husky' and the 'ChaiGod' as keywords to grab at least one image corresponding to each other. For example, the images corresponding to "Husky" are "Husky image 1" and "Husky image 2"; the images corresponding to the "dog" are "dog image 1" and "dog image 2". Then, the electronic device may combine "husky" and "husky image 1" into one training sample; combining the 'Husky' and the 'Husky image 2' into a training sample; combining the 'dog firewood' and the 'dog firewood image 1' into a training sample; the "dog" and "dog figure 2" are combined into one training sample.

S103, extracting one training sample from all the training samples corresponding to the task category labels as a current training sample.

In this step, the electronic device may extract one training sample from all training samples corresponding to the task category label as the current training sample. Specifically, it is assumed that the number of training samples corresponding to the task category label is M, where M is a natural number greater than 1; the electronic device may extract one training sample from the M training samples as a current training sample.

S104, responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition.

In this step, the electronic device may input the current sample image into the model to be trained in response to that the model to be trained does not satisfy the preset convergence condition, and train the model to be trained using the current sample image; and repeatedly executing the operation of extracting the current sample image until the model to be trained meets the preset convergence condition. Specifically, the electronic device may calculate a loss function value corresponding to the current sample image according to a prediction result corresponding to the current sample image and a real result corresponding to the current sample image; training a model to be trained according to a loss function value corresponding to the current sample image; and repeatedly executing the operation of extracting the current sample image until the model to be trained meets the preset convergence condition. Specifically, the electronic device may perform back propagation on the model to be trained according to the loss function value corresponding to the current sample image set to adjust the hierarchical weight layer by layer until the model of the next training period is obtained.

The model training method provided by the embodiment of the application comprises the steps of firstly receiving a task category label input by a user; then generating at least one training sample corresponding to the task category label based on the task category label; extracting a training sample from all training samples corresponding to the task class labels as a current training sample; responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition. That is, the application may generate at least one training sample corresponding to the task category label based on the task category label, and perform model training using the at least one training sample corresponding to the task type label. In the existing model training method, a large amount of manual labeling data is relied on when the model is trained, and a large amount of manpower, capital and time are consumed to be put into the data labeling work. Because the technical means of generating at least one corresponding training sample based on the task category label is adopted, the technical problems that a large amount of manual labeling data is relied on when the model is trained, and a large amount of manpower, capital and time are consumed to be input into data labeling work in the prior art are solved; moreover, the technical scheme of the embodiment of the application is simple and convenient to implement, convenient to popularize and wide in application range.

Example two

Fig. 2 is a second flowchart of the model training method according to the embodiment of the present application. Further optimization and expansion are performed based on the technical scheme, and the method can be combined with the various optional embodiments. As shown in fig. 2, the model training method may include the steps of:

s201, acquiring a task category label input by a user.

S202, generating at least one hyponym corresponding to the task type label based on the pre-constructed knowledge graph.

In this step, the electronic device may generate at least one hyponym corresponding to the task type tag based on a pre-constructed knowledge graph. Specifically, the electronic device may input the task category label into the knowledge graph, and obtain at least one hyponym corresponding to the task type label through the knowledge graph. For example, the electronic device may input a task category label "dog" input by the user into the knowledge-graph, and obtain the hyponyms "husky" and "faggish" of "dog" through the knowledge-graph.

S203, taking each hyponym corresponding to the task type label as a keyword to capture at least one image corresponding to each hyponym.

In this step, the electronic device may capture at least one image corresponding to each hyponym by using each hyponym corresponding to the task type tag as a keyword. Assuming that the task category label input by the user is "dog", the electronic device may generate hyponyms "husky" and "faggish" corresponding to "dog" based on the pre-constructed knowledge graph; and then taking the 'Husky' and the 'ChaiGod' as keywords to grab at least one image corresponding to each other. For example, the images corresponding to "Husky" are "Husky image 1" and "Husky image 2"; the images corresponding to the "dog" are "dog image 1" and "dog image 2".

S204, combining each hyponym and each image corresponding to each hyponym into each training sample to obtain at least one training sample corresponding to the task class label.

In this step, the electronic device may combine each hyponym and each image corresponding to each hyponym into each training sample, so as to obtain at least one training sample corresponding to the task class label. For example, the images corresponding to the hyponym "husky" are "husky image 1" and "husky image 2"; the images corresponding to the hyponymy "dog" are "dog image 1" and "dog image 2". Then, the electronic device may combine "husky" and "husky image 1" into one training sample; combining the 'Husky' and the 'Husky image 2' into a training sample; combining the 'dog firewood' and the 'dog firewood image 1' into a training sample; the "dog" and "dog figure 2" are combined into one training sample.

S205, extracting one training sample from all the training samples corresponding to the task category labels as a current training sample.

S206, responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current sample image into the model to be trained, and training the model to be trained by using the current sample image; and repeatedly executing the operation of extracting the current sample image until the model to be trained meets the preset convergence condition.

EXAMPLE III

Fig. 3 is a third flow chart of the model training method according to the embodiment of the present application. Further optimization and expansion are performed based on the technical scheme, and the method can be combined with the various optional embodiments. As shown in fig. 3, the model training method may include the steps of:

s301, acquiring a task category label input by a user.

S302, generating at least one hyponym corresponding to the task type label based on a pre-constructed knowledge graph.

S303, taking each hyponym corresponding to the task type label as a keyword to capture at least one image corresponding to each hyponym.

S304, one image is extracted from the images corresponding to all the hyponyms and is used as a current image.

In this step, the electronic device may extract one sample image from at least one sample image corresponding to the task training label as a current sample image. Specifically, assume that N sample images corresponding to the task training labels are respectively: sample image 1, sample image 2, …, sample image N; n is a natural number of 1 or more. In this step, the electronic device may extract one sample image from the N sample images corresponding to the task training labels as the current sample image.

S305, performing quality evaluation on the current sample image by adopting a preset quality evaluation algorithm, and determining that the current image is dirty data or clean data; repeatedly executing the operation until each image corresponding to each hyponym is determined to be dirty data or clean data; and obtaining clean data corresponding to each hyponym.

In this step, the electronic device may perform quality evaluation on the current sample image by using a preset quality evaluation algorithm to determine that the current image is dirty data or clean data; repeatedly executing the operation until each image corresponding to each hyponym is determined to be dirty data or clean data; and obtaining clean data corresponding to each hyponym. Specifically, the preset quality evaluation algorithm includes at least one of the following: definition detection, pure tone image detection, image damage detection. The electronic equipment adopts one or more quality evaluation methods to carry out quality evaluation on the current sample image, and determines that the current sample image is a dirty data sample or a clean data sample.

Preferably, in a specific embodiment of the present application, the electronic device may further cluster clean data corresponding to all hyponyms by using a preset clustering algorithm to obtain at least one group of clean data corresponding to all hyponyms; then determining the position of each group of clean data in a pre-constructed knowledge graph; calculating the distance between the position of each group of clean data and a predetermined central position; in response to that at least one group of clean data with the distance from the predetermined center position being greater than or equal to the preset threshold exists in all the clean data, the electronic device may remove the clean data with the distance from the predetermined center position being greater than or equal to the preset threshold from all the clean data, and obtain the clean data with the distance from the predetermined center position being less than the preset threshold; and taking the clean data sample with the distance between the predetermined central positions smaller than a preset threshold value as the clean data corresponding to all the hyponyms. Specifically, the preset clustering algorithm includes at least one of the following: k-means clustering algorithm K-means and Mean Shift clustering algorithm Mean-Shift.

Preferably, in a specific embodiment of the present application, the electronic device may further perform data amplification on each clean data corresponding to each hyponym by using a preset data amplification method, so as to obtain the data-amplified clean data corresponding to each hyponym; and taking the clean data corresponding to each hyponym after the data corresponding to each hyponym is augmented as the clean data corresponding to all the hyponyms. Specifically, the preset data amplification method includes at least one of the following: affine transformation, perspective transformation, color disturbance, data enhancement mixup and generation of a discrimination network GAN.

S306, combining each hyponym and each clean data corresponding to each hyponym into each training sample.

And S307, extracting one training sample from all the training samples corresponding to the task class labels as a current training sample.

S308, responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current sample image into the model to be trained, and training the model to be trained by using the current sample image; and repeatedly executing the operation of extracting the current sample image until the model to be trained meets the preset convergence condition.

It should be noted that the embodiments of the present application may be applied to a model of an image classification task and a model of an object detection task. When a model of an image classification task is trained, firstly, generating keywords based on a knowledge graph according to class labels of the image classification task, and capturing network images by using the generated keywords; removing images with poor quality from captured images through quality evaluation, removing images far away from a clustering center by using a clustering method to obtain cleaned images, expanding training data for the cleaned images based on a data augmentation algorithm, and forming the training data by the preprocessed images and corresponding class labels; and training the model of the image classification task to be trained based on the training data, and obtaining the model of the image classification task when the training is finished. In addition, when a model of a target detection task is trained, keywords are generated based on a knowledge graph according to non-target labels, and the generated keywords are used for capturing network images to obtain background images; removing images with poor quality from the captured background images through quality evaluation, and obtaining training data based on a data augmentation method by using a target template and the background images which are prepared in advance; wherein the training data may include a location of the target, a category of the target; and training the model of the target detection task to be trained based on the training data, and obtaining the detection model of the target detection task when the training is finished.

Example four

Fig. 4 is a schematic structural diagram of a model training apparatus according to an embodiment of the present application. As shown in fig. 4, the apparatus 400 includes: a receiving module 401, a generating module 402, an extracting module 403 and a training module 404; wherein the content of the first and second substances,

the receiving module 401 is configured to receive a task category label input by a user;

the generating module 402 is configured to generate at least one training sample corresponding to the task category label based on the task category label;

the extracting module 403 is configured to extract one training sample from all training samples corresponding to the task category labels as a current training sample;

the training module 404 is configured to, in response to that a model to be trained does not meet a preset convergence condition, input the current training sample into the model to be trained, and train the model to be trained using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition.

Fig. 5 is a schematic structural diagram of a generating module provided in an embodiment of the present application. As shown in fig. 5, the generating module 402 includes: generating submodule 4021, grabbing submodule 4022 and combining submodule 4023; wherein the content of the first and second substances,

the generating submodule 4021 is configured to generate at least one hyponym corresponding to the task type tag based on a pre-constructed knowledge graph;

the capturing sub-module 4022 is configured to capture at least one image corresponding to each hyponym by using each hyponym corresponding to the task type tag as a keyword;

the combining sub-module 4023 is configured to combine each hyponym and each image corresponding to each hyponym into each training sample, so as to obtain at least one training sample corresponding to the task class label.

Further, the generating module 402 further includes: a cleaning sub-module 4024 (not shown in the figure) for extracting an image corresponding to all the hyponyms as a current image; performing quality evaluation on the current sample image by adopting a preset quality evaluation algorithm to determine that the current image is dirty data or clean data; repeatedly executing the operation until each image corresponding to each hyponym is determined to be dirty data or clean data; obtaining clean data corresponding to each hyponym; and performing operation of combining each hyponym and each clean data corresponding to each hyponym into each training sample.

Further, the generating module 402 further includes: the convergence submodule 4025 (not shown in the figures) is configured to cluster the clean data corresponding to all the hyponyms by using a preset clustering algorithm to obtain at least one group of clean data corresponding to all the hyponyms; determining the position of each group of clean data in the pre-constructed knowledge graph; calculating the distance between the position of each group of clean data and a predetermined central position; in response to the fact that at least one group of clean data with the distance from the predetermined center position being greater than or equal to a preset threshold exists in all the clean data, removing the clean data with the distance from the predetermined center position being greater than or equal to the preset threshold from all the clean data, and obtaining the clean data with the distance from the predetermined center position being less than the preset threshold; and taking the clean data sample with the distance between the sample and the predetermined central position smaller than the preset threshold value as the clean data corresponding to all the hyponyms.

Further, the generating sub-module 402 further includes: the augmentation sub-module 4026 (not shown in the figure) is configured to perform data augmentation on each clean data corresponding to each hyponym by using a preset data augmentation method to obtain the data augmented clean data corresponding to each hyponym; and taking the clean data corresponding to each hyponym after the data corresponding to each hyponym is augmented as the clean data corresponding to all the hyponyms.

Further, the preset quality evaluation algorithm comprises at least one of the following: definition detection, pure color image detection and image damage detection; the preset clustering algorithm comprises at least one of the following steps: a K-means clustering algorithm and a mean shift clustering algorithm; the preset data amplification method comprises at least one of the following steps: affine transformation, perspective transformation, color disturbance, data enhancement and generation of a discrimination network.

The model training device can execute the method provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of the execution method. For details of the technique not described in detail in this embodiment, reference may be made to the model training method provided in any embodiment of the present application.

EXAMPLE five

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 6 is a block diagram of an electronic device according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 6, the electronic apparatus includes: one or more processors 601, memory 602, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, one processor 601 is taken as an example.

The memory 602 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the model training methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the model training method provided herein.

The memory 602, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the receiving module 401, the generating module 402, the extracting module 403, and the training module 404 shown in fig. 4) corresponding to the model training method in the embodiments of the present application. The processor 601 executes various functional applications of the server and data processing by running non-transitory software programs, instructions and modules stored in the memory 602, that is, implementing the model training method in the above method embodiment.

The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the model-trained electronic device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 602 optionally includes memory located remotely from processor 601, and these remote memories may be connected to a model training electronic device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the model training method may further include: an input device 603 and an output device 604. The processor 601, the memory 602, the input device 603 and the output device 604 may be connected by a bus or other means, and fig. 6 illustrates the connection by a bus as an example.

The input device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of the model-trained electronic device, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 604 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

According to the technical scheme of the embodiment of the application, a task category label input by a user is received; then generating at least one training sample corresponding to the task category label based on the task category label; extracting a training sample from all training samples corresponding to the task class labels as a current training sample; responding to the fact that the model to be trained does not meet the preset convergence condition, inputting the current training sample into the model to be trained, and training the model to be trained by using the current training sample; and repeatedly executing the operation of extracting the current training sample until the model to be trained meets the preset convergence condition. That is, the application may generate at least one training sample corresponding to the task category label based on the task category label, and perform model training using the at least one training sample corresponding to the task type label. In the existing model training method, a large amount of manual labeling data is relied on when the model is trained, and a large amount of manpower, capital and time are consumed to be put into the data labeling work. Because the technical means of generating at least one corresponding training sample based on the task category label is adopted, the technical problems that a large amount of manual labeling data is relied on when a model is trained, and a large amount of manpower, capital and time are consumed to be put into data labeling work in the prior art are solved, the model training can be realized without acquiring the training sample with the label in advance, and the manpower consumption and the capital cost of manual labeling are greatly reduced; moreover, the technical scheme of the embodiment of the application is simple and convenient to implement, convenient to popularize and wide in application range.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method of model training, the method comprising:

receiving a task category label input by a user;

2. The method of claim 1, the generating at least one training sample corresponding thereto based on the task category labels, comprising:

generating at least one hyponym corresponding to the task type label based on a pre-constructed knowledge graph;

taking each hyponym corresponding to the task type label as a keyword to capture at least one image corresponding to each hyponym;

and combining each hyponym and each image corresponding to each hyponym into each training sample to obtain at least one training sample corresponding to the task class label.

3. The method according to claim 2, wherein after capturing at least one image corresponding to each hyponym by using each hyponym corresponding to the task type label as a keyword, and before combining each hyponym and each image corresponding to each hyponym into each training sample, the method further comprises:

extracting an image from the images corresponding to all the hyponyms as a current image;

performing quality evaluation on the current sample image by adopting a preset quality evaluation algorithm to determine that the current image is dirty data or clean data; repeatedly executing the operation until each image corresponding to each hyponym is determined to be dirty data or clean data; obtaining clean data corresponding to each hyponym; and performing operation of combining each hyponym and each clean data corresponding to each hyponym into each training sample.

4. The method of claim 3, further comprising:

clustering clean data corresponding to all hyponyms by adopting a preset clustering algorithm to obtain at least one group of clean data corresponding to all hyponyms;

determining the position of each group of clean data in the pre-constructed knowledge graph; calculating the distance between the position of each group of clean data and a predetermined central position;

in response to the fact that at least one group of clean data with the distance from the predetermined center position being greater than or equal to a preset threshold exists in all the clean data, removing the clean data with the distance from the predetermined center position being greater than or equal to the preset threshold from all the clean data, and obtaining the clean data with the distance from the predetermined center position being less than the preset threshold; and taking the clean data sample with the distance between the sample and the predetermined central position smaller than the preset threshold value as the clean data corresponding to all the hyponyms.

5. The method of claim 4, further comprising:

performing data augmentation on each clean data corresponding to each hyponym by adopting a preset data augmentation method to obtain the data augmented clean data corresponding to each hyponym; and taking the clean data corresponding to each hyponym after the data corresponding to each hyponym is augmented as the clean data corresponding to all the hyponyms.

6. The method of claim 5, the predetermined quality assessment algorithm comprising at least one of: definition detection, pure color image detection and image damage detection; the preset clustering algorithm comprises at least one of the following steps: a K-means clustering algorithm and a mean shift clustering algorithm; the preset data amplification method comprises at least one of the following steps: affine transformation, perspective transformation, color disturbance, data enhancement and generation of a discrimination network.

7. A model training apparatus, the apparatus comprising: the device comprises a receiving module, a generating module, an extracting module and a training module; wherein the content of the first and second substances,

8. The apparatus of claim 7, the generating module comprising: generating a submodule, a grabbing submodule and a combining submodule; wherein the content of the first and second substances,

the generating submodule is used for generating at least one hyponym corresponding to the task type label based on a pre-constructed knowledge graph;

the grabbing submodule is used for grabbing at least one image corresponding to each hyponym by taking each hyponym corresponding to the task type label as a keyword;

and the combining submodule is used for combining each hyponym and each image corresponding to each hyponym into each training sample to obtain at least one training sample corresponding to the task class label.

9. The apparatus of claim 8, the generating module further comprising: the cleaning submodule is used for extracting an image from the images corresponding to all the hyponyms as a current image; performing quality evaluation on the current sample image by adopting a preset quality evaluation algorithm to determine that the current image is dirty data or clean data; repeatedly executing the operation until each image corresponding to each hyponym is determined to be dirty data or clean data; obtaining clean data corresponding to each hyponym; and performing operation of combining each hyponym and each clean data corresponding to each hyponym into each training sample.

10. The apparatus of claim 9, the generating module further comprising: the convergence submodule is used for clustering clean data corresponding to all hyponyms by adopting a preset clustering algorithm to obtain at least one group of clean data corresponding to all hyponyms; determining the position of each group of clean data in the pre-constructed knowledge graph; calculating the distance between the position of each group of clean data and a predetermined central position; in response to the fact that at least one group of clean data with the distance from the predetermined center position being greater than or equal to a preset threshold exists in all the clean data, removing the clean data with the distance from the predetermined center position being greater than or equal to the preset threshold from all the clean data, and obtaining the clean data with the distance from the predetermined center position being less than the preset threshold; and taking the clean data sample with the distance between the sample and the predetermined central position smaller than the preset threshold value as the clean data corresponding to all the hyponyms.

11. The apparatus of claim 10, the generation submodule further comprising: the augmentation submodule is used for performing data augmentation on each clean data corresponding to each hyponym by adopting a preset data augmentation method to obtain the data augmented clean data corresponding to each hyponym; and taking the clean data corresponding to each hyponym after the data corresponding to each hyponym is augmented as the clean data corresponding to all the hyponyms.

12. The apparatus of claim 11, the predetermined quality assessment algorithm comprising at least one of: definition detection, pure color image detection and image damage detection; the preset clustering algorithm comprises at least one of the following steps: a K-means clustering algorithm and a mean shift clustering algorithm; the preset data amplification method comprises at least one of the following steps: affine transformation, perspective transformation, color disturbance, data enhancement and generation of a discrimination network.

13. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.

14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.