WO2020233200A1

WO2020233200A1 - Model training method and device and information prediction method and device

Info

Publication number: WO2020233200A1
Application number: PCT/CN2020/078590
Authority: WO
Inventors: 陈日伟
Original assignee: 北京字节跳动网络技术有限公司
Priority date: 2019-05-17
Filing date: 2020-03-10
Publication date: 2020-11-26
Also published as: CN110110811A

Abstract

Disclosed in embodiments of the present invention are a model training method and device and an information prediction method and device. A specific embodiment of the model training method comprises: obtaining a sample image set; training to obtain a corresponding prediction model on the basis of a first sample image in the sample image set and an annotation result corresponding to the first sample image; executing the following processing steps: inputting at least one second sample image in the sample image set into the prediction model to obtain a corresponding prediction result; sending the at least one second sample image and the prediction result to an annotation end used by an annotator; and obtaining an annotation result obtained after the annotator adjusts the prediction result, and continuing to train the prediction model on the basis of the obtained annotation result and the sample image corresponding to the obtained annotation result. The embodiment employs an annotation mode of combining machine annotation with manual annotation, so that the annotation efficiency and the model prediction accuracy can be improved, and the cold starting of the model is facilitated.

Description

Method and device for training model, method and device for predicting information

Cross references to related applications

This application is filed based on the Chinese patent application with the application number 201910414089.9 and the application date on May 17, 2019, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated into this application by reference.

Technical field

The embodiments of the present disclosure relate to the field of computer technology, in particular to methods and devices for training models, and methods and devices for predicting information.

Background technique

Before training a machine learning model, it is usually necessary to prepare training data and label the training data. Among them, the training data may include image data, for example. For the labeling of image data, the existing labeling method usually provides the image to be labelled that does not correspond to the labeling result to the labeler, and the labeler directly makes the labeling.

Summary of the invention

The embodiments of the present disclosure propose methods and devices for training models, and methods and devices for predicting information.

In the first aspect, an embodiment of the present disclosure provides a method for training a model, including: obtaining a sample image set, where a first sample image and a second sample image exist in the sample image set, and the first sample image corresponds to The sample image of the annotation result, the second sample image is a sample image that does not correspond to the annotation result; based on the first sample image in the sample image set and its corresponding annotation result, the corresponding prediction model is trained; the following processing steps are performed: Input at least one second sample image in the sample image set into the trained prediction model to obtain the corresponding prediction result; send the above at least one second sample image and the prediction result to the labeling terminal used by the labeler; obtain the label The person adjusts the prediction result obtained after the annotation result, and based on the obtained annotation result and its corresponding sample image, continues to train the trained prediction model.

In some embodiments, after continuing to train the trained prediction model based on the obtained annotation result and the corresponding sample image, the above method further includes: if there is still a second sample image in the sample image set, continue Perform the above processing steps.

In some embodiments, obtaining the annotation result obtained by the annotator after adjusting the prediction result includes: receiving the annotation result obtained by the annotator returned by the annotation terminal after adjusting the prediction result.

In some embodiments, the tagging terminal is used to store the tagging result obtained by the tagger after adjusting the prediction result in a designated storage location; and obtaining the tagging result obtained by the tagger adjusting the prediction result, including: The storage location obtains the labeling result obtained by the labeling staff after adjusting the prediction result.

In some embodiments, each sample image in the sample image set displays the face of the target object, the target object is a human or an animal, and the annotation result is used to indicate the position of the key points of the face displayed in the corresponding sample image, and the training The obtained prediction model is used to locate key points of the face.

In the second aspect, the embodiments of the present disclosure provide a method for predicting information. The method includes: receiving an image to be detected; and inputting the image to be detected into a training method as described in any one of the implementation modes in the first aspect. Forecast model to get the corresponding forecast result.

In some embodiments, the above method further includes: sending the to-be-detected image and the prediction result to an annotation terminal used by an annotator, so that the annotator can adjust the prediction result, and use the adjusted prediction result as the annotation result. And sending the image to be detected and the annotation result to the model training end used to train the prediction model, so that the model training end continues to train the prediction model based on the image to be detected and the annotation result.

In a third aspect, an embodiment of the present disclosure provides an apparatus for training a model. The apparatus includes: an acquisition unit configured to acquire a sample image set, where a first sample image and a second sample image exist in the sample image set , The first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result; the training unit is configured to be based on the first sample image in the sample image set and its corresponding annotation As a result, the training obtains the corresponding prediction model; the processing unit is configured to perform the following processing steps: input at least one second sample image in the sample image set into the trained prediction model to obtain the corresponding prediction result; The second sample image and the prediction result are sent to the labeling terminal used by the labeling staff; the labeling result obtained after the labeling staff has adjusted the prediction result, and based on the obtained labeling result and its corresponding sample image, continue to train The resulting prediction model is trained.

In some embodiments, the processing unit is further configured to: after continuing to train the trained prediction model based on the obtained annotation result and the corresponding sample image, if there is a second sample image in the sample image set To continue the processing steps.

In some embodiments, the processing unit is further configured to receive an annotation result obtained by an annotator returned by the annotation terminal after adjusting the prediction result.

In some embodiments, the tagging terminal is used to store the tagging result obtained by the tagger after adjusting the prediction result in a designated storage location; and the processing unit is further configured to: obtain the tagger from the designated storage location to perform The marked result after adjustment.

In a fourth aspect, an embodiment of the present disclosure provides an apparatus for predicting information. The apparatus includes: a receiving unit configured to receive an image to be detected; a prediction unit configured to input the image to be detected as the first The prediction model trained by the method described in any implementation manner in the aspect obtains the corresponding prediction result.

In some embodiments, the above-mentioned device further includes: a sending unit configured to send the image to be detected and the prediction result to an annotation terminal used by an annotator, so that the annotator can adjust the prediction result, and the adjusted The prediction result is used as the annotation result, and the image to be detected and the annotation result are sent to the model training terminal for training the prediction model, so that the model training terminal continues to train the prediction model based on the image to be detected and the annotation result.

In a fifth aspect, the embodiments of the present disclosure provide an electronic device that includes: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are used by the One or more processors execute, so that the one or more processors implement the method described in any one of the first aspect and the second aspect.

In a sixth aspect, the embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method as described in any one of the first aspect and the second aspect is implemented.

The method and device for training a model provided by the above-mentioned embodiments of the present disclosure obtain a sample image set, and then train a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result, Then perform the following processing steps: input at least one second sample image in the sample image set into the prediction model to obtain the corresponding prediction result; send the at least one second sample image and the prediction result to the annotator used Annotation terminal: Obtain the annotation results obtained by the annotation personnel after adjusting the prediction results, and continue to train the prediction model based on the obtained annotation results and the corresponding sample images, which can improve the efficiency of annotation and the accuracy of model prediction , Which is conducive to the cold start of the model.

The method and device for predicting information provided by the above-mentioned embodiments of the present disclosure can perform prediction operations using a predictive model trained by the method described in any one of the implementations in the first aspect, which can be used when the model is in the cold start phase. Obtain prediction results with higher accuracy.

Description of the drawings

By reading the detailed description of the non-limiting embodiments with reference to the following drawings, other features, purposes and advantages of the present disclosure will become more apparent:

FIG. 1 is an exemplary system architecture diagram in which some embodiments of the present disclosure can be applied;

Fig. 2 is a flowchart of an embodiment of a method for training a model according to the present disclosure;

FIG. 3 is a flowchart of another embodiment of the method for training a model according to the present disclosure;

Fig. 4 is a flowchart of an embodiment of a method for predicting information according to the present disclosure;

Fig. 5 is a schematic structural diagram of an embodiment of an apparatus for training a model according to the present disclosure;

Fig. 6 is a schematic structural diagram of an embodiment of an apparatus for predicting information according to the present disclosure;

FIG. 7 is a schematic structural diagram of a computer system suitable for implementing an electronic device of some embodiments of the present disclosure.

Detailed ways

The present disclosure will be further described in detail below in conjunction with the drawings and embodiments. It can be understood that the specific embodiments described here are only used to explain the relevant disclosure, but not to limit the disclosure. In addition, it should be noted that, for ease of description, only the parts related to the relevant disclosure are shown in the drawings.

It should be noted that the embodiments in the present disclosure and the features in the embodiments can be combined with each other if there is no conflict. Hereinafter, the present disclosure will be described in detail with reference to the drawings and in conjunction with embodiments.

FIG. 1 shows an exemplary system architecture 100 to which embodiments of the method and apparatus for training a model and the method and apparatus for predicting information of the present disclosure can be applied.

As shown in FIG. 1, the system architecture 100 may include

terminal devices

101 and 102 and

servers

103 and 104. Wherein, the terminal device 101 may be connected to the

servers

103 and 104 in communication. The server 103 and the server 104 may be connected in communication. The terminal device 102 and the server 104 may be connected in communication. Here, various communication connection modes may be adopted between the server and the server and between the server and the terminal device, such as wired, wireless communication links, or fiber optic cables.

It should be noted that the terminal device 101 may be, for example, a terminal device used by an annotator. The terminal device 102 may be, for example, a terminal device used by users with predicted needs. The server 103 may be, for example, a server for performing model training. The server 104 may be, for example, a server for performing prediction operations. In addition, the terminal device 102 may be installed with predictive applications supported by the server 104. The prediction application can be associated with a prediction model trained by the server 103. The prediction model can be used to predict images, and the prediction model can be deployed in the server 104. In addition, the annotation result corresponding to the sample image used by the server 103 in the training process of the prediction model may be annotated by the above-mentioned annotator.

It should be pointed out that the

terminal devices

101 and 102 may be hardware or software. When the

terminal devices

101 and 102 are hardware, they can be various electronic devices, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and so on. When the

terminal devices

101 and 102 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or as a single software or software module. There is no specific limitation here.

The

servers

103 and 104 may be hardware or software. When the

servers

103 and 104 are hardware, they can be implemented as a distributed server cluster composed of multiple servers, or as a single server. When the

servers

103 and 104 are software, they can be implemented as multiple software or software modules (for example, to provide distributed services), or can be implemented as a single software or software module. There is no specific limitation here.

It should be noted that the method for training a model provided by some embodiments of the present disclosure is generally executed by the server 103, and accordingly, the device for training the model is generally set in the server 103. In addition, the method for predicting information provided by some embodiments of the present disclosure is generally executed by the server 104, and accordingly, the device for predicting information is generally set in the server 104.

It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks and servers.

Continuing to refer to FIG. 2, it shows a process 200 of an embodiment of the method for training a model according to the present disclosure. The process 200 of the method for training a model includes the following steps:

Step 201: Obtain a sample image set. The sample image set contains a first sample image and a second sample image. The first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result.

In this embodiment, the execution subject of the method for training the model may be a server (for example, the server 103 shown in FIG. 1). The above-mentioned execution subject may obtain a pre-generated sample image set from a local or a connected server, for example. Wherein, the first sample image and the second sample image exist in the sample image set. The first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result. It should be noted that the labeling result corresponding to the first sample image may be a machine labeling result or a manual labeling result, which is not specifically limited here. Preferably, the annotation result corresponding to the first sample image is a manual annotation result.

It should be noted that the sample image may be various sample images, for example, a sample image displaying a designated object. The designated object may be, for example, a person, an animal (for example, a cat or a dog, etc.), a vehicle or a building, and so on. The annotation result can be used to indicate the position of the designated object displayed in the corresponding sample image, for example.

Optionally, each sample image in the sample image set may display the face of the target object, for example. The target object may be a human or an animal, for example. The annotation result can be used to indicate the position of key points (such as eyebrows, eyes, mouth, nose, etc.) of the face displayed in the corresponding sample image, for example.

Step 202: Train a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result.

In this embodiment, the above-mentioned execution subject may train to obtain a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result. For example, the above-mentioned execution subject may take the first sample image in the sample image set as input, and use the annotation result corresponding to the input first sample image as output, and train the prediction model. Specifically, the above-mentioned execution subject may use the first sample image in the sample image set as input, and the annotation result corresponding to the input first sample image as output, and train the target initial model to obtain the corresponding prediction model. Among them, the target initial model may be an untrained or untrained neural network, such as a Convolutional Neural Network (CNN) or a Recurrent Neural Network (RNN).

It should be noted that when the annotation result is used to indicate the position of the designated object displayed in the corresponding sample image, the trained prediction model can be used to locate the designated object displayed in the image.

Optionally, when the labeling result is used to indicate the position of the key points of the face displayed in the corresponding sample image, the prediction model obtained by training can be used to locate the key points of the face.

Step 203: Input at least one second sample image in the sample image set into the trained prediction model to obtain a corresponding prediction result.

In this embodiment, the above-mentioned execution subject may input at least one second sample image in the sample image set into the trained prediction model to obtain a corresponding prediction result.

Step 204: Send at least one second sample image and the prediction result to an annotation terminal used by an annotator.

In this embodiment, the above-mentioned execution subject may send the above-mentioned at least one second sample image and the prediction result to the annotating terminal used by the annotator (for example, the terminal device 101 shown in FIG. 1), so that the annotator can predict The result is adjusted, and the adjusted forecast result is used as the annotation result. Thereafter, the annotator may, for example, send the annotation result to the execution subject through the annotation terminal, or store the annotation result in a designated storage location. Wherein, the above-mentioned execution subject may have the authority to access the storage location.

It should be noted that by providing the prediction results to the labelers for adjustment, the workload of the labelers can be effectively reduced, and the labeling efficiency of the labelers can be improved.

Step 205: Obtain the annotation result obtained by the annotator after adjusting the prediction result, and continue to train the trained prediction model based on the obtained annotation result and its corresponding sample image.

In this embodiment, the above-mentioned execution subject may obtain the annotation result obtained by the annotator after adjusting the prediction result. For example, the annotation result sent by the annotator through the annotation terminal is received, or the annotation result stored by the annotator through the annotation terminal is obtained from the aforementioned storage location. Then, the above-mentioned execution subject may continue to train the prediction model trained in step 202 based on the obtained labeling result and the sample image corresponding to the labeling result, so as to improve the prediction accuracy of the prediction model.

The method provided by the above-mentioned embodiments of the present disclosure obtains a sample image set, and then trains a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result, and then executes the following processing steps: At least one second sample image in the sample image set is input to the prediction model to obtain the corresponding prediction result; the at least one second sample image and the prediction result are sent to the labeling terminal used by the labeler; The prediction result is adjusted to the labeling result, and based on the obtained labeling result and the corresponding sample image, continue to train the prediction model, which can improve the labeling efficiency and model prediction accuracy, which is conducive to the cold start of the model .

Continuing to refer to FIG. 3, it shows a process 300 of another embodiment of a method for training a model. The process 300 of the method for training a model includes the following steps:

Step 301: Obtain a sample image set. Each sample image in the sample image set shows the face of the target object. There are a first sample image and a second sample image in the sample image set. The first sample image corresponds to the annotation result. The sample image, the second sample image is a sample image that does not correspond to the annotation result, and the annotation result is used to indicate the position of the key points of the face displayed in the corresponding sample image.

In this embodiment, the execution subject of the method for training the model may be a server (for example, the server 103 shown in FIG. 1). The above-mentioned execution subject may obtain a pre-generated sample image set from a local or a connected server, for example. Wherein, the first sample image and the second sample image exist in the sample image set. The first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result. In addition, each sample image in the sample image set displays the face of the target object. The target object can be a human or an animal (for example, a cat or a dog, etc.). The annotation result is used to indicate the position of the key points of the face shown in the corresponding sample image.

It should be noted that the labeling result corresponding to the first sample image may be a machine labeling result or a manual labeling result, which is not specifically limited here. Preferably, the annotation result corresponding to the first sample image is a manual annotation result.

Step 302: Train a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result, and the prediction model is used to locate key facial points.

In this embodiment, the above-mentioned execution subject may train to obtain a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result. Among them, the prediction model is used to locate key points on the face.

Here, the above-mentioned execution subject may take the first sample image in the sample image set as input, and use the annotation result corresponding to the input first sample image as output, and train the corresponding prediction model. Specifically, the above-mentioned execution subject may use the first sample image in the sample image set as input, and the annotation result corresponding to the input first sample image as output, and train the target initial model to obtain the corresponding prediction model. Among them, the target initial model can be an untrained or untrained neural network, such as a convolutional neural network or a recurrent neural network.

Step 303: Input at least one second sample image in the sample image set into the trained prediction model to obtain a corresponding prediction result.

In this embodiment, the above-mentioned execution subject may input at least one second sample image in the sample image set into the trained prediction model to obtain a corresponding prediction result. The prediction result is used to indicate the positions of the key points of the faces respectively displayed on the at least one second sample image.

Step 304: Send at least one second sample image and the prediction result to the annotator used by the annotator.

In this embodiment, the above-mentioned execution subject may send the above-mentioned at least one second sample image and the prediction result to the annotating terminal used by the annotator (for example, the terminal device 101 shown in FIG. 1), so that the annotator can predict The result is adjusted, and the adjusted forecast result is used as the annotation result. Thereafter, the annotator may store the annotation result in a designated storage location, for example. Wherein, the above-mentioned execution subject may have the authority to access the storage location.

Step 305: Obtain the annotation result obtained by the annotator after adjusting the prediction result from the designated storage location, and continue to train the trained prediction model based on the obtained annotation result and the corresponding sample image.

In this embodiment, the above-mentioned execution subject can obtain the labeling result obtained by the labeling personnel after adjusting the prediction result from the designated storage location, and continue to predict the training obtained based on the obtained labeling result and its corresponding sample image The model is trained.

Step 306: Determine whether there is a second sample image in the sample image set.

In this embodiment, after performing step 305, the execution subject can detect whether there is a second sample image in the sample image set. If there is still a second sample image, the above-mentioned execution subject can go to step 303. If there is no second sample image, the above-mentioned execution subject may end the execution of the process 300.

It can be seen from FIG. 3 that, compared with the embodiment corresponding to FIG. 2, the process 300 of the method for training a model in this embodiment can further improve the labeling efficiency, and the training can obtain higher prediction accuracy. A predictive model for positioning key points on the face.

With further reference to Fig. 4, it shows a flow 400 of an embodiment of a method for predicting information. The process 400 of the method for predicting information includes the following steps:

Step 401: Receive an image to be detected.

In this embodiment, the execution subject of the method for predicting information may be a server (for example, the server 104 shown in FIG. 1). The foregoing execution subject may, for example, receive in real time the image to be detected sent by the user through the terminal device (for example, the terminal device 102 shown in FIG. 1).

Among them, a predictive model can be run on the above-mentioned executive body. The foregoing prediction model may be a model obtained by training using the method described in the embodiment shown in FIG. 2 or FIG. 3. In addition, the foregoing prediction model may be synchronized to the foregoing execution subject by a model training terminal (for example, the server 103 shown in FIG. 1) used to train the foregoing prediction model, for example.

Step 402: Input the image to be detected into the prediction model to obtain the corresponding prediction result.

In this embodiment, the execution subject may input the image to be detected into the prediction model to obtain the corresponding prediction result. Then, the above-mentioned executive body can also output the prediction result. For example, output the prediction result to the terminal device that sends the image to be detected, so that the terminal device shows the prediction result to the user.

In some optional implementations of this embodiment, when the prediction model is in the cold start phase, since there are few sample images used to train the prediction model, the prediction accuracy of the prediction model generally does not meet expectations. The forecast accuracy rate. Therefore, the above-mentioned execution body can send the image to be detected and the obtained prediction result to the annotator used by the annotator (for example, the terminal device 101 shown in FIG. 1), so that the annotator can adjust the prediction result, and the adjusted The latter prediction result is used as the annotation result, and the image to be detected and the annotation result are sent to the model training terminal, so that the model training terminal continues to train the prediction model based on the image to be detected and the annotation result. In addition, the model training terminal may synchronize the trained prediction model to the execution subject after training the prediction model based on the image to be detected and the annotation result. In this way, the above-mentioned execution subject can obtain prediction results with higher accuracy.

The method provided by the above-mentioned embodiment of the present disclosure uses the prediction model trained by the method described in the embodiment shown in FIG. 2 or FIG. 3 to perform the prediction operation, and the prediction operation can be achieved when the model is in the cold start phase. Accuracy of prediction results.

With further reference to FIG. 5, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a device for training a model. The device embodiment corresponds to the method embodiment shown in FIG. The device can be specifically applied to various electronic devices.

As shown in FIG. 5, the apparatus 500 for training a model of this embodiment may include: the acquiring unit 501 is configured to acquire a set of sample images, and the first sample image and the second sample image exist in the sample image set. This image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result; the training unit 502 is configured to train based on the first sample image in the sample image set and its corresponding annotation result Corresponding prediction model; the processing unit 503 is configured to perform the following processing steps: input at least one second sample image in the sample image set into the trained prediction model to obtain the corresponding prediction result; The image and the prediction result are sent to the annotation terminal used by the annotator; the annotation result obtained by the annotator after adjusting the prediction result is obtained, and based on the obtained annotation result and its corresponding sample image, continue to train the prediction model Conduct training.

In this embodiment, in the device 500 for training a model: the specific processing of the acquiring unit 501, the training unit 502, and the processing unit 503 and the technical effects brought by them can refer to the steps in the embodiment shown in FIG. 2 respectively. The related description of step 202, step 203, step 204 and step 205 will not be repeated here.

In some optional implementations of this embodiment, the processing unit 503 may be further configured to: after continuing to train the trained prediction model based on the obtained annotation result and its corresponding sample image, if the sample There is a second sample image in the image set, and the above processing steps are continued.

In some optional implementation manners of this embodiment, the processing unit 503 may be further configured to receive the annotation result obtained by the annotator returned by the annotation terminal after adjusting the prediction result.

In some optional implementations of this embodiment, the labeling terminal may be used to store the labeling result obtained by the labeling personnel after adjusting the prediction result in a designated storage location; and the processing unit 503 may be further configured to: The storage location of to obtain the labeling result obtained by the labeling staff after adjusting the prediction result.

In some optional implementations of this embodiment, each sample image in the sample image set may display the face of the target object, the target object may be a human or an animal, and the annotation result may be used to indicate the location of the corresponding sample image. The position of the key points of the displayed face, and the trained prediction model can be used to locate the key points of the face.

The device provided by the above-mentioned embodiment of the present disclosure obtains a sample image set, and then trains a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result, and then executes the following processing steps: At least one second sample image in the sample image set is input to the prediction model to obtain the corresponding prediction result; the at least one second sample image and the prediction result are sent to the labeling terminal used by the labeler; The annotation result obtained after the prediction result is adjusted, and based on the obtained annotation result and its corresponding sample image, continue to train the prediction model, which can improve the annotation efficiency and model prediction accuracy, which is conducive to the cold start of the model .

With further reference to FIG. 6, as an implementation of the methods shown in the above figures, the present disclosure provides an embodiment of a device for predicting information. The device embodiment corresponds to the method embodiment shown in FIG. 4. The device can be specifically applied to various electronic devices.

As shown in FIG. 6, the apparatus 600 for predicting information of this embodiment may include: the receiving unit 601 is configured to receive the image to be detected; the prediction unit 602 is configured to input the image to be detected into the prediction model (such as using FIG. 2 or Fig. 3 shows the prediction model obtained by training the method described in the embodiment, and the corresponding prediction result is obtained.

In this embodiment, in the apparatus 600 for predicting information: the specific processing of the receiving unit 601 and the predicting unit 602 and the technical effects brought by them can be referred to step 401 and step 402 in the embodiment shown in FIG. 4, respectively. The relevant description of, will not repeat them here.

In some optional implementation manners of this embodiment, the above-mentioned device 600 may further include: a sending unit (not shown in the figure) configured to send the image to be detected and the prediction result to the labeling terminal used by the labeling personnel, In this way, the annotator can adjust the prediction result, and use the adjusted prediction result as the annotation result, and send the image to be detected and the annotation result to the model training end used to train the prediction model, so that the model training end is based on the The detection images and annotation results continue to train the prediction model.

The device provided by the above-mentioned embodiment of the present disclosure uses the prediction model trained by the method described in the embodiment shown in FIG. 2 or FIG. 3 to perform prediction operations, and can obtain high accuracy when the model is in the cold start phase. The predicted result of the degree.

Referring now to FIG. 7, it shows a schematic structural diagram of an electronic device (for example, the

servers

103 and 104 in FIG. 1) 700 suitable for implementing the embodiments of the present disclosure. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals ( For example, mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs and desktop computers. The electronic device shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.

As shown in FIG. 7, the electronic device 700 may include a processing device (such as a central processing unit, a graphics processor, etc.) 701, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 702 or from a storage device 708 The program in the memory (RAM) 703 executes various appropriate actions and processing. The RAM 703 also stores various programs and data required for the operation of the electronic device 700. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to the bus 704.

Generally, the following devices can be connected to the I/O interface 705: including input devices 706 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, vibration An output device 707 such as a computer; a storage device 708 such as a hard disk; and a communication device 709. The communication device 709 may allow the electronic device 700 to perform wireless or wired communication with other devices to exchange data. Although FIG. 7 shows an electronic device 700 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices. Each block shown in FIG. 7 can represent one device, or can represent multiple devices as needed.

In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication device 709, or installed from the storage device 708, or installed from the ROM 702. When the computer program is executed by the processing device 701, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.

It should be noted that the computer-readable medium described in the embodiment of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the embodiments of the present disclosure, the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In the embodiments of the present disclosure, the computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium. The computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device . The program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.

The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device. The above-mentioned computer-readable medium carries one or more programs. When the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires a sample image set, and the sample image set contains the first sample image and the second sample image. Sample image, the first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result; based on the first sample image in the sample image set and its corresponding annotation result, the training is obtained Corresponding prediction model; perform the following processing steps: input at least one second sample image in the sample image set into the trained prediction model to obtain the corresponding prediction result; send the above at least one second sample image and the prediction result to The labeling terminal used by the labeling personnel; obtaining the labeling results obtained after the labeling personnel have adjusted the prediction results, and continuing to train the trained prediction model based on the obtained labeling results and the corresponding sample images. Alternatively, the electronic device can also be made to: receive the image to be detected associated with the above prediction model; input the image to be detected into the prediction model to obtain the corresponding prediction result.

The computer program code for performing the operations of the embodiments of the present disclosure can be written in one or more programming languages or a combination thereof, the programming languages including object-oriented programming languages such as Java, Smalltalk, C++, Also includes conventional procedural programming languages-such as "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server. In the case of a remote computer, the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).

The flowcharts and block diagrams in the accompanying drawings illustrate the possible implementation architecture, functions, and operations of the system, method, and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.

The units involved in the embodiments described in the present disclosure may be implemented in a software manner, or may be implemented in a hardware manner. Among them, the name of the unit does not constitute a limitation on the unit itself under certain circumstances. For example, the acquisition unit can also be described as a "unit for acquiring a collection of sample images".

The above description is only a preferred embodiment of the present disclosure and an explanation of the applied technical principles. Those skilled in the art should understand that the scope of disclosure involved in this disclosure is not limited to the technical solutions formed by the specific combination of the above technical features, and should also cover the above technical features or technical solutions without departing from the above disclosed concept. Other technical solutions formed by any combination of its equivalent features. For example, the above-mentioned features and the technical features disclosed in the present disclosure (but not limited to) with similar functions are mutually replaced to form a technical solution.

Claims

A method for training models, including:

Acquiring a sample image set, where a first sample image and a second sample image exist in the sample image set, the first sample image is a sample image corresponding to the annotation result, and the second sample image is a sample image that does not correspond to the annotation result;

Training to obtain a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result;

Perform the following processing steps: input at least one second sample image in the sample image set into the trained prediction model to obtain the corresponding prediction result; send the at least one second sample image and the prediction result to The annotation terminal used by the annotator; obtain the annotation result obtained by the annotator after adjusting the prediction result, and continue to train the trained prediction model based on the obtained annotation result and its corresponding sample image .
The method according to claim 1, wherein after the training of the prediction model obtained by training is continued based on the obtained annotation result and the corresponding sample image, the method further comprises:

If there is a second sample image in the sample image set, continue to perform the processing steps.
The method according to claim 1, wherein said obtaining the annotation result obtained by the annotator after adjusting the prediction result comprises:

Receiving the annotation result obtained by the annotator who adjusts the prediction result returned by the annotation terminal.
The method according to claim 1, wherein the labeling terminal is used to store the labeling result obtained by the labeling staff after adjusting the prediction result to a designated storage location; and

The obtaining the annotation result obtained by the annotator after adjusting the prediction result includes:

Obtain the labeling result obtained by the labeling personnel after adjusting the prediction result from the designated storage location.
The method according to any one of claims 1 to 4, wherein each sample image in the sample image set displays the face of a target object, the target object is a human or an animal, and the labeling result is used to indicate the corresponding The location of the key points of the face shown in the sample image, and the trained prediction model is used to locate the key points of the face.
A method for predicting information, including:

Receive the image to be detected;

The image to be detected is input into the prediction model trained by the method according to any one of claims 1-5 to obtain the corresponding prediction result.
The method according to claim 6, wherein the method further comprises:

The image to be detected and the prediction result are sent to the labeling terminal used by the labeler, so that the labeler can adjust the prediction result, and use the adjusted prediction result as the labeling result, and The image to be detected and the annotation result are sent to a model training terminal for training the prediction model, so that the model training terminal continues to train the prediction model based on the image to be detected and the annotation result .
A device for training a model, including:

The acquiring unit is configured to acquire a sample image set, in which there are a first sample image and a second sample image, the first sample image is a sample image corresponding to the annotation result, and the second sample image is an unlabeled Sample image of the result;

The training unit is configured to train to obtain a corresponding prediction model based on the first sample image in the sample image set and its corresponding annotation result;

The processing unit is configured to perform the following processing steps: input at least one second sample image in the sample image set into a trained prediction model to obtain a corresponding prediction result; combine the at least one second sample image with The prediction result is sent to the labeling terminal used by the labeling personnel; the labeling result obtained by the labeling personnel after adjusting the prediction result is obtained, and the training is continued based on the obtained labeling result and its corresponding sample image The resulting prediction model is trained.
The apparatus according to claim 8, wherein the processing unit is further configured to:

After the training of the prediction model obtained based on the obtained annotation result and the corresponding sample image is continued, if there is a second sample image in the sample image set, the processing step is continued.
The apparatus according to claim 8, wherein the processing unit is further configured to:

Receiving the annotation result obtained by the annotator who adjusts the prediction result returned by the annotation terminal.
The device according to claim 8, wherein the labeling terminal is used to store the labeling result obtained by the labeling personnel after adjusting the prediction result to a designated storage location; and

The processing unit is further configured to:

Obtain the labeling result obtained by the labeling personnel after adjusting the prediction result from the designated storage location.
The device according to any one of claims 8-11, wherein each sample image in the sample image set displays the face of a target object, the target object is a human or an animal, and the labeling result is used to indicate its corresponding The location of the key points of the face shown in the sample image, and the trained prediction model is used to locate the key points of the face.
A device for predicting information, including:

The receiving unit is configured to receive the image to be detected;

The prediction unit is configured to input the image to be detected into a prediction model trained by the method according to any one of claims 1 to 5 to obtain a corresponding prediction result.
The device according to claim 13, wherein the device further comprises:

The sending unit is configured to send the to-be-detected image and the prediction result to the annotating terminal used by the annotator, so that the annotator can adjust the prediction result, and use the adjusted prediction result as The labeling result, and sending the image to be detected and the labeling result to the model training end for training the prediction model, so that the model training end continues to compare the image to be detected and the labeling result The prediction model is trained.
An electronic device including:

One or more processors;

A storage device on which one or more programs are stored,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1-7.
A computer-readable medium having a computer program stored thereon, wherein the program is executed by a processor to implement the method according to any one of claims 1-7.