CN112861892A

CN112861892A - Method and device for determining attributes of targets in pictures

Info

Publication number: CN112861892A
Application number: CN201911178972.9A
Authority: CN
Inventors: 祝勇义
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2019-11-27
Filing date: 2019-11-27
Publication date: 2021-05-28
Anticipated expiration: 2039-11-27
Also published as: CN112861892B

Abstract

The embodiment of the application provides a method and a device for determining attributes of targets in pictures, wherein the method comprises the following steps: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first labeled sample set; and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture. According to the method and the device, a user does not need to label a large number of training samples, and the efficiency of determining the attributes of the targets in the pictures is improved.

Description

Method and device for determining attributes of targets in pictures

Technical Field

The embodiment of the application relates to computer technologies, and in particular, to a method and an apparatus for determining an attribute of an object in a picture.

Background

Machine learning is a technique that can simulate or implement the learning behavior of a human to acquire new knowledge or skills. Machine learning models can be divided into supervised machine learning and unsupervised machine learning during the training process.

The supervised machine learning needs to label training samples in advance to obtain labeled samples, and therefore when a high-precision machine learning model needs to be obtained, the supervised machine learning needs a large number of labeled samples, so that the efficiency of obtaining the high-precision machine learning model is low, and further the efficiency of determining the attributes of the targets in the picture by using the machine learning model is low.

Disclosure of Invention

The embodiment of the application provides a method and a device for determining the attributes of targets in pictures, and the efficiency of determining the attributes of the targets in the pictures by adopting a machine learning model is improved.

In a first aspect, an embodiment of the present application provides a method for determining an attribute of an object in a picture, including: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first set of labeled samples; and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture.

According to the scheme, when the target machine learning model is obtained, the number of the samples marked with the labels is gradually expanded based on the obtained middle machine learning model, a user does not need to label a large number of training samples, manpower and material resources are saved, the efficiency of obtaining the target machine learning model is improved, and the efficiency of determining the attributes of the targets in the picture by adopting the machine learning model is further improved.

In one possible design, the number of layers of the second machine learning model is greater than the number of layers of the first machine learning model; and/or at least one first layer exists in the layers included in the second machine learning model, and the number of the neurons included in the first layer is larger than that of the neurons included in the layer corresponding to the first layer in the first machine learning model.

According to the scheme, after the label set is expanded, the complexity of the machine learning model is improved, the complexity of the machine learning model can be gradually enhanced, the problem that the high-precision machine learning model cannot be obtained when fewer manual label samples exist at present is solved, and the high-precision machine learning model can be obtained on the basis of fewer manual label samples.

In one possible design, before the obtaining a target machine learning model according to the second labeled sample set and the second machine learning model, the method further includes: and performing transfer learning on the first machine learning model to obtain the second machine model.

The scheme provides a specific implementation of obtaining the second machine model, and can improve the efficiency of obtaining the target machine learning model.

In one possible design, the obtaining a target machine learning model according to the second labeled sample set and the second machine learning model includes: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the precision of the third machine learning model reaches a preset precision, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set, and until the target machine learning model with the precision being greater than or equal to the preset precision is determined.

The scheme provides a concrete implementation of obtaining the target machine learning model, can obtain the target machine learning model with the precision greater than or equal to the preset precision, and further can improve the identification precision of the attributes of the targets in the pictures.

In one possible design, the obtaining a target machine learning model according to the second labeled sample set and the second machine learning model includes: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the iteration times reach preset times, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set, and obtaining the target machine learning model until the iteration times reach the preset times.

The scheme provides another specific implementation for obtaining the target machine learning model, can adapt to the number of the current test sample sets, and can efficiently obtain the target machine learning model.

In one possible design, before the obtaining the labels of the plurality of test samples in the first set of test samples using the first machine learning model, the method further includes: receiving the first set of annotated samples; and training to obtain the first machine learning model based on the first labeled sample set.

In one possible design, the target machine model is used to load into a device to enable determination of attributes of a target in a picture.

In one possible design, the apparatus includes any one of: the system comprises a server, terminal equipment, a network video recorder NVR, an image processor ISP and a graphic processor GPU.

In a second aspect, an embodiment of the present application provides an apparatus for determining an attribute of an object in a picture, including: a processing module to: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first set of labeled samples; the processing module is further configured to: and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture.

In a possible design, the processing module is further configured to perform transfer learning on the first machine learning model to obtain the second machine model.

In one possible design, the processing module is specifically configured to: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the precision of the third machine learning model reaches a preset precision, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set, and until the target machine learning model with the precision being greater than or equal to the preset precision is determined.

In one possible design, the processing module is specifically configured to: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the iteration times reach preset times, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set, and obtaining the target machine learning model until the iteration times reach the preset times.

In one possible design, the method further includes receiving the first labeled sample set before obtaining labels of the plurality of test samples in the first test sample set using the first machine learning model; and training to obtain the first machine learning model based on the first labeled sample set.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory; the memory is configured to store computer-executable instructions for causing the processor to execute the computer-executable instructions to implement the method of the first aspect or any possible design of the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer storage medium, including: computer-executable instructions for implementing the method of the first aspect or any of the possible designs of the first aspect.

The method for determining the attributes of the targets in the picture comprises the following steps: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; the first machine learning model is obtained by training based on a first labeled sample set; and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture. When the target machine learning model is obtained, the number of the samples marked with the labels is gradually expanded based on the obtained intermediate machine learning model, a user does not need to label a large number of training samples, manpower and material resources are saved, the efficiency of obtaining the target machine learning model is improved, and the efficiency of determining the attributes of the targets in the picture by adopting the machine learning model is further improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.

FIG. 1 is a diagram of a system architecture provided by an embodiment of the present application;

fig. 2 is a first flowchart of a method for determining an attribute of an object in a picture according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of complexity upgrade of a machine learning model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of obtaining a target machine learning model according to an embodiment of the present disclosure;

fig. 5 is a second flowchart of a method for determining an attribute of an object in a picture according to an embodiment of the present application;

fig. 6 is a flowchart three of a method for determining an attribute of an object in a picture according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an apparatus for determining an attribute of an object in a picture according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple. The terms "first," "second," and the like in this application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

In machine learning techniques, a server may train to derive a machine learning model based on training samples. In one mode, the server sends the trained machine learning model to the terminal device, and after the terminal device receives the picture to be recognized, the terminal device recognizes the attribute of the target in the picture by adopting the machine learning model. In another mode, after receiving the picture to be recognized, the terminal device sends the picture to be recognized to the server, and the server recognizes the attribute of the target in the picture to be recognized by adopting a trained machine learning model and returns a recognition result to the terminal device. The system architecture involved in the above two ways can be as shown in fig. 1. The terminal equipment can be a computer, a mobile phone, a camera, a vehicle-mounted auxiliary driving equipment and the like.

The machine learning model in the embodiments of the present application includes, but is not limited to, a neural network learning model.

The following describes a method for determining an attribute of an object in a picture according to the present application with reference to a specific embodiment.

Fig. 2 is a first flowchart of a method for determining attributes of an object in a picture according to an embodiment of the present disclosure, where an execution subject of the embodiment may be a device for determining attributes of an object in a picture, the device may be implemented by hardware or software, and the device may be part or all of a server. As shown in fig. 2, the method of this embodiment may include:

step S201, obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeled sample set to obtain a second labeled sample set; and the first machine learning model is obtained by training based on the first labeled sample set.

In this embodiment, the label of the training sample may represent an attribute of the target in the picture corresponding to the training sample, for example, if the training sample is a picture of yam, the label of the training sample may represent that the target in the picture is "yam". Alternatively, the tag may be a vector or a string.

The labeled sample set includes a plurality of labeled training samples (the training sample labeled with a label may be referred to as a labeled sample). The test sample set in this embodiment includes a plurality of unlabeled test samples.

This step is explained in detail below.

After the first machine learning model is obtained through training based on the first labeling sample set, the labels of a plurality of test samples in the first test sample set can be obtained according to the first machine learning model, and the test samples with the confidence degrees of the obtained labels being larger than the preset value and the labels of the test samples are added to the first labeling sample set to obtain a second labeling sample set.

That is, for each test sample in the first set of test samples, the test sample is used as an input of the first machine learning model, and a label of the test sample is obtained according to the output; if the confidence of the label of the test sample is greater than the preset value, the test sample and the label of the test sample are added to the first labeled sample set, so that the purpose of expanding the labeled sample set is achieved.

The confidence of a tag means: and the attribute of the target in the icon corresponding to the test sample is the accuracy or the reliability of the attribute indicated by the label of the test sample obtained according to the machine learning model. In one approach, the confidence of a tag is determined based on the largest component of the vector to which the tag corresponds. For example, if the label 1 of the test sample 1 in the first test sample set is (0,0.04,0.1,0.85), and the maximum component of the label 1 is 0.85, the confidence of the label 1 may be 85%. In another way, the confidence of the tag may be obtained by the conventional method for obtaining the confidence, which is not described herein again. Optionally, the preset value may be any value in the interval [ 80% -95% ].

For the first annotated set of samples: in one aspect, the first annotated sample set may be received by the determining means of the attribute of the object in the picture from another device or input by a user to the determining means of the attribute of the object in the picture, that is, the first annotated sample set is an initial annotated sample set. That is, before the obtaining the labels of the plurality of test samples in the first test sample set by using the first machine learning model, the method further includes: a first set of annotated samples is received. Accordingly, the first machine learning model is an intermediate machine learning model obtained for the first time in the process of obtaining the target machine learning model. And the target machine learning model is a machine learning model which is finally obtained and used for determining the attribute of the target in the picture.

In another scheme, the first labeled sample set is an intermediate labeled sample set, that is, the first labeled sample set is an expanded labeled sample set after the steps similar to those in step S201. For example, the first labeled sample set is obtained by the following method: obtaining labels of a plurality of test samples in a test sample set by adopting a machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples into a labeling sample set to obtain a first labeling sample set; wherein the one machine learning model is trained based on the one labeled sample set. It is understood that the current annotated sample set may be the initial annotated sample set or the augmented annotated sample set.

Step S202, a target machine learning model is obtained according to the second labeling sample set and the second machine learning model, and the target machine learning model is used for determining the attributes of the targets in the picture.

For the second machine learning model, optionally, the second machine learning model is of a higher complexity than the first machine learning model. Such as: the number of layers of the second machine learning model is greater than that of the first machine learning model; and/or at least one first layer exists in the layers included in the second machine learning model, and the number of the neurons included in the first layer is larger than that of the neurons included in the layer corresponding to the first layer in the first machine learning model.

In one approach, a first machine learning model may be transfer learned to obtain a second machine model. Such as: as shown in fig. 3(a), the first machine learning model is a 3-layer machine learning model, and as shown in fig. 3(b), the second machine learning model is a 5-layer machine learning model, that is, two first machine learning models are combined into one second machine learning model, and the output layer in the first machine learning model is deleted. In addition to the fact that parameters (filter or connection weights, etc.) between the second layer (hidden layer) of the first machine learning model and the input layer of the second first machine learning model can be reinitialized, parameters between the remaining layers can retain parameters between the corresponding layers in the previous first machine learning model. This way, the efficiency of obtaining the target machine learning model can be improved.

In another approach, the second machine learning model may be a model that is independent of, but more complex than, the first machine learning model, i.e., when parameters between layers included in the second machine learning model may be initialized.

According to the second labeled sample set and the second machine learning model, the method for obtaining the target machine learning model includes, but is not limited to, the following three implementation manners:

in one implementation: obtaining a target machine learning model according to the second labeled sample set and the second machine learning model, including:

and a1, training to obtain a third machine learning model based on the second machine learning model and the second label sample set.

That is, the second machine learning model is used as the initial machine learning model, and the third machine learning model is obtained through training based on the second labeled sample set.

and a2, judging whether the precision of the third machine learning model reaches a preset precision, if so, taking the third machine learning model as a target machine learning model, if not, returning to the step of acquiring labels of a plurality of test samples in the second test sample set by adopting the third machine learning model, and adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeling sample set to obtain a third labeling sample set until the target machine learning model with the precision being greater than or equal to the preset precision is determined.

That is, if the accuracy of the third machine learning model does not reach the preset accuracy, the third machine learning model is used as a new first machine learning model, the second test sample set is used as a new first test sample set, the second labeled sample set is used as a new first labeled sample set, and the fourth machine learning model is used as a new second machine learning model, and the steps S201, a1 and a2 are repeatedly executed until the target machine learning model with the accuracy greater than or equal to the preset accuracy is determined.

In another implementation, obtaining the target machine learning model according to the second labeled sample set and the second machine learning model includes:

b1, training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model.

b2, judging whether the iteration frequency reaches the preset frequency, if so, taking the third machine learning model as the target machine learning model, if not, returning to the step of obtaining the labels of the plurality of test samples in the second test sample set by adopting the third machine learning model, and adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set until the preset frequency is iterated to obtain the target machine learning model.

That is, the third machine learning model is used as a new first machine learning model, the second test sample set is used as a new first test sample set, the second labeled sample set is used as a new first labeled sample set, the fourth machine learning model is used as a new second machine learning model, and the steps S201, b1 and b2 are repeatedly executed until iteration is performed for a preset number of times, so that the target machine learning model is obtained.

Wherein, an iterative process comprises: and carrying out one-time expansion process on the label set and training the label set obtained based on the expansion process to obtain a new machine learning model.

In yet another implementation: obtaining a target machine learning model according to the second labeled sample set and the second machine learning model, including: and training based on the second machine learning model and the second labeled sample set to obtain the target machine learning model.

For example, a schematic diagram of obtaining the target machine learning model may be as shown in fig. 4, with reference to fig. 4, as the training process progresses, the labeled samples gradually increase, the complexity of the machine learning model gradually increases, and finally the target machine learning model is obtained.

Step S203, determining the attribute of the target in the picture according to the target machine learning model; or sending the target machine learning model to the terminal equipment so that the terminal equipment determines the attribute of the target in the picture according to the target machine learning model.

That is to say: the target machine learning model may be used to load into a device to enable determination of attributes of a target in a picture. Wherein the device that the target machine learning model can be loaded comprises any one of the following items: the system comprises a server, a terminal device, a Network Video Recorder (NVR), an Image Processor (ISP), and a Graphics Processing Unit (GPU).

Illustratively, the target machine learning model may be used to determine the type of chinese medicine in a plurality of chinese medicine pictures, determine the type of animal in a plurality of animal pictures, face recognition, and the like.

The target machine learning model is loaded on a camera, a computer, a server, a cloud platform and the like, and the identification of the target attribute is executed aiming at the picture, so that the target machine learning model can continuously adapt to the changing requirement.

In this embodiment, when the target machine learning model is obtained, the number of samples labeled with labels is gradually expanded based on the obtained intermediate machine learning model, and a user does not need to label a large number of training samples, so that manpower and material resources are saved, the efficiency of obtaining the target machine learning model is improved, and the efficiency of determining the attribute of the target in the picture by using the machine learning model is further improved.

Meanwhile, if the complexity of the machine learning model is increased once after each time of the expansion of the labeling set, the complexity of the machine learning model can be gradually enhanced, and the problem that the high-precision machine learning model cannot be obtained when fewer manual labeling samples exist at present is solved.

Two specific implementations of the embodiment shown in fig. 2 are described below using specific examples.

Fig. 5 is a second flowchart of a method for determining an attribute of an object in a picture according to an embodiment of the present application, and referring to fig. 5, the method according to the embodiment includes:

step S501, receiving a first labeling sample set.

The first annotated sample set in this embodiment is an initial annotated sample set, that is, the first annotated sample set may be received by the determining apparatus of the attribute of the object in the picture from another device or input by the user to the determining apparatus of the attribute of the object in the picture.

And S502, training to obtain a first machine learning model based on the first labeled sample set.

The first machine learning model in this embodiment is an intermediate machine learning model obtained for the first time.

Step S503, obtaining labels of a plurality of test samples in the first test sample set by adopting the first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeled sample set to obtain a second labeled sample set.

This step is an expansion process of the labeled sample set.

And step S504, training to obtain a third machine learning model based on the second machine learning model and the second labeled sample set.

The second machine learning model and the third machine learning model in this embodiment are both intermediate machine learning models. The third machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S503.

Optionally, the second machine learning model is a model of higher complexity than the first machine learning model; referring to the method for acquiring the second machine learning model in the embodiment shown in fig. 2, an acquisition method of the second machine learning model in the embodiment is shown. For example, the first machine learning model may have 5 layers, and the second machine learning model may have 9 layers. For another example, the number of layers of the first machine learning model may be 5, and the number of layers of the second machine learning model may be 10.

And step S505, determining that the precision of the third machine learning model is smaller than the preset precision.

The method for determining whether the precision of the machine learning model is greater than the preset precision may be as follows: the method comprises the steps of adopting a machine learning model to recognize a preset number of pictures, wherein the attributes of targets in the pictures are known, if the recognition accuracy is greater than or equal to the preset accuracy, determining that the precision of the machine learning model is greater than the preset precision, and if the recognition accuracy is less than the preset accuracy, determining that the precision of the machine learning model is less than the preset precision.

Step S506, obtaining labels of a plurality of test samples in the second test sample set by adopting a third machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the second labeling sample set to obtain a third labeling sample set.

This step is a further augmentation process of the annotated set of samples.

And step S507, training to obtain a fifth machine learning model based on the fourth machine learning model and the third labeled sample set.

Namely, the fourth machine learning model is used as the initial machine learning model, and the fifth machine learning model is obtained through training based on the third labeled sample set.

The fourth machine learning model and the fifth machine learning model in this embodiment are both intermediate machine learning models. The fifth machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S506.

Optionally, the fourth machine learning model is a higher complexity model than the third machine learning model; the fourth machine learning model is obtained after the third machine learning model is subjected to transfer learning, or the fourth machine learning model is a model which is irrelevant to the third machine learning model but has higher complexity than the third machine learning model. For example, when the number of layers of the second machine learning model is 9, the number of layers of the third machine learning model is also 9, and then the number of layers of the fourth machine learning model may be 17. For another example, when the number of layers of the second machine learning model is 10, the number of layers of the third machine learning model is also 10, and then the number of layers of the fourth machine learning model may be 20.

And step S508, determining that the precision of the fifth machine learning model is smaller than the preset precision.

Step S509, obtaining labels of the multiple test samples in the third test sample set by using a fifth machine learning model, and adding the test sample with the confidence of the label being greater than the preset value and the label of the test sample to the third labeled sample set to obtain a fourth labeled sample set.

This step is a further augmentation process of the annotated set of samples.

And step S510, training to obtain a seventh machine learning model based on the fourth labeled sample set and the sixth machine learning model.

Namely, the sixth machine learning model is used as the initial machine learning model, and the seventh machine learning model is obtained through training based on the fourth labeled sample set.

The sixth machine learning model and the seventh machine learning model in this embodiment are both intermediate machine learning models. The seventh machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S508.

Optionally, the sixth machine learning model is a higher complexity model than the fifth machine learning model; the sixth machine learning model is obtained after the fifth machine learning model is subjected to transfer learning, or the sixth machine learning model is a model which is irrelevant to the fifth machine learning model but has higher complexity than the fifth machine learning model. For example, when the number of layers of the fourth machine learning model is 17, and the number of layers of the fifth machine learning model is 17, the number of layers of the sixth machine learning model may be 33. For another example, when the number of layers of the fourth machine learning model is 20, and the number of layers of the fifth machine learning model is also 20, then the number of layers of the sixth machine learning model may be 40.

And step S511, determining that the precision of the seventh machine learning model is greater than or equal to the preset precision, and taking the seventh machine learning model as the target machine learning model.

Step S512, determining the attribute of the target in the image according to the target machine learning model; or, the target machine learning model is sent to the terminal device, so that the terminal device determines the attribute of the target in the image according to the target machine learning model.

According to the method, manpower and material resources can be saved, the efficiency of obtaining the target machine learning model is improved, and the efficiency of determining the attributes of the targets in the picture by adopting the machine learning model is further improved. Meanwhile, if the complexity of the machine learning model is increased once after the labeling set is expanded every time, the complexity of the machine learning model can be gradually increased, and the problem that the high-precision machine learning model cannot be obtained when manually labeled samples are few at present is solved.

Fig. 6 is a flowchart of a third method for determining an attribute of an object in a picture according to an embodiment of the present application, and referring to fig. 6, the method according to the embodiment includes:

step S601, receiving a first labeling sample set.

Step S602, training to obtain a first machine learning model based on the first labeled sample set.

Step S603, obtaining labels of a plurality of test samples in the first test sample set by adopting the first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set.

This step is an expansion process of the labeled sample set.

And S604, training to obtain a third machine learning model based on the second labeled sample level and the second machine learning model.

And training to obtain a third machine learning model based on the second labeled sample set by taking the second machine learning model as the initial machine learning model.

The second machine learning model and the third machine learning model in this embodiment are both intermediate machine learning models. The third machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S603.

Wherein, steps S602 to S604 are the first iteration process.

And step S605, determining that the iteration times are less than the preset times.

Step S606, obtaining labels of a plurality of test samples in the second test sample set by adopting a third machine learning model, and adding the test samples with the confidence degrees of the labels larger than a preset value and the labels of the test samples to the second labeling sample set to obtain a third labeling sample set.

This step is a further augmentation process of the annotated set of samples.

And S607, training to obtain a fifth machine learning model based on the third labeled sample set and the fourth machine learning model.

The fourth machine learning model and the fifth machine learning model in this embodiment are both intermediate machine learning models. The fifth machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S606.

Wherein, steps S606 to S607 are the second iteration process.

And step S608, determining that the iteration times are less than the preset times.

And step S609, acquiring labels of a plurality of test samples in the third test sample set by adopting a fifth machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the third labeled sample set to obtain a fourth labeled sample set.

This step is a further augmentation process of the annotated set of samples.

And S610, training to obtain a seventh machine learning model based on the sixth machine learning model and the fourth labeled sample set.

The sixth machine learning model and the seventh machine learning model in this embodiment are both intermediate machine learning models. The seventh machine learning model is an intermediate machine learning model obtained by training based on the labeled sample set expanded in step S608.

Wherein, the steps S609 to S610 are a third iteration process.

And S611, determining that the iteration times are equal to the preset times, and taking the seventh machine learning model as the target machine learning model. In this embodiment, the preset number of times is 3.

Step S612, determining the attribute of the target in the image according to the target machine learning model; or, the target machine learning model is sent to the terminal device, so that the terminal device determines the attribute of the target in the image according to the target machine learning model.

The method for determining the attribute of an object in a picture according to the present application is explained above, and the apparatus according to the present application is explained below using a specific example.

Fig. 7 is a schematic structural diagram of an apparatus for determining an attribute of an object in a picture according to an embodiment of the present application, and as shown in fig. 7, the apparatus according to the embodiment may include: a receiving module 71 and a processing module 72.

A processing module 72 for: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first set of labeled samples.

The processing module 72 is further configured to: and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture.

Optionally, the number of layers of the second machine learning model is greater than the number of layers of the first machine learning model; and/or at least one first layer exists in the layers included in the second machine learning model, and the number of the neurons included in the first layer is larger than that of the neurons included in the layer corresponding to the first layer in the first machine learning model.

Optionally, the processing module 72 is further configured to perform transfer learning on the first machine learning model to obtain the second machine model.

Optionally, the processing module 72 is specifically configured to: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the precision of the third machine learning model reaches a preset precision, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeling sample set to obtain a third labeling sample set, and until the target machine learning model with the precision being greater than or equal to the preset precision is determined.

Optionally, the processing module 72 is specifically configured to: training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model; and judging whether the iteration times reach preset times, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, and adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples to the second labeled sample set to obtain a third labeled sample set, and obtaining the target machine learning model until the iteration times reach the preset times.

Optionally, the receiving module 71 is configured to receive the first labeled sample set before the first machine learning model is used to obtain the labels of the plurality of test samples in the first test sample set; and training to obtain the first machine learning model based on the first labeled sample set.

The apparatus of this embodiment may be used to implement the technical solution of the method embodiment shown in fig. 7, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and referring to fig. 8, the electronic device of the present embodiment includes: a processor 82, a memory 81 and a communication bus 83, the communication bus 83 is used for connecting the processor 82 and the memory 81, and the processor 82 is coupled with the memory 81;

the memory 81 is used for storing a computer program;

the processor 82 is configured to call the computer program stored in the memory 81 to implement the method in the above method embodiment.

Wherein the computer program may also be stored in a memory external to the electronic device.

It should be understood that in the embodiments of the present application, the processor 82 may be a CPU or GPU, and the processor 82 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or any conventional processor or the like.

The memory 81 may include a read-only memory and a random access memory, and provides instructions and data to the processor 82. The memory 81 may also include a non-volatile random access memory. For example, the memory 81 may also store information of the device type.

The memory 81 may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct bus RAM (DR RAM).

The bus 83 may include a power bus, a control bus, a status signal bus, and the like, in addition to the data bus. But for clarity of illustration the various buses are labeled as bus 83 in the figures.

The embodiments of the present application provide a readable storage medium, which includes a program or instructions, when the program or instructions are run on a computer, the method as described in any of the above method embodiments is performed.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims

1. A method for determining attributes of objects in pictures is characterized by comprising the following steps:

obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first set of labeled samples;

and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture.

2. The method of claim 1, wherein the number of layers of the second machine learning model is greater than the number of layers of the first machine learning model; and/or the presence of a gas in the gas,

at least one first layer exists in the layers included in the second machine learning model, and the number of neurons included in the first layer is larger than the number of neurons included in the layer corresponding to the first layer in the first machine learning model.

3. The method according to claim 1 or 2, wherein before said obtaining a target machine learning model according to the second labeled sample set and the second machine learning model, further comprising:

and performing transfer learning on the first machine learning model to obtain the second machine model.

4. The method of claim 1, wherein obtaining a target machine learning model from the second set of labeled samples and a second machine learning model comprises:

training to obtain a third machine learning model based on the second labeling sample set and the second machine learning model;

and judging whether the precision of the third machine learning model reaches a preset precision, if so, taking the third machine learning model as a target machine learning model, otherwise, returning to the step of acquiring labels of a plurality of test samples in a second test sample set by adopting the third machine learning model, adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples into the second labeled sample set to obtain a third labeled sample set, and until the target machine learning model with the precision being greater than or equal to the preset precision is determined.

5. The method of claim 1, wherein obtaining a target machine learning model from the second set of labeled samples and a second machine learning model comprises:

and judging whether the iteration times reach preset times, if so, returning to the step of obtaining the labels of the plurality of test samples in the second test sample set by adopting a third machine learning model, and adding the test samples with the confidence degrees of the labels being greater than the preset value and the labels of the test samples to the second labeled sample set to obtain a third labeled sample set until the iteration times reach the preset times to obtain the target machine learning model.

6. The method of claim 1, wherein prior to using the first machine learning model to obtain labels for the plurality of test samples in the first set of test samples, further comprising:

receiving the first set of annotated samples;

and training to obtain the first machine learning model based on the first labeled sample set.

7. The method of claim 1, wherein the target machine model is used to load into a device to enable determination of attributes of a target in a picture.

8. The method of claim 7, wherein the device comprises any one of:

the system comprises a server, terminal equipment, a network video recorder NVR, an image processor ISP and a graphic processor GPU.

9. An apparatus for determining attributes of objects in a picture, comprising:

a processing module to: obtaining labels of a plurality of test samples in a first test sample set by adopting a first machine learning model, and adding the test samples with the confidence degrees of the labels being greater than a preset value and the labels of the test samples to the first labeling sample set to obtain a second labeling sample set; wherein the first machine learning model is trained based on the first set of labeled samples;

the processing module is further configured to: and acquiring a target machine learning model according to the second labeling sample set and the second machine learning model, wherein the target machine learning model is used for determining the attribute of the target in the picture.

10. The apparatus of claim 9, wherein the number of layers of the second machine learning model is greater than the number of layers of the first machine learning model; and/or the presence of a gas in the gas,