CN116152603A - Attribute identification model training method, system, medium and device - Google Patents
Attribute identification model training method, system, medium and device Download PDFInfo
- Publication number
- CN116152603A CN116152603A CN202310152682.7A CN202310152682A CN116152603A CN 116152603 A CN116152603 A CN 116152603A CN 202310152682 A CN202310152682 A CN 202310152682A CN 116152603 A CN116152603 A CN 116152603A
- Authority
- CN
- China
- Prior art keywords
- attribute
- image
- noise
- training
- identification model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 154
- 238000000034 method Methods 0.000 title claims abstract description 77
- 238000009792 diffusion process Methods 0.000 claims abstract description 28
- 230000004927 fusion Effects 0.000 claims abstract description 27
- 230000006870 function Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 230000001186 cumulative effect Effects 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 abstract description 8
- 230000002787 reinforcement Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 17
- 238000012545 processing Methods 0.000 description 16
- 238000003491 array Methods 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 235000019800 disodium phosphate Nutrition 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005206 flow analysis Methods 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
The application provides a training method, a training system, a training medium and a training device for an attribute identification model, comprising the following steps: acquiring an attribute label of a target clear image; training a clear data attribute identification model based on the target clear image and the attribute tag; adding noise with different degrees into the target clear image to generate a fusion noise image; identifying the fusion noise image based on the trained clear data attribute identification model, and acquiring an effective attribute label of the fusion noise image; training an attribute recognition model based on the fused noise image and the corresponding effective attribute label to perform attribute recognition of the image based on the trained attribute recognition model. According to the training method based on the diffusion model, the identifiable part in the attribute tag reinforcement training can be effectively identified according to visual visible information, and the accuracy of the attribute identification model is effectively improved; at the same time, the generalization capability of the identification network is enhanced.
Description
Technical Field
The invention belongs to the technical field of computer vision for artificial intelligence application, relates to a training method, and in particular relates to a training method, a training system, a training medium and a training device for an attribute identification model.
Background
The attribute identification of the target is one of important tasks in the field of computer vision and the field of artificial intelligence application. The technology is widely applied to scenes such as security, monitoring, people flow and traffic flow analysis and the like. Diffusion model (diffusion model) is mainly applied to the fields of generating models such as picture generation, text generation, voice generation, waveform signal generation and the like, and obtains excellent performance.
Attribute identification, mainly identifying basic attributes of an object, such as: sex, age, clothing color and accessories of pedestrians; color, brand, and series of automobiles, and the like. The common method uses deep learning network identification, and before network learning, the image is expanded in quantity by means of image enhancement, such as left-right overturn, picture translation, scaling, random erasing, clipping, noise blurring and the like. The method of adding noise blurring therein may lose visual information of the image itself.
In training of the model, raw data labels are employed for visual information weakening or loss data. The data of the tag is difficult to directly reflect after the visual information is lost, and is called fuzzy data. From fig. 1, it can be seen that when the image is reduced in resolution and noise is added, part of the information is not visually recognized after the image becomes blurred, such as a shoulder bag, a hat, even a hairstyle, and the like.
In general, when training, the model uses clear data and fuzzy data with enhanced data as training samples, and the fuzzy data and the source data use common labels. The upper limit of the recognition accuracy of the model is thus limited, because some attributes are difficult to recognize when visual information is lost or insufficient, and the training can make the recognition model estimate the attributes which are difficult to recognize of an unclear sample, which is a main reason for reducing the upper limit of the recognition accuracy of the model. However, a method for adding noise by data enhancement is indispensable, and the generalization capability of the identification network can be effectively enhanced.
Therefore, in the existing target attribute identification technology, the problem of the loss of the upper limit of the model identification precision is caused due to the fact that the clear data and the fusion noise data share the label, and further the problems of low precision of the attribute identification model and insufficient generalization capability of the identification network are caused.
Disclosure of Invention
In view of the above drawbacks of the prior art, the present invention is to provide a training method, system, medium and device for an attribute identification model, which are used for solving the problem that in the prior art, the accuracy of model identification is lost due to the fact that clear data and fusion noise data share labels in the training process, and further, the accuracy of attribute identification model is not high, and the generalization capability of identification network is not enough.
To achieve the above and other related objects, in a first aspect, the present invention provides a training method for an attribute identification model, including the steps of: acquiring a target clear image and an attribute tag of the target clear image; training a clear data attribute recognition model based on the target clear image and the attribute tag; adding noise with different degrees into the target clear image to generate a plurality of fusion noise images; identifying the fusion noise image based on the trained clear data attribute identification model, and acquiring an effective attribute label of the fusion noise image; and training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
In the application, firstly, a clear image of a target is acquired, and the attribute of the acquired clear image of the target is labeled; constructing an attribute identification model through training based on the clear image of the target; meanwhile, adding noise of different degrees to the target clear image to enable the target clear image to generate a plurality of fusion noise images; at this time, the fused noise image is used as a training object to be input into a trained clear data attribute identification model, and an effective attribute label of the fused noise image is obtained, so that the attribute identification process of the image is completed. According to the method, the identification part in the attribute tag reinforcement training can be effectively identified according to visual information through the diffusion training method, so that the accuracy of the attribute identification model is improved.
In one implementation manner of the first aspect, training a sharp data attribute identification model based on the target sharp image and the attribute tag includes the steps of: inputting the target clear image and the attribute tag into the clear data attribute identification model; and adjusting parameters of the clear data attribute identification model so that the clear data attribute identification model outputs effective attribute labels of the target image.
In the implementation mode, based on the target clear image and the attribute label, an attribute identification model is input for training; and then adjusting parameters of the attribute identification model, and finally outputting effective attribute labels of the target image.
In one implementation manner of the first aspect, generating a plurality of fused noise images by adding different degrees of noise to the target sharp image includes the steps of: generating noise with different degrees by adopting a Gaussian random function; and fusing the noise with the target clear image in sequence according to the increasing sequence of the noise so as to acquire the fused noise image.
In the implementation mode, the images under different degrees of noise are obtained by gradually adding different degrees of noise to the clear image of the original image; and the obtained noise image has corresponding attribute labels.
In one implementation manner of the first aspect, the fused noise image is:
wherein alpha is t =1-β t =1-0.001×t;t is the noise coefficient of different degrees; n (0, beta) t ) Is noise; alpha t Is the diffusion coefficient;Is the cumulative product of the diffusion coefficients; x is X 0 Is the target image; x is X t And fusing the noise images.
In an implementation manner of the first aspect, the identifying the fused noise image based on the trained clear data attribute identification model, and obtaining the valid attribute tag of the fused noise image includes the following steps: acquiring first attribute definition precision of each attribute of the target definition image based on the trained definition data attribute identification model; acquiring second attribute definition precision of each attribute of the fused noise image based on the trained definition data attribute identification model; and acquiring the effective attribute tag according to the first attribute definition precision and the second attribute definition precision.
In the implementation mode, each attribute and attribute definition precision of the clear image are obtained through an attribute identification model of the clear image; acquiring attribute definition precision of each attribute of the fused noise image through the definition data attribute identification model; then, the valid attribute tags are obtained according to different definition and the invalid attribute tags are removed.
In an implementation manner of the first aspect, the obtaining the valid attribute tag according to the first attribute definition precision and the second attribute definition precision includes the following steps: when the second attribute definition precision is lower than the product of the first attribute definition precision and a preset threshold value, judging that the corresponding label is an invalid attribute label; and when the second attribute definition precision is greater than or equal to the product of the first attribute definition precision and a preset threshold value, judging the corresponding label as an effective attribute label.
In an embodiment of the present invention, the preset threshold value is:
wherein t is the noise coefficient of different degrees;is the cumulative product of the diffusion coefficients of the noise in the fused noise image.
In this implementation manner, whether the corresponding tag is a valid attribute tag is further determined by an attribute accuracy calculation formula and a determination condition. If the label is the valid attribute label, entering the next step of flow; if the attribute tag is invalid, the tag is not processed in the next flow.
In one implementation manner of the first aspect, training an attribute identification model based on the fused noise image and the corresponding valid attribute tag, so as to perform attribute identification of the image based on the trained attribute identification model includes the following steps: initializing parameters of the attribute identification model; training the attribute identification model based on each fused noise image and the corresponding effective attribute label in sequence from big to small according to the noise; and training the attribute identification model again based on the target image and the corresponding attribute label to obtain a trained attribute identification model.
In the implementation mode, the parameters of the attribute identification model are initialized, images with different noise degrees are input into the attribute identification model for training, and the accuracy of the attribute identification model can be improved through the retraining process.
In a second aspect, the present application provides an attribute identification model training system comprising: the acquisition module is used for acquiring a target clear image and an attribute tag of the target clear image; the identification training module is used for training a clear data attribute identification model based on the target clear image and the attribute tag; the noise diffusion module is used for adding noise with different degrees into the target image to generate a plurality of fusion noise image pairs; the label generation module is used for identifying the fusion noise image based on the trained clear data attribute identification model and acquiring an effective attribute label of the fusion noise image; and the back diffusion training module is used for training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
In the application, a clear image of a target is acquired through an acquisition module, and labeling is performed manually; training the acquired clear target image based on an identification training module to obtain an attribute identification model; and meanwhile, a diffusion noise module is adopted to synthesize fused noise images with different degrees of noise. And judging and identifying the effective attribute labels on the basis of the fused noise image by a label generating module, and finally training the effective attribute labels by a back diffusion training module so as to realize the attribute identification of the image. This method improves the recognition accuracy of the image.
In a third aspect, the present application provides an attribute identification model training device, including: a processor and a memory. The memory is used for storing a computer program; the processor is connected with the memory and is used for executing the computer program stored in the memory so that the attribute identification model training device executes the attribute identification model training method.
As described above, the attribute identification model training method, system, medium and device of the invention have the following beneficial effects:
(1) According to the training method based on the diffusion model, the presumption part of the network is avoided in the training process, and the identifiable part in the attribute tag reinforcement training can be effectively identified according to visual and visible information; therefore, the problem of model identification accuracy upper limit loss caused by the fact that noise data and clear data are fused together in the training process is avoided, and accordingly attribute identification model accuracy is improved and improved.
(2) The attribute identification model training method can effectively increase the generalization capability of the identification network.
(3) According to the attribute identification model training method, the adopted model structure is of light-weight design, the requirements on equipment are not high, and the applicability is strong.
Drawings
Fig. 1 is a schematic diagram illustrating an implementation of the attribute identification model training method of the present invention in an application scenario.
FIG. 2 is a flow chart of an attribute identification model training method according to an embodiment of the invention.
Fig. 3 is a schematic flow chart of S12 in the attribute identification model training method of the present invention.
Fig. 4A is a schematic flow chart of S13 in the attribute identification model training method of the present invention.
Fig. 4B is a schematic diagram of a process of gradually adding noise to an original image of a pedestrian and an original image attribute tag according to an embodiment of the attribute recognition model training method of the present invention.
Fig. 4C is a schematic diagram showing a process of gradually adding noise to an original image of a vehicle and an original image attribute tag according to an embodiment of the attribute recognition model training method of the present invention.
Fig. 5 is a schematic flow chart of S14 in the attribute identification model training method of the present invention.
FIG. 6 is a schematic diagram illustrating an effective attribute tag recognition process according to an embodiment of the training method of the attribute recognition model of the present invention.
Fig. 7 is a schematic flow chart of S15 in the attribute identification model training method of the present invention.
FIG. 8 is a schematic diagram of the training system for attribute identification model according to the present invention in an embodiment.
FIG. 9 is a schematic diagram of the training device for the attribute identification model according to the present invention in an embodiment.
Description of element reference numerals
81. Acquisition module
82. Recognition training module
83. Diffusion noise module
84. Label generating module
85. Reverse diffusion training module
91. Processor and method for controlling the same
92. Memory device
S11 to S15 steps
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.
The attribute identification model training method provided in the embodiment of the present application will be described in detail below with reference to the drawings in the embodiment of the present application.
Referring to fig. 1 and fig. 2, an implementation schematic diagram of the attribute identification model training method of the present invention in an application scenario and a flow schematic diagram of the attribute identification model training method of the present invention in an embodiment are shown respectively. As shown in fig. 1 and 2, the present embodiment provides a training method for an attribute identification model.
The attribute identification model training method specifically comprises the following steps:
s11, acquiring a target clear image and an attribute tag of the target clear image.
Please continue to refer to fig. 2. In this embodiment, the target area is photographed by the still camera, so that the acquired image in the target area is acquired. Wherein the collected image is a clear image of the collected object. The type of the target clear image is a common type of image, such as: RGB, RGBD, gray-scale map, thermodynamic diagram, etc.
After the clear target image is obtained, a corresponding label is marked on the clear target image by adopting a manual means. For example, a tag in a pedestrian image: sex, age, clothing, etc.; vehicle image attributes such as: vehicle type, brand, license plate number, color, train, etc.
And S12, training a clear data attribute identification model based on the target clear image and the attribute label.
Referring to fig. 3, a flowchart of S12 in the attribute identification model training method of the present invention is shown. As shown in fig. 3, the step S12 includes the following steps:
s121, inputting the target clear image and the attribute label into the clear data attribute identification model.
In this embodiment, an identification training network model is constructed according to the user requirements.
Specifically, first, a training data set and a test data set are acquired. The training data set includes: a target clear image and an attribute tag of the target clear image. Then, the object clear image and the attribute label of the object clear image are input into the recognition network model, and training is carried out on the recognition network model. And updating the identification network model according to the identification result through continuous training, and finally obtaining the clear data attribute identification model after label updating.
S122, adjusting parameters of the clear data attribute identification model to enable the clear data attribute identification model to output effective attribute labels of the target image.
In this embodiment, the effective attribute tag of the target image is obtained through the parameters of the high-speed clear data attribute model and finally through the clear data attribute identification model.
S13, adding noise with different degrees into the target clear image to generate a plurality of fusion noise images.
Referring to fig. 4A, a flowchart of S13 in the training method of the attribute identification model according to the present invention is shown. As shown in fig. 4A, the step S13 includes the following steps:
s131, generating noise with different degrees by adopting a Gaussian random function.
In this embodiment, the Gaussian random function is used to generate the noise N (0, beta) t ). The noise values of different degrees are different noise degree coefficients, and the larger the noise degree coefficient is, the larger the noise is. The range of the noise coefficient is as follows: 0-100. The noise figures of the present embodiment are preferably different values of 5, 10, 20, 40, 100, etc.
And calculating different noise coefficients required to be added into the target clear image according to a Gaussian random function formula. The calculation formula adopted by the Gaussian random function is as follows:
according to the calculation formula, noise coefficients with different degrees can be generated. The calculated noise coefficient is used for obtaining the clear image of the target clear image and the image added with the fusion noise.
And S132, fusing the noise with the target clear image in sequence according to the increasing sequence of the noise so as to acquire the fused noise image. Referring to fig. 4B and fig. 4C, a process and an original attribute tag schematic diagram of gradually adding noise to an original image of a pedestrian in an embodiment of the attribute identification model training method of the present invention and a process and an original attribute tag schematic diagram of gradually adding noise to an original image of a vehicle in an embodiment of the attribute identification model training method of the present invention are shown respectively.
As shown in fig. 4B and 4C, the noise coefficient calculated by the gaussian random function in step S131 is added to the clear image of the target clear image to obtain a fused noise image of different noise coefficients.
In this embodiment, the process of adding noise to the original image with different degrees uses a diffusion probability model for processing. The method comprises the following steps:
wherein alpha is t =1-β t =1-0.001×t;t is the noise coefficient of different degrees; n (0, beta) t ) Is noise; alpha t Is the diffusion coefficient;Is the cumulative product of the diffusion coefficients; x is X 0 Is the target image; x is X t And fusing the noise images.
Specifically, t=5, 10, 20, 30, 40, 50, 100 equivalent values are preferable in this embodiment. And (3) sequentially fusing the noise coefficient values obtained in the step (S131) into the target clear image according to the sequence from small to large, and obtaining fused noise images with different noise coefficients. For example: in fig. 4B, the noise coefficients t=5, 10, 20, 40, 100 are added to the clear image of the original target (e.g. pedestrian) image in sequence, so as to obtain images with different noise coefficients t. Because in the original target clear image, different attributes on the target clear image can be clearly identified, the method comprises the following steps: sex, age, coat, under-wear, cap, mask, and accessory items (e.g., bias bag, shoulder bag). When the noise factor t=5, all the above properties can be recognized on the fused noise image, only the sharpness is reduced. When the noise factor t=10, all the above properties can still be recognized on the fused noise image. And by analogy, when the noise coefficient t=40, the human body contour can be recognized by thinning on the fused noise image, and other attributes can not be recognized. When the noise coefficient t=100, no target attribute can be identified on the fused noise image.
In fig. 4C, the noise coefficients t=30 and 50 are added to the clear image of the original target (e.g. vehicle) image in sequence, so as to obtain the noise fusion images of the vehicles with the noise coefficients of 30 and 50 respectively. In the original clear image of the vehicle, identifiable attributes include: license plate, vehicle color, vehicle model, vehicle brand, train, etc. When noise with the coefficient of 30 is added into the original clear image of the vehicle, attributes such as the vehicle, the color, the vehicle type and the like can be identified in the fused noise image with t=30; it cannot be identified that: license plate, train, etc. When noise with a coefficient of 50 is added into the original clear image of the vehicle, the category attribute of the target can only be identified in a fuzzy manner in the fused noise image with t=50, for example: a vehicle; other attributes (such as color, brand, train, license plate) are not basically identified.
S14, identifying the fusion noise image based on the trained clear data attribute identification model, and obtaining an effective attribute label of the fusion noise image.
Referring to fig. 5 and fig. 6, a flow chart of S14 in the attribute identification model training method of the present invention and an effective attribute tag identification flow chart in an embodiment of the attribute identification model training method of the present invention are shown respectively. As shown in fig. 5 and 6, the step S14 includes the steps of:
S141, acquiring first attribute definition precision of each attribute of the target definition image based on the trained definition data attribute identification model.
In this embodiment, the target clear image is input to the trained clear data attribute identification model to obtain the first attribute clear precision of each attribute of the target clear image. The first attribute definition accuracy formula is as follows: first attribute definition accuracy = number of identification correct samples/total number of identification samples.
Specifically, as follows: the attributes contained in the clear image of the pedestrian target are: sex, age, coat, lower garment, cap, mask, fittings, etc. The sex attribute in the pedestrian target clear image corresponds to an attribute precision, namely: first gender attribute accuracy; the age attribute in the pedestrian target clear image corresponds to a precision of: first age attribute accuracy; the coat attribute in the clear image of the pedestrian target corresponds to a precision that is: first coat attribute accuracy; the downloading attribute in the clear image of the pedestrian target also corresponds to an accuracy, namely: first underfill property accuracy. And by analogy, the first definition accuracy of each attribute of the pedestrian target definition image can be obtained.
And the following steps: attributes included in the clear image of the vehicle include color, license plate, vehicle model, train, etc. Wherein, the color attribute in the clear image of the vehicle corresponds to an attribute precision, namely: first color attribute accuracy; the license plate attribute in the clear image of the vehicle corresponds to an attribute precision, namely: first vehicle attribute accuracy; the attribute of the vehicle in the clear image of the vehicle corresponds to an attribute precision, namely: first vehicle-type attribute accuracy; the attribute of the train in the clear image of the vehicle corresponds to an attribute precision, namely: first train attribute accuracy.
Therefore, the first attribute definition precision of each attribute of the acquired target definition image is identified through the trained definition data attribute identification model.
S142, obtaining second attribute definition precision of each attribute of the fused noise image based on the trained definition data attribute identification model.
And similarly, inputting the plurality of fused noise images obtained in the step S132 into a trained clear data attribute identification model to obtain second attribute clear precision of each attribute of the fused noise images.
Specifically, there are several fused noise images of the pedestrian target clear image. Thus, each attribute corresponds to a second attribute resolution. For example, assume that there are 4 noise levels and there are A, B, C three types of attributes. Then, the specific precision is as follows: precision A_clear diagram, precision B_clear diagram and precision C_clear diagram; precision A_noise degree 1, precision B_noise degree 1 and precision C_noise degree 1; precision a_noiseproof 2, precision b_noiseproof 2 and precision c_noiseproof 2; precision A_noise level 3, precision B_noise level 3, precision C_noise level 3; precision a_noise level 4, precision b_noise level 4, precision c_noise level 4.
S143, acquiring the effective attribute tag according to the first attribute definition precision and the second attribute definition precision.
Based on the first attribute definition precision and the second attribute definition precision obtained in the steps, whether the second attribute definition precision is not lower than the product of the first attribute definition precision and a preset threshold value is judged, and whether the attribute is a valid attribute label is further judged.
Specifically, when the second attribute definition precision is lower than the product of the first attribute definition precision and a preset threshold value, judging that the corresponding label is an invalid attribute label; and when the second attribute definition precision is greater than or equal to the product of the first attribute definition precision and a preset threshold value, judging the corresponding label as an effective attribute label.
The preset threshold calculation formula is as follows:
wherein t is the noise coefficient of different degrees;is the cumulative product of the diffusion coefficients of the noise in the fused noise image. The value range of the preset threshold value is between 0 and 1.
For example: from the attribute accuracy described in steps S141 and S142, when the accuracy a_noise level 3< the accuracy a_sharpness is a coefficient, the attribute a tag is considered invalid. When the precision a_noise level 2> precision a_sharpness is a coefficient, the attribute a tag is a valid attribute tag when the noise level 2 is considered.
And S15, training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
Referring to fig. 7, a flowchart of S15 in the attribute identification model training method of the present invention is shown. As shown in fig. 7, the step S15 includes the following steps:
s151, initializing parameters of the attribute identification model.
And initializing parameters of the trained clear data attribute identification model.
And S152, training the attribute identification model based on each fused noise image and the corresponding effective attribute label in sequence from large to small.
And combining the corresponding effective attribute labels with each fused noise image according to the sequence from the big noise to the small noise for training.
Specifically, a pedestrian target is described as an example. Adding the clear pedestrian image into the fused noise image with the noise coefficient t of 100 and the label to input the fused noise image and the label into the attribute identification model for training to form a new model, namely: the first attribute identifies a model. Then, the fused noise image with the added noise coefficient t of 90 and the label are input into the first attribute identification model for training, and another new model is obtained, namely: the second attribute identifies a model. And by analogy, respectively carrying out iterative training on the fused noise image with the noise coefficient from large to small and the label, and gradually generating a new model.
And S153, training the attribute identification model again based on the target image and the corresponding attribute label to obtain a trained attribute identification model.
In this embodiment, the noise image and the label are fused to train the recognition network model, and the initial learning rate is reduced to the learning rate of the previous trainingMultiple times. Wherein (1)>The range of the values is as follows: 0-1.
And training the obtained attribute recognition model again based on the target image and the corresponding attribute label, and obtaining the final attribute recognition network model through iterative training.
The protection scope of the attribute identification model training method described in the embodiments of the present application is not limited to the execution sequence of the steps listed in the embodiments, and all the schemes implemented by adding or removing steps and replacing steps according to the principles of the present application in the prior art are included in the protection scope of the present application.
The present embodiment additionally provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the attribute identification model training method as described in fig. 1.
The present application may be a system, method, and/or computer program product at any possible level of technical detail. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present application.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device. Computer program instructions for carrying out operations of the present application may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and a procedural programming language such as the "C" language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present application are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for computer readable program instructions, which may execute the computer readable program instructions.
The embodiment of the application also provides an attribute identification model training system, which can realize the attribute identification model training method, but the implementation device of the attribute identification model training method includes but is not limited to the structure of the attribute identification model training system listed in the embodiment, and all the structural deformation and replacement of the prior art according to the principles of the application are included in the protection scope of the application.
The attribute recognition model training system provided in this embodiment will be described in detail below with reference to drawings.
The present embodiment provides an attribute identification model training system, including:
the acquisition module is used for acquiring a target clear image and an attribute tag of the target clear image;
the identification training module is used for training a clear data attribute identification model based on the target clear image and the attribute tag;
the diffusion noise module is used for adding noise with different degrees into the target image to generate a plurality of fusion noise images;
the label generation module is used for identifying the fusion noise image based on the trained clear data attribute identification model and acquiring an effective attribute label of the fusion noise image;
And the back diffusion training module is used for training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
Referring to fig. 8, a schematic diagram of an attribute identification model training system according to an embodiment of the invention is shown. As shown in fig. 8, the attribute identification model training system includes: an acquisition module 81, an identification training module 82, a diffuse noise module 83, a tag generation module 84, and a back-diffusion training module 85.
The acquisition module 81 is configured to acquire a target clear image and an attribute tag of the target clear image. Specifically, a target area is photographed by a still camera, so that an acquired image in the target area is acquired. Wherein the collected image is a clear image of the collected object. The type of the target clear image is a common type of image, such as: RGB, RGBD, gray-scale map, thermodynamic diagram, etc.
After the clear target image is obtained, a corresponding label is marked on the clear target image by adopting a manual means.
The recognition training module 82 is connected to the acquisition module 81, and is configured to train a clear data attribute recognition model based on the target clear image and the attribute tag.
The recognition training module 82 inputs the target sharp image and the attribute tags into the sharp data attribute recognition model.
In this embodiment, an identification training network model is constructed according to the user requirements.
Specifically, first, a training data set and a test data set are acquired. The training data set includes: a target clear image and an attribute tag of the target clear image. Then, the object clear image and the attribute label of the object clear image are input into the recognition network model, and training is carried out on the recognition network model. And updating the identification network model according to the identification result through continuous training, and finally obtaining the clear data attribute identification model after label updating.
And adjusting parameters of the clear data attribute identification model so that the clear data attribute identification model outputs effective attribute labels of the target image. In this embodiment, the effective attribute tag of the target image is obtained through the parameters of the high-speed clear data attribute model and finally through the clear data attribute identification model.
The diffuse noise module 83 is connected to the recognition training module 82, and generates a plurality of fused noise images by adding different levels of noise to the target image.
Specifically, gaussian random functions are used to generate noise to varying degrees. In this embodiment, the Gaussian random function is used to generate the noise N (0, beta) t ). The noise values of different degrees are different noise degree coefficients, and the larger the noise degree coefficient is, the larger the noise is.
And adding the noise coefficient calculated by the Gaussian random function into a clear image of the target clear image to obtain fusion noise images with different noise coefficients.
The tag generation module 84 is connected to the diffuse noise module 83, and is configured to identify the fused noise image based on a trained clear data attribute identification model, and obtain an effective attribute tag of the fused noise image.
Inputting the target clear image into the trained clear data attribute identification model to obtain the first attribute clear precision of each attribute of the target clear image. The first attribute definition accuracy formula is as follows: first attribute definition accuracy = number of identification correct samples/total number of identification samples.
And acquiring second attribute definition precision of each attribute of the fused noise image based on the trained definition data attribute identification model.
And acquiring the effective attribute tag according to the first attribute definition precision and the second attribute definition precision.
Based on the first attribute definition precision and the second attribute definition precision obtained in the steps, whether the second attribute definition precision is not lower than the product of the first attribute definition precision and a preset threshold value is judged, and whether the attribute is a valid attribute label is further judged.
Specifically, when the second attribute definition precision is lower than the product of the first attribute definition precision and a preset threshold value, judging that the corresponding label is an invalid attribute label; and when the second attribute definition precision is greater than or equal to the product of the first attribute definition precision and a preset threshold value, judging the corresponding label as an effective attribute label.
The inverse diffusion training module 85 is configured to train an attribute recognition model based on the fused noise image and the corresponding valid attribute tag, so as to perform attribute recognition of the image based on the trained attribute recognition model.
And initializing parameters of the trained clear data attribute identification model.
And training the attribute identification model based on each fused noise image and the corresponding effective attribute label in sequence from the large noise to the small noise.
And combining the corresponding effective attribute labels with each fused noise image according to the sequence from the big noise to the small noise for training. Specifically, a pedestrian target is described as an example. Adding the clear pedestrian image into the fused noise image with the noise coefficient t of 100 and the label to input the fused noise image and the label into the attribute identification model for training to form a new model, namely: the first attribute identifies a model. Then, the fused noise image with the added noise coefficient t of 90 and the label are input into the first attribute identification model for training, and another new model is obtained, namely: the second attribute identifies a model. And by analogy, respectively carrying out iterative training on the fused noise image with the noise coefficient from large to small and the label, and gradually generating a new model.
And finally, training the attribute identification model again based on the target image and the corresponding attribute label to obtain a trained attribute identification model.
It should be noted that, it should be understood that the division of the modules of the above system is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the x module may be a processing element that is set up separately, may be implemented in a chip of the system, or may be stored in a memory of the system in the form of program code, and the function of the x module may be called and executed by a processing element of the system. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element described herein may be an integrated circuit having signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
The above modules may be one or more integrated circuits configured to implement the above methods, for example: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, when a module is implemented in the form of a processing element scheduler code, the processing element may be a general purpose processor, such as a Central Processing Unit (CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
Referring to fig. 9, fig. 9 is a schematic structural diagram of an attribute identification model training device according to an embodiment of the present invention. As shown in fig. 9, the present embodiment provides an attribute identification model training apparatus including: a processor 91 and a memory 92; the memory 92 is used for storing a computer program; the processor 91 is connected to the memory 92 for executing a computer program stored in the memory 92 for causing the attribute identification model training means to perform the steps of the human body image age estimation method as described above.
Preferably, the memory may comprise random access memory (RandomAccess Memory, abbreviated as RAM), and may further comprise non-volatile memory (non-volatile memory), such as at least one magnetic disk memory.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processing, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field programmable gate arrays (Field Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In summary, the attribute identification model training method, system, medium and device provided by the application have the following beneficial effects:
according to the training method based on the diffusion model, the presumption part of the network is avoided in the training process, and the identifiable part in the attribute tag reinforcement training can be effectively identified according to visual and visible information; therefore, the problem of model identification accuracy upper limit loss caused by the fact that noise data and clear data are fused together to form a label in the training process is avoided, and the attribute identification model accuracy is improved and promoted; effectively increasing the generalization capability of the identification network. Meanwhile, a lightweight design model structure is adopted, the requirement on equipment is not high, and the applicability is strong. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.
Claims (10)
1. The attribute identification model training method is characterized by comprising the following steps of:
acquiring a target clear image and an attribute tag of the target clear image;
training a clear data attribute recognition model based on the target clear image and the attribute tag;
adding noise with different degrees into the target clear image to generate a plurality of fusion noise images;
identifying the fusion noise image based on the trained clear data attribute identification model, and acquiring an effective attribute label of the fusion noise image;
and training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
2. The attribute recognition model training method of claim 1, wherein training a sharp data attribute recognition model based on the target sharp image and the attribute tag comprises the steps of:
inputting the target clear image and the attribute tag into the clear data attribute identification model;
and adjusting parameters of the clear data attribute identification model so that the clear data attribute identification model outputs effective attribute labels of the target image.
3. The method of training a model for attribute identification of claim 1 wherein adding different levels of noise to the target sharp image to generate a plurality of fused noise images comprises the steps of:
generating noise with different degrees by adopting a Gaussian random function;
and fusing the noise with the target clear image in sequence according to the increasing sequence of the noise so as to acquire the fused noise image.
4. A method of training a model for attribute identification according to claim 3, wherein the fused noise image is:
5. The attribute recognition model training method of claim 1, wherein the fused noise image is recognized based on a trained clear data attribute recognition model, and acquiring a valid attribute tag of the fused noise image comprises the steps of:
acquiring first attribute definition precision of each attribute of the target definition image based on the trained definition data attribute identification model;
acquiring second attribute definition precision of each attribute of the fused noise image based on the trained definition data attribute identification model;
and acquiring the effective attribute tag according to the first attribute definition precision and the second attribute definition precision.
6. The attribute identification model training method of claim 5 wherein obtaining valid attribute tags from the first and second attribute resolution levels comprises the steps of:
when the second attribute definition precision is lower than the product of the first attribute definition precision and a preset threshold value, judging that the corresponding label is an invalid attribute label;
and when the second attribute definition precision is greater than or equal to the product of the first attribute definition precision and a preset threshold value, judging the corresponding label as an effective attribute label.
8. The attribute recognition model training method of claim 1 wherein training an attribute recognition model based on the fused noise image and the corresponding valid attribute tags to perform attribute recognition of the image based on the trained attribute recognition model comprises the steps of:
initializing parameters of the attribute identification model;
training the attribute identification model based on each fused noise image and the corresponding effective attribute label in sequence from big to small according to the noise;
and training the attribute identification model again based on the target image and the corresponding attribute label to obtain a trained attribute identification model.
9. An attribute identification model training system, comprising:
the acquisition module is used for acquiring a target clear image and an attribute tag of the target clear image;
the identification training module is used for training a clear data attribute identification model based on the target clear image and the attribute tag;
The noise diffusion module is used for adding noise with different degrees into the target clear image to generate a plurality of fusion noise images;
the label generation module is used for identifying the fusion noise image based on the trained clear data attribute identification model and acquiring an effective attribute label of the fusion noise image;
and the back diffusion training module is used for training an attribute recognition model based on the fused noise image and the corresponding effective attribute label so as to recognize the attribute of the image based on the trained attribute recognition model.
10. An attribute identification model training device, comprising: a processor and a memory;
the memory is used for storing a computer program;
the processor is connected to the memory for executing a computer program stored by the memory for causing the attribute identification model training means to perform the attribute identification model training method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310152682.7A CN116152603A (en) | 2023-02-21 | 2023-02-21 | Attribute identification model training method, system, medium and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310152682.7A CN116152603A (en) | 2023-02-21 | 2023-02-21 | Attribute identification model training method, system, medium and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116152603A true CN116152603A (en) | 2023-05-23 |
Family
ID=86354072
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310152682.7A Pending CN116152603A (en) | 2023-02-21 | 2023-02-21 | Attribute identification model training method, system, medium and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116152603A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116386023A (en) * | 2023-05-29 | 2023-07-04 | 松立控股集团股份有限公司 | High-phase locomotive brand recognition method and system based on space-time diffusion and electronic equipment |
CN116704269A (en) * | 2023-08-04 | 2023-09-05 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN117710763A (en) * | 2023-11-23 | 2024-03-15 | 广州航海学院 | Image noise recognition model training method, image noise recognition method and device |
-
2023
- 2023-02-21 CN CN202310152682.7A patent/CN116152603A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116386023A (en) * | 2023-05-29 | 2023-07-04 | 松立控股集团股份有限公司 | High-phase locomotive brand recognition method and system based on space-time diffusion and electronic equipment |
CN116386023B (en) * | 2023-05-29 | 2023-08-25 | 松立控股集团股份有限公司 | High-phase locomotive brand recognition method and system based on space-time diffusion and electronic equipment |
CN116704269A (en) * | 2023-08-04 | 2023-09-05 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN116704269B (en) * | 2023-08-04 | 2023-11-24 | 腾讯科技(深圳)有限公司 | Data processing method, device, equipment and storage medium |
CN117710763A (en) * | 2023-11-23 | 2024-03-15 | 广州航海学院 | Image noise recognition model training method, image noise recognition method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108229531B (en) | Object feature extraction method and device, storage medium and electronic equipment | |
CN110222831B (en) | Robustness evaluation method and device of deep learning model and storage medium | |
US11367271B2 (en) | Similarity propagation for one-shot and few-shot image segmentation | |
CN116152603A (en) | Attribute identification model training method, system, medium and device | |
CN108229488B (en) | Method and device for detecting key points of object and electronic equipment | |
CN112465840B (en) | Semantic segmentation model training method, semantic segmentation method and related device | |
CN113837257B (en) | Target detection method and device | |
CN108491812B (en) | Method and device for generating face recognition model | |
CN113841161A (en) | Extensible architecture for automatically generating content distribution images | |
CN114861842B (en) | Few-sample target detection method and device and electronic equipment | |
CN112418195A (en) | Face key point detection method and device, electronic equipment and storage medium | |
CN114691912A (en) | Method, apparatus and computer-readable storage medium for image processing | |
CN114693694A (en) | Method, apparatus and computer-readable storage medium for image processing | |
CN113744280B (en) | Image processing method, device, equipment and medium | |
JP7294275B2 (en) | Image processing device, image processing program and image processing method | |
Rasheed et al. | A Novel Model Driven Framework for Image Enhancement and Object Recognition | |
CN111723859A (en) | Target positioning method and system based on weak tags | |
CN116363561A (en) | Time sequence action positioning method, device, equipment and storage medium | |
CN116363363A (en) | Unsupervised domain adaptive semantic segmentation method, device, equipment and readable storage medium | |
CN113408528B (en) | Quality recognition method and device for commodity image, computing equipment and storage medium | |
CN115731451A (en) | Model training method and device, electronic equipment and storage medium | |
CN115375657A (en) | Method for training polyp detection model, detection method, device, medium, and apparatus | |
CN115761389A (en) | Image sample amplification method and device, electronic device and storage medium | |
CN113920377A (en) | Method of classifying image, computer device, and storage medium | |
CN114912568A (en) | Method, apparatus and computer-readable storage medium for data processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |