CN112052818A - Unsupervised domain adaptive pedestrian detection method, unsupervised domain adaptive pedestrian detection system and storage medium - Google Patents

Unsupervised domain adaptive pedestrian detection method, unsupervised domain adaptive pedestrian detection system and storage medium Download PDF

Info

Publication number
CN112052818A
CN112052818A CN202010968987.1A CN202010968987A CN112052818A CN 112052818 A CN112052818 A CN 112052818A CN 202010968987 A CN202010968987 A CN 202010968987A CN 112052818 A CN112052818 A CN 112052818A
Authority
CN
China
Prior art keywords
pedestrian
pedestrian detection
image data
network
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010968987.1A
Other languages
Chinese (zh)
Other versions
CN112052818B (en
Inventor
谭宇志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Smart Video Security Innovation Center Co Ltd
Original Assignee
Zhejiang Smart Video Security Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Smart Video Security Innovation Center Co Ltd filed Critical Zhejiang Smart Video Security Innovation Center Co Ltd
Priority to CN202010968987.1A priority Critical patent/CN112052818B/en
Publication of CN112052818A publication Critical patent/CN112052818A/en
Application granted granted Critical
Publication of CN112052818B publication Critical patent/CN112052818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the application provides a pedestrian detection method and system adaptive to an unsupervised domain and a storage medium. According to the method and the device, only label-free image data in a new scene need to be acquired, and a large amount of image annotation is not needed, so that the manpower and material resource consumption caused by data annotation in the development process is greatly saved, and the efficiency is improved; the migration capability of the model is improved, and the method can be more suitable for the change of scenes.

Description

Unsupervised domain adaptive pedestrian detection method, unsupervised domain adaptive pedestrian detection system and storage medium
Technical Field
The application belongs to the technical field of pedestrian detection, and particularly relates to a pedestrian detection method and system adaptive to an unsupervised domain and a storage medium.
Background
Along with pedestrian detects by fields such as more and more use intelligent security, autopilot, pedestrian's application scene that detects is more and more abundant. However, the lighting conditions, background, camera angle, etc. are all different for different scenes, meaning that the data distribution is usually different for different scenes. Practical application scenes of pedestrian detection are various, and a network model trained under one scene is directly used for another scene to cause great reduction of detection performance. The deep learning method relies on a large amount of labeled data to improve the generalization performance of the network model, so that the prior art mainly re-labels a large amount of data of each scene in a data-driven manner, and then trains the network again on the newly labeled data to obtain the network model in the new scene.
Specifically, in the existing methods, the supervised transfer learning is mainly used for processing the problems. Namely, new data are artificially marked in a new scene, and fine tuning training is carried out on the original model by utilizing the newly marked data to realize model migration. The specific implementation process comprises the following steps: 1. collecting data in a target domain scene; 2. manually marking the collected data; 3. carrying out fine tuning training on the trained model on the source domain by using the data and the corresponding label information; 4. and carrying out target detection of the target domain by using the trained model.
However, although the method can adapt the network model to a new scene, a large amount of new scene data is generated correspondingly after each scene switching, and the large amount of new scene data needs a large amount of manpower and material resources for labeling, which brings great inconvenience to applications and developers. Finally, with the increasing change of scenes, the model migration capability is gradually deteriorated. Therefore, the above method for performing fine tuning training based on labeled data suffers from a relatively large bottleneck in practical application, and a new pedestrian detection algorithm is urgently needed to solve the above problems.
Disclosure of Invention
The invention provides a pedestrian detection method, a system and a storage medium adaptive to an unsupervised domain, and aims to solve the problems that in the prior art, when fine tuning training is carried out based on labeled data, a large amount of data in a new scene needs to be manually labeled, and time and labor are wasted.
According to a first aspect of embodiments of the present application, there is provided an unsupervised domain adapted pedestrian detection method, comprising the steps of:
randomly selecting labeled image data and unlabeled image data;
the tagged image data is enhanced through random data to obtain enhanced tagged image data; obtaining first enhanced non-label image data and second enhanced non-label image data respectively by the non-label image label data through random data enhancement;
inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting second enhanced label-free image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic;
obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic;
adding the supervised learning cost and the consistency cost to obtain a total cost;
updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost;
and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
In some embodiments of the present application, the unsupervised domain adapted pedestrian detection method further comprises:
repeating the steps until the first pedestrian detection network and the second pedestrian monitoring network converge to obtain the updated first pedestrian detection network and the updated second pedestrian monitoring network;
and inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
In some embodiments of the present application, the tagged image data is randomly selected from image data for which the tag data is known; the non-label image data is randomly selected from the non-label image data to be detected.
In some embodiments of the present application, the first, second and third pedestrian prediction features include size information, classification information and location information of image pedestrians.
In some embodiments of the present application, the supervised learning costs and consistency costs include pedestrian classification loss, pedestrian center point offset loss, pedestrian border width and height loss.
In some embodiments of the present application, the first and second pedestrian detection networks initially employ the same neural network architecture.
In some embodiments of the present application, the random data enhancement includes image size or pixel random enhancement.
In some embodiments of the present application, the weight parameter of the second pedestrian detection network is obtained by performing exponential moving average in a training process on the weight calculation of the first pedestrian detection network.
According to a second aspect of the embodiments of the present application, there is provided an unsupervised domain adapted pedestrian detection system, specifically including:
a training data selection module: the image processing device is used for randomly selecting image data with labels and image data without labels;
the data enhancement module: the system comprises a random data enhancement module, a tag image data acquisition module, a tag image data storage module and a tag image data processing module, wherein the random data enhancement module is used for enhancing the tag image data to obtain enhanced tag image data; the system comprises a random data enhancement module, a label-free image data acquisition module and a label-free image data acquisition module, wherein the random data enhancement module is used for respectively obtaining first enhanced label-free image data and second enhanced label-free image data through;
the characteristic prediction network module: the system comprises a first pedestrian detection network, a second pedestrian detection network and a third pedestrian prediction network, wherein the first pedestrian prediction network is used for inputting enhanced labeled data to the first pedestrian detection network to obtain first pedestrian prediction characteristics; the first enhanced non-tag data is input to the first pedestrian detection network to obtain a second pedestrian prediction characteristic; the third pedestrian prediction feature is obtained by inputting second enhanced label-free image data to a second pedestrian detection network;
a supervised learning cost module: the monitoring learning cost is obtained according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics;
a consistency cost module: the system is used for obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic;
a total cost module: the method is used for adding the supervised learning cost and the consistency cost to obtain a total cost;
a first pedestrian detection network update module: the weight parameter updating module is used for updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost;
a second pedestrian detection network update module: and the weight parameter updating module is used for updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
In some embodiments of the present application, the unsupervised domain adapted pedestrian detection system further comprises:
training a convergence module: the pedestrian detection method comprises the steps of repeating the steps until the first pedestrian detection network and the second pedestrian monitoring network converge to obtain the updated first pedestrian detection network and the updated second pedestrian monitoring network;
a pedestrian detection module: and the pedestrian detection system is used for inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
According to a third aspect of embodiments of the present application, there is provided a computer-readable storage medium having a computer program stored thereon; a computer program is executed by a processor to implement an unsupervised domain adapted pedestrian detection method.
By adopting the unsupervised domain adaptive pedestrian detection method, the unsupervised domain adaptive pedestrian detection system and the storage medium in the embodiment of the application, the labeled image data and the unlabeled image data are randomly selected; the tagged image data is enhanced through random data to obtain enhanced tagged image data; obtaining first enhanced non-label image data and second enhanced non-label image data respectively by the non-label image label data through random data enhancement; inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting second enhanced label-free image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic; obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic; adding the supervised learning cost and the consistency cost to obtain a total cost; updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost; and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network. According to the method, the migration capability of the model is enhanced by adopting the migration learning, and the model can be trained by using the label data in the existing scene and the label-free data in the new scene together through the unsupervised domain adaptation method, so that the data expression capability of the existing scene can be migrated to the data in the new scene by the model. According to the method and the device, only label-free image data in a new scene need to be acquired, and a large amount of image annotation is not needed, so that the manpower and material resource consumption caused by data annotation in the development process is greatly saved, and the efficiency is improved; the migration capability of the model is improved, and the method can be more suitable for the change of scenes.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart illustrating steps of a unsupervised domain adapted pedestrian detection method according to an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating steps in an unsupervised domain adapted pedestrian detection method according to another embodiment of the present application;
a flow diagram of an unsupervised domain adapted pedestrian detection method according to an embodiment of the application is shown in fig. 3;
fig. 4 shows a schematic structural diagram of an unsupervised domain adapted pedestrian detection system according to an embodiment of the application.
Detailed Description
In the process of implementing the application, the inventor finds that when the existing processing is performed on the detection data of the downloader in the new scene, the supervised migration learning method is used for artificially labeling the image data in the new scene to obtain new data, fine tuning is performed on the detection network of the original scene, and finally, along with the increasing change of the scenes, the model migration capability is gradually deteriorated, the detection result is inaccurate, and a large amount of manual data labeling is time-consuming and labor-consuming.
In order to solve the problems that the model migration capability is gradually poor and the cost of labeled data is high, the application provides a unsupervised domain adaptive migration learning method which only needs to collect unlabeled data in a target scene and does not need to manually mark new data.
The method and the device construct two identical pedestrian detection network models, wherein one is used as a student model and the other is used as a teacher model. And generating a pseudo label according to the target domain data through a teacher network, and guiding the student network needing training by using the generated pseudo label. By making the output of the student network as close as possible to the output of the teacher network, the student network is more suitable for the distribution of the target domain data.
The pedestrian detection method, the system and the storage medium which are adaptive to the unsupervised domain randomly select the image data with the label and the image data without the label; the tagged image data is enhanced through random data to obtain enhanced tagged image data; obtaining first enhanced non-label image data and second enhanced non-label image data respectively by the non-label image label data through random data enhancement; inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting second enhanced label-free image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic; obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic; adding the supervised learning cost and the consistency cost to obtain a total cost; updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost; and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network. According to the method, the migration capability of the model is enhanced by adopting migration learning, and the model is trained jointly by using the labeled data in the existing scene and the unlabeled data in the new scene through an unsupervised domain adaptation method, so that the data expression capability of the existing scene can be migrated to the data in the new scene by the model. According to the method and the device, only label-free image data under a new scene need to be acquired, a large amount of image labels do not need to be carried out again, the efficiency is greatly improved, and therefore the migration performance of the model is greatly improved.
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following further detailed description of the exemplary embodiments of the present application with reference to the accompanying drawings makes it clear that the described embodiments are only a part of the embodiments of the present application, and are not exhaustive of all embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Example 1
A flowchart of the steps of an unsupervised domain adapted pedestrian detection method according to an embodiment of the application is shown in fig. 1.
As shown in fig. 1, the unsupervised domain adaptive pedestrian detection method specifically includes the following steps:
s101: and randomly selecting the image data with the label and the image data without the label.
Wherein, the image data with the label is randomly selected from the image data with known label data; the non-label image data is randomly selected from the non-label image data to be detected.
S102: the tagged image data is enhanced through random data to obtain enhanced tagged image data; and respectively obtaining first enhanced non-label image data and second enhanced non-label image data by random data enhancement of the non-label image data.
Wherein the random data enhancement comprises image size or pixel random enhancement.
S103: inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; and inputting second enhanced unlabeled image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic.
The first pedestrian detection network and the second pedestrian detection network initially adopt the same neural network architecture.
S104: obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; and obtaining the consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic.
The first pedestrian prediction feature, the second pedestrian prediction feature and the third pedestrian prediction feature comprise size information, classification information and position information of image pedestrians.
S105: and adding the supervised learning cost and the consistency cost to obtain the total cost.
S106: and updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost.
Wherein, the supervised learning cost and the consistency cost comprise pedestrian classification loss, offset loss of a pedestrian center point, pedestrian frame width and height loss.
S107: and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
And the weight parameter of the second pedestrian detection network is obtained by calculating the weight of the first pedestrian detection network and calculating the exponential sliding average in the training process.
A schematic diagram of steps in an unsupervised domain adapted pedestrian detection method according to another embodiment of the present application is shown in fig. 2.
As shown in fig. 2, in some embodiments of the present application, the unsupervised domain adapted pedestrian detection method further comprises the steps of:
s108: repeating the steps from S101 to S107 until the first pedestrian detection network and the second pedestrian monitoring network converge to obtain the updated first pedestrian detection network and the updated second pedestrian monitoring network;
s109: and inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
A flow diagram of an unsupervised domain adapted pedestrian detection method according to an embodiment of the application is shown in fig. 3.
In the implementation of the supervision domain adaptive pedestrian detection method, two identical pedestrian detection network models, namely a first pedestrian detection network and a second pedestrian monitoring network, need to be established, the first pedestrian detection network serves as a student model, and the second pedestrian monitoring network serves as a teacher model.
In this embodiment, the pedestrian detection network may adopt a network with an anchor frame, such as a network structure of SSD, YOLO-V3, and the like; an anchorless pedestrian detection network, such as a Center-Net, YOLO-V1 network structure, may also be used.
As shown in fig. 3, the supervision domain adaptive pedestrian detection method of the present application specifically includes the following steps:
1) respectively randomly selecting training data, labeled image data (x) from labeled image data in existing scene and unlabeled image data in new scenes,Bs) And label-free image data xtIn which B issRepresenting tagged image data xsThe label of (1).
2) Tagged image data (x)s,Bs) Obtaining enhanced tagged data
Figure BDA0002683379070000061
xtAfter random data enhancement processing is carried out twice, first enhanced label-free image data are respectively obtained
Figure BDA0002683379070000062
And second enhanced unlabeled image data
Figure BDA0002683379070000063
3) Will be enhanced with tagged data
Figure BDA0002683379070000071
And first enhanced unlabeled image data
Figure BDA0002683379070000072
Respectively input into student network to obtain output characteristics fsAnd ft SSecond enhanced unlabeled image data
Figure BDA0002683379070000073
Input teacher network obtaining output characteristics ft TOutput characteristic ft TI.e. a pseudo tag.
4) Computing output characteristic fsAnd label BsSupervision loss l betweensupervisedComputing the output characteristic ft SAnd ft TLoss of consistency between lconsist
5) For supervision loss lsupervisedAnd a consistency cost lconsistSumming to obtain total cost ltotal
6) The student network weight parameters are updated by a Stochastic Gradient Descent (SGD) algorithm.
7) The student network weight is updated to the teacher network by an Exponential Moving Average (EMA).
8) And returning to the step 1) to loop the series of steps until the student network converges.
In the testing stage, the picture to be detected in the target domain in the new scene is input to the trained student network, and the student network outputs the classification confidence, the frame deviation value and the width and height of the frame. The feature points with the confidence coefficient of the pedestrian classification greater than a certain threshold and the corresponding frames thereof are used as final output, and the threshold of the embodiment is set to be 0.7.
In specific implementation, the total cost of the student network, namely the loss function, is composed of two parts of loss, namely supervision loss of source domain data in an existing scene and consistency loss of target domain data in a new scene.
Regarding the supervision loss of the source domain data in the existing scene, taking the Center-Net detection network as an example, the output of each feature point of the network includes the classification information and the position information of the object to which the feature point belongs. Wherein the classification information is represented as a class confidence of each class in the detection task. The position information is represented by the offset distance of each point from the center of the target to which the point belongs, and the length and width of the frame of the target to which each point belongs.
Therefore, the supervision loss of the network includes the accumulated classification loss of the center point of the pedestrian, the offset loss of the center point of the pedestrian, and the width and height loss of the frame of the pedestrian.
Supervision loss lsupervisedThe total loss of (a) is a weighted sum of three losses, as specified in equation (1):
Lsupervised=LkshapeLshapeoffLoffformula (1)
Wherein L iskTo classify the loss, LshapeFor width and height loss of pedestrian borders, LoffIs the offset loss of the center point of the pedestrian, lambdashapeWeight of width and height loss, λ, of pedestrian bordersoffIs the weight lost to the offset of the center point of the pedestrian.
Regarding the consistency loss of the target domain data in the new scene, because the target domain data does not have artificial tag information, the adopted tag information is derived from the output of the teacher network, and is the same as the supervision loss of the source domain data in the existing scene, including classification confidence loss, frame position and width and height loss, which is not described again here.
Specifically, as shown in fig. 3, the student network receives two types of input data: one is the image data of the source domain and the other is the image data of the target domain.
Image data output characteristic f of student network passing source domainsIncluding pedestrian location and category confidence in the image data.
Due to the image of the source domainThe labels of the data are known, so that the supervised loss L can be calculated by comparing the network predicted pedestrian location and class confidence with the true labelsupervised. During training, L is reduced by gradientsupervisedAnd the pedestrian detection system is smaller and smaller, so that the student network can output more and more accurate pedestrian detection results on the source domain.
On the other hand, the student network outputs the feature f through the image data of the target domaint SIncluding pedestrian location and category confidence in the image data.
The target domain has no label information and therefore cannot be used directly to train the network. The application provides a method for constructing a pseudo label, and image data of different target domains are input into a teacher network to obtain a characteristic ft T,ft TAs a "pseudo tag".
Then, the output characteristics f of the student network are determinedt SWith a pseudo label ft TComparison was made and loss L calculatedconsistDuring training, L is reduced by gradientconsistThe pedestrian detection system is smaller and smaller, so that the student network and the teacher network can output pedestrian detection results which are predicted more and more accurately on the target domain.
In order to obtain the above-mentioned "pseudo label", the teacher model constructed in the present application is the same as the student model.
In the training process, the weight of the teacher network model is obtained by calculating the Exponential Moving Average (EMA) of the student network weight in the training process. The parameters after this iteration of the teacher network are represented as follows:
Yt=λYt-1+(1-λ)Xt
wherein, YtIs the parameter, Y, of the teacher's network after this iterationt-1For the parameters, X, of the teacher's network after the last updatetFor the parameters after this update of the student network, λ is the weight assigned to the model parameters after the last iteration.
When t is 0, Y is initialized and coincides with X, i.e. Y0=X0
Thus, the teacher network can be viewed as the result of a weighted summation of the student networks over a series of different iterative stages. Thus, the weights of the teacher network are smoother in time. Meanwhile, the student networks in different stages are integrated, so that the system has stronger generalization capability.
The random gradient descent method adopted by the student network updating method in the embodiment of the application can also adopt other optimized updating methods.
The teacher network updating method in the embodiment of the application adopts an exponential moving average weighting mode, and other weighting modes can be substituted for the teacher network updating method.
The pedestrian detection network in the embodiment of the present application is not limited to a specific depth network model.
In the pedestrian detection method adaptive to the unsupervised domain in the embodiment of the application, the unlabeled image of the target domain is respectively input to the student network and the teacher network after different data enhancement. And the teacher network predicts the information such as the size, the position, the class confidence coefficient and the like of the pedestrian in the image, and the information is used as a pseudo label of the student network to guide the student network to learn. And after the student network updates the weight, updating the weight of the teacher network by using the sliding average.
Meanwhile, the teacher network and the student network receive images obtained by different data enhancement of the same data. By constraining the output of the two networks to be consistent, the student networks can learn the potential similarity of the target domain data. And because the teacher network has a higher generalization ability than the student network, the generalization ability of the student network can be enhanced by making the output of the student network consistent with the output of the teacher network.
Finally, as shown in fig. 3, by continuously iterating the above training process, the student model and the teacher model can mutually promote the performance improvement, so that the teacher model can predict better pseudo labels, and the student model can better adapt to the distribution of target domain data.
By adopting the unsupervised domain adaptive pedestrian detection method in the embodiment of the application, the labeled image data and the unlabeled image data are randomly selected; the tagged image data is enhanced through random data to obtain enhanced tagged image data; obtaining first enhanced non-label image data and second enhanced non-label image data respectively by the non-label image label data through random data enhancement; inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting second enhanced label-free image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic; obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic; adding the supervised learning cost and the consistency cost to obtain a total cost; updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost; and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network. According to the method, the migration capability of the model is enhanced by adopting the migration learning, and the model can be trained by using the label data in the existing scene and the label-free data in the new scene together through the unsupervised domain adaptation method, so that the data expression capability of the existing scene can be migrated to the data in the new scene by the model. According to the method and the device, only label-free image data in a new scene need to be acquired, and a large amount of image annotation is not needed, so that the manpower and material resource consumption caused by data annotation in the development process is greatly saved, and the efficiency is improved; the migration capability of the model is improved, and the method can be more suitable for the change of scenes.
Example 2
For details not disclosed in the unsupervised domain adaptive pedestrian detection system of this embodiment, please refer to implementation contents of the unsupervised domain adaptive pedestrian detection method in other embodiments.
Fig. 4 shows a schematic structural diagram of an unsupervised domain adapted pedestrian detection system according to an embodiment of the application.
As shown in fig. 4, the unsupervised domain adaptive pedestrian detection system according to the embodiment of the present application includes a training data selecting module 10, a data enhancing module 20, a feature prediction network module 30, a supervised learning cost module 40, a consistency cost module 50, a total cost module 60, a first pedestrian detection network updating module 70, and a second pedestrian detection network updating module 80.
Specifically, the method comprises the following steps:
the training data selection module 10: the method is used for randomly selecting the image data with the label and the image data without the label.
The data enhancement module 20: the system comprises a random data enhancement module, a tag image data acquisition module, a tag image data storage module and a tag image data processing module, wherein the random data enhancement module is used for enhancing the tag image data to obtain enhanced tag image data; the method is used for enhancing the label-free image data through random data to respectively obtain first enhanced label-free image data and second enhanced label-free image data.
Feature prediction network module 30: the system comprises a first pedestrian detection network, a second pedestrian detection network and a third pedestrian prediction network, wherein the first pedestrian prediction network is used for inputting enhanced labeled data to the first pedestrian detection network to obtain first pedestrian prediction characteristics; the first enhanced non-tag data is input to the first pedestrian detection network to obtain a second pedestrian prediction characteristic; and the third pedestrian prediction feature is obtained by inputting the second enhanced unlabeled image data to the second pedestrian detection network.
The supervised learning cost module 40: and obtaining the supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics.
Consistency cost module 50: and obtaining the consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic.
Total cost module 60: and adding the supervised learning cost and the consistency cost to obtain the total cost.
The first pedestrian detection network update module 70: and updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost.
The second pedestrian detection network updating module 80: and the weight parameter updating module is used for updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
In some embodiments of the present application, the unsupervised domain adapted pedestrian detection system further comprises:
training a convergence module: and repeating the steps until the first pedestrian detection network and the second pedestrian monitoring network converge to obtain the updated first pedestrian detection network and the updated second pedestrian monitoring network.
A pedestrian detection module: and the pedestrian detection system is used for inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
An application flow diagram of the unsupervised domain adapted pedestrian detection system according to the embodiment of the application is also shown in fig. 3.
Firstly, two identical pedestrian detection network models, namely a first pedestrian detection network and a second pedestrian monitoring network, need to be established, wherein the first pedestrian detection network is used as a student model, and the second pedestrian monitoring network is used as a teacher model.
In this embodiment, the pedestrian detection network may adopt a network with an anchor frame, such as a network structure of SSD, YOLO-V3, and the like; an anchorless pedestrian detection network, such as a Center-Net, YOLO-V1 network structure, may also be used.
As shown in fig. 3, the application of the supervision domain adaptive pedestrian detection system of the present application includes the following specific steps:
1) respectively randomly selecting training data, labeled image data (x) from labeled image data in existing scene and unlabeled image data in new scenes,Bs) And label-free image data xtIn which B issRepresenting tagged image data xsThe label of (1).
2) Tagged image data (x)s,Bs) Obtaining enhanced tagged data
Figure BDA0002683379070000111
xtAfter random data enhancement processing is carried out twice, first enhanced label-free image data are respectively obtained
Figure BDA0002683379070000112
And second enhanced unlabeled image data
Figure BDA0002683379070000113
3) Will be enhanced with tagged data
Figure BDA0002683379070000114
And first enhanced unlabeled image data
Figure BDA0002683379070000115
Respectively input into student network to obtain output characteristics fsAnd ft SSecond enhanced unlabeled image data
Figure BDA0002683379070000116
Input teacher network obtaining output characteristics ft TOutput characteristic ft TI.e. a pseudo tag.
4) Computing output characteristic fsAnd label BsSupervision loss l betweensupervisedComputing the output characteristic ft SAnd ft TLoss of consistency between lconsist
5) For supervision loss lsupervisedAnd a consistency cost lconsistSumming to obtain total cost ltotal
6) The student network weight parameters are updated by a Stochastic Gradient Descent (SGD) algorithm.
7) The student network weight is updated to the teacher network by an Exponential Moving Average (EMA).
8) And returning to the step 1) to loop the series of steps until the student network converges.
In the testing stage, the picture to be detected in the target domain in the new scene is input to the trained student network, and the student network outputs the classification confidence, the frame deviation value and the width and height of the frame. The feature points with the confidence coefficient of the pedestrian classification greater than a certain threshold and the corresponding frames thereof are used as final output, and the threshold of the embodiment is set to be 0.7.
In specific implementation, the total cost of the student network, namely the loss function, is composed of two parts of loss, namely supervision loss of source domain data in an existing scene and consistency loss of target domain data in a new scene.
Regarding the supervision loss of the source domain data in the existing scene, taking the Center-Net detection network as an example, the output of each feature point of the network includes the classification information and the position information of the object to which the feature point belongs. Wherein the classification information is represented as a class confidence of each class in the detection task. The position information is represented by the offset distance of each point from the center of the target to which the point belongs, and the length and width of the frame of the target to which each point belongs.
Therefore, the supervision loss of the network includes the accumulated classification loss of the center point of the pedestrian, the offset loss of the center point of the pedestrian, and the width and height loss of the frame of the pedestrian.
In the pedestrian detection system adaptive to the unsupervised domain in the embodiment of the application, the target domain unlabeled image is respectively input to the student network and the teacher network after different data enhancement. And the teacher network predicts the information such as the size, the position, the class confidence coefficient and the like of the pedestrian in the image, and the information is used as a pseudo label of the student network to guide the student network to learn. And after the student network updates the weight, updating the weight of the teacher network by using the sliding average.
Meanwhile, the teacher network and the student network receive images obtained by different data enhancement of the same data. By constraining the output of the two networks to be consistent, the student networks can learn the potential similarity of the target domain data. And because the teacher network has a higher generalization ability than the student network, the generalization ability of the student network can be enhanced by making the output of the student network consistent with the output of the teacher network.
Finally, as shown in fig. 3, by continuously iterating the above training process, the student model and the teacher model can mutually promote the performance improvement, so that the teacher model can predict better pseudo labels, and the student model can better adapt to the distribution of target domain data.
By adopting the unsupervised domain adaptive pedestrian detection system in the embodiment of the application, the labeled image data and the unlabeled image data are randomly selected; the tagged image data is enhanced through random data to obtain enhanced tagged image data; obtaining first enhanced non-label image data and second enhanced non-label image data respectively by the non-label image label data through random data enhancement; inputting enhanced labeled data to a first pedestrian detection network to obtain first pedestrian prediction characteristics; inputting first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting second enhanced label-free image data to a second pedestrian detection network to obtain a third pedestrian prediction characteristic; obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic; adding the supervised learning cost and the consistency cost to obtain a total cost; updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost; and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network. According to the method, the migration capability of the model is enhanced by adopting the migration learning, and the model can be trained by using the label data in the existing scene and the label-free data in the new scene together through the unsupervised domain adaptation method, so that the data expression capability of the existing scene can be migrated to the data in the new scene by the model. According to the method and the device, only label-free image data under a new scene need to be acquired, a large amount of image labels do not need to be carried out again, the efficiency is greatly improved, and therefore the migration performance of the model is greatly improved.
Example 3
The present embodiment provides a computer-readable storage medium having stored thereon a computer program; the computer program is executed by a processor to implement the unsupervised domain adapted pedestrian detection method in other embodiments.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. An unsupervised domain adapted pedestrian detection method, comprising the steps of:
randomly selecting labeled image data and unlabeled image data;
the tagged image data is enhanced through random data to obtain enhanced tagged image data; the label-free image label data are subjected to random data enhancement to respectively obtain first enhanced label-free image data and second enhanced label-free image data;
inputting the enhanced labeled data to a first pedestrian detection network to obtain a first pedestrian prediction characteristic; inputting the first enhanced non-tag data to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; inputting the second enhanced unlabeled image data to a second pedestrian detection network to obtain a third pedestrian prediction feature;
obtaining a supervised learning cost according to the label characteristics of the labeled image data and the first pedestrian prediction characteristics; obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic;
adding the supervised learning cost and the consistency cost to obtain a total cost;
updating the weight parameter of the first pedestrian detection network through a random gradient descent algorithm according to the total cost;
and updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
2. The unsupervised domain adapted pedestrian detection method of claim 1, further comprising:
repeating the steps until the first pedestrian detection network and the second pedestrian monitoring network are converged to obtain an updated first pedestrian detection network and an updated second pedestrian monitoring network;
and inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
3. The unsupervised domain adapted pedestrian detection method of claim 1, wherein the tagged image data is randomly selected from image data for which tag data is known; and randomly selecting the non-label image data from the to-be-detected non-label image data.
4. The unsupervised domain adapted pedestrian detection method of claim 1, wherein the first, second and third pedestrian prediction features comprise image pedestrian size information, classification information and location information.
5. The unsupervised domain adapted pedestrian detection method of claim 3, wherein the supervised learning cost and consistency cost comprise a pedestrian classification loss, a pedestrian center point offset loss, a pedestrian border width and a height loss.
6. The unsupervised domain adapted pedestrian detection method of claim 1, wherein the first and second pedestrian detection networks initially employ the same neural network architecture.
7. The unsupervised domain adapted pedestrian detection method of claim 1, wherein the weight parameters of the second pedestrian detection network are obtained by exponential moving average in a training process of weight calculation of the first pedestrian detection network.
8. A pedestrian detection system adaptive to an unsupervised domain is characterized by specifically comprising:
a training data selection module: the image processing device is used for randomly selecting image data with labels and image data without labels;
the data enhancement module: the image processing device is used for enhancing the labeled image data through random data to obtain enhanced labeled image data; the image enhancement module is used for enhancing the label-free image data through random data to respectively obtain first enhanced label-free image data and second enhanced label-free image data;
the characteristic prediction network module: the system is used for inputting the enhanced labeled data to a first pedestrian detection network to obtain a first pedestrian prediction characteristic; the first enhanced non-tag data is input to a first pedestrian detection network to obtain a second pedestrian prediction characteristic; the second enhanced unlabeled image data is input to a second pedestrian detection network to obtain a third pedestrian prediction characteristic;
a supervised learning cost module: the monitoring learning cost is obtained according to the label features of the labeled image data and the first pedestrian prediction features;
a consistency cost module: the system is used for obtaining consistency cost according to the second pedestrian prediction characteristic and the third pedestrian prediction characteristic;
a total cost module: the device is used for adding the supervised learning cost and the consistency cost to obtain a total cost;
a first pedestrian detection network update module: the weight parameter of the first pedestrian detection network is updated through a random gradient descent algorithm according to the total cost;
a second pedestrian detection network update module: and the weight parameter updating module is used for updating the weight parameter of the second pedestrian detection network through an exponential moving average algorithm according to the weight parameter of the first pedestrian detection network.
9. The unsupervised domain adapted pedestrian detection system of claim 8, further comprising:
training a convergence module: the pedestrian detection method comprises the steps of obtaining a first pedestrian detection network and a second pedestrian monitoring network after updating, and repeating the steps until the first pedestrian detection network and the second pedestrian monitoring network converge to obtain the updated first pedestrian detection network and the updated second pedestrian monitoring network;
a pedestrian detection module: and the pedestrian detection system is used for inputting the unlabeled image data to be detected into the updated first pedestrian detection network to obtain a pedestrian detection result.
10. A computer-readable storage medium, having stored thereon a computer program; a computer program for execution by a processor for implementing an unsupervised domain adapted pedestrian detection method according to any one of claims 1-7.
CN202010968987.1A 2020-09-15 2020-09-15 Method, system and storage medium for detecting pedestrians without supervision domain adaptation Active CN112052818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010968987.1A CN112052818B (en) 2020-09-15 2020-09-15 Method, system and storage medium for detecting pedestrians without supervision domain adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010968987.1A CN112052818B (en) 2020-09-15 2020-09-15 Method, system and storage medium for detecting pedestrians without supervision domain adaptation

Publications (2)

Publication Number Publication Date
CN112052818A true CN112052818A (en) 2020-12-08
CN112052818B CN112052818B (en) 2024-03-22

Family

ID=73602952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010968987.1A Active CN112052818B (en) 2020-09-15 2020-09-15 Method, system and storage medium for detecting pedestrians without supervision domain adaptation

Country Status (1)

Country Link
CN (1) CN112052818B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113255807A (en) * 2021-06-03 2021-08-13 北京的卢深视科技有限公司 Face analysis model training method, electronic device and storage medium
CN113536920A (en) * 2021-06-11 2021-10-22 复旦大学 Semi-supervised three-dimensional point cloud target detection method
CN114399683A (en) * 2022-01-18 2022-04-26 南京甄视智能科技有限公司 End-to-end semi-supervised target detection method based on improved yolov5
CN114445670A (en) * 2022-04-11 2022-05-06 腾讯科技(深圳)有限公司 Training method, device and equipment of image processing model and storage medium
CN114550215A (en) * 2022-02-25 2022-05-27 北京拙河科技有限公司 Target detection method and system based on transfer learning
CN114943868A (en) * 2021-05-31 2022-08-26 阿里巴巴新加坡控股有限公司 Image processing method, image processing device, storage medium and processor

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100246980A1 (en) * 2009-03-31 2010-09-30 General Electric Company System and method for automatic landmark labeling with minimal supervision
KR20190042429A (en) * 2017-10-15 2019-04-24 알레시오 주식회사 Method for image processing
CN109902798A (en) * 2018-05-31 2019-06-18 华为技术有限公司 The training method and device of deep neural network
CN109977918A (en) * 2019-04-09 2019-07-05 华南理工大学 A kind of target detection and localization optimization method adapted to based on unsupervised domain
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning
KR102001781B1 (en) * 2018-08-28 2019-10-01 건국대학교 산학협력단 Method of improving learning accuracy of neural networks and apparatuses performing the same
US20190325299A1 (en) * 2018-04-18 2019-10-24 Element Ai Inc. Unsupervised domain adaptation with similarity learning for images
CN110659591A (en) * 2019-09-07 2020-01-07 中国海洋大学 SAR image change detection method based on twin network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100246980A1 (en) * 2009-03-31 2010-09-30 General Electric Company System and method for automatic landmark labeling with minimal supervision
KR20190042429A (en) * 2017-10-15 2019-04-24 알레시오 주식회사 Method for image processing
CN111226258A (en) * 2017-10-15 2020-06-02 阿莱西奥公司 Signal conversion system and signal conversion method
US20190325299A1 (en) * 2018-04-18 2019-10-24 Element Ai Inc. Unsupervised domain adaptation with similarity learning for images
CN109902798A (en) * 2018-05-31 2019-06-18 华为技术有限公司 The training method and device of deep neural network
KR102001781B1 (en) * 2018-08-28 2019-10-01 건국대학교 산학협력단 Method of improving learning accuracy of neural networks and apparatuses performing the same
CN109977918A (en) * 2019-04-09 2019-07-05 华南理工大学 A kind of target detection and localization optimization method adapted to based on unsupervised domain
CN110135295A (en) * 2019-04-29 2019-08-16 华南理工大学 A kind of unsupervised pedestrian recognition methods again based on transfer learning
CN110659591A (en) * 2019-09-07 2020-01-07 中国海洋大学 SAR image change detection method based on twin network

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114943868A (en) * 2021-05-31 2022-08-26 阿里巴巴新加坡控股有限公司 Image processing method, image processing device, storage medium and processor
CN114943868B (en) * 2021-05-31 2023-11-14 阿里巴巴新加坡控股有限公司 Image processing method, device, storage medium and processor
CN113255807A (en) * 2021-06-03 2021-08-13 北京的卢深视科技有限公司 Face analysis model training method, electronic device and storage medium
CN113255807B (en) * 2021-06-03 2022-03-25 北京的卢深视科技有限公司 Face analysis model training method, electronic device and storage medium
CN113536920A (en) * 2021-06-11 2021-10-22 复旦大学 Semi-supervised three-dimensional point cloud target detection method
CN113536920B (en) * 2021-06-11 2022-06-17 复旦大学 Semi-supervised three-dimensional point cloud target detection method
CN114399683A (en) * 2022-01-18 2022-04-26 南京甄视智能科技有限公司 End-to-end semi-supervised target detection method based on improved yolov5
CN114550215A (en) * 2022-02-25 2022-05-27 北京拙河科技有限公司 Target detection method and system based on transfer learning
CN114445670A (en) * 2022-04-11 2022-05-06 腾讯科技(深圳)有限公司 Training method, device and equipment of image processing model and storage medium

Also Published As

Publication number Publication date
CN112052818B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
CN112052818B (en) Method, system and storage medium for detecting pedestrians without supervision domain adaptation
Qin et al. Ultra fast structure-aware deep lane detection
Lian et al. Road extraction methods in high-resolution remote sensing images: A comprehensive review
CN113221905B (en) Semantic segmentation unsupervised domain adaptation method, device and system based on uniform clustering and storage medium
JP2018097807A (en) Learning device
EP3767536A1 (en) Latent code for unsupervised domain adaptation
CN107872644A (en) Video frequency monitoring method and device
CN104217225A (en) A visual target detection and labeling method
CN108491766B (en) End-to-end crowd counting method based on depth decision forest
Fang et al. Survey on the application of deep reinforcement learning in image processing
CN113469186B (en) Cross-domain migration image segmentation method based on small number of point labels
CN113128478B (en) Model training method, pedestrian analysis method, device, equipment and storage medium
CN112132014B (en) Target re-identification method and system based on non-supervised pyramid similarity learning
CN112232355B (en) Image segmentation network processing method, image segmentation device and computer equipment
CN110096979B (en) Model construction method, crowd density estimation method, device, equipment and medium
CN104268546A (en) Dynamic scene classification method based on topic model
CN113807399A (en) Neural network training method, neural network detection method and neural network detection device
KR20230171966A (en) Image processing method and device and computer-readable storage medium
CN114175068A (en) Method for performing on-device learning on machine learning network of automatic driving automobile through multi-stage learning by using adaptive hyper-parameter set and on-device learning device using same
CN112927266A (en) Weak supervision time domain action positioning method and system based on uncertainty guide training
CN114511077A (en) Training point cloud processing neural networks using pseudo-element based data augmentation
CN111914949B (en) Zero sample learning model training method and device based on reinforcement learning
CN113326825A (en) Pseudo tag generation method and device, electronic equipment and storage medium
CN116681961A (en) Weak supervision target detection method based on semi-supervision method and noise processing
Alajlan et al. Automatic lane marking prediction using convolutional neural network and S-Shaped Binary Butterfly Optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant