WO2017124336A1 - Method and system for adapting deep model for object representation from source domain to target domain - Google Patents

Method and system for adapting deep model for object representation from source domain to target domain Download PDF

Info

Publication number
WO2017124336A1
WO2017124336A1 PCT/CN2016/071501 CN2016071501W WO2017124336A1 WO 2017124336 A1 WO2017124336 A1 WO 2017124336A1 CN 2016071501 W CN2016071501 W CN 2016071501W WO 2017124336 A1 WO2017124336 A1 WO 2017124336A1
Authority
WO
WIPO (PCT)
Prior art keywords
objects
fine
criterions
deep model
target domain
Prior art date
Application number
PCT/CN2016/071501
Other languages
French (fr)
Inventor
Xiaoou Tang
Zhanpeng Zhang
Ping Luo
Chen Change Loy
Original Assignee
Sensetime Group Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sensetime Group Limited filed Critical Sensetime Group Limited
Priority to CN201680079452.0A priority Critical patent/CN108604304A/en
Priority to PCT/CN2016/071501 priority patent/WO2017124336A1/en
Publication of WO2017124336A1 publication Critical patent/WO2017124336A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds

Definitions

  • the disclosures relate to a method and a system for adapting a deep model for object representation from a source domain to a target domain.
  • Deep learning approaches have achieved substantial advances for object (e.g., face, dogs, basketball) recognition.
  • contemporary deep models for example, deep convolution networks
  • the annotated data in the unseen target domain is usually not sufficient for training a new deep model.
  • These problems limit the deep learning in the applications, such as object tracking, retrieval, and clustering in unseen images/videos.
  • face clustering in movies i.e., grouping detected faces into different subsets according to different characters. Clustering faces in movies is extremely challenging since characters’ a ppearance may vary drastically under different scenes as the story progresses.
  • Deep learning approaches have achieved substantial advances for object representation learning. These methods could arguably provide a more robust representation to object recognition.
  • contemporary deep models for object recognition are trained with web images or photos from albums. These models overfit to the training data distributions and thus will not be directly generalisable to application in different target domain.
  • a method for adapting a deep model for object representation from a source domain to a target domain comprising: extracting, by a deep model for the source domain, features for objects from input images for the target domain; inferring group labels for objects according to the extracted features; discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain.
  • the extracting, the inferring, the discovering, and the fine-tuning are implemented in an iterative feedback loop that is performed for predetermined times, wherein in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain, in iterations following the starting iteration, the features for objects are extracted from input images for the target domain by the fine-tuned deep model fine-tuned in a previous iteration of the iterative feedback loop.
  • the inferring comprises: computing, according to the exacted features of the objects, a judgment score for each of candidate group label distributions for the objects; determining a candidate group label distribution having highest judgment score; and inferring, based on the determined distribution, group labels for objects, wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is.
  • the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
  • the discovering comprises: computing degrees of difference between objects that are inferred to have the same group label; and choosing pairs of object, having a degree of difference larger than a threshold, as the criterions.
  • the discovering comprises: choosing pairs of object from the objects, which is inferred to have the same group label but should have different group labels according to the target domain prior as the criterions.
  • the fine-tuning comprises: computing a fine-tuning score for each of candidate parameter adjustments according to the discovered criterions; determining the candidate parameter adjustment having highest fine-tuning score; and fine-tuning the deep model with the determined parameter adjustment, wherein the fine-tuning score indicates the similarity between the objects having a same group label, and the higher the similarity is, the higher the fine-tuning score is.
  • a system for adapting a deep model for object representation from a source domain to a target domain comprising: a feature extraction unit configured to receive the deep model for the source domain and use the deep model to extract features for objects from input images for the target domain; an inference unit configured to infer group labels for objects according to the extracted features; a criterions discovery unit configured to discover criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and a training unit configured to fine-tune the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as the deep model for the target domain.
  • a system for adapting a deep model for object representation from a source domain to a target domain comprising: a memory that stores executable components; and a processor electrically coupled to the memory to execute the executable components for: extracting, by a deep model for the source domain, features for objects from input images for the target domain; inferring group labels for objects according to the extracted features; discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as the deep model for the target domain.
  • Fig. 1 shows the overall pipeline of the system for adapting a deep model for object representation from a source domain to a target domain according to some embodiments of the present application
  • Fig. 2 shows the steps used for the inference unit according to some embodiments of the present application
  • Fig. 3 shows the steps used for the criterions discovery unit according to some embodiments of the present application.
  • Fig. 4 shows the steps used for the training unit according to some embodiments of the present application.
  • Fig. 1 shows the overall pipeline of the system for adapting a deep model for object representation from a source domain to a target domain according to some embodiments of the present application.
  • the deep model may be a deep convolution network (DCN) .
  • the system for adapting a deep model for object representation from a source domain to a target domain 100 comprises a feature extraction unit 101, an inference unit 102, a criterions discovery unit 103 and a training unit 104.
  • DCN deep convolution network
  • the feature extraction unit 101 is configured to extract features for objects from input images for the target domain by a deep model for the source domain; the inference unit 102 is configured to infer group labels for objects according to the extracted features; the criterions discovery unit 103 is configured to discovery criterions based on derived target domain priors derived from the input images and the inferred group labels; and the training unit 104 is configured to fine-tune the deep model for the source domain according to the discovered criterions and outputting the fine-tuned deep model as the deep model for the target domain.
  • the criterions may contain information indicating which objects should not be inferred to have a same group label.
  • the group label may indicate the property, name, classification and the like of the objects. For example, if the system is used for face recognition in a movie, the group label may be the name of the role. If the system is used for object detection in the photo, the group label may be the classification of the object, such as “chair” , “table” and the like.
  • the system 100 runs to carry out its functions in an iterative way.
  • the units 101-104 may be implemented as an iterative feedback loop.
  • the feature extraction unit 101 extracts the features from the input images.
  • the inference unit 102 infers group labels for objects according to the extracted features based on the extracted features.
  • the criterions discovery unit 103 discovers criterions from the inferred group labels.
  • the training unit 104 fine-tunes the deep model according to the discovered criterions. Then the next iteration is performed. This iterative feedback loop ends when the desired performance is achieved or the predetermined running time is reached.
  • the deep model is fine-tuned for several times and become more suitable for the target domain.
  • the features for objects are extracted from input images for the target domain by the deep model for the source domain; in iterations following the starting iteration, the features for objects are extracted from input images by the deep model fine-tuned in the previous iteration of the iterative feedback loop.
  • the deep model fine-tuned in the last iteration is outputted
  • the feature extraction unit 101 may be configured with a deep convolutional network (DCN) that consists of successive convolutional filter banks. That is, the deep convolutional network is used as the deep model.
  • the DCN may be initialized by training on a large source domain for image classification/recognition (e.g., large-scale image classification dataset IMAGENET, or large scale face dataset) , or received from other unit, or inputted by user.
  • a large source domain for image classification/recognition e.g., large-scale image classification dataset IMAGENET, or large scale face dataset
  • the pre-trained DCN may be a DCN used in DeepID2+.
  • the input may be, for example, 55 ⁇ 47 RGB face image.
  • the DCN has a plurality of, for example four, successive convolution layers followed by one fully connected layer.
  • Each convolution layer contains learnable filters and is followed by a 2 ⁇ 2 max-pooling layer and Rectified Linear Unites (ReLUs) as the activation function. Then, in this embodiment, the number of feature map generated by each convolution layer will be 128, and the dimension of the face representation generated by the final fully connected layer will be 512.
  • the DCN is pre-trained on CelebFace (as an example) , with around 290,000 faces images from 12,000 identities. The training process is conducted by back-propagation using both the identification and verification loss functions. It should be appreciated that other database with the different number of trained faces images may be applicable.
  • Fig. 2 shows the steps used for the inference unit according to some embodiments of the present application.
  • the extracted features are fed into the inference unit 102, then the inference unit 102 is operated to find an appropriate group label distribution for each objects in the input images according to the extracted features, i.e., infers the group label for each object according to the features thereof.
  • the process of inference may be implemented by the following steps.
  • a judgment score for each of candidate group label distributions for the objects is computed according to the features of the objects, wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is, i.e., the judgment score presents the degree of appropriateness of the distribution thereof.
  • the judgment scores of different distributions are compared with each other, then the candidate group label distribution having highest judgment score is determined.
  • group labels for objects are inferred based on the determined distribution.
  • the judgment score may be a value of a function that contains variables related to the features of the objects, the relation of the features or the like.
  • the group label of each in X is denoted as that may be inferred by maximizing a function p (X, Y) :
  • ⁇ ( ⁇ , ⁇ ) is a pre-computed function that encodes the relation between any pair of features and where positive relation (i.e. ⁇ ( ⁇ , ⁇ ) > 0) means that the features are likely from the same character. Otherwise, they belong to different characters.
  • the computation of v is a combination of the similarity between appearances of a pair of features (i.e., the similarity between features of a pair of objects) ; and the pairwise spatial and temporal criterions of the features, which may be obtained from input images.
  • the group label distribution making the Eqn. (1) having highest value may be considered as the most appropriate distribution, and may be determined as the resulting group label distribution, then group label for the objects can be inferred.
  • Fig. 3 shows the steps used for the criterions discovery unit 103 according to some embodiments of the present application.
  • the resulting group labels for objects as well as the input images are fed into the criterions discovery unit 103.
  • the criterions discovery unit 103 the following steps are performed.
  • the degrees of difference between objects that are inferred to have the same group label are computed.
  • the object pairs having a degree of difference larger than a threshold are chosen as the criterions.
  • the object pairs that are inferred with the same group label but should have different group labels according to the target domain prior are chosen as the criterions.
  • These criterions will be used in the training unit 104 to fine-tune the DCN of the feature extraction unit 101.
  • step S302 may be omitted; in some embodiments, step S303 may be omitted.
  • the degrees of difference between objects that are inferred to have the same group label may be obtained by calculating distance between the features of each pair of objects in the feature space, for example, by calculating L2-distance between features of two objects. Then the top 20%or other percentage of object pairs with the largest degree of difference (for example, L2-distance) are chosen as the criterions, that is, the object pairs having a degree of difference larger than a threshold are chosen as the criterions. For example, in the scenario where the 20%object pairs with the largest degree of difference (for example, L2-distance) are chosen as the criterions, the threshold is the shortest L2-distance in the top 20%of all L2-distances.
  • the large L2-distance means that two objects likely belong to different group label, so the inference of two objects having large L2-distance is likely error, the DCN used to extract features should be corrected, and the information on “these two objects belong to different labels” will be used as the criterion in the correction process. So, at step S302, the object pairs having a degree of difference larger than a threshold are chosen as the criterions.
  • the whole similarity degree of all objects having same group label may be firstly calculated, for example, trace of the covariance matrix i.e. trace ( ⁇ l ) , wherein ⁇ l denotes the covariance matrix of the Gaussian of the l-th group label, the lower the whole similarity degree is, the larger the trace ( ⁇ l ) is. Then only the objects with group label whose trace ( ⁇ l ) is larger than a threshold are considered during calculating the degree of difference between objects that are inferred to have the same group label.
  • the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
  • the target domain prior can be the context extracted from the subtitle that helps to identify the character’s face.
  • Other similar prior can be in a pairwise form: faces appearing in the same frame of a video/movie unlikely belong to the same person (negative pair) while any two faces in the same location between neighboring frames more likely belong to the same person (positive pair) .
  • step S303 object pairs that are inferred to have the same group label but should have different group labels according to the target domain prior are chosen as the criterions.
  • the criterions may contain the information on which pair of objects that is distributed to have same group label actually are not same object.
  • Fig. 4 shows the steps used for the training unit 104 according to some embodiments of the present application.
  • the original DCN or DCN used in the previous iteration is fine-tuned according to the discovered criterions.
  • the parameters of DCN are adjusted in order to make the extracted features are more consistent with the criterions.
  • a fine-tuning score for each of the candidate parameter adjustments is computed according to the discovered criterions; at step S402, the candidate parameter adjustment having highest fine-tuning score is determined as the resulting parameter adjustments of the deep model; and at step S403, the deep model is fine-tuned with the determined parameter adjustment, then the fine-tuned deep model for the target domain is outputted.
  • the fine-tuning score may be inversely proportional to a value of a function that contains variables related to the features of the objects, the relation of features or the like.
  • the function may be contrastive loss function that encourages features of the objects of the same group label to be close and that of the different group labels to be far away from each other.
  • the formulation of the contrastive loss may be:
  • the features extracted by DCN with different parameter adjustments are different, and the different E c are obtained, the more consistent with the criterions, the smaller the value of E c is. Through minimizing E c , the most appropriated parameter adjustment may be obtained, or the appropriated parameter adjustment make E c smallest is the most appropriated parameter adjustment.
  • the candidate parameter adjustments may be included in a parameter adjustment set.
  • the process of minimizing E c may an iterative process
  • the candidate parameter adjustment may be obtained by modifying the parameter adjustment in the previous iteration
  • the deep model may be fine-tuned with the determined parameter adjustment.
  • the triplet loss or other loss functions may also be used, which learn an embedding in which the distances between the positive pairs are smaller than that of the negative pairs.
  • the present application may be embodied as a system, a method or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment and hardware aspects that may all generally be referred to herein as a “unit” , “circuit, ” “module” or “system. ”
  • ICs integrated circuits
  • the present invention may take the form of an entirely software embodiment (including firmware, resident software, micro-code, etc. ) or an embodiment combining software.
  • the system may comprise a memory that stores executable components and a processor, electrically coupled to the memory to execute the executable components to perform operations of the system, as discussed in reference to Figs. 1-4.
  • the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

A method for adapting a deep model for object representation from a source domain to a target domain, comprises: extracting, by the deep model for the source domain, features for objects from input images for the target domain; inferring group labels for objects according to the extracted features; discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain. A system for adapting a deep model for object representation from a source domain to a target domain is also enclosed.

Description

[Title established by the ISA under Rule 37.2] METHOD AND SYSTEM FOR ADAPTING DEEP MODEL FOR OBJECT REPRESENTATION FROM SOURCE DOMAIN TO TARGET DOMAIN Technical Field
The disclosures relate to a method and a system for adapting a deep model for object representation from a source domain to a target domain.
Background
Deep learning approaches have achieved substantial advances for object (e.g., face, dogs, basketball) recognition. However, contemporary deep models, for example, deep convolution networks, usually overfit to the training data distributions, and thus will not be directly generalisable to other unseen target domain. In addition, the annotated data in the unseen target domain is usually not sufficient for training a new deep model. These problems limit the deep learning in the applications, such as object tracking, retrieval, and clustering in unseen images/videos. One example is face clustering in movies, i.e., grouping detected faces into different subsets according to different characters. Clustering faces in movies is extremely challenging since characters’ a ppearance may vary drastically under different scenes as the story progresses. In addition, the various cinematic styles in different movies make it difficult to learn a universal face representation for all movies. Conventional techniques that assume fixed handcrafted features for clustering is infeasible to this problem, however, handcrafted features are susceptible to large appearance, illumination, and viewpoint variations, and thus cannot cope with the drastic appearance changes in movies.
Deep learning approaches have achieved substantial advances for object representation learning. These methods could arguably provide a more robust representation to object recognition. However, contemporary deep models for object recognition are trained with web images or photos from albums. These models overfit  to the training data distributions and thus will not be directly generalisable to application in different target domain.
Therefore, it is desired to provide a method for adapting a deep model from the source domain to the target domain automatically.
Summary
The following presents a simplified summary of the disclosure in order to provide a basic understanding of some aspects of the disclosure. This summary is not an extensive overview of the disclosure. It is intended to neither identify key or critical elements of the disclosure nor delineate any scope of particular embodiments of the disclosure, or any scope of the claims. Its sole purpose is to present some concepts of the disclosure in a simplified form as a prelude to the more detailed description that is presented later.
In an aspect, disclosed is a method for adapting a deep model for object representation from a source domain to a target domain, comprising: extracting, by a deep model for the source domain, features for objects from input images for the target domain; inferring group labels for objects according to the extracted features; discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain.
In one embodiment of the present application, the extracting, the inferring, the discovering, and the fine-tuning are implemented in an iterative feedback loop that is performed for predetermined times, wherein in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain, in iterations following the starting  iteration, the features for objects are extracted from input images for the target domain by the fine-tuned deep model fine-tuned in a previous iteration of the iterative feedback loop.
In one embodiment of the present application, the inferring comprises: computing, according to the exacted features of the objects, a judgment score for each of candidate group label distributions for the objects; determining a candidate group label distribution having highest judgment score; and inferring, based on the determined distribution, group labels for objects, wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is.
In one embodiment of the present application, the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
In one embodiment of the present application, the discovering comprises: computing degrees of difference between objects that are inferred to have the same group label; and choosing pairs of object, having a degree of difference larger than a threshold, as the criterions.
In one embodiment of the present application, the discovering comprises: choosing pairs of object from the objects, which is inferred to have the same group label but should have different group labels according to the target domain prior as the criterions.
In one embodiment of the present application, the fine-tuning comprises: computing a fine-tuning score for each of candidate parameter adjustments according to the discovered criterions; determining the candidate parameter adjustment having highest fine-tuning score; and fine-tuning the deep model with the determined parameter adjustment, wherein the fine-tuning score indicates the similarity between  the objects having a same group label, and the higher the similarity is, the higher the fine-tuning score is.
In an aspect, disclosed is a system for adapting a deep model for object representation from a source domain to a target domain, comprising: a feature extraction unit configured to receive the deep model for the source domain and use the deep model to extract features for objects from input images for the target domain; an inference unit configured to infer group labels for objects according to the extracted features; a criterions discovery unit configured to discover criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and a training unit configured to fine-tune the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as the deep model for the target domain.
In an aspect, disclosed is a system for adapting a deep model for object representation from a source domain to a target domain, comprising: a memory that stores executable components; and a processor electrically coupled to the memory to execute the executable components for: extracting, by a deep model for the source domain, features for objects from input images for the target domain; inferring group labels for objects according to the extracted features; discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as the deep model for the target domain.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present application are described below with reference to the attached drawings. The drawings are illustrative  and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 shows the overall pipeline of the system for adapting a deep model for object representation from a source domain to a target domain according to some embodiments of the present application;
Fig. 2 shows the steps used for the inference unit according to some embodiments of the present application;
Fig. 3 shows the steps used for the criterions discovery unit according to some embodiments of the present application; and
Fig. 4 shows the steps used for the training unit according to some embodiments of the present application.
Detailed Description
Reference will now be made in detail to some specific embodiments of the invention including the best modes contemplated by the inventors for carrying out the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be appreciated by one skilled in the art that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. The present application may be practiced without some or all of these specific details. In other instances, well-known process operations have not been described in detail in order not to unnecessarily obscure the present application.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a” , “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising” , when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 shows the overall pipeline of the system for adapting a deep model for object representation from a source domain to a target domain according to some embodiments of the present application. In some embodiments, the deep model may be a deep convolution network (DCN) . The system for adapting a deep model for object representation from a source domain to a target domain 100 comprises a feature extraction unit 101, an inference unit 102, a criterions discovery unit 103 and a training unit 104. The feature extraction unit 101 is configured to extract features for objects from input images for the target domain by a deep model for the source domain; the inference unit 102 is configured to infer group labels for objects according to the extracted features; the criterions discovery unit 103 is configured to discovery criterions based on derived target domain priors derived from the input images and the inferred group labels; and the training unit 104 is configured to fine-tune the deep model for the source domain according to the discovered criterions and outputting the fine-tuned deep model as the deep model for the target domain.
In some embodiments of the present application, the criterions may contain information indicating which objects should not be inferred to have a same group label. The group label may indicate the property, name, classification and the like of the objects. For example, if the system is used for face recognition in a movie, the group label may be the name of the role. If the system is used for object detection in the photo, the group label may be the classification of the object, such as “chair” ,  “table” and the like.
In some embodiments, the system 100 runs to carry out its functions in an iterative way. In other words, the units 101-104 may be implemented as an iterative feedback loop. Specifically, in each iteration, the feature extraction unit 101 extracts the features from the input images. After that, the inference unit 102 infers group labels for objects according to the extracted features based on the extracted features. Then the criterions discovery unit 103 discovers criterions from the inferred group labels. With the discovered criterions, the training unit 104 fine-tunes the deep model according to the discovered criterions. Then the next iteration is performed. This iterative feedback loop ends when the desired performance is achieved or the predetermined running time is reached. In this way, the deep model is fine-tuned for several times and become more suitable for the target domain. During the iterative feedback loop, in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain; in iterations following the starting iteration, the features for objects are extracted from input images by the deep model fine-tuned in the previous iteration of the iterative feedback loop. In the end of iterative feedback loop, the deep model fine-tuned in the last iteration is outputted
In some embodiments, the feature extraction unit 101 may be configured with a deep convolutional network (DCN) that consists of successive convolutional filter banks. That is, the deep convolutional network is used as the deep model. The DCN may be initialized by training on a large source domain for image classification/recognition (e.g., large-scale image classification dataset IMAGENET, or large scale face dataset) , or received from other unit, or inputted by user. For example, when the system 100 is used in the face recognition, the pre-trained DCN may be a DCN used in DeepID2+. Specifically, the input may be, for example, 55×47 RGB face image. The DCN has a plurality of, for example four, successive convolution layers followed by one fully connected layer. Each convolution layer  contains learnable filters and is followed by a 2×2 max-pooling layer and Rectified Linear Unites (ReLUs) as the activation function. Then, in this embodiment, the number of feature map generated by each convolution layer will be 128, and the dimension of the face representation generated by the final fully connected layer will be 512. The DCN is pre-trained on CelebFace (as an example) , with around 290,000 faces images from 12,000 identities. The training process is conducted by back-propagation using both the identification and verification loss functions. It should be appreciated that other database with the different number of trained faces images may be applicable.
Fig. 2 shows the steps used for the inference unit according to some embodiments of the present application. In the embodiments, the extracted features are fed into the inference unit 102, then the inference unit 102 is operated to find an appropriate group label distribution for each objects in the input images according to the extracted features, i.e., infers the group label for each object according to the features thereof. The process of inference may be implemented by the following steps.
At step S201, a judgment score for each of candidate group label distributions for the objects is computed according to the features of the objects, wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is, i.e., the judgment score presents the degree of appropriateness of the distribution thereof. At step S202, the judgment scores of different distributions are compared with each other, then the candidate group label distribution having highest judgment score is determined. At S203, group labels for objects are inferred based on the determined distribution.
In a specific example, the judgment score may be a value of a function that contains variables related to the features of the objects, the relation of the features or the like. For extracted features
Figure PCTCN2016071501-appb-000001
wherein
Figure PCTCN2016071501-appb-000002
denotes feature of the j-th object of the i-th cluster and the cluster may be predetermined, the group label of each 
Figure PCTCN2016071501-appb-000003
in X is denoted as
Figure PCTCN2016071501-appb-000004
that may be inferred by maximizing a function p (X, Y) :
Figure PCTCN2016071501-appb-000005
where
Figure PCTCN2016071501-appb-000006
signifies a set of input images, which are the neighbors of
Figure PCTCN2016071501-appb-000007
in the space of feature. 
Figure PCTCN2016071501-appb-000008
represents the probability of distributing l and l′ group labels to
Figure PCTCN2016071501-appb-000009
and
Figure PCTCN2016071501-appb-000010
respectively, i.e., the probability of distributing l and l′ group labels to the objects to which
Figure PCTCN2016071501-appb-000011
and
Figure PCTCN2016071501-appb-000012
correspond. And Gaussian distribution may employed to model the first term in Eqn. (1)
Figure PCTCN2016071501-appb-000013
where μl and Σl denote the mean and covariance matrix of the Gaussian of the l-th character, which are obtained and updated in the learning process. For the second term in Eqn. (1) , it is defined as
Figure PCTCN2016071501-appb-000014
wherein 1 (·) is the indicator function and α is a trade-off coefficient between Eqn. (2) and (3) . Furthermore, υ (·,·) is a pre-computed function that encodes the relation between any pair of features
Figure PCTCN2016071501-appb-000015
and
Figure PCTCN2016071501-appb-000016
where positive relation (i.e. υ (·,·) > 0) means that the features are likely from the same character. Otherwise, they belong to different characters. Specifically, the computation of v is a combination of the similarity between appearances of a pair of features (i.e., the similarity between features of a pair of objects) ; and the pairwise spatial and temporal criterions of the features, which may be obtained from input images. For instance, when the system is used in face representation learning and clustering in movies, face images in two successive frames belong to the same character, while face images appearing in the same frame belong to different characters. In general, Eqn. (3) encourages face  images with positive relation to be the same character. For example, if
Figure PCTCN2016071501-appb-000017
Figure PCTCN2016071501-appb-000018
andl=l′,then
Figure PCTCN2016071501-appb-000019
However, if
Figure PCTCN2016071501-appb-000020
but l≠l′, then
Figure PCTCN2016071501-appb-000021
indicating the group label distribution is violating the pairwise criterions. The group label distribution
Figure PCTCN2016071501-appb-000022
making the Eqn. (1) having highest value may be considered as the most appropriate distribution, and may be determined as the resulting group label distribution, then group label for the objects can be inferred.
Fig. 3 shows the steps used for the criterions discovery unit 103 according to some embodiments of the present application. After inferring the group labels, the resulting group labels for objects as well as the input images are fed into the criterions discovery unit 103. In the criterions discovery unit 103, the following steps are performed. At step S301, the degrees of difference between objects that are inferred to have the same group label are computed. At step S302, the object pairs having a degree of difference larger than a threshold are chosen as the criterions. And at step S303, the object pairs that are inferred with the same group label but should have different group labels according to the target domain prior are chosen as the criterions. These criterions will be used in the training unit 104 to fine-tune the DCN of the feature extraction unit 101. In some embodiments, step S302 may be omitted; in some embodiments, step S303 may be omitted.
In some embodiments, the degrees of difference between objects that are inferred to have the same group label may be obtained by calculating distance between the features of each pair of objects in the feature space, for example, by calculating L2-distance between features of two objects. Then the top 20%or other percentage of object pairs with the largest degree of difference (for example, L2-distance) are chosen as the criterions, that is, the object pairs having a degree of difference larger than a threshold are chosen as the criterions. For example, in the scenario where the 20%object pairs with the largest degree of difference (for example,  L2-distance) are chosen as the criterions, the threshold is the shortest L2-distance in the top 20%of all L2-distances. The large L2-distance means that two objects likely belong to different group label, so the inference of two objects having large L2-distance is likely error, the DCN used to extract features should be corrected, and the information on “these two objects belong to different labels” will be used as the criterion in the correction process. So, at step S302, the object pairs having a degree of difference larger than a threshold are chosen as the criterions.
In some embodiments, before calculating the degrees of difference between objects that are inferred to have the same group label, the whole similarity degree of all objects having same group label may be firstly calculated, for example, trace of the covariance matrix i.e. trace (Σl) , wherein Σl denotes the covariance matrix of the Gaussian of the l-th group label, the lower the whole similarity degree is, the larger the trace (Σl) is. Then only the objects with group label whose trace (Σl) is larger than a threshold are considered during calculating the degree of difference between objects that are inferred to have the same group label.
In some embodiments, the target domain prior comprises information on the objects in the input images or relationship between objects in the input images. For example, when the system is used in the face tracking or clustering in a movie, the target domain prior can be the context extracted from the subtitle that helps to identify the character’s face. Other similar prior can be in a pairwise form: faces appearing in the same frame of a video/movie unlikely belong to the same person (negative pair) while any two faces in the same location between neighboring frames more likely belong to the same person (positive pair) . If a pair of objects are inferred to have same group label, but it can known from the target domain prior that these two objects should not have same group label, the label inference of these two objects is likely error, the DCN used to extract features should be corrected, and the information on “these two objects belong to different labels” will be used as criterion in the correction process. So at step S303, object pairs that are inferred to have the same  group label but should have different group labels according to the target domain prior are chosen as the criterions.
In some embodiments, the criterions may contain the information on which pair of objects that is distributed to have same group label actually are not same object.
Fig. 4 shows the steps used for the training unit 104 according to some embodiments of the present application. In the training unit, the original DCN or DCN used in the previous iteration is fine-tuned according to the discovered criterions. The parameters of DCN are adjusted in order to make the extracted features are more consistent with the criterions. At step S401, a fine-tuning score for each of the candidate parameter adjustments is computed according to the discovered criterions; at step S402, the candidate parameter adjustment having highest fine-tuning score is determined as the resulting parameter adjustments of the deep model; and at step S403, the deep model is fine-tuned with the determined parameter adjustment, then the fine-tuned deep model for the target domain is outputted.
In some embodiments, the fine-tuning score may be inversely proportional to a value of a function that contains variables related to the features of the objects, the relation of features or the like. For example, for criterions obtained from the criterions discovery unit 103, the function may be contrastive loss function that encourages features of the objects of the same group label to be close and that of the different group labels to be far away from each other. The formulation of the contrastive loss may be:
Figure PCTCN2016071501-appb-000023
where Ec is the loss, Ii, Ij is the objects i and j. x denotes the according feature. τ is  the margin between different identities. C (Ii, Ij) =1 means that object Ii, Ij are of the same group label, while C (Ii, Ij) =-1 means object Ii, Ij are of different group labels. When the system is used in the face recognition, Ii, Ij may be the face images i and j. x may denote the according feature. τ may be the margin between different identities. C (Ii, Ij) =1 may mean that face images Ii, Ij are of the same person, while C (Ii,Ij) =-1 may mean face images Ii, Ij are of different persons. The features extracted by DCN with different parameter adjustments are different, and the different Ec are obtained, the more consistent with the criterions, the smaller the value of Ec is. Through minimizing Ec, the most appropriated parameter adjustment may be obtained, or the appropriated parameter adjustment make Ec smallest is the most appropriated parameter adjustment. In some embodiments, the candidate parameter adjustments may be included in a parameter adjustment set. In some embodiments, the process of minimizing Ec may an iterative process, the candidate parameter adjustment may be obtained by modifying the parameter adjustment in the previous iteration, the process ends when the value of Ec converges. After minimizing Ec, the deep model may be fine-tuned with the determined parameter adjustment.
In some embodiments, the triplet loss or other loss functions may also be used, which learn an embedding in which the distances between the positive pairs are smaller than that of the negative pairs.
As will be appreciated by one skilled in the art, the present application may be embodied as a system, a method or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment and hardware aspects that may all generally be referred to herein as a “unit” , “circuit, ” “module” or “system. ” Much of the inventive functionality and many of the inventive principles when implemented, are best supported with or integrated circuits (ICs) , such as a digital signal processor and software therefore or application specific ICs. It  is expected that one of ordinary skill, notwithstanding possibly significant effort and many design choices motivated by, for example, available time, current technology, and economic considerations, when guided by the concepts and principles disclosed herein will be readily capable of generating ICs with minimal experimentation. Therefore, in the interest of brevity and minimization of any risk of obscuring the principles and concepts according to the present application, further discussion of such software and ICs, if any, will be limited to the essentials with respect to the principles and concepts used by the preferred embodiments. In addition, the present invention may take the form of an entirely software embodiment (including firmware, resident software, micro-code, etc. ) or an embodiment combining software. For example, the system may comprise a memory that stores executable components and a processor, electrically coupled to the memory to execute the executable components to perform operations of the system, as discussed in reference to Figs. 1-4. Furthermore, the present invention may take the form of a computer program product embodied in any tangible medium of expression having computer-usable program code embodied in the medium.
Although the preferred examples of the present application have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims are intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present application.
Obviously, those skilled in the art can make variations or modifications to the present application without departing the spirit and scope of the present application. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present application.

Claims (21)

  1. A method for adapting a deep model for object representation from a source domain to a target domain, comprising:
    extracting, by the deep model for the source domain, features for objects from input images for the target domain;
    inferring group labels for objects according to the extracted features;
    discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and
    fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain.
  2. The method of claim 1, wherein the extracting, the inferring, the discovering, and the fine-tuning are implemented in an iterative feedback loop that is performed for predetermined times, wherein
    in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain,
    in iterations following the starting iteration, the features for objects are extracted from input images for the target domain by the fine-tuned deep model fine-tuned in a previous iteration of the iterative feedback loop.
  3. The method of claim 1 or 2, wherein the inferring comprises:
    computing, according to the exacted features of the objects, a judgment score for each of candidate group label distributions for the objects;
    determining a candidate group label distribution having highest judgment score; and
    inferring, based on the determined distribution, group labels for objects,
    wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is.
  4. The method of claim 1 or 2, wherein the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
  5. The method of claim 1 or 2, wherein the discovering comprises:
    computing degrees of difference between objects that are inferred to have the same group label; and
    choosing pairs of object, having a degree of difference larger than a threshold, as the criterions.
  6. The method of claim 1 or 2, wherein the discovering comprises:
    choosing pairs of object from the objects, which are inferred to have the same group label but should have different group labels according to the target domain prior as the criterions.
  7. The method of claim 6, wherein the fine-tuning comprises:
    computing a fine-tuning score for each of candidate parameter adjustments according to the discovered criterions;
    determining the candidate parameter adjustment having highest fine-tuning score; and
    fine-tuning the deep model with the determined parameter adjustment,
    wherein the fine-tuning score indicates the similarity between the objects having a same group label, and the higher the similarity is, the higher the fine-tuning score is.
  8. A system for adapting a deep model for object representation from a source domain to a target domain, comprising:
    a feature extraction unit configured to receive the deep model for the source domain and use the deep model to extract features for objects from input images for the target domain;
    an inference unit configured to infer group labels for objects according to the extracted features;
    a criterions discovery unit configured to discover criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and
    a training unit configured to fine-tune the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain.
  9. The system of claim 8, wherein the feature extraction unit, the inference unit, the criterions discovery unit, and the training unit are implemented in an iterative feedback loop that is performed for predetermined times, wherein
    in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain,
    in iterations following the starting iteration, the features for objects are extracted from input images for the target domain by the fine-tuned deep model fine-tuned in a previous iteration of the iterative feedback loop.
  10. The system of claim 8 or 9, wherein the inference unit is configured for:
    computing, according to the extracted features of the objects, a judgment score for each of candidate group label distributions for the objects;
    determining a candidate group label distribution having highest judgment score; and
    inferring, based on the determined distribution, group labels for objects,
    wherein the higher the similarity between the features of the objects having same  group label is, the higher the judgment score is.
  11. The system of claim 8 or 9, wherein the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
  12. The system of claim 8 or 9, wherein the criterions discovery unit is configured for:
    computing degrees of difference between objects that are inferred to have the same group label; and
    choosing pairs of object, having a degree of difference larger than a threshold, as the criterions.
  13. The system of claim 8 or 9, wherein the criterions discovery unit is configured for:
    choosing pairs of object from the objects, which are inferred to have the same group label but should have different group labels according to the target domain prior as the criterions.
  14. The system of claim 13, wherein the training unit is configured for:
    computing a fine-tuning score for each of candidate parameter adjustments according to the discovered criterions;
    determining the candidate parameter adjustment having highest fine-tuning score; and
    fine-tuning the deep model with the determined parameter adjustment,
    wherein the fine-tuning score indicates the similarity between the objects having a same group label, and the higher the similarity is, the higher the fine-tuning score is.
  15. A system for object representation, comprising:
    a memory that stores executable components; and
    a processor electrically coupled to the memory to execute the executable components for:
    extracting, by the deep model for the source domain, features for objects from input images for the target domain;
    inferring group labels for objects according to the extracted features;
    discovering criterions based on target domain priors derived from the input images and the inferred group labels, wherein the criterions contain information indicating which objects should not be inferred to have a same group label; and
    fine-tuning the deep model for the source domain according to the discovered criterions, wherein the fine-tuned deep model is outputted as a deep model for the target domain.
  16.  The system of claim 15, wherein the extracting, the inferring, the determining, and the fine-tuning are implemented in an iterative feedback loop that is performed for predetermined times, wherein
    in starting iteration of the iterative feedback loop, the features for objects are extracted from input images for the target domain by the deep model for the source domain,
    in iterations following the starting iteration, the features for objects are extracted from input images for the target domain by the fine-tuned deep model fine-tuned in a previous iteration of the iterative feedback loop.
  17. The system of claim 15 or 16, wherein the inferring comprises:
    computing, according to the exacted features of the objects, a judgment score for each of candidate group label distributions for the objects;
    determining a candidate group label distribution having highest judgment score; and
    inferring, based on the determined distribution, group labels for objects,
    wherein the higher the similarity between the features of the objects having same group label is, the higher the judgment score is.
  18. The system of claim 15 or 16, wherein the target domain prior comprises information on the objects in the input images or relationship between objects in the input images.
  19. The system of claim 15 or 16, wherein the discovering comprises:
    computing degrees of difference between objects that are inferred to have the same group label; and
    choosing pairs of object, having a degree of difference larger than a threshold, as the criterions.
  20. The system of claim 15 or 16, wherein the discovering comprises:
    choosing pairs of object from the objects, which are inferred to have the same group label but should have different group labels according to the target domain prior as the criterions.
  21. The system of claim 20, wherein the fine-tuning comprises:
    computing a fine-tuning score for each of candidate parameter adjustments according to the discovered criterions;
    determining the candidate parameter adjustment having highest fine-tuning score; and
    fine-tuning the deep model with the determined parameter adjustment,
    wherein the fine-tuning score indicates the similarity between the objects having a same group label, and the higher the similarity is, the higher the fine-tuning score is.
PCT/CN2016/071501 2016-01-20 2016-01-20 Method and system for adapting deep model for object representation from source domain to target domain WO2017124336A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201680079452.0A CN108604304A (en) 2016-01-20 2016-01-20 For adapting the depth model indicated for object from source domain to the method and system of aiming field
PCT/CN2016/071501 WO2017124336A1 (en) 2016-01-20 2016-01-20 Method and system for adapting deep model for object representation from source domain to target domain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/071501 WO2017124336A1 (en) 2016-01-20 2016-01-20 Method and system for adapting deep model for object representation from source domain to target domain

Publications (1)

Publication Number Publication Date
WO2017124336A1 true WO2017124336A1 (en) 2017-07-27

Family

ID=59361172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/071501 WO2017124336A1 (en) 2016-01-20 2016-01-20 Method and system for adapting deep model for object representation from source domain to target domain

Country Status (2)

Country Link
CN (1) CN108604304A (en)
WO (1) WO2017124336A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011568A (en) * 2021-03-31 2021-06-22 华为技术有限公司 Model training method, data processing method and equipment
CN113159199A (en) * 2021-04-27 2021-07-23 广东工业大学 Cross-domain image classification method based on structural feature enhancement and class center matching
US11155809B2 (en) 2014-06-24 2021-10-26 Bio-Rad Laboratories, Inc. Digital PCR barcoding

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112669247A (en) * 2020-12-09 2021-04-16 深圳先进技术研究院 Priori guidance type network for multitask medical image synthesis
CN113255823B (en) * 2021-06-15 2021-11-05 中国人民解放军国防科技大学 Unsupervised domain adaptation method and unsupervised domain adaptation device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902966A (en) * 2012-10-12 2013-01-30 大连理工大学 Super-resolution face recognition method based on deep belief networks
CN103793718A (en) * 2013-12-11 2014-05-14 台州学院 Deep study-based facial expression recognition method
CN104318215A (en) * 2014-10-27 2015-01-28 中国科学院自动化研究所 Cross view angle face recognition method based on domain robustness convolution feature learning
CN104616033A (en) * 2015-02-13 2015-05-13 重庆大学 Fault diagnosis method for rolling bearing based on deep learning and SVM (Support Vector Machine)
CN105160866A (en) * 2015-08-07 2015-12-16 浙江高速信息工程技术有限公司 Traffic flow prediction method based on deep learning nerve network structure

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582813B (en) * 2009-06-26 2011-07-20 西安电子科技大学 Distributed migration network learning-based intrusion detection system and method thereof
CN101840569B (en) * 2010-03-19 2011-12-07 西安电子科技大学 Projection pursuit hyperspectral image segmentation method based on transfer learning
US9231851B2 (en) * 2011-01-31 2016-01-05 Futurewei Technologies, Inc. System and method for computing point-to-point label switched path crossing multiple domains
US9681250B2 (en) * 2013-05-24 2017-06-13 University Of Maryland, College Park Statistical modelling, interpolation, measurement and anthropometry based prediction of head-related transfer functions
CN104199023B (en) * 2014-09-15 2017-02-08 南京大学 RFID indoor positioning system based on depth perception and operating method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102902966A (en) * 2012-10-12 2013-01-30 大连理工大学 Super-resolution face recognition method based on deep belief networks
CN103793718A (en) * 2013-12-11 2014-05-14 台州学院 Deep study-based facial expression recognition method
CN104318215A (en) * 2014-10-27 2015-01-28 中国科学院自动化研究所 Cross view angle face recognition method based on domain robustness convolution feature learning
CN104616033A (en) * 2015-02-13 2015-05-13 重庆大学 Fault diagnosis method for rolling bearing based on deep learning and SVM (Support Vector Machine)
CN105160866A (en) * 2015-08-07 2015-12-16 浙江高速信息工程技术有限公司 Traffic flow prediction method based on deep learning nerve network structure

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11155809B2 (en) 2014-06-24 2021-10-26 Bio-Rad Laboratories, Inc. Digital PCR barcoding
CN113011568A (en) * 2021-03-31 2021-06-22 华为技术有限公司 Model training method, data processing method and equipment
CN113159199A (en) * 2021-04-27 2021-07-23 广东工业大学 Cross-domain image classification method based on structural feature enhancement and class center matching
CN113159199B (en) * 2021-04-27 2022-12-27 广东工业大学 Cross-domain image classification method based on structural feature enhancement and class center matching

Also Published As

Publication number Publication date
CN108604304A (en) 2018-09-28

Similar Documents

Publication Publication Date Title
Chaudhuri et al. Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method
US10902243B2 (en) Vision based target tracking that distinguishes facial feature targets
US9449432B2 (en) System and method for identifying faces in unconstrained media
CN108140032B (en) Apparatus and method for automatic video summarization
WO2017124336A1 (en) Method and system for adapting deep model for object representation from source domain to target domain
US9940577B2 (en) Finding semantic parts in images
CN108288051B (en) Pedestrian re-recognition model training method and device, electronic equipment and storage medium
CN108268823B (en) Target re-identification method and device
CN108664526B (en) Retrieval method and device
CN105100894A (en) Automatic face annotation method and system
US9875397B2 (en) Method of extracting feature of input image based on example pyramid, and facial recognition apparatus
CN109460774B (en) Bird identification method based on improved convolutional neural network
Kim et al. Deep stereo confidence prediction for depth estimation
WO2019007253A1 (en) Image recognition method, apparatus and device, and readable medium
US10007678B2 (en) Image processing apparatus, image processing method, and recording medium
Wang et al. Scene text detection and tracking in video with background cues
CN110765882B (en) Video tag determination method, device, server and storage medium
Miclea et al. Real-time semantic segmentation-based stereo reconstruction
Wang et al. Aspect-ratio-preserving multi-patch image aesthetics score prediction
CN109635647B (en) Multi-picture multi-face clustering method based on constraint condition
CN113705596A (en) Image recognition method and device, computer equipment and storage medium
Gallagher et al. Using context to recognize people in consumer images
Lee et al. Property-specific aesthetic assessment with unsupervised aesthetic property discovery
CN110472591A (en) It is a kind of that pedestrian's recognition methods again is blocked based on depth characteristic reconstruct
CN114519863A (en) Human body weight recognition method, human body weight recognition apparatus, computer device, and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16885609

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16885609

Country of ref document: EP

Kind code of ref document: A1