WO2021120752A1 - Procédé et dispositif d'apprentissage de modèle auto-adaptatif basé sur une région, procédé et dispositif de détection d'image, et appareil et support - Google Patents

Procédé et dispositif d'apprentissage de modèle auto-adaptatif basé sur une région, procédé et dispositif de détection d'image, et appareil et support Download PDF

Info

Publication number
WO2021120752A1
WO2021120752A1 PCT/CN2020/116742 CN2020116742W WO2021120752A1 WO 2021120752 A1 WO2021120752 A1 WO 2021120752A1 CN 2020116742 W CN2020116742 W CN 2020116742W WO 2021120752 A1 WO2021120752 A1 WO 2021120752A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
domain
model
image
global
Prior art date
Application number
PCT/CN2020/116742
Other languages
English (en)
Chinese (zh)
Inventor
周侠
吕彬
高鹏
吕传峰
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021120752A1 publication Critical patent/WO2021120752A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/467Encoded features or binary features, e.g. local binary patterns [LBP]

Definitions

  • This application relates to the field of artificial intelligence classification models, and in particular to a domain adaptive model training, image detection methods, devices, computer equipment, and storage media.
  • OCT Optical Coherence Tomography
  • This application provides a domain adaptive model training, image detection method, device, computer equipment, and storage medium, which realizes that there is no need to manually label images in the target domain, and by adaptively aligning the distribution differences of image data from different domain sources, Improve the training efficiency of the domain adaptive model, and introduce the feature regular loss value, improve the robustness and accuracy of the domain adaptive model, and automatically identify the category of the target domain image through the domain adaptive model, realizing cross-domain Image detection improves the reliability of recognition and saves costs.
  • a domain adaptive model training method including:
  • the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;
  • the image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample;
  • the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;
  • An image detection method including:
  • the image detection instruction is received, and the image of the target area to be detected is obtained;
  • the image of the target domain to be detected is input into the image detection model trained as the above-mentioned domain adaptive model training method, the image features in the image of the target domain to be detected are extracted through the image detection model, and the image detection model is obtained according to The source domain category result of the image feature output.
  • a domain adaptive model training device including:
  • the acquisition module is used to acquire an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample, one category label, and one domain Label association; one image sample of the target domain is associated with one domain label;
  • the input module is used to input the image sample into a Faster-RCNN-based domain adaptive model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample;
  • the domain adaptive model Including the preprocessing model, feature extraction model, region extraction model, detection model, global feature model, and local feature model;
  • An extraction module configured to perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector image
  • the first loss module is used to perform region extraction and equal sampling of the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform region feature vector maps corresponding to the image samples Perform local feature extraction processing and binary classification recognition to obtain a local domain classification result, and obtain a local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the global feature model is used to The feature vector graph performs regularization and global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and according to the global domain classification result and the domain label corresponding to the feature vector graph, a global feature alignment loss is obtained value;
  • the second loss module is used to perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain the recognition result, and according to the recognition result and the source domain
  • the category label corresponding to the domain image sample obtains the detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value, a total loss value is obtained;
  • the training module is configured to iteratively update the initial parameters of the domain adaptive model when the total loss value does not reach the preset convergence condition, until the total loss value reaches the preset convergence condition, it will converge
  • the subsequent domain adaptive model is recorded as a trained domain adaptive model.
  • An image detection device includes:
  • the receiving module is used to receive the image detection instruction and obtain the image of the target area to be detected
  • the detection module is used to input the image of the target domain to be detected into the image detection model trained as the above-mentioned domain adaptive model training method, extract the image features in the image of the target domain to be detected through the image detection model, and obtain all the images of the target domain.
  • the image detection model outputs a source domain category result according to the image feature.
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;
  • the image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample;
  • the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;
  • a computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor further implements the following steps when the processor executes the computer-readable instructions:
  • the image detection instruction is received, and the image of the target area to be detected is obtained;
  • the image of the target domain to be detected is input into the image detection model trained as the above-mentioned domain adaptive model training method, the image features in the image of the target domain to be detected are extracted through the image detection model, and the image detection model is obtained according to The source domain category result of the image feature output.
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:
  • the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;
  • the image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample;
  • the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;
  • One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors further execute the following steps:
  • the image detection instruction is received, and the image of the target area to be detected is obtained;
  • the image of the target domain to be detected is input into the image detection model trained as the above-mentioned domain adaptive model training method, the image features in the image of the target domain to be detected are extracted through the image detection model, and the image detection model is obtained according to The source domain category result of the image feature output.
  • the domain adaptive model training method, device, computer equipment, and storage medium obtained an image sample set containing multiple image samples; the image samples include source domain image samples and target domain image samples;
  • the sample input contains initial parameters and is based on a Faster-RCNN-based domain adaptive model.
  • the image sample is image converted by the preprocessing model to obtain the preprocessed image sample; the preprocessed image is imaged by the feature extraction model Extract and obtain a feature vector map; perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform a regional feature vector map corresponding to the image sample Perform local feature extraction processing and binary classification recognition to obtain local feature alignment loss values; at the same time, perform regularization and global feature recognition processing on the feature vector graph through the global feature model to obtain feature regularization loss values and global feature alignment loss values Perform boundary regression and source domain classification and recognition on the regional feature vector map corresponding to the source domain image sample through the detection model to obtain the detection loss value; according to the global feature alignment loss value, the detection loss value, and the The local feature alignment loss value and the feature regular loss value are used to obtain the total loss value; when the total loss value does not reach the preset convergence condition, the initial parameters are iteratively updated until convergence, and the trained domain adaptive model is obtained, Therefore, this application provides
  • the image samples of the domain are trained, and the domain adaptive model based on Faster-RCNN is converged according to the total loss value containing the global feature alignment loss value, detection loss value, local feature alignment loss value and feature regular loss value.
  • Cross-domain image recognition improves the accuracy and reliability of image recognition and saves labor costs.
  • the image detection method, device, computer equipment and storage medium provided in this application receive the image detection instruction and obtain the image of the target domain to be detected; input the image of the target domain to be detected into the training completed by the above-mentioned domain adaptive model training method
  • the image detection model extracts the image features in the image of the target domain to be detected through the image detection model, and obtains the source domain category results output by the image detection model according to the image features.
  • this application adopts the domain adaptive model
  • the category of the target domain image is automatically recognized, cross-domain image detection is realized, the recognition reliability is improved, and the cost is saved.
  • FIG. 1 is a schematic diagram of an application environment of a domain adaptive model training method or an image detection method in an embodiment of the present application
  • FIG. 2 is a flowchart of a method for training a domain adaptive model in an embodiment of the present application
  • FIG. 3 is a flowchart of step S20 of a domain adaptive model training method in an embodiment of the present application
  • step S40 is a flowchart of step S40 of the method for training a domain adaptive model in an embodiment of the present application
  • FIG. 5 is a flowchart of step S40 of a method for training a domain adaptive model in another embodiment of the present application
  • FIG. 6 is a flowchart of step S40 of the domain adaptive model training method in another embodiment of the present application.
  • Fig. 7 is a flowchart of an image detection method in an embodiment of the present application.
  • Fig. 8 is a functional block diagram of a domain adaptive model training device in an embodiment of the present application.
  • Fig. 9 is a functional block diagram of an image detection device in an embodiment of the present application.
  • Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.
  • the domain adaptive model training method provided by this application can be applied in an application environment as shown in Fig. 1, where a client (computer device) communicates with a server through a network.
  • the client computer equipment
  • the server includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • a method for training a domain adaptive model is provided, and its technical solution mainly includes the following steps S10-S60:
  • an image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; An image sample of the target domain is associated with a domain label.
  • receiving a training request for a domain adaptive model triggers the acquisition of the image sample set used to train the domain adaptive model, the image sample set being a collection of the collected image samples, and the image samples include There are the source domain image sample and the target domain image sample, the source domain image sample is an image collected in a known field or by a known device and marked with a category label, and the category label indicates all
  • the category of the source domain image sample for example, an OCT image sample is collected on a known OCT acquisition device, and the OCT image sample has been marked with the category label of the area contained in the OCT image sample (choroidal area, macular hole) Region, etc.), one of the source domain image samples is associated with a category label and a domain label, the domain label serves as a distinguishing identifier between the source domain image sample and the target domain image sample, and the domain label includes the source
  • the domain label and the target domain label for example, the label includes a model label (source domain label) corresponding to a known device and a model label (target domain label
  • the domain adaptive model includes the Preprocessing model, feature extraction model, region extraction model, detection model, global feature model and local feature model.
  • the domain adaptive model is a neural network model based on Faster-RCNN for image target detection
  • the domain adaptive model includes the initial parameters
  • the initial parameters include the network structure and individual parameters of the domain adaptive model.
  • the parameters of the model, the network structure of the domain adaptive model includes the network structure of Faster-RCNN
  • the preprocessing model is to perform image conversion on the input image sample to convert the preprocessed image sample
  • the image The conversion is an image processing process that performs size parameter and pixel enhancement on the image.
  • the process can be set according to requirements. For example, image conversion includes scaling the input image into an image with preset size parameters, and performing image enhancement on the scaled image , And perform pixel enhancement operations on the converted image samples.
  • the domain adaptive model includes a preprocessing model, a feature extraction model, a region extraction model, a detection model, a global feature model, and a local feature model.
  • step S20 that is, the image conversion is performed on the image sample through the preprocessing model to obtain the preprocessed image sample, including:
  • S201 Perform size matching on the image sample through the preprocessing model according to a preset size parameter to obtain a matching image sample.
  • the size of the image sample varies according to different collection devices, and the image sample needs to be image converted through the preprocessing model to obtain an image in a unified format, and the size parameters are set according to requirements
  • the size parameter includes the length, width, and number of channels of the image, the number of channels is the number of channels after the image sample is converted, and the preprocessing model converts the image sample into the Matching an image sample
  • the size matching is an image that performs scaling processing, merging processing, or scaling and merging processing on the image sample to meet the requirements of the size parameter, for example, the size of the image sample is 600 ⁇ 800 with three channels
  • the size parameter of the image is (600 ⁇ 600, 3), then the size of the matched image sample obtained after the size matching process is a 600 ⁇ 600 image with three channels.
  • S202 Perform denoising and image enhancement processing on the matched image sample through the preprocessing model according to the gamma transformation algorithm to obtain the preprocessed image sample.
  • the preprocessing model reduces image noise on the matched image samples
  • the denoising processing can be set according to requirements.
  • the denoising processing can be median filtering denoising, Gaussian filtering denoising, and mean filtering.
  • Denoising, Wiener filtering denoising or Fourier filtering denoising, etc. through the preprocessing model, perform image enhancement processing on the matched image samples after denoising, and finally obtain the preprocessed image samples.
  • the image enhancement processing is the processing operation of using the gamma transformation algorithm to enhance each pixel in the matched image sample after denoising.
  • the gamma transformation algorithm is to correct the image and transform the gray scale of the image. An algorithm for correcting images with high or low gray levels to enhance contrast.
  • the image features are extracted from the preprocessed image
  • the image features are color features, texture features, shape features, and spatial relationship features in the image
  • the feature extraction model includes the network structure of Faster-RCNN
  • the feature is extracted to obtain the feature vector graph, which is a matrix of multi-channel (also multi-dimensional) and containing vectors of the image features.
  • S40 Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; perform local feature extraction processing on the region feature vector map corresponding to the image sample through the local feature model And two-class recognition to obtain a local domain classification result, and obtain a local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the global feature model is used to compare the feature vector map Perform regularization and global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph.
  • the region extraction model is also called a region generation network (RPN, Region Proposal Network).
  • the region extraction model performs region extraction and equalization sampling processing on the feature vector graph, and the region extraction is derived from the features
  • a plurality of candidate area frames are extracted from the vector graph.
  • the candidate area frame is a target area (also a target area of interest) that contains an anchor that meets the preset requirements, and the equalized sampling is performed for all the candidates
  • the region box is mapped to the feature vector map, and the candidate region box is mapped to the region in the feature vector map to perform ROI Pooling processing to obtain the region feature vector map of the same size
  • the purpose of the balanced sampling is to pool candidate regions of different sizes into the same size region feature vector map.
  • the local feature model extracts local features in the region feature vector graph, and the local feature extraction process is to extract features of the same nature in the information hidden in the local region, such as edge points or lines, to obtain A plurality of local feature vector graphs, and then perform binary classification recognition on all the local feature vector graphs through the local feature model, that is, to identify whether the local feature vector graph is the source domain label result or the target domain label result through the binary classification method, the
  • the local domain classification result includes a local source domain label result and a local target domain label result, and the local domain classification result also includes identifying a probability value corresponding to the local source domain label result and a probability value corresponding to the local target domain label result, Calculate according to the local domain classification result and the domain label corresponding to the input image sample corresponding to the local domain classification result to obtain a local feature alignment loss value, and perform back propagation through the local feature alignment loss value, Adjust the parameters in the local feature model, and continuously align the local features in the source domain image sample and the local feature in the target domain image sample through the local
  • the global feature model performs a regularization process on the feature vector graph, and performs a global feature recognition process on the regularized feature vector graph
  • the global feature recognition process is the regularized feature Perform global feature extraction on the vector graph, classify and recognize the regularized feature vector graph according to the extracted global feature
  • the classification recognition is binary classification recognition
  • the feature vector graph of obtains the feature regular loss value
  • the feature regular loss value can minimize the loss processing on the extracted global feature to prevent overfitting
  • the global domain classification result includes the global source domain label result
  • the global domain classification result further includes identifying the probability value corresponding to the global domain label result and the probability value corresponding to the global target domain label result, according to the global domain classification result and the feature
  • the gap between the domain labels corresponding to the vector graph is used to obtain the global feature alignment loss value.
  • the global feature alignment loss value is used for backpropagation, the parameters in the global feature model are adjusted, and the global feature alignment loss value is continuously updated.
  • the global features in the image samples of the source domain and the global features in the target domain image samples are aligned with each other to reduce the difference between the global features. That is, by extracting the global features from the source domain image samples, it is used to compare the target domain image samples. Perform two-class recognition and effective global features, extract from the global features in the target domain image samples the effective global features for the source domain image samples for classification and recognition, the global features are in the regularized feature vector image
  • the embodied color feature, texture feature, and shape feature can represent the relevant features of the overall object.
  • the step S40 that is, the performing region extraction and equalization sampling on the feature vector map through the region extraction model to obtain a region feature vector map includes:
  • S401 Perform region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame.
  • the region extraction model includes a region extraction network layer and a region of interest pooling layer
  • the region extraction network layer includes a 3 ⁇ 3 convolution layer, an activation layer, and two 1 ⁇ 3 layers with different dimensional parameters.
  • 1 convolutional layer, a softmax layer, and a fully connected layer the region is extracted as the first feature map obtained by convolution of the feature vector map through a 3 ⁇ 3 convolutional layer, and the first feature map is respectively Input the first 1 ⁇ 1 convolutional layer and the second 1 ⁇ 1 convolutional layer to obtain the second feature map and the third feature map of different dimensions, and the second feature map is anchored through the softmax layer.
  • the third feature map and the second feature map after passing through the softmax layer are classified through the fully connected layer, and locked through bbox regression, and finally at least one feature map is output The candidate area frame.
  • S402 Perform equal sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.
  • the region of interest pooling layer is also called ROI pooling, and the region of interest pooling layer maps all the candidate region boxes to the target in the feature vector diagram corresponding to the target in the candidate region box
  • the position of, that is, the region position that is the same as the vector value of the candidate region frame is queried from the feature vector map, and the region corresponding to the candidate region frame in the mapped feature vector map is preset
  • the fixed-size pooling process can pool the regions corresponding to each of the candidate region frames to obtain the region feature vector map of the same size. In this way, it is possible to obtain all the regions of the same size corresponding to each candidate region frame.
  • the area feature vector diagram is also called ROI pooling, and the region of interest pooling layer maps all the candidate region boxes to the target in the feature vector diagram corresponding to the target in the candidate region box.
  • This application realizes the area extraction of the feature vector graph through the area extraction network layer to obtain the candidate area frame; the area of interest pool layer is used to perform equal sampling processing on the feature vector graph and all the candidate area frames to obtain the area
  • the feature vector map in this way, realizes that the region extraction network layer and the region of interest pool layer can automatically identify interesting or useful regions from the feature vector map, and convert them into the same size that is convenient for subsequent feature extraction
  • the regional feature vector map improves the recognition efficiency and accuracy, and avoids the interference of uninteresting or useless regions on feature extraction.
  • the local feature extraction process and the two-class recognition are performed on the regional feature vector map corresponding to the image sample through the local feature model to obtain The local domain classification result, and the local feature alignment loss value obtained according to the local domain classification result and the domain label corresponding to the image sample, including:
  • S403 Perform local feature extraction on the regional feature vector map by using a feature extractor in the local feature model to obtain a local feature vector map.
  • the local feature model includes a feature extractor, a domain classifier, a gradient reversal layer, and a domain difference measurer.
  • the feature extractor performs the local feature extraction for each of the regional feature vector graphs, and the extraction method It can be set according to requirements, such as SIFT (Scale Invariant Feature Transform) method, SURF (Speeded Up Robust Features) method, Harris Corner method, and LBP (Local Binary Pattern) method.
  • the local feature The extraction method is the LBP method. Because the LBP method has the characteristics of rotation invariance and gray level invariance, it can extract the local features more effectively, and after processing by the feature extractor, multiple local feature vector images are obtained.
  • S404 Perform two-class recognition on the local feature vector graph by using a domain classifier in the local feature model to obtain the local domain classification result.
  • the goal of the domain classifier is to maximize the loss of the domain classifier, confuse the domain label recognition results of the target domain image sample and the source domain image sample, and let the local feature vector map corresponding to the source domain image sample
  • the two-class recognition output of the domain classifier is a local target domain label result, so that the local features can be aligned with each other.
  • S405 Perform reverse alignment on the local domain classification result through the gradient reversal layer in the local feature model to obtain a reverse domain label.
  • the gradient reversal layer is also referred to as a GRL (Gradient Reversal Layer) layer.
  • the reverse alignment means that the gradient direction is automatically reversed during the backward propagation process, and no processing is performed during the forward propagation process.
  • the local domain classification result is automatically inverted through the gradient reversal layer to obtain the reverse domain label opposite to the local domain classification result.
  • S406 Perform a difference comparison between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference metric in the local feature model to obtain the local feature alignment loss value.
  • the domain difference metric includes the local feature alignment loss function
  • the difference comparison is a loss value obtained through calculation of the local feature alignment loss function
  • the reverse domain label and the The domain label corresponding to the regional feature vector graph is input into the local feature alignment loss function to obtain the local feature alignment loss value
  • the local feature alignment loss value is:
  • p i,j is the pair corresponding to the same i-th image sample The reverse domain label corresponding to the j-th regional feature vector graph.
  • the present application realizes the local feature extraction of the regional feature vector graph by the feature extractor in the local feature model to obtain the local feature vector graph; the two-class recognition of the local feature vector graph by the domain classifier , Obtain the local domain classification result; perform inverse alignment on the local domain classification result through the gradient reversal layer to obtain a reverse domain label; align the loss function through the local feature in the domain difference measurer Calculate the loss of the reverse domain label and the domain label corresponding to the regional feature vector graph to obtain the local feature alignment loss value.
  • the local feature of the source domain image sample and the target domain image sample are automatically aligned
  • the said local features can effectively extract useful local features to identify the source domain image samples and target domain image samples, and reflect the local features of the source domain image samples and the target domain image samples through the local feature alignment loss value
  • the gap between the features, the local feature alignment loss value is continuously reduced in the process of iterating the initial parameters, which can improve the training efficiency of the model and improve the recognition accuracy and reliability.
  • the feature vector graph is regularized and the global feature recognition process is performed on the feature vector graph through the global feature model to obtain the feature regularization loss value and the global domain.
  • the classification result, and according to the global domain classification result and the domain label corresponding to the feature vector graph, the global feature alignment loss value is obtained, including:
  • S407 Perform regularization processing on the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature map, and at the same time calculate the feature regular loss value through the regular loss function in the feature regular model .
  • the feature regular model performs regularization processing on each of the feature vector graphs, and the regularization processing is to square and sum the feature vectors corresponding to each pixel in the feature vector graph, and then open the whole.
  • the processing operation of the square finally obtains the global regular feature map one-to-one corresponding to the feature vector map.
  • the feature vector corresponding to each pixel in each feature vector map can be made small, close to zero.
  • n is the total number of the image samples in the image sample set;
  • E i is the global canonical feature map corresponding to the i-th image sample (0 ⁇ i ⁇ n);
  • R is a preset Distance constant.
  • S408 Perform global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result.
  • the global feature extraction process is to perform histogram feature extraction on the feature vector corresponding to each pixel in the global regular feature map, and perform classification recognition based on the extracted global feature
  • the classification recognition is a two-category Method recognition, that is, the recognition result has only two classification results
  • the global domain classification result includes the global source domain label result and the global target domain label result
  • the global domain classification result also includes the probability value corresponding to the global domain label result. And the probability value corresponding to the global target domain label result.
  • the global loss model includes a global feature alignment loss function
  • the global feature alignment loss function is calculated by inputting the global domain classification result and the domain label corresponding to the feature vector graph into the global feature alignment loss function.
  • the feature alignment loss value by continuously reducing the global feature alignment loss value, so that the gap between the global feature of the source domain image sample and the global feature of the target domain image sample can improve the training efficiency of the model, the global feature alignment loss value is :
  • this application realizes that by performing regularization processing, global feature extraction processing, and classification recognition on the feature vector graph, the global domain classification result corresponding to the feature vector graph is obtained, and the feature regular loss value is obtained The loss value is aligned with the global feature. Therefore, the introduction of the feature regular loss value and the global feature alignment loss value can improve the robustness and accuracy of the domain adaptive model, and improve the training efficiency of the model.
  • S50 Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the source domain image sample corresponding
  • the category label obtains the detection loss value; and the total loss value is obtained according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value and the feature regular loss value.
  • the detection model only performs boundary regression and source domain classification and recognition on the area feature vector map corresponding to the source domain image sample, and the boundary regression is to locate the area feature corresponding to the source domain image sample
  • the target image area in the vector graph the target image area is the area that needs to be image recognized, that is, the target image area can reflect the category characteristics of the source domain image sample, and the source domain classification is recognized as
  • the image feature is extracted from the target image area in the area feature vector diagram corresponding to the source domain image sample, and the image feature also includes the image feature related to the category of the source domain image sample, and the process is performed based on the extracted image feature Predictive recognition, to identify the category of the source domain image sample, and the method of extracting image features related to the source domain image sample category can be set according to requirements, preferably the extraction method of the neural network model of VGG16, so as to obtain the recognition As a result, the recognition result characterizes the category contained in the source domain image sample, and the loss value between the recognition result and the category label corresponding to the
  • the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value are input into a total loss function, and the total loss is calculated by the total loss function Value;
  • the median loss value is:
  • ⁇ 1 is the weight of the global feature alignment loss value
  • L global is the global feature alignment loss value
  • ⁇ 2 is the weight of the local feature alignment loss value
  • L local is the local feature alignment loss value
  • ⁇ 3 is the weight of the detection loss value
  • L detection is the detection loss value
  • ⁇ 4 is the weight of the feature regular loss value
  • L norm is the feature regular loss value.
  • the convergence condition may be a condition that the value of the total loss value is small and will not drop after 50,000 calculations, that is, the value of the total loss value is small and will not decrease after 50,000 calculations.
  • the convergence condition can also be the condition that the total loss value is less than the set threshold, that is, When the total loss value is less than the set threshold, the training is stopped, and the domain adaptive model after convergence is recorded as the trained domain adaptive model.
  • the initial parameters of the iterative domain adaptive model can be continuously updated, and the data between the source domain image samples and the target domain image samples can be gradually reduced.
  • Distribution difference and then realize the transfer of the knowledge of the source domain image sample to learn the knowledge of the target domain image sample, use the existing source domain image sample knowledge to learn the knowledge of the target domain image sample through the algorithm, that is, find the source domain image sample
  • the similarity between the knowledge and the knowledge of the target domain image sample can realize the recognition of the target domain image sample based on the category of the source domain image sample, and make the recognition accuracy rate higher and higher.
  • this application realizes that by acquiring an image sample set containing multiple image samples; the image samples include source domain image samples and target domain image samples; inputting the image samples into a Faster-RCNN-based domain automaton containing initial parameters
  • An adaptive model is used to perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector map; and use the region extraction model to perform image feature extraction on the preprocessed image Perform region extraction and equalization sampling on the feature vector image to obtain a regional feature vector image; perform local feature extraction processing and binary classification recognition on the region feature vector image corresponding to the image sample through the local feature model to obtain local features Alignment loss value;
  • the feature vector graph is regularized and global feature recognition processing is performed through the global feature model to obtain the feature regularization loss value and the global feature alignment loss value; the detection model is used to compare with the source domain image
  • the regional feature vector map corresponding to the sample performs boundary regression and source
  • this application provides a domain adaptive model training
  • the method is to obtain image samples of the source and target domains for training, without artificial labeling of the image samples of the target domain, and adaptively adapt to the distribution differences of the image data of different domain sources through global feature alignment and local feature alignment.
  • the training efficiency of the domain adaptive model is improved, and the feature regular loss value is introduced, which improves the robustness and accuracy of the domain adaptive model.
  • the total loss value of loss value, detection loss value, local feature alignment loss value and feature regular loss value converges the domain adaptive model based on Faster-RCNN, realizes cross-domain image recognition, and improves the accuracy of image recognition And reliability, saving labor costs.
  • the image detection method provided by this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network.
  • the client includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices.
  • the server can be implemented as an independent server or a server cluster composed of multiple servers.
  • an image detection method is provided, and the technical solution mainly includes the following steps S100-S200:
  • An image detection instruction is received, and an image of a target area to be detected is acquired.
  • the image of the target area to be detected is acquired on the same device as the image sample of the target area, and the image of the target area to be detected is the same device that needs to be identified and is the same as the device that collects the target area image sample
  • the image detection instruction is triggered when the image of the target area to be detected is recognized.
  • the trigger mode of the image detection instruction can be set according to requirements, for example, automatically after the image of the target area to be detected is collected Trigger, or by clicking the OK button after collecting the image of the target area to be detected.
  • the method of acquiring the image of the target area to be detected can also be set according to requirements. For example, it can be triggered by
  • the path of storing the image of the target area to be detected for acquisition may also be obtained in the image detection instruction of the image of the target area to be detected, and so on.
  • the image feature includes the image feature being the color in the image.
  • the image detection model is trained and completed by the above-mentioned domain adaptive model training method. According to the extracted The image feature outputs the source domain category result, the category of the source domain category result is the same as the full set of category labels, and the source domain category result represents the category of the target domain image to be detected.
  • this application realizes that by acquiring the image of the target domain to be detected; inputting the image of the target domain to be detected into the image detection model trained as the above-mentioned domain adaptive model training method, and extracting the target to be detected through the image detection model The image feature in the domain image is obtained, and the source domain category result output by the image detection model according to the image feature is obtained. Therefore, this application automatically recognizes the category of the target domain image to be detected through the domain adaptive model, realizing cross-device or Cross-domain image detection improves the accuracy and reliability of cross-domain recognition results and saves costs.
  • a domain adaptive model training device is provided, and the domain adaptive model training device corresponds to the domain adaptive model training method in the above-mentioned embodiment in a one-to-one correspondence.
  • the domain adaptive model training device includes an acquisition module 11, an input module 12, an extraction module 13, a first loss module 14, a second loss module 15 and a training module 16.
  • the detailed description of each functional module is as follows:
  • the obtaining module 11 is used to obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample, one category label, and one Domain label association; one of the target domain image samples is associated with one domain label;
  • the input module 12 is configured to input the image sample into a Faster-RCNN-based domain adaptation model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample;
  • the domain adaptation The model includes the preprocessing model, feature extraction model, region extraction model, detection model, global feature model, and local feature model;
  • the extraction module 13 is configured to perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector image
  • the first loss module 14 is configured to perform region extraction and equal sampling of the feature vector map through the region extraction model to obtain a region feature vector map; and use the local feature model to perform regional feature vectors corresponding to the image samples
  • the map performs local feature extraction processing and binary classification recognition to obtain a local domain classification result, and according to the local domain classification result and the domain label corresponding to the image sample, the local feature alignment loss value is obtained; at the same time, the global feature model is passed Perform regularization and global feature recognition processing on the feature vector graph to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment according to the global domain classification result and the domain label corresponding to the feature vector graph Loss value
  • the second loss module 15 is configured to perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model, to obtain the recognition result, and to obtain the recognition result according to the recognition result and the comparison with the
  • the category label corresponding to the source domain image sample obtains the detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value, a total loss value is obtained;
  • the training module 16 is configured to iteratively update the initial parameters of the domain adaptive model when the total loss value does not reach the preset convergence condition, until the total loss value reaches the preset convergence condition, The domain adaptive model after convergence is recorded as a trained domain adaptive model.
  • the input module 12 includes:
  • the matching sub-module is configured to perform size matching on the image sample through the preprocessing model according to preset size parameters to obtain a matching image sample;
  • the conversion sub-module is configured to perform denoising and image enhancement processing on the matched image samples through the preprocessing model according to the gamma transformation algorithm to obtain the preprocessed image samples.
  • the first loss module 14 includes:
  • An extraction sub-module configured to perform region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame;
  • the pooling sub-module is used to perform balanced sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.
  • the first loss module 14 further includes:
  • the local extraction sub-module is used to perform local feature extraction on the regional feature vector map by the feature extractor in the local feature model to obtain a local feature vector map;
  • the local classification sub-module is used to perform two-class recognition on the local feature vector graph through the domain classifier in the local feature model to obtain the local domain classification result;
  • the local inversion sub-module is used to reverse and align the local domain classification results through the gradient inversion layer in the local feature model to obtain a reverse domain label;
  • the local loss sub-module is used to compare the difference between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference measurer in the local feature model to obtain the local feature alignment loss value.
  • the first loss module 14 further includes:
  • the global regularization sub-module is used to regularize the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature map, and at the same time, calculate all the features through the regular loss function in the feature regular model.
  • the regular loss value of the characteristic is used to regularize the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature map, and at the same time, calculate all the features through the regular loss function in the feature regular model.
  • a global classification sub-module configured to perform global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result;
  • the global loss sub-module is used to input the global domain classification result and the domain label corresponding to the feature vector graph into a global loss model, and calculate the global domain classification result and the feature by the global loss model The difference between the domain labels corresponding to the vector graph is used to obtain the global feature alignment loss value.
  • Each module in the above-mentioned domain adaptive model training device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • an image detection device is provided, and the image detection device corresponds to the image detection method in the above-mentioned embodiment one-to-one.
  • the image detection device includes a receiving module 101 and a detection module 102.
  • the detailed description of each functional module is as follows:
  • the receiving module 101 is configured to receive an image detection instruction and obtain an image of a target area to be detected;
  • the detection module 102 is configured to input the image of the target domain to be detected into the image detection model trained by the domain adaptive model training method according to any one of claims 1 to 5, and extract the image to be detected from the image detection model The image feature in the target domain image is obtained, and the source domain classification result output by the image detection model according to the image feature is obtained.
  • Each module in the above-mentioned image detection device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 10.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a readable storage medium and an internal memory.
  • the readable storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to implement a domain adaptive model training method or image detection method.
  • the readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor.
  • the processor executes the computer-readable instructions, the domains in the foregoing embodiments are implemented.
  • the adaptive model training method, or the processor executes the computer program to implement the image detection method in the above embodiment.
  • one or more readable storage media storing computer readable instructions are provided.
  • the readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the image detection method in the above-mentioned embodiments.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé d'apprentissage de modèle auto-adaptatif basé sur une région, un procédé et un dispositif de détection d'image, et un appareil, et un support. Le procédé d'apprentissage de modèle auto-adaptatif basé sur une région comprend les étapes consistant à : acquérir un ensemble d'échantillons d'images contenant une pluralité d'échantillons d'images ; entrer les échantillons d'images dans un modèle adaptatif basé sur une région, lequel modèle est basé sur Faster-RCNN et contient des paramètres initiaux, réaliser une conversion d'image sur les échantillons d'images au moyen d'un modèle de prétraitement de manière à obtenir des échantillons d'images prétraités ; obtenir un graphe de vecteurs de caractéristiques au moyen d'un modèle d'extraction de caractéristiques ; obtenir un graphe de vecteurs de caractéristiques de région au moyen d'un modèle d'extraction de région ; obtenir une valeur de perte d'alignement de caractéristiques locales au moyen d'un modèle de caractéristiques locales ; effectuer une normalisation et un traitement de reconnaissance de caractéristiques globales au moyen d'un modèle de caractéristiques globales de manière à obtenir une valeur de perte normale de caractéristiques et une valeur de perte d'alignement de caractéristiques globales ; obtenir une valeur de perte de détection au moyen d'un modèle de détection ; obtenir une valeur de perte totale ; et mettre à jour de manière itérative les paramètres initiaux jusqu'à obtention d'une convergence de manière à obtenir un modèle auto-adaptatif basé sur une région entraînée. Le présent procédé permet une reconnaissance d'image de région transversale, améliorant la précision et la fiabilité de la reconnaissance d'image.
PCT/CN2020/116742 2020-07-28 2020-09-22 Procédé et dispositif d'apprentissage de modèle auto-adaptatif basé sur une région, procédé et dispositif de détection d'image, et appareil et support WO2021120752A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010737198.7A CN111860670B (zh) 2020-07-28 2020-07-28 域自适应模型训练、图像检测方法、装置、设备及介质
CN202010737198.7 2020-07-28

Publications (1)

Publication Number Publication Date
WO2021120752A1 true WO2021120752A1 (fr) 2021-06-24

Family

ID=72948336

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/116742 WO2021120752A1 (fr) 2020-07-28 2020-09-22 Procédé et dispositif d'apprentissage de modèle auto-adaptatif basé sur une région, procédé et dispositif de détection d'image, et appareil et support

Country Status (2)

Country Link
CN (1) CN111860670B (fr)
WO (1) WO2021120752A1 (fr)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343989A (zh) * 2021-07-09 2021-09-03 中山大学 一种基于前景选择域自适应的目标检测方法及系统
CN113392933A (zh) * 2021-07-06 2021-09-14 湖南大学 一种基于不确定性引导的自适应跨域目标检测方法
CN113469092A (zh) * 2021-07-13 2021-10-01 深圳思谋信息科技有限公司 字符识别模型生成方法、装置、计算机设备和存储介质
CN113658036A (zh) * 2021-08-23 2021-11-16 平安科技(深圳)有限公司 基于对抗生成网络的数据增广方法、装置、计算机及介质
CN113705685A (zh) * 2021-08-30 2021-11-26 平安科技(深圳)有限公司 疾病特征识别模型训练、疾病特征识别方法、装置及设备
CN113792576A (zh) * 2021-07-27 2021-12-14 北京邮电大学 基于有监督域适应的人体行为识别方法、电子设备
CN113807420A (zh) * 2021-09-06 2021-12-17 湖南大学 一种考虑类别语义匹配的域自适应目标检测方法及系统
CN113887538A (zh) * 2021-11-30 2022-01-04 北京的卢深视科技有限公司 模型训练、人脸识别方法、电子设备及存储介质
CN114065852A (zh) * 2021-11-11 2022-02-18 合肥工业大学 基于动态权重的多源联合自适应和内聚性特征提取方法
CN114119585A (zh) * 2021-12-01 2022-03-01 昆明理工大学 基于Transformer的关键特征增强胃癌图像识别方法
CN114694146A (zh) * 2022-03-25 2022-07-01 北京世纪好未来教育科技有限公司 文本识别模型的训练方法、文本识别方法、装置及设备
CN114898111A (zh) * 2022-04-26 2022-08-12 北京百度网讯科技有限公司 预训练模型生成方法和装置、目标检测方法和装置
CN115456917A (zh) * 2022-11-11 2022-12-09 中国石油大学(华东) 有益于目标准确检测的图像增强方法、装置、设备及介质
CN115641584A (zh) * 2022-12-26 2023-01-24 武汉深图智航科技有限公司 一种雾天图像识别方法及装置
CN116883735A (zh) * 2023-07-05 2023-10-13 江南大学 基于公有特征和私有特征的域自适应小麦种子分类方法
CN116894839A (zh) * 2023-09-07 2023-10-17 深圳市谱汇智能科技有限公司 芯片晶圆缺陷检测方法、装置、终端设备以及存储介质
CN117297554A (zh) * 2023-11-16 2023-12-29 哈尔滨海鸿基业科技发展有限公司 一种淋巴成像装置控制系统及方法
CN117372791A (zh) * 2023-12-08 2024-01-09 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质
CN117648576A (zh) * 2024-01-24 2024-03-05 腾讯科技(深圳)有限公司 数据增强模型训练及数据处理方法、装置、设备、介质
CN117830882A (zh) * 2024-03-04 2024-04-05 广东泰一高新技术发展有限公司 基于深度学习的航拍图像识别方法及相关产品

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183788B (zh) * 2020-11-30 2021-03-30 华南理工大学 一种域自适应设备运检系统和方法
CN113160135A (zh) * 2021-03-15 2021-07-23 华南理工大学 基于无监督迁移图片分类的结肠病变智能识别方法、系统及介质
CN112710632A (zh) * 2021-03-23 2021-04-27 四川京炜交通工程技术有限公司 一种玻璃微珠高低折射率检测方法及系统
CN113222997A (zh) * 2021-03-31 2021-08-06 上海商汤智能科技有限公司 神经网络的生成、图像处理方法、装置、电子设备及介质
CN113033557A (zh) * 2021-04-16 2021-06-25 北京百度网讯科技有限公司 用于训练图像处理模型和检测图像的方法、装置
CN113177525A (zh) * 2021-05-27 2021-07-27 杭州有赞科技有限公司 一种ai电子秤系统和称量方法
CN113392886A (zh) * 2021-05-31 2021-09-14 北京达佳互联信息技术有限公司 图片识别模型的获取方法、装置、电子设备及存储介质
CN113469190B (zh) * 2021-06-10 2023-09-15 电子科技大学 基于域适应的单阶段目标检测算法
CN113269278B (zh) * 2021-07-16 2021-11-09 广东众聚人工智能科技有限公司 基于领域翻转的机器人巡航目标识别方法及系统
CN113591639A (zh) * 2021-07-20 2021-11-02 北京爱笔科技有限公司 对齐框架的训练方法、装置、计算机设备以及存储介质
CN113269190B (zh) * 2021-07-21 2021-10-12 中国平安人寿保险股份有限公司 基于人工智能的数据分类方法、装置、计算机设备及介质
CN113554013B (zh) * 2021-09-22 2022-03-29 华南理工大学 跨场景识别模型训练方法、跨场景道路识别方法以及装置
CN115131590B (zh) * 2022-09-01 2022-12-06 浙江大华技术股份有限公司 目标检测模型的训练方法、目标检测方法及相关设备
CN116028821B (zh) * 2023-03-29 2023-06-13 中电科大数据研究院有限公司 融合领域知识的预训练模型训练方法、数据处理方法
CN117078985B (zh) * 2023-10-17 2024-01-30 之江实验室 一种景象匹配方法、装置、存储介质及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241321A (zh) * 2018-07-19 2019-01-18 杭州电子科技大学 基于深度领域适应的图像和模型联合分析方法
CN110399856A (zh) * 2019-07-31 2019-11-01 上海商汤临港智能科技有限公司 特征提取网络训练方法、图像处理方法、装置及其设备
CN110516671A (zh) * 2019-08-27 2019-11-29 腾讯科技(深圳)有限公司 神经网络模型的训练方法、图像检测方法及装置
US20200143248A1 (en) * 2017-07-12 2020-05-07 Tencent Technology (Shenzhen) Company Limited Machine learning model training method and device, and expression image classification method and device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105930841B (zh) * 2016-05-13 2018-01-26 百度在线网络技术(北京)有限公司 对图像进行自动语义标注的方法、装置与计算机设备
US10402701B2 (en) * 2017-03-17 2019-09-03 Nec Corporation Face recognition system for face recognition in unlabeled videos with domain adversarial learning and knowledge distillation
CN109145766B (zh) * 2018-07-27 2021-03-23 北京旷视科技有限公司 模型训练方法、装置、识别方法、电子设备及存储介质
CN109086437B (zh) * 2018-08-15 2021-06-01 重庆大学 一种融合Faster-RCNN和Wasserstein自编码器的图像检索方法
CN110852368B (zh) * 2019-11-05 2022-08-26 南京邮电大学 全局与局部特征嵌入及图文融合的情感分析方法与系统
CN111368886B (zh) * 2020-02-25 2023-03-21 华南理工大学 一种基于样本筛选的无标注车辆图片分类方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200143248A1 (en) * 2017-07-12 2020-05-07 Tencent Technology (Shenzhen) Company Limited Machine learning model training method and device, and expression image classification method and device
CN109241321A (zh) * 2018-07-19 2019-01-18 杭州电子科技大学 基于深度领域适应的图像和模型联合分析方法
CN110399856A (zh) * 2019-07-31 2019-11-01 上海商汤临港智能科技有限公司 特征提取网络训练方法、图像处理方法、装置及其设备
CN110516671A (zh) * 2019-08-27 2019-11-29 腾讯科技(深圳)有限公司 神经网络模型的训练方法、图像检测方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHANG XIAOWE , LYU MINGQIANG , LI HUI: "Cross-Domain Person Re-Identification Based on Partial Semantic Feature Invariance", JOURNAL OF BEIJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS, vol. 46, no. 9, 15 April 2020 (2020-04-15), pages 1682 - 1690, XP055823405, ISSN: 1001-5965, DOI: 10.13700/j.bh.1001-5965.2020.0072 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113392933A (zh) * 2021-07-06 2021-09-14 湖南大学 一种基于不确定性引导的自适应跨域目标检测方法
CN113343989B (zh) * 2021-07-09 2022-09-27 中山大学 一种基于前景选择域自适应的目标检测方法及系统
CN113343989A (zh) * 2021-07-09 2021-09-03 中山大学 一种基于前景选择域自适应的目标检测方法及系统
CN113469092B (zh) * 2021-07-13 2023-09-08 深圳思谋信息科技有限公司 字符识别模型生成方法、装置、计算机设备和存储介质
CN113469092A (zh) * 2021-07-13 2021-10-01 深圳思谋信息科技有限公司 字符识别模型生成方法、装置、计算机设备和存储介质
CN113792576A (zh) * 2021-07-27 2021-12-14 北京邮电大学 基于有监督域适应的人体行为识别方法、电子设备
CN113658036A (zh) * 2021-08-23 2021-11-16 平安科技(深圳)有限公司 基于对抗生成网络的数据增广方法、装置、计算机及介质
CN113705685B (zh) * 2021-08-30 2023-08-01 平安科技(深圳)有限公司 疾病特征识别模型训练、疾病特征识别方法、装置及设备
CN113705685A (zh) * 2021-08-30 2021-11-26 平安科技(深圳)有限公司 疾病特征识别模型训练、疾病特征识别方法、装置及设备
CN113807420A (zh) * 2021-09-06 2021-12-17 湖南大学 一种考虑类别语义匹配的域自适应目标检测方法及系统
CN113807420B (zh) * 2021-09-06 2024-03-19 湖南大学 一种考虑类别语义匹配的域自适应目标检测方法及系统
CN114065852B (zh) * 2021-11-11 2024-04-16 合肥工业大学 基于动态权重的多源联合自适应和内聚性特征提取方法
CN114065852A (zh) * 2021-11-11 2022-02-18 合肥工业大学 基于动态权重的多源联合自适应和内聚性特征提取方法
CN113887538B (zh) * 2021-11-30 2022-03-25 北京的卢深视科技有限公司 模型训练、人脸识别方法、电子设备及存储介质
CN113887538A (zh) * 2021-11-30 2022-01-04 北京的卢深视科技有限公司 模型训练、人脸识别方法、电子设备及存储介质
CN114119585B (zh) * 2021-12-01 2022-11-29 昆明理工大学 基于Transformer的关键特征增强胃癌图像识别方法
CN114119585A (zh) * 2021-12-01 2022-03-01 昆明理工大学 基于Transformer的关键特征增强胃癌图像识别方法
CN114694146A (zh) * 2022-03-25 2022-07-01 北京世纪好未来教育科技有限公司 文本识别模型的训练方法、文本识别方法、装置及设备
CN114694146B (zh) * 2022-03-25 2024-04-02 北京世纪好未来教育科技有限公司 文本识别模型的训练方法、文本识别方法、装置及设备
CN114898111A (zh) * 2022-04-26 2022-08-12 北京百度网讯科技有限公司 预训练模型生成方法和装置、目标检测方法和装置
CN114898111B (zh) * 2022-04-26 2023-04-07 北京百度网讯科技有限公司 预训练模型生成方法和装置、目标检测方法和装置
CN115456917A (zh) * 2022-11-11 2022-12-09 中国石油大学(华东) 有益于目标准确检测的图像增强方法、装置、设备及介质
CN115456917B (zh) * 2022-11-11 2023-02-17 中国石油大学(华东) 有益于目标准确检测的图像增强方法、装置、设备及介质
CN115641584A (zh) * 2022-12-26 2023-01-24 武汉深图智航科技有限公司 一种雾天图像识别方法及装置
CN116883735B (zh) * 2023-07-05 2024-03-08 江南大学 基于公有特征和私有特征的域自适应小麦种子分类方法
CN116883735A (zh) * 2023-07-05 2023-10-13 江南大学 基于公有特征和私有特征的域自适应小麦种子分类方法
CN116894839B (zh) * 2023-09-07 2023-12-05 深圳市谱汇智能科技有限公司 芯片晶圆缺陷检测方法、装置、终端设备以及存储介质
CN116894839A (zh) * 2023-09-07 2023-10-17 深圳市谱汇智能科技有限公司 芯片晶圆缺陷检测方法、装置、终端设备以及存储介质
CN117297554A (zh) * 2023-11-16 2023-12-29 哈尔滨海鸿基业科技发展有限公司 一种淋巴成像装置控制系统及方法
CN117372791A (zh) * 2023-12-08 2024-01-09 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质
CN117372791B (zh) * 2023-12-08 2024-03-22 齐鲁空天信息研究院 细粒度定向能毁伤区域检测方法、装置及存储介质
CN117648576A (zh) * 2024-01-24 2024-03-05 腾讯科技(深圳)有限公司 数据增强模型训练及数据处理方法、装置、设备、介质
CN117648576B (zh) * 2024-01-24 2024-04-12 腾讯科技(深圳)有限公司 数据增强模型训练及数据处理方法、装置、设备、介质
CN117830882A (zh) * 2024-03-04 2024-04-05 广东泰一高新技术发展有限公司 基于深度学习的航拍图像识别方法及相关产品

Also Published As

Publication number Publication date
CN111860670B (zh) 2022-05-17
CN111860670A (zh) 2020-10-30

Similar Documents

Publication Publication Date Title
WO2021120752A1 (fr) Procédé et dispositif d'apprentissage de modèle auto-adaptatif basé sur une région, procédé et dispositif de détection d'image, et appareil et support
WO2020238293A1 (fr) Procédé de classification d'image, procédé et appareil de formation de réseau neuronal
WO2019232853A1 (fr) Procédé d'apprentissage de modèle chinois, procédé de reconnaissance de modèle chinois, dispositif, appareil et support
CN110909820B (zh) 基于自监督学习的图像分类方法及系统
WO2016145940A1 (fr) Procédé et dispositif d'authentification faciale
US9928405B2 (en) System and method for detecting and tracking facial features in images
WO2018054283A1 (fr) Procédé et dispositif d'apprentissage de modèle de visage, et procédé et dispositif d'authentification de visage
WO2016150240A1 (fr) Procédé et appareil d'authentification d'identité
WO2014205231A1 (fr) Cadre d'apprentissage en profondeur destiné à la détection d'objet générique
CN110909618B (zh) 一种宠物身份的识别方法及装置
Zhang et al. Road recognition from remote sensing imagery using incremental learning
Abdelsamea et al. A SOM-based Chan–Vese model for unsupervised image segmentation
Wang et al. Hand vein recognition based on multi-scale LBP and wavelet
Liu et al. Finger vein recognition using optimal partitioning uniform rotation invariant LBP descriptor
CN111091129B (zh) 一种基于多重颜色特征流形排序的图像显著区域提取方法
Zarbakhsh et al. Low-rank sparse coding and region of interest pooling for dynamic 3D facial expression recognition
Yang et al. Non-rigid point set registration via global and local constraints
Kalam et al. Gaussian Kernel Based Fuzzy CMeans Clustering Algorithm For Image Segmentation
CN113313179A (zh) 一种基于l2p范数鲁棒最小二乘法的噪声图像分类方法
Ge et al. Active appearance models using statistical characteristics of gabor based texture representation
WO2020247494A1 (fr) Appariement croisé d'empreintes digitales sans contact à des empreintes digitales basées sur un contact existantes
US9659210B1 (en) System and method for detecting and tracking facial features in images
Pathak et al. Entropy based CNN for segmentation of noisy color eye images using color, texture and brightness contour features
CN111488811A (zh) 人脸识别方法、装置、终端设备及计算机可读介质
CN111768436B (zh) 一种基于Faster-RCNN改进的图像特征块配准方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900968

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20900968

Country of ref document: EP

Kind code of ref document: A1