WO2021120752A1

WO2021120752A1 - Region-based self-adaptive model training method and device, image detection method and device, and apparatus and medium

Info

Publication number: WO2021120752A1
Application number: PCT/CN2020/116742
Authority: WO
Inventors: 周侠; 吕彬; 高鹏; 吕传峰
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-07-28
Filing date: 2020-09-22
Publication date: 2021-06-24
Also published as: CN111860670A; CN111860670B

Abstract

A region-based self-adaptive model training method and device, an image detection method and device, an apparatus and a medium. The region-based self-adaptive model training method comprises: acquiring an image sample set containing a plurality of image samples; inputting the image samples into a region-based adaptive model that is based on Faster-RCNN and that contains initial parameters, performing image conversion on the image samples by means of a preprocessing model so as to obtain preprocessed image samples; obtaining a feature vector graph by means of a feature extraction model; obtaining a region feature vector graph by means of a region extraction model; obtaining a local feature alignment loss value by means of a local feature model; performing normalization and global feature recognition processing by means of a global feature model so as to obtain a feature normal loss value and a global feature alignment loss value; obtaining a detection loss value by means of a detection model; obtaining a total loss value; and iteratively updating the initial parameters until achieving convergence so as to obtain a trained region-based self-adaptive model. The present method allows for cross-region image recognition, improving image recognition accuracy and reliability.

Description

Domain adaptive model training, image detection method, device, equipment and medium

This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on July 28, 2020, the application number is 202010737198.7, and the invention title is "domain adaptive model training, image detection methods, devices, equipment and media", all of which The content is incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence classification models, and in particular to a domain adaptive model training, image detection methods, devices, computer equipment, and storage media.

Background technique

The inventor found that the current deep learning method has been widely used in artificial intelligence, but the deep learning method is very dependent on the distribution of training data. If the distribution of the collected training data is different, the detection accuracy of the model finally trained by the deep learning method will be low. For example, OCT (Optical Coherence Tomography) lesion detection is a very important part of medical diagnosis. Researchers have begun to detect lesions through OCT based on deep learning. However, due to the differences in the collection parameters and collection methods of different OCT collection devices, there are differences in the distribution of data collected by different devices, which will seriously affect the detection results , Resulting in deviations in the detection results, which in turn makes the detection accuracy rate low.

Summary of the invention

This application provides a domain adaptive model training, image detection method, device, computer equipment, and storage medium, which realizes that there is no need to manually label images in the target domain, and by adaptively aligning the distribution differences of image data from different domain sources, Improve the training efficiency of the domain adaptive model, and introduce the feature regular loss value, improve the robustness and accuracy of the domain adaptive model, and automatically identify the category of the target domain image through the domain adaptive model, realizing cross-domain Image detection improves the reliability of recognition and saves costs.

A domain adaptive model training method, including:

Obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;

The image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;

Performing image feature extraction on the preprocessed image by using the feature extraction model to obtain a feature vector diagram;

Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform local feature extraction processing and two steps on the region feature vector map corresponding to the image sample Classification and recognition, obtain the local domain classification result, and obtain the local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the feature vector graph is regularized by the global feature model And global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph;

Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the category label corresponding to the source domain image sample Obtain a detection loss value; obtain a total loss value according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value;

When the total loss value does not reach the preset convergence condition, iteratively update the initial parameters of the domain adaptive model, until the total loss value reaches the preset convergence condition, the domain after convergence The adaptive model is recorded as a domain adaptive model that has been trained.

An image detection method, including:

The image detection instruction is received, and the image of the target area to be detected is obtained;

The image of the target domain to be detected is input into the image detection model trained as the above-mentioned domain adaptive model training method, the image features in the image of the target domain to be detected are extracted through the image detection model, and the image detection model is obtained according to The source domain category result of the image feature output.

A domain adaptive model training device, including:

The acquisition module is used to acquire an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample, one category label, and one domain Label association; one image sample of the target domain is associated with one domain label;

The input module is used to input the image sample into a Faster-RCNN-based domain adaptive model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model Including the preprocessing model, feature extraction model, region extraction model, detection model, global feature model, and local feature model;

An extraction module, configured to perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector image;

The first loss module is used to perform region extraction and equal sampling of the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform region feature vector maps corresponding to the image samples Perform local feature extraction processing and binary classification recognition to obtain a local domain classification result, and obtain a local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the global feature model is used to The feature vector graph performs regularization and global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and according to the global domain classification result and the domain label corresponding to the feature vector graph, a global feature alignment loss is obtained value;

The second loss module is used to perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain the recognition result, and according to the recognition result and the source domain The category label corresponding to the domain image sample obtains the detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value, a total loss value is obtained;

The training module is configured to iteratively update the initial parameters of the domain adaptive model when the total loss value does not reach the preset convergence condition, until the total loss value reaches the preset convergence condition, it will converge The subsequent domain adaptive model is recorded as a trained domain adaptive model.

An image detection device includes:

The receiving module is used to receive the image detection instruction and obtain the image of the target area to be detected;

The detection module is used to input the image of the target domain to be detected into the image detection model trained as the above-mentioned domain adaptive model training method, extract the image features in the image of the target domain to be detected through the image detection model, and obtain all the images of the target domain. The image detection model outputs a source domain category result according to the image feature.

A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:

Perform region extraction and balanced sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform local feature extraction processing and two steps on the region feature vector map corresponding to the image sample Classification and recognition, obtain the local domain classification result, and obtain the local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the feature vector graph is regularized by the global feature model And global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph;

A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, and the processor further implements the following steps when the processor executes the computer-readable instructions:

One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

One or more readable storage media storing computer readable instructions, when the computer readable instructions are executed by one or more processors, the one or more processors further execute the following steps:

The domain adaptive model training method, device, computer equipment, and storage medium provided in this application obtain an image sample set containing multiple image samples; the image samples include source domain image samples and target domain image samples; The sample input contains initial parameters and is based on a Faster-RCNN-based domain adaptive model. The image sample is image converted by the preprocessing model to obtain the preprocessed image sample; the preprocessed image is imaged by the feature extraction model Extract and obtain a feature vector map; perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform a regional feature vector map corresponding to the image sample Perform local feature extraction processing and binary classification recognition to obtain local feature alignment loss values; at the same time, perform regularization and global feature recognition processing on the feature vector graph through the global feature model to obtain feature regularization loss values and global feature alignment loss values Perform boundary regression and source domain classification and recognition on the regional feature vector map corresponding to the source domain image sample through the detection model to obtain the detection loss value; according to the global feature alignment loss value, the detection loss value, and the The local feature alignment loss value and the feature regular loss value are used to obtain the total loss value; when the total loss value does not reach the preset convergence condition, the initial parameters are iteratively updated until convergence, and the trained domain adaptive model is obtained, Therefore, this application provides a domain adaptive model training method, which is trained by acquiring image samples of the source domain and the target domain, without artificial labeling of the image samples of the target domain, and the method of global feature alignment and local feature alignment The distribution differences of image data from different domain sources are adapted to improve the training efficiency of the domain adaptive model, and the feature regular loss value is introduced to improve the robustness and accuracy of the domain adaptive model. The image samples of the domain are trained, and the domain adaptive model based on Faster-RCNN is converged according to the total loss value containing the global feature alignment loss value, detection loss value, local feature alignment loss value and feature regular loss value. Cross-domain image recognition improves the accuracy and reliability of image recognition and saves labor costs.

The image detection method, device, computer equipment and storage medium provided in this application receive the image detection instruction and obtain the image of the target domain to be detected; input the image of the target domain to be detected into the training completed by the above-mentioned domain adaptive model training method The image detection model extracts the image features in the image of the target domain to be detected through the image detection model, and obtains the source domain category results output by the image detection model according to the image features. In this way, this application adopts the domain adaptive model The category of the target domain image is automatically recognized, cross-domain image detection is realized, the recognition reliability is improved, and the cost is saved.

The details of one or more embodiments of the present application are presented in the following drawings and description, and other features and advantages of the present application will become apparent from the description, drawings and claims.

Description of the drawings

In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

FIG. 1 is a schematic diagram of an application environment of a domain adaptive model training method or an image detection method in an embodiment of the present application;

FIG. 2 is a flowchart of a method for training a domain adaptive model in an embodiment of the present application;

FIG. 3 is a flowchart of step S20 of a domain adaptive model training method in an embodiment of the present application;

4 is a flowchart of step S40 of the method for training a domain adaptive model in an embodiment of the present application;

FIG. 5 is a flowchart of step S40 of a method for training a domain adaptive model in another embodiment of the present application;

FIG. 6 is a flowchart of step S40 of the domain adaptive model training method in another embodiment of the present application;

Fig. 7 is a flowchart of an image detection method in an embodiment of the present application;

Fig. 8 is a functional block diagram of a domain adaptive model training device in an embodiment of the present application;

Fig. 9 is a functional block diagram of an image detection device in an embodiment of the present application;

Fig. 10 is a schematic diagram of a computer device in an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The domain adaptive model training method provided by this application can be applied in an application environment as shown in Fig. 1, where a client (computer device) communicates with a server through a network. Among them, the client (computer equipment) includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 2, a method for training a domain adaptive model is provided, and its technical solution mainly includes the following steps S10-S60:

S10. Obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; An image sample of the target domain is associated with a domain label.

Understandably, receiving a training request for a domain adaptive model triggers the acquisition of the image sample set used to train the domain adaptive model, the image sample set being a collection of the collected image samples, and the image samples include There are the source domain image sample and the target domain image sample, the source domain image sample is an image collected in a known field or by a known device and marked with a category label, and the category label indicates all The category of the source domain image sample, for example, an OCT image sample is collected on a known OCT acquisition device, and the OCT image sample has been marked with the category label of the area contained in the OCT image sample (choroidal area, macular hole) Region, etc.), one of the source domain image samples is associated with a category label and a domain label, the domain label serves as a distinguishing identifier between the source domain image sample and the target domain image sample, and the domain label includes the source The domain label and the target domain label, for example, the label includes a model label (source domain label) corresponding to a known device and a model label (target domain label) corresponding to another device similar to the known device, the source domain image The sample is associated with the source domain label, the target domain image sample is associated with the target domain label, one target domain image sample is associated with one domain label, and the target domain image sample is associated with the known domain Related fields or images that are collected by another device similar to the known device and are not marked with a category label.

S20. Input the image sample into a Faster-RCNN-based domain adaptive model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model includes the Preprocessing model, feature extraction model, region extraction model, detection model, global feature model and local feature model.

Understandably, the domain adaptive model is a neural network model based on Faster-RCNN for image target detection, the domain adaptive model includes the initial parameters, and the initial parameters include the network structure and individual parameters of the domain adaptive model. The parameters of the model, the network structure of the domain adaptive model includes the network structure of Faster-RCNN, the preprocessing model is to perform image conversion on the input image sample to convert the preprocessed image sample, the image The conversion is an image processing process that performs size parameter and pixel enhancement on the image. The process can be set according to requirements. For example, image conversion includes scaling the input image into an image with preset size parameters, and performing image enhancement on the scaled image , And perform pixel enhancement operations on the converted image samples. The domain adaptive model includes a preprocessing model, a feature extraction model, a region extraction model, a detection model, a global feature model, and a local feature model.

In one embodiment, as shown in FIG. 3, in the step S20, that is, the image conversion is performed on the image sample through the preprocessing model to obtain the preprocessed image sample, including:

S201: Perform size matching on the image sample through the preprocessing model according to a preset size parameter to obtain a matching image sample.

Understandably, the size of the image sample varies according to different collection devices, and the image sample needs to be image converted through the preprocessing model to obtain an image in a unified format, and the size parameters are set according to requirements, The size parameter includes the length, width, and number of channels of the image, the number of channels is the number of channels after the image sample is converted, and the preprocessing model converts the image sample into the Matching an image sample, the size matching is an image that performs scaling processing, merging processing, or scaling and merging processing on the image sample to meet the requirements of the size parameter, for example, the size of the image sample is 600×800 with three channels The size parameter of the image is (600×600, 3), then the size of the matched image sample obtained after the size matching process is a 600×600 image with three channels.

S202: Perform denoising and image enhancement processing on the matched image sample through the preprocessing model according to the gamma transformation algorithm to obtain the preprocessed image sample.

Understandably, the preprocessing model reduces image noise on the matched image samples, and the denoising processing can be set according to requirements. For example, the denoising processing can be median filtering denoising, Gaussian filtering denoising, and mean filtering. Denoising, Wiener filtering denoising or Fourier filtering denoising, etc., through the preprocessing model, perform image enhancement processing on the matched image samples after denoising, and finally obtain the preprocessed image samples. The image enhancement processing is the processing operation of using the gamma transformation algorithm to enhance each pixel in the matched image sample after denoising. The gamma transformation algorithm is to correct the image and transform the gray scale of the image. An algorithm for correcting images with high or low gray levels to enhance contrast.

S30: Perform image feature extraction on the preprocessed image by using the feature extraction model to obtain a feature vector diagram.

Understandably, the image features are extracted from the preprocessed image, the image features are color features, texture features, shape features, and spatial relationship features in the image, and the feature extraction model includes the network structure of Faster-RCNN The 13 convolutional layers, 13 activation layers, and 4 pooling layers of, input the feature extraction model through the pre-processed image, and pass the image of each convolution layer, each activation layer, and each pooling layer The feature is extracted to obtain the feature vector graph, which is a matrix of multi-channel (also multi-dimensional) and containing vectors of the image features.

S40: Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; perform local feature extraction processing on the region feature vector map corresponding to the image sample through the local feature model And two-class recognition to obtain a local domain classification result, and obtain a local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the global feature model is used to compare the feature vector map Perform regularization and global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph.

Understandably, the region extraction model is also called a region generation network (RPN, Region Proposal Network). The region extraction model performs region extraction and equalization sampling processing on the feature vector graph, and the region extraction is derived from the features A plurality of candidate area frames are extracted from the vector graph. The candidate area frame is a target area (also a target area of interest) that contains an anchor that meets the preset requirements, and the equalized sampling is performed for all the candidates The region box is mapped to the feature vector map, and the candidate region box is mapped to the region in the feature vector map to perform ROI Pooling processing to obtain the region feature vector map of the same size The purpose of the balanced sampling is to pool candidate regions of different sizes into the same size region feature vector map.

Wherein, the local feature model extracts local features in the region feature vector graph, and the local feature extraction process is to extract features of the same nature in the information hidden in the local region, such as edge points or lines, to obtain A plurality of local feature vector graphs, and then perform binary classification recognition on all the local feature vector graphs through the local feature model, that is, to identify whether the local feature vector graph is the source domain label result or the target domain label result through the binary classification method, the The local domain classification result includes a local source domain label result and a local target domain label result, and the local domain classification result also includes identifying a probability value corresponding to the local source domain label result and a probability value corresponding to the local target domain label result, Calculate according to the local domain classification result and the domain label corresponding to the input image sample corresponding to the local domain classification result to obtain a local feature alignment loss value, and perform back propagation through the local feature alignment loss value, Adjust the parameters in the local feature model, and continuously align the local features in the source domain image sample and the local feature in the target domain image sample through the local feature alignment loss value to reduce the difference between the local features, That is, by extracting from the local features in the source domain image samples the effective local features for the two-class recognition of the target domain image samples, and extracting the source domain image samples from the local features in the target domain image samples. Classification and recognition of effective local features.

Wherein, the global feature model performs a regularization process on the feature vector graph, and performs a global feature recognition process on the regularized feature vector graph, and the global feature recognition process is the regularized feature Perform global feature extraction on the vector graph, classify and recognize the regularized feature vector graph according to the extracted global feature, the classification recognition is binary classification recognition, and obtain the global domain classification result, according to the regularization The feature vector graph of, obtains the feature regular loss value, the feature regular loss value can minimize the loss processing on the extracted global feature to prevent overfitting, and the global domain classification result includes the global source domain label result And the global target domain label result, the global domain classification result further includes identifying the probability value corresponding to the global domain label result and the probability value corresponding to the global target domain label result, according to the global domain classification result and the feature The gap between the domain labels corresponding to the vector graph is used to obtain the global feature alignment loss value. The global feature alignment loss value is used for backpropagation, the parameters in the global feature model are adjusted, and the global feature alignment loss value is continuously updated. The global features in the image samples of the source domain and the global features in the target domain image samples are aligned with each other to reduce the difference between the global features. That is, by extracting the global features from the source domain image samples, it is used to compare the target domain image samples. Perform two-class recognition and effective global features, extract from the global features in the target domain image samples the effective global features for the source domain image samples for classification and recognition, the global features are in the regularized feature vector image The embodied color feature, texture feature, and shape feature can represent the relevant features of the overall object.

In an embodiment, as shown in FIG. 4, the step S40, that is, the performing region extraction and equalization sampling on the feature vector map through the region extraction model to obtain a region feature vector map includes:

S401: Perform region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame.

Understandably, the region extraction model includes a region extraction network layer and a region of interest pooling layer, and the region extraction network layer includes a 3×3 convolution layer, an activation layer, and two 1×3 layers with different dimensional parameters. 1 convolutional layer, a softmax layer, and a fully connected layer, the region is extracted as the first feature map obtained by convolution of the feature vector map through a 3×3 convolutional layer, and the first feature map is respectively Input the first 1×1 convolutional layer and the second 1×1 convolutional layer to obtain the second feature map and the third feature map of different dimensions, and the second feature map is anchored through the softmax layer. ), the third feature map and the second feature map after passing through the softmax layer are classified through the fully connected layer, and locked through bbox regression, and finally at least one feature map is output The candidate area frame.

S402: Perform equal sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.

Understandably, the region of interest pooling layer is also called ROI pooling, and the region of interest pooling layer maps all the candidate region boxes to the target in the feature vector diagram corresponding to the target in the candidate region box The position of, that is, the region position that is the same as the vector value of the candidate region frame is queried from the feature vector map, and the region corresponding to the candidate region frame in the mapped feature vector map is preset The fixed-size pooling process can pool the regions corresponding to each of the candidate region frames to obtain the region feature vector map of the same size. In this way, it is possible to obtain all the regions of the same size corresponding to each candidate region frame. The area feature vector diagram.

This application realizes the area extraction of the feature vector graph through the area extraction network layer to obtain the candidate area frame; the area of interest pool layer is used to perform equal sampling processing on the feature vector graph and all the candidate area frames to obtain the area The feature vector map, in this way, realizes that the region extraction network layer and the region of interest pool layer can automatically identify interesting or useful regions from the feature vector map, and convert them into the same size that is convenient for subsequent feature extraction The regional feature vector map improves the recognition efficiency and accuracy, and avoids the interference of uninteresting or useless regions on feature extraction.

In one embodiment, as shown in FIG. 5, in the step S40, the local feature extraction process and the two-class recognition are performed on the regional feature vector map corresponding to the image sample through the local feature model to obtain The local domain classification result, and the local feature alignment loss value obtained according to the local domain classification result and the domain label corresponding to the image sample, including:

S403: Perform local feature extraction on the regional feature vector map by using a feature extractor in the local feature model to obtain a local feature vector map.

Understandably, the local feature model includes a feature extractor, a domain classifier, a gradient reversal layer, and a domain difference measurer. The feature extractor performs the local feature extraction for each of the regional feature vector graphs, and the extraction method It can be set according to requirements, such as SIFT (Scale Invariant Feature Transform) method, SURF (Speeded Up Robust Features) method, Harris Corner method, and LBP (Local Binary Pattern) method. Preferably, the local feature The extraction method is the LBP method. Because the LBP method has the characteristics of rotation invariance and gray level invariance, it can extract the local features more effectively, and after processing by the feature extractor, multiple local feature vector images are obtained.

S404: Perform two-class recognition on the local feature vector graph by using a domain classifier in the local feature model to obtain the local domain classification result.

Understandably, the goal of the domain classifier is to maximize the loss of the domain classifier, confuse the domain label recognition results of the target domain image sample and the source domain image sample, and let the local feature vector map corresponding to the source domain image sample The two-class recognition output of the domain classifier is a local target domain label result, so that the local features can be aligned with each other.

S405: Perform reverse alignment on the local domain classification result through the gradient reversal layer in the local feature model to obtain a reverse domain label.

Understandably, the gradient reversal layer is also referred to as a GRL (Gradient Reversal Layer) layer. The reverse alignment means that the gradient direction is automatically reversed during the backward propagation process, and no processing is performed during the forward propagation process. Before performing back propagation to calculate the local feature alignment loss value, the local domain classification result is automatically inverted through the gradient reversal layer to obtain the reverse domain label opposite to the local domain classification result.

S406: Perform a difference comparison between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference metric in the local feature model to obtain the local feature alignment loss value.

Understandably, the domain difference metric includes the local feature alignment loss function, and the difference comparison is a loss value obtained through calculation of the local feature alignment loss function, and the reverse domain label and the The domain label corresponding to the regional feature vector graph is input into the local feature alignment loss function to obtain the local feature alignment loss value, and the local feature alignment loss value is:

Wherein, n is the total number of the image samples in the image sample set; m is the total number of the regional feature vector maps corresponding to the same image sample; D _i is the first image sample in the image sample set The domain labels associated with the i image samples (for example, D _i =0 indicates the source domain label, and D _i =1 indicates the target domain label); p _i,j is the pair corresponding to the same i-th image sample The reverse domain label corresponding to the j-th regional feature vector graph.

The present application realizes the local feature extraction of the regional feature vector graph by the feature extractor in the local feature model to obtain the local feature vector graph; the two-class recognition of the local feature vector graph by the domain classifier , Obtain the local domain classification result; perform inverse alignment on the local domain classification result through the gradient reversal layer to obtain a reverse domain label; align the loss function through the local feature in the domain difference measurer Calculate the loss of the reverse domain label and the domain label corresponding to the regional feature vector graph to obtain the local feature alignment loss value. In this way, the local feature of the source domain image sample and the target domain image sample are automatically aligned The said local features can effectively extract useful local features to identify the source domain image samples and target domain image samples, and reflect the local features of the source domain image samples and the target domain image samples through the local feature alignment loss value The gap between the features, the local feature alignment loss value is continuously reduced in the process of iterating the initial parameters, which can improve the training efficiency of the model and improve the recognition accuracy and reliability.

In an embodiment, as shown in FIG. 6, in the step S40, the feature vector graph is regularized and the global feature recognition process is performed on the feature vector graph through the global feature model to obtain the feature regularization loss value and the global domain. The classification result, and according to the global domain classification result and the domain label corresponding to the feature vector graph, the global feature alignment loss value is obtained, including:

S407: Perform regularization processing on the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature map, and at the same time calculate the feature regular loss value through the regular loss function in the feature regular model .

Understandably, the feature regular model performs regularization processing on each of the feature vector graphs, and the regularization processing is to square and sum the feature vectors corresponding to each pixel in the feature vector graph, and then open the whole. The processing operation of the square finally obtains the global regular feature map one-to-one corresponding to the feature vector map. In this way, the feature vector corresponding to each pixel in each feature vector map can be made small, close to zero. , Reduce the difference between pixels, prevent over-fitting, and calculate the feature regular loss value through the regular loss function, introduce the feature regular loss value, and improve the robustness and accuracy of the domain adaptive model, The characteristic regular loss value is:

Wherein, n is the total number of the image samples in the image sample set; E _i is the global canonical feature map corresponding to the i-th image sample (0≤i≤n); R is a preset Distance constant.

S408: Perform global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result.

Understandably, the global feature extraction process is to perform histogram feature extraction on the feature vector corresponding to each pixel in the global regular feature map, and perform classification recognition based on the extracted global feature, and the classification recognition is a two-category Method recognition, that is, the recognition result has only two classification results, the global domain classification result includes the global source domain label result and the global target domain label result, and the global domain classification result also includes the probability value corresponding to the global domain label result. And the probability value corresponding to the global target domain label result.

S409. Input the global domain classification result and the domain label corresponding to the feature vector graph into a global loss model, and calculate the global domain classification result and the domain corresponding to the feature vector graph through the global loss model The difference between the labels is used to obtain the global feature alignment loss value.

Understandably, the global loss model includes a global feature alignment loss function, and the global feature alignment loss function is calculated by inputting the global domain classification result and the domain label corresponding to the feature vector graph into the global feature alignment loss function. The feature alignment loss value, by continuously reducing the global feature alignment loss value, so that the gap between the global feature of the source domain image sample and the global feature of the target domain image sample can improve the training efficiency of the model, the global feature alignment loss value is :

Wherein, n is the total number of sample images of the image sample set; D _i is set in the sample image associated with the i-th sample image domain label (for example: D _i = 0 denotes the source domain tag , D _i = 1 represents the target domain label); p _i is the global domain classification result corresponding to the feature vector graph corresponding to the i-th image sample.

In this way, this application realizes that by performing regularization processing, global feature extraction processing, and classification recognition on the feature vector graph, the global domain classification result corresponding to the feature vector graph is obtained, and the feature regular loss value is obtained The loss value is aligned with the global feature. Therefore, the introduction of the feature regular loss value and the global feature alignment loss value can improve the robustness and accuracy of the domain adaptive model, and improve the training efficiency of the model.

S50: Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the source domain image sample corresponding The category label obtains the detection loss value; and the total loss value is obtained according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value and the feature regular loss value.

Understandably, the detection model only performs boundary regression and source domain classification and recognition on the area feature vector map corresponding to the source domain image sample, and the boundary regression is to locate the area feature corresponding to the source domain image sample The target image area in the vector graph, the target image area is the area that needs to be image recognized, that is, the target image area can reflect the category characteristics of the source domain image sample, and the source domain classification is recognized as The image feature is extracted from the target image area in the area feature vector diagram corresponding to the source domain image sample, and the image feature also includes the image feature related to the category of the source domain image sample, and the process is performed based on the extracted image feature Predictive recognition, to identify the category of the source domain image sample, and the method of extracting image features related to the source domain image sample category can be set according to requirements, preferably the extraction method of the neural network model of VGG16, so as to obtain the recognition As a result, the recognition result characterizes the category contained in the source domain image sample, and the loss value between the recognition result and the category label corresponding to the source domain image sample is calculated through the cross-entropy algorithm, that is, the loss value between the recognition result and the category label corresponding to the source domain image sample is calculated. The detection loss value.

Wherein, the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value are input into a total loss function, and the total loss is calculated by the total loss function Value; the median loss value is:

L=λ ₁ L _global +λ ₂ L _local +λ ₃ L _detection +λ ₄ L _norm

Where λ ₁ is the weight of the global feature alignment loss value; L _global is the global feature alignment loss value; λ ₂ is the weight of the local feature alignment loss value; _{L local is the local feature alignment loss value;} λ ₃ is the weight of the detection loss value; L _detection is the detection loss value; λ ₄ is the weight _{of the feature regular loss value; L norm} is the feature regular loss value.

S60: When the total loss value does not reach a preset convergence condition, iteratively update the initial parameters of the domain adaptive model, until the total loss value reaches the preset convergence condition, converge all subsequent parameters. The domain adaptive model is recorded as the domain adaptive model after training.

Understandably, the convergence condition may be a condition that the value of the total loss value is small and will not drop after 50,000 calculations, that is, the value of the total loss value is small and will not decrease after 50,000 calculations. When it will no longer drop, stop training, and record the domain adaptive model after convergence as the completed domain adaptive model; the convergence condition can also be the condition that the total loss value is less than the set threshold, that is, When the total loss value is less than the set threshold, the training is stopped, and the domain adaptive model after convergence is recorded as the trained domain adaptive model.

In this way, when the total loss value does not reach the preset convergence condition, the initial parameters of the iterative domain adaptive model can be continuously updated, and the data between the source domain image samples and the target domain image samples can be gradually reduced. Distribution difference, and then realize the transfer of the knowledge of the source domain image sample to learn the knowledge of the target domain image sample, use the existing source domain image sample knowledge to learn the knowledge of the target domain image sample through the algorithm, that is, find the source domain image sample The similarity between the knowledge and the knowledge of the target domain image sample can realize the recognition of the target domain image sample based on the category of the source domain image sample, and make the recognition accuracy rate higher and higher.

In this way, this application realizes that by acquiring an image sample set containing multiple image samples; the image samples include source domain image samples and target domain image samples; inputting the image samples into a Faster-RCNN-based domain automaton containing initial parameters An adaptive model is used to perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector map; and use the region extraction model to perform image feature extraction on the preprocessed image Perform region extraction and equalization sampling on the feature vector image to obtain a regional feature vector image; perform local feature extraction processing and binary classification recognition on the region feature vector image corresponding to the image sample through the local feature model to obtain local features Alignment loss value; At the same time, the feature vector graph is regularized and global feature recognition processing is performed through the global feature model to obtain the feature regularization loss value and the global feature alignment loss value; the detection model is used to compare with the source domain image The regional feature vector map corresponding to the sample performs boundary regression and source domain classification and identification to obtain a detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value and the feature regular loss value , The total loss value is obtained; when the total loss value does not reach the preset convergence condition, the initial parameters are updated iteratively until convergence, and the trained domain adaptive model is obtained. Therefore, this application provides a domain adaptive model training The method is to obtain image samples of the source and target domains for training, without artificial labeling of the image samples of the target domain, and adaptively adapt to the distribution differences of the image data of different domain sources through global feature alignment and local feature alignment. The training efficiency of the domain adaptive model is improved, and the feature regular loss value is introduced, which improves the robustness and accuracy of the domain adaptive model. In this way, it is possible to train through image samples in different domains and align them according to the global features. The total loss value of loss value, detection loss value, local feature alignment loss value and feature regular loss value, converges the domain adaptive model based on Faster-RCNN, realizes cross-domain image recognition, and improves the accuracy of image recognition And reliability, saving labor costs.

The image detection method provided by this application can be applied in the application environment as shown in Fig. 1, in which the client (computer equipment) communicates with the server through the network. Among them, the client (computer equipment) includes, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, cameras, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In an embodiment, as shown in FIG. 7, an image detection method is provided, and the technical solution mainly includes the following steps S100-S200:

S100. An image detection instruction is received, and an image of a target area to be detected is acquired.

Understandably, the image of the target area to be detected is acquired on the same device as the image sample of the target area, and the image of the target area to be detected is the same device that needs to be identified and is the same as the device that collects the target area image sample The image detection instruction is triggered when the image of the target area to be detected is recognized. The trigger mode of the image detection instruction can be set according to requirements, for example, automatically after the image of the target area to be detected is collected Trigger, or by clicking the OK button after collecting the image of the target area to be detected. The method of acquiring the image of the target area to be detected can also be set according to requirements. For example, it can be triggered by The path of storing the image of the target area to be detected for acquisition may also be obtained in the image detection instruction of the image of the target area to be detected, and so on.

S200. Input the image of the target domain to be detected into the image detection model trained as described in the above-mentioned domain adaptive model training method, and extract the image features in the image of the target domain to be detected through the image detection model to obtain the image detection The model outputs the source domain category result according to the image feature.

Understandably, it is only necessary to input the image of the target area to be detected into the trained image detection model, and extract the image feature through the image detection model. The image feature includes the image feature being the color in the image. Features, texture features, shape features, and spatial relationship features, as well as image features related to the category of source domain image samples. The image detection model is trained and completed by the above-mentioned domain adaptive model training method. According to the extracted The image feature outputs the source domain category result, the category of the source domain category result is the same as the full set of category labels, and the source domain category result represents the category of the target domain image to be detected.

In this way, this application realizes that by acquiring the image of the target domain to be detected; inputting the image of the target domain to be detected into the image detection model trained as the above-mentioned domain adaptive model training method, and extracting the target to be detected through the image detection model The image feature in the domain image is obtained, and the source domain category result output by the image detection model according to the image feature is obtained. Therefore, this application automatically recognizes the category of the target domain image to be detected through the domain adaptive model, realizing cross-device or Cross-domain image detection improves the accuracy and reliability of cross-domain recognition results and saves costs.

In one embodiment, a domain adaptive model training device is provided, and the domain adaptive model training device corresponds to the domain adaptive model training method in the above-mentioned embodiment in a one-to-one correspondence. As shown in FIG. 8, the domain adaptive model training device includes an acquisition module 11, an input module 12, an extraction module 13, a first loss module 14, a second loss module 15 and a training module 16. The detailed description of each functional module is as follows:

The obtaining module 11 is used to obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample, one category label, and one Domain label association; one of the target domain image samples is associated with one domain label;

The input module 12 is configured to input the image sample into a Faster-RCNN-based domain adaptation model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; the domain adaptation The model includes the preprocessing model, feature extraction model, region extraction model, detection model, global feature model, and local feature model;

The extraction module 13 is configured to perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector image;

The first loss module 14 is configured to perform region extraction and equal sampling of the feature vector map through the region extraction model to obtain a region feature vector map; and use the local feature model to perform regional feature vectors corresponding to the image samples The map performs local feature extraction processing and binary classification recognition to obtain a local domain classification result, and according to the local domain classification result and the domain label corresponding to the image sample, the local feature alignment loss value is obtained; at the same time, the global feature model is passed Perform regularization and global feature recognition processing on the feature vector graph to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment according to the global domain classification result and the domain label corresponding to the feature vector graph Loss value

The second loss module 15 is configured to perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model, to obtain the recognition result, and to obtain the recognition result according to the recognition result and the comparison with the The category label corresponding to the source domain image sample obtains the detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value, a total loss value is obtained;

The training module 16 is configured to iteratively update the initial parameters of the domain adaptive model when the total loss value does not reach the preset convergence condition, until the total loss value reaches the preset convergence condition, The domain adaptive model after convergence is recorded as a trained domain adaptive model.

In an embodiment, the input module 12 includes:

The matching sub-module is configured to perform size matching on the image sample through the preprocessing model according to preset size parameters to obtain a matching image sample;

The conversion sub-module is configured to perform denoising and image enhancement processing on the matched image samples through the preprocessing model according to the gamma transformation algorithm to obtain the preprocessed image samples.

In an embodiment, the first loss module 14 includes:

An extraction sub-module, configured to perform region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame;

The pooling sub-module is used to perform balanced sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.

In an embodiment, the first loss module 14 further includes:

The local extraction sub-module is used to perform local feature extraction on the regional feature vector map by the feature extractor in the local feature model to obtain a local feature vector map;

The local classification sub-module is used to perform two-class recognition on the local feature vector graph through the domain classifier in the local feature model to obtain the local domain classification result;

The local inversion sub-module is used to reverse and align the local domain classification results through the gradient inversion layer in the local feature model to obtain a reverse domain label;

The local loss sub-module is used to compare the difference between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference measurer in the local feature model to obtain the local feature alignment loss value.

In an embodiment, the first loss module 14 further includes:

The global regularization sub-module is used to regularize the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature map, and at the same time, calculate all the features through the regular loss function in the feature regular model. The regular loss value of the characteristic;

A global classification sub-module, configured to perform global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result;

The global loss sub-module is used to input the global domain classification result and the domain label corresponding to the feature vector graph into a global loss model, and calculate the global domain classification result and the feature by the global loss model The difference between the domain labels corresponding to the vector graph is used to obtain the global feature alignment loss value.

For the specific definition of the domain adaptive model training device, please refer to the above definition of the domain adaptive model training method, which will not be repeated here. Each module in the above-mentioned domain adaptive model training device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In an embodiment, an image detection device is provided, and the image detection device corresponds to the image detection method in the above-mentioned embodiment one-to-one. As shown in FIG. 9, the image detection device includes a receiving module 101 and a detection module 102. The detailed description of each functional module is as follows:

The receiving module 101 is configured to receive an image detection instruction and obtain an image of a target area to be detected;

The detection module 102 is configured to input the image of the target domain to be detected into the image detection model trained by the domain adaptive model training method according to any one of claims 1 to 5, and extract the image to be detected from the image detection model The image feature in the target domain image is obtained, and the source domain classification result output by the image detection model according to the image feature is obtained.

For the specific definition of the image detection device, please refer to the above definition of the image detection method, which will not be repeated here. Each module in the above-mentioned image detection device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 10. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a readable storage medium and an internal memory. The readable storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer readable instructions in the readable storage medium. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to implement a domain adaptive model training method or image detection method. The readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

In one embodiment, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor. When the processor executes the computer-readable instructions, the domains in the foregoing embodiments are implemented. The adaptive model training method, or the processor executes the computer program to implement the image detection method in the above embodiment.

In one embodiment, one or more readable storage media storing computer readable instructions are provided. The readable storage media provided in this embodiment include non-volatile readable storage media and volatile readable storage. Medium; the readable storage medium stores computer readable instructions, and when the computer readable instructions are executed by one or more processors, the one or more processors implement the image detection method in the above-mentioned embodiments.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium or a volatile readable storage medium, when the computer readable instruction is executed, it may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A domain adaptive model training method, which includes:

Obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;

The image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;

Performing image feature extraction on the preprocessed image by using the feature extraction model to obtain a feature vector diagram;

Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform local feature extraction processing and two steps on the region feature vector map corresponding to the image sample Classification and recognition, obtain the local domain classification result, and obtain the local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the feature vector graph is regularized by the global feature model And global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph;

Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the category label corresponding to the source domain image sample Obtain a detection loss value; obtain a total loss value according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value;

When the total loss value does not reach the preset convergence condition, iteratively update the initial parameters of the domain adaptive model, until the total loss value reaches the preset convergence condition, the domain after convergence The adaptive model is recorded as a domain adaptive model that has been trained.
The method for training a domain adaptive model according to claim 1, wherein said performing image conversion on said image samples through a preprocessing model to obtain preprocessed image samples comprises:

Performing size matching on the image sample through the preprocessing model according to preset size parameters to obtain a matching image sample;

According to the gamma transform algorithm, the matched image samples are processed for denoising and image enhancement through the preprocessing model to obtain the preprocessed image samples.
5. The domain adaptive model training method according to claim 1, wherein said performing region extraction and equal sampling on said feature vector map by said region extraction model to obtain a region feature vector map comprises:

Performing region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame;

Perform balanced sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.
The domain adaptive model training method according to claim 1, wherein the local feature extraction process and the two-class recognition are performed on the regional feature vector map corresponding to the image sample through the local feature model to obtain the local domain classification As a result, and according to the local domain classification result and the domain label corresponding to the image sample, the local feature alignment loss value is obtained, including:

Performing local feature extraction on the regional feature vector map by a feature extractor in the local feature model to obtain a local feature vector map;

Performing two-class recognition on the local feature vector graph by using a domain classifier in the local feature model to obtain the local domain classification result;

Inverting and aligning the local domain classification result by using the gradient reversal layer in the local feature model to obtain a reverse domain label;

The difference comparison between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference metric in the local feature model is used to obtain the local feature alignment loss value.
3. The domain adaptive model training method according to claim 1, wherein said regularization and global feature recognition processing are performed on said feature vector graph through said global feature model to obtain a feature regularization loss value and a global domain classification result, And according to the global domain classification result and the domain label corresponding to the feature vector graph, the global feature alignment loss value is obtained, including:

Performing regularization processing on the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature graph, and at the same time calculating the feature regular loss value through the regular loss function in the feature regular model;

Performing global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result;

The global domain classification result and the domain label corresponding to the feature vector graph are input into a global loss model, and the global domain classification result and the domain label corresponding to the feature vector graph are calculated through the global loss model. And obtain the global feature alignment loss value.
An image detection method, which includes:

The image detection instruction is received, and the image of the target area to be detected is obtained;

The image of the target domain to be detected is input into the image detection model trained by the domain adaptive model training method according to any one of claims 1 to 5, and the image in the image of the target domain to be detected is extracted through the image detection model Feature, acquiring a source domain category result output by the image detection model according to the image feature.
A domain adaptive model training device, which includes:

The acquisition module is used to acquire an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample, one category label, and one domain Label association; one image sample of the target domain is associated with one domain label;

The input module is used to input the image sample into a Faster-RCNN-based domain adaptive model containing initial parameters, and perform image conversion on the image sample through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model Including the preprocessing model, feature extraction model, region extraction model, detection model, global feature model, and local feature model;

An extraction module, configured to perform image feature extraction on the preprocessed image through the feature extraction model to obtain a feature vector image;

The first loss module is used to perform region extraction and equal sampling of the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform region feature vector maps corresponding to the image samples Perform local feature extraction processing and binary classification recognition to obtain a local domain classification result, and obtain a local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the global feature model is used to The feature vector graph performs regularization and global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and according to the global domain classification result and the domain label corresponding to the feature vector graph, a global feature alignment loss is obtained value;

The second loss module is used to perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain the recognition result, and according to the recognition result and the source domain The category label corresponding to the domain image sample obtains the detection loss value; according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value, a total loss value is obtained;

The training module is configured to iteratively update the initial parameters of the domain adaptive model when the total loss value does not reach the preset convergence condition, until the total loss value reaches the preset convergence condition, it will converge The subsequent domain adaptive model is recorded as a trained domain adaptive model.
An image detection device, which includes:

The receiving module is used to receive the image detection instruction and obtain the image of the target area to be detected;

The detection module is configured to input the image of the target domain to be detected into the image detection model trained by the domain adaptive model training method according to any one of claims 1 to 5, and extract the target to be detected through the image detection model The image feature in the domain image is obtained, and the source domain classification result output by the image detection model according to the image feature is obtained.
A computer device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:

Obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;

The image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;

Performing image feature extraction on the preprocessed image by using the feature extraction model to obtain a feature vector diagram;

Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform local feature extraction processing and two steps on the region feature vector map corresponding to the image sample Classification and recognition, obtain the local domain classification result, and obtain the local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the feature vector graph is regularized by the global feature model And global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph;

Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the category label corresponding to the source domain image sample Obtain a detection loss value; obtain a total loss value according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value;

When the total loss value does not reach the preset convergence condition, iteratively update the initial parameters of the domain adaptive model, until the total loss value reaches the preset convergence condition, the domain after convergence The adaptive model is recorded as a domain adaptive model that has been trained.
9. The computer device according to claim 9, wherein said performing image conversion on said image sample through a preprocessing model to obtain a preprocessed image sample comprises:

Performing size matching on the image sample through the preprocessing model according to preset size parameters to obtain a matching image sample;

According to the gamma transform algorithm, the matched image samples are processed for denoising and image enhancement through the preprocessing model to obtain the preprocessed image samples.
9. The computer device according to claim 9, wherein said performing area extraction and equal sampling on said feature vector map through said area extraction model to obtain a regional feature vector map comprises:

Performing region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame;

Perform balanced sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.
The computer device according to claim 9, wherein the local feature extraction process and the two-class recognition are performed on the regional feature vector map corresponding to the image sample through the local feature model to obtain a local domain classification result, and according to The local domain classification result and the domain label corresponding to the image sample to obtain a local feature alignment loss value includes:

Performing local feature extraction on the regional feature vector map by a feature extractor in the local feature model to obtain a local feature vector map;

Performing two-class recognition on the local feature vector graph by using a domain classifier in the local feature model to obtain the local domain classification result;

Performing inverse alignment on the local domain classification result through the gradient reversal layer in the local feature model to obtain a reverse domain label;

The difference comparison between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference metric in the local feature model is used to obtain the local feature alignment loss value.
The computer device according to claim 9, wherein the regularization and global feature recognition processing is performed on the feature vector graph through the global feature model to obtain the feature regularization loss value and the global domain classification result, and according to the The global domain classification result and the domain label corresponding to the feature vector graph are used to obtain the global feature alignment loss value, including:

Performing regularization processing on the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature graph, and at the same time calculating the feature regular loss value through the regular loss function in the feature regular model;

Performing global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result;

The global domain classification result and the domain label corresponding to the feature vector graph are input into a global loss model, and the global domain classification result and the domain label corresponding to the feature vector graph are calculated through the global loss model. And obtain the global feature alignment loss value.
A computer device includes a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, wherein the processor further implements the following steps when executing the computer readable instructions :

The image detection instruction is received, and the image of the target area to be detected is obtained;

The image of the target area to be detected is input into the image detection model trained by the domain adaptive model training method, the image features in the image of the target area to be detected are extracted through the image detection model, and the image detection model is obtained according to the image detection model. The source domain category result of the image feature output.
One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors execute the following steps:

Obtain an image sample set; the image sample set includes a plurality of image samples; the image sample includes a source domain image sample and a target domain image sample; one source domain image sample is associated with a category label and a domain label; The target domain image sample is associated with a domain label;

The image sample is input to a Faster-RCNN-based domain adaptive model containing initial parameters, and the image sample is image converted through the preprocessing model to obtain a preprocessed image sample; the domain adaptive model includes the preprocessing Model, feature extraction model, area extraction model, detection model, global feature model and local feature model;

Performing image feature extraction on the preprocessed image by using the feature extraction model to obtain a feature vector diagram;

Perform region extraction and equal sampling on the feature vector map through the region extraction model to obtain a region feature vector map; use the local feature model to perform local feature extraction processing and two steps on the region feature vector map corresponding to the image sample Classification and recognition, obtain the local domain classification result, and obtain the local feature alignment loss value according to the local domain classification result and the domain label corresponding to the image sample; at the same time, the feature vector graph is regularized by the global feature model And global feature recognition processing to obtain a feature regular loss value and a global domain classification result, and obtain a global feature alignment loss value according to the global domain classification result and the domain label corresponding to the feature vector graph;

Perform boundary regression and source domain classification and recognition on the region feature vector map corresponding to the source domain image sample through the detection model to obtain a recognition result, and according to the recognition result and the category label corresponding to the source domain image sample Obtain a detection loss value; obtain a total loss value according to the global feature alignment loss value, the detection loss value, the local feature alignment loss value, and the feature regular loss value;

When the total loss value does not reach the preset convergence condition, iteratively update the initial parameters of the domain adaptive model, until the total loss value reaches the preset convergence condition, the domain after convergence The adaptive model is recorded as a domain adaptive model that has been trained.
15. The readable storage medium according to claim 15, wherein said performing image conversion on said image sample through a preprocessing model to obtain a preprocessed image sample comprises:

Performing size matching on the image sample through the preprocessing model according to preset size parameters to obtain a matching image sample;

According to the gamma transform algorithm, the matched image samples are processed for denoising and image enhancement through the preprocessing model to obtain the preprocessed image samples.
15. The readable storage medium according to claim 15, wherein said performing region extraction and equal sampling on said feature vector map by said region extraction model to obtain a region feature vector map comprises:

Performing region extraction on the feature vector graph through the region extraction network layer in the region extraction model to obtain at least one candidate region frame;

Perform balanced sampling processing on the feature vector map and all the candidate region frames through the region of interest pool layer in the region extraction model to obtain a region feature vector map.
15. The readable storage medium according to claim 15, wherein the local feature extraction process and the two-class recognition are performed on the regional feature vector map corresponding to the image sample through the local feature model to obtain a local domain classification result, And according to the local domain classification result and the domain label corresponding to the image sample, the local feature alignment loss value is obtained, including:

Performing local feature extraction on the regional feature vector map by a feature extractor in the local feature model to obtain a local feature vector map;

Performing two-class recognition on the local feature vector graph by using a domain classifier in the local feature model to obtain the local domain classification result;

Inverting and aligning the local domain classification result by using the gradient reversal layer in the local feature model to obtain a reverse domain label;

The difference comparison between the reverse domain label and the domain label corresponding to the regional feature vector graph by the domain difference metric in the local feature model is used to obtain the local feature alignment loss value.
The readable storage medium according to claim 15, wherein the regularization and global feature recognition processing is performed on the feature vector graph through the global feature model to obtain a feature regularization loss value and a global domain classification result, and according to The global domain classification result and the domain label corresponding to the feature vector graph to obtain the global feature alignment loss value includes:

Performing regularization processing on the feature vector graph through the feature regular model in the global feature model to obtain a global regular feature graph, and at the same time calculating the feature regular loss value through the regular loss function in the feature regular model;

Performing global feature extraction processing and classification recognition on the global regular feature map through the global feature model to obtain the global domain classification result;

The global domain classification result and the domain label corresponding to the feature vector graph are input into a global loss model, and the global domain classification result and the domain label corresponding to the feature vector graph are calculated through the global loss model. And obtain the global feature alignment loss value.
One or more readable storage media storing computer readable instructions, where when the computer readable instructions are executed by one or more processors, the one or more processors further execute the following steps:

The image detection instruction is received, and the image of the target area to be detected is obtained;

The image of the target domain to be detected is input into the image detection model trained by the domain adaptive model training method, the image features in the image of the target domain to be detected are extracted through the image detection model, and the image detection model is acquired according to the image detection model. The source domain category result of the image feature output.