CN111723823B - Underwater target detection method based on third party transfer learning - Google Patents

Underwater target detection method based on third party transfer learning Download PDF

Info

Publication number
CN111723823B
CN111723823B CN202010589644.4A CN202010589644A CN111723823B CN 111723823 B CN111723823 B CN 111723823B CN 202010589644 A CN202010589644 A CN 202010589644A CN 111723823 B CN111723823 B CN 111723823B
Authority
CN
China
Prior art keywords
matrix
target
representing
target domain
party
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010589644.4A
Other languages
Chinese (zh)
Other versions
CN111723823A (en
Inventor
徐涛
周纪勇
马玉琨
蔡磊
吴韶华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Institute of Science and Technology
Original Assignee
Henan Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Institute of Science and Technology filed Critical Henan Institute of Science and Technology
Priority to CN202010589644.4A priority Critical patent/CN111723823B/en
Publication of CN111723823A publication Critical patent/CN111723823A/en
Application granted granted Critical
Publication of CN111723823B publication Critical patent/CN111723823B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2136Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on sparsity criteria, e.g. with an overcomplete basis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/513Sparse representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an underwater target detection method based on third party transfer learning, which comprises the following steps: firstly, acquiring an underwater image from a picture database to obtain an original target domain; secondly, processing the original target domain and the third party target domain by using the 2D-SPCCA to obtain a new target domain; then, processing the new target domain and the source domain by utilizing an integrated transfer learning algorithm based on characteristic difference self-adaption to obtain a mapping matrix; then, after the YOLOv3 network model is improved, the model is transferred to an underwater scene for training to obtain an underwater target detection model; and finally, detecting the underwater image by using the underwater target detection model, and outputting a detection result. According to the invention, the third party characteristic information is combined with the original target characteristic information, so that original data is increased, and the accuracy of image recognition is improved; and then, the characteristic distribution difference self-adaptive principle is introduced into multi-characteristic integrated transfer learning, so that the time of characteristic mapping is reduced, and the accuracy of transfer learning is ensured.

Description

Underwater target detection method based on third party transfer learning
Technical Field
The invention relates to the technical field of image recognition, in particular to an underwater target detection method based on third party transfer learning.
Background
Seawater has a strong attenuation effect on light, and 70% of incident light is absorbed by a water body in a deepwater environment, so that the deepwater environment is dim. A large amount of algae, plankton and impurities exist in the deepwater environment, and the underwater environment is severe, complex and changeable and difficult to predict. These factors have serious effects on underwater imaging, resulting in poor underwater image quality.
The complex and varied underwater environment presents certain challenges to the quality of underwater imaging. For example, the contrast between the object and the background in the underwater image is low, the texture of the image object is unclear, the edge of the object is easy to be fused with the background under the condition of weak light, a large amount of noise exists in the image, and the brightness of the image is uneven due to the underwater uneven light field. Therefore, the underwater image has the problems of blurred objects, blocked objects, difficult discrimination and the like. These factors may cause a lack of image object information, resulting in a low or no recognition of the object. The missing information in the underwater images may also contain important information, and target detection and accurate identification of such images may be helpful in Achieving Underwater Vehicle (AUV) environmental awareness, scene planning, autonomous control, and mission allocation. Therefore, the realization of the identification of the image is of great significance to the marine exploration and AUV underwater operation.
Target recognition is an important research field in the field of robot vision, and is attracting attention. The current target recognition technology has great achievements in artificial intelligence, such as unmanned driving, face recognition, scene classification and the like. Deepwater operations and ocean exploration of underwater vehicles (AUVs) are also not separated from target identification. It is therefore necessary for many fields to be able to more clearly identify the target image. The traditional target recognition mode has low recognition rate and the constructed model lacks generalization capability. Transfer learning is attracting attention as an important method in machine learning. The mechanism is to apply knowledge and models in the mature field to the new field. Solving a model suitable for solving the problem in the new field by means of the prior knowledge information, thereby solving the new problem. Owing to this feature, transfer learning is widely used in the field of robot vision. There is great success, both at the theoretical research level and at the technical practice level. The principle of the feature-based transfer learning algorithm is that features of a source domain image and a target domain image are mapped into a shared domain under a certain mapping condition, shared data of the source domain image and the target domain image are increased, and therefore the target domain image is processed and identified by using a model of the source domain. However, the single migration learning model is not ideal when processing complex images, while heterogeneous migration has characteristic variability.
Disclosure of Invention
Aiming at the defects in the background art, the invention provides an underwater target detection method based on third party transfer learning, which solves the technical problems of poor generalization capability and low recognition rate of the existing target recognition model.
The technical scheme of the invention is realized as follows:
an underwater target detection method based on third party transfer learning comprises the following steps:
s1, acquiring an underwater image from a picture database, randomly shielding a target area in the underwater image, wherein the shielding area is smaller than 40% of the area of the underwater image to obtain an underwater missing image, and extracting target features of the underwater missing image by using a sift feature extraction method to obtain an original target domain;
s2, acquiring a third party target domain containing the underwater image in the step S1 from an expert knowledge base;
s3, mapping the original target domain in the step S1 and the third party target domain in the step S2 into the shared space domain respectively to obtain a target domain matrix and a third party target domain matrix;
s4, performing association processing on data in the target domain matrix and the third party target domain matrix by using the 2D-SPCCA to obtain a new target domain;
s5, acquiring a large number of images from a Coco database, and extracting target features by using a sift feature extraction method to obtain a source domain;
s6, respectively processing the new target domain and the new source domain by utilizing an integrated transfer learning algorithm based on characteristic difference self-adaption to obtain a mapping matrix, wherein the mapping matrix comprises a target matrix and a source domain matrix;
s7, training the Yolov3 network by utilizing data in the source domain matrix to obtain a Yolov3 network model and network weight;
s8, removing the weight of a detection layer of the YOLOv3 network model to obtain an improved YOLOv3 network model;
s9, acquiring an underwater image from a picture database, inputting the underwater image into the improved YOLOv3 network model in the step S8, and obtaining detection layer weight, thereby obtaining an underwater target detection model;
s10, inputting the underwater image in the step S1 into the underwater target detection model in the step S9, and outputting the target and the region where the target is located.
In the step S4, the method for obtaining the new target domain by performing association processing on the data in the target domain matrix and the third party target domain matrix by using the 2D-SPCCA comprises the following steps:
s4.1, for target Domain matrixConstructing a target matrix using k data in the target domain matrixWherein (1)>Is that the characteristic information in the target domain matrix M is a real number, and the dimension of the target domain matrix M is D x ×d l ,d l Representing the feature dimension, D, of the projected data x Representing the feature space dimension of the target domain matrix, k=min (k To ,k Tt ),k To Representing the feature quantity, k, of the target domain matrix Tt Characteristic quantity of matrix representing third party target domain, < +.>Characteristic information representing a kth target matrix;
s4.2, aiming at third party target domain matrixConstructing a third party matrix using k data in the third party target domain matrix>Wherein (1)>Is that the characteristic information in the third party target domain matrix N is a real number, and the dimension of the third party target domain matrix N is D y ×d l ,D y Feature space dimension representing a third party target domain matrix, < ->Characteristic information representing a kth third party matrix;
s4.3, constructing a correlation function between the target domain matrix M and the third party target domain matrix N by using a correlation analysis method:
wherein ρ represents a correlation coefficient of the feature data, M T 、N T 、X To T 、X Tt T Matrix M, N, X respectively To 、 X Tt Is a transposed matrix of (a);
s4.4, respectively constructing a sparse reconstruction matrix of the target domain matrix M and a sparse reconstruction matrix of the third party target domain matrix N:
X Tt S Tt X Tt T M=λX Tt X Tt T M (4),
X To S To X To T N=λX To X To T N (5),
wherein S is Tt Sparse matrix representing third party target domain matrix, S To A sparse matrix representing a target domain matrix, lambda representing a characteristic value;
s4.5, converting a correlation function between the target domain matrix M and the third party target domain matrix N into an objective function by utilizing the 2D-SPCCA:
s.t.
wherein, the matrix A is an identity matrix,sparse matrix representing target domain matrix, +.>Sparse matrix, X, representing third party target domain matrix Toi The ith eigenvector, X, representing the target matrix Toj The j-th eigenvector, X, representing the target matrix Tti An ith eigenvector, X, representing a third party matrix Ttj The j-th eigenvector, i, j e [1, k, representing the third party matrix];
S4.6, carrying out mathematical operation on the objective function (6):
wherein,,an i-th eigenvector representing a target weighting matrix,/>The j-th feature vector representing the target weighting matrix,/->Representing the ith eigenvector of the third party weighting matrix, < ->Represents the j-th eigenvector of the third party weighting matrix,> respectively indicate->Transposed matrix of A m Representing an identity matrix, S ToTt =S To ·S Tt , />Representing the product of the target sparse matrix and the third party sparse matrix;
s4.7, respectively performing mathematical operation on the formula (7) and the formula (8) to obtain a matrix S ToTo =S To ·S To 、S TtTt =S Tt ·S Tt
S4.8, let H TtTo =D TtTo -S TtTo 、H TtTt =D TtTt -S TtTt 、H ToTo =D ToTo -S ToTo In combination with step S4.6 and step S4.7, the objective function in step S4.5 may be expressed as an objective optimization function:
s.t.
wherein matrix H TtTo 、H TtTt 、H ToTo Are symmetrical matrixes;
s4.9 orderConverting the objective optimization function in the step S4.7 by using a Lagrangian multiplier algorithm to obtain:
F xy (F y ) -1 F yx α=λ 2 F x α (13),
F yx (F x ) -1 F xy β=λ 2 F x β (14),
wherein α=α 1 ,...,α i' For the eigenvector corresponding to eigenvalue i', β=β 1 ,...,β i' The feature vector corresponding to the feature value i';
s4.10, let matrix C= [ alpha ] 1 ,...,α i' ]Matrix d= [ beta ] 1 ,...,β i' ]The new target domain is obtained as follows:
X T =CX To +DX Tt (15)。
in the step S6, the method for obtaining the mapping matrix by respectively processing the new target domain and the source domain by using the integrated transfer learning algorithm based on the characteristic difference self-adaption comprises the following steps:
s6.1, constructing an objective function of transfer learning based on characteristic difference self-adaption:
s.t.Z T PBP T Z=A (19),
wherein P= [ X ] S ,X T ]Is the source domain X S And a new target domain X T Spliced sample matrix, B represents center matrix, A represents unit matrix, gamma represents distribution balance factor, Q d Representing MMD matrices, Q respectively k‘ Representing MMD matrices adapted to respective classes, X h The distance within the class is represented by the distance,representing a degradation prevention term, Z representing a mapping matrix, K representing the number of feature information;
s6.2, converting the objective function of the transfer learning based on the characteristic difference adaptation into the objective optimization function of the transfer learning based on the characteristic difference adaptation by using the Lagrangian function:
s6.3, orderThe target optimization function of the transfer learning based on the feature difference adaptation is converted into:
s6.4, solving the eigenvalue of the formula (21), wherein the eigenvector corresponding to the eigenvalue is a mapping matrix Z, and the source domain X S And a new target domain X T Mapping to the mapping matrix Z to obtain a target matrix Z (T) and a source domain matrix Z (S).
The technical scheme has the beneficial effects that:
(1) According to the invention, the acquired third-party characteristic information and the original target characteristic information are subjected to correlation analysis, so that the correlation between the third-party characteristic information and the original target characteristic information is judged, and the data with the correlation are fused together to form a new target domain, so that the original data is increased, and the accuracy of image identification is improved;
(2) The invention introduces the characteristic distribution difference self-adaption principle into multi-characteristic integrated transfer learning, reduces the time of characteristic mapping, ensures the accuracy of transfer learning and improves the learning capacity of transfer learning.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the present invention;
fig. 2 shows original images, (a) - (c) show complete image information of goldfish, submarines and frogman, and (d) - (f) show image information of goldfish, submarines and frogman after shielding;
FIG. 3 shows the result of detecting FIG. 2 by SSD algorithm;
FIG. 4 shows the detection result of FIG. 2 using the YOLOv3 algorithm;
FIG. 5 shows the results of the test of FIG. 2 using the method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without any inventive effort, are intended to be within the scope of the invention.
The invention relates to an underwater target detection method based on third party transfer learning, which is characterized in that first, third party information and target information are fused to form a new target domain. And common sense knowledge in a knowledge base is used as third party information, so that original sample data is added, and the problem of low recognition accuracy caused by insufficient target characteristic information data is solved. Secondly, the method introduces a characteristic distribution difference self-adaption principle into the integrated transfer learning according to the characteristic of non-uniform characteristic distribution of the underwater image. The time consumption of feature mapping is reduced, and the accuracy of transfer learning is ensured. The transfer learning provided by the invention is to transfer the trained target detection model as a source domain to different scenes by a certain transfer mode to realize target detection.
The complex underwater environment can cause a great deal of missing of image target information, so that the underwater image cannot realize low target detection rate or cannot be detected due to insufficient characteristic information. The present invention thus provides a two-dimensional sparsity model correlation analysis (2D-WSPCCA). Setting the image data obtained by the feature extraction as an original target domain, and setting the projection of the third-party feature information obtained by the knowledge base as a third-party target domain. Then, in a feature space of the feature information projection in the two domains, the correlation between the two data is judged through the 2D-WSPCCA, and the correlated data are fused together to form a new target domain, so that the problem of insufficient information of the target is solved
The principle of the feature-based migration learning algorithm is to map source domain data and target domain data specifically into a shared domain under a certain mapping condition, so that the problem of the target domain is solved by using a model of the source domain. However, the single migration learning model is not ideal in processing complex images, and heterogeneous migration has characteristic differences. The invention introduces the characteristic difference self-adaption principle into multi-characteristic integrated transfer learning, and provides a self-adaption multi-characteristic integrated transfer learning method (AMI-TL), which not only reduces the time required for solving the characteristic mapping mode, but also ensures the accuracy of transfer learning, thereby improving the learning capacity of transfer learning.
As shown in fig. 1, the embodiment of the invention provides an underwater target detection method based on third party transfer learning, which comprises the following specific steps:
s1, acquiring an underwater image from a picture database, randomly shielding a target area in the underwater image, wherein the shielding area is smaller than 40% of the area of the underwater image to obtain an underwater missing image, and extracting target features of the underwater missing image by using a sift feature extraction method to obtain an original target domain;
s2, acquiring a third party target domain containing the underwater image in the step S1 from an expert knowledge base; the expert knowledge base contains basic characteristic information of an object, for example, a target area of an underwater image in fig. 2 (a) is goldfish, the basic characteristic information of the goldfish (color of the goldfish, shape of a fish mouth, etc.) is included in the expert knowledge base, the present invention adopts a common sense description of the characteristics of the object retrieved from the expert knowledge base, and ensures that the characteristics of the object are matched with corresponding characteristic descriptions in the expert knowledge base, and finally the common sense knowledge retrieved from the expert knowledge base can be expressed as:
wherein,,is a common sense description of the object a, k retrieved from an expert knowledge base 1 Representing the number of feature descriptions. In order to encode the detected common sense knowledge, the present invention describes the common sense of the object aConversion into word sequence->And mapping each word in the sentence into a continuous vector space by a mapping relation shown in formula (2):
wherein w is e The characteristic mapping parameters are represented by a set of characteristics,represents the ith 1 Sentence characteristic describes the corresponding characteristic vector. Finally, a set of off-field eigenvectors about the target is obtained>
S3, mapping the original target domain in the step S1 and the third party target domain in the step S2 into the shared space domain respectively to obtain a target domain matrix and a third party target domain matrix;
s4, performing association processing on data in the target domain matrix and the third party target domain matrix by using the 2D-SPCCA to obtain a new target domain; the n objects are encoded in the feature semantic descriptions corresponding to the expert knowledge base, and then the vector space contains n multiplied by k 1 And the associated feature vectors. However, when the vectors of the storage space are large enough, the difficulty of extracting useful information from the candidate knowledge is increased. Therefore, the invention provides a method for judging the correlation between data based on the characteristic information weighted two-dimensional sparsity typical correlation analysis (2D-WSPCCA) and fusing the characteristics with the correlation. The specific method comprises the following steps:
s4.1, for target Domain matrixConstructing a target matrix using k data in the target domain matrixWherein (1)>Is that the characteristic information in the target domain matrix M is a real number, and the dimension of the target domain matrix M is D x ×d l ,d l Representing the feature dimension, D, of the projected data x Representing the feature space dimension of the target domain matrix, k=min (k To ,k Tt ),k To Representing the feature quantity, k, of the target domain matrix Tt Characteristic quantity of matrix representing third party target domain, < +.>Characteristic information representing a kth target domain matrix.
S4.2, aiming at third party target domain matrixConstructing a third party matrix using k data in the third party target domain matrix>Wherein (1)>Is that the characteristic information in the third party target domain matrix N is a real number, and the dimension of the third party target domain matrix N is D y ×d l ,D y Feature space dimension representing a third party target domain matrix, < ->Characteristic information representing a kth third party target domain matrix;
s4.3, constructing a correlation function between the target domain matrix M and the third party target domain matrix N by using a correlation analysis method, judging the correlation degree of data in the two matrices by the correlation of the data, and fusing the correlated data:
wherein ρ represents a correlation coefficient of the feature data, M T 、N T 、X To T 、X Tt T Matrix M, N, X respectively To 、 X Tt Is a transposed matrix of (a);
s4.4, in order to improve the capability of CCA in high-dimensional data analysis, respectively constructing a sparse reconstruction matrix of a target domain matrix M and a third party target domain matrix N:
X Tt S Tt X Tt T M=λX Tt X Tt T M (4),
X To S To X To T N=λX To X To T N (5),
wherein S is Tt Sparse matrix representing third party target domain matrix, S To A sparse matrix representing a target domain matrix, lambda representing a characteristic value; the sparse matrix S can be obtained by solving equations (4) and (5) To 、S Tt . Simultaneously, the variance of the feature matrix is utilized to make the original feature matrix X To 、X Tt Weighting the source domain and the target domain characteristic data after weighting as
S4.5, converting a correlation function (formula (3)) between the target domain matrix M and the third party target domain matrix N into an objective function by utilizing the 2D-SPCCA:
s.t.
wherein, the matrix A is an identity matrix,sparse matrix representing target domain matrix, +.>Sparse matrix, X, representing third party target domain matrix Toi The ith eigenvector, X, representing the target matrix Toj The j-th eigenvector, X, representing the target matrix Tti An ith eigenvector, X, representing a third party matrix Ttj The j-th eigenvector, i, j e [1, k, representing the third party matrix];
S4.6, carrying out mathematical operation on the objective function (6):
wherein,,an i-th eigenvector representing a target weighting matrix,/>The j-th feature vector representing the target weighting matrix,/->Representing the ith eigenvector of the third party weighting matrix, < ->Represents the j-th eigenvector of the third party weighting matrix,> respectively indicate->Transposed matrix of A m Representing an identity matrix, S ToTt =S To ·S Tt , />Representing the product of the target sparse matrix and the third party sparse matrix;
s4.7, respectively performing mathematical operation on the formula (7) and the formula (8) to obtain a matrix S ToTo =S To ·S To 、S TtTt =S Tt ·S Tt
S4.8, let H TtTo =D TtTo -S TtTo 、H TtTt =D TtTt -S TtTt 、H ToTo =D ToTo -S ToTo In combination with step S4.6 and step S4.7, the objective function in step S4.5 may be expressed as an objective optimization function:
s.t.
wherein matrix H TtTo 、H TtTt 、H ToTo Are symmetrical matrixes;
s4.9 orderConverting the objective optimization function in the step S4.7 by using a Lagrangian multiplier algorithm to obtain:
F xy (F y ) -1 F yx α=λ 2 F x α (13),
F yx (F x ) -1 F xy β=λ 2 F x β (14),
wherein α=α 1 ,...,α i' For the eigenvector corresponding to eigenvalue i', β=β 1 ,...,β i' The feature vector corresponding to the feature value i'; solving the equation (13) and the equation (14) can obtain a characteristic value i'.
S4.10, let matrix C= [ alpha ] 1 ,...,α i' ]Matrix d= [ beta ] 1 ,...,β i' ]The new target domain is obtained as follows:
X T =CX To +DX Tt (15)。
s5, acquiring a large number of images from a Coco database, and extracting target features by using a sift feature extraction method to obtain a source domain;
s6, respectively processing the new target domain and the new source domain by utilizing an integrated transfer learning algorithm based on characteristic difference self-adaption to obtain a mapping matrix, wherein the mapping matrix comprises a target matrix and a source domain matrix;
the method aims at solving the problems that the effect of a single migration learning model is not ideal when complex images are processed, and the heterogeneous migration has characteristic differences. The invention provides a multi-feature integrated transfer learning method based on self-adaption. The COCO data set is widely applied in the field of target detection, and has 92 target classifications, 82783 training sheets, 40504 verification sheets and 40775 test images. The invention extracts a large number of pictures from the COCO data set, and the target features of the extracted images form a feature matrix which is set as a source domain D S
Wherein D is S The source domain is represented by a representation of the source domain,representing one sample of the source domain, k 2 Represents the kth 2 Each feature vector X S Representing the feature space of the source domain, Y S The label space of the source domain is represented, and m represents the number of feature vectors.
The ImageNet dataset is widely applied in the current deep learning field, and is mainly applied in the research fields of image classification, positioning, detection and the like. The ImageNet dataset had 1400 photos, up to 2 tens of thousands of categories. The training set contains 12 ten thousand pictures, the verification set contains 5 ten thousand pictures, and the detection set contains 10 ten thousand pictures. Underwater Target data sets are mainly data sets established for underwater images. Has important significance in the research of the field of underwater image processing. Including torpedoes, submarines, frogmans, AUVs, and the like. The invention extracts the underwater pictures from the ImageNet dataset and the Underwater Target dataset, performs the covering treatment on the underwater images and simulates the underwaterImage deletion phenomenon, and obtaining a new target domain D by the method of step S1 T
Wherein D is T The domain of the object is represented by a representation,representing one sample of the target domain, X T Feature space, Y, representing a target domain T Representing the tag space of the target domain.
The construction method of the mapping matrix comprises the following steps:
s6.1, constructing an objective function of transfer learning based on characteristic difference self-adaption:
s.t.Z T PBP T Z=A (19),
wherein P= [ X ] S ,X T ]Is the source domain X S And a new target domain X T Spliced sample matrix, B represents center matrix, A represents unit matrix, gamma represents distribution balance factor, Q d Representing MMD matrices, Q respectively k‘ Representing MMD matrices adapted to respective classes, X h The distance within the class is represented by the distance,representing a degradation prevention term, avoiding occurrence of a degradation solution in the formula (18), wherein Z represents a mapping matrix, and K represents the number of feature vectors;
s6.2, converting the objective function of the transfer learning based on the characteristic difference adaptation into the objective optimization function of the transfer learning based on the characteristic difference adaptation by using the Lagrangian function:
s6.3, solving the bias guide of the formula (20) to enableThe target optimization function of the transfer learning based on the feature difference adaptation is converted into:
s6.4, solving the eigenvalue of the formula (21), wherein the eigenvector corresponding to the eigenvalue is a mapping matrix Z, and the source domain X S And a new target domain X T Mapping to the mapping matrix Z to obtain a target matrix Z (T) and a source domain matrix Z (S).
S7, training the Yolov3 network by utilizing data in the source domain matrix to obtain a Yolov3 network model and network weight; for the target detection model based on the deep neural network, most models cannot effectively cooperate in terms of operation speed and detection accuracy. However, the YOLOv3 target detection model has certain advantages in both operation speed and detection accuracy. YOLOv3 contains a convolutional layer, a residual layer, a YOLO layer, and a detection layer. The YOLOv3 model can detect targets on three different scales, and the detection rate is greatly improved. The present invention therefore introduces YOLOv3 as a target detection model for migration. Training the YOLOv3 target detection model by using the data of the mapped source domain matrix to obtain the network weight of the model.
S8, in order to enable the YOLOv3 network model to be more suitable for underwater images, the model needs to be improved, the weight of a detection layer of the YOLOv3 network model is removed, and the weights of other layers are reserved, so that an improved YOLOv3 network model is obtained;
s9, acquiring an underwater image from a picture database, inputting the underwater image into the improved YOLOv3 network model in the step S8, and obtaining detection layer weight, thereby obtaining an underwater target detection model; and migrating the improved YOLOv3 network model to the field of underwater target detection, acquiring a small number of pictures through a Underwater Target data set, training the improved YOLOv3 network model, generating new detection layer weights, and finally obtaining an underwater target detection model suitable for underwater image detection.
S10, inputting the underwater image in the step S1 into the underwater target detection model in the step S9, and outputting the target and the region where the target is located.
Simulation experiment and result analysis
In order to verify the identification capability of the invention on the underwater picture with serious information defect, an SSD target detection algorithm and a YOLOv3 target detection algorithm are compared with the method. And selecting goldfish, submarines and frogman from the COCO data set and the Underwater Target data set as experimental identification targets. Goldfish living in shallow water area has single background information and enough light. The submarines and frogmans are in a deep water environment, have complex background information, are in a dim state and are susceptible to the environment, as shown in fig. 2-5. The superiority of the method for detecting the underwater target can be clearly shown by comparing the single background with the pictures with complex background. The invention shields the target information of the image so as to simulate the phenomenon of target information missing of the underwater image. The advantage of the method can be more intuitively shown by comparing the identification of the target with complete information with the identification of the target with missing information. FIGS. 2 (a) - (c) are complete image information of goldfish, submarine and frogman, and FIGS. 2 (d) - (f) are image information of goldfish, submarine and frogman after shielding; fig. 3 (a) to (f) show the detection results of fig. 2 (a) to (f) using the SSD algorithm; fig. 4 (a) to (f) show the detection results of fig. 2 (a) to (f) using YOLOv3 algorithm; fig. 5 (a) to (f) show the detection results of fig. 2 (a) to (f) by the method of the present invention. The experiment selects three methods to be model trained in an ImageNet data set and a Underwater Target data set, and the target detection and recognition rate of the method is shown in a table 1 under an operating environment Python 3.7.
TABLE 1 detection rate of each target detection model
For fig. 2 (a) and fig. 2 (d), the image background is single, and the target is obviously easy to detect, so that the recognition rate of the three methods is more than 90%. However SSD, YOLOv3 present a significant disadvantage in the face of target detection in deep water environments. Fig. 3 (b) to (c), fig. 4 (b) to (c) and fig. 5 (b) to (c) are all target detection for a complete underwater image, and the detection rate of the detection model proposed by the present invention is significantly better than the other two detection rates from the detection rates in table 1; fig. 3 (e) to (f), fig. 4 (e) to (f) and fig. 5 (e) to (f) are all target detections for information-deficient submarines and frogmans. For the detection results shown in fig. 3, fig. 4 and fig. 5, the detection model provided by the invention can still realize detection in the face of the information defect target, the detection rate is obviously higher than that of the other two types, and the problems of low detection rate and incapacity of detection caused by insufficient information of SSD and YOLOv3 are solved. In conclusion, the experimental result verifies the effectiveness of the method, and compared with the existing algorithm, the target detection algorithm provided by the invention is more suitable for detection tasks under the condition of lack of underwater target information.
The above description is only of the preferred embodiments of the present invention, and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the spirit and scope of the present invention should be included in the scope of the present invention.

Claims (1)

1. The underwater target detection method based on third party transfer learning is characterized by comprising the following steps:
s1, acquiring an underwater image from a picture database, randomly shielding a target area in the underwater image, wherein the shielding area is smaller than 40% of the area of the underwater image to obtain an underwater missing image, and extracting target characteristics of the underwater missing image by using a sift characteristic extraction method to obtain an original target area;
s2, acquiring a third party target domain containing the underwater image in the step S1 from an expert knowledge base; the expert knowledge base contains basic characteristic information of the object, common sense description of the characteristic of the object is retrieved from the expert knowledge base, and the characteristic of the object is ensured to be matched with the corresponding characteristic description in the expert knowledge base, and finally the common sense knowledge retrieved from the expert knowledge base can be expressed as:
wherein,,is a common sense description of the object a, k retrieved from an expert knowledge base 1 Representing the number of feature descriptions; general description of object a->Conversion into word sequence->And mapping each word in the sentence into a continuous vector space by a mapping relation shown in formula (2):
wherein w is e The characteristic mapping parameters are represented by a set of characteristics,represents the ith 1 Sentence characteristic description corresponding characteristic vector;
finally, a group of off-site feature vectors about the target are obtained
S3, mapping the original target domain in the step S1 and the third party target domain in the step S2 into the shared space domain respectively to obtain a target domain matrix and a third party target domain matrix;
s4, performing association processing on data in the target domain matrix and the third party target domain matrix by using the 2D-WSPCCA to obtain a new target domain;
s4.1, for target Domain matrixConstructing a target matrix using k data in the target domain matrixWherein (1)>Is that the characteristic information in the target domain matrix M is a real number, and the dimension of the target domain matrix M is D x ×d l ,d l Representing the feature dimension, D, of the projected data x Representing the feature space dimension of the target domain matrix, k=min (k To ,k Tt ),k To Representing the feature quantity, k, of the target domain matrix Tt Characteristic quantity of matrix representing third party target domain, < +.>Characteristic information representing a kth target matrix;
s4.2, aiming at third party target domain matrixConstructing a third party matrix using k data in the third party target domain matrix>Wherein (1)>Is that the characteristic information in the third party target domain matrix N is a real number, and the dimension of the third party target domain matrix N is D y ×d l ,D y Feature space dimension representing a third party target domain matrix, < ->Characteristic information representing a kth third party matrix;
s4.3, constructing a correlation function between the target domain matrix M and the third party target domain matrix N by using a correlation analysis method:
wherein ρ represents a correlation coefficient of the feature data, M T 、N T 、X To T 、X Tt T Matrix M, N, X respectively To 、X Tt Is a transposed matrix of (a);
s4.4, respectively constructing a sparse reconstruction matrix of the target domain matrix M and a sparse reconstruction matrix of the third party target domain matrix N:
X Tt S Tt X Tt T M=λX Tt X Tt T M (4),
X To S To X To T N=λX To X To T N (5),
wherein S is Tt Sparse matrix representing third party target domain matrix, S To A sparse matrix representing a target domain matrix, lambda representing a characteristic value; using variance of feature matrix to obtain original feature matrix X To 、X Tt Weighting the source domain and the target domain characteristic data after weighting as
S4.5, converting a correlation function between the target domain matrix M and the third party target domain matrix N into an objective function by utilizing the 2D-WSPCCA:
wherein, the matrix A is an identity matrix,sparse matrix representing target domain matrix, +.>Sparse matrix, X, representing third party target domain matrix Toi The ith eigenvector, X, representing the target matrix Toj The j-th eigenvector, X, representing the target matrix Tti An ith eigenvector, X, representing a third party matrix Ttj The j-th eigenvector, i, j e [1, k, representing the third party matrix];
S4.6, carrying out mathematical operation on the objective function (6):
wherein,,an i-th eigenvector representing a target weighting matrix,/>The j-th eigenvector representing the target weighting matrix,/>Representing the ith eigenvector of the third party weighting matrix, < ->Represents the j-th eigenvector of the third party weighting matrix,> respectively indicate->Transposed matrix of A m Representing an identity matrix, S ToTt =S To ·S TtRepresenting the product of the target sparse matrix and the third party sparse matrix;
s4.7, respectively performing mathematical operation on the formula (7) and the formula (8) to obtain a matrix S ToTo =S To ·S To 、S TtTt =S Tt ·S Tt
S4.8, let H TtTo =D TtTo -S TtTo 、H TtTt =D TtTt -S TtTt 、H ToTo =D ToTo -S ToTo In combination with step S4.6 and step S4.7, the objective function in step S4.5 may be expressed as an objective optimization function:
wherein matrix H TtTo 、H TtTt 、H ToTo Are symmetrical matrixes;
s4.9 orderConverting the objective optimization function in the step S4.7 by utilizing a Lagrangian multiplier algorithm to obtain:
F xy (F y ) -1 F yx α=λ 2 F x α (13),
F yx (F x ) -1 F xy β=λ 2 F x β (14),
wherein α=α 1 ,...,α i' For the eigenvector corresponding to eigenvalue i', β=β 1 ,...,β i' The feature vector corresponding to the feature value i';
s4.10, let matrix C= [ alpha ] 1 ,...,α i' ]Matrix d= [ beta ] 1 ,...,β i' ]The new target domain is obtained as follows:
X T =CX To +DX Tt (15);
s5, acquiring a large number of images from a Coco database, and extracting target features by using a sift feature extraction method to obtain a source domain;
s6, respectively processing the new target domain and the new source domain by utilizing an integrated transfer learning algorithm based on characteristic difference self-adaption to obtain a mapping matrix, wherein the mapping matrix comprises a target matrix and a source domain matrix;
s6.1, constructing an objective function of transfer learning based on characteristic difference self-adaption:
s.t.Z T PBP T Z=A (19),
wherein P= [ X ] S ,X T ]Is the source domain X S And a new target domain X T Spliced sample matrix, B represents center matrix, A represents unit matrix, gamma represents distribution balance factor, Q d Representing MMD matrices, Q respectively k‘ Representing MMD matrices adapted to respective classes, X h The distance within the class is represented by the distance,representing a degradation prevention term, Z representing a mapping matrix, K representing the number of feature information;
s6.2, converting the objective function of the transfer learning based on the characteristic difference adaptation into the objective optimization function of the transfer learning based on the characteristic difference adaptation by using the Lagrangian function:
s6.3, orderThe target optimization function of the transfer learning based on the feature difference adaptation is converted into:
s6.4, solving the eigenvalue of the formula (21), wherein the eigenvector corresponding to the eigenvalue is a mapping matrix Z, and the source domain X S And a new target domain X T Mapping to a mapping matrix Z to obtain a target matrix Z (T) and a source domain matrix Z (S);
s7, training the Yolov3 network by utilizing data in the source domain matrix to obtain a Yolov3 network model and network weight;
s8, removing the weight of a detection layer of the YOLOv3 network model to obtain an improved YOLOv3 network model;
s9, acquiring an underwater image from a picture database, inputting the underwater image into the improved YOLOv3 network model in the step S8, and obtaining detection layer weight, thereby obtaining an underwater target detection model; migrating the improved YOLOv3 network model to the field of underwater target detection, acquiring a small number of pictures through a Underwater Target data set to train the improved YOLOv3 network model, generating new detection layer weights, and finally obtaining an underwater target detection model suitable for underwater image detection;
s10, inputting the underwater image in the step S1 into the underwater target detection model in the step S9, and outputting the target and the region where the target is located.
CN202010589644.4A 2020-06-24 2020-06-24 Underwater target detection method based on third party transfer learning Active CN111723823B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010589644.4A CN111723823B (en) 2020-06-24 2020-06-24 Underwater target detection method based on third party transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010589644.4A CN111723823B (en) 2020-06-24 2020-06-24 Underwater target detection method based on third party transfer learning

Publications (2)

Publication Number Publication Date
CN111723823A CN111723823A (en) 2020-09-29
CN111723823B true CN111723823B (en) 2023-07-18

Family

ID=72568871

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010589644.4A Active CN111723823B (en) 2020-06-24 2020-06-24 Underwater target detection method based on third party transfer learning

Country Status (1)

Country Link
CN (1) CN111723823B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200241A (en) * 2020-10-09 2021-01-08 山东大学 Automatic sorting method for fish varieties based on ResNet transfer learning
CN113657541B (en) * 2021-08-26 2023-10-10 电子科技大学长三角研究院(衢州) Domain self-adaptive target recognition method based on depth knowledge integration
CN114092793B (en) * 2021-11-12 2024-05-17 杭州电子科技大学 End-to-end biological target detection method suitable for complex underwater environment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816040A (en) * 2019-02-01 2019-05-28 四创科技有限公司 The method of urban waterlogging depth of water detection based on deep learning
CN110232186A (en) * 2019-05-20 2019-09-13 浙江大学 The knowledge mapping for merging entity description, stratification type and text relation information indicates learning method
CN110866476A (en) * 2019-11-06 2020-03-06 南京信息职业技术学院 Dense stacking target detection method based on automatic labeling and transfer learning

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389174B (en) * 2018-10-23 2021-04-13 四川大学 Crowd gathering sensitive image detection method
CN109740665B (en) * 2018-12-29 2020-07-17 珠海大横琴科技发展有限公司 Method and system for detecting ship target with occluded image based on expert knowledge constraint
CN110705591A (en) * 2019-03-09 2020-01-17 华南理工大学 Heterogeneous transfer learning method based on optimal subspace learning
CN110390273A (en) * 2019-07-02 2019-10-29 重庆邮电大学 A kind of indoor occupant intrusion detection method based on multicore transfer learning
CN110427875B (en) * 2019-07-31 2022-11-11 天津大学 Infrared image target detection method based on deep migration learning and extreme learning machine
CN110717526B (en) * 2019-09-23 2023-06-02 华南理工大学 Unsupervised migration learning method based on graph convolution network
CN111209952B (en) * 2020-01-03 2023-05-30 西安工业大学 Underwater target detection method based on improved SSD and migration learning
CN111241970B (en) * 2020-01-06 2023-06-27 电子科技大学 SAR image sea surface ship detection method based on yolov3 algorithm and sliding window strategy
AU2020100705A4 (en) * 2020-05-05 2020-06-18 Chang, Jiaying Miss A helmet detection method with lightweight backbone based on yolov3 network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816040A (en) * 2019-02-01 2019-05-28 四创科技有限公司 The method of urban waterlogging depth of water detection based on deep learning
CN110232186A (en) * 2019-05-20 2019-09-13 浙江大学 The knowledge mapping for merging entity description, stratification type and text relation information indicates learning method
CN110866476A (en) * 2019-11-06 2020-03-06 南京信息职业技术学院 Dense stacking target detection method based on automatic labeling and transfer learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection Under Partial Occlusion;Zhishuai Zhang 等;《2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition》;1372-1380 *

Also Published As

Publication number Publication date
CN111723823A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111723823B (en) Underwater target detection method based on third party transfer learning
CN110598029B (en) Fine-grained image classification method based on attention transfer mechanism
CN112069929B (en) Unsupervised pedestrian re-identification method and device, electronic equipment and storage medium
Wang et al. Real-time underwater onboard vision sensing system for robotic gripping
CN108021947B (en) A kind of layering extreme learning machine target identification method of view-based access control model
Jin et al. Accurate underwater ATR in forward-looking sonar imagery using deep convolutional neural networks
CN111445488B (en) Method for automatically identifying and dividing salt body by weak supervision learning
CN110108704A (en) A kind of automatic monitoring and pre-alarming method of cyanobacteria and its automatic monitoring and alarming system
CN110728629A (en) Image set enhancement method for resisting attack
Yamada et al. Learning features from georeferenced seafloor imagery with location guided autoencoders
CN110991257B (en) Polarized SAR oil spill detection method based on feature fusion and SVM
Maire et al. A convolutional neural network for automatic analysis of aerial imagery
CN114863263B (en) Snakehead fish detection method for blocking in class based on cross-scale hierarchical feature fusion
Wang et al. Fast classification and detection of fish images with YOLOv2
CN110633727A (en) Deep neural network ship target fine-grained identification method based on selective search
Dakhil et al. Review on deep learning technique for underwater object detection
CN114596480A (en) Yoov 5 optimization-based benthic organism target detection method and system
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN116664545A (en) Offshore benthos quantitative detection method and system based on deep learning
Walker et al. The effect of physics-based corrections and data augmentation on transfer learning for segmentation of benthic imagery
CN111612090A (en) Image emotion classification method based on content color cross correlation
Chicchon et al. Semantic segmentation of fish and underwater environments using deep convolutional neural networks and learned active contours
CN116092005A (en) Fish identification method, system, computer equipment and storage medium
CN114663683A (en) Underwater target detection method based on spatial feature self-supervision
Si et al. Token-Selective Vision Transformer for fine-grained image recognition of marine organisms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant