CN115578631A - Image tampering detection method based on multi-scale interaction and cross-feature contrast learning - Google Patents

Image tampering detection method based on multi-scale interaction and cross-feature contrast learning Download PDF

Info

Publication number
CN115578631A
CN115578631A CN202211421544.6A CN202211421544A CN115578631A CN 115578631 A CN115578631 A CN 115578631A CN 202211421544 A CN202211421544 A CN 202211421544A CN 115578631 A CN115578631 A CN 115578631A
Authority
CN
China
Prior art keywords
image
feature
features
scale
cross
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211421544.6A
Other languages
Chinese (zh)
Other versions
CN115578631B (en
Inventor
高赞
陈圣灏
李传森
张蕊
李华刚
郝敬全
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Zhonglian Audio Visual Information Technology Co ltd
Qingdao Haier Smart Technology R&D Co Ltd
Taihua Wisdom Industry Group Co Ltd
Shandong Institute of Artificial Intelligence
Original Assignee
Shandong Zhonglian Audio Visual Information Technology Co ltd
Qingdao Haier Smart Technology R&D Co Ltd
Taihua Wisdom Industry Group Co Ltd
Shandong Institute of Artificial Intelligence
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Zhonglian Audio Visual Information Technology Co ltd, Qingdao Haier Smart Technology R&D Co Ltd, Taihua Wisdom Industry Group Co Ltd, Shandong Institute of Artificial Intelligence filed Critical Shandong Zhonglian Audio Visual Information Technology Co ltd
Priority to CN202211421544.6A priority Critical patent/CN115578631B/en
Publication of CN115578631A publication Critical patent/CN115578631A/en
Application granted granted Critical
Publication of CN115578631B publication Critical patent/CN115578631B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/95Pattern authentication; Markers therefor; Forgery detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention belongs to the technical field of image detection, and provides an image tampering detection method based on multi-scale interaction and cross-feature contrast learning, which can be used for positioning an image counterfeiting data set and an inharmonious data set. The method comprises the following steps: constructing an input image, and inputting an image to be positioned into a backbone network to extract features; the multi-scale features are interacted to obtain multi-stage features; setting pixels of the inharmonic region as positive examples, setting background pixels as negative examples, simultaneously selecting positive and negative feature vectors from each feature size, randomly sampling the negative examples according to the number of the positive examples, carrying out contrast learning loss constraint on the feature vectors obtained by sampling, and simultaneously carrying out contrast learning loss constraint on the feature of multiple stages by mixing the positive and negative examples; fusing every two adjacent features, and completing the fusion of the features through the contraction attention of the features; and (4) performing joint training on multiple loss functions.

Description

Image tampering detection method based on multi-scale interaction and cross-feature contrast learning
Technical Field
The invention relates to an image tampering detection method based on multi-scale interaction and cross-feature comparison learning, and belongs to the technical field of image detection.
Background
With the improvement of the living standard of people, multimedia has penetrated into various fields, and digital images have become important carriers for media propagation. However, with the advent of more and more image editing tools, it has become easier to manipulate images. Compared with a complex tampering method, simple counterfeiting technologies such as image splicing and the like are huge in quantity and most widely applied, the simple splicing technology often causes illumination statistical inconsistency between an image counterfeiting area and an integral area due to a camera, the image counterfeiting area and the integral area are called inharmonious areas, the inharmonious images are increasingly generated and rapidly spread along with the development of the Internet, the dominance is achieved, in the current detection method for the image counterfeiting field, due to the fact that the tampering types are various, the tampering modes are continuously iterated, common information about counterfeiting priors in general tampering is difficult to effectively define, the inharmonious images with obvious color difference clues are expected to be simplified, positioning is only carried out on the inharmonious images with the obvious color difference clues, and the common information of the inharmonious counterfeiting area is searched for carrying out positioning on the inharmonious counterfeiting area through the most extensive and most easily-generated color counterfeiting difference information at present. Meanwhile, the image tampering positioning is further assisted to achieve excellent performance, and the image inharmonious positioning can be said to be one of subtasks of the image tampering positioning. The existing method uses multi-scale information to mine the dissonant region, but the method is only an extension of semantic segmentation and is not designed for the dissonant positioning task, and the method also increases the difference between the image foreground and the background according to illumination inconsistency, but the characteristic extraction positioning network is not improved at all, and the dissonant region is still generated through pre-operation and is also a forged region in nature, and certain common information exists among the regions, so that the information is hoped to be explored for positioning.
Disclosure of Invention
The invention aims to provide an image tampering detection method based on multi-scale interaction and cross-feature contrast learning, which can effectively position an image forged data set and an inharmonious data set.
In order to achieve the purpose, the invention is realized by the following technical scheme:
an image tampering detection method based on multi-scale interaction and cross-feature contrast learning comprises the following steps:
s1, constructing an input image, inputting an image to be positioned into a backbone network, and extracting characteristics:
randomly adding image jitter to an original image at the same time, using the image jitter as the input of a backbone network, sharing images with different scales and network weights for feature extraction, putting the images into the backbone network, and extracting image features in four stages respectively, wherein each stage generates three large and small features under the condition of three inputs;
s2, multi-scale feature interaction to obtain multi-stage features:
performing multi-scale feature interaction on the three features of each stage, performing down-sampling on the large features, performing up-sampling on the small features to obtain the same size, and performing multi-scale weight constraint addition to obtain final features;
s3, cross-feature comparison learning:
setting pixels of an inharmonic region in the group Truth as a positive example, setting background pixels as a negative example, simultaneously down-sampling the group Truth to each feature size to select positive and negative feature vectors, randomly sampling negative samples according to the number of the positive samples, performing contrast learning loss constraint on the feature vectors obtained by sampling, and performing contrast learning loss constraint on the feature vectors obtained by sampling while mixing the positive and negative samples on multi-stage features;
4) Feature shrinkage fusion decoding:
fusing every two adjacent features, and completing the fusion of the features through the contraction attention of the features;
5) Multi-loss function joint training:
and finally, decoding to obtain a final predicted image, performing pixel-level loss supervision on the final predicted image and the GroudTruth, and performing a combined training optimization network with contrast learning.
On the basis of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning, the image extraction feature construction is specifically as follows:
the method comprises the steps of adjusting the size of an image, randomly turning over the image, randomly rotating the image, adjusting the contrast ratio of the image, and then using the image as the input of a network, wherein the input sizes are respectively H multiplied by W, W is the width of a picture, H is the height of the picture, and the unit is a pixel.
On the basis of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning, the multi-scale feature interaction specifically comprises the following steps:
and performing the same operation on the three characteristics of each stage, wherein the characteristics of the 1.5x image are respectively downsampled to the size of the characteristics of the input image in an average pooling mode and a maximum pooling mode, the characteristics of the 0.5x image are upsampled to the size of the characteristics of the input image in a bilinear interpolation mode, then the characteristics are spliced, two layers of rolling blocks are performed, a softmax function is performed to automatically learn the weight of each scale characteristic, and the weighted sum is performed to obtain the characteristics after the three scales are fused.
On the basis of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning, the cross-feature contrast learning specifically comprises the following steps:
sampling the GroudTruth to the same size of each feature by a nearest neighbor method; finding out feature vectors of dissonant forged pixels and background pixels on the feature map by using GroudTruth in a mapping mode, and then randomly sampling two categories in each image in each batch; setting 5 as a threshold, when the number of the feature vectors is less than 5, the category of the image is abandoned, when the number of the feature vectors is more than 5, random sampling is carried out, 5 related feature vectors are selected, all harmonic pixel feature vectors and background pixel feature vectors in the batch are combined in the same category, and finally four feature sets are obtained according to 4 features
Figure 413316DEST_PATH_IMAGE001
Cross-image contrast learning is implemented in each feature set, with the contrast loss as follows:
Figure 420586DEST_PATH_IMAGE002
wherein
Figure 821612DEST_PATH_IMAGE003
Next, A1 and A4 and A1 and A3 are respectively subjected to cross-scale contrast learning.
On the basis of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning, the feature contraction fusion decoder specifically comprises the following steps: in the four features F1, F2, F3 and F4, the final result is output through convolution and up-sampling by continuously fusing the features through contraction and fusion of F1 and F2, contraction and fusion of F2 and F3, contraction and fusion of F3 and F4 in pairs.
On the basis of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning, the target function construction based on the multi-scale interaction and cross-feature contrast learning network is specifically as follows:
since in anharmonic localization the samples are unbalanced and the harmonious region is smaller than the anharmonic region, whereby the pixel supervision loss consists of dice and focal, the pixel segmentation loss and the contrast loss are combined, whereby the total loss function is located as follows:
Figure 336644DEST_PATH_IMAGE004
wherein G represents the group Truth,
Figure 365780DEST_PATH_IMAGE005
a predictive picture representing each of the stages is shown,
Figure 809531DEST_PATH_IMAGE006
representing the probability value of the pixel prediction,
Figure 432273DEST_PATH_IMAGE007
the user-defined parameters are represented by a table,
Figure 845937DEST_PATH_IMAGE008
the setting is 1 and the setting is carried out,
Figure 870525DEST_PATH_IMAGE009
the setting was made to be 0.3,
Figure 344232DEST_PATH_IMAGE010
the union is represented as a union of the sets,
Figure 687226DEST_PATH_IMAGE011
the union is represented as a union of the sets,
Figure 779947DEST_PATH_IMAGE012
for comparison loss.
The invention has the advantages that:
the multi-scale information is combined together, common characteristics of illumination forgery are explored through contrast learning, effective fusion is carried out on the characteristics through layer-by-layer scaling in decoding, and the potential relation between the inharmonious region and the inharmonious region is fully excavated.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a block diagram of the present invention;
FIG. 2 is a model performance display of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
As shown in fig. 1, it is an operation flowchart of the image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to the present invention, and the implementation steps of the method are as follows:
step one, constructing an input image, inputting the image to be positioned into a backbone network and extracting characteristics
The method comprises the following specific operations of constructing multiple scales, wherein an input image is input as a network after being subjected to size adjustment, random overturning, random rotation and contrast adjustment, the input size is respectively H multiplied by W, the W is the picture width, the H is the picture height, the unit is a pixel, in the input process, in order to find the difference between the scales, the input image is divided by 0.5 to obtain a low-resolution image, meanwhile, the input image is multiplied by 1.5 to obtain a high-resolution image, the three different pixels and the different scale images are jointly input into a backbone network swantransformer, feature extraction is carried out through an attention layer and an MLP layer, the method is different from the mode of extracting features by convolution of a predecessor, global modeling can be achieved, in the process of inharmonious localization, the fact that local inconsistent information is found through the global modeling is important, the features of different stages are extracted, parameters are shared at the same time, and finally, the features with three different scales are generated in four extraction stages.
Step two, obtaining multi-stage characteristics through multi-scale characteristic interaction
The three characteristics of each stage are subjected to the same operation, wherein the characteristics of a 1.5x image are sampled to the size of the characteristics of an input image in an average pooling and maximum pooling adding mode, meanwhile, the characteristics of a 0.5x image are sampled to the size of the characteristics of the input image in a bilinear interpolation mode, then, the characteristics are spliced, two layers of convolution blocks are carried out, a softmax function is carried out to automatically learn the weight of each scale characteristic, the characteristics of each scale are subjected to weighted summation to obtain the characteristics after three scales are fused, and in training, the characteristics fuse effective information on the three scales.
Step three, cross-feature comparison learning
The significance of the contrast learning lies in that feature vectors of the same class are drawn closer, feature vectors of different classes are drawn farther, two categories of forged inharmonic pixels and background pixels exist in inharmonic localization, and the universality of key information of pixels with forged colors is sought to be found out, and particularly, the significance of the contrast learning is that the popularity of the key information of the pixels with forged colors is sought to be found out
Figure 49254DEST_PATH_IMAGE013
Down-sampling to the same size of each feature by nearest neighbor method, then we use by means of mapping
Figure 569229DEST_PATH_IMAGE014
Finding out the feature vectors of the discordant forged pixels and the background pixels on the feature map, then randomly sampling two categories in each image in each batch, setting 5 as a threshold, and when the number of the feature vectors is less than 5, determining the category of the imageAbandoning, randomly sampling when the number is more than 5, selecting 5 related feature vectors, finally combining all harmonic pixel feature vectors and background pixel feature vectors in the batch in the same category, and finally obtaining four feature sets according to 4 features
Figure 166563DEST_PATH_IMAGE015
We implement cross-image contrast learning in each feature set, with the contrast loss as follows:
Figure 187609DEST_PATH_IMAGE016
wherein
Figure 160025DEST_PATH_IMAGE017
Figure 116480DEST_PATH_IMAGE018
And
Figure 466689DEST_PATH_IMAGE019
a pixel-based feature vector in the representation feature,
Figure 697951DEST_PATH_IMAGE020
in order to fix the temperature coefficient,
Figure 50172DEST_PATH_IMAGE021
in order to belong to the pixels of the anharmonic region,
Figure 771004DEST_PATH_IMAGE022
are pixels belonging to the background area.
Next, the A1 and the A4 and the A1 and the A3 are respectively subjected to cross-scale contrast learning, and taking the A1 and the A4 as examples, the feature vectors of the positive and negative samples in the A1 and the A4 sets are respectively integrated and subjected to contrast learning, so that the cross-scale contrast learning is realized.
Step four, feature contraction fusion decoding
In the four features F1, F2, F3 and F4, the final predicted feature obtained by fusing F1 and F2, F2 and F3, and F3 and F4 in a shrinking way is continuously fused to be a feature, and finally a final result is output by convolution upsampling, wherein taking F4 and F3 as examples, the F4 is firstly upsampled to obtain the same feature size as F3, feature multiplication is carried out to obtain F34, the F34 is respectively added with F3 and F4 to obtain F3'F4', then the two features are spliced to excavate channel information through channel attention to obtain fused information, and meanwhile, the feature is fused to carry out auxiliary supervision loss, so that the final predicted feature obtained by fusing predicted images is output by convolution and bilinear upsampling.
Step five, multi-loss function joint training
Since in anharmonic localization the samples are unbalanced and the harmonious region is smaller than the anharmonic region, the pixel supervision loss is thus composed of diceloss and focalloss, the pixel segmentation loss and the contrast loss are combined, whereby the total loss function is located as follows:
Figure 342930DEST_PATH_IMAGE023
wherein G represents
Figure 440199DEST_PATH_IMAGE024
Figure 148392DEST_PATH_IMAGE025
A predictive picture representing each of the stages is shown,
Figure 40125DEST_PATH_IMAGE026
representing the probability value of the prediction of the pixel,
Figure 833769DEST_PATH_IMAGE027
a union is represented that is a union of the two,
Figure 374209DEST_PATH_IMAGE012
to compare the loss, the final prediction map is constrained by this loss function.
In order to verify the effectiveness of the present invention,in that
Figure 795963DEST_PATH_IMAGE028
The evaluation is performed on the non-harmonious data set,
Figure 999543DEST_PATH_IMAGE029
consists of four subdata sets:
Figure 405116DEST_PATH_IMAGE030
. In that
Figure 985133DEST_PATH_IMAGE031
And
Figure 526973DEST_PATH_IMAGE032
on the data set, dissonant images are obtained by adjusting the color and illumination statistics of the foreground colors. For
Figure 167033DEST_PATH_IMAGE033
And HDay2Night datasets, dissonant images are obtained by subjecting the foreground to different styles of modification or capture under different conditions from the corresponding part of the same scene. For the anharmonic images in all four sub-data sets, the foreground region appears incompatible with the background mainly due to the color and illumination inconsistencies, which allows us to use the data set to focus on the localization of the anharmonic regions. In this work we only use dissonant images and not pairs of harmonious images. One problem is that the anharmonic regions in the anharmonic image may be blurred, since the background may also be considered as an anharmonic region. To avoid blurring we directly discard images with foreground areas larger than 50%, which only account for around 2% of the entire data set. The strategy is similar to the traditional image processing positioning method, and the task is to position the uncoordinated area which occupies less than 50 percent of the area. We cut the training set and test set to 64255 and 7237 sheets, respectively. For quantitative evaluation, we calculated the average accuracy AP, F1 score, and IOU as evaluation indices based on the predicted mask M and the ground truth mask according to the previous correlation method.
The performance comparison of the classical image inharmonious positioning algorithm and the method is shown in the following table, 30 epochs are set in an experiment, an optimization method Adam is adopted, the default learning rate is 1e-4, and the optimization method adopts a poly learning rate attenuation strategy; loss function hyperparametric settings to
Figure 168225DEST_PATH_IMAGE034
(ii) a In order to enhance the fitting capability of a model to target domain data, random contrast enhancement, illumination enhancement, saturation enhancement and turning operation are adopted; and restoring the image to the original image for testing when the trained model is tested.
The following table is a comparison of the performance of the classical image dissonance localization algorithm with the present invention on different datasets:
Figure 410987DEST_PATH_IMAGE035
from the table, it can be found that the model effect of the model obtains excellent performance in AP, F1 and IOU, and the optimal MadisNet is exceeded by improving the positioning network, so that a very high-efficiency positioning effect is obtained.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. An image tampering detection method based on multi-scale interaction and cross-feature contrast learning is characterized by comprising the following steps:
s1, constructing an input image, inputting an image to be positioned into a backbone network, and extracting characteristics:
randomly adding image jitter to an original image at the same time, using the image jitter as the input of a backbone network, sharing images with different scales and network weights for feature extraction, putting the images into the backbone network, and extracting image features in four stages respectively, wherein each stage generates three large and small features under the condition of three inputs;
s2, multi-scale feature interaction to obtain multi-stage features:
performing multi-scale feature interaction on the three features of each stage, performing down-sampling on the large features, performing up-sampling on the small features to obtain the same size, and performing multi-scale weight constraint addition to obtain final features;
s3, cross-feature comparison learning:
setting pixels of an inharmonic region in the group Truth as positive examples, setting background pixels as negative examples, simultaneously sampling the group Truth to each feature size for selecting positive and negative feature vectors, randomly sampling negative samples according to the number of the positive samples, performing comparative learning loss constraint on the feature vectors obtained by sampling, and performing comparative learning loss constraint on the mixture of the positive and negative samples on multi-stage features;
4) Feature shrinkage fusion decoding:
fusing every two adjacent features, and completing the fusion of the features through the contraction attention of the features;
5) And (3) multi-loss function joint training:
and finally, decoding to obtain a final predicted image, performing pixel-level loss supervision on the predicted image and the GroudTruth, and performing combined training optimization network with contrast learning.
2. The image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to claim 1, characterized in that: the image extraction feature construction is specifically as follows:
the method comprises the steps of adjusting the size of an image, randomly turning over the image, randomly rotating the image, adjusting the contrast ratio of the image, using the image as the input of a network, wherein the input size is H multiplied by W, the W is the width of the image, the H is the height of the image, and the unit is a pixel.
3. The image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to claim 1, wherein the multi-scale feature interaction specifically comprises the following steps:
and performing the same operation on the three characteristics of each stage, wherein the characteristics of the 1.5x image are respectively downsampled to the size of the characteristics of the input image in an average pooling and maximum pooling addition mode, the characteristics of the 0.5x image are upsampled to the size of the characteristics of the input image in a bilinear interpolation mode, then the characteristics are spliced, two layers of convolution blocks are performed, a softmax function is performed to automatically learn the weight of each scale characteristic, and the characteristics after three scales are fused are obtained through weighted summation.
4. The image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to claim 1, wherein the cross-feature contrast learning specifically comprises the following steps:
sampling the GroudTruth to the same size of each feature by a nearest neighbor method; finding out feature vectors of dissonant forged pixels and background pixels on the feature map by using GroudTruth in a mapping mode, and then randomly sampling two categories in each image in each batch; setting 5 as a threshold, when the number of the feature vectors is less than 5, the category of the image is abandoned, when the number of the feature vectors is more than 5, random sampling is carried out, 5 related feature vectors are selected, all harmonic pixel feature vectors and background pixel feature vectors in the batch are combined in the same category, and finally four feature sets are obtained according to 4 features
Figure 142625DEST_PATH_IMAGE001
Cross-image contrast learning is implemented in each feature setThe comparative losses are as follows:
Figure 681054DEST_PATH_IMAGE002
wherein
Figure 816500DEST_PATH_IMAGE003
Figure 426473DEST_PATH_IMAGE004
And
Figure 330975DEST_PATH_IMAGE005
a pixel-based feature vector in the representation feature,
Figure 633780DEST_PATH_IMAGE006
in order to fix the temperature coefficient,
Figure 489479DEST_PATH_IMAGE007
in order to belong to the pixels of the anharmonic region,
Figure 903142DEST_PATH_IMAGE008
to belong to the background region pixels, next, cross-scale contrast learning is performed on A1 and A4 and A1 and A3, respectively.
5. The image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to claim 1, wherein the feature contraction fusion decoder is specifically as follows: in the four characteristics F1, F2, F3 and F4, the F1 and the F2, the F2 and the F3, and the F3 and the F4 are shrunk and fused pairwise, and the two characteristics are continuously fused until one characteristic is finally output through convolution and upsampling.
6. The image tampering detection method based on multi-scale interaction and cross-feature contrast learning according to claim 1, wherein the objective function construction based on the multi-scale interaction and cross-feature contrast learning network is specifically as follows:
since in anharmonic localization the samples are unbalanced and the harmonious regions are smaller than the anharmonic regions, whereby the pixel supervision loss consists of rice and focal, the pixel segmentation loss and the contrast loss are combined, whereby the total loss function is located as follows:
Figure 662151DEST_PATH_IMAGE009
wherein G represents a group Truth group,
Figure 135858DEST_PATH_IMAGE010
a predictive picture representing each of the stages is shown,
Figure 980317DEST_PATH_IMAGE011
representing the probability value of the prediction of the pixel,
Figure 932092DEST_PATH_IMAGE012
the user-defined parameters are represented by a table,
Figure 545608DEST_PATH_IMAGE013
the setting is 1 and the setting is carried out,
Figure 190215DEST_PATH_IMAGE014
the setting was made to be 0.3,
Figure 20506DEST_PATH_IMAGE015
the union is represented as a union of the sets,
Figure 775972DEST_PATH_IMAGE016
for comparison loss.
CN202211421544.6A 2022-11-15 2022-11-15 Image tampering detection method based on multi-scale interaction and cross-feature contrast learning Active CN115578631B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211421544.6A CN115578631B (en) 2022-11-15 2022-11-15 Image tampering detection method based on multi-scale interaction and cross-feature contrast learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211421544.6A CN115578631B (en) 2022-11-15 2022-11-15 Image tampering detection method based on multi-scale interaction and cross-feature contrast learning

Publications (2)

Publication Number Publication Date
CN115578631A true CN115578631A (en) 2023-01-06
CN115578631B CN115578631B (en) 2023-08-18

Family

ID=84588667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211421544.6A Active CN115578631B (en) 2022-11-15 2022-11-15 Image tampering detection method based on multi-scale interaction and cross-feature contrast learning

Country Status (1)

Country Link
CN (1) CN115578631B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091907A (en) * 2023-04-12 2023-05-09 四川大学 Image tampering positioning model and method based on non-mutually exclusive ternary comparison learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102261A (en) * 2020-08-28 2020-12-18 国网甘肃省电力公司电力科学研究院 Multi-scale generation-based tamper image detection method for anti-network
CN112150450A (en) * 2020-09-29 2020-12-29 武汉大学 Image tampering detection method and device based on dual-channel U-Net model
CN112381775A (en) * 2020-11-06 2021-02-19 厦门市美亚柏科信息股份有限公司 Image tampering detection method, terminal device and storage medium
DE102020215860A1 (en) * 2020-12-15 2022-06-15 Conti Temic Microelectronic Gmbh Correction of images from an all-round view camera system in the event of rain, light and dirt
CN114821665A (en) * 2022-05-24 2022-07-29 浙江工业大学 Urban pedestrian flow small target detection method based on convolutional neural network
CN115063373A (en) * 2022-06-24 2022-09-16 山东省人工智能研究院 Social network image tampering positioning method based on multi-scale feature intelligent perception

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112102261A (en) * 2020-08-28 2020-12-18 国网甘肃省电力公司电力科学研究院 Multi-scale generation-based tamper image detection method for anti-network
CN112150450A (en) * 2020-09-29 2020-12-29 武汉大学 Image tampering detection method and device based on dual-channel U-Net model
CN112381775A (en) * 2020-11-06 2021-02-19 厦门市美亚柏科信息股份有限公司 Image tampering detection method, terminal device and storage medium
DE102020215860A1 (en) * 2020-12-15 2022-06-15 Conti Temic Microelectronic Gmbh Correction of images from an all-round view camera system in the event of rain, light and dirt
CN114821665A (en) * 2022-05-24 2022-07-29 浙江工业大学 Urban pedestrian flow small target detection method based on convolutional neural network
CN115063373A (en) * 2022-06-24 2022-09-16 山东省人工智能研究院 Social network image tampering positioning method based on multi-scale feature intelligent perception

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHAO-MING WU: "Multi-level tamper detection and recovery with tamper type identification", pages 4512 - 4516 *
张婧媛 等: "基于Transformer的多任务图像拼接篡改检测算法", vol. 50, no. 1, pages 114 - 122 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116091907A (en) * 2023-04-12 2023-05-09 四川大学 Image tampering positioning model and method based on non-mutually exclusive ternary comparison learning
CN116091907B (en) * 2023-04-12 2023-08-15 四川大学 Image tampering positioning model and method based on non-mutually exclusive ternary comparison learning

Also Published As

Publication number Publication date
CN115578631B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN111047551B (en) Remote sensing image change detection method and system based on U-net improved algorithm
CN112396607B (en) Deformable convolution fusion enhanced street view image semantic segmentation method
CN110136062B (en) Super-resolution reconstruction method combining semantic segmentation
CN108596818B (en) Image steganalysis method based on multitask learning convolutional neural network
CN115063373A (en) Social network image tampering positioning method based on multi-scale feature intelligent perception
Zuo et al. HF-FCN: Hierarchically fused fully convolutional network for robust building extraction
CN112767418A (en) Mirror image segmentation method based on depth perception
CN111797841A (en) Visual saliency detection method based on depth residual error network
CN115346037B (en) Image tampering detection method
CN115578631A (en) Image tampering detection method based on multi-scale interaction and cross-feature contrast learning
Zheng et al. T-net: Deep stacked scale-iteration network for image dehazing
CN112819837A (en) Semantic segmentation method based on multi-source heterogeneous remote sensing image
CN114092824A (en) Remote sensing image road segmentation method combining intensive attention and parallel up-sampling
CN115393718A (en) Optical remote sensing image change detection method based on self-adaptive fusion NestedUNet
CN115713462A (en) Super-resolution model training method, image recognition method, device and equipment
CN114565508A (en) Virtual reloading method and device
CN116343052B (en) Attention and multiscale-based dual-temporal remote sensing image change detection network
CN112991239B (en) Image reverse recovery method based on deep learning
Gan et al. Highly accurate end-to-end image steganalysis based on auxiliary information and attention mechanism
CN112488115B (en) Semantic segmentation method based on two-stream architecture
CN116188652A (en) Face gray image coloring method based on double-scale circulation generation countermeasure
CN115457385A (en) Building change detection method based on lightweight network
CN111539922B (en) Monocular depth estimation and surface normal vector estimation method based on multitask network
Xu et al. ESNet: An Efficient Framework for Superpixel Segmentation
Yu et al. Dual-branch feature learning network for single image super-resolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant