CN114078132A

CN114078132A - Image copying-pasting tampering detection algorithm based on autocorrelation characteristic pyramid network

Info

Publication number: CN114078132A
Application number: CN202010764142.0A
Authority: CN
Inventors: 吴玉婷; 梁鹏; 赵慧民; 郝刚; 何娃
Original assignee: Guangdong Polytechnic Normal University
Current assignee: Guangdong Polytechnic Normal University
Priority date: 2020-08-02
Filing date: 2020-08-02
Publication date: 2022-02-22

Abstract

The invention discloses an image copying-pasting tampering detection algorithm based on an autocorrelation characteristic pyramid network, which obtains rich characteristic space distribution information by constructing the autocorrelation characteristic pyramid network and combines global characteristics and local characteristics to obtain a more accurate matching result, and mainly comprises three parts: and (4) feature extraction, namely generating a tampered area mask by using an autocorrelation feature pyramid. The main contributions of the present invention are as follows: (1) feature space distribution information is blended in the feature extraction process, and a more accurate matching result is obtained by combining the global features and the local features, so that the detection performance of the model is improved. (2) And (4) adopting a neighborhood comparison strategy to eliminate false matching from the preliminary matching features obtained from the feature pyramid, and further optimizing the performance of the model.

Description

Image copying-pasting tampering detection algorithm based on autocorrelation characteristic pyramid network

Technical Field

The invention relates to an image copying-pasting tampering detection algorithm based on an autocorrelation characteristic pyramid network.

Background

Image copy-paste tampering, one of the most common and most easily handled image tampering operations, is to hide certain objects or counterfeit false objects to mislead people by copying a certain area of an image and pasting copies of the copied area in other locations of the same image. The image copying-pasting tampering detection is divided into three levels of an image level, a region level and a pixel level in sequence according to the detection granularity, wherein the image level is used for distinguishing whether an image is tampered or not, the region level is used for detecting the tampered region in the image, and the pixel level is used for judging whether each pixel point in the image is tampered or not.

Generally, image copy-paste tamper detection judges the authenticity of an image by calculating feature similarity, and the detection process is divided into the following four steps: 1) and (4) preprocessing. The commonly used preprocessing techniques mainly include graying the color image, and blocking the image. 2) And (5) feature extraction. After the image is preprocessed, a group of correlation information is extracted from the image to represent the image characteristics. 3) And (5) matching the features. And calculating the similarity between the features, and if the similarity between the features is smaller than a certain threshold value, considering that the image areas represented by the two features are possibly subjected to copy-paste tampering. 4) And (5) post-treatment. And (4) optimizing the matching result by adopting related measures, eliminating false matching pairs as far as possible, and storing the optimized matching result.

Most of the existing image copying-pasting tampering detection methods mainly rely on the characteristics of artificial design, and the existing algorithms have insufficient robustness when resisting various post-processing attacks of tampered images, thereby greatly increasing the difficulty of image copying-pasting tampering detection. In recent years, deep learning has made a major breakthrough in various computer vision tasks due to its powerful feature learning capability, which provides another effective solution for image copy-paste tamper detection. For example, WuY et al^[1]An end-to-end trainable deep learning framework is proposed to implement image copy-paste tamper detection by an end-to-end trainable deep learning frameworkThe VGG16 convolutional neural network extracts a feature vector set from an input image, and then compares the features in the feature vector set pairwise by using a Pearson correlation coefficient to obtain a similarity feature vector set for tamper detection. The method has far better performance on two public data sets (CASIA and CoMoFoD) than the traditional most advanced image copy-paste tamper detection algorithm. However, because the method performs feature extraction by reducing the dimension of the original input image to a lower dimension, regularity of image composition and spatial arrangement information of features are lost, resulting in a large deviation in a subsequent feature matching task.

At present, image copy-paste tamper detection technologies are mainly divided into two types, namely image tamper detection based on manual feature extraction and image tamper detection based on deep neural network feature extraction, according to a feature extraction mode. Image tampering detection methods based on manually extracted features include keypoint-based methods and block-based methods. In the keypoint-based approach, SIFT is typically employed^[2-5]And SURF^[6-7]Two feature descriptors. Silva E et al^[7]The SURF characteristics are used for detecting key points and the characteristics are matched through the nearest neighbor distance ratio, the method can effectively detect the tampered image under the post-processing attack, but the method has the defect that a small or homogeneous tampered area cannot be detected. In block-based approaches, a number of features are employed to describe overlapping blocks, such as discrete cosine transform coefficients (DCT)^[8]Principal Component Analysis (PCA)^[9]Discrete Wavelet Transform (DWT) and Singular Value Decomposition (SVD)^[10]Zernike moments^[11]Fourier Mellin Transform (FMT)^[12]And Local Binary Pattern (LBP)^[13]And the like. The features of DCT, PCA, SVD and the like have strong robustness to JPEG compression, additive noise and image blurring, while the features of FMT, LBP and the like are rotation-invariant, and the features have robustness to scaling, compression and rotation operations at the same time. Literature reference^[14]A detection algorithm combining the key point features and the block features is provided, and the method has better robustness to post-processing attacks such as geometric transformation, JPEG (joint photographic experts group) compression, downsampling and the like. Experiments show that the precision ratio and the recall ratio of the method on the CMFDA database are respectively 96.6% and 100%, but the precision of this method drops significantly when the tampered image is subjected to severe post-processing attacks. The image tampering detection method based on the manual feature extraction has the main defects that: when the area of the tampered region is small or the texture is homogeneous, this type of method cannot ensure that enough points of interest are extracted to identify the tampered region.

In recent years, with the great progress of a deep learning method in a computer vision task, some scholars propose to realize image copy-paste tamper detection by using a deep neural network instead of the traditional image tamper detection method based on manual feature extraction. Rao Y et al^[15]An image tampering detection method based on a deep learning technique is proposed, which automatically learns a hierarchical representation from an input RGB color image using a Convolutional Neural Network (CNN). Barni M et al^[16]A multi-branch CNN framework is designed, and correct positioning of an image copying-pasting tampered source region and a target region is achieved by learning a group of characteristics capable of reflecting interpolation artifacts and inconsistent boundaries. WuY et al^[17]An end-to-end deep learning framework is provided to solve the problem of image copy-paste tampering region positioning, and a copy-paste tampering region mask is directly generated from an input image to be analyzed by utilizing a convolution and deconvolution module. Experimental results prove that the image tampering detection algorithm based on the deep neural network extraction features obtains good effects in relevant data sets, and shows stronger robustness in resisting various later period attacks. However, problems still exist with these approaches, such as, for example, at WuY et al^[1]The regularity of the image composition and the spatial arrangement of the features were neglected in the study of (1).

[1]Wu Y,Abd-Almageed W,Natarajan P,“BusterNet:Detecting copy-move image forgery with source/target localization,”in Proc.ofECCV2018,2018,pp.170–186.

[2]Pan X,Lyu S.Region duplication detection using image feature matching[J].IEEE Transactions on Information Forensics and Security,2010,5(4):857-867.

[3]Amerini I,Ballan L,Caldelli R,et al.A SIFT-Based Forensic Method for Copy–Move Attack Detection and Transformation Recovery[J].IEEE Transactions on Information Forensics&Security,2011,6(3):1099-1110.

[4]Li J,Li X,Yang B,et al.Segmentation-Based Image Copy-Move Forgery Detection Scheme[J].IEEE Transactions on Information Forensics&Security,2015,10(3):507-518.

[5]Pun C M,Yuan X C,Bi X L.Image Forgery Detection Using Adaptive Oversegmentation and Feature Point Matching[J].Information Forensics and Security,IEEE Transactions on,2015,10(8):1705-1716.

[6]Shivakumar B L,Santhosh Baboo S.Detection of Region Duplication Forgery in Digital Images Using SURF[J].International Journal ofComputer Science Issues(IJCSI),2011,8(4):199-205.

[7]Silva E,Carvalho T,Ferreira A,et al.Going deeper into copy-move forgery detection:Exploring image telltales via multi-scale analysis and voting processes[J].Journal of Visual Communication&Image Representation,2015,29.

[8]A Jessica Fridrich,B David Soukal,A Jan

Detection of copy-move forgery in digital images[J].proceedings ofdigital forensic research workshop,2003.

[9]Mahdian B,Saic S.Detection of copy–move forgery using a method based on blur moment invariants[J].Forensic Science International,2007,171(2-3):180-189.

[10]Li G,Wu Q,Tu D,et al.A Sorted Neighborhood Approach for Detecting Duplicated Regions in Image Forgeries Based on DWT and SVD[C]//Proceedings of the 2007 IEEE International Conference on Multimedia and Expo,ICME 2007,July 2-5,2007,Beijing,China.IEEE,2007.

[11]Ryu S J,Kirchner M,Lee M J,et al.Rotation Invariant Localization ofDuplicated Image Regions Based on Zernike Moments[J].IEEE Transactions on Information Forensics&Security,2013,8(8):1355-1370.

[12]D.Cozzolino,G.Poggi and L.Verdoliva,"Efficient Dense-Field Copy–Move Forgery Detection,"in IEEE Transactions on Information Forensics and Security,vol.10,no.11,pp.2284-2297,Nov.2015,doi:10.1109/TIFS.2015.2455334.

[13]Li L,Li S,Zhu H,et al.An efficient scheme for detecting copy-move forged images by local binarypatterns[J].Journal ofInformation Hiding&Multimedia Signal Processing,2013,4(1):46-56.

[14]Pun C M,Yuan X C,Bi X L.Image Forgery Detection Using Adaptive Oversegmentation and Feature Point Matching[J].Information Forensics and Security,IEEE Transactions on,2015,10(8):1705-1716.

[15]Rao Y,Ni J.A deep learning approach to detection of splicing and copy-move forgeries in images[C]//2016IEEE InternationalWorkshop on Information Forensics and Security(WIFS).IEEE,2016.

[16]Barni M,Phan Q T,Tondi B.Copy Move Source-Target Disambiguation through Multi-Branch CNNs[J].2019.

[17]Wu Y,Abd-Almageed W,Natarajan P.Image Copy-Move Forgery Detection via an End-to-End Deep Neural Network[C]//2018IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE,2018.

[18]D.Tralic,I.Zupancic,S.Grgic and M.Grgic,"CoMoFoD—New database for copy-move forgery detection,"Proceedings ELMAR-2013,Zadar,2013,pp.49-54.

[19]Zandi M,Mahmoudi-AznavehA,TalebpourA.Iterative Copy-Move Forgery Detection Based on a New Interest Point Detector[J].IEEE transactions on information forensics and security,2016,11(11):2499-2512.

Disclosure of Invention

In view of the above, the invention provides an image copying-pasting tampering detection algorithm based on an autocorrelation feature pyramid network, which is called as SFPNet, and the invention acquires rich feature spatial distribution information by constructing the autocorrelation feature pyramid network, and combines global features and local features to obtain a more accurate matching result. Experiments show that various evaluation indexes of the method on the public image tampering detection reference are superior to those of a comparison algorithm, the PRF value of the method is respectively improved by 14.85%, 15.04% and 12.81%, and the method has excellent detection performance particularly under small-area tampering samples.

In order to solve the technical problems, the invention adopts the following technical scheme:

an image copying-pasting tampering detection algorithm based on an autocorrelation characteristic pyramid network is characterized in that a characteristic pyramid is constructed to obtain more characteristic space arrangement information, and the algorithm mainly comprises three parts: and (4) feature extraction, namely generating a tampered area mask by using an autocorrelation feature pyramid. Experiments show that the evaluation indexes under the two groups of reference data sets in the model are superior to those of a comparison algorithm, and various post-processing attacks can be effectively resisted.

The main contributions of the present invention are as follows:

(1) feature space distribution information is blended in the feature extraction process, and a more accurate matching result is obtained by combining the global features and the local features, so that the detection performance of the model is improved.

(2) And (4) adopting a neighborhood comparison strategy to eliminate false matching from the preliminary matching features obtained from the feature pyramid, and further optimizing the performance of the model. The model of the invention not only can effectively detect the tampered image with large area, but also has outstanding performance on the image tampering detection problem of small and medium area due to richer feature space distribution information and appropriate post-processing strategy, which is one of the important reasons that the method of the invention is superior to the similar algorithm.

In the model, the problem of pixel-level image copy-paste tamper detection based on deep learning is mainly concerned, regularity of image composition and spatial arrangement characteristics of features are considered, different weights are given according to the richness degree of semantic information contained in different dimension feature maps by obtaining the feature maps with different dimensions, and then a feature matching result is influenced.

Drawings

FIG. 1 is a model block diagram of the present invention;

figure 2 is a graph comparing AUC performance on CASIACMFD data sets.

Detailed Description

In order to make the present invention more clear and intuitive for those skilled in the art, the present invention will be further described with reference to the accompanying drawings.

Image copy-paste tamper detection based on autocorrelation feature pyramid network

As shown in FIG. 1, the image copy-paste tamper detection model based on the autocorrelation Feature Pyramid Network (SFPNet) mainly comprises three parts: and (4) feature extraction, namely generating a tampered area mask by using an autocorrelation feature pyramid. The method proposed by the present invention can be applied to the resolution of various images, and for simplifying the discussion, we assume that the size of the input image is 512 × 512 × 3, and then the specific implementation of each part is described separately.

1.1 feature extraction

By the literature^[1]Based on the network structure in (1), the first four layers of the VGG16 are adopted to extract features from an original input image (with the size of 512 × 512 × 3), and each layer respectively obtains 64 feature maps of 256 × 256, 128 feature maps of 128 × 128, 256 feature maps of 64 × 64 and 512 feature maps of 32 × 32. The network extracts and compresses the features continuously to finally obtain more reliable global features at a higher level, and the features at a lower level contain more local spatial information. These different levels of features will be used in the subsequent task of building the pyramid of autocorrelation features.

1.2 auto-correlation feature pyramid network (SFPNet)

The essence of image copy-paste tamper detection is to compare the similarity between different image blocks in the same image, so we need to convert the image features extracted by the convolutional neural network (VGG16) into correlation representations between different image blocks. And documents^[1]By using different single-layer autocorrelation characteristics, we intend to construct an autocorrelation characteristic pyramid based on a convolutional neural network, and the pyramid level corresponds to autocorrelation characteristics with dimensions from large to small from low order to high order respectivelyAnd symbolizing that the pyramids of different levels contain different semantic information.

In particular, the pyramid of autocorrelation features P for n layers_nLet f be_kA feature tensor representing the feature tensor generated by the k (k is 0, …, n-1) th layer of the feature extraction network and having the size w_k×h_k×d_k(w_k,h_k,d_kE.n), the feature tensor can also be regarded as w_k×h_kBlock-like features, i.e.

Wherein each f_k[i_r,i_c]Having d_kAnd (4) each dimension. For the whole feature extraction network, a total of n feature tensors of different scales are generated, i.e. F_k＝{f₁,…,f_n}。

For the k-th layer feature tensor f_kGiven two blocky features f_k[i](i＝(i_r,i_c) And f) and_k[j](j＝(j_r,j_c) The correlation of the two features is quantified by cosine similarity, and the calculation formula is as follows:

wherein

Is f_k[i]Normalized result, μ_k[i]Is f_k[i]Mean value of (a)_k[i]Is f_k[i]Standard deviation of (2). For f_kGiven in_k[i]At m_k(m_k＝w_k×h_k) F is_k[j]Repeating the above calculation process to form an autocorrelation vector s_k[i]＝[cos(θ_i,0),…,cos(θ_i,j),…,cos(θ_i,m-1)]And the autocorrelation vectors are arranged in descending order to obtain

Finally, the k-th layer feature tensor f_kForming a size w by feature calculation_k×h_k×m_kSimilarity matrix of

The components of all elements in the similarity matrix are arranged according to a descending order, the more front components indicate that the correlation between pixel points is higher, the higher the possibility of tampering is, and in an actual experiment, in order to reduce the calculation complexity, only the front part of components are selected as the autocorrelation characteristic of the layer. Meanwhile, in order to improve the accuracy of the model, neighborhood convolution operation is further carried out on the similarity matrix, and the core idea of the step is to compare the neighborhoods of the matching features on the basis of primary matching so as to eliminate false matching. The convolution operation enhances the reliability of the data but does not change the size of the similarity matrix.

For the whole feature extraction network, n similarity matrixes with different scales are generated in total, and in the autocorrelation feature pyramid, the feature tensor with the large scale contains the information of the feature tensor with the small scale, so that in order to inhibit the repeatability of the feature tensor with the large scale, a smaller weight is given to the feature tensor with the larger scale. Thus, the weight of the k-th layer autocorrelation feature pyramid is set to

Then n layers of the autocorrelation feature pyramid P_nExpressed as:

1.3 generating a tamper area mask

In the foregoing processes of feature extraction and autocorrelation feature pyramid calculation, the resolution of the features is continuously decreased. Here, in order to generate a tamper area mask that is consistent with the original image size, the features need to be decoded to restore to the original resolution. The decoding module used by the invention realizes the image size restoration by adopting a convolution and up-sampling mode. Specifically, assuming that the input feature map dimension of the decoding module is 30 × 30, the feature map is convolved by three convolution branches (convolution kernel sizes are 1 × 1,3 × 3, and 5 × 5, respectively), and after the convolution is completed, a plurality of feature maps generated by the three convolution branches are merged by means of vector splicing. And finally, performing up-sampling on the feature maps obtained by combining the convolution operations, wherein the up-sampling adopts a bilinear interpolation mode to finally generate an output feature map with the size of 60 multiplied by 60, namely the effect of expanding the input feature map by one time is realized. The invention uses the convolution and up-sampling mode alternately for the feature map generated in the self-correlation feature pyramid calculation process to finally generate a tampered area mask of 512 × 512 × 1.

2 experiment

Experiment simulation environment: the CPU processor is i78700, the memory is 16GB, the graphics card is GTX 1080Ti X2, the hard disk is 2T, and the experimental platform is Ubuntu 18.04. The open source architecture selected for the experiment is the Tensorflow deep learning framework.

2.1 data set

USCISI dataset: the data set is composed of 10⁵An image composition, wherein all images are taken from the SUN2012 dataset and the Microsoft COCO dataset provided with the source object segmentation mask, and a copy-paste tampered image is obtained by means of geometric transformation.

CASIA CMFD data set: CASATIDEv2.0 is the detection reference of the image falsification with the largest data volume disclosed at present, and the data set comprises 7491 real images and 5123 falsified images, and documents^[1]The author of (c) selected 1313 of the tampered images obtained in the copy-paste operation to form the CASIA CMFD data set.

Comofid dataset: the comofid contains 5000 images, 25 categories in total, including 200 basic tamper images, and 24 images that apply various post-processing attacks to the basic tamper images.

2.2 network training details

Training sample press of different scaleAnd dividing the network into a training set and a test set according to the ratio of 9:1, wherein the training set is used for network training, and the test set is used for evaluating the network performance. The network uses cross entropy of the two classes as a loss function and the activation function uses the Relu function. All convolution operations in the network use boundary padding to keep the input and output image dimensions consistent, with convolution step sizes of 1. During network training, the error is updated using an Adam optimizer with the learning rate set to 10^-4The batch size was set to 2 and the number of iterations was set to 50.

2.3 evaluation index

The present invention uses PRF values in machine learning, i.e., Precision (Precision), Recall (Recall), and harmonic means of Precision and Recall (F)₁) The performance of the model is evaluated in a pixel level by three experimental indexes, and the calculation method is shown in formulas (1) to (3).

In the formula, TP represents the number of pixels in which a tampered region is correctly detected, FP represents the number of pixels in which a tampered region is mistaken, and FN represents the number of pixels in which a tampered region is missed. In addition, the performance of the model is evaluated at the same time at the image level, and if any pixel in the test image is detected as counterfeit, the test image is marked as a tampered image. The number of image levels TP, FP, FN (where TP indicates the number of correctly detected tampered images, FP indicates the number of erroneously detected tampered images, and FN indicates the number of missed tampered images) is counted, and a PRF value is calculated. In addition, the area (simply called AUC) covered under the ROC curve is used as the measurement of the model performance, wherein the ROC curve is a curve taking the false positive rate and the true positive rate of the sample to be detected as axes.

2.4 Experimental results and analysis

2.4.1SFPNet validation results

To verify the effectiveness of the SFPNet proposed by the present invention, the experiment was trained on three different numbers of layers of sfpnets using the CASIA CMFD data set, and then tested on the CoMoFoD data set. Table 1 lists the PRF values of three SFPNets on the base class/non-attack image (200 sheets) for performance comparison, using the literature^[18]The detection criterion defined in (1), at the level of a single-sample pixel F₁Corresponding PRF values were counted on image subsets with scores above 0.5. The experimental results are as follows:

TABLE 1 comparison of Performance data for three SFPNets

As can be seen from the above table, the PRF value is significantly improved as the number of layers of the feature pyramid is increased. Among them, the increase from single-layer to double-layer SFPNet is most obvious, the Precision is increased by 36.13%, the Recall is increased by 29.71%, and F₁Improved by 28.09%, detected single sample pixel level F₁The number of images scored above 0.5 was increased by 74. This shows that SFPNet can effectively acquire more tampered image information by adding autocorrelation features of different scales.

2.4.2 analysis of the results of the two methods at CASIA CMFD data set

In order to verify the effectiveness of the method and the parameter selection, the test compares the detection results of the double-layer SFPNet and the BusterNet. The network was trained on the USCISI dataset and evaluated using the CASIA CMFD dataset, with the following comparison results:

TABLE 2 comparison of experimental data for the two-layer SFPNet with BusterNet at CASIA CMFD dataset

From each experimental data, SFPNet has better performance than the BusterNet model on CASIA CMFD data set. The SFPNet acquires more feature details by constructing a feature pyramid, so that the performance of correctly detecting the tampered region by the model is improved, and F₁Whether evaluated at the pixel level or the image level, the index value is better than that of BusterNet. In addition, as can be seen from fig. 2 (line depth differentiation), the AUC performance of SFPNet compared to BusterNet was improved by 5% and 7% under the image-level evaluation and pixel-level evaluation conditions, respectively.

The embodiments described above are presented to enable those skilled in the art to make and use the invention. It will be readily apparent to those skilled in the art that various modifications to these embodiments may be made, and the generic principles described herein may be applied to other embodiments without the use of the inventive faculty. Therefore, the present invention is not limited to the embodiments described herein, and those skilled in the art should make improvements and modifications to the present invention based on the disclosure of the present invention within the protection scope of the present invention.

Claims

1. An image copying-pasting tampering detection algorithm based on an autocorrelation characteristic pyramid network is characterized in that:

the method comprises the following steps of obtaining rich characteristic space distribution information by constructing an autocorrelation characteristic pyramid network, and combining global characteristics and local characteristics to obtain a more accurate matching result, wherein the method mainly comprises the following three parts: and (4) feature extraction, namely generating a tampered area mask by using an autocorrelation feature pyramid.

2. The algorithm of claim 1, wherein:

in the feature extraction part, the first four layers of the convolutional neural network are adopted to extract features from an original input image, the network continuously extracts and compresses the extracted features to finally obtain global features which are more reliable than a higher layer, the features of a lower layer contain more local spatial information, and the features of different layers are used for a subsequent task of constructing an autocorrelation feature pyramid.

3. The algorithm according to claim 1 or 2, characterized in that:

in the autocorrelation characteristic pyramid part, image characteristics extracted by the convolutional neural network are converted into correlation representations among different image blocks, an autocorrelation characteristic pyramid based on the convolutional neural network is constructed, the pyramid layers respectively correspond to autocorrelation characteristics with the dimensionalities from low order to high order, and the pyramids with different layers contain different semantic information.

4. The algorithm of claim 3, wherein:

the specific process of the autocorrelation characteristic pyramid part is as follows:

pyramid of autocorrelation features P for n layers_nLet f be_kA feature tensor representing the feature tensor generated by the k (k is 0, …, n-1) th layer of the feature extraction network and having the size w_k×h_k×d_k(w_k，h_k，d_kE.n), the feature tensor can also be regarded as w_k×h_kBlock-like features, i.e.

Wherein each f_k[i_r，i_c]Having d_kA dimension; for the whole feature extraction network, a total of n feature tensors of different scales are generated, i.e. F_k＝{f₁，…，f_n}；

For the k-th layer feature tensor f_kGiven two blocky features f_k[i](i＝(i_r，i_c) And f) and_k[j](j＝(j_r，j_c) The correlation of the two features is quantified by cosine similarity, and the calculation formula is as follows:

wherein

Is f_k[i]Normalized result, μ_k[i]Is f_k[i]Mean value of (a)_k[i]Is f_k[i]Standard deviation of (d); for f_kGiven in_k[i]At m_k(m_k＝w_k×h_k) F is_k[j]Repeating the above calculation process to form an autocorrelation vector s_k[i]＝[cos(θ_i，0)，…，cos(θ_i，j)，…，cos(θ_i，m-1)]And the autocorrelation vectors are arranged in descending order to obtain s'_k[i]＝sort(s_k[i])；

Finally, the k-th layer feature tensor f_kForming a size w by feature calculation_k×h_k×m_kS 'similarity matrix'_k(ii) a And arranging the components of all elements in the similarity matrix according to a descending order, wherein the more front components indicate that the correlation among the pixel points is higher, and the higher the possibility of tampering is.

5. The algorithm of claim 4, wherein:

setting the weight of the k-th layer autocorrelation feature pyramid to be

Then n layers of the autocorrelation feature pyramid P_nExpressed as:

6. the algorithm of claim 1, wherein:

by decoding the features to restore to the original resolution to generate a tampered area mask that is consistent with the original image size, the decoding module implements image size restoration in a convolution + upsampling manner.

7. The algorithm of claim 6, wherein:

the convolution branch is used for carrying out convolution operation on the feature map input by the decoding module, combining a plurality of feature maps generated by the convolution branch in a vector splicing mode after the convolution is finished, then carrying out up-sampling on the feature maps obtained by combining the convolution operation, and finally generating an output feature map.

8. The algorithm of claim 7, wherein:

the upsampling is performed in a bilinear interpolation mode.