CN116912184B - Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss - Google Patents
Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss Download PDFInfo
- Publication number
- CN116912184B CN116912184B CN202310800166.0A CN202310800166A CN116912184B CN 116912184 B CN116912184 B CN 116912184B CN 202310800166 A CN202310800166 A CN 202310800166A CN 116912184 B CN116912184 B CN 116912184B
- Authority
- CN
- China
- Prior art keywords
- image
- tampered
- tampering
- training
- level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 238000000926 separation method Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 109
- 238000013528 artificial neural network Methods 0.000 claims abstract description 46
- 238000001514 detection method Methods 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 16
- 230000006870 function Effects 0.000 claims description 83
- 230000004913 activation Effects 0.000 claims description 27
- 238000010586 diagram Methods 0.000 claims description 22
- 230000004807 localization Effects 0.000 claims description 16
- 230000008439 repair process Effects 0.000 claims description 16
- 230000008569 process Effects 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 description 5
- 238000003062 neural network model Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 239000000969 carrier Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention discloses a weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss, and relates to the technical field of restoration image detection positioning. The technical key points of the invention include: acquiring pre-training set image data and training set image data; preprocessing image data; inputting the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtaining trained pre-training model parameters; and taking the pre-training model parameters as initial values of network parameters, inputting the preprocessed training set image data into a weak supervision tampering positioning model based on a neural network for training, and obtaining a trained weak supervision tampering positioning model. According to the invention, the label is not required to be tampered accurately in advance, and only a small amount of training samples with image-level label information are required to be used for learning, so that the training difficulty is reduced, and the positioning efficiency is improved.
Description
Technical Field
The invention relates to the technical field of repair image detection and positioning, in particular to a weak supervision depth repair image tampering positioning method and system based on tampering area separation and area constraint loss.
Background
With the rapid development of the internet technology and social networks, digital media mainly comprising images, videos and audios gradually become main carriers of data, and among the information carriers, the images are fully expressed, can store rich visual information and are most acceptable to human beings. And the image usually contains some important information, so that the integrity of the image is maintained. Image restoration is the process of recovering damaged or missing areas from information of undamaged areas in a given image. It has wide application in the real world, such as repairing damaged images and removing unwanted areas.
The weak supervision algorithm is one of the popular research directions in the field of image processing in recent years, and the traditional image processing algorithm needs a large amount of pixel level annotation data as a training set, so that a large amount of manpower, material resources and time are consumed. While weak supervision algorithms may perform accurate image segmentation or localization by using training data with only a portion of the pixel level labels (e.g., image level labels or bounding boxes). The method can effectively reduce the dependence of the labeling data and reduce the cost and the time expenditure. In the practical application field, as various noise and interference exist in reality, it is often very difficult to acquire high-quality tampered image pixel level labeling data, and the traditional tampered positioning algorithm is easily influenced by these factors to generate deviation, so that the advantage of weak supervision tampered positioning is particularly obvious, the robustness and generalization capability of the algorithm can be improved by using more information and priori knowledge, and the problems of excessive fitting and class imbalance can be avoided by using weak tag information, so that the robustness of the algorithm is improved. In addition, the weak supervision method can also utilize the context information, the falsification characteristics, the texture information and the like of the image to improve the generalization capability of the algorithm.
In the prior depth repair image tampering positioning method, the tampered area is positioned by a given repair image through a training neural network so as to approach to the repair mask, so that the contrast information of the high-dimensional characteristics between the depth repair image and the original image is ignored. The weak supervision tampering positioning refers to that accurate labeling data is not needed when a model is trained, and learning is guided by depending on some weak supervision information, namely whether an image is tampered by a depth restoration technology, so that automatic identification of an image tampered area is realized. In a real-world scenario, a depth repair image is often used as a Removal (Removal) detection task in removing some or some of the objects in the image, i.e., in image forensics. This requires a large amount of manual labeling and data generation, and consumes a large amount of manpower and material costs.
Disclosure of Invention
Therefore, the invention provides a weak supervision depth restoration image tampering positioning method and a weak supervision depth restoration image tampering positioning system based on tampering area separation and area constraint loss, which introduce the visual angle of the depth restoration image tampering positioning into the weak supervision field so as to reduce the labeling cost and improve the efficiency and the expandability.
According to an aspect of the invention, there is provided a weak supervision depth restoration image tampering localization method based on tampering region separation and region constraint loss, the method comprising the steps of:
step one, acquiring pre-training set image data and training set image data;
step two, preprocessing the image data;
inputting the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtaining trained pre-training model parameters;
step four, taking the pre-training model parameters as initial values of network parameters, inputting the preprocessed training set image data into a weak supervision and tampering positioning model based on a neural network for training, and obtaining a trained weak supervision and tampering positioning model;
inputting the image to be detected into a trained weak supervision tampering positioning model, and determining a tampering area in the image to be detected.
Further, the pre-training set image data in the first step comprises a real image, a depth restoration tampered image with a large tampered area and a corresponding first-level image-level tag; the training set image data comprises depth restoration tampered images with small tampered areas and corresponding second-level image-level labels.
Further, the preprocessing in the second step includes: converting a real image or a depth restoration tampered image from an RGB color space to a noise space by using a steganography analysis rich model layer method, and obtaining a corresponding noise image; and carrying out cascading operation on the real image or the depth restoration tampered image and the corresponding noise image.
Further, the training process of the weak supervision tampering positioning model based on the neural network in the fourth step includes:
inputting the cascaded images and noise images into a neural network to obtain depth characteristics of the images; reducing the number of channels corresponding to the depth features to 2 through a convolution layer; selecting a characteristic with a category of 1 as a category activation diagram of an initial tampering category;
separating tampered region features and non-tampered region features in the depth restoration tampered image, calculating an image-level label loss function and a region constraint loss function, and updating a class activation diagram;
the first-level image-level label loss function L cls Second-level image-level tag loss function L 0 And L 1 Region constraint loss function L ac Weighting is carried out to be used as a total loss function of the weak supervision tampering positioning model; the total loss function is expressed as: l=l cls +λ(L 0 +L 1 )+μL ac λ and μ are weight parameters;
and optimizing network parameters of the neural network by adopting a back propagation algorithm, and obtaining a trained weak supervision tampering positioning model.
Further, in the step four, the process of separating the tampered region feature and the non-tampered region feature in the deep repair tampered image in the training process of the weak supervision tampered positioning model based on the neural network includes:
for depth characteristics of the depth restoration tampered image, generating a rough real pixel prior A according to the spatial position of the depth characteristics 0 And tampered pixel a priori 1 :
Wherein Z represents a depth feature; p (P) i A class activation diagram representing a class 1, i.e., a tampered image;
coarse real pixel a priori 0 And tampered pixel a priori 1 Generating image-level features Z through global maximization layers, respectively 0 And Z 1 Image-level labels that are non-tampered region features and tampered region features, respectively.
Further, in the fourth step, the first-level image-level tag loss function L is used in the training process of the weak supervision tampering positioning model based on the neural network cls Second-level image-level tag loss function L 0 And L 1 All adopt cross entropy functions, and the calculation formula of the cross entropy functions is as follows:
wherein N represents the number of batches and M represents the number of categories; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c;
the calculation formula of the region constraint loss function is as follows:
wherein,indicating tampered region featuresProportion of the whole image->Representing the proportion of the non-tampered region features to the whole image; h and W represent the height and width of the image.
According to another aspect of the present invention, there is provided a weak supervision depth repair image tamper localization system based on tamper zone separation and zone constraint loss, the system comprising:
an image acquisition module configured to acquire pre-training set image data and training set image data;
a preprocessing module configured to preprocess image data;
the pre-training model training module is configured to input the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtain trained pre-training model parameters;
the tampering positioning model training module is configured to take the pre-training model parameters as initial values of network parameters, input the preprocessed training set image data into a weak supervision tampering positioning model based on a neural network for training, and acquire a trained weak supervision tampering positioning model;
the tampering locating module is configured to input the image to be detected into the trained weak supervision tampering locating model, and determine a tampering area in the image to be detected.
Further, the pre-training set image data in the image acquisition module comprises a real image, a depth restoration tampered image with a large tampered area and a corresponding first-level image-level tag; the training set image data comprises depth restoration tampered images with small tampered areas and corresponding second-level image-level labels.
Further, the preprocessing in the preprocessing module includes: converting a real image or a depth restoration tampered image from an RGB color space to a noise space by using a steganography analysis rich model layer method, and obtaining a corresponding noise image; and carrying out cascading operation on the real image or the depth restoration tampered image and the corresponding noise image.
Further, the training process of the weak supervision tampering positioning model based on the neural network in the tampering positioning model training module comprises the following steps:
inputting the cascaded images and noise images into a neural network to obtain depth characteristics of the images; reducing the number of channels corresponding to the depth features to 2 through a convolution layer; selecting a characteristic with a category of 1 as a category activation diagram of an initial tampering category;
separating tampered region features and non-tampered region features in the depth restoration tampered image, calculating an image-level label loss function and a region constraint loss function, and updating a class activation diagram; the process for separating the tampered region features and the non-tampered region features in the depth restoration tampered image comprises the following steps: for depth characteristics of the depth restoration tampered image, generating a rough real pixel prior a according to the spatial position of the depth characteristics 0 And tampered pixel a priori a 1 :
Wherein Z represents a depth feature; p (P) i A class activation diagram representing a class 1, i.e., a tampered image;
a, a priori a coarse real pixel 0 And tampered pixel a priori a 1 Generating image-level features Z through global maximization layers, respectively 0 And Z 1 Image-level labels as a non-tampered region feature and a tampered region feature, respectively;
the first-level image-level label loss function L cls Second-level image-level tag loss function L 0 And L 1 Region constraint loss function L ac Weighting is carried out to be used as a total loss function of the weak supervision tampering positioning model; the total loss function is expressed as: l=l cls +λ(L 0 +L 1 )+μL ac λ and μ are weight parameters;wherein the first-level image-level tag loss function L cls Second-level image-level tag loss function L 0 And L 1 All adopt cross entropy functions, and the calculation formula of the cross entropy functions is as follows:
wherein N represents the batch processing number, and M represents the category number; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c;
the calculation formula of the region constraint loss function is as follows:
wherein,indicating the proportion of tampered region features to the whole image, < >>Representing the proportion of the non-tampered region features to the whole image; h and W represent the height and width of the image;
and optimizing network parameters of the neural network by adopting a back propagation algorithm, and obtaining a trained weak supervision tampering positioning model.
The beneficial technical effects of the invention are as follows:
conventional approaches typically rely on supervised learning, requiring a large number of images and corresponding tamper location annotations to train the model, however obtaining sufficient annotation data is very difficult. In order to solve the problem, the invention provides a weak supervision depth restoration image tampering positioning method and a weak supervision depth restoration image tampering positioning system based on tampering area separation and area constraint loss, which separate tampering area characteristics and real image area characteristics and respectively carry out classification training, and build a combined loss containing tampering area real area classification loss and area constraint loss to supervise learning. Experimental results show that the method can well position the tampered area in the input image, and obtains higher performance compared with the traditional repairing image tampering positioning method, especially for concentrated and smaller tampered objects.
Drawings
The invention may be better understood by reference to the following description taken in conjunction with the accompanying drawings, which are included to provide a further illustration of the preferred embodiments of the invention and to explain the principles and advantages of the invention, together with the detailed description below.
Fig. 1 is a flowchart of a weak supervision depth restoration image tampering positioning method based on tampering region separation and region constraint loss according to an embodiment of the invention.
Fig. 2 is a schematic diagram of a training process of a weak supervision tampering localization model based on a neural network in an embodiment of the invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, exemplary embodiments or examples of the present invention will be described below with reference to the accompanying drawings. It is apparent that the described embodiments or examples are only implementations or examples of a part of the invention, not all. All other embodiments or examples, which may be made by one of ordinary skill in the art without undue burden, are intended to be within the scope of the present invention based on the embodiments or examples herein.
The embodiment of the invention provides a weak supervision depth restoration image tampering positioning method based on tampering area separation and area constraint loss, which comprises the following steps as shown in figures 1-2:
step one, acquiring pre-training set image data and training set image data;
according to an embodiment of the invention, the image data comprises a depth repair tampered image, a real image and a corresponding image-level tag. The true images are randomly extracted from Places, and the corresponding first-level image level labels are 0; the depth restoration tampered image is divided into two types, wherein one type is that the tampered area is large, tampered pixels account for more than 20% of the pixels of the image, and the depth restoration tampered image with irregular tampered areas is used as pre-training set data; secondly, the tampered area is smaller, tampered pixels account for less than 10% of the pixels of the image, and the depth restoration tampered image in the tampered area is used as training set data. The depth restoration tampered image corresponds to a first-level image level tag of 1. The original images of the depth restoration tampered images are randomly extracted from the Places. In the depth restoration tampered image, the tampered area corresponds to the second-level image-level label to be 1, and the non-tampered area corresponds to the second-level image-level label to be 0.
Step two, preprocessing the image data;
according to the embodiment of the invention, image noise extraction is carried out on each picture; the image noise extraction is to convert a picture from an RGB color space to a noise space by using a steganography analysis rich model layer, and the calculation formula is as follows:
wherein W is s Representing a fixed convolution kernel, X represents the input depth-restored tampered image,representing a convolution operation. The image and noise are then cascade-operated as inputs to the neural network.
The method focuses on local noise distribution of an image, adopts a plurality of submodels to extract different types of features, can better characterize damage of tampering to various correlations of neighborhood pixels, lightens the burden of subsequent tampering detection, and belongs to the preprocessing step of the subsequent tampering detection.
Inputting the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtaining trained pre-training model parameters;
according to the embodiment of the invention, a neural network is trained by utilizing a certain amount of tampered images and unrepaired image data sets to detect whether the images are tampered, so that the network is self-adaptive and focuses on the difference between the real images and the tampered images. Firstly, using a cascade image and noise obtained by processing a pre-training image data set to input the cascade image and noise into a neural network training neural network so as to detect whether an input image is a tampered image or not, and obtaining a neural network model capable of detecting whether the image is tampered or not.
The neural network model employs a neural network for extracting features, such as any one or more combinations of Resnet series, xreception series, VGG series. The neural network model contains only encoders and classifiers to extract image features and detect if an image is tampered with. The encoder uses, for example, resnet, which is a convolutional neural network suitable for image feature extraction. The classification loss uses a cross entropy function.
The pre-training set image data can guide the detection network to better distinguish the difference between the tampered image and the real image, and the neural network can better adapt to various different tampering conditions by pre-training the tampered image of the irregular tampering area, so that the robustness and the reliability of detection are improved.
Step four, taking the pre-training model parameters as initial values of network parameters, inputting the preprocessed training set image data into a weak supervision and tampering positioning model based on a neural network for training, and obtaining a trained weak supervision and tampering positioning model;
according to the embodiment of the invention, the weak supervision tampering positioning model based on the neural network is used for calculating the probability that an input picture is a tampered image, so as to obtain an image-level label of each image; and the probability value of each pixel as a tampered pixel can be calculated, an output probability image is obtained for each picture, the image-level label is a numerical value between 0 and 1, the label approaching 0 represents that the image is a real image, and the label approaching 1 represents that the image is a tampered image. Wherein the shape and size of each output probability map are equal to the shape and size of the input picture. In each output probability map, the value of each pixel represents the probability that the pixel in the same source picture is a tampered pixel. For example, a pixel value greater than a threshold value indicates that the pixel is a tampered pixel, and a pixel value less than the threshold value indicates that the pixel is a non-tampered pixel.
The process of training the weak supervision tamper localization model is approximately as follows:
1) Locating the tamper initiation area: training the tampered image data set again by using a pre-trained classification network, wherein the classification network comprises priori knowledge of the difference between the real image and the tampered image, so that an initial area of the tampered area can be roughly positioned, and a class activation diagram (also called a seed area) of the initial tampered class is obtained; the neural network is trained by migration learning on a training data set generated using a mask of a set and rules. And inputting the cascaded images and noise into a neural network to obtain depth characteristics of the images, wherein the depth characteristics are 2048 multiplied by 8, the channel number is reduced from 2048 to 2 through a convolution layer, and the characteristics with the category of 1 are selected as a category activation diagram of the initial tampering category.
This is because in the class activation map CAM, the image-level tamper features used to estimate the classification score are vulnerable to contamination by the non-tamper features. In addition, some of the real image pixel cues associated with the tampered region object may also help the classifier identify the image class. The image-level label in the weak supervision tamper localization task contains samples meeting the requirement of label y=0, and in the tampered image, the background and the foreground are both available, namely, the samples of y=1 contain regions of y=0, wherein the samples of y=0 can help the network to identify noise and pixel differences in the tampered region and the non-tampered region of the image.
2) Therefore, the invention provides a class activation diagram based on tamper region separation, namely separating tamper features and non-tamper features, calculating an image-level tag loss function and a region constraint loss function, and updating the class activation diagram.
By aggregating features of potential positions of tampered region part and non-tampered region part of an image of which label is 1, an image-level label (Z 0 And Z 1 ). To achieve this, first a coarse real pixel A0 and a tampered pixel a are generated for each spatial position of the depth feature Z 1 . In particular, by inputting the extracted image features Z into a convolution with softmax activation, a location prior A can be obtained 1 To represent predicted tampered pixel prior sum a 0 To represent the true pixel prior:
wherein P is i ∈R 1×HW A class activation diagram representing a class 1, i.e., a tampered image; z epsilon R C×HW Representing neural network depth features. P (P) i The calculation formula of (2) is as follows:
P i =relu(W 2 (sigmoid(ZW 1 +b 1 )))
wherein W is 1 ∈R HW×2 Is the weight of the activation head, b 1 ∈R C×2 Is the bias of the activation head, which uses a sigmoid (·) function to activate; then smoothing the selected class activation diagram by a convolution layer, W 2 ∈R C×C Is the weight of the convolutional layer and is activated by a relu (·) activation function.
Then according to two a priori A 0 And A 1 Global maximization layer is adopted for Z to generate image level characteristics Z 0 And Z 1 Image level labels as authentic and tampered features, respectively.
Thanks to these two priors, a graph with the original CAMImage-level tamper target feature A compared to image-level feature Z 1 The pollution caused by the real pixel characteristics is less; at the same time, additional image-level real pixel features A 0 Features aggregated from real images are simulated. Re-inputting the separated tampered region features and non-tampered region features into a classification layer of a detection network for classification, wherein theoretically, the tampered region feature image-level label should be 1, the non-tampered region feature image-level label should be 0, and the loss function of the tampered region feature image-level label and the non-tampered region feature image-level label uses a cross entropy function and is recorded as L 1 And L 0 。
However, such designs tend to cause the partitioning of the non-tampered area to become smaller and smaller, i.e., A 0 The number of pixels included is gradually reduced. A region constraint function is added to ensure that the background partitions are both small enough to contain no tampered regions and large enough to contain all non-tampered regions. Thus, the neural network can be ensured to adaptively learn so as to enable the class activation map to be close to the falsified region true mask. The area constraint loss function is calculated as follows, the numerator of which represents the proportion of the tampered area in the CAM, and the denominator represents the proportion of the real image area:
wherein,representing the proportion of tampering features to the whole image, < ->Representing the proportion of the non-tampered feature to the whole image; h and W represent the height and width of the image. The arrangement can ensure that the numerator is gradually reduced and the denominator is gradually increased in the training process, namely, the CAM is optimized to gradually approach the tamper mask.
First-level image-level tag loss function L corresponding to a real image and a falsified image cls Second-level image-level labels corresponding to tampered and non-tampered regions in tampered imagesLoss function L 0 And L 1 All adopt cross entropy functions, the cross entropy functions have the following calculation formulas:
wherein N represents the number of batches and M represents the number of categories; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c.
3) And acquiring a trained weak supervision tampering positioning model through the back propagation fine tuning neural network. The four loss functions are adopted to carry out weighted supervision training on the neural network, and the four loss functions are matched with each other to improve the tamper detection performance. And optimizing the network parameters of the neural network model by adopting a back propagation algorithm according to the loss value calculated by adopting the depth supervision method. The total loss function is:
L=L cls +λ(L 0 +L 1 )+μL ac
wherein λ and μ are weight coefficients, both set to 0.5.
Inputting the image to be detected into a trained weak supervision tampering positioning model, and determining a tampering area in the image to be detected.
Further experiments prove the technical effect of the invention.
The method of the present invention is compared with existing CAM acquisition methods and conventional evidence obtaining methods. The baseline algorithm adopts four classical depth restoration image tampering positioning algorithms for comparison, and the baseline algorithm is respectively as follows: (1) traditional evidence obtaining method LDI, (2) ELA method (3) CAM method, (4) GradCAM method and (5) LayerCAM method. The test aims to detect and locate the repair area from the image to be detected, and the detection performance of the network is measured by the cross ratio IOU. The evaluation results are shown in Table 1. Compared with other existing methods, the method has better performance on the data set, which shows that the method can better capture the tampered region in the image by using the image-level label, only uses weak supervision signals, does not need to manually label the pixel-level label of the tampered region, and reduces the label load. Experimental results show that even though the pixel level is not marked, the method can achieve strong performance, and the difficulty of acquiring large-scale data is obviously reduced.
TABLE 1
Model | Repair method | Cross ratio IOU |
LDI | GC | 7.89 |
ELA | GC | 36.09 |
CAM | GC | 22.69 |
GradCAM | GC | 14.16 |
LayerCAM | GC | 14.23 |
The invention is that | GC | 50.83 |
In addition, diverse Inpainting Dataset was chosen as the generalization test set. And selecting an image in which the image falsification area occupies pixels of 10% or less of the image, a total of 2258 images, each of 5% -10%, 2% -5% and 2% or less occupying proportions of 1:3:1. in order to evaluate the generalization performance of the method, the method is compared with three methods of ELA, CAM and GradCAM, and the generalization capability of ELA is very strong, so that the method is not limited to a certain repair algorithm and has very strong universality; only CAM and GradCAM were chosen because other CAM methods such as GradCAM++ and LayerCAM have significantly different effects. The test results are shown in Table 2.
As can be seen from table 2, the method of the present invention exhibits a certain generalization ability when dealing with different image restoration methods, and has a better performance in terms of tamper localization. In particular, the method of the present invention is still better able to tamper localization than CAM and GradCAM, even if applied to images processed using different repair methods.
TABLE 2
Another embodiment of the present invention proposes a weak supervision depth restoration image tampering localization system based on tampering region separation and region constraint loss, the system comprising:
an image acquisition module configured to acquire pre-training set image data and training set image data;
a preprocessing module configured to preprocess image data;
the pre-training model training module is configured to input the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtain trained pre-training model parameters;
the tampering positioning model training module is configured to take the pre-training model parameters as initial values of network parameters, input the preprocessed training set image data into a weak supervision tampering positioning model based on a neural network for training, and acquire a trained weak supervision tampering positioning model;
the tampering locating module is configured to input the image to be detected into the trained weak supervision tampering locating model, and determine a tampering area in the image to be detected.
In this embodiment, preferably, the pre-training set image data in the image acquisition module includes a real image, a depth repair tampered image with a large tampered area, and a corresponding first-level image-level tag; the training set image data comprises depth restoration tampered images with small tampered areas and corresponding second-level image-level labels.
In this embodiment, preferably, the preprocessing in the preprocessing module includes: converting a real image or a depth restoration tampered image from an RGB color space to a noise space by using a steganography analysis rich model layer method, and obtaining a corresponding noise image; and carrying out cascading operation on the real image or the depth restoration tampered image and the corresponding noise image.
In this embodiment, preferably, the training process of the weak supervision tamper localization model based on the neural network in the tamper localization model training module includes:
inputting the cascaded images and noise images into a neural network to obtain depth characteristics of the images; reducing the number of channels corresponding to the depth features to 2 through a convolution layer; selecting a characteristic with a category of 1 as a category activation diagram of an initial tampering category;
separating tampered region features and non-tampered region features in the depth restoration tampered image, calculating an image-level label loss function and a region constraint loss function, and updating a class activation diagram; the process for separating the tampered region features and the non-tampered region features in the depth restoration tampered image comprises the following steps: depth specification for depth restoration of tampered imagesGenerating a rough real pixel prior A according to the spatial position 0 And tampered pixel a priori 1 :
Wherein Z represents a depth feature; p (P) i A class activation diagram representing a class 1, i.e., a tampered image;
coarse real pixel a priori 0 And tampered pixel a priori 1 Generating image-level features Z through global maximization layers, respectively 0 And Z 1 Image-level labels as a non-tampered region feature and a tampered region feature, respectively;
the first-level image-level label loss function L cls Second-level image-level tag loss function L 0 And L 1 Region constraint loss function L ac Weighting is carried out to be used as a total loss function of the weak supervision tampering positioning model; the total loss function is expressed as: l=l cls +λ(L 0 +L 1 )+μL ac λ and μ are weight coefficients; wherein the first-level image-level tag loss function L cls Second-level image-level tag loss function L 0 And L 1 All adopt cross entropy functions, and the calculation formula of the cross entropy functions is as follows:
wherein N represents the batch processing number, and M represents the category number; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c;
the calculation formula of the region constraint loss function is as follows:
wherein,indicating the proportion of tampered region features to the whole image, < >>Representing the proportion of the non-tampered region features to the whole image; h and W represent the height and width of the image;
and optimizing network parameters of the neural network by adopting a back propagation algorithm, and obtaining a trained weak supervision tampering positioning model.
The function of the weak supervision depth restoration image tampering positioning system based on tampered region separation and region constraint loss can be described by the weak supervision depth restoration image tampering positioning method based on tampered region separation and region constraint loss, so that the system embodiment is not described in detail, and the method embodiment is not described in detail.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of the above description, will appreciate that other embodiments are contemplated within the scope of the invention as described herein. The disclosure of the present invention is intended to be illustrative, but not limiting, of the scope of the invention, which is defined by the appended claims.
Claims (6)
1. The weak supervision depth restoration image tampering positioning method based on tampering area separation and area constraint loss is characterized by comprising the following steps of:
step one, acquiring pre-training set image data and training set image data;
step two, preprocessing the image data;
inputting the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtaining trained pre-training model parameters;
step four, taking the pre-training model parameters as initial values of network parameters, inputting the preprocessed training set image data into a weak supervision and tampering positioning model based on a neural network for training, and obtaining a trained weak supervision and tampering positioning model; the training process of the weak supervision tampering positioning model based on the neural network comprises the following steps:
inputting the cascaded images and noise images into a neural network to obtain depth characteristics of the images; reducing the number of channels corresponding to the depth features to 2 through a convolution layer; selecting a characteristic with a category of 1 as a category activation diagram of an initial tampering category;
separating tampered region features and non-tampered region features in the depth restoration tampered image, calculating an image-level label loss function and a region constraint loss function, and updating a class activation diagram, wherein the specific steps are as follows: for depth characteristics of the depth restoration tampered image, generating a rough real pixel prior A according to the spatial position of the depth characteristics 0 And tampered pixel a priori 1 :
Wherein Z represents a depth feature; p (P) i A class activation diagram representing a class 1, i.e., a tampered image; coarse real pixel a priori 0 And tampered pixel a priori 1 Generating image-level features Z through global maximization layers, respectively 0 And Z 1 Image-level labels as a non-tampered region feature and a tampered region feature, respectively;
the first-level image-level label loss function L cls Second-level image-level tag loss function L 0 And L 1 Region constraint loss function L ac Weighting is carried out to be used as a total loss function of the weak supervision tampering positioning model; the total loss function is expressed as: l=l cls +λ(L 0 +L 1 )+μL ac λ and μ are weight coefficients; wherein the first-level image-level tag loss function L cls Second-level image-level tag loss function L 0 And L 1 All adopt cross entropy functions, and the calculation formula of the cross entropy functions is as follows:
wherein N represents the number of batches and M represents the number of categories; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c;
the calculation formula of the region constraint loss function is as follows:
wherein,indicating the proportion of tampered region features to the whole image, < >>Representing the proportion of the non-tampered region features to the whole image; h and W represent the height and width of the image;
optimizing network parameters of the neural network by adopting a back propagation algorithm, and acquiring a trained weak supervision tampering positioning model;
inputting the image to be detected into a trained weak supervision tampering positioning model, and determining a tampering area in the image to be detected.
2. The weak supervision depth restoration image tampering localization method based on tampered area separation and area constraint loss according to claim 1, wherein the pre-training set image data in the first step comprises a real image, a depth restoration tampered image with a large tampered area and a corresponding first-level image-level tag; the training set image data comprises depth restoration tampered images with small tampered areas and corresponding second-level image-level labels.
3. The method for tamper localization of weakly supervised depth restoration image based on tamper zone separation and zone constraint loss of claim 2, wherein the preprocessing in step two comprises: converting a real image or a depth restoration tampered image from an RGB color space to a noise space by using a steganography analysis rich model layer method, and obtaining a corresponding noise image; and carrying out cascading operation on the real image or the depth restoration tampered image and the corresponding noise image.
4. A weak supervision depth restoration image tampering localization system based on tampered region separation and region constraint loss, characterized by comprising:
an image acquisition module configured to acquire pre-training set image data and training set image data;
a preprocessing module configured to preprocess image data;
the pre-training model training module is configured to input the preprocessed pre-training set image data into a pre-training detection model based on a neural network for pre-training, and obtain trained pre-training model parameters;
the tampering positioning model training module is configured to take the pre-training model parameters as initial values of network parameters, input the preprocessed training set image data into a weak supervision tampering positioning model based on a neural network for training, and acquire a trained weak supervision tampering positioning model; the training process of the weak supervision tampering positioning model based on the neural network comprises the following steps: inputting the cascaded images and noise images into a neural network to obtain depth characteristics of the images; reducing the number of channels corresponding to the depth features to 2 through a convolution layer; selecting a characteristic with a category of 1 as a category activation diagram of an initial tampering category;
separating tampered region features and non-tampered region features in the depth restoration tampered image, calculating an image-level label loss function and a region constraint loss function, and updating a class activation diagram; the process for separating the tampered region features and the non-tampered region features in the depth restoration tampered image comprises the following steps: for depth characteristics of the depth restoration tampered image, generating a rough real pixel prior A according to the spatial position of the depth characteristics 0 And tampered pixel a priori 1 :
Wherein Z represents a depth feature; p (P) i A class activation diagram representing a class 1, i.e., a tampered image;
coarse real pixel a priori 0 And tampered pixel a priori 1 Generating image-level features Z through global maximization layers, respectively 0 And Z 1 Image-level labels as a non-tampered region feature and a tampered region feature, respectively;
the first-level image-level label loss function L cls Second-level image-level tag loss function L 0 And L 1 Region constraint loss function L ac Weighting is carried out to be used as a total loss function of the weak supervision tampering positioning model; the total loss function is expressed as: l=l cls +λ(L 0 +L 1 )+μL ac λ and μ are weight coefficients; wherein the first-level image-level tag loss function L cls Second-level image-level tag loss function L 0 And L 1 All adopt cross entropy functions, and the calculation formula of the cross entropy functions is as follows:
wherein N represents the batch processing number, and M represents the category number; y is ic Is a sign function with a value of 0 or 1, y if the true class of sample i is equal to c ic Taking 1, otherwise taking 0; p is p ic Representing the predicted probability that sample i belongs to category c;
the calculation formula of the region constraint loss function is as follows:
wherein,indicating the proportion of tampered region features to the whole image, < >>Representing the proportion of the non-tampered region features to the whole image; h and W represent the height and width of the image;
optimizing network parameters of the neural network by adopting a back propagation algorithm, and acquiring a trained weak supervision tampering positioning model;
the tampering locating module is configured to input the image to be detected into the trained weak supervision tampering locating model, and determine a tampering area in the image to be detected.
5. The weak supervision depth restoration image tampering localization system based on tampered area separation and area constraint loss according to claim 4, wherein the pre-training set image data in the image acquisition module comprises a real image, a depth restoration tampered image with a large tampered area and a corresponding first-level image-level tag; the training set image data comprises depth restoration tampered images with small tampered areas and corresponding second-level image-level labels.
6. The weak supervision depth repair image tamper localization system based on tamper zone separation and zone constraint loss of claim 5, wherein the preprocessing in the preprocessing module comprises: converting a real image or a depth restoration tampered image from an RGB color space to a noise space by using a steganography analysis rich model layer method, and obtaining a corresponding noise image; and carrying out cascading operation on the real image or the depth restoration tampered image and the corresponding noise image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310800166.0A CN116912184B (en) | 2023-06-30 | 2023-06-30 | Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310800166.0A CN116912184B (en) | 2023-06-30 | 2023-06-30 | Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116912184A CN116912184A (en) | 2023-10-20 |
CN116912184B true CN116912184B (en) | 2024-02-23 |
Family
ID=88359401
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310800166.0A Active CN116912184B (en) | 2023-06-30 | 2023-06-30 | Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116912184B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117407562B (en) * | 2023-12-13 | 2024-04-05 | 杭州海康威视数字技术股份有限公司 | Image recognition method, system and electronic equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111212291A (en) * | 2020-01-14 | 2020-05-29 | 广东工业大学 | DFL-CNN network-based video intra-frame object removal tamper detection method |
CN111899251A (en) * | 2020-08-06 | 2020-11-06 | 中国科学院深圳先进技术研究院 | Copy-move type forged image detection method for distinguishing forged source and target area |
CN112614111A (en) * | 2020-12-24 | 2021-04-06 | 南开大学 | Video tampering operation detection method and device based on reinforcement learning |
WO2021190451A1 (en) * | 2020-03-24 | 2021-09-30 | 华为技术有限公司 | Method and apparatus for training image processing model |
CN113505670A (en) * | 2021-06-29 | 2021-10-15 | 西南交通大学 | Remote sensing image weak supervision building extraction method based on multi-scale CAM and super-pixels |
CN114418840A (en) * | 2021-12-15 | 2022-04-29 | 深圳先进技术研究院 | Image splicing positioning detection method based on attention mechanism |
CN114913183A (en) * | 2021-02-07 | 2022-08-16 | 上海交通大学 | Image segmentation method, system, apparatus and medium based on constraint |
WO2023076467A1 (en) * | 2021-10-27 | 2023-05-04 | Monovo, LLC | Encrypting data generated from medical devices |
CN116152575A (en) * | 2023-04-18 | 2023-05-23 | 之江实验室 | Weak supervision target positioning method, device and medium based on class activation sampling guidance |
CN116342857A (en) * | 2023-03-28 | 2023-06-27 | 武汉大学 | Weak supervision target positioning method based on category correction |
-
2023
- 2023-06-30 CN CN202310800166.0A patent/CN116912184B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111212291A (en) * | 2020-01-14 | 2020-05-29 | 广东工业大学 | DFL-CNN network-based video intra-frame object removal tamper detection method |
WO2021190451A1 (en) * | 2020-03-24 | 2021-09-30 | 华为技术有限公司 | Method and apparatus for training image processing model |
CN111899251A (en) * | 2020-08-06 | 2020-11-06 | 中国科学院深圳先进技术研究院 | Copy-move type forged image detection method for distinguishing forged source and target area |
CN112614111A (en) * | 2020-12-24 | 2021-04-06 | 南开大学 | Video tampering operation detection method and device based on reinforcement learning |
CN114913183A (en) * | 2021-02-07 | 2022-08-16 | 上海交通大学 | Image segmentation method, system, apparatus and medium based on constraint |
CN113505670A (en) * | 2021-06-29 | 2021-10-15 | 西南交通大学 | Remote sensing image weak supervision building extraction method based on multi-scale CAM and super-pixels |
WO2023076467A1 (en) * | 2021-10-27 | 2023-05-04 | Monovo, LLC | Encrypting data generated from medical devices |
CN114418840A (en) * | 2021-12-15 | 2022-04-29 | 深圳先进技术研究院 | Image splicing positioning detection method based on attention mechanism |
CN116342857A (en) * | 2023-03-28 | 2023-06-27 | 武汉大学 | Weak supervision target positioning method based on category correction |
CN116152575A (en) * | 2023-04-18 | 2023-05-23 | 之江实验室 | Weak supervision target positioning method, device and medium based on class activation sampling guidance |
Also Published As
Publication number | Publication date |
---|---|
CN116912184A (en) | 2023-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111311563B (en) | Image tampering detection method based on multi-domain feature fusion | |
Korus et al. | Multi-scale fusion for improved localization of malicious tampering in digital images | |
JP3740065B2 (en) | Object extraction device and method based on region feature value matching of region-divided video | |
Qin et al. | Moving cast shadow removal based on local descriptors | |
CN110598609A (en) | Weak supervision target detection method based on significance guidance | |
Varnousfaderani et al. | Weighted color and texture sample selection for image matting | |
CN112818862A (en) | Face tampering detection method and system based on multi-source clues and mixed attention | |
CN101971190A (en) | Real-time body segmentation system | |
Yang et al. | Spatiotemporal trident networks: detection and localization of object removal tampering in video passive forensics | |
Zhang et al. | A novel text detection system based on character and link energies | |
CN104657980A (en) | Improved multi-channel image partitioning algorithm based on Meanshift | |
Hussain et al. | Robust pre-processing technique based on saliency detection for content based image retrieval systems | |
CN107622280B (en) | Modularized processing mode image saliency detection method based on scene classification | |
CN103530638A (en) | Method for matching pedestrians under multiple cameras | |
CN116912184B (en) | Weak supervision depth restoration image tampering positioning method and system based on tampering area separation and area constraint loss | |
Chen et al. | SNIS: A signal noise separation-based network for post-processed image forgery detection | |
CN104408728A (en) | Method for detecting forged images based on noise estimation | |
Wan et al. | A new technique for summarizing video sequences through histogram evolution | |
CN111274964B (en) | Detection method for analyzing water surface pollutants based on visual saliency of unmanned aerial vehicle | |
Tralic et al. | Video frame copy-move forgery detection based on cellular automata and local binary patterns | |
CN114863464A (en) | Second-order identification method for PID drawing picture information | |
CN111832497B (en) | Text detection post-processing method based on geometric features | |
CN109902690A (en) | Image recognition technology | |
CN112308040A (en) | River sewage outlet detection method and system based on high-definition images | |
Vijayalakshmi K et al. | Copy-paste forgery detection using deep learning with error level analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |