CN110942456A

CN110942456A - Tampered image detection method, device, equipment and storage medium

Info

Publication number: CN110942456A
Application number: CN201911189124.8A
Authority: CN
Inventors: 廖红虹; 邹雨晗; 章放; 杨海军; 徐倩; 杨强
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2020-03-31
Anticipated expiration: 2039-11-25
Also published as: CN110942456B

Abstract

The invention discloses a method, a device, equipment and a storage medium for detecting a tampered image, wherein the method comprises the following steps: acquiring a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model; carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features; if the image to be detected has the area with inconsistent pixels, the image to be detected is judged to be a tampered image, and therefore the accuracy of image tampering detection is improved.

Description

Tampered image detection method, device, equipment and storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting a tampered image.

Background

With the development of computer technology, more and more technologies (big data, distributed, Blockchain, artificial intelligence, etc.) are applied to the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but higher requirements are also put forward on the technologies due to the requirements of security and real-time performance of the financial industry.

With the popularization of image editing software and the reduction of application thresholds, daily production life is flooded with more and more tampered images. At present, the traditional image processing modes are mainly used for detecting the tampered image, most of the modes only stay at the image-level detection level, and detection from the pixel-level is difficult, so that the detection accuracy of the tampered image is low.

Disclosure of Invention

The invention provides a method, a device, equipment and a storage medium for detecting a tampered image, and aims to improve the accuracy of detecting the tampered image.

In order to achieve the above object, the present invention provides a method for detecting a tampered image, the method including:

acquiring a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model;

carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features;

and if the to-be-detected image has an area with inconsistent pixels, judging that the to-be-detected image is a tampered image.

Preferably, after the step of determining that the image to be detected is a tampered image if there is a region with inconsistent pixels in the image to be detected, the method further includes:

respectively searching a corresponding tampered pixel position and a corresponding tampered edge position in the first multi-dimensional semantic feature according to a first pre-generated predicted pixel mask and a first pre-generated predicted edge mask;

and predicting a tampering mode and/or a tampering area of the image to be detected by combining the characteristic samples.

Preferably, before the step of searching for corresponding tampered pixel locations and tampered edge locations in the first multi-dimensional semantic feature according to the pre-generated first predicted pixel mask and the first predicted edge mask, respectively, the method further includes:

generating the first predicted pixel mask according to a result of the pixel consistency detection;

and carrying out edge consistency detection on the image to be detected according to the first multi-dimensional semantic features, and generating the first predicted edge mask according to the result of the edge consistency detection.

Preferably, the feature sampling includes global sampling and local sampling, and the step of predicting the tampering mode and/or tampering region of the image to be detected by combining the feature sampling includes:

acquiring integral first multi-dimensional semantic features of the image to be detected through the global sampling;

acquiring a first local multi-dimensional semantic feature of a region corresponding to the first prediction pixel mask and the first prediction edge mask through the local sampling;

combining the whole first multi-dimensional semantic features and the local first multi-dimensional semantic features to obtain combined features;

and analyzing the combination characteristics to predict a tampering mode and a tampering area of the image to be detected.

acquiring a local pixel first multi-dimensional semantic feature of a region corresponding to the first prediction pixel mask through the local sampling, and predicting a tampering mode of the image to be detected according to the whole first multi-dimensional semantic feature and the local pixel first multi-dimensional semantic feature; or

And acquiring a local edge first multi-dimensional semantic feature of a region corresponding to the first prediction edge mask through the local sampling, and predicting a tampered region of the image to be detected according to the whole first multi-dimensional semantic feature and the local edge first multi-dimensional semantic feature.

Preferably, the first multidimensional semantic feature model includes a full convolution network FCN model, and the step of obtaining the first multidimensional semantic feature of the image to be detected based on the preset multidimensional semantic detection model further includes:

inputting a training image into a pre-constructed FCN model for training to obtain an initial FCN model;

and inputting the verification picture into the initial FCN model for verification, and storing the verification picture as the FCN model after the verification is passed.

Preferably, the step of inputting the verification picture into the initial FCN model for verification, and after the verification is passed, storing the verification picture as the FCN model includes:

inputting a check image into the initial FCN model to obtain a second multi-dimensional semantic feature of the check image;

acquiring a corresponding second prediction mask from the second multi-dimensional semantic features;

calculating a loss value of the check image according to the second prediction mask;

and if the loss value is within a preset range, judging that the verification is passed, and storing the corresponding initial FCN model as the FCN model.

In addition, to achieve the above object, an embodiment of the present invention further provides a tampered image detecting device, including:

the acquisition module is used for acquiring a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model;

the detection module is used for carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features;

and the judging module is used for judging that the image to be detected is a tampered image if the image to be detected has an area with inconsistent pixels.

In addition, in order to achieve the above object, an embodiment of the present invention further provides a tampered image detecting device, where the tampered image detecting device includes a processor, a memory, and a tampered image detecting program stored in the memory, and when the tampered image detecting program is executed by the processor, the steps of the tampered image detecting method are implemented.

In addition, in order to achieve the above object, an embodiment of the present invention further provides a computer storage medium, where a tampered image detection program is stored on the computer storage medium, and when the tampered image detection program is executed by a processor, the steps of the tampered image detection method are implemented.

Compared with the prior art, the invention provides a method, a device, equipment and a storage medium for detecting a tampered image, wherein a first multi-dimensional semantic feature of an image to be detected is obtained based on a preset multi-dimensional semantic detection model; carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features; if the image to be detected has the area with inconsistent pixels, the image to be detected is judged to be a tampered image, and therefore the accuracy of image tampering detection is improved.

Drawings

Fig. 1 is a schematic diagram of a hardware configuration of a tamper image detection device according to embodiments of the present invention;

FIG. 2 is a schematic flow chart of a tampered image detection method according to a first embodiment of the present invention;

FIG. 3 is a schematic view of a scene of an embodiment of a method for detecting a tampered image according to the present invention

FIG. 4 is a schematic flow chart diagram illustrating a tampered image detection method according to a second embodiment of the present invention;

FIG. 5 is a block flow diagram of one embodiment of a tamper image detection method of the present invention;

fig. 6 is a functional block diagram of the tamper image detection apparatus according to the first embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The tampered image detection device mainly related to the embodiment of the invention is a network connection device capable of realizing network connection, and the tampered image detection device can be a server, a cloud platform and the like. In addition, the mobile terminal related to the embodiment of the invention can be mobile network equipment such as a mobile phone, a tablet personal computer and the like.

Referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a tamper image detection device according to embodiments of the present invention. In this embodiment of the present invention, the tampered image detecting device may include a processor 1001 (e.g., a central processing Unit, CPU), a communication bus 1002, an input port 1003, an output port 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the input port 1003 is used for data input; the output port 1004 is used for data output, the memory 1005 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration depicted in FIG. 1 is not intended to be limiting of the present invention, and may include more or less components than those shown, or some components in combination, or a different arrangement of components.

With continued reference to fig. 1, the memory 1005 of fig. 1, which is one of the readable storage media, may include an operating system, a network communication module, an application program module, and a tamper image detection program. In fig. 1, the network communication module is mainly used for connecting to a server and performing data communication with the server; and the processor 1001 may call the falsified image detection program stored in the memory 1005 and execute the falsified image detection method provided by the embodiment of the present invention.

The embodiment of the invention provides a method for detecting a tampered image.

With the popularization of image editing software and the reduction of application thresholds, daily production life is flooded with more and more tampered images. People can conveniently tamper images through Photoshop (image processing software) and American show, and the like software is not necessarily true. Some tampered images with certain purposes are spread on the network, false information or rumors are easily spread, and therefore the credibility and unstable emotion of unknown masses in the society are caused.

The prior art mainly proposes solutions for some kind of tampering, such as image stitching, removing an object in an image, copying an object in an image for multiple times, image enhancement, and the like. The method mainly adopted is also based on the image level, for example, by utilizing the secondary compression of a JPEG (Joint Photographic Experts Group) image, false ghosts are easily generated, and the consistency of the edge and the color of the image, the consistency of an imaging model of a camera, the consistency of imaging noise and the like are detected to detect whether the image is tampered.

However, the tampering types detected by these methods are too few, and usually, for a certain specific type or a few common types, the method is often ineffective for a combined scenario of multiple tampering modes. And the positioning of the tampered location is not accurate enough. In the prior art, tampering positioning is mainly based on an image block mode, positioning accuracy depends on the size of an image block, and positioning accuracy and category accuracy need to be compromised. That is, a smaller image block can obtain higher positioning accuracy but may reduce the detection accuracy of the falsified category, resulting in some normal images being determined as falsified images.

In a word, the currently detectable falsification types are too few, and accurate positioning is difficult, so that the falsified image detection accuracy is not high. In order to solve the problems that the image tampering accuracy is not high, and the detection and positioning of the image tampering mode are inaccurate, the embodiment provides a method for detecting tampered images.

Referring to fig. 2, fig. 2 is a schematic flowchart of a tampered image detection method according to a first embodiment of the present invention.

In this embodiment, the tampered image detection method is applied to a tampered image detection device, and the method includes:

step S101, acquiring a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model;

the first multidimensional semantic feature model includes a full convolution Network FCN model, and it is understood that the multidimensional semantic feature model may also be other Deep learning models, for example, models including CNN (Convolutional neural networks), DBN (Deep Belief networks), and the like. FCN (full Convolutional network) converts the full link layer in conventional CNN (Convolutional Neural network) into Convolutional layers one by one, and after many convolutions, the obtained image becomes smaller and smaller, and the resolution becomes lower and lower. The FCN may identify different objects/objects in the image with different identified pixels. In this embodiment, the FCN model adopts a VGG (Visual Geometry Group Network) to extract the multidimensional semantic features of the image. The result of the multi-dimensional semantic features is that each pixel belongs to some specific object background.

Further, the multidimensional semantic feature model may also be a machine learning model, or other intelligent decision making model.

Specifically, an image to be detected is input into the FCN model, the FCN model extracts the features of the image to be detected through a multilayer convolutional neural network, the obtained features are fused and processed through technologies such as upsampling and cross-channel information fusion, and finally the first multi-dimensional semantic features of the image to be detected are obtained.

The method for acquiring the first multi-dimensional semantic features of the image to be detected based on the preset multi-dimensional semantic detection model comprises the following steps:

s1001, inputting a training image into a pre-constructed FCN model for training to obtain an initial FCN model;

in this embodiment, a large number of training images are collected in advance, and the training images include non-tampered images and tampered images that have been tampered in various ways. And marking the training image and marking the tampering mode of the training image.

And (3) requiring a professional to classify the training images according to experience, firstly classifying the training images into tampered images and non-tampered images, and further classifying the tampered images secondarily according to the specific tampering mode of the tampered images.

The tampering mode comprises 7 basic modes of contrast enhancement, quantization, noise addition, morphology, blurring, resampling and compression, and also comprises 24 fine modes of the 7 basic modes, including automatic enhancement, histogram equalization, quantization, dithering, Gaussian noise, salt and pepper noise, Poisson noise, uniform noise, closing operation, opening operation, expanding operation, erosion operation, box blurring, Gaussian blurring, median blurring, wavelet blurring, area-based resampling, cubic (cubic function) resampling, LANCS (name of Hungary mathematician) resampling, Biliner (Bilinear) resampling, nearest neighbor resampling, JPEG compression, JPEG secondary compression and WEBP compression. The tampering mode of the training image annotation comprises the basic mode and the fine mode. And classifying the tampered images into 7 classes according to the basic mode, and then performing secondary classification in each basic mode, namely performing secondary classification on the tampered training images according to the fine mode.

And after classifying the training images, carrying out pixel-level labeling on the training images by using a labeling tool. The pixel level labeling includes that the pixel is not tampered and the pixel is tampered, P0 represents that the pixel is not tampered, and P1 represents that the pixel is tampered. And further labeling a corresponding tampering mode for the training image with the tampered pixels according to the secondary classification result. The basic mode and the fine mode of the training image tampering mode can be labeled at the same time, and in other embodiments, only the basic mode or only the fine mode can be labeled.

And after the training image is labeled according to a tampering mode, inputting the training image into a pre-constructed FCN, performing feature extraction on the FCN through a multilayer convolutional neural network, and fusing and processing specific features through technologies such as upsampling, cross-channel information fusion and the like to finally obtain the multi-dimensional semantic features of the training image. Thereby completing the construction of the initial FCN model, obtaining the initial FCN model.

And S1002, inputting a verification picture into the initial FCN model for verification, and storing the verification picture as the FCN model after the verification is passed.

Specifically, the step S1002 includes:

step a, inputting a check image into the initial FCN model to obtain a second multi-dimensional semantic feature of the check image;

b, acquiring a corresponding second prediction mask from the second multi-dimensional semantic features;

in particular, the second prediction mask includes a second prediction pixel mask and a second prediction edge mask. The second predicted pixel mask is a two-dimensional pixel mask matrix having the same size as the verification image, each element position of the two-dimensional pixel mask matrix corresponds to a pixel position in the verification image, and in the two-dimensional pixel mask matrix, a matrix value of a tampered pixel is set to 1, and a matrix value of an untampered pixel is set to 0. That is, the second predicted pixel mask is a matrix consisting of 0 and 1.

Analyzing the second multi-dimensional semantic features, and judging whether regions with inconsistent pixels exist in the check image or not; if the area with inconsistent pixels does not exist in the verification image, the image to be detected is not a tampered image; and if the to-be-detected image has an area with inconsistent pixels, finding out the area with inconsistent pixels according to the second multi-dimensional semantic features, and generating a second prediction pixel mask.

And if the matrix values in the second prediction pixel mask are all 0, indicating that the verification image is not a tampered image. If the second prediction pixel mask has a matrix value with a value of 1, the verification image is a tampered image, and whether an area with inconsistent edges exists in the verification image is continuously judged.

And extracting the regions with inconsistent edges in the check image according to the second multi-dimensional semantic features, and generating a corresponding second predicted edge mask. In particular, the second predicted edge mask is similar to the second predicted pixel mask. The second edge mask is an edge mask two-dimensional matrix with the same size as the check image, each element position of the edge mask two-dimensional matrix corresponds to a pixel position in the image to be detected, in the edge mask two-dimensional matrix, the matrix value of the tampered pixel is set to be 1, and the matrix value of the non-tampered pixel is set to be 0. Therefore, after the second prediction pixel mask is obtained, the first multi-dimensional semantic features of the check image are continuously analyzed, areas with inconsistent edges in the image to be detected are extracted, and a corresponding prediction edge mask is generated.

C, calculating the loss value of the check image according to the second prediction mask;

specifically, a second pixel mask of the verification image is extracted according to the verification pixel level label of the verification image; extracting edge information according to the second pixel mask, and taking the boundary of the tampered pixel and the non-tampered pixel as a second edge mask; calculating loss values of the second prediction pixel mask and the second pixel mask to obtain a pixel mask loss value, and calculating loss values of the second prediction edge mask and the second edge mask to obtain an edge mask loss value of the verification image;

in this embodiment, a second pixel mask of the verification image is extracted in advance according to the verification pixel level label of the verification image; and extracting edge information according to the second pixel mask, and taking the boundary of the tampered pixel and the non-tampered pixel as a second edge mask.

And calculating loss values of the second prediction pixel mask and the second pixel mask to obtain a pixel mask loss value. And calculating loss values of the second predicted edge mask and the second edge mask to obtain an edge mask loss value of the check image. In particular, the loss value is the difference between the result predicted by the FCN model and the true result obtained from the annotation. The loss value may be an information entropy distance or an euclidean distance.

And d, if the loss value is within a preset range, judging that the verification is passed, and storing the corresponding initial FCN model as the FCN model.

In this embodiment, the preset range of the pixel mask loss value and/or the edge mask loss value may be set empirically. If the pixel mask loss value and the edge mask loss value are both within a preset range, the accuracy of the FCN model can reach an ideal value, and the FCN model is judged to meet the requirements.

Further, if one or both of the pixel mask loss value and the edge mask loss value are not within the corresponding preset loss value range, it is determined that the FCN model is not satisfactory.

When the FCN model does not meet the requirements, the FCN model continues to be trained until the FCN model meets the requirements.

Further, before the step of inputting the verification image into the FCN model and obtaining the second multidimensional semantic feature of the verification image, the method further includes:

and labeling the check image to obtain check pixel level labeling and check image level labeling of the check image. And specifically, marking the tampering mode of the verification image by using a marking tool. The step of labeling the verification image is consistent with the step of labeling the training image, and is not repeated here.

The image-level annotations include two major categories: a normal image and a tampered image, wherein the normal image is denoted by M0 and the tampered image is denoted by M1. And further marking the tampering mode of the tampered image.

The image-level annotation of the verification image can be compared with the predicted tampering mode, and the accuracy of the FCN model is further verified

The FCN model can detect multiple tampering modes. Specifically, referring to fig. 3, fig. 3 is a scene schematic diagram of an embodiment of the tampered image detection method of the present invention. The tamper modes that can be detected include contrast enhancement, quantization, noise addition, morphology, blur, resampling, compression of these 7 basic modes, and also include a plurality of refined modes under the 7 basic modes, including 24 refined modes of automatic enhancement, histogram equalization, quantization, dither, gaussian noise, salt and pepper noise, poisson noise, uniform noise, close operation, open operation, dilation operation, erosion operation, box blur, gaussian blur, median blur, wavelet blur, area-based resampling, cubic (cubic function) resampling, LANCZOS (name of hungary math) resampling, Bilinear resampling, nearest neighbor resampling, JPEG compression, JPEG secondary compression, WEBP compression.

Step S102, carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features;

and analyzing the first multi-dimensional semantic features to obtain pixels of the image to be detected, and detecting the pixel consistency.

And S103, if the to-be-detected image has an area with inconsistent pixels, judging that the to-be-detected image is a tampered image.

Further, if the to-be-detected image does not have the area with inconsistent pixels, the to-be-detected image is judged not to be a tampered image.

According to the scheme, the first multi-dimensional semantic features of the image to be detected are obtained based on the preset multi-dimensional semantic detection model; carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features; if the image to be detected has the area with inconsistent pixels, the image to be detected is judged to be a tampered image, and therefore the accuracy of image tampering detection is improved.

As shown in fig. 4, a second embodiment of the present invention provides a method for detecting a tampered image, based on the first embodiment shown in fig. 2, wherein if there is an area with inconsistent pixels in the to-be-detected image, the step of determining that the to-be-detected image is the tampered image further includes:

step S104: respectively searching a corresponding tampered pixel position and a corresponding tampered edge position in the first multi-dimensional semantic feature according to a first pre-generated predicted pixel mask and a first pre-generated predicted edge mask;

further, before the step of searching for a corresponding tampered pixel position and a corresponding tampered edge position in the first multi-dimensional semantic feature according to a first pre-generated predicted pixel mask and a first pre-generated predicted edge mask, respectively, the method further includes:

specifically, whether an area with inconsistent pixels exists in the image to be detected is judged according to the first multi-dimensional semantic features; if the image to be detected has an area with inconsistent pixels, judging that the image to be detected is a tampered image, and generating a corresponding first prediction pixel mask;

specifically, the first prediction pixel mask is a pixel mask two-dimensional matrix with the same size as the image to be detected, each element position of the pixel mask two-dimensional matrix corresponds to a pixel position in the image to be detected, and in the pixel mask two-dimensional matrix, the matrix value of the tampered pixel is set to 1, and the matrix value of the untampered pixel is set to 0. That is, the first predicted pixel mask is a matrix consisting of 0 and 1. Analyzing the first multi-dimensional semantic features, and judging whether an area with inconsistent pixels exists in the image to be detected; if the image to be detected does not have the area with inconsistent pixels, the image to be detected is not a tampered image; if the image to be detected has an area with inconsistent pixels, the area with inconsistent pixels is found out according to the first multi-dimensional semantic features, and the first prediction pixel mask is generated.

And if the matrix values in the first prediction pixel mask are all 0, the image to be detected is not a tampered image. If the first prediction pixel mask has a matrix value with a numerical value of 1, the image to be detected is a tampered image, and whether an area with inconsistent edges exists in the image to be detected is continuously judged.

Specifically, according to the first multi-dimensional semantic features, regions with inconsistent edges in the image to be detected are extracted, and then corresponding first predicted edge masks are generated. In particular, the first predicted edge mask is similar to the first predicted pixel mask. The first predicted edge mask is an edge mask two-dimensional matrix with the same size as the image to be detected, each element position of the edge mask two-dimensional matrix corresponds to a pixel position in the image to be detected, in the edge mask two-dimensional matrix, the matrix value of a tampered pixel is set to be 1, and the matrix value of an untampered pixel is set to be 0. Therefore, after the first prediction pixel mask is obtained, the first multi-dimensional semantic features of the image to be detected are continuously analyzed, areas with inconsistent edges in the image to be detected are extracted, and a corresponding prediction edge mask is generated.

Therefore, the corresponding tampered pixel position and the tampered edge position can be searched in the first multi-dimensional semantic feature respectively according to the first predicted pixel mask and the first predicted edge mask which are generated in advance

Step S105: and predicting a tampering mode and/or a tampering area of the image to be detected by combining the characteristic samples.

Specifically, the feature sampling includes global sampling and local sampling, and the step of predicting the tampering mode and/or tampering region of the image to be detected by combining the feature sampling includes:

Searching the position with the matrix value of 1 in the first prediction pixel mask in the first multi-dimensional semantic feature, and obtaining the corresponding tampered pixel position; and searching the position with the matrix value of 1 in the first prediction edge mask in the first multi-dimensional semantic feature, so as to obtain the corresponding tampering edge position.

The feature sampling comprises global sampling and local sampling, wherein the global sampling refers to the adoption of multi-dimensional semantic features of the whole training image, and the local sampling refers to the adoption of multi-dimensional semantic features of a target area. In this embodiment, the target area refers to an area corresponding to the first prediction pixel mask and an area corresponding to the first prediction edge mask.

The process of obtaining the combined features involves necessary processes such as matrix extraction, matrix scale operation, merging operation and the like. And analyzing the combination characteristics to predict a tampering mode and a tampering area of the image to be detected. And after the combination characteristics are obtained, respectively judging the tampering types of the tampering areas by applying a classification algorithm. The classification algorithm may be one or more of softmax (flexible maximum transfer function), linear (linear), MLP (Muti-Layer probability), convolution.

Specifically, referring to fig. 5, fig. 5 is a flow chart of an embodiment of the tampered image detection method of the present invention. As shown in the figure, the tampered image detection process of the present invention is as follows: firstly, inputting the image to be detected, and outputting the first multi-dimensional semantic feature by the FCN model; then, consistency detection is carried out according to the first multi-dimensional semantic features to obtain the first prediction pixel mask and the first prediction edge mask; and predicting a tampering mode and a tampering area of the image to be detected by combining the global characteristic sampling and the local characteristic sampling.

Further, the feature sampling includes global sampling and local sampling, and the step of predicting the tampering mode and/or tampering region of the image to be detected by combining the feature sampling includes:

Further, the positioning of the tampered region can be further corrected by using convolution + regression operation, so that the positioning is more accurate.

According to the scheme, the first multi-dimensional semantic features of the image to be detected are obtained based on the preset multi-dimensional semantic detection model; carrying out pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features; if the to-be-detected image has an area with inconsistent pixels, judging that the to-be-detected image is a tampered image; respectively searching a corresponding tampered pixel position and a corresponding tampered edge position in the first multi-dimensional semantic feature according to a first pre-generated predicted pixel mask and a first pre-generated predicted edge mask; and predicting the tampering mode and/or the tampering area of the image to be detected by combining with characteristic sampling, so that the accuracy of image tampering detection is improved and the problem of inaccurate image tampering mode detection and positioning is solved based on a full convolution network in deep learning.

In addition, the embodiment also provides a tampered image detection device. Referring to fig. 5, fig. 5 is a functional block diagram of the tampered image detecting device according to the first embodiment of the present invention.

In this embodiment, the tampered image detecting device is a virtual device, and is stored in the memory 1005 of the tampered image detecting apparatus shown in fig. 1, so as to implement all functions of the tampered image detecting program: the method comprises the steps of obtaining a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model; the pixel consistency detection is carried out on the image to be detected according to the first multi-dimensional semantic features; and if the to-be-detected image has the area with inconsistent pixels, judging that the to-be-detected image is a tampered image.

Specifically, the falsified image detection apparatus includes:

the acquisition module 10 is used for acquiring a first multi-dimensional semantic feature of an image to be detected based on a preset multi-dimensional semantic detection model;

the detection module 20 is configured to perform pixel consistency detection on the image to be detected according to the first multi-dimensional semantic features;

and the judging module 30 is configured to judge that the image to be detected is a tampered image if an area with inconsistent pixels exists in the image to be detected.

Further, the falsified image detection apparatus further includes:

the generating module is used for respectively searching a corresponding tampered pixel position and a corresponding tampered edge position in the first multi-dimensional semantic feature according to a first predicted pixel mask and a first predicted edge mask which are generated in advance;

and the prediction module is used for predicting the tampering mode and/or tampering area of the image to be detected by combining the characteristic samples.

Further, the generating module further comprises:

a first generation unit configured to generate the first predicted pixel mask according to a result of the pixel consistency detection;

and the second generation unit is used for carrying out edge consistency detection on the image to be detected according to the first multi-dimensional semantic features and generating the first predicted edge mask according to the result of the edge consistency detection.

Further, the prediction module further comprises:

the first acquisition unit is used for acquiring the integral first multi-dimensional semantic features of the image to be detected through the global sampling;

a second obtaining unit, configured to obtain, through the local sampling, a local first multi-dimensional semantic feature of a region corresponding to the first prediction pixel mask and the first prediction edge mask;

a third obtaining unit, configured to combine the overall first multidimensional semantic feature and the local first multidimensional semantic feature to obtain a combined feature;

and the second prediction unit is used for analyzing the combination characteristics and predicting a tampering mode and a tampering area of the image to be detected.

Further, the prediction module further comprises:

the fourth acquisition unit is used for acquiring the integral first multi-dimensional semantic features of the image to be detected through the global sampling;

the third prediction unit is used for acquiring a first multi-dimensional semantic feature of a local pixel in a region corresponding to the first prediction pixel mask through the local sampling, and predicting a tampering mode of the image to be detected according to the whole first multi-dimensional semantic feature and the first multi-dimensional semantic feature of the local pixel; or

And the fourth prediction unit is used for acquiring the first multi-dimensional semantic feature of the local edge of the region corresponding to the first prediction edge mask through the local sampling, and predicting the tampered region of the image to be detected according to the whole first multi-dimensional semantic feature and the first multi-dimensional semantic feature of the local edge.

Further, the obtaining module further comprises:

the training unit is used for inputting a training image into a pre-constructed FCN model for training to obtain an initial FCN model;

and the verification unit is used for inputting the verification picture into the initial FCN model for verification, and after the verification is passed, the verification picture is stored as the FCN model.

Further, the authentication unit includes:

the input subunit is used for inputting the check image into the initial FCN model to obtain a second multi-dimensional semantic feature of the check image;

an obtaining subunit, configured to obtain a corresponding second prediction mask from the second multidimensional semantic feature;

a calculating subunit, configured to calculate a loss value of the verification image according to the second prediction mask;

and the saving subunit is used for judging that the verification is passed if the loss value is within a preset range, and saving the corresponding initial FCN model as the FCN model.

In addition, an embodiment of the present invention further provides a computer storage medium, where a tampered image detection program is stored on the computer storage medium, and when the tampered image detection program is executed by a processor, the steps of the tampered image detection method are implemented, which are not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for causing a terminal device to execute the method according to the embodiments of the present invention.

The above description is only for the preferred embodiment of the present invention and is not intended to limit the scope of the present invention, and all equivalent structures or flow transformations made by the present specification and drawings, or applied directly or indirectly to other related arts, are included in the scope of the present invention.

Claims

1. A method of tamper image detection, the method comprising:

2. The method according to claim 1, wherein the step of determining that the image to be detected is a tampered image if there is an area with inconsistent pixels in the image to be detected further comprises:

3. The method according to claim 2, wherein the step of finding corresponding tampered pixel locations and tampered edge locations in the first multi-dimensional semantic feature according to the pre-generated first predicted pixel mask and first predicted edge mask further comprises:

4. The method according to claim 2, wherein the feature samples comprise global samples and local samples, and the step of predicting the falsification mode and/or falsification area of the image to be detected by combining the feature samples comprises:

5. The method according to claim 2, wherein the feature samples comprise global samples and local samples, and the step of predicting the falsification mode and/or falsification area of the image to be detected by combining the feature samples comprises:

6. The method according to claim 1, wherein the first multidimensional semantic feature model comprises a Full Convolution Network (FCN) model, and the step of obtaining the first multidimensional semantic feature of the image to be detected based on the preset multidimensional semantic detection model further comprises:

7. The method according to claim 6, wherein said step of inputting a verification picture into said initial FCN model for verification, and after verification, saving as said FCN model comprises:

8. A falsified image detection apparatus characterized by comprising:

9. A tampered image detection device, characterized in that it comprises a processor, a memory and a tampered image detection program stored in said memory, which when run by said processor implements the steps of the tampered image detection method according to any of claims 1-7.

10. A computer storage medium having a tamper image detection program stored thereon, the tamper image detection program when executed by a processor implementing the steps of the tamper image detection method according to any one of claims 1-7.