CN117496338A

CN117496338A - Method, equipment and system for defending website picture tampering and picture immediate transmission

Info

Publication number: CN117496338A
Application number: CN202311535721.8A
Authority: CN
Inventors: 赵文波; 李昱甫; 朱磊; 王立业; 马立艳
Original assignee: CETC 30 Research Institute
Current assignee: CETC 30 Research Institute
Priority date: 2023-11-16
Filing date: 2023-11-16
Publication date: 2024-02-02

Abstract

The invention discloses a method, equipment and a system for defending website picture tampering and picture immediate transmission, belonging to the crossing field of computer vision and deep learning, comprising the following steps: inputting a picture; pretreatment; generating a hash value; extracting picture features by using a convolutional neural network CNN; further extracting features and reconstructing the picture using the self-encoder; performing feature fusion analysis by combining the perceptual hash algorithm, the CNN and the output of the self-encoder; analyzing the fused characteristics, and judging whether the picture is tampered or contains a picture horse; and (5) detecting a result. The invention provides a scheme for efficiently and accurately defending website picture tampering and picture immediate transmission.

Description

Method, equipment and system for defending website picture tampering and picture immediate transmission

Technical Field

The invention relates to the crossing field of computer vision and deep learning, in particular to a method, equipment and a system for defending website picture tampering and picture immediate transmission.

Background

Computer vision is a science of how to "see" and understand digital images or video by a machine. Deep learning is a branch of machine learning, and useful patterns and features can be automatically learned and extracted from a large amount of data.

The existing methods for defending website picture tampering and picture immediate transmission, whether based on rules or machine learning, and continuing image signature have some common or respective specific problems. There is a continuing need for research and improvement by those skilled in the art to address the ever-evolving needs of image tampering and picture horses.

Disclosure of Invention

The invention aims to overcome the defects of the prior art, provides a method, equipment and a system for defending website picture tampering and picture immediate transmission, and provides a technical scheme and the like for defending website picture tampering and picture immediate transmission efficiently and accurately.

The invention aims at realizing the following scheme:

a method for defending website picture tampering and picture immediate transmission, comprising the steps of:

s1: preprocessing a picture: the input image is first opened and adjusted to a set size, and then converted into a gray image;

s2: calculating a perceptual hash: after preprocessing, carrying out global characteristic analysis on the picture by using a perception hash algorithm, and obtaining a hash value by carrying out hash on the global characteristic of the picture, wherein the global characteristic comprises color, texture and shape; then, comparing the hash value with the hash value of the original picture stored in the database, and if the similarity of the hash value and the hash value is lower than a set threshold value, considering that the picture is possibly tampered;

S3: CNN image classification: classifying the perceived hash by using a CNN model, wherein the CNN model firstly extracts the characteristics of the image through a convolution layer and a pooling layer, and then outputs a classification result through a full connection layer;

s4: anomaly detection based on self-encoder: performing anomaly detection on an image by using a self-encoder model, wherein the self-encoder model firstly compresses the input image into a low-dimensional feature vector through an encoding layer, and then restores the feature vector into an original image through a decoding layer; during the training process, the self-encoder model learns how to reconstruct a normal image, and in the test stage, if an input image cannot be correctly reconstructed by the self-encoder model, the image is considered to be abnormal;

s5: combining the calculated perceptual hash, CNN image classification and abnormal detection output of the self-encoder, performing fusion analysis, and judging whether the picture is tampered or contains a picture horse through tampering detection and picture horse detection;

s6: and outputting the detection result of the step S5.

Further, in step S1, the preprocessing further includes: color space conversion, size normalization, and noise removal for improving the effect of subsequent processing.

Further, in step S2, the global characteristic analysis is performed on the picture by using a perceptual hash algorithm, and a hash value is obtained by hashing the global characteristic of the picture, where the global characteristic includes color, texture and shape; then, comparing the hash value with the hash value of the original picture stored in the database, and if the similarity of the hash value and the hash value is lower than a set threshold value, considering that the picture is possibly tampered, specifically comprising the following substeps: converting the preprocessed image into a Numpy array, and then calculating Discrete Cosine Transform (DCT) values of the array; in the DCT result, taking only the first 32 coefficients, and then calculating the average value of the coefficients; finally, a hash value is generated: for each DCT coefficient, the corresponding bit of the hash value is '1' if its value is greater than the average value, otherwise '0'.

Further, in step S4, the self-encoder based anomaly detection further includes the sub-steps of: the self-encoder is trained to reconstruct the input picture, which is considered likely to contain a picture horse if the error of the reconstruction exceeds a set threshold.

Further, in step S5, the tamper detection specifically includes the following sub-steps:

and tamper detection is carried out by adopting a combination method of a perceptual hash algorithm and a convolutional neural network: generating a hash value on the input image by using a perceptual hash algorithm, wherein the perceptual hash algorithm generates the hash value by extracting global features of the image, and the difference of the hash values reflects the difference of the image; then, comparing the generated hash value with the original hash value of the image, and if the difference between the hash values exceeds a certain threshold value, considering that the image is possibly tampered;

tamper detection is carried out by utilizing a convolutional neural network, an image is input into the pre-trained convolutional neural network, and whether the image is tampered or not is evaluated by comparing the output of the network with the expected output; furthermore, the tampered region is located using the output of the middle layer of the network.

Further, in step S5, the picture horse detection specifically includes the following sub-steps:

Learning, by means of a depth self-encoder, the reconstructed input data to learn an implicit structure of the data; during training, a normal image training depth self-encoder is used for learning the distribution of the normal image; during testing, inputting an image to be detected into a depth self-encoder, and if the reconstruction error of the depth self-encoder is larger than a set threshold value, considering that the image possibly contains a picture horse; and, the position of the picture horse is located using the output from the intermediate layer of the encoder.

Further, in step S6, the outputting the detection result of step S5 specifically includes outputting the location, type and possible tamperer of the tampered or photo horse, and presenting in a graphic form.

Further, the outputting the location, type and possibly the tamperer of the tampered or imaged horse and presenting it in the form of a graphic, comprising: for tampering or presence judgment of a picture horse, the model may output a value between 0 and 1, indicating the probability that the model considers that the image is tampered or contains the picture horse; setting a threshold value, and judging that the image is tampered or contains a picture horse when the output value exceeds the threshold value; for a specific position of the tampered or picture horse, outputting a heat map with the same size as the original image, wherein each pixel value represents the probability that the position is tampered or contains the picture horse; or output a report describing the likelihood of the picture horse, as well as possible source and threat information.

An apparatus for protecting against website picture tampering and picture immediate transmission, comprising a processor and a memory, the memory having stored therein a computer program which, when loaded by the processor, performs the method of any of the above.

A system for defending website picture tampering and picture immediate transmission comprising a device as described above.

The beneficial effects of the invention include:

the invention combines a perception hash algorithm, a Convolutional Neural Network (CNN) and a deep learning method of a self-encoder, and provides a technical scheme for efficiently and accurately defending website picture tampering and picture immediate transmission.

The scheme of the invention improves the detection effect of tampering complex and unknown type images and picture horses: by combining the perceptual hash algorithm and the deep learning technology, the global features of the picture can be captured by utilizing a rule (the perceptual hash algorithm), and the fine features of the picture can be learned by a deep learning model (such as CNN and a self-encoder), so that the detection effect of picture tampering and picture horse of complex and unknown types is improved.

The scheme of the invention reduces the dependence on the quality and the quantity of training data: by means of the self-encoder, the unlabeled data can be used for training, thereby reducing the need for large amounts of labeled data. In addition, the deep learning model (such as CNN and the self-encoder) has strong feature learning capability, and can extract rich features from limited training data, so that the dependence on the quality and quantity of the training data is reduced.

The scheme of the invention improves the detection capability of micro tampering: compared with the traditional image signature-based method, the perceptual hash algorithm can capture global features of the picture, and the deep learning model can learn fine features of the picture, so that the method can better detect fine picture tampering.

The scheme of the invention reduces the requirements of storage and calculation resources: in the method of the invention, the representation of the picture (e.g. the intermediate layer output of the perceptual hash and the deep learning model) is much smaller than the original picture, which greatly reduces the storage space requirements. In addition, through a high-efficiency perceptual hash algorithm and a deep learning model, the method can realize rapid and accurate picture tampering and picture horse detection under limited computing resources.

The method has the advantages that the perceptual hash algorithm is used for carrying out preliminary feature extraction, so that the method has good robustness to some minor changes of the picture, and can effectively detect tampering of minor modification of the picture. Then, further feature learning is performed through a Convolutional Neural Network (CNN), so that the method can automatically learn and extract the features of the picture without knowing the features of the original picture in advance. Finally, by carrying out feature reconstruction from the encoder, the method can effectively detect the tampering of the picture, thereby realizing the protection of the picture tampering and the picture horse, and simultaneously increasing the detection capability of the system for the picture horse of unknown type.

The invention improves the efficiency and accuracy of defending website picture tampering and picture immediate transmission, and provides new ideas and tools for research and application in related fields.

Compared with other technical schemes, the scheme of the invention has the following main advantages:

1) The method has higher detection capability for complex and unknown types of picture tampering and picture horse: the method combines a perceptual hash algorithm and a deep learning technology, and improves the detection effect of picture tampering and picture horse of complex and unknown types by capturing global features of pictures by using the perceptual hash algorithm and learning fine features of the pictures by using a deep learning model. Compared with the method which only depends on the traditional image processing method or a single deep learning model, the method can more comprehensively understand the image content and improve the detection accuracy.

2) Reducing the dependence on the quality and quantity of training data: using the characteristics of the self-encoder, the unlabeled data can be used for training, reducing the need for large amounts of labeled data. This makes it difficult to obtain data or, in the event of excessive tag information costs, still maintain good performance. Meanwhile, the strong feature learning capability of the deep learning model can extract rich features from limited training data, and the dependence on the quality and quantity of the training data is reduced.

3) Powerful detection capability for small tampering: the global features of the picture and the fine features of the picture are captured through the perceptual hash algorithm, and the deep learning model learns the fine features of the picture, so that the method can effectively detect the fine picture tampering. This strategy of combining global and local information is more likely to capture subtle tampering behavior than traditional image signature-based approaches.

4) Reducing storage and computing resource requirements: in the method of this patent, the representation of the picture (e.g., the intermediate layer output of the perceptual hash and the deep learning model) is much smaller than the original picture, which greatly reduces the storage space requirements. In addition, through a high-efficiency perceptual hash algorithm and a deep learning model, the method can realize rapid and accurate picture tampering and picture horse detection under limited computing resources.

5) The method has good adaptability and expansibility: the method combines a rule algorithm (perceptual hash) and a data driving algorithm (deep learning) and can capture global and local picture features at the same time, so that the method can not only effectively detect known type picture tampering and picture horses, but also better detect complex and unknown type picture tampering and picture horses.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a flow chart of steps of a method according to an embodiment of the present invention.

Detailed Description

All of the features disclosed in all of the embodiments of this specification, or all of the steps in any method or process disclosed implicitly, except for the mutually exclusive features and/or steps, may be combined and/or expanded and substituted in any way.

In view of the current situation in the background, the following technical problems are found after the prior art is subjected to deep analysis and thinking in the invention conception:

1. rule-based method

Currently, one common approach to web site picture tampering and picture horse defense is to use rule-based detection systems. This method sets a series of rules to detect possible picture tampering and picture horses, according to known attack patterns. For example, a rule may be set to detect whether or not the color distribution, size, format, and other attributes of the image meet expectations.

This approach sets a series of rules for known attack patterns in order to detect possible picture tampering and picture horses. These rules are typically based on various attributes of the picture, such as color distribution, size, format, etc. Once certain attributes of a picture do not match predefined rules, the system may consider the picture as possibly tampered with or possibly containing a picture horse.

For example, a simple rule might be to check whether the color distribution of the picture meets expectations. If the color distribution of a picture is not as expected, the picture may be tampered with. Alternatively, another rule may be to check whether the picture contains unusual metadata, which may imply that the picture contains hidden code, i.e. picture horses.

This rule-based approach relies primarily on the experience and knowledge of experts, and rule set-up and maintenance typically requires manual effort involving analysis of large amounts of historical data to find possible attack patterns, which are then translated into rules. The number and quality of rules directly influence the detection effect of this method. For known attack patterns, this approach allows for fast and accurate detection if valid rules can be devised. However, for unknown or complex attack patterns, manual rules tend to be difficult to override, and the effectiveness of this approach is greatly compromised.

2. Machine learning-based method

Another approach is a defense approach based on machine learning. According to the method, by training a machine learning model, such as a Support Vector Machine (SVM) or a decision tree, characteristics of tampered pictures and picture horses are automatically learned and identified according to the marked tampered pictures and picture horse data. The learned model is then applied to the new picture to detect if it has been tampered with or contains a picture horse.

With the development of machine learning technology, a new defense method has begun to prevail, i.e., a defense method based on machine learning. The method mainly automatically identifies and detects tampered pictures and picture horses by training a machine learning model.

In this method, a large amount of tampered pictures and picture horse data needs to be collected first, and then these data are labeled as training data of a model. Then, by training a machine learning model such as a Support Vector Machine (SVM) or decision tree, features of the tampered picture and the picture horse are automatically learned and recognized. After model training is completed, it can be applied to the new picture to detect if it has been tampered with or contains a picture horse.

Although the machine learning based approach solves the problem of the rule based approach being unable to cope with new attack patterns to some extent, it still has some significant drawbacks. First, this method relies on a large amount of labeling data, which takes a significant amount of time and labor to collect and label. Second, due to the nature of the machine learning model, it may not be effectively identified for certain types of picture horses or more complex tampering techniques. Furthermore, training the model also requires a large amount of computing resources, which undoubtedly increases the difficulty of implementation for resource-limited environments.

3. Image signature-based method

In addition to rule methods and machine learning techniques, there is also a technique based on image signatures that is widely used to defend website picture tampering and picture immediate transmission.

Image signature technology is a method of converting an image into a specific set of values (i.e., a "signature"). The set of values is a representation of the characteristics of the image that can be used to compare and identify the image. The specific transformation process generally includes two steps, feature extraction and feature encoding. Feature extraction is the extraction of information from an image that represents the characteristics of the image, such as color, texture, shape, etc. Feature encoding converts extracted feature information into a set of values.

In applications that defend against web site picture tampering and picture immediate transmission, the workflow of image signature technology is typically as follows: firstly, calculating an image signature of a normal picture of a website and storing the image signature; then, when a new picture is uploaded to a website, calculating an image signature of the picture, and comparing the image signature with a stored image signature of a normal picture; if the difference between the two exceeds a certain threshold, then it is considered that the picture may be tampered with or a picture horse.

Although image signature based techniques can effectively detect picture tampering and picture horses in certain scenarios, there are problems with this approach. First, small modifications to the picture, particularly high quality tampering, by this method may not be effectively detected. As a small range of modifications may not significantly alter the image signature. Second, this approach requires the storage of a large number of image signatures, which can lead to memory space problems. Finally, the process of computing and comparing image signatures can require a significant amount of computing resources, which can be a challenge for resource-constrained environments.

In summary, the existing methods for defending website picture tampering and picture immediate transmission, whether based on rule or machine learning, and continuing image signature methods have some common or respective specific problems.

1) The problem with rule-based approaches is that: rule-based picture tampering and picture horse detection methods are widely used, but such methods can be significantly compromised when complex or unknown types of picture tampering and picture horses are identified. They rely mostly on predefined rules and human experience, which can be effectively addressed for common and known attack means, but are struggled against new or more complex tamper means. The formulation and optimization of rules require a great deal of manual participation, consume a great deal of time and manpower resources, and are difficult to adapt to the rapid development of image tampering technology.

2) The problem with the machine learning based approach is that: although the machine learning-based method can effectively improve the recognition capability of various types of picture tampering and picture horses through self learning, the performance of such a method depends largely on the quality and quantity of training data. If the training data is not comprehensive and rich enough to cover various situations which may occur in the practical application, the recognition effect of the model is greatly reduced. In addition, machine learning methods generally require a large amount of labeled data for training, and in a practical environment, it is not easy to obtain a large amount of labeled data with high quality, which causes great trouble to training of a model.

3) The problem with the image signature based method is that: image signature-based methods can effectively detect picture tampering and picture horse in certain scenarios, but for small-scale, high-quality image modifications, the methods may not be able to effectively detect. As such minor modifications may not significantly alter the image signature. Furthermore, this approach requires the storage and management of a large number of image signatures, which can lead to pressure on storage space and computing resources. The process of computing and comparing image signatures requires a significant amount of computing resources, which can be a significant challenge for resource-limited environments.

After the technical problems are found, the invention discloses a technical scheme for defending website picture tampering and immediate picture transmission based on a perceptual hash and deep learning technology. The technical scheme comprehensively utilizes a perception hash algorithm, a convolutional neural network and a deep learning technology of a self-encoder, and aims to solve the technical problems found by the prior art. Specifically, in a further concept, the present invention combines a perceptual hash algorithm with a deep learning technique. In particular, the present invention contemplates a scheme for protecting against web site picture tampering and picture immediate transmission that combines Convolutional Neural Networks (CNNs) and self-encoders. By the method, the system can efficiently and accurately identify and prevent tampered pictures and pictures from being uploaded.

Perceptual hashing is a common means of evaluating similarity of pictures when they are aligned, which can convert the pictures into a relatively short, fixed-length binary string. Compared with the traditional hash algorithm, the method has better robustness to some small changes of the picture, such as scaling, clipping and the like. Therefore, it can be well used for detecting tampering of a picture.

Deep learning is one of the most popular techniques in the field of artificial intelligence. In this method, the application of deep learning is mainly embodied in two aspects, namely, feature extraction of a picture using a Convolutional Neural Network (CNN) and feature reconstruction by a self-encoder so as to find the difference between an original picture and a tampered picture. Convolutional Neural Network (CNN) is a feed-forward neural network whose artificial neurons can respond to stimuli in a single field of view around, and is one of the best performing models at present in terms of image and speech recognition problems. The method is characterized in that the picture features are automatically extracted through convolution calculation. The self-encoder is then an unsupervised neural network model that obtains a compressed representation of the original data by encoding and decoding the input data. The goal of the self-encoder is to make the output result as close as possible to the input. Therefore, when the input data is tampered, the output result thereof will be different from the original picture, so that the tampering of the picture can be detected.

In the method, firstly, the initial feature extraction is carried out on the input picture through a perceptual hash algorithm, then further feature learning is carried out through CNN, then the features are input into a self-encoder for reconstruction, and finally, whether the picture is prevented from being uploaded is determined according to the difference between the reconstructed result and the original picture.

The method has the advantages that the perceptual hash algorithm is used for carrying out preliminary feature extraction, so that the method has good robustness to some minor changes of the picture, and can effectively detect tampering of minor modification of the picture. Then, further feature learning is performed through a Convolutional Neural Network (CNN), so that the method can automatically learn and extract the features of the picture without knowing the features of the original picture in advance. Finally, by carrying out feature reconstruction from the encoder, the method can effectively detect the tampering of the picture, thereby realizing the protection of the picture tampering and the picture horse, and simultaneously increasing the detection capability of the system for the picture horse of unknown type. The implementation flow is approximately as follows, and the following steps are sequentially executed: 1. input: website pictures; 2. pretreatment: scaling, normalization, etc.; 3. perceptual hash algorithm: generating a hash value; 4. convolutional Neural Network (CNN): extracting picture characteristics; 5. self-encoder: further extracting features and reconstructing pictures; 6. feature fusion: combining the perceptual hash algorithm, the CNN and the output of the self-encoder; 7. tampering and picture horse detection: analyzing the fused characteristics, and judging whether the picture is tampered or contains a picture horse; 8. and (3) outputting: the detection results include whether tampered, the type of tampering, the location of tampering, etc.

Furthermore, the method of the invention adopts a staged strategy to realize image tampering and image horse detection by combining a perceptual hash algorithm and a deep learning technology. Firstly, it will preprocess the inputted picture, and then detect whether the picture is tampered or contains malicious code through a perceptual hash algorithm and a deep learning model. The method mainly comprises the following steps:

step 1, pretreatment: for an input picture, preprocessing is firstly performed, including color space conversion, size standardization, noise removal and the like, so as to improve the effect of subsequent processing.

Step 2, a perception hash algorithm: after preprocessing, global characteristic analysis is performed on the picture by using a perceptual hash algorithm. The algorithm obtains a unique hash value by hashing the global characteristics of the picture (e.g., color, texture, shape, etc.). Then, the hash value is compared with the hash value of the original picture stored in the database, and if the similarity of the hash value and the hash value is lower than a set threshold value, the picture is considered to be possibly tampered.

Step 3, deep learning model: if the perceptual hash algorithm detects possible picture tampering, then further detailed analysis is performed using a deep learning model. Here, the inventive method employs two models, convolutional Neural Network (CNN) and self-encoder.

Step 3.1, CNN: and extracting and classifying the characteristics of the picture by using the CNN. The CNN can learn the local characteristics of the picture, has robustness to local translation, rotation, scaling and other transformations, and is very suitable for image tampering detection. Here, the CNN is trained to distinguish a normal picture from a tampered picture, and if the output of the CNN indicates that the picture may be tampered with, the next process is performed.

Step 3.2, self-encoder: if the CNN detects possible picture tampering, it is further analyzed using the self-encoder. The self-encoder is a deep learning model of unsupervised learning, can learn advanced characteristics of pictures, and can perform anomaly detection. Here, the method uses the self-encoder to determine whether the picture contains a picture horse. In particular, the self-encoder is trained to reconstruct an input picture, which is considered likely to contain a picture horse if the error of the reconstruction exceeds a set threshold.

In a further specific embodiment, the method further comprises the following implementation steps:

s1: and (5) preprocessing the picture. In this step S1, the input image is first opened and adjusted to a size of 8x8, and then converted into a gray image. The purpose of this step is to reduce the complexity of the subsequent processing and the greyscale image contains the main information of the image, which is sufficient for the subsequent processing.

The pseudo code is as follows:

from PIL import Image

image＝Image.open('input.jpg')

image＝image.resize((8,8),Image.ANTIALIAS).convert('L')

in this section of pseudo code, the picture is resized to 8x8 and then converted to a grayscale image.

image: input picture.

The scale (8, 8), image. Antialia): 8x8 represents the target size, and image. Antialia represents the image resampling filter.

Convert ('L') means converting the image into a grayscale mode.

S2: a perceptual hash is calculated. In this step S2, the preprocessed image is converted into a Numpy array, and then the discrete cosine transform (Discrete Cosine Transform, DCT) of the array is computed. In the result of the DCT, only the first 32 coefficients are taken and then the average of these coefficients is calculated. Finally, a hash value is generated: for each DCT coefficient, the corresponding bit of the hash value is '1' if its value is greater than the average value, otherwise '0'. The pseudo code is as follows:

import numpy as np

converting picture into NumPy array

pixels＝np.array(image)

# computation DCT

dct＝np.fft.dct(pixels)

# get the first 32 coefficients

dct_low_freq＝dct[:32]

Calculation of average value #

mean_value＝np.mean(dct_low_freq[1:])

# generate hash value

hash_value＝”.join(['1'if i>mean_value else'0'for iin dct_low_freq[1:]])

The pseudo code converts the picture into a NumPy array, then calculates its Discrete Cosine Transform (DCT), takes the first 32 coefficients, calculates the average of these coefficients, and generates a hash value.

pixels-a matrix of pixel values obtained by converting an image into a NumPy array.

dct is the result of the discrete cosine transform, with a size of 8x8.

dct_low_freq-the first 32 coefficients in the DCT result.

mean value in mean_value: dct_low_freq.

hash_value, generated hash value.

In this code, the computation of the Discrete Cosine Transform (DCT) can be represented by the following formula:

wherein:

f (x, y) is the pixel value of the input image I.

C (u) and C (v) are normalization factors, and if u and v are both 0, they have values ofOtherwise, 1.

N is the size of the image, for example, if the image is 8x8 in size, n=8, here 8.

F (u, v) is a coefficient of a u-th row and a v-th column obtained by Discrete Cosine Transform (DCT).

Pi is the circumference ratio.

This formula can be broken down into:

then, the first 32 DCT coefficients are extracted from the code and their average value is calculated. Finally, a hash value is generated based on whether each coefficient is greater than the average.

S3: CNN image classification. In this step S3, the perceptual hash is classified using a CNN model. The CNN model firstly extracts the features of the image through a series of convolution layers and pooling layers, and then outputs classification results through a full connection layer. The pseudo code is as follows:

from keras.models import Sequential

from keras.layers import Conv2D,MaxPooling2D,Flatten,Dense

model＝Sequential()

model.add(Conv2D(32,(3,3),activation＝'relu',input_shape＝(8,8,1)))

model.add(MaxPooling2D((2,2)))

model.add(Flatten())

model.add(Dense(10,activation＝'softmax'))

model.compile(optimizer＝'adam',loss＝'categorical_crossentropy',metrics＝['accuracy'])

model.fit(train_images,train_labels,epochs＝10,batch_size＝32)

this code defines a convolutional neural network model and trains the model.

Conv2D (32, (3, 3), activation= 'relu', input_shape= (8,8,1)), which is a convolution layer, 32 represents the number of convolution kernels, (3, 3) represents the size of the convolution kernels, 'relu' represents the activation function, and (8,8,1) represents the shape of the input (width, height, and number of channels of the gray scale image).

MaxPooling2D ((2, 2)): this is one maximum pooling layer, (2, 2) represents the size of the pooling window.

Flat () flattening layer, which unifies multidimensional inputs, can be used for the transition of the convolutional layer to the fully connected layer.

Dense (10, activation= 'softmax') the fully connected layer, 10 indicates the number of neurons, and 'softmax' indicates the activation function.

compile (optimization= ' adam ', loss= ' category_cross sentrony ', metrics= [ ' accura cy ' ]) compilation model, ' adam ' represents optimizer, ' category_cross sentrony ' represents loss function, and [ (accuracy ' ] represents evaluation index.

In this section of pseudo code, the operation of each layer can be represented by the following formula:

convolution layer:

where b is the bias term. This operation is performed on each local area of the image, outputting a feature map.

(I x W) [ I, j ] in the convolution operation, the value of the ith row j column of the convolution result of the convolution kernel W with the input image I.

I [ m, n ] is the pixel value of the mth row and n column of the input image.

W [ i-m, j-n ]: the value of row (i-m) of the convolution kernel (j-n).

And b, biasing items.

This formula can be broken down into:

(I*W)[i,j]＝(I*W)[i,j]+b

maximum pooling layer:

MaxPool(I)[i,j]＝max _0≤m,n<p I[i·p+m,j·p+n]

where (i, j) is a position on the feature map and p is the size of the pooling window.

MaxPool (I) [ I, j ] the result of maximum pooling is taken from the maximum value of the pooled region of the ith row, j column of the input image I.

p is the size of the pooling kernel.

Full tie layer:

f(x)＝activation(Wx+b)

where activation is an activation function, such as a softmax function or a ReLU or sigmoid, and b is a bias vector. f (x) is the output of the fully connected layer, W is the weight matrix of the fully connected layer, and x is the input vector of the fully connected layer.

S4: based on anomaly detection from the encoder. In this step S4, anomaly detection is performed on the image using a self-encoder model. The self-encoder model first compresses the input image into a low-dimensional feature vector through the encoding layer, and then restores the feature vector to the original image through the decoding layer. During training, the self-encoder model learns how to reconstruct a normal image, so that during the test phase, if an input image cannot be properly reconstructed from the self-encoder model, this image is considered to be abnormal.

from keras.layers import Input,Dense

from keras.models import Model

input_img＝Input(shape＝(64,))

encoded＝Dense(32,activation＝'relu')(input_img)

decoded＝Dense(64,activation＝'sigmoid')(encoded)

autoencoder＝Model(input_img,decoded)

autoencoder.compile(optimizer＝'adadelta',loss＝'binary_crossentropy')

autoencoder.fit(normal_images,normal_images,epochs＝50,batch_size＝256)

This code defines a self-encoder model and trains the model.

Dense (32, activation= 'inlu') (input_img) coding layer, 32 represents the number of neurons, 'inlu' represents the activation function and input_img represents the input picture.

The decoding layer, 64, represents the number of neurons, 64 represents the activation function, and encoded represents the encoded result.

compile (optimization = 'adadelta', loss = 'binary_cross-sentropy') is a compiled model, 'adadelta' represents an optimizer, and 'binary_cross-sentropy' represents a loss function.

In this code, the encoder and decoder calculations from the encoder can be represented by the following formulas:

an encoder:

h＝activation(W _e x+b _e )

h is the output of the encoder, i.e. a low-dimensional representation of the input data.

W _e Weight matrix of encoder.

b _d Bias vector of encoder.

A decoder:

x'＝activation(W _d h+b _d )

x' is the output of the decoder, i.e. the reconstructed input data.

W _d Weight matrix of decoder.

b _d Bias vector of decoder.

Wherein activation is an activation function, such as a ReLU or sigmoid function.

S5: tampering and picture horse detection. In this step S5, tamper detection and picture horse detection are included. The specific descriptions are as follows:

a. Tamper detection

The object of tamper detection is to detect whether an image is tampered with and locate the tampered area. The combination method of the perceptual hash algorithm and the convolutional neural network is adopted for tamper detection.

Perceptual hash algorithm: first, a hash value is generated on an input image using a perceptual hash algorithm. The perceptual hash algorithm generates a shorter hash value by extracting global features (e.g., brightness, color, texture, etc.) of the image, and the difference in hash values may reflect the difference in the image. The generated hash value is then compared to the original hash value of the image (if any), and if the difference between the hash values exceeds a certain threshold, the image is considered likely to be tampered with.

Convolutional neural network: however, perceptual hashing algorithms may be insensitive to some subtle tampering, so convolutional neural networks are further used for tamper detection. The images are input into a pre-trained convolutional neural network, and by comparing the output of the network with the expected output (e.g., class labels of the images, etc.), it can be assessed whether the images were tampered with. In addition, the tampered region can be located by using the output of the middle layer of the network.

b. Picture horse detection

Picture horse is a type of malware that propagates by hiding in normal images to evade security detection. The object of the picture horse detection is to detect whether the image contains a picture horse or not and to locate the position of the picture horse. Here a depth self-encoder is used for picture horse detection.

A depth self-encoder is an unsupervised deep learning model that learns the implicit structure of the data by learning to reconstruct the input data. During training, the self-encoder is trained using a large number of normal images, allowing it to learn the distribution of the normal images. At the time of testing, an image to be detected is input into the self-encoder, and if the self-encoder cannot reconstruct the image well (i.e., the reconstruction error is greater than a certain threshold), then the image may be considered to contain a picture horse. In addition, the output from the intermediate layer of the encoder can also be used to locate the position of the picture horse.

S6: and outputting a result. In this step S6, the output result of the method of the present invention is mainly a prediction result concerning image tampering and picture horse detection. This prediction is typically presented in two levels of information:

judging the existence of image tampering or picture horse: first, the method outputs a determination as to whether the image is tampered with or contains a picture horse. This is a binary decision, typically expressed as "yes" or "no", or represented by the numbers 0 and 1.

Specific location of image tampering or picture horse: if it is detected that the image is tampered with or contains a picture horse, the method also outputs more specific information, such as the location, type, and possibly a tamperer of the tampered with or picture horse. Such information may be presented in the form of a graphic, such as highlighting or a box line on the original image to mark the location of the tamper or picture horse.

For the deep learning model, both layers of output information are calculated. For example, for tampering or presence determination of a picture horse, the model may output a value between 0 and 1, indicating the probability that the model considers the image as tampered with or containing a picture horse. A threshold (e.g., 0.5) may then be set, and when the output value exceeds this threshold, it is determined that the image is tampered with or contains a picture horse. For a particular location of a tampered or photo horse, the model may output a heat map of the same size as the original image, with each pixel value representing the probability that the location was tampered with or contained the photo horse. A detailed report may also be output describing the likelihood of a picture horse, as well as possible sources and threats.

It should be noted that, within the scope of protection defined in the claims of the present invention, the following embodiments may be combined and/or expanded, and replaced in any manner that is logical from the above specific embodiments, such as the disclosed technical principles, the disclosed technical features or the implicitly disclosed technical features, etc.

Example 1

As shown in fig. 1, a method for defending website picture tampering and picture immediate transmission comprises the following steps:

s6: and outputting the detection result of the step S5.

Example 2

On the basis of embodiment 1, in step S1, the preprocessing further includes: color space conversion, size normalization, and noise removal for improving the effect of subsequent processing.

Example 3

Based on embodiment 1, in step S2, the global characteristic analysis is performed on the picture by using a perceptual hash algorithm, and a hash value is obtained by hashing the global characteristic of the picture, where the global characteristic includes color, texture and shape; then, comparing the hash value with the hash value of the original picture stored in the database, and if the similarity of the hash value and the hash value is lower than a set threshold value, considering that the picture is possibly tampered, specifically comprising the following substeps:

Converting the preprocessed image into a Numpy array, and then calculating Discrete Cosine Transform (DCT) values of the array; in the DCT result, taking only the first 32 coefficients, and then calculating the average value of the coefficients; finally, a hash value is generated: for each DCT coefficient, the corresponding bit of the hash value is '1' if its value is greater than the average value, otherwise '0'.

Example 4

On the basis of embodiment 1, in step S4, the self-encoder based anomaly detection further includes the sub-steps of: the self-encoder is trained to reconstruct the input picture, which is considered likely to contain a picture horse if the error of the reconstruction exceeds a set threshold.

Example 5

On the basis of embodiment 1, in step S5, the tamper detection specifically includes the following sub-steps:

Example 6

On the basis of embodiment 1, in step S5, the picture horse detection specifically includes the following sub-steps:

Example 7

On the basis of embodiment 1, in step S6, the output of the detection result of step S5 specifically includes outputting the location, type and possible tamperer of the tampered or imaged horse, and presenting it in a graphic form.

Example 8

On the basis of embodiment 7, the output tampering or location, type of picture horse and possible tampering person are presented in the form of a graph, comprising: for tampering or presence judgment of a picture horse, the model may output a value between 0 and 1, indicating the probability that the model considers that the image is tampered or contains the picture horse; setting a threshold value, and judging that the image is tampered or contains a picture horse when the output value exceeds the threshold value; for a specific position of the tampered or picture horse, outputting a heat map with the same size as the original image, wherein each pixel value represents the probability that the position is tampered or contains the picture horse; or output a report describing the likelihood of the picture horse, as well as possible source and threat information.

Example 9

An apparatus for protecting against website picture tampering and picture immediate transmission comprising a processor and a memory, the memory having stored therein a computer program which when loaded by the processor performs the method of any of embodiments 1-8.

Example 10

A system for protecting against website picture tampering and picture immediate transmission comprising the apparatus of embodiment 9.

The units involved in the embodiments of the present invention may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.

According to an aspect of embodiments of the present invention, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the methods provided in the various alternative implementations described above.

As another aspect, the embodiment of the present invention also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the methods described in the above embodiments.

Claims

1. A method for defending website picture tampering and picture immediate transmission, comprising the steps of:

s6: and outputting the detection result of the step S5.

2. The method for protecting against web site picture tampering and picture immediate transmission according to claim 1, wherein in step S1, the preprocessing further comprises: color space conversion, size normalization, and noise removal for improving the effect of subsequent processing.

3. The method for preventing website picture tampering and immediate transmission according to claim 1, wherein in step S2, the global characteristic analysis is performed on the picture by using a perceptual hash algorithm, and a hash value is obtained by hashing the global characteristic of the picture, wherein the global characteristic includes color, texture and shape; then, comparing the hash value with the hash value of the original picture stored in the database, and if the similarity of the hash value and the hash value is lower than a set threshold value, considering that the picture is possibly tampered, specifically comprising the following substeps:

4. The method for defending against web site picture tampering and picture immediate transmission according to claim 1, wherein in step S4, the self-encoder based anomaly detection further comprises the sub-steps of: the self-encoder is trained to reconstruct the input picture, which is considered likely to contain a picture horse if the error of the reconstruction exceeds a set threshold.

5. The method for defending against website picture tampering and picture immediate transmission according to claim 1, wherein in step S5, the tamper detection specifically comprises the following sub-steps:

6. The method for defending against web site picture tampering and picture immediate transmission according to claim 1, wherein in step S5, the picture horse detection specifically comprises the sub-steps of:

7. The method for defending against website picture tampering and immediate transmission according to claim 1, wherein in step S6, the detection result of step S5 is outputted, specifically including outputting the location, type and possible tamperer of the tampered or picture horse, and presenting it in the form of a graph.

8. The method for defending against website picture tampering and picture immediate transmission according to claim 7, characterized in that said outputting tampering or picture horse's location, type and possible tamperers, and presenting in graphical form, in particular comprises: for tampering or presence judgment of a picture horse, the model may output a value between 0 and 1, indicating the probability that the model considers that the image is tampered or contains the picture horse; setting a threshold value, and judging that the image is tampered or contains a picture horse when the output value exceeds the threshold value; for a specific position of the tampered or picture horse, outputting a heat map with the same size as the original image, wherein each pixel value represents the probability that the position is tampered or contains the picture horse; or output a report describing the likelihood of the picture horse, as well as possible source and threat information.

9. An apparatus for protecting against website picture tampering and picture immediate transmission, comprising a processor and a memory, the memory having stored therein a computer program which, when loaded by the processor, performs the method of any of claims 1-8.

10. A system for protecting against web site picture tampering and picture immediate transmission comprising the apparatus of claim 9.