WO2022237397A1 - 图像真伪检测方法、装置、计算机设备和存储介质 - Google Patents

图像真伪检测方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022237397A1
WO2022237397A1 PCT/CN2022/085430 CN2022085430W WO2022237397A1 WO 2022237397 A1 WO2022237397 A1 WO 2022237397A1 CN 2022085430 W CN2022085430 W CN 2022085430W WO 2022237397 A1 WO2022237397 A1 WO 2022237397A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
sample
noise
detected
frequency components
Prior art date
Application number
PCT/CN2022/085430
Other languages
English (en)
French (fr)
Inventor
韩周
张壮
董志强
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP22806367.3A priority Critical patent/EP4300417A1/en
Priority to US17/979,883 priority patent/US20230056564A1/en
Publication of WO2022237397A1 publication Critical patent/WO2022237397A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/771Feature selection, e.g. selecting representative features from a multi-dimensional feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/431Frequency domain transformation; Autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • G06Q30/0185Product, service or business identity fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present application relates to the fields of computer technology and image processing technology, in particular to an image authenticity detection method, device, computer equipment and storage medium.
  • the features related to the image content are generally extracted to detect the authenticity of the image.
  • the features related to image content are easily affected by image background changes, image illumination changes, image content occlusion or facial expression movements, etc., so it is difficult to ensure that the detection method has high robustness, resulting in inaccurate detection results.
  • the intrinsic pattern noise is the inherent noise introduced by the camera sensor and is not disturbed by the image content.
  • An image authenticity detection device comprising:
  • An image acquisition module configured to acquire an image to be detected
  • a high-frequency information acquisition module configured to remove low-frequency information in the image to be detected to obtain first image information
  • a noise reduction module configured to perform noise reduction processing on the first image information to obtain second image information after noise reduction
  • a pattern noise determination module configured to obtain an inherent pattern noise feature map corresponding to the image to be detected according to the difference between the first image information and the second image information;
  • the authenticity identification module is used to analyze the distribution of the inherent mode noise in the intrinsic mode noise feature map, and perform authenticity identification on the image to be detected according to the distribution, so as to obtain the authenticity detection of the image to be detected The result; the inherent pattern noise is the inherent noise brought by the camera sensor and not disturbed by the image content.
  • a computer device comprising a memory and one or more processors, the memory having stored therein computer readable instructions that, when executed by the one or more processors, cause the one or more A processor executes the steps in the image authenticity detection method described in each embodiment of the present application.
  • One or more computer-readable storage media having stored thereon computer-readable instructions that, when executed by one or more processors, cause the one or more processors to Execute the steps in the image authenticity detection method described in each embodiment of the present application.
  • a computer program product comprising computer readable instructions stored in a computer readable storage medium; read by one or more processors of a computer device from the computer readable storage medium
  • Computer-readable instructions one or more processors execute the computer-readable instructions, so that the computer device executes the steps in the image authenticity detection method described in the various embodiments of the present application.
  • Fig. 1 is the application environment diagram of image authenticity detection method in an embodiment
  • Fig. 2 is a schematic flow chart of an image authenticity detection method in an embodiment
  • Fig. 3 is a schematic diagram of the principle of introducing pattern noise during camera shooting in an embodiment
  • Fig. 4 is a schematic diagram of the overall process of extracting the intrinsic mode noise feature map in one embodiment
  • FIG. 5 is a schematic diagram of a network architecture of a convolutional neural network in an embodiment
  • Fig. 6 is a schematic diagram of the overall process of authenticity identification of images to be detected in an embodiment
  • Fig. 7 is a schematic diagram of the overall process of the model training step in an embodiment
  • Fig. 8 is a structural block diagram of an image authenticity detection device in an embodiment
  • Fig. 9 is a structural block diagram of an image authenticity detection device in another embodiment.
  • Figure 10 is a diagram of the internal structure of a computer device in one embodiment.
  • the image authenticity detection method provided in this application can be applied to the application environment shown in FIG. 1 .
  • the terminal 102 communicates with the server 104 through the network.
  • the terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers and portable wearable devices.
  • the server 104 can be an independent physical server, or a server cluster or a distributed system composed of multiple physical servers, and can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, Cloud servers for basic cloud computing services such as middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • the user can use the terminal 102 to input the image to be detected, and the terminal 102 can send the image to be detected to the server 104 .
  • the server 104 may execute the image authenticity detection method in each embodiment of the present application to obtain the authenticity detection result of the image to be detected.
  • the server 104 may determine whether to further process the image to be detected according to the authenticity detection result, and return the processing result to the terminal 102 .
  • the image to be detected is a face image
  • the server 104 can perform face recognition on the image to be detected, and the recognition result is returned to the terminal 102, if the authenticity detection result If the image to be detected is a forged image, the server 104 may not perform face recognition, and return the result that the image to be detected is a forged image to the terminal 102 .
  • the image authenticity detection method in each embodiment of the present application may be implemented by using a machine learning method in the field of artificial intelligence.
  • the step of analyzing the distribution of the inherent pattern noise in the characteristic map of the inherent pattern noise, and identifying the authenticity of the image to be detected according to the distribution, and obtaining the authenticity detection result of the image to be detected can be realized by using a machine learning method.
  • the image authenticity detection method in each embodiment of the present application also involves computer vision technology. For example: remove the low-frequency information in the image to be detected to obtain the first image information, perform noise reduction processing on the first image information to obtain the second image information after noise reduction, and according to the relationship between the first image information and the second image information
  • the image authenticity detection method in each embodiment of the present application may also involve blockchain technology.
  • the server or terminal that executes the image authenticity detection method in each embodiment of the present application may be a node in the blockchain system.
  • an image authenticity detection method is provided.
  • the image authenticity detection method can be executed by a computer device, wherein the computer device can include a server and a terminal. It can be understood that the image authenticity
  • the false detection method can be executed by the server or the terminal alone, or jointly executed by the terminal and the server.
  • the application of the method to the server in FIG. 1 is used as an example for illustration, including the following steps:
  • Step 202 acquiring an image to be detected.
  • the image to be detected is an image whose authenticity needs to be detected.
  • the image to be detected may be a human face image, ie, an image containing a human face.
  • the image to be detected can also be an image containing other content, such as: a person image, an object image or a landscape image, etc. Any image that needs to be detected for authenticity can be used as the image to be detected without limitation.
  • the image to be detected may be an image extracted from a video to be detected.
  • the server can sample the video to be detected to obtain multiple video frames, and the server can use the multiple video frames as multiple images to be detected, and execute the methods in the embodiments of the present application for each image to be detected, Thereby, authenticity detection results of each image to be detected are obtained.
  • the server may extract a frame of image from the video to be detected as the image to be detected, and execute the method in each embodiment of the present application on the image to be detected, so as to obtain the authenticity detection result of the image to be detected.
  • the server may first crop the image to be detected according to a preset size, and then perform step 204 and subsequent steps according to the cropped image to be detected.
  • the server may perform center cropping on the image to be detected according to a preset size.
  • the center cropping is a process of cropping the image to be detected by taking the center of the image to be detected as the center of the cropped image.
  • the server may use the center of the image to be detected as the center of the cropped image, and crop the image to be detected that meets a preset size from the image to be detected.
  • the server can cut the center of the image to be detected at a position 112 dimensions away from the center in the four directions of up, down, left, and right, and the cropped image is centered on the center of the image to be detected , and the image whose size is [224,224] is the cropped image to be detected.
  • the preset size will be one dimension more than the number of channels. For example, if the image to be detected is a three-channel image, the preset size can be [224,224,3].
  • Step 204 removing low-frequency information in the image to be detected to obtain first image information.
  • the low-frequency information is an image signal in a low-frequency segment in the image.
  • the first image information is image information that does not contain low-frequency information obtained by removing low-frequency information from the image to be detected.
  • the server may decompose and remove low-frequency information in the image to be detected by performing domain transformation on the image to be detected to obtain the first image information.
  • the domain transform may be any one of wavelet transform, Fourier transform and the like.
  • the server may filter the image to be detected by using a high-pass filter to remove low-frequency information in the image to be detected to obtain the first image information.
  • the server may also use other methods to remove low-frequency information in the image to be detected to obtain the first image information, which is not limited.
  • Step 206 performing noise reduction processing on the first image information to obtain second image information after noise reduction.
  • the noise reduction processing is processing for reducing noise in an image.
  • the second image information is noise-reduced image information obtained by performing noise reduction processing on the first image information.
  • the server may perform noise reduction filtering on the image corresponding to the first image information in the spatial domain to obtain the second image information after noise reduction. In another embodiment, the server may perform noise reduction filtering on the first image information in the transform domain to obtain the second image information in the transform domain.
  • the transform domain may be any one of wavelet domain, frequency domain and the like.
  • the server may perform noise reduction filtering on the first image information through a Wiener filter.
  • the server may also use other filters to perform noise reduction filtering on the first image information, such as an average filter or a median filter.
  • Step 208 according to the difference between the first image information and the second image information, an intrinsic mode noise feature map corresponding to the image to be detected is obtained.
  • the intrinsic pattern noise feature map is an image containing intrinsic pattern noise in the image to be detected.
  • Intrinsic pattern noise is the noise in the image caused by inherent defects in the production process of the camera sensor.
  • the main component in the inherent mode noise is the Photo Response Non-Uniformity (PRNU, Photo Response Non-Uniformity) feature.
  • PRNU Photo Response Non-Uniformity
  • y ij is the image output by the camera sensor
  • x ij is the incident light received by the camera sensor
  • f ij is the photoresponse non-uniformity multiplicative noise factor
  • ⁇ ij is the shot noise
  • c ij is the dark current noise
  • ⁇ ij is additional random noise.
  • the photoresponse non-uniformity feature (that is, the PRNU feature) is implicit in the image captured by the camera in the form of the photoresponse non-uniformity multiplicative noise factor in the above formula.
  • pattern noise 304 ie, inherent pattern noise
  • the photoresponse non-uniformity characteristic signal of the main component of the pattern noise is weak, mainly concentrated in the high-frequency details and not affected by temperature and humidity, which is an inherent property of the sensor and can be distinguished between different devices.
  • the server may subtract the first image information from the second image information, and obtain an intrinsic mode noise feature map corresponding to the image to be detected according to the subtraction result.
  • the obtained intrinsic mode noise feature map is a residual, which not only contains intrinsic mode noise, but also contains some other noises, such as dark current noise.
  • the image authenticity detection methods in the various embodiments of the application do not need to accurately extract the photoresponse non-uniformity features in the image, but only need to extract the inherent mode noise residual, thereby improving the calculation efficiency.
  • the server can respectively obtain a single-channel intrinsic mode noise feature map under each channel, and then combine the single-channel intrinsic mode noise feature maps under each channel, The multi-channel intrinsic mode noise feature map corresponding to the image to be detected is obtained.
  • the server may first perform image enhancement processing on the intrinsic mode noise feature map, and then perform step 210 .
  • the image enhancement processing may include at least one of processing such as zero-mean filtering and Fourier peak suppression. It can be understood that the zero-mean filtering can remove the linear mode in the intrinsic mode noise characteristic map, and avoid bright lines appearing in the intrinsic mode noise characteristic map. Fourier peak suppression can suppress the area with larger peaks in the frequency domain of the intrinsic mode noise feature map, thereby removing the block effect in the intrinsic mode noise feature map and avoiding edge prominence.
  • Step 210 analyzing the distribution of the inherent pattern noise in the characteristic image of the inherent pattern noise, and identifying the authenticity of the image to be detected according to the distribution, and obtaining the authenticity detection result of the image to be detected; the inherent pattern noise is brought by the camera sensor , and the inherent noise that is not disturbed by the image content.
  • the server can analyze the distribution of the intrinsic mode noise in the intrinsic mode noise feature map by using a machine learning method, and identify the authenticity of the image to be detected according to the distribution, and obtain the authenticity detection result of the image to be detected.
  • the server may also analyze the distribution of the intrinsic mode noise in the intrinsic mode noise feature map in other ways. For example, the server can compare the intrinsic mode noise feature map with the preset intrinsic mode noise feature map templates corresponding to different authenticity detection results, and compare the authenticity detection results corresponding to the matching intrinsic mode noise feature map templates , determined as the authenticity detection result of the image to be detected.
  • the authenticity detection result may at least include whether the image to be detected is a real image, whether the image to be detected is a forged image, and the like.
  • the authenticity detection result may at least include that the detected image is a real image, the image to be detected is a forged image of a forged type, and the like.
  • the type of forgery may include at least one of a tampering type, a synthesis type, and an artificial intelligence generation type. That is, the forged image under the forgery type may include at least one of a forged image of a tampering type, a forged image of a synthesis type, a forged image of an artificial intelligence generation type, and the like.
  • a forged image of a tampering type refers to a forged image obtained by tampering with a real image.
  • a synthetic type of forged image means that the image is a forged image synthesized by computer.
  • a fake image generated by artificial intelligence means that the image is a fake image generated by a machine learning model of artificial intelligence.
  • GANs Generative Adversarial Nets
  • Generative Adversarial Nets Generative Adversarial Nets
  • the server can directly send the authenticity detection result of the image to be detected As the authenticity detection result of the video to be detected.
  • the server can The detection result determines the authenticity detection result of the video to be detected.
  • the server may determine the authenticity detection result of the video to be detected according to the proportion of the authenticity detection result of each image to be detected. For example, the server may select the authenticity detection result with the largest proportion from the authenticity detection results of each image to be detected as the authenticity detection result of the video to be detected.
  • the above image authenticity detection method removes the low-frequency information in the image to be detected to obtain the first image information, and then performs noise reduction processing on the first image information to obtain the second image information after noise reduction, and according to the first image information and The difference between the second image information obtains the intrinsic pattern noise feature map corresponding to the image to be detected. Since the intrinsic pattern noise in the intrinsic pattern noise feature map is the inherent noise brought by the camera sensor, it is not disturbed by the image content. Therefore, the distribution of pattern noise in the real image is different from that in the forged image, and it will not be difficult to distinguish due to the interference of the image content. Therefore, analyzing the distribution of the inherent pattern noise in the characteristic map of the inherent pattern noise can obtain the accurate accuracy of the image to be detected. The authenticity detection results, improve the accuracy of the authenticity detection results.
  • removing the low-frequency information in the image to be detected to obtain the first image information includes: performing domain transformation on the image to be detected in the spatial domain to obtain low-frequency information and high-frequency information in the transformed domain; removing the low-frequency information, And the first image information in the transform domain is obtained according to the retained high-frequency information.
  • the first image information includes high frequency information of the image to be detected in the frequency domain.
  • the server can perform Fourier transform on the image to be detected in the spatial domain, transform the image to be detected from the spatial domain to the frequency domain, and obtain the low-frequency information and high-frequency information of the image to be detected in the frequency domain, and then the server can remove the low-frequency information , to obtain the high-frequency information in the frequency domain.
  • the first image information includes wavelet coefficients of high frequency components of the image to be detected in the wavelet domain.
  • the server can perform wavelet transform on the image to be detected in the space domain, transform the image to be detected from the space domain to the wavelet domain, and obtain the low frequency component and high frequency component of the image to be detected in the wavelet domain, and then the server can transform the wavelet of the low frequency component
  • the coefficients are set to zero to obtain the wavelet coefficients of the high-frequency components in the wavelet domain.
  • the server may also perform other domain transformations on the image to be detected in the spatial domain, as long as it is a transformation that can separate low-frequency components, it is not limited.
  • the server can perform domain transformation on the image to be detected in the spatial domain to obtain the low-frequency information and high-frequency information in the transformation domain, then remove the low-frequency information, and obtain the first image in the transformation domain according to the retained high-frequency information.
  • Image information so that the low-frequency information in the image to be detected can be accurately removed, and the first image information containing high-frequency information can be obtained. Because the inherent pattern noise is mainly concentrated in the high-frequency information, the first image information containing high-frequency information can be accurately determined.
  • the image information can pave the way for the subsequent accurate determination of the intrinsic mode noise feature map.
  • the first image information includes wavelet coefficients of high-frequency components in each direction of the image to be detected in the wavelet domain.
  • removing the low-frequency information in the image to be detected to obtain the first image information includes: performing wavelet transformation on the image to be detected in the spatial domain, and decomposing the image to be detected into low-frequency components in the wavelet domain and multiple directions. High-frequency component; set the wavelet coefficient of the low-frequency component to zero to obtain the wavelet coefficient of the high-frequency component in each direction.
  • the server can perform wavelet transformation on the image to be detected in the spatial domain, decompose the image to be detected into low-frequency components in the wavelet domain and high-frequency components in multiple directions, and then the server can set the wavelet coefficients of the low-frequency components to zero, And the wavelet coefficients of the high-frequency components in each direction that are not set to zero are obtained.
  • the high-frequency components in multiple directions may include at least one of horizontal high-frequency components, vertical high-frequency components, and diagonal high-frequency components.
  • the server can perform multi-scale wavelet transform on the image to be detected in the spatial domain, and decompose the image to be detected into low-frequency components and high-frequency components at each scale.
  • the high-frequency components at the same scale include multiple directions high-frequency components.
  • the server may set the wavelet coefficients of the low-frequency components to zero to obtain the wavelet coefficients of the high-frequency components of each scale and each direction that have not been set to zero.
  • the order of the multi-scale wavelet transform can be set arbitrarily according to the actual situation, for example, it can be a 4th order wavelet transform.
  • the wavelet base adopted by the wavelet transform can be db4 (Daubechies 4), and other wavelet functions can also be selected as the wavelet base, without limitation.
  • the server accurately decomposes the image to be detected into low-frequency components and high-frequency components in multiple directions through wavelet transform, and then sets the wavelet coefficients of the low-frequency components to zero to obtain the wavelet coefficients of high-frequency components in each direction, so that Accurate wavelet coefficients of high-frequency components can be obtained, because the inherent mode noise is mainly concentrated in high-frequency information, so accurately determining the wavelet coefficients of high-frequency components can pave the way for the subsequent accurate determination of the characteristic map of inherent mode noise.
  • the second image information includes high-frequency noise reduction wavelet coefficients corresponding to high-frequency components of the image to be detected in each direction in the wavelet domain.
  • performing noise reduction processing on the first image information to obtain the second image information after noise reduction includes: performing noise reduction processing on the wavelet coefficients of the high frequency components in each direction to obtain the corresponding High frequency denoising wavelet coefficients.
  • the intrinsic mode noise feature map corresponding to the image to be detected is obtained, including: according to the wavelet coefficient and the high frequency component corresponding to the same direction
  • the difference between the denoising wavelet coefficients is used to obtain the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction; the inverse Wavelet transform to obtain the intrinsic pattern noise feature map corresponding to the image to be detected.
  • the high-frequency denoising wavelet coefficients are wavelet coefficients of non-noise information in the high-frequency components obtained after denoising the wavelet coefficients of the high-frequency components.
  • the high frequency noise wavelet coefficient is the wavelet coefficient of the noise information in the high frequency component.
  • the server may perform denoising filtering on wavelet coefficients of high-frequency components in each direction by using a Wiener filter to obtain high-frequency denoising wavelet coefficients corresponding to high-frequency components in each direction.
  • the server may also use other filters to perform noise reduction filtering on the wavelet coefficients of the high-frequency components in each direction, such as an average filter or a median filter.
  • the server may subtract wavelet coefficients corresponding to high frequency components in the same direction from high frequency noise reduction wavelet coefficients to obtain high frequency noise wavelet coefficients corresponding to high frequency components in each direction.
  • the image to be detected is decomposed into low-frequency components and high-frequency components at each scale, and the high-frequency components at the same scale include
  • the server can perform denoising processing on the wavelet coefficients of the high-frequency components in each direction at each scale, and obtain the high-frequency denoising wavelet coefficients corresponding to the high-frequency components at each scale and in each direction. Then, the server can obtain the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at each scale according to the difference between the wavelet coefficients corresponding to the high-frequency components in the same direction at the same scale and the high-frequency noise reduction wavelet coefficients.
  • the server can perform an inverse wavelet transform according to the wavelet coefficients of the low-frequency components set to zero and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at each scale, to obtain the intrinsic pattern noise feature map corresponding to the image to be detected.
  • the server can obtain the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction according to the difference between the wavelet coefficients corresponding to the high-frequency components in the same direction and the high-frequency noise reduction wavelet coefficients obtained by the noise reduction process. , and then perform an inverse wavelet transform according to the wavelet coefficients of the low-frequency components set to zero and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction, to obtain the intrinsic mode noise feature map corresponding to the image to be detected, which can accurately obtain Inherent pattern noise feature maps for pattern noise, thereby improving the accuracy of image authenticity detection.
  • the intrinsic mode noise feature map including the intrinsic mode noise can be obtained, and only the residual error needs to be extracted without accurately extracting the non-uniformity of the optical response feature, the image authenticity detection can be carried out, which improves the efficiency and convenience of image authenticity detection.
  • performing wavelet transformation on the image to be detected in the spatial domain, decomposing the image to be detected into low-frequency components in the wavelet domain and high-frequency components in multiple directions includes: performing multiple Scale wavelet transform decomposes the image to be detected into low-frequency components and high-frequency components at each scale; high-frequency components at the same scale include high-frequency components in multiple directions.
  • the inverse wavelet transform is performed according to the wavelet coefficients of the zero-set low-frequency components and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction to obtain the intrinsic mode noise feature map corresponding to the image to be detected, including:
  • the wavelet coefficients of the low-frequency components set to zero and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at each scale are subjected to inverse wavelet transformation to obtain the intrinsic mode noise feature map corresponding to the image to be detected.
  • the server can perform multi-scale wavelet transform on the image to be detected in the spatial domain, and decompose the image to be detected into low-frequency components and high-frequency components at each scale.
  • the high-frequency components at the same scale include high-frequency components in multiple directions portion.
  • the server can set the wavelet coefficients of the low-frequency components to 0, and obtain the wavelet coefficients of the high-frequency components in each direction at each scale that are not set to zero.
  • the server may perform denoising processing on the wavelet coefficients of the high-frequency components in each direction at each scale to obtain high-frequency denoising wavelet coefficients corresponding to the high-frequency components at each scale and in each direction.
  • the server can obtain the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at each scale according to the difference between the wavelet coefficients corresponding to the high-frequency components in the same direction at the same scale and the high-frequency noise reduction wavelet coefficients. Finally, the server can perform an inverse wavelet transform based on the wavelet coefficients of the zeroed low frequency components and the high frequency noise wavelet coefficients corresponding to the high frequency components in each direction at each scale, to obtain the intrinsic pattern noise feature map corresponding to the image to be detected.
  • the server may perform noise reduction processing on the high-frequency components in each direction at the same scale in turn, and based on the relationship between the wavelet coefficients of the high-frequency components in the same direction and the high-frequency denoising wavelet coefficients The difference between the high-frequency noise wavelet coefficients corresponding to the high-frequency components in this direction at this scale is obtained, and thus the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at this scale are obtained. Then, at the next scale, the server may sequentially perform noise reduction processing on the high-frequency components in each direction at the scale. And so on, until the high-frequency components of each scale and each direction are processed.
  • the server can perform multi-scale wavelet transform on the image to be detected in the spatial domain, so as to more accurately determine the high-frequency information in the image to be detected, and perform noise reduction filtering based on the accurate high-frequency information, which can improve the noise reduction.
  • the accuracy of noise processing can be improved, so as to pave the way for accurately determining the characteristic map of intrinsic mode noise.
  • performing noise reduction processing on the wavelet coefficients of the high frequency components in each direction, and obtaining the high frequency noise reduction wavelet coefficients corresponding to the high frequency components in each direction includes: estimating non-noise information in the high frequency components in each direction The local variance of the high frequency component; for the high frequency component in each direction, according to the local variance of the non-noise information in the high frequency component, the wavelet coefficient of the high frequency component is denoised and filtered, and the high frequency drop corresponding to the high frequency component in each direction is obtained noisysy wavelet coefficients.
  • non-noise information is information other than noise information.
  • Local variance which is used to characterize the variance of pixel values corresponding to local pixel points in the image.
  • the server may use a window of a preset size to perform filtering processing based on the high-frequency component to obtain a filtering result. Then the server can determine the local variance of the non-noise information in the high-frequency component according to the difference between the filtering result and the preset noise variance.
  • the server can use multiple windows of different sizes to determine the initial local variances corresponding to the non-noise information in the high-frequency components under each window, and then according to each The initial local variance determines the final local variance of the non-noise information in the high frequency components.
  • the server can perform noise reduction filtering on the wavelet coefficients of the high-frequency components through a Wiener filter according to the local variance of the non-noise information in the high-frequency components to obtain The high-frequency denoising wavelet coefficients corresponding to the high-frequency components of .
  • the server can estimate the local variance of the non-noise information in the high-frequency components in each direction, and then perform noise reduction filtering on the wavelet coefficients of the high-frequency components according to the local variance of the non-noise information in the high-frequency components, to obtain
  • the high-frequency denoising wavelet coefficient corresponding to the high-frequency component can accurately denoise the high-frequency information, obtain accurate information after denoising the high-frequency information, and realize the accurate separation of noise from high-frequency information.
  • Accurate intrinsic mode noise feature maps are obtained as a foreshadowing.
  • estimating the local variance of the non-noise information in the high-frequency components in each direction includes: for the high-frequency components in each direction, through multiple windows of different sizes, performing filtering processing based on the high-frequency components to obtain high The filtering results corresponding to the frequency components under each window; according to the difference between the filtering results corresponding to the same window and the preset noise variance, determine the initial local variance corresponding to the non-noise information in the high frequency component under each window; from In each initial local variance, the final local variance of the non-noise information in the high-frequency component is selected.
  • the preset noise variance is the variance of the noise information in the preset high-frequency component.
  • the initial local variance refers to the local variance of the non-noise information in the high-frequency component under a window.
  • the final local variance refers to the local variance of the non-noise information in the high-frequency component finally determined according to the initial local variance of the non-noise information in the high-frequency component under multiple windows of different sizes.
  • the server may perform a square operation on the pixel value of each pixel in the image corresponding to the high-frequency component, and then use multiple windows of different sizes to obtain The high-frequency component square diagram is filtered to obtain the corresponding filtering results of the high-frequency component in each window.
  • the server may subtract the filtering result corresponding to the same window from the preset noise variance to obtain the initial local variance corresponding to the non-noise information in the high-frequency component under each window.
  • the square operation is performed on the pixel value of each pixel point in the image corresponding to the high-frequency component, and then the square image of the high-frequency component after the square operation is filtered through the window, so that the filtering result can be approximated to the preset noise variance, Then the filtering result is subtracted from the preset noise variance to obtain the local variance of the non-noise information.
  • the server may select the minimum value of the initial local variance corresponding to the non-noise information in the high-frequency component under different windows as the final local variance of the non-noise information in the high-frequency component.
  • the size of each window and the number of windows can be set arbitrarily according to actual needs. For example, four windows of sizes 3 ⁇ 3, 5 ⁇ 5, 7 ⁇ 7 and 9 ⁇ 9 may be selected.
  • the preset noise variance can be set according to actual requirements.
  • the preset noise standard deviation can be set to 5, that is, the preset noise variance is 25.
  • the local variance of the non-noise information in the high-frequency component is determined through multiple windows of different sizes, so as to avoid the inaccuracy of the local variance obtained by using a single window.
  • multiple windows of different sizes more accurate The local variance of , which can improve the accuracy of noise reduction filtering for wavelet coefficients of high-frequency components.
  • FIG. 4 it is a schematic diagram of the overall flow of the steps of determining the intrinsic mode noise feature map corresponding to the image to be detected in the above-mentioned embodiments, specifically including the following steps:
  • Step 402 acquiring an image to be detected.
  • Step 404 perform multi-scale wavelet transform on the single-channel image of the image to be detected, and decompose it into low-frequency components and high-frequency components in each direction at each scale.
  • Step 406 Set the wavelet coefficients of the low-frequency components to zero to obtain the wavelet coefficients of the high-frequency components in each direction at each scale.
  • Step 408 Perform local variance estimation of non-noise information on the horizontal high-frequency components at a single scale, and perform noise reduction filtering according to the local variance to obtain high-frequency noise reduction wavelet coefficients, and then compare the horizontal high-frequency components with the corresponding high-frequency The corresponding high-frequency noise wavelet coefficients are obtained by subtracting the high-frequency noise reduction wavelet coefficients.
  • Step 410 Perform local variance estimation of non-noise information on the vertical high-frequency components at a single scale, and perform noise reduction filtering according to the local variance to obtain high-frequency noise reduction wavelet coefficients, and then compare the vertical high-frequency components with the corresponding high-frequency The corresponding high-frequency noise wavelet coefficients are obtained by subtracting the high-frequency noise reduction wavelet coefficients.
  • Step 412 Perform local variance estimation of non-noise information on the diagonal high-frequency components at a single scale, and perform noise reduction filtering according to the local variance to obtain high-frequency noise reduction wavelet coefficients, and then compare the diagonal high-frequency components with the corresponding The high-frequency noise reduction wavelet coefficients are subtracted to obtain the corresponding high-frequency noise wavelet coefficients.
  • step 414 It is judged whether the high-frequency components at each scale have been denoised, and if not, return to step 408 to continue execution. If yes, execute step 414 .
  • Step 414 Perform inverse wavelet transform according to the wavelet coefficients of the zero-set low-frequency components in a single channel and the high-frequency noise wavelet coefficients corresponding to high-frequency components in each direction in each scale, to obtain the intrinsic mode noise feature map.
  • step 416 It is judged whether all channels have been processed, if not, return to step 404 to continue execution. If yes, execute step 416 .
  • Step 416 according to the intrinsic mode noise characteristic map of each single channel, obtain the intrinsic mode noise characteristic map corresponding to the image to be detected.
  • analyzing the distribution of the intrinsic mode noise in the intrinsic mode noise feature map, and identifying the authenticity of the image to be detected according to the distribution, and obtaining the authenticity detection result of the image to be detected includes: inputting the intrinsic mode noise feature map To the pre-trained authenticity detection model; through the authenticity detection model to analyze the distribution of the inherent pattern noise in the characteristic map of the inherent pattern noise, and according to the distribution, the authenticity of the image to be detected is identified, and the authenticity detection result of the image to be detected is obtained .
  • the authenticity detection model may be a machine learning model.
  • the authenticity detection model may be a neural network model. In one embodiment, the authenticity detection model may be a convolutional neural network model.
  • the authenticity detection model may be an EfficientNet convolutional neural network model.
  • the authenticity detection model may be an EfficientNet-B4 (one of the versions of the EfficientNet series) convolutional neural network model.
  • the convolutional neural network of the EfficientNet series performs well in the application of image classification. Therefore, the convolutional neural network can be selected to classify and identify the authenticity detection results of the image to be detected, and accurate authenticity detection results can be obtained.
  • the photoresponse non-uniformity characteristic which is the main component of intrinsic mode noise, is an inherent property of the camera sensor and is not very sensitive to the image content.
  • the server can determine the authenticity detection result of the image to be detected by analyzing the distribution of the intrinsic mode noise in the intrinsic mode noise feature map through the pre-trained authenticity detection model.
  • the difference between the distribution of the inherent pattern noise of the fake image and the real image can be learned by the authenticity detection model during the model training process, so that the authenticity detection of the image to be detected can be accurately obtained through the pre-trained authenticity detection model result.
  • the server may also input the intrinsic mode noise feature map and features related to the image content into the neural network model to obtain the authenticity detection result of the image.
  • the feature related to the image content in the face image may be at least one of the information of facial features, skin color, and face contour.
  • the server can analyze the distribution of the intrinsic mode noise in the intrinsic mode noise feature map through the authenticity detection model, and perform authenticity identification of the image to be detected according to the distribution, and accurately obtain the authenticity detection result of the image to be detected, The accuracy of image authenticity detection is improved.
  • the EfficientNet-B4 convolutional neural network is used for authenticity identification, which improves the accuracy and improves the efficiency of model training.
  • FIG. 6 it is a schematic diagram of the overall process of obtaining the authenticity detection results of the image to be detected in the above-mentioned embodiments, specifically including the following steps:
  • Step 602 acquiring an image to be detected.
  • Step 604 performing cropping processing on the image to be detected.
  • Step 606 performing intrinsic pattern noise extraction on the cropped image to be detected to obtain an intrinsic pattern noise feature map.
  • Step 608 inputting the intrinsic mode noise feature map into the pre-trained authenticity detection model.
  • Step 610 analyzing the distribution of intrinsic mode noise through the authenticity detection model to obtain the authenticity detection result of the image to be detected.
  • step 606 specifically includes the following steps:
  • Step 6061 perform multi-scale wavelet transform on the image to be detected.
  • Step 6062 set the wavelet coefficients of the low frequency components to zero.
  • Step 6063 estimating the local variance of non-noise information in each high-frequency component.
  • Step 6064 perform denoising filtering on the wavelet coefficients of each high-frequency component according to the local variance, to obtain high-frequency denoising wavelet coefficients.
  • Step 6065 according to the difference between the wavelet coefficient of the high frequency component and the wavelet coefficient of the high frequency noise reduction, obtain the high frequency noise wavelet coefficient.
  • Step 6066 Perform inverse wavelet transformation according to the wavelet coefficients of the low-frequency components set to zero and the corresponding high-frequency noise wavelet coefficients of each high-frequency component to obtain a characteristic map of intrinsic mode noise.
  • Step 6067 perform image enhancement processing on the intrinsic pattern noise feature map.
  • the authenticity detection model is obtained through the model training step; the model training step includes: obtaining the sample image and the authenticity label corresponding to the sample image; performing intrinsic pattern noise extraction on the sample image to obtain the sample intrinsic pattern noise feature map ; Input the sample intrinsic pattern noise feature map into the authenticity detection model to be trained, analyze the distribution of the intrinsic pattern noise in the sample intrinsic pattern noise feature map through the authenticity detection model, and analyze the sample intrinsic pattern noise feature map according to the distribution
  • the authenticity of the corresponding sample image is identified to obtain the authenticity detection result of the sample image; according to the difference between the authenticity detection result of the sample image and the authenticity label corresponding to the sample image, iteratively adjust the authenticity detection to be trained
  • the model parameters of the model are obtained until the iteration stop condition is met, and the trained authenticity detection model is obtained.
  • the authenticity label is used to represent the actual authenticity of the sample image.
  • the sample intrinsic pattern noise feature map is an image including intrinsic pattern noise in the sample image.
  • the authenticity detection result of the sample image is the authenticity of the sample image identified by the authenticity detection model to be trained during the model training process.
  • the authenticity label may include that the sample image is a real image and that the sample image is a fake image.
  • the authenticity detection result of the sample image may include that the sample image is a real image and that the sample image is a forged image.
  • the authenticity label may include that the sample image is a real image, and the sample image is a forged image under the forgery type.
  • the authenticity detection result of the sample image may include that the sample image is a real image, and that the sample image is a forged image under a forgery type.
  • the authenticity label corresponding to the sample image may be obtained by manually marking the sample image.
  • the sample image can be an image extracted from a sample video
  • the authenticity label of the sample image can be the authenticity label of the sample video from which the sample image is derived
  • the authenticity label of the sample video can be manually The sample video is obtained by labeling.
  • the server can adjust the model parameters of the authenticity detection model to be trained according to the difference between the authenticity detection result of the sample image and the authenticity label corresponding to the sample image, and then enter the next round Iterate until the difference satisfies the iteration stop condition, and obtain the trained authenticity detection model.
  • the server may use the cross-entropy loss function as the objective function in the model training process to measure the difference.
  • the iteration stopping condition can converge to the loss value of the objective function.
  • the server can iteratively train the authenticity detection model according to the sample image and the corresponding authenticity label, so as to obtain the distribution of the inherent mode noise in the sample intrinsic mode noise feature map, and accurately obtain the authenticity.
  • the authenticity detection model of the false detection result improves the accuracy of image authenticity detection.
  • extracting the intrinsic mode noise from the sample image to obtain the characteristic map of the intrinsic mode noise of the sample includes: removing low-frequency information in the sample image to obtain high-frequency information of the sample; performing noise reduction processing on the high-frequency information of the sample to obtain the reduced-frequency information.
  • the denoising information of the sample after noise according to the difference between the high frequency information of the sample and the denoising information of the sample, the characteristic map of the inherent mode noise of the sample corresponding to the sample image is obtained.
  • the sample high-frequency information is image information that does not contain low-frequency information obtained by removing low-frequency information from the sample image.
  • the sample denoising information is denoised image information obtained by performing denoising processing on the high-frequency information of the samples.
  • the server may decompose and remove low-frequency information in the sample image to obtain high-frequency information of the sample by performing domain transformation on the sample image.
  • the server can perform wavelet transform on the sample image, decompose the sample image into a sample low-frequency component and a plurality of sample high-frequency components, and then set the wavelet coefficients of the sample low-frequency components to zero to obtain the wavelet of each sample high-frequency component coefficient.
  • the server may filter the sample image through a high-pass filter to remove low-frequency information in the sample image to obtain sample high-frequency information.
  • the server may also use other methods to remove low-frequency information in the sample image to obtain sample high-frequency information, which is not limited.
  • the server may perform denoising filtering on the high-frequency information of the samples in the spatial domain to obtain denoising information of the samples after denoising.
  • the server may perform denoising filtering on the high-frequency information of the samples in the transform domain to obtain denoising information of the samples in the transform domain.
  • the transform domain may be any one of wavelet domain, frequency domain and the like.
  • the server may perform noise reduction filtering on the sample high-frequency information through a Wiener filter.
  • the server may also use other filters to perform noise reduction filtering on the sample high-frequency information, such as an average filter or a median filter.
  • the server may respectively perform denoising filtering on the wavelet coefficients of the high-frequency components of each sample in the wavelet domain of the sample image to obtain the high-frequency noise-reducing wavelet coefficients of the samples corresponding to the high-frequency components of each sample.
  • the server can estimate the local variance of the non-noise information in the high-frequency components of each sample, and then perform noise reduction filtering on the wavelet coefficients of the high-frequency components of each sample according to the local variance to obtain the high-frequency Denoising wavelet coefficients.
  • the server may subtract the high frequency information of the sample from the denoising information of the sample to obtain a sample eigenmode noise feature map corresponding to the sample image.
  • the server may obtain the high-frequency noise wavelet coefficients of the sample according to the difference between the wavelet coefficients of the high-frequency components of the sample and the high-frequency noise reduction wavelet coefficients of the sample. Then the server can perform inverse wavelet transformation according to the wavelet coefficients of the low-frequency components of the samples set to zero and the high-frequency noise wavelet coefficients of the samples corresponding to the high-frequency components of each sample, to obtain the characteristic map of the inherent mode noise of the samples.
  • the server may subtract the wavelet coefficients of the high-frequency component of the sample from the high-frequency noise reduction wavelet coefficients of the sample to obtain the high-frequency noise wavelet coefficients of the sample.
  • the specific embodiment of the step of extracting the intrinsic mode noise from the sample image to obtain the characteristic map of the intrinsic mode noise of the sample is similar to the specific embodiment of obtaining the characteristic intrinsic mode noise corresponding to the image to be detected in each embodiment of the present application,
  • the foregoing embodiments are merely examples, and details are not repeated here.
  • the server can remove the low-frequency information in the sample image to obtain the high-frequency information of the sample, and then perform noise reduction processing on the high-frequency information of the sample to obtain the denoising information of the sample after noise reduction, because the intrinsic mode noise mainly exists in the high-frequency Therefore, according to the difference between the high-frequency information of the sample and the denoising information of the sample, the characteristic map of the inherent mode noise of the sample corresponding to the sample image can be accurately obtained.
  • obtaining the sample image and the authenticity label corresponding to the sample image includes: acquiring a sample video carrying the authenticity label; sampling the sample video to obtain a plurality of sample video frames corresponding to the sample video; cropping according to a preset size A sample image is obtained for each sample video frame, and the authenticity label carried by the sample video from which the sample image originates is used as the authenticity label corresponding to the sample image.
  • the sample video frame is a single frame image in the sample video.
  • the server may evenly and randomly divide the acquired videos into a sample set and a test set.
  • the videos in the sample set are used as sample videos for model training.
  • the videos in the test set are used as test videos to test the effect of the trained authenticity detection model after the model training is completed.
  • videos can be divided into different video sets according to video types, and the server can randomly select a first preset number of videos from each type of video sets and divide them into sample sets, and then randomly select each type of video sets from each type of video sets. Extract a second preset number of videos and divide them into a test set.
  • the video types may include real video types, and multiple fake types.
  • authenticity labels may be manually marked on each sample video according to the actual authenticity of each sample video.
  • the server can obtain the sample video carrying the authenticity label, and then sample the sample video according to a preset sampling rule to obtain a plurality of sample video frames corresponding to the sample video. Next, the server may crop each sample video frame according to a preset size to obtain a sample image, and use the authenticity label carried by the sample video from which the sample image is derived as the authenticity label corresponding to the sample image.
  • the authenticity label may include that the sample video is a real video, and that the sample video is a fake video. Specifically, if the authenticity label carried by the sample video from which the sample image is derived is that the sample video is a real video, then the sample image is a real image as the authenticity label of the sample image. If the authenticity label carried by the sample video from which the sample image is derived is that the sample video is a fake video, then the sample image is a fake image as the authenticity label of the sample image.
  • the authenticity label may include that the sample video is a real video, and that the sample video is a fake video of a fake type. Specifically, if the authenticity label carried by the sample video from which the sample image is derived is that the sample video is a real video, then the sample image is a real image as the authenticity label of the sample image. If the authenticity label carried by the sample video from which the sample image is derived is a forged video under the forgery type of the sample video, the forged image under the forgery type of the sample image is used as the authenticity label of the sample image.
  • the server may perform center cropping on the sample video frame according to a preset size. Specifically, the server may use the center of the sample video frame as the center of the cropped sample image, and crop the sample video frame to a cropped sample image conforming to a preset size. For example: assuming that the preset size is [224,224], the server can use the center of the sample video frame to cut it at a position 112 dimensions away from the center in the four directions of up, down, left, and right respectively, and the cropped one is centered on the center of the sample video frame , and the image with size [224,224] is the sample image.
  • the preset size will be one dimension more than the number of channels. For example, if the sample video frame is a three-channel image, the preset size may be [224,224,3].
  • the server can extract the sample image from the sample video, and use the authenticity label carried by the sample video as the authenticity label of the sample image, so that the sample image can be obtained from the video.
  • cropping the sample video frame to obtain the sample image can ensure the consistency of the size of each sample image during training, and reduce the size of the sample image, thereby reducing time overhead, improving model training efficiency, and avoiding Scaling is used to destroy the inherent pattern noise between pixels in the sample image.
  • sampling the sample video to obtain a plurality of sample video frames corresponding to the sample video includes: for each sample video, if the specified number of sampling frames is greater than or equal to the total number of frames of the sample video, Each frame of is used as a sample video frame.
  • the server may determine the total number of frames of the sample video, and compare the total number of frames of the sample video with the specified number of sampled frames. If the specified number of sampling frames is greater than or equal to the total number of frames of the sample video, the server may use each frame in the sample video as a sample video frame.
  • the server can sample according to the corresponding rules according to the number of sampling frames and the total number of frames of the sample video, so that the sample images can be obtained from the sample video in an appropriate amount and evenly, and obtained according to the sampling
  • the model training is performed on the sample images, which avoids a lot of time overhead caused by directly performing model training based on each frame image in the sample video, and improves the efficiency of model training.
  • sampling the sample video to obtain a plurality of sample video frames corresponding to the sample video includes: for each sample video, if the specified number of sampling frames is less than the total number of frames of the sample video, according to the total number of frames and the number of samples The frame number determines the sampling interval, and sample video frames are extracted from the sample video according to the sampling interval.
  • the server may determine the total number of frames of the sample video, and compare the total number of frames of the sample video with the specified number of sampled frames. If the specified number of sampling frames is less than the total number of frames of the sample video, the server can determine the sampling interval according to the total number of frames and the number of sampling frames, and extract sample video frames from the sample video according to the sampling interval.
  • the server may determine the multiple of the total frame number compared to the sampling frame number as the sampling interval, and extract sample video frames from the sample video according to the sampling interval. In one embodiment, the server samples the sample video according to the following sampling rules:
  • frame_num represents the total number of frames of the sample video
  • sample_num represents the specified number of sample frames
  • k represents the index number of the extracted sample video frame.
  • the sampling rule is [], which means that when the total number of frames or the number of sampled frames is less than or equal to 0, the sample video will not be sampled.
  • frame_num ⁇ sample_num that is, when the specified number of sampling frames is greater than or equal to the total number of frames of the sample video, then sequentially extract the 0th frame, ..., kth frame, ... and frame_num-1 frame of the sample video , that is, each frame in the sample video is extracted as a sample video frame.
  • the total number of frames is determined as the sampling interval compared to the multiple frame_num/sample_num of the number of sampling frames, and the sampling interval is taken from the sample video Extract sample video frames from .
  • the specified number of sampling frames can be set arbitrarily according to actual needs. For example: You can set the number of sampling frames to 10 frames.
  • the server can sample according to the corresponding rules according to the number of sampling frames and the total number of frames of the sample video, so that the sample images can be obtained from the sample video in an appropriate amount and evenly, and obtained according to the sampling
  • the model training is performed on the sample images, which avoids a lot of time overhead caused by directly performing model training based on each frame image in the sample video, and improves the efficiency of model training.
  • FIG. 7 it is a schematic diagram of the overall process of the model training steps of the authenticity detection model in the above-mentioned embodiment, specifically including the following steps:
  • Step 702 acquire sample videos with authenticity labels.
  • Step 704 sampling the sample video to obtain a plurality of sample video frames.
  • Step 706 cropping the sample video frame to obtain a sample image.
  • Step 708 performing intrinsic pattern noise extraction on the sample image to obtain a feature map of the intrinsic pattern noise of the sample.
  • Step 710 inputting the sample intrinsic mode noise feature map into the authenticity detection model to be trained.
  • Step 712 According to the difference between the authenticity detection result of the sample image output by the model and the authenticity label, iteratively adjust the model parameters until the iteration stop condition is satisfied, and obtain the trained authenticity detection model.
  • step 708 specifically includes the following steps:
  • Step 7081 perform multi-scale wavelet transform on the sample image.
  • Step 7082 set the wavelet coefficients of the low frequency components to zero.
  • Step 7083 estimate the local variance of the non-noise information in each high-frequency component.
  • Step 7084 Perform denoising filtering on the wavelet coefficients of each high-frequency component according to the local variance to obtain high-frequency denoising wavelet coefficients.
  • Step 7085 According to the difference between the wavelet coefficient of the high frequency component and the wavelet coefficient of the high frequency noise reduction, the wavelet coefficient of the high frequency noise is obtained.
  • Step 7086 Perform inverse wavelet transformation according to the wavelet coefficients of the zero-set low-frequency components and the corresponding high-frequency noise wavelet coefficients of each high-frequency component to obtain the sample intrinsic mode noise feature map.
  • Step 7087 perform image enhancement processing on the sample intrinsic mode noise feature map.
  • performing authenticity identification on the image to be detected according to the distribution, and obtaining the authenticity detection result of the image to be detected includes: identifying the image to be detected as a real image if the distribution conforms to the intrinsic mode noise distribution of the real image.
  • the server may analyze the distribution of the intrinsic mode noise in the intrinsic mode noise feature map through the pre-trained authenticity detection model, and determine the intrinsic mode noise distribution that the distribution conforms to. If the distribution conforms to the intrinsic pattern noise distribution of the real image, the authenticity detection model can identify the image to be detected as a real image.
  • the server can accurately Identify whether the image to be detected is a real image or not.
  • the authenticity identification of the image to be detected is carried out according to the distribution situation, and the authenticity detection result of the image to be detected is obtained: if the distribution situation is consistent with the preset intrinsic mode noise distribution of the forgery type, then according to the matching The forgery type corresponding to the intrinsic pattern noise distribution is used to identify the image to be detected as a forgery image of the forgery type.
  • the server may analyze the distribution of the intrinsic mode noise in the intrinsic mode noise feature map through the pre-trained authenticity detection model, and determine the intrinsic mode noise distribution that the distribution conforms to. If the distribution is consistent with the preset inherent pattern noise distribution of the counterfeit type, the authenticity detection model can identify the image to be detected as a counterfeit image according to the counterfeit type corresponding to the matched intrinsic pattern noise distribution.
  • the authenticity detection model can identify the image to be detected as a synthetic type of forged image.
  • the server can accurately It can accurately identify whether the image to be detected is a forged image under the type of forgery.
  • the server can not only identify the image to be detected as a forged image, but also identify the type of forgery, thereby increasing the amount of information.
  • the present application also provides an application scenario, which is an application scenario of face image security detection, and the application scenario applies the above-mentioned image authenticity detection method.
  • the application of the image authenticity detection method in this application scenario is as follows:
  • the terminal can collect the user's face image, and send the user's face image to the server for face recognition.
  • the server may execute the image authenticity detection method in each embodiment of the present application, perform authenticity recognition on the face image to be detected, and obtain the authenticity detection result. If the face image to be detected is identified as a real face image, the server may continue to perform face recognition on the face image.
  • the server may not perform face recognition on the face image, and feed back the information that the forged image is recognized to the terminal, thereby effectively protecting the user's personal, property and The security of reputation and other aspects improves the security of the face recognition process.
  • the present application further provides an application scenario, which is an application scenario of image identification.
  • This application scenario applies the above-mentioned image authenticity detection method.
  • the application of the image authenticity detection method in this application scenario is as follows:
  • the user can input the image to be detected through the terminal, and the terminal can The image to be detected is sent to the server, and the server can execute the image authenticity detection method in each embodiment of the present application to obtain the authenticity detection result of the image to be detected, and return the authenticity detection result to the terminal for display, so that the user can easily view the image For authenticity identification.
  • steps in each flow chart are displayed in sequence according to the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least a part of the steps in each flowchart may include multiple steps or multiple stages, these steps or stages are not necessarily executed at the same time, but may be executed at different times, the execution order of these steps or stages It does not necessarily have to be performed sequentially, but can be performed alternately or alternately with other steps or at least a part of steps or stages in other steps.
  • an image authenticity detection device 800 is provided.
  • the device can use software modules or hardware modules, or a combination of the two to become a part of computer equipment.
  • the device specifically includes: Image acquisition module 802, high-frequency information acquisition module 804, noise reduction module 806, pattern noise determination module 808 and authenticity identification module 810, wherein:
  • An image acquisition module 802 configured to acquire an image to be detected.
  • the high-frequency information acquisition module 804 is configured to remove low-frequency information in the image to be detected to obtain first image information.
  • the noise reduction module 806 is configured to perform noise reduction processing on the first image information to obtain second image information after noise reduction.
  • the pattern noise determining module 808 is configured to obtain an intrinsic pattern noise feature map corresponding to the image to be detected according to the difference between the first image information and the second image information.
  • the authenticity identification module 810 is used to analyze the distribution of the inherent pattern noise in the inherent pattern noise feature map, and carry out authenticity identification of the image to be detected according to the distribution, so as to obtain the authenticity detection result of the image to be detected;
  • the inherent pattern noise is formed by Inherent noise from a camera sensor that is not disturbed by image content.
  • the first image information includes wavelet coefficients of high-frequency components in each direction of the image to be detected in the wavelet domain.
  • the high-frequency information acquisition module 804 is also used to perform wavelet transformation on the image to be detected in the spatial domain, and decompose the image to be detected into low-frequency components in the wavelet domain and high-frequency components in multiple directions; the low-frequency components Set the wavelet coefficients to zero to get the wavelet coefficients of the high-frequency components in each direction.
  • the second image information includes high-frequency noise reduction wavelet coefficients corresponding to high-frequency components of the image to be detected in each direction in the wavelet domain.
  • the denoising module 806 is further configured to perform denoising processing on wavelet coefficients of high frequency components in each direction to obtain high frequency denoising wavelet coefficients corresponding to high frequency components in each direction.
  • the pattern noise determination module 808 is also used to obtain the high-frequency noise wavelet corresponding to the high-frequency component in each direction according to the difference between the wavelet coefficient corresponding to the high-frequency component in the same direction and the high-frequency noise reduction wavelet coefficient Coefficients: perform an inverse wavelet transform according to the wavelet coefficients of the low-frequency components set to zero, and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction, to obtain the intrinsic pattern noise feature map corresponding to the image to be detected.
  • the high-frequency information acquisition module 804 is also used to perform multi-scale wavelet transform on the image to be detected in the spatial domain, decomposing the image to be detected into low-frequency components and high-frequency components at each scale;
  • the high-frequency components include high-frequency components in multiple directions.
  • the pattern noise determination module 808 is also used to perform an inverse wavelet transform according to the wavelet coefficients of the zero-set low-frequency components and the high-frequency noise wavelet coefficients corresponding to the high-frequency components in each direction at each scale, to obtain the image to be detected Corresponding eigenmode noise feature map.
  • the noise reduction module 806 is also used to estimate the local variance of the non-noise information in the high-frequency components in each direction; for the high-frequency components in each direction, according to the local variance of the non-noise information in the high-frequency components , performing denoising filtering on the wavelet coefficients of the high-frequency components to obtain the high-frequency denoising wavelet coefficients corresponding to the high-frequency components in each direction.
  • the noise reduction module 806 is also used to filter the high-frequency components in each direction through a plurality of windows of different sizes, and to obtain the corresponding filtering values of the high-frequency components under each window. Results: According to the difference between the filtering result corresponding to the same window and the preset noise variance, determine the initial local variance corresponding to the non-noise information in the high-frequency component under each window; from each initial local variance, select the high-frequency component The final local variance of the non-noise information in .
  • the authenticity identification module 810 is also used to input the intrinsic pattern noise feature map into the pre-trained authenticity detection model; analyze the distribution of the intrinsic pattern noise in the intrinsic pattern noise feature map through the authenticity detection model, And according to the distribution, the authenticity of the image to be detected is identified, and the authenticity detection result of the image to be detected is obtained.
  • the image authenticity detection device also includes:
  • the model training module 812 is used to obtain the sample image and the authenticity label corresponding to the sample image; the inherent pattern noise is extracted from the sample image to obtain the sample intrinsic pattern noise feature map; the sample intrinsic pattern noise feature map is input to the authenticity to be trained
  • the detection model the distribution of the inherent pattern noise in the sample inherent pattern noise feature map is analyzed through the authenticity detection model, and the authenticity of the sample image corresponding to the sample inherent pattern noise feature map is identified according to the distribution, and the authenticity of the sample image is obtained. false detection result; according to the difference between the authenticity detection result of the sample image and the authenticity label corresponding to the sample image, iteratively adjust the model parameters of the authenticity detection model to be trained until the iteration stop condition is met, and the training completed Authenticity detection model.
  • the model training module 812 is also used to remove low-frequency information in the sample image to obtain high-frequency information of the sample; perform noise reduction processing on the high-frequency information of the sample to obtain denoising information of the sample after noise reduction; The difference between the frequency information and the sample denoising information is used to obtain the sample inherent pattern noise feature map corresponding to the sample image.
  • the model training module 812 is also used to obtain sample videos carrying authenticity labels; sample the sample videos to obtain a plurality of sample video frames corresponding to the sample videos; cut each sample video frame according to a preset size to obtain The sample image, and the authenticity label carried by the sample video from which the sample image is derived is used as the authenticity label corresponding to the sample image.
  • the model training module 812 is also used for each sample video, if the specified number of sampling frames is greater than or equal to the total number of frames of the sample video, each frame in the sample video is used as a sample video frame; if specified The number of sampling frames is less than the total number of frames of the sample video, the sampling interval is determined according to the total number of frames and the number of sampling frames, and sample video frames are extracted from the sample video according to the sampling interval.
  • the authenticity identification module 810 is also used to identify the image to be detected as a real image if the distribution conforms to the intrinsic mode noise distribution of the real image; match, the image to be detected is identified as a forgery image of a forgery type according to the forgery type corresponding to the matching intrinsic mode noise distribution.
  • the above-mentioned image authenticity detection device removes the low-frequency information in the image to be detected to obtain the first image information, and then performs noise reduction processing on the first image information to obtain the second image information after noise reduction, and according to the first image information and the The difference between the second image information obtains the intrinsic pattern noise feature map corresponding to the image to be detected. Since the intrinsic pattern noise in the intrinsic pattern noise feature map is the inherent noise brought by the camera sensor, it is not disturbed by the image content. Therefore, the distribution of pattern noise in the real image is different from that in the forged image, and it will not be difficult to distinguish due to the interference of the image content. Therefore, analyzing the distribution of the inherent pattern noise in the characteristic map of the inherent pattern noise can obtain the accurate The authenticity detection results, improve the accuracy of the authenticity detection results.
  • Each module in the above-mentioned image authenticity detection device can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of one or more processors in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that one or more processors can call and execute the above The operation corresponding to the module.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 10 .
  • the computer device includes one or more processors, memory, and network interfaces connected by a system bus. Wherein, one or more processors of the computer device are used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions and a database.
  • the internal memory provides an environment for the execution of the operating system and computer readable instructions in the non-volatile storage medium.
  • the database of the computer device is used to store model data.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer readable instructions are executed by one or more processors, an image authenticity detection method is realized.
  • FIG. 10 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the computer equipment on which the solution of this application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and one or more processors, where computer-readable instructions are stored in the memory, and the above-mentioned methods are implemented when the one or more processors execute the computer-readable instructions Steps in the examples.
  • one or more computer-readable storage media are provided, storing computer-readable instructions, and when the computer-readable instructions are executed by one or more processors, the steps in the foregoing method embodiments are implemented.
  • a computer program product comprising computer readable instructions stored in a computer readable storage medium.
  • One or more processors of the computer device read the computer-readable instructions from the computer-readable storage medium, and one or more processors execute the computer-readable instructions, so that the computer device executes the steps in the foregoing method embodiments.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioethics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种图像真伪检测方法,包括:获取待检测图像;去除待检测图像中的低频信息,得到第一图像信息;对第一图像信息进行降噪处理,得到降噪后的第二图像信息;根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图;分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果;固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。

Description

图像真伪检测方法、装置、计算机设备和存储介质
本申请要求于2021年05月11日提交中国专利局,申请号为202110512723X、发明名称为“图像真伪检测方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术和图像处理技术领域,特别是涉及一种图像真伪检测方法、装置、计算机设备和存储介质。
背景技术
随着科学技术特别是图像处理技术的飞速发展,为人们的生活带来很多便利的同时,也逐渐出现了通过图像篡改或图像合成等方式制作伪造的图像,来谋取利益或制造虚假新闻等犯罪方式。例如:通过使用合成的人脸视频来恶意使用他人身份进行身份认证来注册软件。因此,为了保障人们的人身、财产、以及信誉等方面的安全,对图像的真伪进行检测非常重要。
传统方法中,一般是提取图像内容相关的特征来进行图像真伪的检测。然而,图像内容相关的特征容易受到图像背景变化、图像光照变化、图像内容遮挡或人脸表情动作等方面的影响,难以保证检测方法具有较高的鲁棒性,导致检测结果不够准确。
发明内容
基于此,有必要针对上述技术问题,提供一种图像真伪检测方法、装置、计算机设备和存储介质。
一种图像真伪检测方法,由计算机设备执行,所述方法包括:
获取待检测图像;
去除所述待检测图像中的低频信息,得到第一图像信息;
对所述第一图像信息进行降噪处理,得到降噪后的第二图像信息;
根据所述第一图像信息与所述第二图像信息之间的差异,得到与所述待检测图像对应的固有模式噪声特征图;及
分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果;所述固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
一种图像真伪检测装置,所述装置包括:
图像获取模块,用于获取待检测图像;
高频信息获取模块,用于去除所述待检测图像中的低频信息,得到第一图像信息;
降噪模块,用于对所述第一图像信息进行降噪处理,得到降噪后的第二图像信息;
模式噪声确定模块,用于根据所述第一图像信息与所述第二图像信息之间的差异,得到与所述待检测图像对应的固有模式噪声特征图;及
真伪识别模块,用于分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果;所述固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行本申请各实施例所述的图像真伪检测方法中的步骤。
一个或多个计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行本申请各实施例所述的图像真伪检测方法中的步骤。
一种计算机程序产品,所述计算机程序产品包括计算机可读指令,所述计算机可读指令存储在计算机可读存储介质中;计算机设备的一个或多个处理器从计算机可读存储介质读取该计算机可读指令,一个或多个处理器执行该计算机可读指令,使得所述计算机设备执行本申请各实施例所述的图像真伪 检测方法中的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图:
图1为一个实施例中图像真伪检测方法的应用环境图;
图2为一个实施例中图像真伪检测方法的流程示意图;
图3为一个实施例中相机拍摄过程中引入模式噪声的原理示意图;
图4为一个实施例中提取固有模式噪声特征图的整体流程示意图;
图5为一个实施例中卷积神经网络的网络架构示意图;
图6为一个实施例中对待检测图像进行真伪识别的整体流程示意图;
图7为一个实施例中模型训练步骤的整体流程示意图;
图8为一个实施例中图像真伪检测装置的结构框图;
图9为另一个实施例中图像真伪检测装置的结构框图;
图10为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的图像真伪检测方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务器104可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN、以及大数据和人工智能平台等基础云计算服务的云服务器。
在一个实施例中,用户可以使用终端102输入待检测图像,终端102可以将待检测图像发送至服务器104。服务器104可以执行本申请各实施例中的图像真伪检测方法,得到待检测图像的真伪检测结果。服务器104可以根据真伪检测结果,确定是否对待检测图像进行进一步处理,并将处理结果返回至终端102。例如:待检测图像为人脸图像,若真伪检测结果为待检测图像是真实图像,则服务器104可以对该待检测图像进行人脸识别,并将识别结果返回至终端102,若真伪检测结果为待检测图像是伪造图像,则服务器104可以不进行人脸识别,并将待检测图像是伪造图像的结果返回至终端102。
在一个实施例中,本申请各实施例中的图像真伪检测方法可以采用人工智能领域的机器学习方法来实现。比如:分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果的步骤,可以采用机器学习方法来实现。
在一个实施例中,本申请各实施例中的图像真伪检测方法还涉及计算机视觉技术。例如:去除待检测图像中的低频信息,得到第一图像信息,对第一图像信息进行降噪处理,得到降噪后的第二图像信息,以及根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图等步骤,均涉及计算机视觉技术。
在一个实施例中,本申请各实施例中的图像真伪检测方法还可以涉及区块链技术。例如:执行本申请各实施例中的图像真伪检测方法的服务器或终端可以是区块链系统中的一个节点。
在一个实施例中,如图2所示,提供了一种图像真伪检测方法,该图像真伪检测方法可以由计算机设备执行,其中,计算机设备可包括服务器和终端,可以理解,该图像真伪检测方法可以由服务器或终端单独执行,也可以由终端和服务器共同执行,本申请实施例以该方法应用于图1中的服务器为例进行说明,包括以下步骤:
步骤202,获取待检测图像。
其中,待检测图像,是需要检测真伪的图像。
在一个实施例中,待检测图像可以是人脸图像,即,包含人脸的图像。在其他实施例中,待检测图像还可以是包含其他内容的图像,比如:人物图像、物体图像或风景图像等,凡是需要检测真伪的图像均可作为待检测图像,不做限定。
在一个实施例中,待检测图像可以是从待检测视频中提取的图像。在一个实施例中,服务器可以对待检测视频进行采样,得到多个视频帧,服务器可以将多个视频帧作为多个待检测图像,对各个待检测图像分别执行本申请各实施例中的方法,从而得到各个待检测图像的真伪检测结果。在另一个实施例中,服务器可以从待检测视频中提取一帧图像作为待检测图像,对该待检测图像执行本申请各实施例中的方法,从而得到该待检测图像的真伪检测结果。
在一个实施例中,服务器可以先按照预设尺寸,对待检测图像进行裁剪,再根据裁剪后的待检测图像,执行步骤204及后续步骤。
在一个实施例中,服务器可以按照预设尺寸,对待检测图像进行中心裁剪。其中,中心裁剪,是以待检测图像的中心为裁剪后的图像的中心,对待检测图像进行裁剪的处理。
具体地,服务器可以以待检测图像的中心作为裁剪后的图像的中心,从待检测图像中裁剪出符合预设尺寸的裁剪后的待检测图像。例如:假设预设尺寸为[224,224],则服务器可以以待检测图像的中心,分别在上下左右四个方向上距中心112尺寸的位置处进行裁剪,裁剪出的以待检测图像的中心为中心,且尺寸为[224,224]的图像即为裁剪后的待检测图像。可以理解,若待检测图像为多通道图像,则预设尺寸会多一个通道数量的维度,比如,待检测图像为三通道图像,则预设尺寸可以为[224,224,3]。
步骤204,去除待检测图像中的低频信息,得到第一图像信息。
其中,低频信息,是图像中处于低频率段的图像信号。第一图像信息,是从待检测图像中去除低频信息后得到的不包含低频信息的图像信息。
在一个实施例中,服务器可以通过对待检测图像进行域变换的方式,分解出待检测图像中的低频信息并去除,得到第一图像信息。在一个实施例中,域变换,可以是小波变换和傅里叶变换等中的任意一种。
在另一个实施例中,服务器可以通过高通滤波器,对待检测图像进行滤波处理,以去除待检测图像中的低频信息,得到第一图像信息。
在其他实施例中,服务器也可以采用其他的方式去除待检测图像中的低频信息,得到第一图像信息,不做限定。
步骤206,对第一图像信息进行降噪处理,得到降噪后的第二图像信息。
其中,降噪处理,是减少图像中的噪声的处理。第二图像信息,是对第一图像信息进行降噪处理得到的降噪后的图像信息。
在一个实施例中,服务器可以在空间域对第一图像信息所对应的图像进行降噪滤波,得到降噪后的第二图像信息。在另一个实施例中,服务器可以在变换域对第一图像信息进行降噪滤波,得到变换域下的第二图像信息。在一个实施例中,变换域可以是小波域和频域等中的任意一种。
在一个实施例中,服务器可以通过维纳滤波器,对第一图像信息进行降噪滤波。在其他实施例中,服务器还可以采用其他的滤波器对第一图像信息进行降噪滤波,比如:均值滤波器或中值滤波器等。
步骤208,根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图。
其中,固有模式噪声特征图,是包含待检测图像中的固有模式噪声的图像。固有模式噪声,是由于相机传感器的生产工艺的固有缺陷所导致的图像中产生的噪声。固有模式噪声中的主要成分是光响应非均匀性(PRNU,Photo Response Non-Uniformity)特征。
可以理解,由于相机传感器像素间硅涂层薄厚差异这种生产工艺缺陷,导致成像设备感光元件的光敏特性存在微小差异,这种微小差异使得整个感光阵列存在固定误差分布,而这种分布也以乘性因子的形式隐含于相机拍摄到的图像中。可以将图像模型看作无噪声图像和噪声的复合,数学模型如下 式所示:
y ij=f ij(x ijij)+c ijij
其中,y ij为相机传感器输出的图像,x ij为相机传感器所接收到的入射光照,f ij为光响应非均匀性乘性噪声因子,η ij为散粒噪声,c ij为暗电流噪声,ε ij为附加随机噪声。i=1,…,m,j=1,…,n,m×n为相机传感器的分辨率。光响应非均匀性特征(即PRNU特征)以上式中的光响应非均匀性乘性噪声因子的形式隐含于相机拍摄到的图像中。
如图3所示,相机在拍摄图像的过程中,所拍摄的真实场景的入射光的经过相机传感器302时,会引入模式噪声304(即固有模式噪声),导致最终生成的图像含有模式噪声。模式噪声的主要成分的光响应非均匀性特征信号微弱,主要集中在高频细节部分且不受温湿度影响,属于传感器的固有属性且不同设备间具有可区分性。
在一个实施例中,服务器可以将第一图像信息与第二图像信息相减,并根据相减的结果,得到待检测图像对应的固有模式噪声特征图。
可以理解,根据第一图像信息与第二图像信息之间的差异,得到的固有模式噪声特征图属于残差,其中不仅仅包含固有模式噪声,还包含一些其他的噪声,比如暗电流噪声,本申请各实施例中的图像真伪检测方法无需精确提取出图像中的光响应非均匀性特征,而是只需提取固有模式噪声残差即可,从而提高了运算效率。
在一个实施例中,若待检测图像为多通道图像,则服务器可以分别在各个通道下得到单通道的固有模式噪声特征图,然后将各通道下的单通道的固有模式噪声特征图合并起来,得到待检测图像对应的多通道的固有模式噪声特征图。
在一个实施例中,服务器可以先对固有模式噪声特征图进行图像增强处理,再执行步骤210。在一个实施例中,图像增强处理可以包括零均值滤波和傅里叶峰值抑制等处理中的至少一种。可以理解,零均值滤波可以实现去除固有模式噪声特征图中的线性模式,避免固有模式噪声特征图中出现亮线。傅里叶峰值抑制可以实现抑制固有模式噪声特征图在频域中峰值较大的区域,从而去除固有模式噪声特征图中的块效应,避免边缘突出。
步骤210,分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果;固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
在一个实施例中,服务器可以通过机器学习方法,分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果。
在其他实施例中,服务器还可以通过其他的方式分析固有模式噪声特征图中固有模式噪声的分布情况。比如,服务器可以将固有模式噪声特征图与预设的不同的真伪检测结果分别对应的固有模式噪声特征图模板进行比对,将相匹配的固有模式噪声特征图模板所对应的真伪检测结果,确定为待检测图像的真伪检测结果。
在一个实施例中,真伪检测结果可以至少包括待检测图像为真实图像和待检测图像为伪造图像等。
在另一个实施例中,真伪检测结果可以至少包括检测图像为真实图像、以及待检测图像为伪造类型下的伪造图像等。
在一个实施例中,伪造类型可以包括篡改类型、合成类型和人工智能生成类型等中的至少一种。即,伪造类型下的伪造图像可以包括篡改类型的伪造图像、合成类型的伪造图像和人工智能生成类型的伪造图像等中的至少一种。
其中,篡改类型的伪造图像,是指该图像是对真实图像进行篡改得到的伪造图像。合成类型的伪造图像,是指该图像是通过计算机合成的伪造图像。人工智能生成类型的伪造图像,是指该图像是通过人工智能的机器学习模型生成的伪造图像。比如:通过生成式对抗网络(GANs,Generative Adversarial Nets)生成的伪造图像。
在一个实施例中,若待检测图像是从待检测视频中提取的图像、且仅从待检测视频中提取了一帧图像作为待检测图像,则服务器可以直接将待检测图像的真伪检测结果作为待检测视频的真伪检测结 果。
在另一个实施例中,若待检测图像是从待检测视频中提取的图像、且从待检测视频中提取了多帧图像作为多个待检测图像,则服务器可以根据各个待检测图像的真伪检测结果,确定待检测视频的真伪检测结果。在一个实施例中,服务器可以按照各个待检测图像的真伪检测结果所占的比例,确定待检测视频的真伪检测结果。比如:服务器可以从各个待检测图像的真伪检测结果中,选取所占比例最大的真伪检测结果,作为待检测视频的真伪检测结果。
上述图像真伪检测方法,去除待检测图像中的低频信息,得到第一图像信息,然后对第一图像信息进行降噪处理,得到降噪后的第二图像信息,并根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图,由于固有模式噪声特征图中的固有模式噪声是由相机传感器带来的固有的噪声,不受图像内容干扰,所以真实图像与伪造图像中的模式噪声分布情况具有区别,且不会受到图像内容的干扰而难以辨别,因此,分析固有模式噪声特征图中固有模式噪声的分布情况,能够得到待检测图像的准确的真伪检测结果,提高了真伪检测结果的准确性。
在一个实施例中,去除待检测图像中的低频信息,得到第一图像信息包括:对空间域下的待检测图像进行域变换,得到变换域下的低频信息和高频信息;去除低频信息,并根据保留的高频信息得到变换域下的第一图像信息。
在一个实施例中,第一图像信息包括待检测图像在频域下的高频信息。服务器可以对空间域下的待检测图像进行傅里叶变换,将待检测图像由空间域变换到频域,得到待检测图像在频域下的低频信息和高频信息,然后服务器可以去除低频信息,得到频域下的高频信息。
在另一个实施例中,第一图像信息包括待检测图像在小波域下的高频分量的小波系数。服务器可以对空间域下的待检测图像进行小波变换,将待检测图像由空间域变换到小波域,得到待检测图像在小波域下的低频分量和高频分量,然后服务器可以将低频分量的小波系数置零,得到小波域下高频分量的小波系数。
在其他实施例中,服务器也可以对空间域下的待检测图像进行其他的域变换,只要是能够分离出低频分量的变换均可,不做限定。
上述实施例中,服务器可以对空间域下的待检测图像进行域变换,得到变换域下的低频信息和高频信息,然后去除低频信息,并根据保留的高频信息得到变换域下的第一图像信息,从而能够准确地去除待检测图像中的低频信息,得到包含高频信息的第一图像信息,因为固有模式噪声主要集中在高频信息中,所以准确地确定出包含高频信息的第一图像信息,能够为后续准确地确定固有模式噪声特征图做铺垫。
在一个实施例中,第一图像信息包括待检测图像在小波域下的各方向的高频分量的小波系数。本实施例中,去除待检测图像中的低频信息,得到第一图像信息包括:对空间域下的待检测图像进行小波变换,将待检测图像分解为小波域下的低频分量和多个方向的高频分量;将低频分量的小波系数置零,得到各方向的高频分量的小波系数。
具体地,服务器可以对空间域下的待检测图像进行小波变换,将待检测图像分解为小波域下的低频分量和多个方向的高频分量,然后服务器可以将低频分量的小波系数置零,并得到未置零的各方向的高频分量的小波系数。
在一个实施例中,多个方向的高频分量可以包括水平高频分量、垂直高频分量和对角高频分量等中的至少一种。
在一个实施例中,服务器可以对空间域下的待检测图像进行多尺度小波变换,将待检测图像分解为低频分量和各尺度下的高频分量,同一尺度下的高频分量包括多个方向的高频分量。服务器可以将低频分量的小波系数置零,得到未置零的各尺度下各方向的高频分量的小波系数。
在一个实施例中,多尺度小波变换的阶数可以根据实际情况任意设置,比如:可以是4阶小波变换。
在一个实施例中,小波变换所采用的小波基可以是db4(Daubechies 4),也可以选取其他的小波函数作为小波基,不做限定。
上述实施例中,服务器通过小波变换将待检测图像准确地分解为低频分量和多个方向的高频分量,然后将低频分量的小波系数置零,得到各方向的高频分量的小波系数,从而能够得到准确的高频分量的小波系数,因为固有模式噪声主要集中在高频信息中,所以准确地确定出高频分量的小波系数,能够为后续准确地确定固有模式噪声特征图做铺垫。
在一个实施例中,第二图像信息包括待检测图像在小波域下各方向的高频分量对应的高频降噪小波系数。本实施例中,对第一图像信息进行降噪处理,得到降噪后的第二图像信息包括:对各方向的高频分量的小波系数进行降噪处理,得到各方向的高频分量对应的高频降噪小波系数。本实施例中,根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图,包括:根据对应于同一方向的高频分量的小波系数与高频降噪小波系数之间的差异,得到各方向的高频分量对应的高频噪声小波系数;根据置零的低频分量的小波系数、以及各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
其中,高频降噪小波系数,是对高频分量的小波系数进行降噪处理后得到的高频分量中的非噪声信息的小波系数。高频噪声小波系数,是高频分量中的噪声信息的小波系数。
在一个实施例中,服务器可以通过维纳滤波器,对各方向的高频分量的小波系数进行降噪滤波,得到各方向的高频分量对应的高频降噪小波系数。在其他实施例中,服务器还可以采用其他的滤波器对各方向的高频分量的小波系数进行降噪滤波,比如:均值滤波器或中值滤波器等。
在一个实施例中,服务器可以将对应于同一方向的高频分量的小波系数与高频降噪小波系数相减,得到各方向的高频分量对应的高频噪声小波系数。
在一个实施例中,在对空间域下的待检测图像进行多尺度小波变换,将待检测图像分解为低频分量和各尺度下的高频分量,同一尺度下的高频分量包括多个方向的高频分量的情况下,服务器可以对各尺度下各方向的高频分量的小波系数进行降噪处理,得到各尺度下各方向的高频分量对应的高频降噪小波系数。然后,服务器可以根据对应于同一尺度下同一方向的高频分量的小波系数与高频降噪小波系数之间的差异,得到各尺度下各方向的高频分量对应的高频噪声小波系数。接着,服务器可以根据置零的低频分量的小波系数、以及各尺度下各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
上述实施例中,服务器可以根据对应于同一方向的高频分量的小波系数与降噪处理得到的高频降噪小波系数之间的差异,得到各方向的高频分量对应的高频噪声小波系数,然后根据置零的低频分量的小波系数、以及各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图,能够准确地得到包含固有模式噪声的固有模式噪声特征图,从而提高了图像真伪检测的准确性。此外,根据高频分量的小波系数与高频降噪小波系数之间的差异就能够得到包含固有模式噪声的固有模式噪声特征图,只需提取残差,而无需准确地提取出光响应非均匀性特征,就能进行图像真伪检测,提高了图像真伪检测的效率和便捷性。
在一个实施例中,对空间域下的待检测图像进行小波变换,将待检测图像分解为小波域下的低频分量和多个方向的高频分量包括:对空间域下的待检测图像进行多尺度小波变换,将待检测图像分解为低频分量和各尺度下的高频分量;同一尺度下的高频分量包括多个方向的高频分量。本实施例中,根据置零的低频分量的小波系数、以及各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图,包括:根据置零的低频分量的小波系数、以及各尺度下各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
具体地,服务器可以对空间域下的待检测图像进行多尺度小波变换,将待检测图像分解为低频分量和各尺度下的高频分量,同一尺度下的高频分量包括多个方向的高频分量。然后,服务器可以将低频分量的小波系数置0,得到未置零的各尺度下各方向的高频分量的小波系数。接着,服务器可以对各尺度下各方向的高频分量的小波系数进行降噪处理,得到各尺度下各方向的高频分量对应的高频降噪小波系数。服务器可以根据对应于同一尺度下同一方向的高频分量的小波系数与高频降噪小波系数之间的差异,得到各尺度下各方向的高频分量对应的高频噪声小波系数。最后,服务器可以根据置零的 低频分量的小波系数、以及各尺度下各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
在一个实施例中,服务器可以先在同一尺度下,依次对该尺度下各方向的高频分量进行降噪处理,并根据同一方向的高频分量的小波系数与高频降噪小波系数之间的差异,得到该尺度下该方向的高频分量对应的高频噪声小波系数,从而得到了该尺度下各方向的高频分量分别对应的高频噪声小波系数。接着,服务器可以在下一尺度下,依次对该尺度下各方向的高频分量进行降噪处理。依次类推,直至处理完毕各尺度各方向下的高频分量。
上述实施例中,服务器可以对空间域下的待检测图像进行多尺度小波变换,从而能够更加准确地确定待检测图像中的高频信息,基于准确的高频信息进行降噪滤波,能够提高降噪处理的准确性,从而为准确地确定固有模式噪声特征图做铺垫。
在一个实施例中,对各方向的高频分量的小波系数进行降噪处理,得到各方向的高频分量对应的高频降噪小波系数包括:估计各方向的高频分量中的非噪声信息的局部方差;针对每个方向的高频分量,根据高频分量中的非噪声信息的局部方差,对高频分量的小波系数进行降噪滤波,得到各方向的高频分量对应的高频降噪小波系数。
其中,非噪声信息,是除噪声信息之外的信息。局部方差,用于表征图像中局部像素点对应的像素值的方差情况。
在一个实施例中,针对每个方向的高频分量,服务器可以采用一个预设尺寸的窗口,基于该高频分量进行滤波处理,得到滤波结果。然后服务器可以根据滤波结果与预设噪声方差之间的差异,确定该高频分量中的非噪声信息的局部方差。
在另一个实施例中,针对每个方向的高频分量,服务器可以采用多个不同尺寸的窗口,确定该高频分量中的非噪声信息在各窗口下分别对应的初始局部方差,然后根据各个初始局部方差,确定高频分量中的非噪声信息的最终的局部方差。
在一个实施例中,针对每个方向的高频分量,服务器可以根据高频分量中的非噪声信息的局部方差,通过维纳滤波器对高频分量的小波系数进行降噪滤波,得到各方向的高频分量对应的高频降噪小波系数。
可以理解,假设待检测图像中噪声部分均值为零,方差为随空间变化的平稳高斯白噪声,是上述实施例的设计思想。
上述实施例中,服务器可以估计各方向的高频分量中的非噪声信息的局部方差,然后根据高频分量中的非噪声信息的局部方差,对高频分量的小波系数进行降噪滤波,得到高频分量对应的高频降噪小波系数,从而能够准确地对高频信息进行降噪,得到准确的高频信息降噪后的信息,实现了将噪声从高频信息中准确分离出来,为得到准确的固有模式噪声特征图做铺垫。
在一个实施例中,估计各方向的高频分量中的非噪声信息的局部方差包括:针对每个方向的高频分量,通过多个不同尺寸的窗口,基于高频分量进行滤波处理,得到高频分量在各窗口下分别对应的滤波结果;根据同一窗口对应的滤波结果与预设噪声方差之间的差异,确定高频分量中的非噪声信息在各个窗口下分别对应的初始局部方差;从各个初始局部方差中,选取高频分量中的非噪声信息的最终的局部方差。
其中,预设噪声方差,是预设的高频分量中的噪声信息的方差。初始局部方差,是指高频分量中的非噪声信息在一个窗口下的局部方差。最终的局部方差,是指根据高频分量中的非噪声信息在多个不同尺寸的窗口下的初始局部方差,最终确定出的高频分量中的非噪声信息的局部方差。
在一个实施例中,针对每个方向的高频分量,服务器可以对高频分量所对应的图像中各像素点的像素值进行平方运算,再通过多个不同尺寸的窗口,对平方运算后得到的高频分量平方图进行滤波处理,得到高频分量在各窗口下分别对应的滤波结果。
在一个实施例中,服务器可以将同一窗口对应的滤波结果与预设噪声方差相减,得到高频分量中的非噪声信息在各个窗口下分别对应的初始局部方差。
可以理解,对高频分量所对应的图像中各像素点的像素值进行平方运算,再通过窗口对平方运算 后的高频分量平方图进行滤波处理,能够使滤波结果逼近于预设噪声方差,然后将滤波结果与预设噪声方差相减,能够得到非噪声信息的局部方差。
在一个实施例中,服务器可以将高频分量中的非噪声信息在不同窗口下对应的初始局部方差中的最小值,选取为高频分量中的非噪声信息的最终的局部方差。
在一个实施例中,各个窗口的尺寸以及窗口的数量均可以根据实际需求进行任意设置。比如,可以选取尺寸分别为3×3、5×5、7×7和9×9的4个窗口。
在一个实施例中,预设噪声方差可以根据实际需求进行设置。比如:可以设置预设噪声标准差为5,即预设噪声方差为25。
上述实施例中,通过多个不同尺寸的窗口,确定高频分量中的非噪声信息的局部方差,避免使用单一窗口得到的局部方差不准确,通过使用多个不同尺寸的窗口,能够得到更加准确的局部方差,从而能够提高对高频分量的小波系数进行降噪滤波的准确性。
如图4所示,是上述各实施例中确定待检测图像对应的固有模式噪声特征图的步骤的整体流程示意图,具体包括如下步骤:
步骤402,获取待检测图像。
步骤404,对待检测图像的单通道图像进行多尺度小波变换,分解为低频分量和各尺度下各方向的高频分量。
步骤406,将低频分量的小波系数置零,得到各尺度下各方向的高频分量的小波系数。
步骤408,对单个尺度下的水平高频分量,进行非噪声信息的局部方差估计,并根据局部方差进行降噪滤波,得到高频降噪小波系数,再将水平高频分量与相对应的高频降噪小波系数相减,得到相对应的高频噪声小波系数。
步骤410,对单个尺度下的垂直高频分量,进行非噪声信息的局部方差估计,并根据局部方差进行降噪滤波,得到高频降噪小波系数,再将垂直高频分量与相对应的高频降噪小波系数相减,得到相对应的高频噪声小波系数。
步骤412,对单个尺度下的对角高频分量,进行非噪声信息的局部方差估计,并根据局部方差进行降噪滤波,得到高频降噪小波系数,再将对角高频分量与相对应的高频降噪小波系数相减,得到相对应的高频噪声小波系数。
判断各尺度下的高频分量是否均降噪完毕,若否,则返回步骤408,以继续执行。若是,则执行步骤414。
步骤414,根据单通道下的置零的低频分量的小波系数、以及各尺度下各方向的高频分量对应的高频噪声小波系数,进行逆小波变换,得到固有模式噪声特征图。
判断各通道是否均处理完毕,若否,则返回步骤404,以继续执行。若是,则执行步骤416。
步骤416,根据各个单通道的固有模式噪声特征图,得到待检测图像对应的固有模式噪声特征图。
在一个实施例中,分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果包括:将固有模式噪声特征图输入至预先训练的真伪检测模型中;通过真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果。
在一个实施例中,真伪检测模型可以是机器学习模型。
在一个实施例中,真伪检测模型可以是神经网络模型。在一个实施例中,真伪检测模型可以是卷积神经网络模型。
在一个实施例中,真伪检测模型可以是EfficientNet卷积神经网络模型。在一个实施例中,综合考虑模型准确率、存储量和训练耗时等因素,真伪检测模型可以是EfficientNet-B4(EfficientNet系列的其中一个版本)卷积神经网络模型。EfficientNet系列的卷积神经网络在图像分类的应用中性能表现良好,因此,可以选取该卷积神经网络对待检测图像的真伪检测结果进行分类识别,能够得到准确的真伪检测结果。
如图5所示,为EfficientNet-B4卷积神经网络模型的网络架构,图中的“×2”表示将相应的虚 线框中的模块重复2次,“×4”表示将相应的虚线框中的模块重复4次,“×6”表示将相应的虚线框中的模块重复6次。在其他实施例中,真伪检测模型还可以是其他网络结构的卷积神经网络模型
可以理解,固有模式噪声中的主要成分光响应非均匀性特征属于相机传感器的固有属性,对图像内容并不太敏感,因此,如果通过篡改、合成和人工智能生成等方式制作伪造图像,会在一定程度上破坏原始的真实图像中的光响应非均匀性特征,导致伪造图像与真实图像的光响应非均匀性特征的分布存在差异。因此,服务器可以通过预先训练的真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,来确定待检测图像的真伪检测结果。伪造图像与真实图像的固有模式噪声的分布情况之间的差异,可以在模型训练过程中由真伪检测模型学习到,从而能够通过预先训练的真伪检测模型准确得到待检测图像的真伪检测结果。
在一个实施例中,服务器还可以将固有模式噪声特征图和图像内容相关的特征一起输入至神经网络模型中,得到图形的真伪检测结果。例如:人脸图像中图像内容相关的特征可以是五官位置、肤色和人脸轮廓等中的至少一种信息。
上述实施例中,服务器可以通过真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,准确地得到待检测图像的真伪检测结果,提高了图像真伪检测的准确性。采用EfficientNet-B4卷积神经网络进行真伪识别,提高了准确性,且提高了模型训练的效率。
如图6所示,为上述各实施例中得到待检测图像的真伪检测结果的整体流程示意图,具体包括如下步骤:
步骤602,获取待检测图像。
步骤604,对待检测图像进行裁剪处理。
步骤606,对裁剪后的待检测图像进行固有模式噪声提取,得到固有模式噪声特征图。
步骤608,将固有模式噪声特征图输入至预先训练的真伪检测模型中。
步骤610,通过真伪检测模型分析固有模式噪声的分布情况,得到待检测图像的真伪检测结果。
其中,步骤606具体包括如下步骤:
步骤6061,对待检测图像进行多尺度小波变换。
步骤6062,将低频分量的小波系数置零。
步骤6063,估计各高频分量中非噪声信息的局部方差。
步骤6064,根据局部方差,对各高频分量的小波系数进行降噪滤波,得到高频降噪小波系数。
步骤6065,根据高频分量的小波系数与高频降噪小波系数的差异,得到高频噪声小波系数。
步骤6066,根据置零的低频分量的小波系数、以及各高频分量对应的高频噪声小波系数,进行逆小波变换,得到固有模式噪声特征图。
步骤6067,对固有模式噪声特征图进行图像增强处理。
在一个实施例中,真伪检测模型是通过模型训练步骤得到;模型训练步骤包括:获取样本图像和样本图像对应的真伪标签;对样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图;将样本固有模式噪声特征图输入至待训练的真伪检测模型中,通过真伪检测模型分析样本固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对样本固有模式噪声特征图所对应的样本图像进行真伪识别,得到样本图像的真伪检测结果;根据样本图像的真伪检测结果与对应于样本图像的真伪标签之间的差异,迭代地调整待训练的真伪检测模型的模型参数,直至满足迭代停止条件,得到训练完成的真伪检测模型。
其中,真伪标签,用于表征样本图像的实际的真伪情况。样本固有模式噪声特征图,是包含样本图像中的固有模式噪声的图像。样本图像的真伪检测结果,是在模型训练过程中通过待训练的真伪检测模型识别出的样本图像的真伪情况。
在一个实施例中,真伪标签可以包括样本图像为真实图像、以及样本图像为伪造图像。样本图像的真伪检测结果可以包括样本图像为真实图像、以及样本图像为伪造图像。
在另一个实施例中,真伪标签可以包括样本图像为真实图像、以及样本图像为伪造类型下的伪造 图像。样本图像的真伪检测结果可以包括样本图像为真实图像、以及样本图像为伪造类型下的伪造图像。
在一个实施例中,样本图像对应的真伪标签,可以是人工对样本图像进行标记得到的。在另一个实施例中,样本图像可以是从样本视频中提取的图像,样本图像的真伪标签可以是样本图像所来源于的样本视频的真伪标签,样本视频的真伪标签可以是人工对样本视频进行标记得到的。
具体地,在每轮迭代中,服务器可以根据样本图像的真伪检测结果与对应于样本图像的真伪标签之间的差异,调整待训练的真伪检测模型的模型参数,再进入下一轮迭代,直至差异满足迭代停止条件,得到训练完成的真伪检测模型。
在一个实施例中,服务器可以将采用交叉熵损失函数作为模型训练过程中的目标函数,来衡量差异。迭代停止条件可以为目标函数的损失值收敛。
上述实施例中,服务器可以根据样本图像和对应的真伪标签,迭代地对真伪检测模型进行训练,从而能够得到能够分析样本固有模式噪声特征图中固有模式噪声的分布情况,并准确得到真伪检测结果的真伪检测模型,从而提高了图像真伪检测的准确性。
在一个实施例中,对样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图包括:去除样本图像中的低频信息,得到样本高频信息;对样本高频信息进行降噪处理,得到降噪后的样本去噪信息;根据样本高频信息与样本去噪信息之间的差异,得到与样本图像对应的样本固有模式噪声特征图。
其中,样本高频信息,是从样本图像中去除低频信息后得到的不包含低频信息的图像信息。样本去噪信息,是对样本高频信息进行降噪处理得到的降噪后的图像信息。
在一个实施例中,服务器可以通过对样本图像进行域变换的方式,分解出样本图像中的低频信息并去除,得到样本高频信息。在一个实施例中,服务器可以对样本图像进行小波变换,将样本图像分解为样本低频分量和多个样本高频分量,然后将样本低频分量的小波系数置零,得到各样本高频分量的小波系数。
在另一个实施例中,服务器可以通过高通滤波器,对样本图像进行滤波处理,以去除样本图像中的低频信息,得到样本高频信息。
在其他实施例中,服务器也可以采用其他的方式去除样本图像中的低频信息,得到样本高频信息,不做限定。
在一个实施例中,服务器可以在空间域对样本高频信息进行降噪滤波,得到降噪后的样本去噪信息。在另一个实施例中,服务器可以在变换域对样本高频信息进行降噪滤波,得到变换域下的样本去噪信息。在一个实施例中,变换域可以是小波域和频域等中的任意一种。
在一个实施例中,服务器可以通过维纳滤波器,对样本高频信息进行降噪滤波。在其他实施例中,服务器还可以采用其他的滤波器对样本高频信息进行降噪滤波,比如:均值滤波器或中值滤波器等。
在一个实施例中,服务器可以对样本图像在小波域下的各样本高频分量的小波系数分别进行降噪滤波,得到各样本高频分量对应的样本高频降噪小波系数。在一个实施例中,服务器可以估计各样本高频分量中非噪声信息的局部方差,然后根据局部方差对各样本高频分量的小波系数进行降噪滤波,得到样本高频分量对应的样本高频降噪小波系数。
在一个实施例中,服务器可以将样本高频信息与样本去噪信息相减,得到样本图像对应的样本固有模式噪声特征图。
在一个实施例中,服务器可以根据样本高频分量的小波系数与样本高频降噪小波系数的差异,得到样本高频噪声小波系数。然后服务器可以根据置零的样本低频分量的小波系数、以及各样本高频分量对应的样本高频噪声小波系数,进行逆小波变换,得到样本固有模式噪声特征图。
在一个实施例中,服务器可以将样本高频分量的小波系数与样本高频降噪小波系数相减,得到样本高频噪声小波系数。
可以理解,对样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图的步骤的具体实施例,与本申请各实施例中得到待检测图像对应的固有模式噪声特征图的具体实施例相似,上述实施例 仅作简单举例,不再赘述。
上述实施例中,服务器可以去除样本图像中的低频信息,得到样本高频信息,然后对样本高频信息进行降噪处理,得到降噪后的样本去噪信息,因为固有模式噪声主要存在于高频信息中,因此根据样本高频信息与样本去噪信息之间的差异,能够准确得到与样本图像对应的样本固有模式噪声特征图。
在一个实施例中,获取样本图像和样本图像对应的真伪标签包括:获取携带真伪标签的样本视频;对样本视频进行采样,得到样本视频对应的多个样本视频帧;按照预设尺寸裁剪各样本视频帧,得到样本图像,并将样本图像来源于的样本视频所携带的真伪标签,作为样本图像对应的真伪标签。
其中,样本视频帧,是样本视频中的单帧图像。
在一个实施例中,服务器可以将获取到的各个视频均匀且随机地划分至样本集和测试集中。样本集中的视频,用于作为样本视频进行模型训练。测试集中的视频用于作为测试视频,在模型训练完成后对训练完成的真伪检测模型进行效果测试。具体地,可以将视频根据视频类型划分为不同的视频集合,服务器可以分别从各个类型的视频集合中随机抽取第一预设数量的视频划分至样本集中,再分别从各个类型的视频集合中随机抽取第二预设数量的视频划分至测试集中。在一个实施例中,视频类型可以包括真实视频类型、以及多个伪造类型。
具体地,可以人工根据各个样本视频的实际的真伪情况,对各个样本视频标记真伪标签。服务器可以获取携带真伪标签的样本视频,然后按照预设采样规则对样本视频进行采样,得到样本视频对应的多个样本视频帧。接着,服务器可以按照预设尺寸裁剪各样本视频帧,得到样本图像,并将样本图像来源于的样本视频所携带的真伪标签,作为样本图像对应的真伪标签。
在一个实施例中,样本视频的数量为多个。
在一个实施例中,真伪标签可以包括样本视频为真实视频、以及样本视频为伪造视频。具体地,若样本图像来源于的样本视频所携带的真伪标签为样本视频为真实视频,则将样本图像为真实图像作为样本图像的真伪标签。若样本图像来源于的样本视频所携带的真伪标签为样本视频为伪造视频,则将样本图像为伪造图像作为样本图像的真伪标签。
在另一个实施例中,真伪标签可以包括样本视频为真实视频、以及样本视频为伪造类型下的伪造视频。具体地,若样本图像来源于的样本视频所携带的真伪标签为样本视频为真实视频,则将样本图像为真实图像作为样本图像的真伪标签。若样本图像来源于的样本视频所携带的真伪标签为样本视频为伪造类型下的伪造视频,则将样本图像为伪造类型下的伪造图像作为样本图像的真伪标签。
在一个实施例中,服务器可以按照预设尺寸,对样本视频帧进行中心裁剪。具体地,服务器可以以样本视频帧的中心作为裁剪后的样本图像的中心,从样本视频帧中裁剪出符合预设尺寸的裁剪后的样本图像。例如:假设预设尺寸为[224,224],则服务器可以以样本视频帧的中心,分别在上下左右四个方向上距中心112尺寸的位置处进行裁剪,裁剪出的以样本视频帧的中心为中心,且尺寸为[224,224]的图像即为样本图像。可以理解,若样本视频帧为多通道图像,则预设尺寸会多一个通道数量的维度,比如,样本视频帧为三通道图像,则预设尺寸可以为[224,224,3]。
上述实施例中,服务器可以从样本视频中提取样本图像,并将样本视频所携带的真伪标签作为样本图像的真伪标签,从而能够从视频中获取样本图像。此外,对样本视频帧进行裁剪得到样本图像,能够确保训练时的各个样本图像的尺寸的一致性,且减小了样本图像的尺寸,从而降低了时间开销,提高了模型训练效率,还能避免采用缩放处理破坏样本图像中像素间的固有模式噪声。
在一个实施例中,对样本视频进行采样,得到样本视频对应的多个样本视频帧包括:针对每个样本视频,若指定的采样帧数大于或等于样本视频的总帧数,将样本视频中的各帧作为样本视频帧。
具体地,针对每个样本视频,服务器可以确定样本视频的总帧数,并将样本视频的总帧数与指定的采样帧数进行比对。若指定的采样帧数大于或等于样本视频的总帧数,则服务器可以将样本视频中的各帧作为样本视频帧。
上述实施例中,针对每个样本视频,服务器可以根据采样帧数和样本视频的总帧数,按照相应的规则进行采样,从而能够适量且均匀地从样本视频中获取样本图像,且根据采样得到的样本图像进行模型训练,避免了直接根据样本视频中每一帧图像进行模型训练而导致的大量时间开销,提高了模型 训练效率。
在一个实施例中,对样本视频进行采样,得到样本视频对应的多个样本视频帧包括:针对每个样本视频,若指定的采样帧数小于样本视频的总帧数,根据总帧数和采样帧数确定采样间隔,并按照采样间隔从样本视频中提取样本视频帧。
具体地,针对每个样本视频,服务器可以确定样本视频的总帧数,并将样本视频的总帧数与指定的采样帧数进行比对。若指定的采样帧数小于样本视频的总帧数,则服务器可以根据总帧数和采样帧数,确定采样间隔,并按照采样间隔从样本视频中提取样本视频帧。
在一个实施例中,服务器可以将总帧数相较于采样帧数的倍数,确定为采样间隔,并按照采样间隔从样本视频中提取样本视频帧。在一个实施例中,服务器与按照如下采样规则对样本视频进行采样:
[],frame_num≦0或sample_num≦0;
[0,…,k,…,frame_num-1],frame_num≦sample_num;
[0,…,int(k*(frame_num/sample_num)),int(sample_num*
(frame_num/sample_num))],frame_num>sample_num;
其中,frame_num表示样本视频的总帧数,sample_num表示指定的采样帧数,k表示提取的样本视频帧的索引序号。
可以理解,上述采样规则中,当frame_num≦0或sample_num≦0时,采样规则为[],表示当总帧数或采样帧数小于或等于0时,则不对样本视频进行采样。当frame_num≦sample_num时,即,指定的采样帧数大于或等于样本视频的总帧数时,则依次提取样本视频中的第0帧、……、第k帧、……和第frame_num-1帧,即,提取样本视频中的各帧作为样本视频帧。当frame_num>sample_num时,即指定的采样帧数小于样本视频的总帧数时,则将总帧数相较于采样帧数的倍数frame_num/sample_num,确定为采样间隔,并按照采样间隔从样本视频中提取样本视频帧。
在一个实施例中,指定的采样帧数可以根据实际需求任意设置。比如:可以将采样帧数设置为10帧。
上述实施例中,针对每个样本视频,服务器可以根据采样帧数和样本视频的总帧数,按照相应的规则进行采样,从而能够适量且均匀地从样本视频中获取样本图像,且根据采样得到的样本图像进行模型训练,避免了直接根据样本视频中每一帧图像进行模型训练而导致的大量时间开销,提高了模型训练效率。
如图7所示,为上述实施例中的真伪检测模型的模型训练步骤的整体流程示意图,具体包括如下步骤:
步骤702,获取携带真伪标签的样本视频。
步骤704,对样本视频进行采样,得到多个样本视频帧。
步骤706,对样本视频帧进行裁剪处理,得到样本图像。
步骤708,对样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图。
步骤710,将样本固有模式噪声特征图输入至待训练的真伪检测模型。
步骤712,根据模型输出的样本图像的真伪检测结果与真伪标签的差异,迭代调整模型参数,直至满足迭代停止条件,得到训练完成的真伪检测模型。
其中,步骤708具体包括如下步骤:
步骤7081,对样本图像进行多尺度小波变换。
步骤7082,将低频分量的小波系数置零。
步骤7083,估计各高频分量中非噪声信息的局部方差。
步骤7084,根据局部方差,对各高频分量的小波系数进行降噪滤波,得到高频降噪小波系数。
步骤7085,根据高频分量的小波系数与高频降噪小波系数的差异,得到高频噪声小波系数。
步骤7086,根据置零的低频分量的小波系数、以及各高频分量对应的高频噪声小波系数,进行逆小波变换,得到样本固有模式噪声特征图。
步骤7087,对样本固有模式噪声特征图进行图像增强处理。
在一个实施例中,根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果包括:若分布情况符合真实图像的固有模式噪声分布,则识别待检测图像为真实图像。
具体地,服务器可以通过预先训练的真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,确定分布情况所符合的固有模式噪声分布。若分布情况符合真实图像的固有模式噪声分布,则真伪检测模型可以识别待检测图像为真实图像。
上述实施例中,因为真实图像与伪造图像中的模式噪声分布情况具有区别,且不会受到图像内容的干扰而难以辨别,所以服务器可以根据固有模式噪声特征图中固有模式噪声的分布情况,准确地识别出待检测图像是否为真实图像。
在一个实施例中,根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果包括:若分布情况与预设的伪造类型的固有模式噪声分布相符合,则根据相符合的固有模式噪声分布所对应的伪造类型,将待检测图像识别为伪造类型的伪造图像。
具体地,服务器可以通过预先训练的真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,确定分布情况所符合的固有模式噪声分布。若分布情况与预设的伪造类型的固有模式噪声分布相符合,则真伪检测模型可以根据相符合的固有模式噪声分布所对应的伪造类型,将待检测图像识别为伪造类型的伪造图像。
例如:若分布情况与合成类型的固有模式噪声分布相符合,则真伪检测模型可以将待检测图像识别为合成类型的伪造图像。
上述实施例中,因为真实图像与伪造图像中的模式噪声分布情况具有区别,且不会受到图像内容的干扰而难以辨别,所以服务器可以根据固有模式噪声特征图中固有模式噪声的分布情况,准确地识别出待检测图像是否为伪造类型下的伪造图像。此外,在待检测图像为伪造图像的情况下,服务器不仅仅能识别出待检测图像为伪造图像,还能够识别出伪造类型,从而提高了信息量。
本申请还提供一种应用场景,该应用场景为人脸图像安全检测的应用场景,该应用场景应用上述的图像真伪检测方法。具体地,该图像真伪检测方法在该应用场景的应用如下:
用户在通过终端上的软件或应用程序(app,application)等进行人脸识别时,终端可以采集用户的人脸图像,并将用户的人脸图像发送至服务器进行人脸识别。然而,由于一些非法行为会使用伪造图像将用户的人脸图像进行替换,导致最终发送至服务器进行人脸识别的图像有可能不是用户的真实的人脸图像,而是伪造的人脸图像。因此,在服务器进行人脸识别之前,服务器可以先执行本申请各实施例中的图像真伪检测方法,对待检测的人脸图像进行真伪识别,得到真伪检测结果。若识别待检测的人脸图像为真实的人脸图像,则服务器可以继续对该人脸图像进行人脸识别。若识别待检测的人脸图像为伪造的人脸图像,则服务器可以不对该人脸图像进行人脸识别,并将识别到伪造图像的信息反馈给终端,从而有效保护了用户的人身、财产和信誉等方面的安全,提高了人脸识别过程中的安全性。
本申请还另外提供一种应用场景,该应用场景为图像鉴别的应用场景。该应用场景应用上述的图像真伪检测方法。具体地,该图像真伪检测方法在该应用场景的应用如下:
在需要对图像进行真伪鉴别的场景下,比如:对新闻中的图片进行真伪鉴别,或者对一些非法网站上的图片进行真伪鉴别等,用户可以通过终端输入待检测图像,终端可以将待检测图像发送至服务器,服务器可以执行本申请各实施例中的图像真伪检测方法,得到待检测图像的真伪检测结果,并将真伪检测结果返回至终端进行展示,以方便用户对图像进行真伪鉴别。
应该理解的是,虽然各流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,各流程图中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图8所示,提供了一种图像真伪检测装置800,该装置可以采用软件模块或 硬件模块,或者是二者的结合成为计算机设备的一部分,该装置具体包括:图像获取模块802、高频信息获取模块804、降噪模块806、模式噪声确定模块808和真伪识别模块810,其中:
图像获取模块802,用于获取待检测图像。
高频信息获取模块804,用于去除待检测图像中的低频信息,得到第一图像信息。
降噪模块806,用于对第一图像信息进行降噪处理,得到降噪后的第二图像信息。
模式噪声确定模块808,用于根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图。
真伪识别模块810,用于分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果;固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
在一个实施例中,第一图像信息包括待检测图像在小波域下的各方向的高频分量的小波系数。本实施例中,高频信息获取模块804还用于对空间域下的待检测图像进行小波变换,将待检测图像分解为小波域下的低频分量和多个方向的高频分量;将低频分量的小波系数置零,得到各方向的高频分量的小波系数。
在一个实施例中,第二图像信息包括待检测图像在小波域下各方向的高频分量对应的高频降噪小波系数。本实施例中,降噪模块806还用于对各方向的高频分量的小波系数进行降噪处理,得到各方向的高频分量对应的高频降噪小波系数。本实施例中,模式噪声确定模块808还用于根据对应于同一方向的高频分量的小波系数与高频降噪小波系数之间的差异,得到各方向的高频分量对应的高频噪声小波系数;根据置零的低频分量的小波系数、以及各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
在一个实施例中,高频信息获取模块804还用于对空间域下的待检测图像进行多尺度小波变换,将待检测图像分解为低频分量和各尺度下的高频分量;同一尺度下的高频分量包括多个方向的高频分量。本实施例中,模式噪声确定模块808还用于根据置零的低频分量的小波系数、以及各尺度下各方向的高频分量对应的高频噪声小波系数进行逆小波变换,得到与待检测图像对应的固有模式噪声特征图。
在一个实施例中,降噪模块806还用于估计各方向的高频分量中的非噪声信息的局部方差;针对每个方向的高频分量,根据高频分量中的非噪声信息的局部方差,对高频分量的小波系数进行降噪滤波,得到各方向的高频分量对应的高频降噪小波系数。
在一个实施例中,降噪模块806还用于针对每个方向的高频分量,通过多个不同尺寸的窗口,基于高频分量进行滤波处理,得到高频分量在各窗口下分别对应的滤波结果;根据同一窗口对应的滤波结果与预设噪声方差之间的差异,确定高频分量中的非噪声信息在各个窗口下分别对应的初始局部方差;从各个初始局部方差中,选取高频分量中的非噪声信息的最终的局部方差。
在一个实施例中,真伪识别模块810还用于将固有模式噪声特征图输入至预先训练的真伪检测模型中;通过真伪检测模型分析固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对待检测图像进行真伪识别,得到待检测图像的真伪检测结果。
在一个实施例中,图像真伪检测装置还包括:
模型训练模块812,用于获取样本图像和样本图像对应的真伪标签;对样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图;将样本固有模式噪声特征图输入至待训练的真伪检测模型中,通过真伪检测模型分析样本固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对样本固有模式噪声特征图所对应的样本图像进行真伪识别,得到样本图像的真伪检测结果;根据样本图像的真伪检测结果与对应于样本图像的真伪标签之间的差异,迭代地调整待训练的真伪检测模型的模型参数,直至满足迭代停止条件,得到训练完成的真伪检测模型。
在一个实施例中,模型训练模块812还用于去除样本图像中的低频信息,得到样本高频信息;对样本高频信息进行降噪处理,得到降噪后的样本去噪信息;根据样本高频信息与样本去噪信息之间的差异,得到与样本图像对应的样本固有模式噪声特征图。
在一个实施例中,模型训练模块812还用于获取携带真伪标签的样本视频;对样本视频进行采样,得到样本视频对应的多个样本视频帧;按照预设尺寸裁剪各样本视频帧,得到样本图像,并将样本图像来源于的样本视频所携带的真伪标签,作为样本图像对应的真伪标签。
在一个实施例中,模型训练模块812还用于针对每个样本视频,若指定的采样帧数大于或等于样本视频的总帧数,将样本视频中的各帧作为样本视频帧;若指定的采样帧数小于样本视频的总帧数,根据总帧数和采样帧数确定采样间隔,并按照采样间隔从样本视频中提取样本视频帧。
在一个实施例中,真伪识别模块810还用于若分布情况符合真实图像的固有模式噪声分布,则识别待检测图像为真实图像;若分布情况与预设的伪造类型的固有模式噪声分布相符合,则根据相符合的固有模式噪声分布所对应的伪造类型,将待检测图像识别为伪造类型的伪造图像。
上述图像真伪检测装置,去除待检测图像中的低频信息,得到第一图像信息,然后对第一图像信息进行降噪处理,得到降噪后的第二图像信息,并根据第一图像信息与第二图像信息之间的差异,得到与待检测图像对应的固有模式噪声特征图,由于固有模式噪声特征图中的固有模式噪声是由相机传感器带来的固有的噪声,不受图像内容干扰,所以真实图像与伪造图像中的模式噪声分布情况具有区别,且不会受到图像内容的干扰而难以辨别,因此,分析固有模式噪声特征图中固有模式噪声的分布情况,能够得到待检测图像的准确的真伪检测结果,提高了真伪检测结果的准确性。
关于图像真伪检测装置的具体限定可以参见上文中对于图像真伪检测方法的限定,在此不再赘述。上述图像真伪检测装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的一个或多个处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于一个或多个处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图10所示。该计算机设备包括通过系统总线连接的一个或多个处理器、存储器和网络接口。其中,该计算机设备的一个或多个处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储模型数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被一个或多个处理器执行时以实现一种图像真伪检测方法。
本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,还提供了一种计算机设备,包括存储器和一个或多个处理器,存储器中存储有计算机可读指令,该一个或多个处理器执行计算机可读指令时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一个或多个计算机可读存储介质,存储有计算机可读指令,该计算机可读指令被一个或多个处理器执行时实现上述各方法实施例中的步骤。
在一个实施例中,提供了一种计算机程序产品,该计算机程序产品包括计算机可读指令,该计算机可读指令存储在计算机可读存储介质中。计算机设备的一个或多个处理器从计算机可读存储介质读取该计算机可读指令,一个或多个处理器执行该计算机可读指令,使得该计算机设备执行上述各方法实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (18)

  1. 一种图像真伪检测方法,其特征在于,由计算机设备执行,所述方法包括:
    获取待检测图像;
    去除所述待检测图像中的低频信息,得到第一图像信息;
    对所述第一图像信息进行降噪处理,得到降噪后的第二图像信息;
    根据所述第一图像信息与所述第二图像信息之间的差异,得到与所述待检测图像对应的固有模式噪声特征图;及
    分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果;所述固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
  2. 根据权利要求1所述的方法,其特征在于,所述第一图像信息包括所述待检测图像在小波域下的各方向的高频分量的小波系数;所述去除所述待检测图像中的低频信息,得到第一图像信息包括:
    对空间域下的所述待检测图像进行小波变换,将所述待检测图像分解为小波域下的低频分量和多个方向的高频分量;及
    将所述低频分量的小波系数置零,得到各方向的所述高频分量的小波系数。
  3. 根据权利要求2所述的方法,其特征在于,所述第二图像信息包括所述待检测图像在小波域下各方向的高频分量对应的高频降噪小波系数;所述对所述第一图像信息进行降噪处理,得到降噪后的第二图像信息包括:
    对各方向的所述高频分量的小波系数进行降噪处理,得到各方向的所述高频分量对应的高频降噪小波系数;
    所述根据所述第一图像信息与所述第二图像信息之间的差异,得到与所述待检测图像对应的固有模式噪声特征图,包括:
    根据对应于同一方向的所述高频分量的小波系数与所述高频降噪小波系数之间的差异,得到各方向的所述高频分量对应的高频噪声小波系数;及
    根据置零的所述低频分量的小波系数、以及各方向的所述高频分量对应的高频噪声小波系数进行逆小波变换,得到与所述待检测图像对应的固有模式噪声特征图。
  4. 根据权利要求3所述的方法,其特征在于,所述对空间域下的所述待检测图像进行小波变换,将所述待检测图像分解为小波域下的低频分量和多个方向的高频分量包括:
    对空间域下的所述待检测图像进行多尺度小波变换,将所述待检测图像分解为低频分量和各尺度下的高频分量;同一尺度下的高频分量包括多个方向的高频分量;
    所述根据置零的所述低频分量的小波系数、以及各方向的所述高频分量对应的高频噪声小波系数进行逆小波变换,得到与所述待检测图像对应的固有模式噪声特征图,包括:
    根据置零的所述低频分量的小波系数、以及各所述尺度下各方向的所述高频分量对应的高频噪声小波系数进行逆小波变换,得到与所述待检测图像对应的固有模式噪声特征图。
  5. 根据权利要求3所述的方法,其特征在于,所述对各方向的所述高频分量的小波系数进行降噪处理,得到各方向的所述高频分量对应的高频降噪小波系数包括:
    估计各方向的所述高频分量中的非噪声信息的局部方差;及
    针对每个方向的所述高频分量,根据所述高频分量中的非噪声信息的局部方差,对所述高频分量的小波系数进行降噪滤波,得到各方向的所述高频分量对应的高频降噪小波系数。
  6. 根据权利要求5所述的方法,其特征在于,所述估计各方向的所述高频分量中的非噪声信息的局部方差包括:
    针对每个方向的所述高频分量,通过多个不同尺寸的窗口,基于所述高频分量进行滤波处理,得到所述高频分量在各所述窗口下分别对应的滤波结果;
    根据同一窗口对应的滤波结果与预设噪声方差之间的差异,确定所述高频分量中的非噪声信息在各个所述窗口下分别对应的初始局部方差;及
    从各个所述初始局部方差中,选取所述高频分量中的非噪声信息的最终的局部方差。
  7. 根据权利要求1所述的方法,其特征在于,所述分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果包括:
    将所述固有模式噪声特征图输入至预先训练的真伪检测模型中;及
    通过所述真伪检测模型分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果。
  8. 根据权利要求7所述的方法,其特征在于,所述真伪检测模型是通过模型训练步骤得到;所述模型训练步骤包括:
    获取样本图像和所述样本图像对应的真伪标签;
    对所述样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图;
    将所述样本固有模式噪声特征图输入至待训练的真伪检测模型中,通过所述真伪检测模型分析所述样本固有模式噪声特征图中固有模式噪声的分布情况,并根据分布情况对所述样本固有模式噪声特征图所对应的样本图像进行真伪识别,得到所述样本图像的真伪检测结果;及
    根据所述样本图像的真伪检测结果与对应于所述样本图像的所述真伪标签之间的差异,迭代地调整所述待训练的真伪检测模型的模型参数,直至满足迭代停止条件,得到训练完成的真伪检测模型。
  9. 根据权利要求8所述的方法,其特征在于,所述对所述样本图像进行固有模式噪声提取,得到样本固有模式噪声特征图包括:
    去除所述样本图像中的低频信息,得到样本高频信息;
    对所述样本高频信息进行降噪处理,得到降噪后的样本去噪信息;及
    根据所述样本高频信息与所述样本去噪信息之间的差异,得到与所述样本图像对应的样本固有模式噪声特征图。
  10. 根据权利要求8所述的方法,其特征在于,所述获取样本图像和所述样本图像对应的真伪标签包括:
    获取携带真伪标签的样本视频;
    对所述样本视频进行采样,得到所述样本视频对应的多个样本视频帧;及
    按照预设尺寸裁剪各所述样本视频帧,得到样本图像,并将所述样本图像来源于的样本视频所携带的真伪标签,作为所述样本图像对应的真伪标签。
  11. 根据权利要求10所述的方法,其特征在于,所述对所述样本视频进行采样,得到所述样本视频对应的多个样本视频帧包括:
    针对每个所述样本视频,若指定的采样帧数大于或等于所述样本视频的总帧数,将所述样本视频中的各帧作为样本视频帧。
  12. 根据权利要求10所述的方法,其特征在于,所述对所述样本视频进行采样,得到所述样本视频对应的多个样本视频帧包括:
    针对每个所述样本视频,若指定的采样帧数小于所述样本视频的总帧数,根据所述总帧数和所述采样帧数确定采样间隔,并按照所述采样间隔从所述样本视频中提取样本视频帧。
  13. 根据权利要求1至12任一项所述的方法,其特征在于,所述根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果包括:
    若所述分布情况符合真实图像的固有模式噪声分布,则识别所述待检测图像为真实图像。
  14. 根据权利要求1至12任一项所述的方法,其特征在于,所述根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果包括:
    若所述分布情况与预设的伪造类型的固有模式噪声分布相符合,则根据相符合的固有模式噪声分布所对应的伪造类型,将所述待检测图像识别为所述伪造类型的伪造图像。
  15. 一种图像真伪检测装置,其特征在于,所述装置包括:
    图像获取模块,用于获取待检测图像;
    高频信息获取模块,用于去除所述待检测图像中的低频信息,得到第一图像信息;
    降噪模块,用于对所述第一图像信息进行降噪处理,得到降噪后的第二图像信息;
    模式噪声确定模块,用于根据所述第一图像信息与所述第二图像信息之间的差异,得到与所述待检测图像对应的固有模式噪声特征图;及
    真伪识别模块,用于分析所述固有模式噪声特征图中固有模式噪声的分布情况,并根据所述分布情况对所述待检测图像进行真伪识别,得到所述待检测图像的真伪检测结果;所述固有模式噪声,是由相机传感器带来的、且不受图像内容干扰的固有噪声。
  16. 一种计算机设备,包括存储器和一个或多个处理器,所述存储器存储有计算机可读指令,其特征在于,所述一个或多个处理器执行所述计算机可读指令时实现权利要求1至14中任一项所述的方法的步骤。
  17. 一个或多个计算机可读存储介质,存储有计算机可读指令,其特征在于,所述计算机可读指令被一个或多个处理器执行时实现权利要求1至14中任一项所述的方法的步骤。
  18. 一种计算机程序产品,包括计算机可读指令,其特征在于,所述计算可读指令被一个或多个处理器执行时实现权利要求1至14中任一项所述的方法的步骤。
PCT/CN2022/085430 2021-05-11 2022-04-07 图像真伪检测方法、装置、计算机设备和存储介质 WO2022237397A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22806367.3A EP4300417A1 (en) 2021-05-11 2022-04-07 Method and apparatus for evaluating image authenticity, computer device, and storage medium
US17/979,883 US20230056564A1 (en) 2021-05-11 2022-11-03 Image authenticity detection method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110512723.XA CN112991345B (zh) 2021-05-11 2021-05-11 图像真伪检测方法、装置、计算机设备和存储介质
CN202110512723.X 2021-05-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/979,883 Continuation US20230056564A1 (en) 2021-05-11 2022-11-03 Image authenticity detection method and apparatus

Publications (1)

Publication Number Publication Date
WO2022237397A1 true WO2022237397A1 (zh) 2022-11-17

Family

ID=76337542

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085430 WO2022237397A1 (zh) 2021-05-11 2022-04-07 图像真伪检测方法、装置、计算机设备和存储介质

Country Status (4)

Country Link
US (1) US20230056564A1 (zh)
EP (1) EP4300417A1 (zh)
CN (1) CN112991345B (zh)
WO (1) WO2022237397A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095471A (zh) * 2023-10-19 2023-11-21 南京理工大学 基于多尺度特征的人脸伪造溯源方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112991345B (zh) * 2021-05-11 2021-08-10 腾讯科技(深圳)有限公司 图像真伪检测方法、装置、计算机设备和存储介质
US11967184B2 (en) 2021-05-21 2024-04-23 Ford Global Technologies, Llc Counterfeit image detection
US20220374641A1 (en) * 2021-05-21 2022-11-24 Ford Global Technologies, Llc Camera tampering detection
CN113705397A (zh) * 2021-08-16 2021-11-26 南京信息工程大学 基于双流cnn结构融合prnu的gan生成人脸检测方法
US11798151B1 (en) * 2022-04-25 2023-10-24 Rivian Ip Holdings, Llc Systems and methods for determining image capture degradation of a camera sensor

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1925546A (zh) * 2006-07-20 2007-03-07 中山大学 一种数码相机jpeg图像的真伪鉴别方法
CN110121109A (zh) * 2019-03-22 2019-08-13 西安电子科技大学 面向监控系统数字视频实时溯源方法、城市视频监控系统
CN111709408A (zh) * 2020-08-18 2020-09-25 腾讯科技(深圳)有限公司 图像真伪检测方法和装置
US20200364513A1 (en) * 2017-11-30 2020-11-19 3M Innovative Properties Company Image based counterfeit detection
CN112991345A (zh) * 2021-05-11 2021-06-18 腾讯科技(深圳)有限公司 图像真伪检测方法、装置、计算机设备和存储介质

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1664849A (zh) * 2005-03-23 2005-09-07 中山大学 基于参数化整数小波变换的图像认证方法
US8160293B1 (en) * 2006-05-19 2012-04-17 The Research Foundation Of State University Of New York Determining whether or not a digital image has been tampered with
CN101616238B (zh) * 2009-07-17 2011-08-31 中山大学 一种数码相机的数字图像认证方法
CN102663426B (zh) * 2012-03-29 2013-12-04 东南大学 一种基于小波多尺度分析和局部三值模式的人脸识别方法
CN104156919B (zh) * 2014-08-04 2016-09-14 陕西科技大学 一种基于小波变换和Hopfield神经网络的运动模糊图像恢复方法
CN105120294B (zh) * 2015-06-26 2018-01-02 中国电子科技集团公司第二十八研究所 一种jpeg格式图像来源鉴别方法
CN105741118A (zh) * 2016-02-14 2016-07-06 武汉大学 一种通过图片噪声识别实现电子支付功能的方法及系统
CN105844638A (zh) * 2016-03-23 2016-08-10 武汉大学 一种通过相机噪声实现鉴别照片真伪的方法及系统
CN106485684B (zh) * 2016-10-24 2019-10-25 常州工学院 一种基于双树复小波变换的单幅图像去云雾方法
CN111553848B (zh) * 2020-03-20 2023-04-07 西安电子科技大学 监控视频溯源处理方法、系统、存储介质、视频监控终端
CN111652875B (zh) * 2020-06-05 2023-05-05 西安电子科技大学 一种视频伪造检测方法、系统、存储介质、视频监控终端
CN111709930A (zh) * 2020-06-15 2020-09-25 荆门汇易佳信息科技有限公司 基于模式噪声的图片出处与窜改认定方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1925546A (zh) * 2006-07-20 2007-03-07 中山大学 一种数码相机jpeg图像的真伪鉴别方法
US20200364513A1 (en) * 2017-11-30 2020-11-19 3M Innovative Properties Company Image based counterfeit detection
CN110121109A (zh) * 2019-03-22 2019-08-13 西安电子科技大学 面向监控系统数字视频实时溯源方法、城市视频监控系统
CN111709408A (zh) * 2020-08-18 2020-09-25 腾讯科技(深圳)有限公司 图像真伪检测方法和装置
CN112991345A (zh) * 2021-05-11 2021-06-18 腾讯科技(深圳)有限公司 图像真伪检测方法、装置、计算机设备和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117095471A (zh) * 2023-10-19 2023-11-21 南京理工大学 基于多尺度特征的人脸伪造溯源方法
CN117095471B (zh) * 2023-10-19 2024-02-27 南京理工大学 基于多尺度特征的人脸伪造溯源方法

Also Published As

Publication number Publication date
US20230056564A1 (en) 2023-02-23
EP4300417A1 (en) 2024-01-03
CN112991345B (zh) 2021-08-10
CN112991345A (zh) 2021-06-18

Similar Documents

Publication Publication Date Title
WO2022237397A1 (zh) 图像真伪检测方法、装置、计算机设备和存储介质
JP7490141B2 (ja) 画像検出方法、モデルトレーニング方法、画像検出装置、トレーニング装置、機器及びプログラム
Debiasi et al. PRNU-based detection of morphed face images
Lawgaly et al. Sensor pattern noise estimation based on improved locally adaptive DCT filtering and weighted averaging for source camera identification and verification
Raja et al. Video presentation attack detection in visible spectrum iris recognition using magnified phase information
US10558841B2 (en) Method and apparatus for recognizing fingerprint ridge point
Seo et al. Nonparametric bottom-up saliency detection by self-resemblance
Hu Variable lighting face recognition using discrete wavelet transform
CN110428399B (zh) 用于检测图像的方法、装置、设备和存储介质
Al-Ani et al. On the SPN estimation in image forensics: a systematic empirical evaluation
Cooper Improved photo response non-uniformity (PRNU) based source camera identification
CN104978578A (zh) 手机拍照文本图像质量评估方法
CN111444555B (zh) 一种测温信息显示方法、装置及终端设备
CN111784675A (zh) 物品纹理信息处理的方法、装置、存储介质及电子设备
CN111612741A (zh) 一种基于失真识别的精确无参考图像质量评价方法
CN106940904A (zh) 基于人脸识别和语音识别的考勤系统
Khalil et al. A review of fingerprint pre-processing using a mobile phone
Morinaga et al. Classification between natural and graphics images based on generalized Gaussian distributions
CN116311212B (zh) 基于高速摄像机实现运动状态下的船号识别方法及装置
CN117373136A (zh) 基于频率掩膜和注意力一致性的人脸伪造检测方法
CN113610071B (zh) 人脸活体检测方法、装置、电子设备及存储介质
CN111669575B (zh) 图像处理效果的测试方法、系统、电子设备、介质及终端
Cozzolino et al. A comparative analysis of forgery detection algorithms
Qiao et al. Classifying between computer generated and natural images: An empirical study from RAW to JPEG format
WO2022120532A1 (en) Presentation attack detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22806367

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022806367

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022806367

Country of ref document: EP

Effective date: 20230926

NENP Non-entry into the national phase

Ref country code: DE