CN117788906A - Large model generation image identification method and system - Google Patents

Large model generation image identification method and system Download PDF

Info

Publication number
CN117788906A
CN117788906A CN202311804911.5A CN202311804911A CN117788906A CN 117788906 A CN117788906 A CN 117788906A CN 202311804911 A CN202311804911 A CN 202311804911A CN 117788906 A CN117788906 A CN 117788906A
Authority
CN
China
Prior art keywords
residual
features
inputting
classification
processing module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311804911.5A
Other languages
Chinese (zh)
Other versions
CN117788906B (en
Inventor
郑威
云剑
郑晓玲
凌霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Information and Communications Technology CAICT
Original Assignee
China Academy of Information and Communications Technology CAICT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Information and Communications Technology CAICT filed Critical China Academy of Information and Communications Technology CAICT
Priority to CN202311804911.5A priority Critical patent/CN117788906B/en
Publication of CN117788906A publication Critical patent/CN117788906A/en
Application granted granted Critical
Publication of CN117788906B publication Critical patent/CN117788906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a large model generation image identification method and a large model generation image identification system. The method comprises the following steps: inputting the generated image into a first processing module based on residual filtering to obtain original characteristics; inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features; and inputting the classification characteristic into a classification network, and outputting a result of only true or false. The scheme provided by the invention solves the problems that the prior art cannot utilize shallow texture information of an input picture, and the loss function is simple and cannot dynamically change along with input data.

Description

Large model generation image identification method and system
Technical Field
The invention belongs to the field of image identification, and particularly relates to a large model generation image identification method and system.
Background
With the development of artificial intelligence technology, the large model gradually develops and matures to play respective roles in the life of people. Among them, AI-generated content (AIGC) is a popular large model direction. A large number of pictures using diffusion models (diffusion) create large models into the field of view of people. For example, stablediffration, dreambooth, midjourney, only requires the user to input prompt (prompt), and the large model service of enabling the large model to automatically generate relevant corresponding images also enables various persons who are not good at drawing to map ideas in their brains onto the images.
However, generating the large image model is a double-sided sword, which brings about a plurality of disadvantages. For example, in general commercial activities, investors or purchasers often wish to purchase images that are personally designed and drawn by the painter rather than images generated by a large model, and the random use of large models to generate images can also be a copyright dispute. For example, a large model of the generated image can be drawn with great details according to the prompt words, and the large model can possibly be used for rumors by the thieves with impossibility of measuring the results.
At this time, the society has urgent need for technology capable of distinguishing large model generation images from real reality images.
Although the former has made some researches on how to identify the true picture and the fake picture, in the field of identifying the computer generated picture and the true picture, the existing researches tend to focus more on the image generated by the conventional neural network such as the countermeasure generation network (GAN) or the variational self-encoder (VAE), and the methods of identifying the image and the real picture generated by the conventional neural network such as the GAN and the VAE are proposed from various aspects such as the spatial domain and the frequency domain.
However, the image generation large model for generating an image by using the diffusion model is far from the conventional image generation network principle, and it is difficult to directly apply the conventional technology to the pseudo-discrimination task for generating a picture by using the large model. Some students use the existing image identification technology to identify large model images and real images, and as a result, the model performs very poorly, and cannot meet the current demands and expectations of people. With the rapid development of large model generation by diffusion image, the existing identification technology can be more and more difficult to distinguish the image generated by the large model from the real image.
Prior Art
DIRE technology, paper DIRE for Diffusion-Generated Image Detection, is an abbreviation for DIffusion Reconstruction Error. The DIRE measures the error between the input image and its reconstruction by means of a pre-trained diffusion model. The authors of this paper found that images generated by diffusion models were more easily reconstructed by pre-trained diffusion models than real images, which would be difficult to reconstruct due to various complications of reality. And reconstructing an input image through DDIM, calculating differences between the reconstructed image and the original image, and finally performing two classifications by taking the differences as characteristics to judge whether the image is a large-model forged image.
Defects of the prior art
The first disadvantage of the prior art method DIRE is that the difference between the original image and the reconstructed image results in loss of shallow texture characteristics of the original image, and that sufficient information cannot be extracted from the original image.
A second disadvantage of the existing method DIRE is that there is no concern about the characteristics of the relationship between individual pixels within the large model-generated image.
A third disadvantage of the prior art method DIRE is that the loss function is too simple to dynamically adjust the learning stride according to the different input data.
Disclosure of Invention
In order to solve the technical problems, the invention provides a technical scheme of a large model generation image identification method so as to solve the technical problems.
The first aspect of the invention discloses a large model generation image identification method, which comprises the following steps:
s1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
s2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false.
According to the method of the first aspect of the present invention, in the step S1, the method for inputting the generated image into the first processing module based on residual filtering to obtain the original feature includes:
and respectively inputting the generated image into a residual filter and a convolution kernel, combining the processing results of the residual filter and the convolution kernel, and finally inputting the combined result into a first convolution pooling layer to obtain the original characteristics.
According to the method of the first aspect of the present invention, in the step S1, there are seventeen residual filters; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
According to the method of the first aspect of the present invention, in the step S1, the first convolution pooling layer is a convolution pooling layer with a convolution layer and a pooling layer to which a residual mechanism is applied.
According to the method of the first aspect of the present invention, in the step S2, the method for inputting the original feature into a second processing module based on a self-attention mechanism and a residual structure to obtain a classification feature includes:
and inputting the original features into a second convolution pooling layer to obtain processing features, inputting the processing features into self-attention operation, and carrying out numerical addition on the self-attention operation result and the processing features to obtain classification features.
According to the method of the first aspect of the present invention, in said step S2, V of said attention calculation is then said processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
According to the method of the first aspect of the present invention, in said step S3, classification is optimized using a high-dimensional spherical boundary objective function in said classification network.
In a second aspect, the present invention discloses a large model generation image authentication system, the system comprising:
the first processing module is configured to input the generated image into the first processing module based on residual filtering to obtain original characteristics;
the second processing module is configured to input the original features into the second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and the third processing module is configured to input the classification characteristic into a classification network and output a result of only true or false.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory storing a computer program and a processor implementing the steps in a large model generation image authentication method of any one of the first aspects of the present disclosure when the processor executes the computer program.
A fourth aspect of the invention discloses a computer-readable storage medium. A computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps in a large model generation image authentication method of any of the first aspects of the present disclosure.
In summary, the scheme provided by the invention comprises a self-attention mechanism and a network structure of a residual error structure, the self-attention mechanism is combined to enhance the refining and analyzing capacity of the network to shallow texture features invisible to naked eyes, and the basis of identification is further enriched through the residual error supplementing lost feature information. The training mode of training steps by increasing similarity in the group in the classification based on the high-dimensional spherical boundary objective function can help the model pay more attention to similarity threshold values in the group. The method and the device solve the problem that the prior art cannot utilize shallow texture information of the input picture, and the loss function is simple and cannot dynamically change along with the input data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a large model generation image authentication method according to an embodiment of the present invention;
FIG. 2 is a block flow diagram according to an embodiment of the invention;
FIG. 3 is a block diagram of residual filtering according to an embodiment of the present invention;
FIG. 4 is an exemplary value of the initialization of seventeen residual filters according to an embodiment of the present invention;
FIG. 5 is a self-attention module diagram according to an embodiment of the present invention;
FIG. 6 is a block diagram of a large model generation image authentication system according to an embodiment of the present invention;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The first aspect of the invention discloses a large model generation image identification method. Fig. 1 is a flowchart of a large model generation image authentication method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
s1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
s2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false.
In step S1, as shown in fig. 2, the generated image is input to a first processing module (residual filtering module in fig. 2) based on residual filtering, to obtain an original feature.
In some embodiments, in the step S1, the method for obtaining the original feature by inputting the generated image into a first processing module based on residual filtering includes:
as shown in fig. 3, the generated image is input into a residual filter and a convolution kernel respectively, then the processing results of the residual filter and the convolution kernel are combined, and finally the combined result is input into a first convolution pooling layer to obtain the original feature.
Seventeen residual filters are arranged; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
The first convolutional pooling layer is a convolutional pooling layer with a convolutional layer and a pooling layer, wherein a residual mechanism is applied to the convolutional layer.
Specifically, the convolution pooling layers herein include a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
As shown in fig. 4, the initialization specific values of seventeen residual filters are as follows, and these filters can efficiently extract residual information of an image.
In step S2, as shown in fig. 2, the original feature is input to a second processing module (self-attention module in fig. 2) based on a self-attention mechanism and a residual structure, resulting in a classification feature. The self-attention mechanism is combined to enhance the refining and analyzing capacity of the network to shallow texture features invisible to naked eyes, and the basis of identification is further enriched through residual error supplement lost feature information.
In some embodiments, in the step S2, the method for inputting the original feature into a second processing module based on a self-attention mechanism and a residual structure to obtain the classification feature includes:
as shown in fig. 5, the original feature is input into a second convolution pooling layer to obtain a processing feature, the processing feature is input into a self-attention operation, and the result of the self-attention operation and the processing feature are subjected to numerical addition to obtain a classification feature. And the information lost due to operation is supplemented through a residual error mechanism, so that the identification effect is enhanced.
V of the attention operation is the processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
Specifically, the convolution pooling layer comprises, in order, a 3*3 convolution layer, a regularization layer, a ReLU layer, a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
These processed features use a self-attention mechanism to capture the correlation of shallow texture features at the pixel level.
Before self-attention is performed, the form of the data needs to be adjusted so that the data is in the form of (number of pixels, number of information channels). Then, a self-attention operation is performed on the data.
In step S3, the classification feature is input into a classification network (the ball-type loss classification module in fig. 2), and only the result of true or false is output.
In some embodiments, in said step S3, classification is optimized using a high-dimensional spherical boundary objective function in said classification network.
Specifically, the problem of image false discrimination can be regarded as a two-class problem, and only "true" or "false" results are output. In order to improve the accuracy and the effectiveness of classification, the invention introduces a high-dimensional spherical boundary objective function in the classification layer. The high-dimensional spherical boundary objective function is mainly developed around intra-group similarity steps and inter-group similarity steps. In the invention, the similarity in the group is emphasized, and a special weight w is designed for the similarity in the group.
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process. The stride update rule is as follows:
at the beginning, respective thresholds are set for the group similarity and the inter-group similarity.
When a sample is input, first, the intra-group similarity and inter-group similarity of the current sample are calculated. The group similarity stride and the inter-group similarity stride are then calculated. The group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation. The inter-group similarity stride is the difference of the inter-group similarity calculation minus the inter-group similarity threshold. If the two strides are less than 0, then they are set to 0, keeping the two strides non-negative all the time.
When the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted. Thus, when solving gradient descent, different steps can be updated for different data.
After training the model by using the high-dimensional spherical boundary objective function, the model can forge and identify the input picture to identify whether the picture is generated by a large model.
In summary, the scheme provided by the invention combines the self-attention mechanism and the network structure of the residual error structure, enhances the refining and analyzing capacity of the network to the invisible shallow texture features of naked eyes, and further enriches the identification basis by supplementing lost feature information through residual error.
The training mode of training steps by increasing similarity in the group in the classification based on the high-dimensional spherical boundary objective function can help the model pay more attention to similarity threshold values in the group. The method solves the problems that the prior art cannot utilize shallow texture information of an input picture, and the loss function is simple and cannot dynamically change along with input data.
A second aspect of the invention discloses a large model generation image authentication system. FIG. 6 is a block diagram of a large model generation image authentication system according to an embodiment of the present invention; as shown in fig. 6, the system 100 includes:
a first processing module 101 configured to input the generated image into a first processing module based on residual filtering, resulting in an original feature;
a second processing module 102 configured to input the original features into a second processing module based on a self-attention mechanism and a residual structure, resulting in classification features;
the third processing module 103 is configured to input the classification feature into a classification network and output only a result of "true" or "false".
According to the system of the second aspect of the present invention, the first processing module 101 is specifically configured such that the method for inputting the generated image into the first processing module based on residual filtering to obtain the original feature includes:
as shown in fig. 3, the generated image is input into a residual filter and a convolution kernel respectively, then the processing results of the residual filter and the convolution kernel are combined, and finally the combined result is input into a first convolution pooling layer to obtain the original feature.
Seventeen residual filters are arranged; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
The first convolutional pooling layer is a convolutional pooling layer with a convolutional layer and a pooling layer, wherein a residual mechanism is applied to the convolutional layer.
Specifically, the convolution pooling layers herein include a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
As shown in fig. 4, the initialization specific values of seventeen residual filters are as follows, and these filters can efficiently extract residual information of an image.
According to the system of the second aspect of the present invention, the second processing module 102 is specifically configured to input the original feature into the second processing module based on a self-attention mechanism and a residual structure, and the method for obtaining the classification feature includes:
as shown in fig. 5, the original feature is input into a second convolution pooling layer to obtain a processing feature, the processing feature is input into a self-attention operation, and the result of the self-attention operation and the processing feature are subjected to numerical addition to obtain a classification feature. And the information lost due to operation is supplemented through a residual error mechanism, so that the identification effect is enhanced.
V of the attention operation is the processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
Specifically, the convolution pooling layer comprises, in order, a 3*3 convolution layer, a regularization layer, a ReLU layer, a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
These processed features use a self-attention mechanism to capture the correlation of shallow texture features at the pixel level.
Before self-attention is performed, the form of the data needs to be adjusted so that the data is in the form of (number of pixels, number of information channels). Then, a self-attention operation is performed on the data.
According to the system of the second aspect of the present invention, the third processing module 103 is specifically configured to optimize classification using a high-dimensional spherical boundary objective function in the classification network.
Specifically, the problem of image false discrimination can be regarded as a two-class problem, and only "true" or "false" results are output. In order to improve the accuracy and the effectiveness of classification, the invention introduces a high-dimensional spherical boundary objective function in the classification layer. The high-dimensional spherical boundary objective function is mainly developed around intra-group similarity steps and inter-group similarity steps. In the invention, the similarity in the group is emphasized, and a special weight w is designed for the similarity in the group.
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process. The stride update rule is as follows:
at the beginning, respective thresholds are set for the group similarity and the inter-group similarity.
When a sample is input, first, the intra-group similarity and inter-group similarity of the current sample are calculated. The group similarity stride and the inter-group similarity stride are then calculated. The group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation. The inter-group similarity stride is the difference of the inter-group similarity calculation minus the inter-group similarity threshold. If the two strides are less than 0, then they are set to 0, keeping the two strides non-negative all the time.
When the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted. Thus, when solving gradient descent, different steps can be updated for different data.
After training the model by using the high-dimensional spherical boundary objective function, the model can forge and identify the input picture to identify whether the picture is generated by a large model.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor implements the steps in a large model generation image authentication method according to any one of the first aspect of the disclosure when executing the computer program.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 7, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a structural diagram of a portion related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the present application solution is applied, and a specific electronic device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
A fourth aspect of the invention discloses a computer-readable storage medium. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a large model generation image authentication method of any one of the first aspects of the present disclosure.
Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The foregoing examples represent only a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims (10)

1. A method for large model generation image authentication, the method comprising:
s1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
s2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false.
2. The method for discriminating a large model generated image according to claim 1 wherein in step S1, the method for inputting the generated image to a first processing module based on residual filtering to obtain an original feature includes:
and respectively inputting the generated image into a residual filter and a convolution kernel, combining the processing results of the residual filter and the convolution kernel, and finally inputting the combined result into a first convolution pooling layer to obtain the original characteristics.
3. The large model generation image discrimination method according to claim 2, wherein in said step S1, there are seventeen of said residual filters; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
4. The large model generation image discrimination method according to claim 2, wherein in said step S1, said first convolution pooling layer is a convolution pooling layer with a residual mechanism of convolution layer and pooling layer.
5. The method of claim 1, wherein in step S2, the step of inputting the original features into a second processing module based on a self-attention mechanism and a residual structure to obtain classification features comprises:
and inputting the original features into a second convolution pooling layer to obtain processing features, inputting the processing features into self-attention operation, and carrying out numerical addition on the self-attention operation result and the processing features to obtain classification features.
6. The large model generation image discrimination method according to claim 5, wherein in said step S2, V of said attention operation is said processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
7. A large model generation image discrimination method according to claim 1, wherein in said step S3, classification is optimized using a high-dimensional spherical boundary objective function in said two-classification network.
8. A system for large model generation image authentication, the system comprising:
the first processing module is configured to input the generated image into the first processing module based on residual filtering to obtain original characteristics;
the second processing module is configured to input the original features into the second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and the third processing module is configured to input the classification characteristic into a classification network and output a result of only true or false.
9. An electronic device comprising a memory storing a computer program and a processor implementing the steps of a large model generation image authentication method according to any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a large model generation image authentication method according to any of claims 1 to 7.
CN202311804911.5A 2023-12-26 2023-12-26 Large model generation image identification method and system Active CN117788906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311804911.5A CN117788906B (en) 2023-12-26 2023-12-26 Large model generation image identification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311804911.5A CN117788906B (en) 2023-12-26 2023-12-26 Large model generation image identification method and system

Publications (2)

Publication Number Publication Date
CN117788906A true CN117788906A (en) 2024-03-29
CN117788906B CN117788906B (en) 2024-07-05

Family

ID=90388665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311804911.5A Active CN117788906B (en) 2023-12-26 2023-12-26 Large model generation image identification method and system

Country Status (1)

Country Link
CN (1) CN117788906B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830407A (en) * 2018-05-30 2018-11-16 华东交通大学 Sensor distribution optimization method under the conditions of multi-state in monitoring structural health conditions
US10803387B1 (en) * 2019-09-27 2020-10-13 The University Of Stavanger Deep neural architectures for detecting false claims
CN113094871A (en) * 2021-03-08 2021-07-09 国网湖北省电力有限公司电力科学研究院 Wind power area boundary accurate modeling method based on diamond convex hull set theory
CN114973364A (en) * 2022-05-23 2022-08-30 北京影数科技有限公司 Depth image false distinguishing method and system based on face region attention mechanism
CN115082322A (en) * 2022-07-26 2022-09-20 腾讯科技(深圳)有限公司 Image processing method and device, and training method and device of image reconstruction model
CN115100516A (en) * 2022-06-07 2022-09-23 北京科技大学 Relation learning-based remote sensing image target detection method
CN115116092A (en) * 2022-06-30 2022-09-27 中原工学院 Intelligent true and false pedestrian identification method based on human eye stereoscopic vision and bionic model
CN115170933A (en) * 2022-08-18 2022-10-11 西安电子科技大学 Digital image forged area positioning method based on double-current deep neural network
CN116704580A (en) * 2023-06-09 2023-09-05 成都信息工程大学 Face counterfeiting detection method based on depth information decoupling
CN116739071A (en) * 2023-05-16 2023-09-12 华为技术有限公司 Model training method and related device
CN116935253A (en) * 2022-03-29 2023-10-24 上海电力大学 Human face tampering detection method based on residual error network combined with space-time attention mechanism
CN117079355A (en) * 2023-08-29 2023-11-17 中国信息通信研究院 Object image fake identifying method and device and electronic equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830407A (en) * 2018-05-30 2018-11-16 华东交通大学 Sensor distribution optimization method under the conditions of multi-state in monitoring structural health conditions
US10803387B1 (en) * 2019-09-27 2020-10-13 The University Of Stavanger Deep neural architectures for detecting false claims
CN113094871A (en) * 2021-03-08 2021-07-09 国网湖北省电力有限公司电力科学研究院 Wind power area boundary accurate modeling method based on diamond convex hull set theory
CN116935253A (en) * 2022-03-29 2023-10-24 上海电力大学 Human face tampering detection method based on residual error network combined with space-time attention mechanism
CN114973364A (en) * 2022-05-23 2022-08-30 北京影数科技有限公司 Depth image false distinguishing method and system based on face region attention mechanism
CN115100516A (en) * 2022-06-07 2022-09-23 北京科技大学 Relation learning-based remote sensing image target detection method
CN115116092A (en) * 2022-06-30 2022-09-27 中原工学院 Intelligent true and false pedestrian identification method based on human eye stereoscopic vision and bionic model
CN115082322A (en) * 2022-07-26 2022-09-20 腾讯科技(深圳)有限公司 Image processing method and device, and training method and device of image reconstruction model
CN115170933A (en) * 2022-08-18 2022-10-11 西安电子科技大学 Digital image forged area positioning method based on double-current deep neural network
CN116739071A (en) * 2023-05-16 2023-09-12 华为技术有限公司 Model training method and related device
CN116704580A (en) * 2023-06-09 2023-09-05 成都信息工程大学 Face counterfeiting detection method based on depth information decoupling
CN117079355A (en) * 2023-08-29 2023-11-17 中国信息通信研究院 Object image fake identifying method and device and electronic equipment

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
IYYAKUTTI IYAPPAN GANAPATHI 等: "Learning to localize image forgery using end-to-end attention network", 《NEUROCOMPUTING》, vol. 512, 1 November 2022 (2022-11-01), pages 25 - 39 *
MD. TANVIR HASSAN等: "Regular Splitting Graph Network for 3D Human Pose Estimation", 《IEEE TRANSACTIONS ON IMAGE PROCESSING》, vol. 32, 11 July 2023 (2023-07-11), pages 4212 - 4222 *
ZIYI XI等: "AI-Generated Image Detection using a Cross-Attention Enhanced Dual-Stream Network", 《2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC)》, 20 November 2023 (2023-11-20), pages 1463 - 1470 *
张毅: "多姿态人脸识别关键技术研究", 《中国博士学位论文全文数据库 (信息科技辑)》, no. 04, 15 April 2022 (2022-04-15), pages 138 - 72 *
邓建国等: "监督学习中的损失函数及应用研究", 《大数据》, vol. 6, no. 01, 15 January 2020 (2020-01-15), pages 60 - 80 *
邱宏: "面向特定对象的视频伪造鉴别技术与监管系统", 《广播电视信息》, vol. 30, no. 07, 11 July 2023 (2023-07-11), pages 19 - 22 *

Also Published As

Publication number Publication date
CN117788906B (en) 2024-07-05

Similar Documents

Publication Publication Date Title
Rahmouni et al. Distinguishing computer graphics from natural images using convolution neural networks
Yang et al. MTD-Net: Learning to detect deepfakes images by multi-scale texture difference
Singh et al. Image classification: a survey
Che et al. How is gaze influenced by image transformations? dataset and model
CN109146856A (en) Picture quality assessment method, device, computer equipment and storage medium
Han et al. Two-stage learning to predict human eye fixations via SDAEs
CN111738243B (en) Method, device and equipment for selecting face image and storage medium
CN111448581A (en) System and method for image processing using deep neural networks
Do et al. Deep neural network-based fusion model for emotion recognition using visual data
CN111310705A (en) Image recognition method and device, computer equipment and storage medium
US11809519B2 (en) Semantic input sampling for explanation (SISE) of convolutional neural networks
CN115620384B (en) Model training method, fundus image prediction method and fundus image prediction device
CN111814682A (en) Face living body detection method and device
CN112529040A (en) Model generation method and device, electronic equipment and medium
CN112818774A (en) Living body detection method and device
Zaman et al. A novel driver emotion recognition system based on deep ensemble classification
CN116958637A (en) Training method, device, equipment and storage medium of image detection model
CN117636131A (en) Yolo-I model-based small target identification method and related device
Liang et al. Fixation prediction for advertising images: Dataset and benchmark
Willoughby et al. DrunkSelfie: intoxication detection from smartphone facial images
Chao et al. Instance-aware image dehazing
CN116469177A (en) Living body target detection method with mixed precision and training method of living body detection model
Hepburn et al. Enforcing perceptual consistency on generative adversarial networks by using the normalised laplacian pyramid distance
CN117788906B (en) Large model generation image identification method and system
CN116823983A (en) One-to-many style handwriting picture generation method based on style collection mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant