CN117788906B - Large model generation image identification method and system - Google Patents
Large model generation image identification method and system Download PDFInfo
- Publication number
- CN117788906B CN117788906B CN202311804911.5A CN202311804911A CN117788906B CN 117788906 B CN117788906 B CN 117788906B CN 202311804911 A CN202311804911 A CN 202311804911A CN 117788906 B CN117788906 B CN 117788906B
- Authority
- CN
- China
- Prior art keywords
- stride
- similarity
- group similarity
- features
- group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000012545 processing Methods 0.000 claims abstract description 58
- 230000007246 mechanism Effects 0.000 claims abstract description 27
- 238000001914 filtration Methods 0.000 claims abstract description 14
- 238000011176 pooling Methods 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 24
- 238000012549 training Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 11
- 238000005259 measurement Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 238000012850 discrimination method Methods 0.000 claims 3
- 230000008859 change Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 8
- 238000009792 diffusion process Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000007670 refining Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
Abstract
The invention provides a large model generation image identification method and a large model generation image identification system. The method comprises the following steps: inputting the generated image into a first processing module based on residual filtering to obtain original characteristics; inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features; and inputting the classification characteristic into a classification network, and outputting a result of only true or false. The scheme provided by the invention solves the problems that the prior art cannot utilize shallow texture information of an input picture, and the loss function is simple and cannot dynamically change along with input data.
Description
Technical Field
The invention belongs to the field of image identification, and particularly relates to a large model generation image identification method and system.
Background
With the development of artificial intelligence technology, the large model gradually develops and matures to play respective roles in the life of people. Wherein the AI-generated content (AIGC) is a popular large model direction. A large number of pictures using diffusion models (diffusion) create large models into the field of view of people. For example StableDiffusion, dreambooth, midjourney, the large model service of enabling the large model to automatically generate the corresponding image only by inputting prompt words (prompt) by a user also enables various persons not good at drawing to map ideas in the mind of the person to the image.
But generating the large image model is a double-sided sword, which brings about a plurality of defects. For example, in general commercial activities, investors or purchasers often wish to purchase images that are personally designed and drawn by the painter rather than images generated by a large model, and the random use of large models to generate images can also be a copyright dispute. For example, a large model of the generated image can be drawn in great detail based on the hint words, possibly for cook up a story and spread it around by a person in mind evil intention, with immeasurable consequences.
At this time, the society has urgent need for technology capable of distinguishing large model generation images from real reality images.
Although the former has made some researches on how to identify the true picture and the fake picture, in the field of identifying the computer generated picture and the true picture, the existing researches tend to focus more on the image generated by the conventional neural network such as the countermeasure generation network (GAN) or the variational self-encoder (VAE), and the methods of identifying the image and the real picture generated by the conventional neural network such as the GAN and the VAE are proposed from various aspects such as the spatial domain and the frequency domain.
However, the principle of an image generation large model for generating an image by using a diffusion model is very different from that of the conventional image generation network, and the conventional technology is difficult to be directly applied to the fake identification task of generating a picture by using the large model. Some students use the existing image identification technology to identify large model images and real images, and as a result, the model performs very poorly, and cannot meet the current demands and expectations of people. With the rapid development of large model generation by diffusion image, the existing identification technology can be more and more difficult to distinguish the image generated by the large model from the real image.
Prior Art
DIRE technology, from paper DIRE for Diffusion-GENERATED IMAGE Detection, is an abbreviation for DIffusion Reconstruction Error. The DIRE measures the error between the input image and its reconstruction by means of a pre-trained diffusion model. The authors of this paper found that images generated by diffusion models were more easily reconstructed by pre-trained diffusion models than real images, which would be difficult to reconstruct due to various complications of reality. And reconstructing an input image through DDIM, calculating differences between the reconstructed image and the original image, and finally performing two classifications by taking the differences as characteristics to judge whether the image is a large-model forged image.
Defects of the prior art
The first disadvantage of the prior art method DIRE is that the difference between the original image and the reconstructed image results in loss of shallow texture characteristics of the original image, and that sufficient information cannot be extracted from the original image.
A second disadvantage of the existing method DIRE is that there is no concern about the characteristics of the relationship between individual pixels within the large model-generated image.
A third disadvantage of the prior art method DIRE is that the loss function is too simple to dynamically adjust the learning stride according to the different input data.
Disclosure of Invention
In order to solve the technical problems, the invention provides a technical scheme of a large model generation image identification method so as to solve the technical problems.
The first aspect of the invention discloses a large model generation image identification method, which comprises the following steps:
S1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
S2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
And S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false.
According to the method of the first aspect of the present invention, in the step S1, the method for inputting the generated image into the first processing module based on residual filtering to obtain the original feature includes:
And respectively inputting the generated image into a residual filter and a convolution kernel, combining the processing results of the residual filter and the convolution kernel, and finally inputting the combined result into a first convolution pooling layer to obtain the original characteristics.
According to the method of the first aspect of the present invention, in the step S1, there are seventeen residual filters; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
According to the method of the first aspect of the present invention, in the step S1, the first convolution pooling layer is a convolution pooling layer with a convolution layer and a pooling layer to which a residual mechanism is applied.
According to the method of the first aspect of the present invention, in the step S2, the method for inputting the original feature into a second processing module based on a self-attention mechanism and a residual structure to obtain a classification feature includes:
and inputting the original features into a second convolution pooling layer to obtain processing features, inputting the processing features into self-attention operation, and carrying out numerical addition on the self-attention operation result and the processing features to obtain classification features.
According to the method of the first aspect of the present invention, in said step S2, V of said attention calculation is then said processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
According to the method of the first aspect of the present invention, in said step S3, classification is optimized using a high-dimensional spherical boundary objective function in said classification network.
In a second aspect, the present invention discloses a large model generation image authentication system, the system comprising:
the first processing module is configured to input the generated image into the first processing module based on residual filtering to obtain original characteristics;
The second processing module is configured to input the original features into the second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
and the third processing module is configured to input the classification characteristic into a classification network and output a result of only true or false.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory storing a computer program and a processor implementing the steps in a large model generation image authentication method of any one of the first aspects of the present disclosure when the processor executes the computer program.
A fourth aspect of the invention discloses a computer-readable storage medium. A computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps in a large model generation image authentication method of any of the first aspects of the present disclosure.
In summary, the scheme provided by the application comprises a self-attention mechanism and a network structure of a residual error structure, the self-attention mechanism is combined to enhance the refining and analyzing capacity of the network to shallow texture features invisible to naked eyes, and the basis of identification is further enriched through the residual error supplementing lost feature information. The training mode of training steps by increasing similarity in the group in the classification based on the high-dimensional spherical boundary objective function can help the model pay more attention to similarity threshold values in the group. The application solves the problems that the prior art cannot utilize shallow texture information of an input picture, and the loss function is simple and cannot dynamically change along with input data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a large model generation image authentication method according to an embodiment of the present invention;
FIG. 2 is a block flow diagram according to an embodiment of the invention;
FIG. 3 is a block diagram of residual filtering according to an embodiment of the present invention;
FIG. 4 is an exemplary value of the initialization of seventeen residual filters according to an embodiment of the present invention;
FIG. 5 is a self-attention module diagram according to an embodiment of the present invention;
FIG. 6 is a block diagram of a large model generation image authentication system according to an embodiment of the present invention;
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The first aspect of the invention discloses a large model generation image identification method. Fig. 1 is a flowchart of a large model generation image authentication method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
S1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
S2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
And S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false.
In step S1, as shown in fig. 2, the generated image is input to a first processing module (residual filtering module in fig. 2) based on residual filtering, to obtain an original feature.
In some embodiments, in the step S1, the method for obtaining the original feature by inputting the generated image into a first processing module based on residual filtering includes:
As shown in fig. 3, the generated image is input into a residual filter and a convolution kernel respectively, then the processing results of the residual filter and the convolution kernel are combined, and finally the combined result is input into a first convolution pooling layer to obtain the original feature.
Seventeen residual filters are arranged; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
The first convolutional pooling layer is a convolutional pooling layer with a convolutional layer and a pooling layer, wherein a residual mechanism is applied to the convolutional layer.
Specifically, the convolution pooling layers herein include a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
As shown in fig. 4, the initialization specific values of seventeen residual filters are as follows, and these filters can efficiently extract residual information of an image.
In step S2, as shown in fig. 2, the original feature is input to a second processing module (self-attention module in fig. 2) based on a self-attention mechanism and a residual structure, resulting in a classification feature. The self-attention mechanism is combined to enhance the refining and analyzing capacity of the network to shallow texture features invisible to naked eyes, and the basis of identification is further enriched through residual error supplement lost feature information.
In some embodiments, in the step S2, the method for inputting the original feature into a second processing module based on a self-attention mechanism and a residual structure to obtain the classification feature includes:
As shown in fig. 5, the original feature is input into a second convolution pooling layer to obtain a processing feature, the processing feature is input into a self-attention operation, and the result of the self-attention operation and the processing feature are subjected to numerical addition to obtain a classification feature. And the information lost due to operation is supplemented through a residual error mechanism, so that the identification effect is enhanced.
V of the attention operation is the processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
Specifically, the convolution pooling layer comprises a 3*3 convolution layer, a regularization layer, a ReLU layer, a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer in that order.
These processed features use a self-attention mechanism to capture the correlation of shallow texture features at the pixel level.
Before self-attention is performed, the form of the data needs to be adjusted so that the data is in the form of (number of pixels, number of information channels). Then, a self-attention operation is performed on the data.
In step S3, the classification feature is input into a classification network (the ball-type loss classification module in fig. 2), and only the result of true or false is output.
In some embodiments, in said step S3, classification is optimized using a high-dimensional spherical boundary objective function in said classification network.
Specifically, the problem of image false discrimination can be regarded as a two-class problem, and only "true" or "false" results are output. In order to improve the accuracy and the effectiveness of classification, the invention introduces a high-dimensional spherical boundary objective function in the classification layer. The high-dimensional spherical boundary objective function is mainly developed around intra-group similarity steps and inter-group similarity steps. In the invention, the similarity in the group is emphasized, and a special weight w is designed for the similarity in the group.
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process. The stride update rule is as follows:
at the beginning, respective thresholds are set for the group similarity and the inter-group similarity.
When a sample is input, first, the intra-group similarity and inter-group similarity of the current sample are calculated. The group similarity stride and the inter-group similarity stride are then calculated. The group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation. The inter-group similarity stride is the difference of the inter-group similarity calculation minus the inter-group similarity threshold. If the two strides are less than 0, then they are set to 0, keeping the two strides non-negative all the time.
When the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted. Thus, when solving gradient descent, different steps can be updated for different data.
After training the model by using the high-dimensional spherical boundary objective function, the model can forge and identify the input picture to identify whether the picture is generated by a large model.
In summary, the scheme provided by the invention combines the self-attention mechanism and the network structure of the residual error structure, enhances the refining and analyzing capacity of the network to the invisible shallow texture features of naked eyes, and further enriches the identification basis by supplementing lost feature information through residual error.
The training mode of training steps by increasing similarity in the group in the classification based on the high-dimensional spherical boundary objective function can help the model pay more attention to similarity threshold values in the group. The method solves the problems that the prior art cannot utilize shallow texture information of an input picture, and the loss function is simple and cannot dynamically change along with input data.
A second aspect of the invention discloses a large model generation image authentication system. FIG. 6 is a block diagram of a large model generation image authentication system according to an embodiment of the present invention; as shown in fig. 6, the system 100 includes:
A first processing module 101 configured to input the generated image into a first processing module based on residual filtering, resulting in an original feature;
A second processing module 102 configured to input the original features into a second processing module based on a self-attention mechanism and a residual structure, resulting in classification features;
The third processing module 103 is configured to input the classification feature into a classification network and output only a result of "true" or "false".
According to the system of the second aspect of the present invention, the first processing module 101 is specifically configured such that the method for inputting the generated image into the first processing module based on residual filtering to obtain the original feature includes:
As shown in fig. 3, the generated image is input into a residual filter and a convolution kernel respectively, then the processing results of the residual filter and the convolution kernel are combined, and finally the combined result is input into a first convolution pooling layer to obtain the original feature.
Seventeen residual filters are arranged; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
The first convolutional pooling layer is a convolutional pooling layer with a convolutional layer and a pooling layer, wherein a residual mechanism is applied to the convolutional layer.
Specifically, the convolution pooling layers herein include a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer.
As shown in fig. 4, the initialization specific values of seventeen residual filters are as follows, and these filters can efficiently extract residual information of an image.
According to the system of the second aspect of the present invention, the second processing module 102 is specifically configured to input the original feature into the second processing module based on a self-attention mechanism and a residual structure, and the method for obtaining the classification feature includes:
As shown in fig. 5, the original feature is input into a second convolution pooling layer to obtain a processing feature, the processing feature is input into a self-attention operation, and the result of the self-attention operation and the processing feature are subjected to numerical addition to obtain a classification feature. And the information lost due to operation is supplemented through a residual error mechanism, so that the identification effect is enhanced.
V of the attention operation is the processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
Specifically, the convolution pooling layer comprises a 3*3 convolution layer, a regularization layer, a ReLU layer, a 3*3 convolution layer, a regularization layer, a ReLU layer, and a max pooling layer in that order.
These processed features use a self-attention mechanism to capture the correlation of shallow texture features at the pixel level.
Before self-attention is performed, the form of the data needs to be adjusted so that the data is in the form of (number of pixels, number of information channels). Then, a self-attention operation is performed on the data.
According to the system of the second aspect of the present invention, the third processing module 103 is specifically configured to optimize classification using a high-dimensional spherical boundary objective function in the classification network.
Specifically, the problem of image false discrimination can be regarded as a two-class problem, and only "true" or "false" results are output. In order to improve the accuracy and the effectiveness of classification, the invention introduces a high-dimensional spherical boundary objective function in the classification layer. The high-dimensional spherical boundary objective function is mainly developed around intra-group similarity steps and inter-group similarity steps. In the invention, the similarity in the group is emphasized, and a special weight w is designed for the similarity in the group.
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process. The stride update rule is as follows:
at the beginning, respective thresholds are set for the group similarity and the inter-group similarity.
When a sample is input, first, the intra-group similarity and inter-group similarity of the current sample are calculated. The group similarity stride and the inter-group similarity stride are then calculated. The group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation. The inter-group similarity stride is the difference of the inter-group similarity calculation minus the inter-group similarity threshold. If the two strides are less than 0, then they are set to 0, keeping the two strides non-negative all the time.
When the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted. Thus, when solving gradient descent, different steps can be updated for different data.
After training the model by using the high-dimensional spherical boundary objective function, the model can forge and identify the input picture to identify whether the picture is generated by a large model.
A third aspect of the invention discloses an electronic device. The electronic device comprises a memory and a processor, the memory stores a computer program, and the processor implements the steps in a large model generation image authentication method according to any one of the first aspect of the disclosure when executing the computer program.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 7, the electronic device includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the electronic device is used for conducting wired or wireless communication with an external terminal, and the wireless communication can be achieved through WIFI, an operator network, near Field Communication (NFC) or other technologies. The display screen of the electronic equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the electronic equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the electronic equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 7 is merely a block diagram of a portion related to the technical solution of the present disclosure, and does not constitute a limitation of the electronic device to which the technical solution of the present disclosure is applied, and that a specific electronic device may include more or less components than those shown in the drawings, or may combine some components, or have different component arrangements.
A fourth aspect of the invention discloses a computer-readable storage medium. A computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a large model generation image authentication method of any one of the first aspects of the present disclosure.
Note that the technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be regarded as the scope of the description. The foregoing examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.
Claims (8)
1. A method for large model generation image authentication, the method comprising:
S1, inputting a generated image into a first processing module based on residual filtering to obtain original characteristics;
S2, inputting the original features into a second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
Inputting the original features into a second convolution pooling layer to obtain processing features, inputting the processing features into self-attention operation, and carrying out numerical addition on the self-attention operation result and the processing features to obtain classification features;
S3, inputting the classification characteristics into a classification network, and outputting a result with only true or false;
optimizing classification by using a high-dimensional spherical boundary objective function in the two-class network;
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process;
the stride updating rule is as follows:
At the beginning, setting respective thresholds for the similarity between groups and the similarity between groups; when one sample is input, firstly calculating the intra-group similarity and the inter-group similarity of the current sample; then calculating the similarity stride between groups and the similarity stride between groups; the group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation; the inter-group similarity stride is the difference of the inter-group similarity calculated value minus the inter-group similarity threshold;
if the group similarity stride and the inter-group similarity stride are smaller than 0, then the group similarity stride is set to 0, and the two strides are always kept non-negative; when the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted; when solving gradient descent, different steps can be updated for different data.
2. The method for discriminating a large model generated image according to claim 1 wherein in step S1, the method for inputting the generated image to a first processing module based on residual filtering to obtain an original feature includes:
And respectively inputting the generated image into a residual filter and a convolution kernel, combining the processing results of the residual filter and the convolution kernel, and finally inputting the combined result into a first convolution pooling layer to obtain the original characteristics.
3. The large model generation image discrimination method according to claim 2, wherein in said step S1, there are seventeen of said residual filters; the convolution kernel has eight; the values of seventeen residual filters are fixed and not changed in learning; whereas the parameters of the eight convolution kernels are learned during training.
4. The large model generation image discrimination method according to claim 2, wherein in said step S1, said first convolution pooling layer is a convolution pooling layer with a residual mechanism of convolution layer and pooling layer.
5. A large model generation image discrimination method according to claim 1, wherein in said step S2, V of said attention operation is said processing feature; the weights assigned to the V are obtained by calculation of Q and K using a softmax layer, and then the weighted average results in shallow texture features captured from the attention mechanism.
6. A system for large model generation image authentication, the system comprising:
the first processing module is configured to input the generated image into the first processing module based on residual filtering to obtain original characteristics;
The second processing module is configured to input the original features into the second processing module based on a self-attention mechanism and a residual error structure to obtain classification features;
Inputting the original features into a second convolution pooling layer to obtain processing features, inputting the processing features into self-attention operation, and carrying out numerical addition on the self-attention operation result and the processing features to obtain classification features;
A third processing module configured to input the classification feature into a classification network and output only a "true" or "false" result;
optimizing classification by using a high-dimensional spherical boundary objective function in the two-class network;
The high-dimensional spherical boundary objective function is an objective function for adaptively changing the optimization stride in the training process;
the stride updating rule is as follows:
At the beginning, setting respective thresholds for the similarity between groups and the similarity between groups; when one sample is input, firstly calculating the intra-group similarity and the inter-group similarity of the current sample; then calculating the similarity stride between groups and the similarity stride between groups; the group similarity stride is the product of the weight w and the difference of the group similarity threshold minus the group similarity calculation; the inter-group similarity stride is the difference of the inter-group similarity calculated value minus the inter-group similarity threshold;
if the group similarity stride and the inter-group similarity stride are smaller than 0, then the group similarity stride is set to 0, and the two strides are always kept non-negative; when the measurement indexes are calculated in a grouping way, the intra-group similarity loss and the inter-group similarity loss are multiplied by the stride of each group respectively and then are counted; when solving gradient descent, different steps can be updated for different data.
7. An electronic device comprising a memory storing a computer program and a processor implementing the steps of a large model generation image authentication method according to any one of claims 1 to 5 when the computer program is executed by the processor.
8. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of a large model generation image authentication method according to any of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311804911.5A CN117788906B (en) | 2023-12-26 | 2023-12-26 | Large model generation image identification method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311804911.5A CN117788906B (en) | 2023-12-26 | 2023-12-26 | Large model generation image identification method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117788906A CN117788906A (en) | 2024-03-29 |
CN117788906B true CN117788906B (en) | 2024-07-05 |
Family
ID=90388665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311804911.5A Active CN117788906B (en) | 2023-12-26 | 2023-12-26 | Large model generation image identification method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117788906B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115082322A (en) * | 2022-07-26 | 2022-09-20 | 腾讯科技(深圳)有限公司 | Image processing method and device, and training method and device of image reconstruction model |
CN117079355A (en) * | 2023-08-29 | 2023-11-17 | 中国信息通信研究院 | Object image fake identifying method and device and electronic equipment |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108830407B (en) * | 2018-05-30 | 2021-07-30 | 华东交通大学 | Sensor distribution optimization method in structure health monitoring under multi-working condition |
US10803387B1 (en) * | 2019-09-27 | 2020-10-13 | The University Of Stavanger | Deep neural architectures for detecting false claims |
CN113094871A (en) * | 2021-03-08 | 2021-07-09 | 国网湖北省电力有限公司电力科学研究院 | Wind power area boundary accurate modeling method based on diamond convex hull set theory |
CN116935253A (en) * | 2022-03-29 | 2023-10-24 | 上海电力大学 | Human face tampering detection method based on residual error network combined with space-time attention mechanism |
CN114973364A (en) * | 2022-05-23 | 2022-08-30 | 北京影数科技有限公司 | Depth image false distinguishing method and system based on face region attention mechanism |
CN115100516A (en) * | 2022-06-07 | 2022-09-23 | 北京科技大学 | Relation learning-based remote sensing image target detection method |
CN115116092A (en) * | 2022-06-30 | 2022-09-27 | 中原工学院 | Intelligent true and false pedestrian identification method based on human eye stereoscopic vision and bionic model |
CN115170933A (en) * | 2022-08-18 | 2022-10-11 | 西安电子科技大学 | Digital image forged area positioning method based on double-current deep neural network |
CN116739071A (en) * | 2023-05-16 | 2023-09-12 | 华为技术有限公司 | Model training method and related device |
CN116704580A (en) * | 2023-06-09 | 2023-09-05 | 成都信息工程大学 | Face counterfeiting detection method based on depth information decoupling |
-
2023
- 2023-12-26 CN CN202311804911.5A patent/CN117788906B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115082322A (en) * | 2022-07-26 | 2022-09-20 | 腾讯科技(深圳)有限公司 | Image processing method and device, and training method and device of image reconstruction model |
CN117079355A (en) * | 2023-08-29 | 2023-11-17 | 中国信息通信研究院 | Object image fake identifying method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN117788906A (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rahmouni et al. | Distinguishing computer graphics from natural images using convolution neural networks | |
Singh et al. | Image classification: a survey | |
Niu et al. | Facial expression recognition with LBP and ORB features | |
Che et al. | How is gaze influenced by image transformations? dataset and model | |
Zhou et al. | A lightweight convolutional neural network for real-time facial expression detection | |
Han et al. | Two-stage learning to predict human eye fixations via SDAEs | |
CN109146856A (en) | Picture quality assessment method, device, computer equipment and storage medium | |
Do et al. | Deep neural network-based fusion model for emotion recognition using visual data | |
CN111738243B (en) | Method, device and equipment for selecting face image and storage medium | |
US11809519B2 (en) | Semantic input sampling for explanation (SISE) of convolutional neural networks | |
CN115620384B (en) | Model training method, fundus image prediction method and fundus image prediction device | |
CN112818774A (en) | Living body detection method and device | |
CN111814682A (en) | Face living body detection method and device | |
CN111860056B (en) | Blink-based living body detection method, blink-based living body detection device, readable storage medium and blink-based living body detection equipment | |
Liang et al. | Fixation prediction for advertising images: Dataset and benchmark | |
Balaji et al. | Temporally coherent video anonymization through GAN inpainting | |
CN117788906B (en) | Large model generation image identification method and system | |
CN116469177A (en) | Living body target detection method with mixed precision and training method of living body detection model | |
CN116823983A (en) | One-to-many style handwriting picture generation method based on style collection mechanism | |
CN104598866B (en) | A kind of social feeling quotrient based on face promotes method and system | |
CN115731620A (en) | Method for detecting counter attack and method for training counter attack detection model | |
JP6947460B1 (en) | Programs, information processing equipment, and methods | |
Tiwari et al. | Personality prediction from Five-Factor Facial Traits using Deep learning | |
Dhar et al. | Detecting deepfake images using deep convolutional neural network | |
CN114049676A (en) | Fatigue state detection method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |