CN116778306A

CN116778306A - Fake object detection method, related device and storage medium

Info

Publication number: CN116778306A
Application number: CN202310768511.7A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Beijing Real AI Technology Co Ltd
Current assignee: Beijing Real AI Technology Co Ltd
Priority date: 2023-06-27
Filing date: 2023-06-27
Publication date: 2023-09-19

Abstract

The embodiment of the application discloses a fake object detection method, a related device and a storage medium. The method comprises the following steps: acquiring target object characteristics of a target object in data to be detected; determining a first feature distance set, a second feature distance set and a third feature distance set, wherein the first feature distance set is the feature distance between a target object feature and each of at least one real prototype, the second feature distance set is the feature distance between the target object feature and each of at least two unique counterfeit prototypes, and the third feature distance set is the feature distance between the target object feature and each of at least one common counterfeit prototype; if the target object is determined to be a fake object based on the first feature distance set, the second feature distance set and the third feature distance set, determining the fake type of the target object according to the second feature distance set. The scheme can improve the accuracy of the detection of the forged object and the processing efficiency of the detection of the forged object.

Description

Fake object detection method, related device and storage medium

Technical Field

The embodiment of the application relates to the technical field of data transmission, in particular to a fake object detection method, a related device and a storage medium.

Background

With the rapid development of artificial intelligence generated content (Artificial Intelligence Generated Content, AIGC) technology, and in particular, the continuous iteration of generating a countering network (Generative Adversarial Networks, GANs) and a Diffusion Model (Diffusion Model), the counterfeiting effect of deep counterfeiting face technology is more realistic, so that the counterfeiting content is difficult to distinguish, thereby bringing a plurality of potential risks and threats, such as false news, network fraud, privacy infringement, and the like.

In view of the abuse risk of deep counterfeited face technologies, there is an increasing need for detection and defense of deep counterfeited faces, for example, social media platforms need to detect and defend against the spread of deep counterfeited content, financial institutions need to detect and defend against the use of deep counterfeited face technologies in network fraud, government and legal departments need to provide false evidence, etc.

Many dummy face detection techniques have been proposed to identify whether a face in an image or video has been tampered with and output a genuine-fake classification label. Most of these methods use deep learning technology based on data driving, and design more advanced network structures or loss functions to improve detection performance, but outputting only true and false two-class labels cannot meet the requirements of some scenes, for example, for malicious and illegal fake faces, a manager needs to determine how fake contents are produced.

In the prior art, most of work is to separately study the true and false classification task of false faces and the tracing task of the forging method, the tracing model of the forging method needs to be connected in series behind the true and false classification model, namely, the faces to be detected are sent into the deep and false classification model to judge whether the faces are forged or not, and then sent into the tracing model of the forging method to judge which forging method is manufactured.

Disclosure of Invention

The embodiment of the application provides a fake object detection method, a related device and a storage medium, which can improve the accuracy of fake object detection and the processing efficiency of fake object detection.

In a first aspect, an embodiment of the present application provides a method for detecting a counterfeit object, including:

acquiring target object characteristics of a target object in data to be detected;

determining a first feature distance set, a second feature distance set and a third feature distance set, wherein the first feature distance set is the feature distance between the target object feature and each of at least one real prototype, the second feature distance set is the feature distance between the target object feature and each of at least two unique counterfeit prototypes, and the third feature distance set is the feature distance between the target object feature and each of at least one common counterfeit prototype;

And if the target object is determined to be a fake object based on the first characteristic distance set, the second characteristic distance set and the third characteristic distance set, determining the fake type of the target object according to the second characteristic distance set.

In some embodiments, the fourth loss value is determined according to a fourth formula:

wherein ,for the fourth loss value, d (z _cls ,m _ij ) Representing z _cls And m is equal to _ij Characteristic distance between z _cls Representing the second feature, m _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=1 represents m _ij Is unique toForging a prototype, j represents a prototype index of a prototype in the prototype set, x is the target sample, gamma represents a coefficient for controlling difficulty of learning task, and p (y) _c I x) indicates that the target sample belongs to y _c Y when the classification label of the target sample is a true sample _c =0, y when the two-class label of the target sample is a counterfeit sample _c ＝y _m ，y _m A label representing a counterfeit type.

In some embodiments, the fifth loss value is determined according to a fifth formula:

wherein ,for the fifth loss value, d (z _cls ,m _ij ) Representing z _cls And m is equal to _ij Characteristic distance between z _cls Representing the second feature, m _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=2 represents m _ij For common forgery prototypes, j represents the prototype index of prototypes in the prototype set, x is the target sample, γ represents the coefficient controlling the difficulty of learning task, p (y) _b I x) indicates that the target sample belongs to tag y _b Probability of y _b A dichotomous tag representing the target specimen.

In some embodiments, the data to be detected includes at least one of audio data, video data, and image data.

In some embodiments, the data to be detected includes at least one of video data and image data, and the target object is a human face.

In a second aspect, an embodiment of the present application further provides a counterfeit object detection device, including:

the receiving and transmitting module is used for acquiring target object characteristics of a target object in the data to be detected;

the processing module is used for determining a first characteristic distance set, a second characteristic distance set and a third characteristic distance set, wherein the first characteristic distance set is the characteristic distance between the target object characteristic and each real prototype in at least one real prototype, the second characteristic distance set is the characteristic distance between the target object characteristic and each unique forged prototype in at least two unique forged prototypes, and the third characteristic distance set is the characteristic distance between the target object characteristic and each common forged prototype in at least one common forged prototype; and if the target object is determined to be a fake object based on the first characteristic distance set, the second characteristic distance set and the third characteristic distance set, determining the fake type of the target object according to the second characteristic distance set.

In some embodiments, the processing module determines whether the target object is a counterfeit object, in particular, according to the following:

determining an average feature distance of feature distances in the first feature distance set, and determining a minimum feature distance from the second feature distance set and the third feature set; and determining whether the target object is a fake object according to the average characteristic distance and the minimum characteristic distance.

In some embodiments, the processing module is specifically configured to, when performing the step of determining whether the target object is a counterfeit object according to the average feature distance and the minimum feature distance:

determining a first probability that the target object is a real object according to the average feature distance, and determining a second probability that the target object is a counterfeit object according to the minimum feature distance; if the first probability is larger than the second probability, determining that the target object is a real object; and if the first probability is smaller than or equal to the second probability, determining that the target object is a fake object.

In some embodiments, the fake object detection apparatus is implemented based on a fake object detection model that includes a feature encoder, a prototype set, a detection tracing module, and a prototype update module;

Wherein the prototype set comprises at least one preset true prototype, at least two preset unique counterfeit prototypes and at least one preset common counterfeit prototype;

the feature encoder is used for acquiring the target object features of the target object;

the detection traceability module is used for determining whether a target object is a fake object according to the target object characteristics and the prototype set, and determining the fake type of the target object when the target object is the fake object;

the prototype updating module is configured to perform prototype updating processing on at least one preset true prototype, at least two preset unique counterfeit prototypes, and at least one preset common counterfeit prototype in a training stage of the counterfeit object detection model, so as to obtain at least one true prototype, at least two counterfeit prototypes, and at least one common counterfeit prototype.

In some embodiments, before the step of obtaining the target object characteristic of the target object in the data to be detected, the processing module is further configured to:

obtaining a target sample through the transceiver module, wherein the target sample is from a real sample set and a fake sample set;

Acquiring a first feature and a second feature of the target sample, wherein the first feature comprises a real feature, a unique counterfeit feature and a common counterfeit feature;

determining a prototype updating loss value according to the first characteristic, and determining a prototype detecting traceability loss value according to the second characteristic;

if the target loss value does not meet the preset convergence condition, acquiring a new sample from the real sample set and the fake sample set, taking the new sample as the target sample until the target loss value meets the preset convergence condition, and obtaining a trained fake object detection model, wherein the target loss value is determined according to the prototype updated loss value and the prototype detection traceability loss value.

In some embodiments, the processing module is specifically configured to, when performing the step of obtaining the first feature and the second feature of the target sample:

determining the first and second features of the target sample from the feature encoder;

the first feature is divided into a plurality of real features, a plurality of unique counterfeit features, and a plurality of common counterfeit features according to a multi-headed cross-attention sub-module and the prototype set, the prototype update module comprising the multi-headed cross-attention sub-module.

In some embodiments, the processing module is specifically configured to, when performing the step of determining a prototype-updated loss value according to the first feature and determining a prototype-detected trace-source loss value according to the second feature:

determining, by the prototype updating module, the prototype updating loss value according to the plurality of real features, the plurality of unique counterfeit features, the plurality of common counterfeit features, and the prototype set, and determining, by the detection tracing module, the detection tracing loss value according to the second feature.

In some embodiments, the prototype-updating module further comprises a fake-method discriminator and a true-false discriminator; the prototype updated loss value includes a first loss value, a second loss value, and a third loss value; the processing module is specifically configured to, when executing the step of determining a prototype update loss value according to the first feature:

determining target genuine features of each genuine prototype from a plurality of the genuine features according to the sample tag of each genuine feature, determining target unique counterfeit features of each unique counterfeit prototype from a plurality of the unique counterfeit features according to the sample tag of each unique counterfeit feature, and determining target common counterfeit features of each common counterfeit prototype from a plurality of the common counterfeit features according to the sample tag of each common counterfeit feature;

Determining the first loss value from the prototype set, the target authentic signature, the target-unique counterfeit signature, and the target-commonality counterfeit signature;

determining the second loss value according to the target commonality counterfeit feature and the counterfeit method discriminator;

the third loss value is determined according to the sample tag of the common counterfeit feature, the common counterfeit feature and the true-false discriminator.

In some embodiments, the first loss value is determined according to a first formula:

wherein ,for the first loss value, I represents an indication function for indicating the selection of the corresponding z by the sample label _a and m_ij ，z _a M for the target authentic, the target unique counterfeit or the target common counterfeit _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=1 represents m _ij For a unique counterfeit prototype, i=2 represents m _ij For the commonality forgery prototype j represents the prototype index of the prototypes in the prototype set, +.>Representation I (z) _a ) And I (m) _ij ) Feature distance between them.

In some embodiments, the second loss value is determined according to a second formula:

wherein ,identifying the second loss value, x representing the target sample, f representing the feature encoder, g representing a multi-headed cross-attention sub-module, D _m Representing said forgery method discriminator, f (x) representing said second feature, ++>Representing an indication function indicating a target common counterfeit feature for selecting a sample label as a counterfeit object from the common counterfeit features, y _b Representing a binary label, y _b =1 indicates that the two kinds of labels are counterfeit objects, N indicates the number of the target samples,represented in x, y _b Y _m Hope of the above, y _m Label, θ, representing counterfeit type _f Representing the parameters, θ, of the feature encoder _g Parameters representing the multi-headed cross-attention sub-module,/->Parameters representing the forgery method discriminator.

In some embodiments, the third loss value is determined according to a third formula:

wherein ,for the third loss value, +.>Probability of output when inputting the common forgery feature into the true and false discriminators.

In some embodiments, the detected traceable loss value includes a fourth loss value and a fifth loss value; the processing module is specifically configured to, when executing the step of determining a prototype detection traceability loss value according to the second feature:

Determining a first sample feature distance set, a second sample feature distance set and a third sample feature distance set through the detection traceability module, wherein the first sample feature distance set comprises feature distances between the second feature and each of at least one preset true prototype, the second sample feature distance set comprises feature distances between the second feature and each of at least two preset unique counterfeit prototypes, and the third sample feature distance set comprises feature distances between the second feature and each of at least one preset common counterfeit prototypes;

determining the fourth loss value of the target sample from the first sample feature distance set and the second sample feature distance set;

the fifth loss value of the target sample is determined from the first set of sample feature distances and the third set of sample feature distances.

wherein ,for the fourth loss value, d (z _cls ,m _ij ) Representing z _cls And m is equal to _ij Characteristic distance between z _cls Representing the second feature, m _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=1 represents m _ij For exclusive falsification of prototypes, j represents the prototype index of prototypes in the prototype set, x is the target sample, γ represents the coefficient controlling the difficulty of learning tasks, p (y) _c I x) indicates that the target sample belongs to y _c Y when the classification label of the target sample is a true sample _c =0, y when the two-class label of the target sample is a counterfeit sample _c ＝y _m ，y _m A label representing a counterfeit type.

In a third aspect, an embodiment of the present application further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the method when executing the computer program.

In a fourth aspect, embodiments of the present application also provide a computer readable storage medium storing a computer program comprising program instructions which, when executed by a processor, implement the above-described method.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a transceiver coupled to a terminal device, for executing the technical solution provided in the first aspect of the embodiment of the present application.

In a sixth aspect, an embodiment of the present application provides a chip system, where the chip system includes a processor for supporting a terminal device to implement the functions involved in the first aspect, for example, generating or processing information involved in the counterfeit object detection method provided in the first aspect. In one possible design, the above chip system further includes a memory for holding program instructions and data necessary for the terminal. The chip system may be formed of a chip or may include a chip and other discrete devices.

In a seventh aspect, an embodiment of the present application provides a computer program product including instructions, which when executed on a computer, cause the computer to perform the method for detecting a counterfeit object provided in the first aspect, and also achieve the beneficial effects of the method for detecting a counterfeit object provided in the first aspect.

Compared with the prior art, in the scheme provided by the embodiment of the application, on one hand, as the counterfeiting prototype provided by the scheme comprises a plurality of unique counterfeiting prototypes representing unique counterfeiting characteristics and at least one common counterfeiting prototype representing common counterfeiting characteristics, the unique counterfeiting characteristics in the target object can be identified, the common counterfeiting characteristics in the target object can be identified, the counterfeiting characteristics of the target object can be more comprehensively identified, the target object can be subjected to true and false classification through the identified more comprehensive counterfeiting characteristics, and the classification effect of the true and false classification can be improved; on the other hand, when the target object is identified as the counterfeit object, the counterfeit type of the target object is further determined by utilizing the second characteristic distance set obtained by calculation when the classification task is performed before, and calculation of the characteristic distance is not needed.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an application scenario of a fake object detection method according to an embodiment of the present application;

fig. 2 is a schematic diagram of a training flow of a fake object detection model in a fake object detection method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a training sub-process of a fake object detection model in a fake object detection method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of another training sub-process of a fake object detection model in the fake object detection method according to the embodiment of the present application;

FIG. 5 is a schematic diagram of another training sub-process of a fake object detection model in the fake object detection method according to the embodiment of the present application;

fig. 6 is a schematic diagram of a training flow based on a specific framework of a fake object detection model in the fake object detection method according to the embodiment of the present application;

Fig. 7 is a schematic flow chart of a counterfeit object detection method according to an embodiment of the present application;

FIG. 8 is a schematic view of a sample feature space visualization provided by an embodiment of the present application;

fig. 9 is a schematic block diagram of a counterfeit object detection device provided by an embodiment of the present application;

FIG. 10 is a schematic diagram of a server according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a terminal device in an embodiment of the present application;

fig. 12 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

The terms first, second and the like in the description and in the claims of embodiments of the application and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those explicitly listed but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus, such that the partitioning of modules by embodiments of the application is only one logical partitioning, may be implemented with additional partitioning, such as a plurality of modules may be combined or integrated in another system, or some features may be omitted, or not implemented, and further, such that the coupling or direct coupling or communication connection between modules may be via some interfaces, indirect coupling or communication connection between modules may be electrical or otherwise similar, none of which are limited in embodiments of the application. The modules or sub-modules described as separate components may or may not be physically separate, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to achieve the purposes of the embodiment of the present application.

The embodiment of the application provides a fake object detection method, a related device and a storage medium, wherein an execution subject of the fake object detection method can be the fake object detection device provided by the embodiment of the application or a computer device integrated with the fake object detection device, wherein the fake object detection device can be realized in a hardware or software mode, and the computer device can be a terminal or a server.

When the computer device is a server, the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligence platforms, and the like.

When the computer device is a terminal, the terminal may include: smart phones, tablet computers, notebook computers, desktop computers, smart televisions, smart speakers, personal digital assistants (hereinafter abbreviated as PDA, english: personal Digital Assistant), desktop computers, smart watches, and the like, which carry multimedia data processing functions (e.g., video data playing functions, music data playing functions), but are not limited thereto.

The scheme of the embodiment of the application can be realized based on an artificial intelligence technology, and particularly relates to the technical field of computer vision in the artificial intelligence technology and the fields of cloud computing, cloud storage, databases and the like in the cloud technology, and the technical fields are respectively described below.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace human eyes with a camera and a Computer to perform machine Vision such as recognition, tracking and measurement on a target, and further perform graphic processing to make the Computer process into an image more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, model robustness detection, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, map construction, etc., as well as common model robustness detection, fingerprint recognition, etc., biometric techniques.

With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.

The scheme of the embodiment of the application can be realized based on cloud technology, and particularly relates to the technical fields of cloud computing, cloud storage, databases and the like in the cloud technology, and the technical fields are respectively described below.

Cloud technology (Cloud technology) refers to a hosting technology for integrating hardware, software, network and other series resources in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. Cloud technology (Cloud technology) is based on the general terms of network technology, information technology, integration technology, management platform technology, application technology and the like applied by Cloud computing business models, and can form a resource pool, so that the Cloud computing business model is flexible and convenient as required. Cloud computing technology will become an important support. Background services of technical networking systems require a significant amount of computing, storage resources, such as video websites, image-like websites, and more portals. Along with the high development and application of the internet industry, each article possibly has an own identification mark in the future, the identification mark needs to be transmitted to a background system for logic processing, data with different levels can be processed separately, and various industry data needs strong system rear shield support and can be realized only through cloud computing. According to the embodiment of the application, the identification result can be stored through the cloud technology.

Cloud storage (cloud storage) is a new concept that extends and develops in the concept of cloud computing, and a distributed cloud storage system (hereinafter referred to as a storage system for short) refers to a storage system that integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network to work cooperatively through application software or application interfaces through functions such as cluster application, grid technology, and a distributed storage file system, so as to provide data storage and service access functions for the outside. In the embodiment of the application, the information such as network configuration and the like can be stored in the storage system, so that the server can conveniently call the information.

At present, the storage method of the storage system is as follows: when creating logical volumes, each logical volume is allocated a physical storage space, which may be a disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as a data Identification (ID) and the like, the file system writes each object into a physical storage space of the logical volume, and the file system records storage position information of each object, so that when the client requests to access the data, the file system can enable the client to access the data according to the storage position information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided into stripes in advance according to the set of capacity measures for objects stored on a logical volume (which measures tend to have a large margin with respect to the capacity of the object actually to be stored) and redundant array of independent disks (RAID, redundant Array of Independent Disk), and a logical volume can be understood as a stripe, whereby physical storage space is allocated for the logical volume.

The Database (Database), which can be considered as an electronic filing cabinet, is a place for storing electronic files, and users can perform operations such as adding, inquiring, updating, deleting and the like on the data in the files. A "database" is a collection of data stored together in a manner that can be shared with multiple users, with as little redundancy as possible, independent of the application.

The database management system (Database Management System, abbreviated as DBMS) is a computer software system designed for managing databases, and generally has basic functions of storage, interception, security, backup and the like. The database management system may classify according to the database model it supports, e.g., relational, XML (Extensible Markup Language ); or by the type of computer supported, e.g., server cluster, mobile phone; or by the query language used, e.g., SQL (structured query language ), XQuery; or by performance impact emphasis, such as maximum scale, maximum speed of operation; or other classification schemes. Regardless of the manner of classification used, some DBMSs are able to support multiple query languages across categories, for example, simultaneously. In the embodiment of the application, the identification result can be stored in the database management system, so that the server can conveniently call.

It should be specifically noted that, the service terminal according to the embodiments of the present application may be a device that provides voice and/or data connectivity to the service terminal, a handheld device with a wireless connection function, or other processing device connected to a wireless modem. Such as mobile telephones (or "cellular" telephones) and computers with mobile terminals, which can be portable, pocket, hand-held, computer-built-in or car-mounted mobile devices, for example, which exchange voice and/or data with radio access networks. For example, personal communication services (English full name: personal Communication Service, english short name: PCS) telephones, cordless telephones, session Initiation Protocol (SIP) phones, wireless local loop (Wireless Local Loop, english short name: WLL) stations, personal digital assistants (English full name: personal Digital Assistant, english short name: PDA) and the like.

Referring to fig. 1, fig. 1 is a schematic diagram of an application scenario of a counterfeit object detection method according to an embodiment of the present application. The fake object detection method is applied to the computer device 10 in fig. 1, and a feature encoder, a prototype set, a detection traceability module and a prototype updating module (not shown in fig. 1) are arranged in the computer device 10, wherein the prototype set comprises at least one preset true prototype, at least two preset unique fake prototypes and at least one preset common fake prototype; the feature encoder is used for acquiring the target object features of the target object; the detection traceability module is used for determining whether a target object is a fake object according to the target object characteristics and the prototype set, and determining the fake type of the target object when the target object is the fake object; the prototype updating module is configured to perform prototype updating processing on at least one preset true prototype, at least two preset unique counterfeit prototypes, and at least one preset common counterfeit prototype in a training stage of the counterfeit object detection model, so as to obtain at least one true prototype, at least two counterfeit prototypes, and at least one common counterfeit prototype.

The prototype updating module in this embodiment guides the learning of the prototype set in the training stage of the fake object detection model, and does not need the model to participate in the operation in the reasoning stage, so that the prototype updating module in this embodiment does not bring additional reasoning overhead.

Specifically, firstly, detecting data corresponding to a target object from data to be detected, then inputting the data corresponding to the target object into a feature encoder, obtaining target object features of the target object through the feature encoder, and then calculating feature distances between the target object features and all prototypes in a prototype set to obtain a first feature distance set, a second feature distance set and a third prototype set, wherein the first feature distance set is a feature distance between the target object features and all real prototypes in at least one real prototype, the second feature distance set is a feature distance between the target object features and all unique counterfeit prototypes in at least two unique counterfeit prototypes, and the third feature distance set is a feature distance between the target object features and all common counterfeit prototypes in at least one common counterfeit prototype; and then determining whether the target object is a fake object or not based on the first feature distance set, the second feature set and the third feature set, and if the target object is a fake object, determining the fake type of the target object further according to the second feature distance set.

It should be noted that, in the embodiment of the present application, the data to be detected includes at least one of audio data, video data and image data, and the specific data type of the data to be detected is not limited herein, and when the data to be detected is video data or image data, the target object may be a human face, and in addition, the target object may also be other object types, such as a seal, etc., and the embodiment of the present application also does not limit the specific type of the target object.

For the sake of easy understanding, the following specific steps are exemplified by taking the data type of the data to be detected as an image, the target object as an object type as a face, and the execution subject as a fake object detection terminal.

It should be noted that, the fake object detection method provided by the embodiment of the application is implemented based on a fake object detection model, wherein the fake object detection model comprises a feature encoder, a prototype set, a detection tracing module and a prototype updating module; wherein the prototype set comprises at least one preset true prototype, at least two preset unique counterfeit prototypes, and at least one preset common counterfeit prototype.

Before the fake object detection method provided by the application is used for detecting the fake object, training is firstly needed to be carried out on the fake object detection model to obtain a trained fake object detection model, and the target loss value is determined according to the prototype updating loss value and the prototype detection tracing loss value.

Before training a fake object detection model, firstly, performing parameter initialization operation on the fake object detection model, initializing a feature encoder by using an ImageNet pre-training weight, randomly initializing a prototype updating module, and predefining a group of learnable prototype sets P= { m _ij }，m _ij Representing prototypes in the prototype set, i e {0,1,2}, i=0 representing m _ij For a true prototype, i=1 represents m _ij For a unique counterfeit prototype, i=2 represents m _ij For commonality forgery prototypes, j represents the prototype index of prototypes in the prototype set.

In the embodiment of the present application, the number of true prototypes is W, the number of unique counterfeit prototypes is K, the number of common counterfeit prototypes is L, where W and L are integers not less than 1, K is an integer greater than 1, and specific values of W, K and L are obtained according to the actual number of corresponding prototypes, and the specific values are not limited herein.

The embodiment of the application takes W as 1, K as 4 and L as 1 as examples for distance illustration, and in this case, { m } can be used for 1 true prototype ₀₁ The expression that 4 unique counterfeit prototypes can use { m }, respectively ₁₁ }、{m ₁₂ }、{m ₁₃} and {m₁₄ The expression of 1 commonality falsification prototype can use { m } ₂₁ And } represents.

The following describes the training process of the fake object detection model in the embodiment of the present application in detail:

Referring to fig. 2, fig. 2 is a schematic diagram of a training flow of a fake object detection model in the fake object detection method provided by the present application, which includes steps S110 to S140.

S110, acquiring target samples, wherein the target samples are from a real sample set and a fake sample set.

For example, a real sample collection packageInclude W real-type data sources given each containing a target object The fake sample set comprises given data sources M= { M of K fake types which all contain target objects ^j I j=1, …, K }, where W and K are not necessarily equal. Wherein each data source represents a type of sample, and each data source corresponds to a plurality of samples. The present embodiment uses X to represent the input space and Y to represent the output space. For->Its two kinds of labels (true and false kinds of labels) y _b =0, multi-class label (type label) y _m =0, or a specific true type tag; for x ε M, it is classified into tags y _b =1, multi-class label y _m ∈{1,…,K}。

In this embodiment, each training round needs to randomly select at least one target sample from the real sample set and/or the fake sample set as an input of the current fake object detection model.

S120, acquiring a first characteristic and a second characteristic of the target sample.

Wherein the first features include genuine features, unique counterfeit features, and common counterfeit features.

In some embodiments, image feature extraction may be performed on the target sample using feature encoders in the counterfeit object detection model.

Specifically, the feature encoder may be any deep learning network, such as a convolutional neural network and a visual transducer, and the embodiment of the present application is not limited to the specific type of the feature encoder, and may use the basic visual transducer as the feature encoder.

As shown in fig. 3, step S120 includes steps S1201 to S1202:

s1201 determining the first feature and the second feature of the target sample according to the feature encoder.

The target sample in this embodiment is an image x, and in some embodiments the feature encoder determines the first feature and the second feature of the target image by specifically.

First, for a 2D image x εR ^H×W×C The feature encoder expands it into 2D image blocksWhere (H, W) is the original resolution of the image, C is the number of image channels, (S, S) is the resolution of each image block, n=hw/S ² Is the number of image blocks, and outputs a classification token +. >For final classification. C' =s ² C, record z ε R ^(1+N)×C′ Is the output of the feature encoder, and the feature output of a target sample is mainly composed of a plurality of block Tokens (Patch Tokens) z _p ∈R ^N×C′ And a Class token z _cls ∈R ^1×C′ Composition is prepared.

Wherein, the block token is the first feature in the present embodiment, the classification token is the second feature in the present embodiment, one block token includes the feature of one image block in the target sample, and the classification token includes the image feature of the target sample.

S1202, dividing the first feature into a plurality of real features, a plurality of unique counterfeit features and a plurality of common counterfeit features according to a multi-head cross-attention sub-module and the prototype set.

Wherein the prototype-updating module includes the multi-headed cross-attention sub-module g (z; θ).

Specifically, each Z is _p (first feature) inputting multi-head cross-attention sub-module g (Z; θ) as Key and Value respectively, and inputting multi-head cross-attention as Query by prototype set P, Z related to each prototype can be queried through cross-attention operation _p Thereby obtaining each real prototypeRespectively corresponding real characteristicsUnique counterfeit characteristics corresponding to each unique counterfeit prototype>Common forgery feature of each common forgery prototype +. >

In some embodiments, for each Z _p After division, a feature map can be obtainedThe prototype set may then be updated based on the feature map.

S130, determining a prototype updating loss value according to the first characteristic, and determining a prototype detecting traceability loss value according to the second characteristic.

Specifically, in this embodiment, the prototype updating module determines the prototype updating loss value according to the plurality of real features, the plurality of unique counterfeit features, the plurality of common counterfeit features, and the prototype set, and the detecting tracing loss value according to the second feature.

In some embodiments, in order to enable the common forgery features learned by the prototypes in the prototypes to learn the features of commonality of each forgery method (forgery type), and enable the common forgery features to have a true-false classification function, the prototypes updating module further includes a forgery method discriminator and a true-false discriminator; the prototype updated loss value includes a first loss value, a second loss value, and a third loss value; referring to fig. 4, the present application determines a first loss value, a second loss value and a third loss value by the steps of:

a1301 determining target real features of each real prototype from a plurality of said real features based on sample tags of each said real feature, determining target unique counterfeit features of each unique counterfeit prototype from a plurality of said unique counterfeit features based on sample tags of each said unique counterfeit feature, and determining target common counterfeit features of each common counterfeit prototype from a plurality of said common counterfeit features based on sample tags of each said common counterfeit feature.

In the present embodiment, although the first feature is divided into a plurality of real features, a plurality of unique counterfeit features, and a plurality of common counterfeit features by the multi-head cross-attention sub-module and the prototype set, the first feature queried by each prototype does not necessarily correspond to the prototype, and at this time, it is necessary to further acquire a sample tag of each first feature, find the first feature corresponding to each prototype, determine the real feature corresponding to the real prototype as the target real feature, the unique counterfeit feature corresponding to the unique counterfeit prototype as the target unique counterfeit feature, and the common counterfeit feature corresponding to the common counterfeit prototype as the target common counterfeit feature (the sample tag y of the corresponding target sample _b ＝1)。

a1302, determining the first loss value according to the prototype set, the target real feature, the target unique counterfeit feature, and the target common counterfeit feature.

Specifically, feature distances between the individual true prototypes and the corresponding target true features are calculated, feature distances between the corresponding individual counterfeit features of the individual counterfeit prototypes are calculated, and feature distances between the common counterfeit prototypes and the corresponding target common counterfeit features are calculated.

In some embodiments, the first loss value is a prototype-centric loss that is expected to be as small as possible from the prototype to the first feature corresponding to the prototype, where the first loss value is determined according to a first formula:

wherein ,for the first loss value, I represents an indication function for indicating the selection of the corresponding z by the sample label _a and m_ij ，z _a M for the target authentic, the target unique counterfeit or the target common counterfeit _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=1 represents m _ij For a unique counterfeit prototype, i=2 represents m _ij For the commonality forgery prototype j represents the prototype index of the prototypes in the prototype set, +. >Representation I (z) _a ) And I (m) _ij ) Feature distance between them.

The feature distance in this embodiment may be a euclidean feature distance.

a1303, determining the second loss value according to the target commonality counterfeit feature and the counterfeit method discriminator.

In this embodiment, it is necessary to perform countermeasure processing on the target common forgery feature, and then input the target common forgery feature after the countermeasure processing into the forgery method discriminator, and if the forgery method discriminator cannot distinguish from which forgery method the target common forgery feature comes, it is explained that the target common forgery feature learns all forgery methods (forgery types).

wherein ,identifying the second loss value, x representing the target sample, f representing the feature encoder, g representing a multi-headed cross-attention sub-module, D _m Representing the forgery method discriminator, f (x) representing the second bitSign, I _[yb＝1] Representing an indication function indicating a target common counterfeit feature for selecting a sample label as a counterfeit object from the common counterfeit features, y _b Representing a binary label, y _b =1 indicates that the two kinds of labels are counterfeit objects, N indicates the number of the target samples, Represented in x, y _b Y _m Hope of the above, y _m Label, θ, representing counterfeit type _f Representing the parameters, θ, of the feature encoder _g Parameters representing the multi-headed cross-attention sub-module,/->Parameters representing the forgery method discriminator.

a1304 determining the third loss value from the sample tag of the common counterfeit feature, and the true-false discriminator.

In this embodiment, specifically, the common counterfeit feature is input into the true and false discriminator to obtain a corresponding classification result, and then the classification result and the sample mark of the common counterfeit feature are determined to be a third loss value corresponding to the common counterfeit feature; training the model by using the third loss value can enable the characteristics learned by the common fake prototype to have a true-false classification function.

In some embodiments, the third formula is:

wherein ,for the third loss value, +.>To input the common forgery feature into the true and false discriminatorProbability of output at that time.

In some embodiments, the detected traceable loss value includes a fourth loss value and a fifth loss value; at the moment, the fourth loss value and the fifth loss value are used for training the detection tracing module, so that the detection tracing module can be combined with the common counterfeiting characteristics during the true and false classification tasks. Referring to fig. 5, the present application determines a fourth loss value and a fifth loss value by the steps of:

b1301, determining a first sample characteristic distance set, a second sample characteristic distance set and a third sample characteristic distance set through the detection traceability module.

The first sample feature distance set comprises feature distances between the second feature and each of at least one preset true prototype, the second sample feature distance set comprises feature distances between the second feature and each of at least two preset unique counterfeit prototypes, and the third sample feature distance set comprises feature distances between the second feature and each of at least one preset common counterfeit prototypes.

In some embodiments, the euclidean distance of the second feature from each prototype in the set of prototypes is determined: wherein the probability that a target sample belongs to a prototype is proportional to the negative of the euclidean distance between the second feature of the target sample and the prototype, i.e.:

at this time, the target sample belongs to prototype m _ij The probability of (2) can be expressed as:

wherein, gamma is used for controlling the difficulty of learning tasks.

b1302, determining the fourth loss value for the target sample from the first set of sample feature distances and the second set of sample feature distances.

In this embodiment, the first sample feature distance set includes feature distances between the second feature and each of at least one preset true prototype, and the second sample feature distance set includes feature distances between the second feature and each of at least two preset unique counterfeit prototypes.

In order to make the features learned by the unique forging prototype have a true-false classification function and a forging type tracing function, the features learned by the common forging prototype have a true-false classification function, and the embodiment determines the fourth loss function based on the true prototype and the unique forging prototype, and determines the fifth loss function based on the true prototype and the common forging prototype, that is, the unique forging prototype and the common forging prototype in the embodiment need to calculate losses respectively with the true prototype, so that the unique forging prototype and the common forging prototype have different functions.

The embodiment calculates a fourth loss value for the true prototype and the unique counterfeit prototype, wherein the fourth loss value is determined according to a fourth formula, and the fourth formula is:

b1303 determining the fifth loss value of the target sample according to the first sample feature distance set and the third sample feature distance set.

In this embodiment, the first sample feature distance set includes feature distances between the second feature and each of at least one preset true prototype, and the third sample feature distance set includes feature distances between the second feature and each of at least one preset common counterfeit prototype.

The embodiment calculates a fifth loss value for the true prototype and the common forgery prototype, wherein the fifth loss value is determined according to a fifth formula, and the fifth formula is as follows:

wherein ,for the fifth loss value, d (z _cls ,m _ij ) Representing z _cls And m is equal to _ij Characteristic distance between z _cls Representing the second feature, m _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=2 represents m _ij For common forgery prototypes, j represents the prototype index of prototypes in the prototype set, x is the target sample, γ represents the coefficient controlling the difficulty of learning task, p (y) _b I x) indicates that the target sample belongs to tag y _b Probability of y _b Representing the targetAnd (5) classifying labels of the samples.

S140, determining whether the target loss value meets a preset convergence condition; if yes, execute step S150; if not, step S160 is performed.

In this embodiment, the target loss value is determined according to the prototype updated loss value and the prototype detected trace-source loss value, where the prototype updated loss value includes a first loss valueSecond loss value->Third loss value->Prototype detection trace-source loss value includes a fourth loss value +.>Fifth loss value->

At this time, in some embodiments, the target loss value is:

wherein ,is the target loss value.

Specifically, the preset convergence condition is that the target loss values are smaller than the preset loss function value within the preset training times.

And S150, acquiring a new sample from the real sample set and the fake sample set, taking the new sample as the target sample, and returning to the step S120.

In this embodiment, if it is determined that the target loss value does not meet the preset convergence condition, at least one new sample is randomly obtained from the real sample set and the fake sample set, and the obtained new sample is used as the target sample, so as to train and update the fake object detection model.

S160, obtaining a trained fake object detection model.

In this embodiment, if it is determined that the target loss value meets a preset convergence condition, a trained fake object detection model is obtained, and then the trained fake object detection model is used to detect a fake object in an inference stage.

In order to further understand the training flow of the fake object detection model in this embodiment, please refer to fig. 6, fig. 6 is a schematic diagram of the training flow based on the specific frame of the fake object detection model provided by the present application, firstly, an image feature extraction step is executed, then a prototype update step and a detection tracing module update step are executed, and finally, gradient feedback is performed according to the loss values obtained in the prototype update step and the detection tracing module update step, so as to implement training update of the fake object detection model, where each step is specifically as follows:

Extracting image features: each target sample is input to a feature encoder, which outputs a plurality of first features (block tokens) and second features (class tokens) for each target sample.

Prototype updating: inputting each first feature into a multi-head cross-attention submodule as Key and Value, inputting multi-head cross-attention as Query by a prototype set P, inquiring the first feature related to each prototype through cross-attention operation, and obtaining a feature mapThen according to the real label of every first feature in the feature map, determining the first feature correspondent to every prototype from the feature map, i.e. according to the first feature correspondent to every prototype the prototype central loss function value of every prototype is obtained, i.e. the first loss value +.>First loss value->Specifically according to the above formula (1); then, the target common counterfeit feature in the feature map is subjected to countermeasure processing, the target common counterfeit feature after the countermeasure processing is input into a counterfeit method discriminator, and a second loss value +_ is determined based on the above formula (2)>At the same time, inputting the common fake feature in the feature map into the true and false discriminator, and determining the third loss value based on the above formula (3)>

And (5) updating a detection traceability module: calculating a fourth loss value for a true prototype and a unique counterfeit prototype Wherein the fourth loss value->Determined according to the above formula (6); calculation of a fifth loss value for a true prototype and a common counterfeited prototypeThe fifth loss value is determined according to the above formula (7).

Gradient return: and (3) carrying out gradient feedback on the fake object detection model according to the loss values obtained in the prototype updating and detection tracing module updating steps, specifically determining a target loss value according to a formula (8), and then carrying out gradient feedback on the model according to the target loss value to complete updating of model parameters.

In summary, in the first aspect, only one fake object detection model is required to be deployed, so that the real and fake two classification tasks and the fake type traceability task can be simultaneously performed on the target object, and a plurality of models are not required to be deployed, so that the cost of model training and deployment is reduced; in the second aspect, the prototype updating module in the fake object detection model only needs to guide the learning of the prototype set in the training stage, and does not need to participate in operation in the reasoning stage, so that the prototype updating module in the fake object detection model does not bring extra reasoning expense; in a third aspect, the present embodiment trains the unique counterfeit prototype and the common counterfeit prototype for the counterfeit features, and can learn and retain various counterfeit features in the counterfeit data more comprehensively through the unique counterfeit prototype and the common counterfeit prototype, so that the counterfeit features can be recognized more comprehensively in a subsequent reasoner stage, and thus the performance of genuine-fake classification and counterfeit type tracing of the model is improved.

Fig. 2 to fig. 6 correspond to embodiments for explaining training implemented by a fake object detection model in the fake object detection method provided by the present application, and after training the fake object detection model, performing fake object detection on a target object in data to be detected by using the trained fake object detection model in an inference stage. The following describes the counterfeit object detection method (reasoning stage) in detail, please refer to fig. 7, fig. 7 is a flow chart of the counterfeit object detection method according to the embodiment of the present application.

As shown in fig. 7, the method includes the following steps S210 to S230.

S210, obtaining target object characteristics of a target object in the data to be detected.

In some embodiments, after the data to be detected is obtained, target object detection needs to be performed on the data to be detected, data corresponding to a target object is determined in the data to be detected, and then feature encoding is performed on the data corresponding to the target object, so that target object features are obtained.

For example, an image to be detected including a face image is acquired, then the face image in the image to be detected is detected by using a preset target object detector, the detected face image is cut off to be used as an input of a feature encoder, and then the feature encoder performs feature encoding on the input face image, and the obtained face feature is used as a target object feature.

In other embodiments, the data to be detected only includes the data of the target object, and at this time, the feature encoder may directly perform feature encoding on the data to be detected to obtain the feature of the target object.

The data to be detected in the embodiment may be directly input by a user, may be obtained from another terminal, and may also be obtained from a cloud end through downloading, and the obtaining mode and path of the data to be detected in the embodiment are not limited.

In other embodiments, the counterfeit object detection terminal may directly obtain the target object characteristics.

S220, determining a first feature distance set, a second feature distance set and a third feature distance set.

The first feature distance set is a feature distance between the target object feature and each of at least one true prototype, the second feature distance set is a feature distance between the target object feature and each of at least two unique counterfeit prototypes, and the third feature distance set is a feature distance between the target object feature and each of at least one common counterfeit prototype.

In particular, the Euclidean distance between the target object feature and each prototype in a prototype set, wherein the prototype set comprises at least one real prototype, at least two unique forged prototypes, and at least one common forged prototype; and then classifying Euclidean distances between the target object features and each real prototype as a first distance feature set, classifying Euclidean distances between the target object features and each unique forged prototype as a second distance feature set, and classifying Euclidean distances between the target object features and each target object feature as a third distance feature set.

And S230, if the target object is determined to be a fake object based on the first characteristic distance set, the second characteristic distance set and the third characteristic distance set, determining the fake type of the target object according to the second characteristic distance set.

In this embodiment, the true-false classification processing is performed on the target object according to the first feature distance set, the second feature distance set, and the third feature distance set to determine whether the target object is a counterfeit object, if the target object is a counterfeit object, the type of counterfeit of the target object is further determined according to the second feature distance set, and if the target object is determined to be a real object, the classification result that the target object is a real object is output.

In some embodiments, determining whether the target object is a counterfeit object is based on: determining an average feature distance of feature distances in the first feature distance set, and determining a minimum feature distance from the second feature distance set and the third feature set; and determining whether the target object is a fake object according to the average characteristic distance and the minimum characteristic distance.

Specifically, an average feature distance of feature distances in the first feature distance set is determined based on equation (9):

wherein f (x) is the target object feature, m _r M when i.epsilon.0 _ij N is the number of true prototypes (the number of feature distances in the first feature distance set), d (f (x), m _r ) For the average feature distance of the feature distances in the first feature distance set, d (f (x), m when i is {0} _ij ) Is the feature distance in the first feature distance set.

Determining a minimum feature distance from the second feature distance set and the third feature set based on equation (10):

d(f(x),m _f )＝mind(f(x),m _ij ),i∈{1,2}； (10)

wherein d (f (x), m _f ) Determining a minimum feature distance for the second feature distance set and the third feature distance set, f (x) being the target object feature, m _f M for i.epsilon.1, 2 _ij When i is {1,2}, d (f (x), m _ij ) Is the feature distance in the second feature distance set and the third feature distance set.

In some embodiments, after determining the average feature distance and the minimum feature distance, determining a first probability that the target object is a real object further according to the average feature distance, and determining a second probability that the target object is a counterfeit object according to the minimum feature distance; if the first probability is larger than the second probability, determining that the target object is a real object; and if the first probability is smaller than or equal to the second probability, determining that the target object is a fake object.

Specifically, the first probability and the second probability are determined according to formula (11):

wherein ,y_b D (f (x), m) calculated by (9) and formula (10) is used for classifying labels _r ) And d (f (x), m _f ) Substituting formula (11) to calculate p (y) _b I x), wherein when m takes the value of m _r At the time, p (y) _b I x) is the first probability, when m takes on the value of m _f At the time, p (y) _b I x) is the second probability.

In this embodiment, if the first probability is greater than the second probability, then y is determined _b =0, i.e. the target object is a real object, otherwise y is determined _b =1, i.e. the target object is a counterfeit object, when y _b When=1, the present embodiment needs to further predict the forgery type of the target object, specifically, determine the probabilities that the target object corresponds to each unique forgery prototype according to the feature distances in the second feature distance set, specifically, by the formula (12):

wherein ,y_m E {1, …, K }, K is the unique counterfeit prototypeEach unique forgery prototype represents a forgery type.

In summary, in the scheme provided by the embodiment of the present application, on one hand, since the counterfeit prototype provided by the present application includes a plurality of unique counterfeit prototypes representing unique counterfeit characteristics and at least one common counterfeit prototype representing common counterfeit characteristics, the counterfeit prototype provided by the present application can identify not only the unique counterfeit characteristics in the target object but also the common counterfeit characteristics in the target object, can identify the counterfeit characteristics of the target object more comprehensively, and can classify the target object in real and false by the identified more comprehensive counterfeit characteristics, thereby improving the classification effect of the real and false classification; on the other hand, when the target object is identified as the counterfeit object, the counterfeit type of the target object is further determined by utilizing the second characteristic distance set obtained by calculation when the classification task is performed before, and the calculation of the characteristic distance is not needed any more.

To verify the validity of the present application, the inventors have performed experiments using the falsified face dataset faceforensis++. The faceforensis++ contains real face data collected from the internet and false face data synthesized by 5 fake methods. The experiment was performed using a ViT-Small network as the baseline model, using model ViT-Small-2way and model ViT-Small-6way for comparison with model ViT-Small-sources of the method of the application. For true and false classification tasks, the evaluation index selects the true and false classification accuracy rate bi_acc, the Curve Area (AUC) (the size of the Area under the ROC Curve), tpr@fpr (recall rate at fixed false alarm rate). In an actual application scene, the occurrence rate of the fake face is extremely low, and a large amount of data are real faces, so that false alarm rates of the fake sample identified by the real sample need to be considered, and two situations of TPR@FPR=0.01% and TPR@FPR=0.1% are respectively compared. For the tracing task of the fake method, the evaluation index selects the classification accuracy attr_Acc of the fake method. The experimental results are shown in table 1:

TABLE 1

Model	bi_Acc	AUC	TPR@FPR＝0.01％	TPR@FPR＝0.1％	attr_Acc
						ViT-Small-2way	0.9676	0.9911	0.7159	0.7469	-
ViT-Small-6way	0.9677	0.9912	0.7687	0.8280	0.9873
						ViT-Small-ours	0.972	0.9935	0.874	0.9311	0.9892

As can be seen from Table 1, the detection method of the counterfeit object provided by the application is superior to the baseline method in all indexes, and fully proves the superiority of the application.

In addition, to visualize the learned features, a partial sample in the faceforces++ dataset was also visualized using t-SNE (t-Distributed Stochastic Neighbor Embedding), the results of which are shown in FIG. 8. In fig. 8, the color corresponding to the tag 0 represents the feature corresponding to the real face data, the colors corresponding to the tags 1 to 5 respectively represent the features corresponding to the false face data of different counterfeit types, and (a) shows a t-SNE diagram of the features before ViT _small-6way is sent to the classifier. (b) shows the t-SNE pattern in the present application. (c) Shown is a t-SNE plot of features queried from true prototypes and unique counterfeit prototypes in the present application. (d) Shown is a t-SNE diagram of features queried by a co-counterfeited prototype in the present application. By comparing (a) and (b), it can be observed that the features extracted by the method of the application are more differentiated between true and false categories. By observing (c) and (d), it can be stated that the proposed prototype is able to learn the corresponding counterfeit features.

Any technical features mentioned in the embodiments corresponding to any one of fig. 1 to 8 are also applicable to the embodiments corresponding to fig. 9 to 12 in the embodiments of the present application, and the following description is omitted.

A fake object detection method according to an embodiment of the present application is described above, and a fake object detection device (e.g., server, user terminal) that performs the above fake object detection method is described below.

Referring to fig. 9, a schematic structural diagram of a counterfeit object detection device 900 shown in fig. 9 can be applied to a counterfeit object authentication detection scenario. The counterfeit object detection device 900 in the embodiment of the present application is capable of implementing steps corresponding to the counterfeit object detection method performed in the embodiment corresponding to any of fig. 1 to 8 described above. The functions of the counterfeit object detection device 900 may be implemented by hardware, or may be implemented by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the functions described above, which may be software and/or hardware. The counterfeit object detection device 900 may include a transceiver module 901 and a processing module 902:

the transceiver module 901 is configured to obtain a target object feature of a target object in data to be detected;

a processing module 902, configured to determine a first feature distance set, a second feature distance set, and a third feature distance set, where the first feature distance set is a feature distance between the target object feature and each of at least one real prototype, the second feature distance set is a feature distance between the target object feature and each of at least two unique counterfeit prototypes, and the third feature distance set is a feature distance between the target object feature and each of at least one common counterfeit prototypes; and if the target object is determined to be a fake object based on the first characteristic distance set, the second characteristic distance set and the third characteristic distance set, determining the fake type of the target object according to the second characteristic distance set.

In some embodiments, the processing module 902 determines whether the target object is a counterfeit object in particular according to the following:

In some embodiments, the processing module 902 is specifically configured to, when performing the step of determining whether the target object is a counterfeit object according to the average feature distance and the minimum feature distance:

In some embodiments, before the transceiver module 901 performs the step of obtaining the target object characteristic of the target object in the data to be detected, the processing module 902 is further configured to:

obtaining a target sample through the transceiver module 901, wherein the target sample is from a real sample set and a fake sample set;

In some embodiments, the processing module 902 is specifically configured to, when performing the step of obtaining the first characteristic and the second characteristic of the target sample:

In some embodiments, the processing module 902 is specifically configured to, when performing the step of determining a prototype updated loss value according to the first feature and determining a prototype detected trace-source loss value according to the second feature:

In some embodiments, the prototype-updating module further comprises a fake-method discriminator and a true-false discriminator; the prototype updated loss value includes a first loss value, a second loss value, and a third loss value; the processing module 902 is specifically configured to, when executing the step of determining a prototype-updated loss value according to the first feature:

wherein ,is the firstThree loss values (I/O)>Probability of output when inputting the common forgery feature into the true and false discriminators. />

In some embodiments, the detected traceable loss value includes a fourth loss value and a fifth loss value; the processing module 902 is specifically configured to, when executing the step of determining a prototype detection trace-source loss value according to the second feature:

wherein ,for the fourth loss value, d (z _cls ,m _ij ) Representing z _cls And m is equal to _ij Between (a) and (b)Feature distance, z _cls Representing the second feature, m _ij Representing prototypes in the prototype set, i=0 representing m _ij For a true prototype, i=1 represents m _ij For exclusive falsification of prototypes, j represents the prototype index of prototypes in the prototype set, x is the target sample, γ represents the coefficient controlling the difficulty of learning tasks, p (y) _c I x) indicates that the target sample belongs to y _c Y when the classification label of the target sample is a true sample _c =0, y when the two-class label of the target sample is a counterfeit sample _c ＝y _m ，y _m A label representing a counterfeit type.

In summary, in the solution provided in the embodiment of the present application, on one hand, since the counterfeit prototype provided by the counterfeit object detection device 900 in the present application includes a plurality of unique counterfeit prototypes representing unique counterfeit characteristics and at least one common counterfeit prototype representing common counterfeit characteristics, not only the unique counterfeit characteristics but also the common counterfeit characteristics in the target object can be identified by the counterfeit prototype in the counterfeit object detection device 900 in the present application, the counterfeit characteristics of the target object can be more comprehensively identified, and the classification effect of the true and false classification can be improved by performing the true and false classification on the target object by the identified more comprehensive counterfeit characteristics; on the other hand, when the counterfeit object detection device 900 in the present embodiment recognizes that the target object is a counterfeit object, the counterfeit type of the target object may be further determined by using the second feature distance set calculated during the previous classification task, and the feature distance calculation is not required.

The fake object detection method in the embodiment of the present application is described above from the viewpoint of the modularized functional entity, and the fake object detection device in the embodiment of the present application is described below from the viewpoint of hardware processing, respectively.

It should be noted that, in each embodiment of the present application (including each embodiment shown in fig. 9), the entity devices corresponding to all the transceiver modules may be transceivers, and the entity devices corresponding to all the processing modules may be processors. When one of the apparatuses has a structure as shown in fig. 9, the processor, the transceiver, and the memory implement the same or similar functions as the transceiver module and the processing module provided in the foregoing apparatus embodiment corresponding to the apparatus, and the memory in fig. 10 stores a computer program to be called when the processor executes the above-described counterfeit object detection method.

The system shown in fig. 9 may have a structure as shown in fig. 10, and when the apparatus shown in fig. 9 has a structure as shown in fig. 10, the processor in fig. 10 can implement the same or similar functions as the processing module provided by the apparatus embodiment corresponding to the apparatus, and the transceiver in fig. 10 can implement the same or similar functions as the transceiver module provided by the apparatus embodiment corresponding to the apparatus, and the memory in fig. 10 stores a computer program to be invoked when the processor performs the above-mentioned counterfeit object detection method. In the embodiment of the present application, the entity device corresponding to the transceiver module in the embodiment shown in fig. 9 may be an input/output interface, and the entity device corresponding to the processing module may be a processor.

The embodiment of the present application further provides a terminal device, as shown in fig. 11, for convenience of explanation, only the portion relevant to the embodiment of the present application is shown, and specific technical details are not disclosed, please refer to the method portion of the embodiment of the present application. The terminal device may be any terminal device including a mobile phone, a tablet computer, a personal digital assistant (Personal Digital Assistant, PDA), a Point of Sales (POS), a vehicle-mounted computer, and the like, taking the mobile phone as an example of the terminal:

fig. 11 is a block diagram showing a part of the structure of a mobile phone related to a terminal device provided by an embodiment of the present application. Referring to fig. 11, the mobile phone includes: radio Frequency (RF) circuit 55, memory 520, input unit 530, display unit 540, sensor 550, audio circuit 560, wireless fidelity (wireless fidelity, wi-Fi) module 570, processor 580, and power supply 590. Those skilled in the art will appreciate that the handset configuration shown in fig. 11 is not limiting of the handset and may include more or fewer components than shown, or may combine certain components, or may be arranged in a different arrangement of components.

The following describes the components of the mobile phone in detail with reference to fig. 11:

the RF circuit 55 may be used for receiving and transmitting signals during the process of receiving and transmitting information or communication, in particular, after receiving downlink information of the base station, the downlink information is processed by the processor 580; in addition, the data of the design uplink is sent to the base station. Generally, RF circuitry 55 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (English full name: low Noise Amplifier; LNA), a duplexer, and the like. In addition, the RF circuitry 55 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to global system for mobile communications (english: global System of Mobile communication, english: GSM), general packet radio service (english: general Packet Radio Service, english: GPRS), code division multiple access (english: code Division Multiple Access, CDMA), wideband code division multiple access (english: wideband Code Division Multiple Access, english: WCDMA), long term evolution (english: long Term Evolution, english: LTE), email, short message service (english: short Messaging Service, english: SMS), and the like.

The memory 520 may be used to store software programs and modules, and the processor 580 performs various functional applications and data processing of the cellular phone by executing the software programs and modules stored in the memory 520. The memory 520 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, application programs required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, phonebook, etc.) created according to the use of the handset, etc. In addition, memory 520 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.

The input unit 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the handset. In particular, the input unit 530 may include a touch panel 531 and other input devices 532. The touch panel 531, also referred to as a touch screen, may collect touch operations thereon or thereabout by a user (e.g., operations of the user on the touch panel 531 or thereabout by using any suitable object or accessory such as a finger, a stylus, etc.), and drive the corresponding connection device according to a predetermined program. Alternatively, the touch panel 531 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 580, and can receive commands from the processor 580 and execute them. In addition, the touch panel 531 may be implemented in various types such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 530 may include other input devices 532 in addition to the touch panel 531. In particular, other input devices 532 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.

The display unit 540 may be used to display information input by a user or information provided to the user and various menus of the mobile phone. The display unit 540 may include a display panel 541, and optionally, the display panel 541 may be configured in the form of a liquid crystal display (english: liquid Crystal Display, abbreviated as LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 531 may cover the display panel 541, and when the touch panel 531 detects a touch operation thereon or thereabout, the touch operation is transferred to the processor 580 to determine the type of the touch event, and then the processor 580 provides a corresponding visual output on the display panel 541 according to the type of the touch event. Although in fig. 11, the touch panel 531 and the display panel 541 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 531 and the display panel 541 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 550, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel 541 according to the brightness of ambient light, and a proximity sensor that may turn off the display panel 541 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and direction when stationary, and can be used for applications of recognizing the gesture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the handset are not described in detail herein.

Audio circuitry 560, speakers 561, microphone 562 may provide an audio interface between the user and the handset. The audio circuit 560 may transmit the received electrical signal converted from audio data to the speaker 561, and the electrical signal is converted into a sound signal by the speaker 561 and output; on the other hand, microphone 562 converts the collected sound signals into electrical signals, which are received by audio circuit 560 and converted into audio data, which are processed by audio data output processor 580 for transmission to, for example, another cell phone via RF circuit 55, or for output to memory 520 for further processing.

Wi-Fi belongs to a short-distance wireless transmission technology, and a mobile phone can help a user to send and receive e-mails, browse web pages, access streaming media and the like through a Wi-Fi module 570, so that wireless broadband Internet access is provided for the user. Although fig. 11 shows Wi-Fi module 570, it is understood that it does not belong to the necessary constitution of the cell phone, and can be omitted entirely as needed within the scope of not changing the essence of the application.

Processor 580 is the control center of the handset, connects the various parts of the entire handset using various interfaces and lines, and performs various functions and processes of the handset by running or executing software programs and/or modules stored in memory 520, and invoking data stored in memory 520, thereby performing overall monitoring of the handset. Optionally, processor 580 may include one or more processing units; preferably, processor 580 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 580.

The handset further includes a power supply 590 (e.g., a battery) for powering the various components, which can be logically connected to the processor 580 by a power management system so as to perform functions such as managing charging, discharging, and power consumption by the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which will not be described herein.

In the embodiment of the present application, the processor 580 included in the mobile phone further has a flowchart for controlling and executing the above fake object detection method shown in fig. 2 and 7.

Fig. 12 is a schematic diagram of a server structure according to an embodiment of the present application, where the server 620 may have a relatively large difference due to different configurations or performances, and may include one or more central processing units (in english: central processing units, in english: CPU) 622 (for example, one or more processors) and a memory 632, and one or more storage media 630 (for example, one or more mass storage devices) storing application programs 642 or data 644. Wherein memory 632 and storage medium 630 may be transitory or persistent storage. The program stored on the storage medium 630 may include one or more modules (not shown), each of which may include a series of instruction operations on a server. Still further, the central processor 622 may be configured to communicate with a storage medium 630 and execute a series of instruction operations in the storage medium 630 on the server 620.

The Server 620 may also include one or more power supplies 626, one or more wired or wireless network interfaces 650, one or more input/output interfaces 658, and/or one or more operating systems 641, such as Windows Server, mac OS X, unix, linux, freeBSD, and the like.

The steps performed by the server in the above embodiments may be based on the structure of the server 620 shown in fig. 12. The steps of the server shown in fig. 2 and 7 in the above embodiment may be based on the server structure shown in fig. 12. For example, the processor 622 performs the following operations by invoking instructions in the memory 632:

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, apparatuses and modules described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein.

In the embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms.

The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When the computer program is loaded and executed on a computer, the flow or functions according to the embodiments of the present application are fully or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

The above description has been made in detail on the technical solutions provided by the embodiments of the present application, and specific examples are applied in the embodiments of the present application to illustrate the principles and implementation manners of the embodiments of the present application, where the above description of the embodiments is only for helping to understand the methods and core ideas of the embodiments of the present application; meanwhile, as for those skilled in the art, according to the idea of the embodiment of the present application, there are various changes in the specific implementation and application scope, and in summary, the present disclosure should not be construed as limiting the embodiment of the present application.

Claims

1. A counterfeit object detection method, comprising:

2. The method of claim 1, wherein determining whether the target object is a counterfeit object is based on:

determining an average feature distance of feature distances in the first feature distance set, and determining a minimum feature distance from the second feature distance set and the third feature set;

and determining whether the target object is a fake object according to the average characteristic distance and the minimum characteristic distance.

3. The method of claim 2, wherein said determining whether the target object is a counterfeit object based on the average feature distance and the minimum feature distance comprises:

determining a first probability that the target object is a real object according to the average feature distance, and determining a second probability that the target object is a counterfeit object according to the minimum feature distance;

if the first probability is larger than the second probability, determining that the target object is a real object;

And if the first probability is smaller than or equal to the second probability, determining that the target object is a fake object.

4. A method according to any one of claims 1 to 3, characterized in that the method is implemented based on a fake object detection model comprising a feature encoder, a prototype set, a detection trace-source module and a prototype update module;

5. The method of claim 4, wherein prior to the obtaining the target object feature of the target object in the data to be detected, the method further comprises:

obtaining a target sample, wherein the target sample is from a real sample set and a fake sample set;

6. The method of claim 5, wherein the obtaining the first and second characteristics of the target sample comprises:

7. The method of claim 5, wherein determining a prototype updated loss value from the first feature and determining a prototype detection trace-source loss value from the second feature comprises:

8. The method of claim 5, wherein the prototype-updating module further comprises a fake-method discriminator and a true-false discriminator; the prototype updated loss value includes a first loss value, a second loss value, and a third loss value; the determining a prototype-update-loss value from the first feature includes:

9. The method of claim 8, wherein the first loss value is determined according to a first formula:

10. The method of claim 8, wherein the second loss value is determined according to a second formula:

wherein ,identifying the second loss value, x representing the target sample, f representing the feature encoder, g representing a multi-headed cross-attention submodule，D _m Representing the forgery method discriminator, f (x) representing the second feature, I _[yb＝1] Representing an indication function indicating a target common counterfeit feature for selecting a sample label as a counterfeit object from the common counterfeit features, y _b Representing a binary label, y _b =1 indicates that the two kinds of labels are counterfeit objects, N indicates the number of the target samples, ++>Represented in x, y _b Y _m Hope of the above, y _m Label, θ, representing counterfeit type _f Representing the parameters, θ, of the feature encoder _g Parameters, θ, representing the multi-headed cross-attention sub-module _Dm Parameters representing the forgery method discriminator.

11. The method of claim 8, wherein the third loss value is determined according to a third formula:

12. The method of claim 5, wherein the detecting traceable loss value comprises a fourth loss value and a fifth loss value; the determining a prototype detection traceability loss value according to the second feature includes:

13. A counterfeit object detection device, comprising: