WO2023149513A1 - Forgery image detection device, forgery image detection method, and program - Google Patents

Forgery image detection device, forgery image detection method, and program Download PDF

Info

Publication number
WO2023149513A1
WO2023149513A1 PCT/JP2023/003431 JP2023003431W WO2023149513A1 WO 2023149513 A1 WO2023149513 A1 WO 2023149513A1 JP 2023003431 W JP2023003431 W JP 2023003431W WO 2023149513 A1 WO2023149513 A1 WO 2023149513A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
image
forged
counterfeit
machine learning
Prior art date
Application number
PCT/JP2023/003431
Other languages
French (fr)
Japanese (ja)
Inventor
俊彦 山崎
楓 塩原
Original Assignee
国立大学法人 東京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国立大学法人 東京大学 filed Critical 国立大学法人 東京大学
Publication of WO2023149513A1 publication Critical patent/WO2023149513A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to a counterfeit image detection device, a counterfeit image detection method, and a program.
  • Non-Patent Document 1 A method for detecting forgery by deep fake has been studied, and for example, a deep learning-based detection method has been developed (for example, Non-Patent Document 1).
  • the present invention has been made in view of the above circumstances, and one of its objects is to provide a counterfeit image detection device, a counterfeit image detection method, and a program capable of detecting various counterfeit images.
  • a counterfeit image detection device that detects a counterfeit image, comprising a processor and a memory, wherein the processor accepts original image data to be subjected to machine learning. is stored in the memory, the original image data stored in the memory is duplicated to generate first and second duplicate image data, and at least one of the generated first and second duplicate image data is applying a predetermined data augmentation process to source image data and target image data, respectively, generating at least one forged image data using the generated target image data and source image data, and generating the original image data as a non-forgery image, and the generated forgery image data as a forgery image, respectively, are input to a predetermined machine learning model, and the machine learning model is machine-learned so that the non-forgery image and the forgery image can be distinguished. , the machine learning model subjected to machine learning is subjected to the process of detecting the forged image.
  • various forged images can be detected regardless of the domain of the dataset used for learning.
  • FIG. 1 is a block diagram showing a configuration example of a forged image detection device according to an embodiment of the present invention
  • FIG. 4 is a functional block diagram showing an example of a control unit that performs machine learning processing of the forged image detection device according to the embodiment of the present invention
  • FIG. 3 is a functional block diagram showing an example of a control unit that performs discrimination processing of the forged image detection device according to the embodiment of the present invention
  • FIG. 4 is a flow chart showing an operation example of the forged image detection device according to the embodiment of the present invention
  • FIG. 4 is an explanatory diagram showing an example of image processing by the operation of the forged image detection device according to the embodiment of the present invention
  • a forged image detection apparatus 1 is a general computer device including a control unit 11, a storage unit 12, an operation unit 13, a display unit 14, and a communication unit 15, as illustrated in FIG. It can be realized by
  • control unit 11 is a processor such as a CPU, and operates according to a program stored in the storage unit 12.
  • the control unit 11 receives original image data to be machine-learned and duplicates the original image data to generate first and second duplicate image data.
  • the control unit 11 applies a predetermined data extension process to at least one of the generated first and second duplicate image data, sets them as source image data and target image data, and divides them into target image data and At least one counterfeit image data is generated using the source image data and the original image data as a non-counterfeit image.
  • the control unit 11 inputs the thus generated forged image data as a forged image into a predetermined machine learning model, and machine-learns the machine learning model so that the non-forged image and the forged image can be distinguished. Then, the control unit 11 subjects the machine learning model subjected to the machine learning to a process of detecting a forged image. A detailed operation of the control unit 11 will be described later.
  • the storage unit 12 is a memory device, disk device, or the like, and holds programs executed by the control unit 11 .
  • This program may be stored and provided in a computer-readable and non-temporary recording medium, and may be copied to this storage unit 12 .
  • the storage unit 12 also operates as a work memory for the control unit 11, and holds original image data, replicated image data, machine learning models, and the like.
  • the operation unit 13 is a mouse, a keyboard, or the like, and accepts user's operations and outputs information representing the contents of the operations to the control unit 11 .
  • the display unit 14 is a display device or the like, and outputs information according to instructions input from the control unit 11 .
  • the communication unit 15 is a network interface or the like, and transmits and receives various data via a network according to instructions input from the control unit 11 .
  • control unit 11 executes a program stored in the storage unit 12 to execute machine learning processing and discrimination processing.
  • control unit 11 that performs machine learning processing includes an original image receiving unit 21, a duplicate image generating unit 22, a data extension processing unit 23, a forged image data generating unit 24, and a machine learning unit 24, as illustrated in FIG.
  • a learning processing unit 25 is functionally included.
  • the control unit 11 that performs discrimination processing functionally includes a target image receiving unit 31, a discrimination processing unit 32, and an output unit 33, as illustrated in FIG.
  • the original image receiving unit 21 receives original image data to be machine-learned.
  • the original image receiving unit 21 normally receives a plurality of original image data to be subjected to machine learning via the communication unit 15, for example.
  • the original image data is image data including an image portion to be forged, such as a person's face portion.
  • the original image receiving unit 21 stores the received original image data in the storage unit 12 .
  • the original image receiving unit 21 may resize the input image data to a predetermined size and store it in the storage unit 12 as original image data.
  • the resizing process here includes at least one process such as image data enlargement/reduction processing and processing for cutting out a predetermined attention area (for example, a portion where a face image is captured) to the input image data. is performed.
  • the duplicated image generation unit 22 selects each piece of original image data stored in the storage unit 12 as a target of sequential processing, duplicates the selected original image data to be processed, and generates first and second duplicated images. Generate data.
  • the data extension processing unit 23 applies predetermined data extension processing to at least one of the first and second duplicate image data generated by the duplicate image generation unit 22 .
  • the data augmentation process is preferably a process of simulating traces of forgery appearing in the forged image when the image is forged.
  • examples of such traces of counterfeiting include: (a) In order to synthesize the face of another person with the image of the original person, a feature point defect ( landmark mismatch), (b) a visual blending boundary; (c) color mismatch due to differences in exposure between the original image and the counterfeit image, and differences in skin tone; (d) frequency inconsistency in image quality (frequency due to encoding, etc.); and so on.
  • the data extension processing unit 23 selects one of the first and second duplicated image data by a predetermined method, for example, predetermined duplicated image data, or randomly selects one of the first and second duplicated image data. Select any copy image data. Then, color conversion processing and frequency conversion processing are applied as data extension processing to one of the selected first and second duplicate image data.
  • This color conversion processing includes a process of randomly changing the value of each RGB channel (RGB shift), a process of randomly changing hue, saturation, and brightness (HueSaturationValue), and a process of randomly changing brightness and contrast ( RandomBrightnessContrast), etc. The parameters for each process are appropriately determined experimentally.
  • the frequency conversion process for example, one is randomly selected from the process of downscaling and then upscaling to reduce the image quality (Downscale), the sharpening process (Sharpen), etc., and is applied alternatively. It is good as a thing.
  • the data extension processing unit 23 outputs either the first or second copy image data, to which data extension processing is applied to at least one, as source image data, and outputs the other as target image data.
  • which is the source image data and which is the target image data may be determined in advance, or uniform random numbers may be generated to differ for each sample (original image data).
  • the forged image data generation unit 24 generates at least one piece of forged image data using the source image data and the target image data output by the data extension processing unit 23 .
  • the source image data Is and the target image data It are respectively assigned reference numerals for easy distinction.
  • the forged image data generation unit 24 identifies image portions to be forged for each of the source image data Is and the target image data It.
  • the image portion to be forged may be each part of the human body such as the face (head), hand, body, and leg of the person. Note that these are only examples, and other parts may be the target of forgery. For the sake of explanation, the case where a person's face is to be forged will be taken as an example below.
  • the forged image data generation unit 24 inputs the original image data, which are the sources of the source image data Is and the target image data It, to the facial feature point extractor.
  • the feature point extractor for the face part a widely known one such as an extractor using Haar-like features can be used. Identify regions that contain
  • the forged image data generation unit 24 obtains a convex area of feature points of the face extracted from the original image data that is the source of the source image data Is and the target image data It (ConvexHull).
  • the forged image data generation unit 24 obtains a mask image M in which the pixels in the convex region obtained here have a predetermined pixel value (for example, "1"), and extracts the image in the mask image M from the source image data Is. and the area (1-M) other than the mask image M extracted from the target image data It are combined to generate the forged image ISB:
  • the machine learning model is a well-known classification model such as EfficientNet-b4 (EFNB4: Mingxing Tan and Quoc Le, Efficient: Rethinking model scaling for convolutional neural networks, In ICML, pp. 6105-6114, 2019). utensils can be used.
  • EFNB4 Mingxing Tan and Quoc Le
  • Efficient Rethinking model scaling for convolutional neural networks, In ICML, pp. 6105-6114, 2019.
  • utensils can be used.
  • the label tj is "0" if xj is any of the non-forgery images I_i, and "1" if xj is any of the forgery images ISB_i.
  • the machine learning processing unit 25 uses this image permutation (training image data set) X and the corresponding label T to convert the classification model F( ⁇ ) into a cross entropy loss L: to optimize.
  • F(x) represents the probability that image x is a forged image.
  • the optimization method a widely known method such as back propagation, which is a general machine learning process, can be employed.
  • the label tj is "0" if xj is any of the non-forgery images It_i, and is "1" if xj is any of the forgery images ISB_i.
  • the target image receiving unit 31 receives input of image data to be determined.
  • the target image receiving unit 31 may use the input image data as target image data T by resizing the input image data to a predetermined size.
  • the resizing process here includes at least one process such as image data enlargement/reduction processing and processing for cutting out a predetermined attention area (for example, a portion where a face image is captured) to the input image data. is performed.
  • the discrimination processing unit 32 inputs the target image data T into the classification model F( ⁇ ), which is a machine learning model that has undergone machine learning through the above-described machine learning process. The determination processing unit 32 then determines whether or not the target image data T is forged image data based on the output of this classification model F(T).
  • the output of the classification model F(T) represents the probability pT that the target image data T is forged image data (0 ⁇ pT ⁇ 1). Therefore, when the output F(T) exceeds a predetermined threshold value (for example, 1/2), the determination processing unit 32 outputs information indicating that the target image data is forged image data. outputs information indicating that the target image data is not forged image data.
  • This threshold can be set arbitrarily, for example experimentally, within the range of values of the output F(T) (not including the boundaries).
  • the output unit 33 outputs information indicating whether or not the target image data is forged image data output by the determination processing unit 32 to the display unit 14 or the like, and presents it to the user.
  • the discrimination processing unit 32 may directly output the output of the classification model F(T) to the output unit 33, output it to the output unit display unit 14, etc., and present it to the user.
  • the forged image detection apparatus 1 of the present embodiment basically has the configuration described above, and operates as follows.
  • the forged image detection apparatus 1 performs machine learning for the purpose of detecting a forged image by synthesizing another person's face with an image of a person's face. .
  • a plurality of image data in which a person's face is captured are prepared in advance as original image data, and are sequentially input to the counterfeit image detection device 1 .
  • the forged image detection device 1 starts the processing illustrated in FIG. 4 and performs the following processing for each original image data I_i. That is, the forged image detection apparatus 1 duplicates the original image data I_i to be processed to generate first and second duplicated image data IR1_i and IR2_i (S11, FIG. 5: S21). For at least one of the first and second replicated image data IR1_i and IR2_i generated in step S11 (here both the first and second replicated image data IR1_i and IR2_i), the forged image detection apparatus 1 detects Data extension processing including color conversion processing and frequency conversion processing is applied (S12, FIG. 5: S22).
  • the color conversion process performed in step S12 includes a process of randomly changing the value of each RGB channel (RGB shift), a process of randomly changing hue, saturation, and brightness (HueSaturationValue), and a process of randomly changing brightness and contrast. RandomBrightnessContrast), and here, each of these processes is performed sequentially. It should be noted that the parameters for each process are randomly determined within a range determined experimentally in advance.
  • the forged image detection device 1 performs color conversion processing and frequency conversion processing on each of the first and second duplicate image data IR1_i and IR2_i based on mutually different parameters.
  • the forged image detection device 1 sets one of the first and second duplicated image data IR1_i and IR2_i after the data extension processing as the source image data Is and the other as the target image data It (S13).
  • the source image data Is and which is the target image data It is made different for each sample (original image data) by generating a uniform random number.
  • the forged image detection device 1 identifies a face image portion as a part of the imaged area of the original image data I_i processed in step S11 (S14, FIG. 5: S23). This processing can be performed by inputting the original image data I_i into a well-known face feature point extractor and detecting the positions where the feature points of each part of the face are captured.
  • the forged image detection device 1 obtains a polygonal area (convex area) including the identified face image portion as the mask image M (S15, FIG. 5: S24).
  • the forged image detection apparatus 1 extracts an image inside the mask image M obtained in step S15 from the source image data Is, and synthesizes it with the area (1-M) other than the mask image M extracted from the target image data It.
  • a forged image ISB_i is generated (S16, FIG. 5: S25).
  • One of the characteristics of this embodiment is that different image processing is applied to one piece of original image data (or one of them may not be subjected to image processing) when generating a forged image. It is to generate a pair of image data as a target and a source and synthesize them to obtain counterfeit image data. Synthesizing the target and the source obtained from the same image in this way generally results in forged image data that is difficult to identify. Forged image data can be obtained.
  • the counterfeit image detection device 1 performs the above processing for each piece of original image data I_i, obtains at least one counterfeit image ISB_i from each, and stores them in the storage unit 12 .
  • the forged image detection device 1 machine-learns a predetermined machine-learning model (for example, EfficientNet-b4 (EFNB4)) using this training image data set (S18).
  • EFNB4 EfficientNet-b4
  • the forged image detection device 1 uses the training image data set X and the corresponding label T to convert the classification model F( ⁇ ) into a cross entropy loss L: to optimize.
  • F(x) represents the probability that image x is a forged image.
  • the optimization method can be performed by adopting a widely known method such as the back propagation method, so the explanation is omitted here.
  • whether or not the image data to be processed is the forgery image data is determined as follows. to decide.
  • the user inputs image data to be determined to the forged image detection device 1 .
  • the counterfeit image detection apparatus 1 receives this image data, performs predetermined preprocessing such as resizing processing, and obtains target image data T.
  • the forged image detection apparatus 1 inputs the target image data T to the classification model F( ⁇ ), which is a machine learning model that has been machine-learned by the above-described machine-learning process. Since the output of this classification model F(T) represents the probability pT that the target image data T is counterfeit image data (0 ⁇ pT ⁇ 1), the counterfeit image detection device 1 displays this output F(T) as it is. It is displayed on the unit 14 and presented to the user. The user refers to this probability to determine whether or not the input image data is forged image data.
  • the forged image detection apparatus 1 of the present embodiment can also perform processing for determining whether or not moving image data is a forged image.
  • the machine learning process may be as described above.
  • the forged image detection device 1 obtains the probability pT that the image data of each frame is forged image data, and further, for example, the average of the probability that the image data corresponding to each frame is forged image data.
  • Generates and displays statistical calculation results for The user can refer to the statistical calculation result presented by the forged image detection device 1 to determine whether the moving image data is a forged image.
  • the forged image detection apparatus 1 of the present embodiment further performs a predetermined data expansion process on the mask image M generated in the process of the forged image data generation unit 24 (the process of step S15 illustrated in FIG. 4). good too.
  • the data expansion processing for the mask image can be, for example, resizing processing, deformation processing, scaling processing, and the like.
  • the forged image detection apparatus 1 detects a characteristic portion of a face image as a partial area captured in the original image data I_i, and randomly selects a portion of the characteristic portion. After invalidation, a polygonal area (convex area) including the characteristic portion of the face image excluding the invalidated characteristic portion is obtained as the initial mask image Ms. Then, a predetermined elastic deformation process (Elastic Deformation) is applied to this initial mask image Ms.
  • This elastic deformation process can adopt the method disclosed in Tianchen Zhao, et al., Learning self-consistency for deepfake detection, In ICCV, pp. 15023-15033, 2021, so detailed description is omitted here. .
  • a plurality of Gaussian filters are sequentially applied to the mask image that has undergone the elastic deformation processing to perform enlargement/reduction processing (at this time, at least a portion of the mask image is pixels are set to values between 0 and 1). Furthermore, using a randomly determined constant r, the pixel values included in the mask image are multiplied by r to obtain the mask image M to be actually used.
  • the constant r is a value greater than 0 and less than or equal to 1, and may be determined by uniformly sampling from among 0.25, 0.5, 0.75, and 1.
  • the ratio of synthesis between the source image and the target image can be diversified, and the generated forged image data can be diversified.
  • the forged image detection apparatus 1 of the present embodiment further extracts the mask image from the source image data Is to which image processing including at least one of resizing, translation, rotation, brightness change, saturation change, and hue change is applied.
  • Forged image data may be obtained by extracting the portion extracted by M and synthesizing it with the portion (1-M) extracted from the target image data other than the mask image M.
  • the forged image detection apparatus 1 performs not only the above-mentioned examples but also resizing, translation, rotation, and color conversion processing (brightness change, saturation change,
  • the source image data Is (or the target image data It) may be generated by applying image processing including at least one of hue change) and frequency conversion processing.
  • JPEG Joint Picture Experts Group
  • 1 forged image detection device 11 control unit, 12 storage unit, 13 operation unit, 14 display unit, 15 communication unit, 21 original image reception unit, 22 duplicate image generation unit, 23 data extension processing unit, 24 forged image data generation unit , 25 machine learning processing unit, 31 target image receiving unit, 32 discrimination processing unit, and 33 output unit.

Abstract

This forgery image detection device detects a forgery image and comprises a processor and a memory. The processor: (S21) receives original image data for machine learning, stores the original image data in the memory, and reproduces the original image data stored in the memory; (S22) applies a predetermined data augmentation process to a pair of reproduced image data items obtained by the reproducing; and (S25) generates at least one forgery image data item (ISB) using the generated target image data and source image data as source image data (Is) and target image data (It), respectively. The original image data is used as a non-forgery image and the generated forgery image data is used as a forgery image to train a predetermined machine learning model into a state capable of distinguishing between a non-forgery image and a forgery image.

Description

偽造画像検出装置、偽造画像検出方法、及びプログラムFake image detection device, forged image detection method, and program
 本発明は、偽造画像検出装置、偽造画像検出方法、及びプログラムに関する。 The present invention relates to a counterfeit image detection device, a counterfeit image detection method, and a program.
 近年、人工知能技術を用いた画像の偽造技術(ディープフェイク技術)が問題となっている。そしてディープフェイクによる偽造を検出する方法が研究され、例えば深層学習ベースの検出方法が開発されている(例えば非特許文献1)。 In recent years, image forgery technology (deepfake technology) using artificial intelligence technology has become a problem. A method for detecting forgery by deep fake has been studied, and for example, a deep learning-based detection method has been developed (for example, Non-Patent Document 1).
 しかしながら、上記従来の例の技術では、訓練データと同じような偽造方法、訓練データと類似した画質やシーンの画像(いわゆるイン・データセットな対象)に対しては効果的に偽造を検出するものの、訓練データとは異なる方法で生成された偽造画像等、いわゆるクロス・データセットの対象については検出精度が低下することが知られている(Ali Lhodabakhsh, et al., Face face detection methods: Can the be generalized?, In BIOSIG, pp. 1-6, 2018等)。 However, in the above-described conventional example technology, although forgery methods similar to training data and image quality and scene images similar to training data (so-called in-dataset targets) are effectively detected, forgery is detected. , It is known that the detection accuracy decreases for so-called cross-dataset targets, such as forged images generated by a method different from the training data (Ali Lhodabakhsh, et al., Face face detection methods: Can the be Generalized?, In BIOSIG, pp. 1-6, 2018, etc.).
 本発明は上記実情に鑑みて為されたもので、多様な偽造画像を検出可能な偽造画像検出装置、偽造画像検出方法、及びプログラムを提供することを、その目的の一つとする。 The present invention has been made in view of the above circumstances, and one of its objects is to provide a counterfeit image detection device, a counterfeit image detection method, and a program capable of detecting various counterfeit images.
 上記従来例の問題点を解決する本発明の一態様は、プロセッサ及びメモリを備え、偽造画像を検出する偽造画像検出装置であって、前記プロセッサが、機械学習の対象となる原画像データを受け入れて、前記メモリに格納し、当該メモリに格納した原画像データを複製して第1,第2の複製画像データを生成し、当該生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとして、当該生成したターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを生成して、前記原画像データを非偽造画像とし、前記生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習し、当該機械学習した機械学習モデルが、前記偽造画像を検出する処理に供されることとしたものである。 One aspect of the present invention that solves the problems of the conventional example is a counterfeit image detection device that detects a counterfeit image, comprising a processor and a memory, wherein the processor accepts original image data to be subjected to machine learning. is stored in the memory, the original image data stored in the memory is duplicated to generate first and second duplicate image data, and at least one of the generated first and second duplicate image data is applying a predetermined data augmentation process to source image data and target image data, respectively, generating at least one forged image data using the generated target image data and source image data, and generating the original image data as a non-forgery image, and the generated forgery image data as a forgery image, respectively, are input to a predetermined machine learning model, and the machine learning model is machine-learned so that the non-forgery image and the forgery image can be distinguished. , the machine learning model subjected to machine learning is subjected to the process of detecting the forged image.
 本発明によると、学習に用いたデータセットのドメインに関わらず、多様な偽造画像を検出可能となる。 According to the present invention, various forged images can be detected regardless of the domain of the dataset used for learning.
本発明の実施の形態に係る偽造画像検出装置の構成例を表すブロック図である。1 is a block diagram showing a configuration example of a forged image detection device according to an embodiment of the present invention; FIG. 本発明の実施の形態に係る偽造画像検出装置の機械学習処理を行う制御部の例を表す機能ブロック図である。4 is a functional block diagram showing an example of a control unit that performs machine learning processing of the forged image detection device according to the embodiment of the present invention; FIG. 本発明の実施の形態に係る偽造画像検出装置の判別処理を行う制御部の例を表す機能ブロック図である。FIG. 3 is a functional block diagram showing an example of a control unit that performs discrimination processing of the forged image detection device according to the embodiment of the present invention; 本発明の実施の形態に係る偽造画像検出装置の動作例を表すフローチャート図である。FIG. 4 is a flow chart showing an operation example of the forged image detection device according to the embodiment of the present invention; 本発明の実施の形態に係る偽造画像検出装置の動作による画像処理の例を表す説明図である。FIG. 4 is an explanatory diagram showing an example of image processing by the operation of the forged image detection device according to the embodiment of the present invention;
 本発明の実施の形態について図面を参照しながら説明する。本発明の実施の形態に係る偽造画像検出装置1は、図1に例示するように、制御部11、記憶部12、操作部13、表示部14、及び通信部15を含む一般的なコンピュータ装置により実現できる。 An embodiment of the present invention will be described with reference to the drawings. A forged image detection apparatus 1 according to an embodiment of the present invention is a general computer device including a control unit 11, a storage unit 12, an operation unit 13, a display unit 14, and a communication unit 15, as illustrated in FIG. It can be realized by
 ここで制御部11は、CPU等のプロセッサであり、記憶部12に格納されたプログラムに従って動作する。本実施の形態の一例では、この制御部11は、機械学習の対象となる原画像データを受け入れ、当該原画像データを複製して第1,第2の複製画像データを生成する。制御部11は、これら生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとて、これらのターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを、原画像データを非偽造画像として生成する。制御部11は、こうして生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習する。そして制御部11は、当該機械学習した機械学習モデルを、偽造画像を検出する処理に供する。この制御部11の詳しい動作は、後に説明する。 Here, the control unit 11 is a processor such as a CPU, and operates according to a program stored in the storage unit 12. In one example of the present embodiment, the control unit 11 receives original image data to be machine-learned and duplicates the original image data to generate first and second duplicate image data. The control unit 11 applies a predetermined data extension process to at least one of the generated first and second duplicate image data, sets them as source image data and target image data, and divides them into target image data and At least one counterfeit image data is generated using the source image data and the original image data as a non-counterfeit image. The control unit 11 inputs the thus generated forged image data as a forged image into a predetermined machine learning model, and machine-learns the machine learning model so that the non-forged image and the forged image can be distinguished. Then, the control unit 11 subjects the machine learning model subjected to the machine learning to a process of detecting a forged image. A detailed operation of the control unit 11 will be described later.
 記憶部12は、メモリデバイスやディスクデバイス等であり、制御部11によって実行されるプログラムを保持する。このプログラムは、コンピュータ可読、かつ非一時的な記録媒体に格納されて提供され、この記憶部12に複写されてよい。またこの記憶部12は、制御部11のワークメモリとしても動作し、原画像データや複製画像データ、機械学習モデルなどを保持する。 The storage unit 12 is a memory device, disk device, or the like, and holds programs executed by the control unit 11 . This program may be stored and provided in a computer-readable and non-temporary recording medium, and may be copied to this storage unit 12 . The storage unit 12 also operates as a work memory for the control unit 11, and holds original image data, replicated image data, machine learning models, and the like.
 操作部13は、マウスやキーボード等であり、ユーザの操作を受け入れて、当該操作の内容を表す情報を、制御部11に出力する。表示部14は、ディスプレイ装置などであり、制御部11から入力される指示に従い、情報を出力する。通信部15は、ネットワークインタフェース等であり、制御部11から入力される指示に従って、ネットワークを介して種々のデータを送受する。 The operation unit 13 is a mouse, a keyboard, or the like, and accepts user's operations and outputs information representing the contents of the operations to the control unit 11 . The display unit 14 is a display device or the like, and outputs information according to instructions input from the control unit 11 . The communication unit 15 is a network interface or the like, and transmits and receives various data via a network according to instructions input from the control unit 11 .
 次に、制御部11の動作について説明する。本実施の形態の一例では、制御部11は、記憶部12に格納されたプログラムを実行することで、機械学習処理と、判別処理とをそれぞれ実行する。このうち機械学習処理を行う制御部11は、図2に例示するように、原画像受入部21と、複製画像生成部22と、データ拡張処理部23と、偽造画像データ生成部24と、機械学習処理部25と、を機能的に含む。 Next, the operation of the control unit 11 will be explained. In one example of the present embodiment, the control unit 11 executes a program stored in the storage unit 12 to execute machine learning processing and discrimination processing. Of these, the control unit 11 that performs machine learning processing includes an original image receiving unit 21, a duplicate image generating unit 22, a data extension processing unit 23, a forged image data generating unit 24, and a machine learning unit 24, as illustrated in FIG. A learning processing unit 25 is functionally included.
 また判別処理を行う制御部11は、図3に例示するように、対象画像受入部31と、判別処理部32と、出力部33とを機能的に含む。 The control unit 11 that performs discrimination processing functionally includes a target image receiving unit 31, a discrimination processing unit 32, and an output unit 33, as illustrated in FIG.
[機械学習処理]
 まず、機械学習処理を行う制御部11の機能的な構成について説明する。原画像受入部21は、機械学習の対象となる原画像データを受け入れる。この原画像受入部21は、例えば通信部15を介して機械学習の対象となる原画像データを、通常は複数受け入れる。ここで原画像データは、偽造の対象となる画像部分、例えば人物の顔部分を含む画像データである。原画像受入部21は、受け入れた原画像データを記憶部12に格納する。
[Machine learning processing]
First, the functional configuration of the control unit 11 that performs machine learning processing will be described. The original image receiving unit 21 receives original image data to be machine-learned. The original image receiving unit 21 normally receives a plurality of original image data to be subjected to machine learning via the communication unit 15, for example. Here, the original image data is image data including an image portion to be forged, such as a person's face portion. The original image receiving unit 21 stores the received original image data in the storage unit 12 .
 なお、この原画像受入部21は、入力された画像データを予め定めた所定のサイズにリサイズ処理して原画像データとして記憶部12に格納することとしてもよい。ここでのリサイズ処理は、画像データの拡大縮小処理、及び、所定の注目領域(例えば顔画像が撮像されている部分)を切り出す処理などの少なくとも一つの処理を、入力された画像データに対して施して行われる。 Note that the original image receiving unit 21 may resize the input image data to a predetermined size and store it in the storage unit 12 as original image data. The resizing process here includes at least one process such as image data enlargement/reduction processing and processing for cutting out a predetermined attention area (for example, a portion where a face image is captured) to the input image data. is performed.
 複製画像生成部22は、記憶部12に格納した原画像データのそれぞれを、順次処理の対象として選択し、選択した処理の対象の原画像データを複製して、第1,第2の複製画像データを生成する。 The duplicated image generation unit 22 selects each piece of original image data stored in the storage unit 12 as a target of sequential processing, duplicates the selected original image data to be processed, and generates first and second duplicated images. Generate data.
 データ拡張処理部23は、複製画像生成部22が生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用する。ここでデータ拡張処理は、画像の偽造が行われたときに、偽造画像に表れる偽造の痕跡を模擬する処理であることが好適である。具体的にこのような偽造の痕跡の例としては、
(a)もととなる人物の画像に、他人の人物の顔を合成するために、もととなる人物と合成される他人との顔の形の相違によって合成時に発生する特徴点の不具合(landmark mismatch)、
(b)視覚的な境界線(blending boundary)、
(c)もととなる画像と、偽造された画像との露光の違いや、肌の色の違いによる不整合(color mismatch)、
(d)画質(符号化等による周波数)の不整合(frequency inconsistency)、
などがある。
The data extension processing unit 23 applies predetermined data extension processing to at least one of the first and second duplicate image data generated by the duplicate image generation unit 22 . Here, the data augmentation process is preferably a process of simulating traces of forgery appearing in the forged image when the image is forged. Specifically, examples of such traces of counterfeiting include:
(a) In order to synthesize the face of another person with the image of the original person, a feature point defect ( landmark mismatch),
(b) a visual blending boundary;
(c) color mismatch due to differences in exposure between the original image and the counterfeit image, and differences in skin tone;
(d) frequency inconsistency in image quality (frequency due to encoding, etc.);
and so on.
 そこで本実施の形態の一例では、このデータ拡張処理部23は、第1,第2の複製画像データのいずれか一方を予め定めた方法、例えば予め決められた複製画像データまたは、ランダムに選択したいずれかの複製画像データを選択する。そして当該選択した第1,第2の複製画像データのいずれか一方に対して、データ拡張処理として、色変換処理と周波数変換処理とを適用する。この色変換処理としては、ランダムにRGBの各チャンネルの値を変化させる処理(RGBシフト)、色相や彩度、輝度をランダムに変化させる処理(HueSaturationValue)、明度やコントラストをランダムに変化させる処理(RandomBrightnessContrast)などを用いる。それぞれの処理のパラメータは、適宜実験的に定めることとする。 Therefore, in one example of the present embodiment, the data extension processing unit 23 selects one of the first and second duplicated image data by a predetermined method, for example, predetermined duplicated image data, or randomly selects one of the first and second duplicated image data. Select any copy image data. Then, color conversion processing and frequency conversion processing are applied as data extension processing to one of the selected first and second duplicate image data. This color conversion processing includes a process of randomly changing the value of each RGB channel (RGB shift), a process of randomly changing hue, saturation, and brightness (HueSaturationValue), and a process of randomly changing brightness and contrast ( RandomBrightnessContrast), etc. The parameters for each process are appropriately determined experimentally.
 また周波数変換処理としては、ダウンスケールした後にアップスケールして画質を低下させる処理(Downscale)、シャープにする処理(Sharpen)などのうちから例えばランダムに一つを選択し、択一的に適用することとしてよい。 Also, as the frequency conversion process, for example, one is randomly selected from the process of downscaling and then upscaling to reduce the image quality (Downscale), the sharpening process (Sharpen), etc., and is applied alternatively. It is good as a thing.
 なお、これらの処理は一例であり、他の種々の色変換処理や周波数変換処理が実行されてもよい。また、ここでは第1,第2の複製画像データのいずれか一方を選択して、データ拡張処理を施したが、双方に互いに異なるデータ拡張処理を施すこととしてもよい。 Note that these processes are only examples, and various other color conversion processes and frequency conversion processes may be performed. Also, here, either one of the first and second copy image data is selected and subjected to the data expansion process, but it is also possible to apply different data expansion processes to both.
 データ拡張処理部23は、少なくとも一方にデータ拡張処理を適用した、第1,第2の複製画像データのいずれかをソース画像データとし、他方をターゲット画像データとして出力する。ここでどちらをソース画像データとし、どちらをターゲット画像データとするかは予め決めておいてもよいし、一様乱数を発生させてサンプル(原画像データ)ごとに異ならせてもよい。 The data extension processing unit 23 outputs either the first or second copy image data, to which data extension processing is applied to at least one, as source image data, and outputs the other as target image data. Here, which is the source image data and which is the target image data may be determined in advance, or uniform random numbers may be generated to differ for each sample (original image data).
 偽造画像データ生成部24は、データ拡張処理部23が出力したソース画像データとターゲット画像データとを用いて少なくとも一つの偽造画像データを生成する。なお、以下では区別を容易にするため、ソース画像データIsとターゲット画像データItとにそれぞれ符号を付して区別する。 The forged image data generation unit 24 generates at least one piece of forged image data using the source image data and the target image data output by the data extension processing unit 23 . In the following description, the source image data Is and the target image data It are respectively assigned reference numerals for easy distinction.
 偽造画像データ生成部24は、ソース画像データIsとターゲット画像データItとのそれぞれについて偽造の対象となる画像部分を特定する。ここで偽造の対象となる画像部分は、人物の顔部分(頭部)、手の部分、胴体部分、脚部分など人体の各部であってよい。なお、これらは一例であり、他の部分を偽造の対象としてもよい。以下では、説明のため、人物の顔部分を偽造の対象とする場合を例とする。 The forged image data generation unit 24 identifies image portions to be forged for each of the source image data Is and the target image data It. Here, the image portion to be forged may be each part of the human body such as the face (head), hand, body, and leg of the person. Note that these are only examples, and other parts may be the target of forgery. For the sake of explanation, the case where a person's face is to be forged will be taken as an example below.
 この例では、偽造画像データ生成部24は、ソース画像データIsとターゲット画像データItとの元となった原画像データを、顔部分の特徴点抽出器に入力する。顔部分の特徴点抽出器としては、Haar-like特徴量を用いた抽出器など広く知られたものを利用でき、例えば人物の目、鼻、口、耳、及び顔の輪郭のうち少なくとも一つを含む領域を特定する。 In this example, the forged image data generation unit 24 inputs the original image data, which are the sources of the source image data Is and the target image data It, to the facial feature point extractor. As the feature point extractor for the face part, a widely known one such as an extractor using Haar-like features can be used. Identify regions that contain
 偽造画像データ生成部24は、ソース画像データIsとターゲット画像データItとの元となった原画像データから抽出した顔部分の特徴点の凸領域を求める(ConvexHull)。偽造画像データ生成部24は、ここで求めた凸領域内の画素を所定の画素値(例えば「1」)としたマスク画像Mを得て、ソース画像データIsから抽出したマスク画像M内の画像と、ターゲット画像データItから抽出したマスク画像M以外の領域(1-M)と、を合成して偽造画像ISBを生成する:
Figure JPOXMLDOC01-appb-M000001
The forged image data generation unit 24 obtains a convex area of feature points of the face extracted from the original image data that is the source of the source image data Is and the target image data It (ConvexHull). The forged image data generation unit 24 obtains a mask image M in which the pixels in the convex region obtained here have a predetermined pixel value (for example, "1"), and extracts the image in the mask image M from the source image data Is. and the area (1-M) other than the mask image M extracted from the target image data It are combined to generate the forged image ISB:
Figure JPOXMLDOC01-appb-M000001
 機械学習処理部25は、各原画像データを非偽造画像I_i(i=1,2,…n)とし、対応する原画像データに基づいて生成した偽造画像データISB_i(i=1,2,…n)を偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像I_iと偽造画像ISB_iとを識別可能な状態に機械学習する。 The machine learning processing unit 25 treats each original image data as a non-forgery image I_i (i=1, 2, . . . n), and forges image data ISB_i (i=1, 2, . n) as forged images are inputted to a predetermined machine learning model, and the machine learning model is machine-learned so that the non-forged image I_i and the forged image ISB_i can be distinguished.
 ここで機械学習モデルは、例えばEfficientNet-b4(EFNB4:Mingxing Tan and Quoc Le, Efficient: Rethinking model scaling for convolutional neural networks, In ICML, pp. 6105-6114, 2019)のような、広く知られた分類器を用いることができる。 Here the machine learning model is a well-known classification model such as EfficientNet-b4 (EFNB4: Mingxing Tan and Quoc Le, Efficient: Rethinking model scaling for convolutional neural networks, In ICML, pp. 6105-6114, 2019). utensils can be used.
 具体的に機械学習処理部25は、n個の非偽造画像I_iと、n個の偽造画像ISB_iとからN個の互いに異なる画像を選択して、その順列X=xj(j=1,2,…N)を生成するとともに、対応するラベルT=tj(j=1,2,…N)を生成する。ここでラベルtjは、xjが非偽造画像I_iのいずれかであれば「0」、xjが偽造画像ISB_iのいずれかであれば「1」となるものとする。 Specifically, the machine learning processing unit 25 selects N mutually different images from the n non-forgery images I_i and the n forgery images ISB_i, and performs permutation X=xj (j=1, 2, . . N) and corresponding labels T=tj (j=1, 2, . . . N). Here, the label tj is "0" if xj is any of the non-forgery images I_i, and "1" if xj is any of the forgery images ISB_i.
 機械学習処理部25は、この画像の順列(訓練画像データセット)Xと、対応するラベルTとを用い、分類モデルF(・)を、クロスエントロピー損失L:
Figure JPOXMLDOC01-appb-M000002
で最適化する。ここでF(x)は、画像xが偽造画像である確率を表す。なお、最適化の方法は、一般的な機械学習の処理であるバックプロパゲーションの方法など、広く知られた方法を採用して行うことができる。
The machine learning processing unit 25 uses this image permutation (training image data set) X and the corresponding label T to convert the classification model F(·) into a cross entropy loss L:
Figure JPOXMLDOC01-appb-M000002
to optimize. where F(x) represents the probability that image x is a forged image. As the optimization method, a widely known method such as back propagation, which is a general machine learning process, can be employed.
 なお、ここでは非偽造画像は、原画像データから取得したが、本実施の形態はこれに限られない。機械学習処理部25は、原画像データではなく、ターゲット画像データIt_i(i=1,2,…n)を非偽造画像としてもよい。この場合には、機械学習処理部25は、n個の非偽造画像It_iと、n個の偽造画像ISB_iとからN個の互いに異なる画像を選択して、その順列X=xj(j=1,2,…N)を生成するとともに、対応するラベルT=tj(j=1,2,…N)を生成して機械学習処理に供する。ここでラベルtjは、xjが非偽造画像It_iのいずれかであれば「0」、xjが偽造画像ISB_iのいずれかであれば「1」となるものとする。 Although the non-forgery image is obtained from the original image data here, the present embodiment is not limited to this. The machine learning processing unit 25 may use the target image data It_i (i=1, 2, . . . n) instead of the original image data as the non-counterfeit image. In this case, the machine learning processing unit 25 selects N different images from the n non-forgery images It_i and the n forgery images ISB_i, and the permutation X=xj (j=1, 2, . . . N) and the corresponding label T=tj (j=1, 2, . . . N) for machine learning processing. Here, the label tj is "0" if xj is any of the non-forgery images It_i, and is "1" if xj is any of the forgery images ISB_i.
[判別処理]
 次に、判別処理を行う制御部11の機能的な構成について説明する。対象画像受入部31は、判別の対象となる画像データの入力を受け入れる。なお、この対象画像受入部31は、入力された画像データを予め定めた所定のサイズにリサイズ処理して対象画像データTとしてもよい。ここでのリサイズ処理は、画像データの拡大縮小処理、及び、所定の注目領域(例えば顔画像が撮像されている部分)を切り出す処理などの少なくとも一つの処理を、入力された画像データに対して施して行われる。
[Discrimination processing]
Next, a functional configuration of the control unit 11 that performs determination processing will be described. The target image receiving unit 31 receives input of image data to be determined. Note that the target image receiving unit 31 may use the input image data as target image data T by resizing the input image data to a predetermined size. The resizing process here includes at least one process such as image data enlargement/reduction processing and processing for cutting out a predetermined attention area (for example, a portion where a face image is captured) to the input image data. is performed.
 判別処理部32は、上述の機械学習処理により機械学習された状態にある機械学習モデルである分類モデルF(・)に、対象画像データTを入力する。判別処理部32は、そしてこの分類モデルF(T)の出力に基づいて対象画像データTが偽造画像データであるか否かを判断する。 The discrimination processing unit 32 inputs the target image data T into the classification model F(·), which is a machine learning model that has undergone machine learning through the above-described machine learning process. The determination processing unit 32 then determines whether or not the target image data T is forged image data based on the output of this classification model F(T).
 ここで分類モデルF(T)の出力は、対象画像データTが偽造画像データである確率pTを表す(0≦pT≦1)。そこで判別処理部32は、この出力F(T)が所定のしきい値(例えば1/2)を超える場合に、対象画像データが偽造画像データである旨の情報を出力し、そうでない場合には対象画像データが偽造画像データでない旨の情報を出力する。このしきい値は、例えば実験的に、出力F(T)の値の範囲内(境界を含まない)で、任意に定め得る。 Here, the output of the classification model F(T) represents the probability pT that the target image data T is forged image data (0≤pT≤1). Therefore, when the output F(T) exceeds a predetermined threshold value (for example, 1/2), the determination processing unit 32 outputs information indicating that the target image data is forged image data. outputs information indicating that the target image data is not forged image data. This threshold can be set arbitrarily, for example experimentally, within the range of values of the output F(T) (not including the boundaries).
 出力部33は、判別処理部32が出力する、対象画像データが偽造画像データであるか否かを表す情報を、表示部14等に出力して、ユーザに提示する。なお、判別処理部32は、分類モデルF(T)の出力をそのまま出力部33に出力し、出力部表示部14等に出力して、ユーザに提示してもよい。 The output unit 33 outputs information indicating whether or not the target image data is forged image data output by the determination processing unit 32 to the display unit 14 or the like, and presents it to the user. Note that the discrimination processing unit 32 may directly output the output of the classification model F(T) to the output unit 33, output it to the output unit display unit 14, etc., and present it to the user.
[動作]
 本実施の形態の偽造画像検出装置1は、以上の構成を基本的に備えており、次のように動作する。以下の例では、この偽造画像検出装置1は、人物の顔が撮像された画像に対して、他人の顔を合成して偽造された画像を検出することを目的として機械学習を行うこととする。
[motion]
The forged image detection apparatus 1 of the present embodiment basically has the configuration described above, and operates as follows. In the following example, the forged image detection apparatus 1 performs machine learning for the purpose of detecting a forged image by synthesizing another person's face with an image of a person's face. .
 この例では、予め原画像データとして、人の顔が撮像されている複数の画像データを用意し、これを順次、偽造画像検出装置1に入力する。偽造画像検出装置1は、当該機械学習処理の対象となる原画像データを受け入れ、所定の前処理(例えばリサイズの処理など)を施したうえで、原画像データI_i(i=1,2,…n)として記憶部12に格納していく。 In this example, a plurality of image data in which a person's face is captured are prepared in advance as original image data, and are sequentially input to the counterfeit image detection device 1 . The forged image detection apparatus 1 receives original image data to be subjected to the machine learning process, performs predetermined preprocessing (for example, resize processing), and converts the original image data I_i (i=1, 2, . . . n) is stored in the storage unit 12 .
 偽造画像検出装置1は、図4に例示する処理を開始し、原画像データI_iごとに、次の処理を行う。すなわち、偽造画像検出装置1は、処理の対象とした原画像データI_iを複製して、第1,第2の複製画像データIR1_i,IR2_iを生成する(S11、図5:S21)。偽造画像検出装置1は、ステップS11で生成した第1,第2の複製画像データIR1_i,IR2_iの少なくとも一方(ここでは第1,第2の複製画像データIR1_i,IR2_iの双方とする)に対して色変換処理と周波数変換処理とを含むデータ拡張処理を適用する(S12,図5:S22)。 The forged image detection device 1 starts the processing illustrated in FIG. 4 and performs the following processing for each original image data I_i. That is, the forged image detection apparatus 1 duplicates the original image data I_i to be processed to generate first and second duplicated image data IR1_i and IR2_i (S11, FIG. 5: S21). For at least one of the first and second replicated image data IR1_i and IR2_i generated in step S11 (here both the first and second replicated image data IR1_i and IR2_i), the forged image detection apparatus 1 detects Data extension processing including color conversion processing and frequency conversion processing is applied (S12, FIG. 5: S22).
 このステップS12において行う色変換処理は、ランダムにRGBの各チャンネルの値を変化させる処理(RGBシフト)、色相や彩度、輝度をランダムに変化させる処理(HueSaturationValue)、明度やコントラストをランダムに変化させる処理(RandomBrightnessContrast)であるものとし、ここでは、これらの各処理を順次施すものとする。なお、それぞれの処理のパラメータは、予め実験的に定めた範囲のうちからランダムに決定する。 The color conversion process performed in step S12 includes a process of randomly changing the value of each RGB channel (RGB shift), a process of randomly changing hue, saturation, and brightness (HueSaturationValue), and a process of randomly changing brightness and contrast. RandomBrightnessContrast), and here, each of these processes is performed sequentially. It should be noted that the parameters for each process are randomly determined within a range determined experimentally in advance.
 また周波数変換処理としてここでは、ダウンスケールした後にアップスケールして画質を低下させる処理(Downscale)、またはシャープにする処理(Sharpen)のいずれかをランダムに選択し、選択した処理を適用する。 Also, as the frequency conversion process, either downscaling and then upscaling to reduce image quality (Downscale) or sharpening (Sharpen) is randomly selected and the selected process is applied.
 なお、偽造画像検出装置1は、第1,第2の複製画像データIR1_i,IR2_iのそれぞれに対して、互いに異なるパラメータに基づく、色変換処理及び、周波数変換処理を施すこととする。 Note that the forged image detection device 1 performs color conversion processing and frequency conversion processing on each of the first and second duplicate image data IR1_i and IR2_i based on mutually different parameters.
 また偽造画像検出装置1は、データ拡張処理を施した後の第1,第2の複製画像データIR1_i,IR2_iのいずれかをソース画像データIsとし、他方をターゲット画像データItとする(S13)。ここでどちらをソース画像データIsとし、どちらをターゲット画像データItとするかは一様乱数を発生させてサンプル(原画像データ)ごとに異ならせる。 Also, the forged image detection device 1 sets one of the first and second duplicated image data IR1_i and IR2_i after the data extension processing as the source image data Is and the other as the target image data It (S13). Here, which is the source image data Is and which is the target image data It is made different for each sample (original image data) by generating a uniform random number.
 一方、偽造画像検出装置1は、ステップS11で処理の対象とした原画像データI_iのうちに撮像されている一部の領域として、顔画像部分を特定する(S14,図5:S23)。この処理は、原画像データI_iを、広く知られた顔部分の特徴点抽出器に入力して、顔の各部の特徴点が撮像されている位置を検出することで行うことができる。偽造画像検出装置1は、特定した顔画像部分を含む多角形領域(凸領域)をマスク画像Mとして求める(S15,図5:S24)。 On the other hand, the forged image detection device 1 identifies a face image portion as a part of the imaged area of the original image data I_i processed in step S11 (S14, FIG. 5: S23). This processing can be performed by inputting the original image data I_i into a well-known face feature point extractor and detecting the positions where the feature points of each part of the face are captured. The forged image detection device 1 obtains a polygonal area (convex area) including the identified face image portion as the mask image M (S15, FIG. 5: S24).
 偽造画像検出装置1は、ソース画像データIsから、ステップS15で求めたマスク画像M内部の画像を抽出し、ターゲット画像データItから抽出したマスク画像M以外の領域(1-M)に合成して偽造画像ISB_iを生成する(S16,図5:S25)。 The forged image detection apparatus 1 extracts an image inside the mask image M obtained in step S15 from the source image data Is, and synthesizes it with the area (1-M) other than the mask image M extracted from the target image data It. A forged image ISB_i is generated (S16, FIG. 5: S25).
 これにより、ソース画像データIsの一部の領域内の画像を抽出し、ターゲット画像データItの対応する領域内に上書き合成して、偽造画像データが生成される。 As a result, an image in a partial area of the source image data Is is extracted and overwritten in the corresponding area of the target image data It to generate forged image data.
 本実施の形態において特徴的なことの一つは、偽造画像を生成するにあたり、一つの原画像データから互いに異なる画像処理を施した(あるいはいずれか一方は画像処理を施さないこととしてもよい)一対の画像データをターゲット及びソースとして生成し、これらを合成して偽造画像データとすることである。このように同一の画像から得られたターゲット及びソースを合成しているので、一般に識別が困難な偽造画像データとなるが、偽造された画像データに必然的に生じる周波数の不一致などの痕跡が強調された偽造画像データを得ることができる。 One of the characteristics of this embodiment is that different image processing is applied to one piece of original image data (or one of them may not be subjected to image processing) when generating a forged image. It is to generate a pair of image data as a target and a source and synthesize them to obtain counterfeit image data. Synthesizing the target and the source obtained from the same image in this way generally results in forged image data that is difficult to identify. Forged image data can be obtained.
 偽造画像検出装置1は、以上の処理を各原画像データI_iについて行い、それぞれから少なくとも一つずつの偽造画像ISB_iを得て、記憶部12に蓄積しておく。 The counterfeit image detection device 1 performs the above processing for each piece of original image data I_i, obtains at least one counterfeit image ISB_i from each, and stores them in the storage unit 12 .
 偽造画像検出装置1は、次に、n個の原画像データ(非偽造画像)I_iと、n個の偽造画像ISB_iとからN個の互いに異なる画像を選択して、その順列X=xj(j=1,2,…N)を生成するとともに、対応するラベルT=tj(j=1,2,…N)を生成して、訓練画像データセットを構成する(S17)。 The forgery image detection device 1 next selects N different images from the n original image data (non-forgery images) I_i and the n forgery images ISB_i, and calculates the permutation X=xj (j =1, 2, . . . N) and corresponding labels T=tj (j=1, 2, . . . N) to form a training image data set (S17).
 そして偽造画像検出装置1は、予め定めた機械学習モデル(例えばEfficientNet-b4(EFNB4))を、この訓練画像データセットを用いて機械学習する(S18)。このとき、偽造画像検出装置1は、訓練画像データセットXと、対応するラベルTとを用い、分類モデルF(・)を、クロスエントロピー損失L:
Figure JPOXMLDOC01-appb-M000003
で最適化する。ここでF(x)は、画像xが偽造画像である確率を表す。ここで最適化の方法は、バックプロパゲーションの方法など、広く知られた方法を採用して行うことができるので、ここでの説明は省略する。
Then, the forged image detection device 1 machine-learns a predetermined machine-learning model (for example, EfficientNet-b4 (EFNB4)) using this training image data set (S18). At this time, the forged image detection device 1 uses the training image data set X and the corresponding label T to convert the classification model F(·) into a cross entropy loss L:
Figure JPOXMLDOC01-appb-M000003
to optimize. where F(x) represents the probability that image x is a forged image. Here, the optimization method can be performed by adopting a widely known method such as the back propagation method, so the explanation is omitted here.
 またこのようにして非偽造画像と偽造画像とを識別可能な状態に機械学習処理した機械学習モデルを用い、次のようにして処理の対象とした画像データが偽造画像データであるか否かを判断する。 In addition, using the machine learning model in which the machine learning process is performed in such a manner that the non-forgery image and the forgery image can be discriminated, whether or not the image data to be processed is the forgery image data is determined as follows. to decide.
 すなわち、ユーザは、偽造画像検出装置1に、判別の対象となる画像データを入力する。偽造画像検出装置1は、この画像データを受け入れ、リサイズ処理など所定の前処理を行って対象画像データTを得る。そして偽造画像検出装置1は、上述の機械学習処理により機械学習された状態にある機械学習モデルである分類モデルF(・)に、対象画像データTを入力する。この分類モデルF(T)の出力は、対象画像データTが偽造画像データである確率pTを表す(0≦pT≦1)ので、偽造画像検出装置1は、この出力F(T)をそのまま表示部14に表示して、ユーザに提示する。ユーザは、この確率を参照して、入力した画像データが偽造画像データであるか否かを判断する。 That is, the user inputs image data to be determined to the forged image detection device 1 . The counterfeit image detection apparatus 1 receives this image data, performs predetermined preprocessing such as resizing processing, and obtains target image data T. FIG. Then, the forged image detection apparatus 1 inputs the target image data T to the classification model F(·), which is a machine learning model that has been machine-learned by the above-described machine-learning process. Since the output of this classification model F(T) represents the probability pT that the target image data T is counterfeit image data (0≦pT≦1), the counterfeit image detection device 1 displays this output F(T) as it is. It is displayed on the unit 14 and presented to the user. The user refers to this probability to determine whether or not the input image data is forged image data.
[動画像データへの応用]
 なお、本実施の形態の偽造画像検出装置1は、動画像データについても、偽造画像であるか否かを判別する処理を行い得る。この例では、機械学習の処理は上述の通りでよいが、判別処理においては、偽造画像であるか否かの判断の対象となる動画像データを構成する複数のフレームの画像を逐次的に偽造画像検出装置1に入力し、偽造画像検出装置1がそれぞれのフレームの画像データが偽造画像データである確率pTを得て、さらに例えば各フレームに対応する偽造画像データである確率の平均など、所定の統計演算結果を生成して表示する。ユーザは、この偽造画像検出装置1が提示した統計演算結果を参照して、当該動画像データが偽造画像であるか否かを判断できる。
[Application to moving image data]
Note that the forged image detection apparatus 1 of the present embodiment can also perform processing for determining whether or not moving image data is a forged image. In this example, the machine learning process may be as described above. input to the image detection device 1, the forged image detection device 1 obtains the probability pT that the image data of each frame is forged image data, and further, for example, the average of the probability that the image data corresponding to each frame is forged image data. Generates and displays statistical calculation results for The user can refer to the statistical calculation result presented by the forged image detection device 1 to determine whether the moving image data is a forged image.
[マスクの変形]
 本実施の形態の偽造画像検出装置1は、偽造画像データ生成部24としての処理(図4に例示したステップS15の処理)において生成したマスク画像Mについてさらに、所定のデータ拡張処理を実行してもよい。
[Transformation of mask]
The forged image detection apparatus 1 of the present embodiment further performs a predetermined data expansion process on the mask image M generated in the process of the forged image data generation unit 24 (the process of step S15 illustrated in FIG. 4). good too.
 ここでマスク画像に対するデータ拡張処理は、例えば、リサイズ処理や変形処理、拡大縮小の処理等であり得る。本実施の形態の一例では、偽造画像検出装置1は、原画像データI_iのうちに撮像されている一部の領域として、顔画像の特徴部分を検出し、当該特徴部分の一部をランダムに無効化したうえで、無効化した特徴部分を除く顔画像の特徴部分を含む多角形領域(凸領域)を初期のマスク画像Msとして求める。そして、この初期のマスク画像Msに対して、所定の弾性的変形処理(Elastic Deformation)を施す。この弾性的変形処理は、Tianchen Zhao, et al., Learning self-consistency for deepfake detection, In ICCV, pp. 15023-15033, 2021 に開示された方法を採用できるので、ここでの詳しい説明は省略する。 Here, the data expansion processing for the mask image can be, for example, resizing processing, deformation processing, scaling processing, and the like. In one example of the present embodiment, the forged image detection apparatus 1 detects a characteristic portion of a face image as a partial area captured in the original image data I_i, and randomly selects a portion of the characteristic portion. After invalidation, a polygonal area (convex area) including the characteristic portion of the face image excluding the invalidated characteristic portion is obtained as the initial mask image Ms. Then, a predetermined elastic deformation process (Elastic Deformation) is applied to this initial mask image Ms. This elastic deformation process can adopt the method disclosed in Tianchen Zhao, et al., Learning self-consistency for deepfake detection, In ICCV, pp. 15023-15033, 2021, so detailed description is omitted here. .
 また、この弾性的変形処理を施したマスク画像に対し、カーネルサイズをそれぞれランダムに定めた複数のガウシアンフィルタを順次適用して拡大・縮小の処理を実行する(このとき、マスク画像の少なくとも一部の画素が0と1の間の値に定められる)。さらにランダムに定めた定数rを用いて、マスク画像に含まれるピクセル値をr倍し、実際に使用するマスク画像Mとする。ここで定数rは0より大で1以下の値であり、0.25,0.5,0.75,1のうちから一様にサンプリングして定めてもよい。 Also, a plurality of Gaussian filters, each of which has a kernel size determined at random, are sequentially applied to the mask image that has undergone the elastic deformation processing to perform enlargement/reduction processing (at this time, at least a portion of the mask image is pixels are set to values between 0 and 1). Furthermore, using a randomly determined constant r, the pixel values included in the mask image are multiplied by r to obtain the mask image M to be actually used. Here, the constant r is a value greater than 0 and less than or equal to 1, and may be determined by uniformly sampling from among 0.25, 0.5, 0.75, and 1.
 このようにして変形したマスク画像Mを用いることで、ソース画像とターゲット画像との合成の割合が多様化し、生成される偽造画像データを多様化できる。 By using the mask image M deformed in this way, the ratio of synthesis between the source image and the target image can be diversified, and the generated forged image data can be diversified.
[ソース画像データに対する画像処理]
 本実施の形態の偽造画像検出装置1は、さらに、リサイズ、平行移動、回転、明度変更、彩度変更、色相変更の少なくとも一つを含む画像処理を適用したソース画像データIsから、上記マスク画像Mにより抽出される部分を抽出し、ターゲット画像データの上記マスク画像M以外から抽出された部分(1-M)に合成して偽造画像データを得てもよい。
[Image processing for source image data]
The forged image detection apparatus 1 of the present embodiment further extracts the mask image from the source image data Is to which image processing including at least one of resizing, translation, rotation, brightness change, saturation change, and hue change is applied. Forged image data may be obtained by extracting the portion extracted by M and synthesizing it with the portion (1-M) extracted from the target image data other than the mask image M.
[複製画像データに対するデータ拡張処理の別の例]
 偽造画像検出装置1は、第1または第2の複製画像データに対するデータ拡張処理として、既に述べた例だけではなく、例えば、リサイズ、平行移動、回転、色変換処理(明度変更、彩度変更、色相変更を含む)、周波数変換処理のうち少なくとも一つを含む画像処理を適用してソース画像データIs(またはターゲット画像データIt)を生成してもよい。
[Another example of data extension processing for duplicate image data]
The forged image detection apparatus 1 performs not only the above-mentioned examples but also resizing, translation, rotation, and color conversion processing (brightness change, saturation change, The source image data Is (or the target image data It) may be generated by applying image processing including at least one of hue change) and frequency conversion processing.
 またここでのデータ拡張処理における周波数変換処理として、JPEG(Joint Picture Experts Group)の圧縮処理等を実行し、量子化エラーを付与する処理を行ってもよい。 Also, as frequency conversion processing in the data expansion processing here, JPEG (Joint Picture Experts Group) compression processing or the like may be executed to add quantization errors.
[実施の形態の効果]
 本実施の形態によると、データセットの画像データがどのような画像データであるかによらず、偽造の痕跡が機械学習されると期待でき、学習に用いたデータセットのドメインに関わらず、多様な偽造画像を検出可能となる。
[Effects of Embodiment]
According to the present embodiment, regardless of what kind of image data the image data of the data set is, it can be expected that machine learning will be performed for traces of forgery. Forged images can be detected.
 1 偽造画像検出装置、11 制御部、12 記憶部、13 操作部、14 表示部、15 通信部、21 原画像受入部、22 複製画像生成部、23 データ拡張処理部、24 偽造画像データ生成部、25 機械学習処理部、31 対象画像受入部、32 判別処理部、33 出力部。
 
1 forged image detection device, 11 control unit, 12 storage unit, 13 operation unit, 14 display unit, 15 communication unit, 21 original image reception unit, 22 duplicate image generation unit, 23 data extension processing unit, 24 forged image data generation unit , 25 machine learning processing unit, 31 target image receiving unit, 32 discrimination processing unit, and 33 output unit.

Claims (13)

  1.  プロセッサ及びメモリを備え、偽造画像を検出する偽造画像検出装置であって、前記プロセッサが、
     機械学習の対象となる原画像データを受け入れて、前記メモリに格納し、
     当該メモリに格納した原画像データを複製して第1,第2の複製画像データを生成し、
     当該生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとして、当該生成したターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを生成して、
     前記原画像データを非偽造画像とし、前記生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習し、
     当該機械学習した機械学習モデルが、前記偽造画像を検出する処理に供される偽造画像検出装置。
    A counterfeit image detection device for detecting counterfeit images, comprising a processor and a memory, wherein the processor comprises:
    receiving original image data to be machine-learned and storing it in the memory;
    duplicating the original image data stored in the memory to generate first and second duplicate image data;
    Predetermined data extension processing is applied to at least one of the generated first and second duplicate image data, and the generated target image data and source image data are used as source image data and target image data, respectively. generating at least one piece of counterfeit image data using
    The original image data is set as a non-forged image, and the generated forged image data is set as a forged image, and each is input to a predetermined machine learning model, and the machine learning model is set to a state in which the non-forged image and the forged image can be distinguished. machine learning to
    A counterfeit image detection device in which the machine-learned machine learning model is subjected to processing for detecting the counterfeit image.
  2.  プロセッサ及びメモリを備え、偽造画像を検出する検出装置であって、前記プロセッサが、
     機械学習の対象となる原画像データを複数受け入れて、前記メモリに格納し、
     当該メモリに格納した原画像データのそれぞれを順次、対象画像データとして選択し、
     当該選択した対象画像データを複製して第1,第2の複製画像データを生成し、
     当該生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとして、当該生成したターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを生成して、
     前記対象画像データを非偽造画像とし、前記生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習し、
     前記複数の原画像データのそれぞれを対象画像データとして得た合成画像データを当該機械学習した機械学習モデルが、前記偽造画像を検出する処理に供される偽造画像検出装置。
    A detection device for detecting counterfeit images, comprising a processor and memory, the processor comprising:
    receiving a plurality of original image data to be machine-learned and storing them in the memory;
    sequentially selecting each of the original image data stored in the memory as target image data;
    duplicating the selected target image data to generate first and second duplicate image data;
    Predetermined data extension processing is applied to at least one of the generated first and second duplicate image data, and the generated target image data and source image data are used as source image data and target image data, respectively. generating at least one piece of counterfeit image data using
    The target image data is set as a non-forged image, and the generated forged image data is set as a forged image, and each is input to a predetermined machine learning model so that the machine learning model can distinguish between the non-forged image and the forged image. machine learning to
    A forgery image detection device, wherein a machine learning model obtained by performing machine learning on synthetic image data obtained by using each of the plurality of original image data as target image data is used for processing for detecting the forgery image.
  3.  請求項1に記載の偽造画像検出装置であって、
     前記偽造画像データを生成するときに、前記ターゲット画像データとソース画像データとを合成して偽造画像データを生成する偽造画像検出装置。
    The counterfeit image detection device according to claim 1,
    A counterfeit image detection apparatus for generating counterfeit image data by synthesizing the target image data and the source image data when generating the counterfeit image data.
  4.  請求項3に記載の偽造画像検出装置であって、
     前記偽造画像データを生成するときに、前記ソース画像データの一部の領域内の画像を抽出し、ターゲット画像データの対応する領域内に上書き合成して、偽造画像データを生成する偽造画像検出装置。
    The counterfeit image detection device according to claim 3,
    A counterfeit image detection device for generating counterfeit image data by extracting an image within a partial area of the source image data and overwriting and synthesizing the image within a corresponding area of the target image data when generating the counterfeit image data. .
  5.  請求項1に記載の偽造画像検出装置であって、
     さらに、前記ソース画像データに対してリサイズ、平行移動、回転、明度変更、彩度変更、色相変更の少なくとも一つを含む画像処理を適用する偽造画像検出装置。
    The counterfeit image detection device according to claim 1,
    Further, the forged image detection apparatus applies image processing including at least one of resizing, translation, rotation, brightness change, saturation change, and hue change to the source image data.
  6.  請求項1に記載の偽造画像検出装置であって、
     前記所定のデータ拡張処理を適用する対象として、前記第1,第2の複製画像データのいずれか一方を、予め定めた方法で選択する偽造画像検出装置。
    The counterfeit image detection device according to claim 1,
    A forgery image detection apparatus for selecting either one of the first and second duplicate image data as a target to which the predetermined data extension processing is applied by a predetermined method.
  7.  請求項3に記載の偽造画像検出装置であって、
     前記所定のデータ拡張処理を適用する対象として、前記第1,第2の複製画像データのいずれか一方をランダムに選択する偽造画像検出装置。
    The counterfeit image detection device according to claim 3,
    A forgery image detection apparatus for randomly selecting either one of the first and second duplicate image data as a target to which the predetermined data extension processing is applied.
  8.  請求項6に記載の偽造画像検出装置であって、
     前記予め定めた方法は、前記第1,第2の複製画像データのいずれか一方をランダムに選択する方法である偽造画像検出装置。
    The counterfeit image detection device according to claim 6,
    The forgery image detection apparatus, wherein the predetermined method is a method of randomly selecting one of the first and second duplicate image data.
  9.  請求項1に記載の偽造画像検出装置であって、
     前記第1,第2の複製画像データの少なくとも一方に対して適用する所定のデータ拡張処理は、画質と色調の少なくとも一方を変更する処理である偽造画像検出処理装置。
    The counterfeit image detection device according to claim 1,
    The forgery image detection processing apparatus, wherein the predetermined data extension process applied to at least one of the first and second duplicate image data is a process of changing at least one of image quality and color tone.
  10.  請求項4に記載の偽造画像検出装置であって、
     前記原画像データは、人物の顔の画像を含み、
     前記ターゲット画像データとソース画像データとを合成するときには、当該ソース画像データから前記人物の目、鼻、口、耳、及び顔の輪郭のうち少なくとも一つを含む領域の部分画像を抽出して、ターゲット画像データの対応する領域に上書き合成する偽造画像検出装置。
    The counterfeit image detection device according to claim 4,
    the original image data includes an image of a person's face,
    when synthesizing the target image data and the source image data, extracting a partial image of an area including at least one of the person's eyes, nose, mouth, ears, and facial contour from the source image data; A counterfeit image detection device that overwrites and synthesizes a corresponding area of target image data.
  11.  請求項5に記載の偽造画像検出装置であって、
     前記原画像データは、人物の顔の画像を含み、
     前記ターゲット画像データとソース画像データとを合成するときには、リサイズ、平行移動、回転、明度変更、彩度変更、色相変更のうち少なくとも一つを含む画像処理を適用したソース画像データから前記人物の目、鼻、口、耳、及び顔の輪郭のうち少なくとも一つを含む領域の部分画像を抽出して、ターゲット画像データの対応する領域に上書き合成する偽造画像検出装置。
    The forged image detection device according to claim 5,
    the original image data includes an image of a person's face,
    When synthesizing the target image data and the source image data, the person's eyes are obtained from the source image data to which image processing including at least one of resizing, translation, rotation, brightness change, saturation change, and hue change is applied. , a forgery image detection device for extracting a partial image of an area including at least one of the contours of a nose, mouth, ear, and face, and superimposing it on the corresponding area of the target image data.
  12.  プロセッサ及びメモリを備えたコンピュータ装置により実行される、偽造画像を検出する方法であって、前記プロセッサが、
     機械学習の対象となる原画像データを受け入れて、前記メモリに格納し、
     当該メモリに格納した原画像データを複製して第1,第2の複製画像データを生成し、
     当該生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとして、当該生成したターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを生成して、
     前記原画像データを非偽造画像とし、前記生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習し、
     当該機械学習した機械学習モデルが、前記偽造画像を検出する処理に供される偽造画像検出方法。
    A method of detecting counterfeit images performed by a computing device comprising a processor and memory, the processor comprising:
    receiving original image data to be machine-learned and storing it in the memory;
    duplicating the original image data stored in the memory to generate first and second duplicate image data;
    Predetermined data extension processing is applied to at least one of the generated first and second duplicate image data, and the generated target image data and source image data are used as source image data and target image data, respectively. generating at least one piece of counterfeit image data using
    The original image data is set as a non-forged image, and the generated forged image data is set as a forged image, and each is input to a predetermined machine learning model, and the machine learning model is set to a state in which the non-forged image and the forged image can be distinguished. machine learning to
    A counterfeit image detection method, wherein the machine-learned machine learning model is subjected to processing for detecting the counterfeit image.
  13.  プロセッサ及びメモリを備えたコンピュータ装置により実行される、偽造画像を検出するプログラムであって、前記プロセッサがこのプログラムを実行することで、
     機械学習の対象となる原画像データを受け入れて、前記メモリに格納し、
     当該メモリに格納した原画像データを複製して第1,第2の複製画像データを生成し、
     当該生成した第1,第2の複製画像データの少なくとも一方に対して所定のデータ拡張処理を適用し、それぞれをソース画像データとターゲット画像データとして、当該生成したターゲット画像データとソース画像データとを用いて少なくとも一つの偽造画像データを生成して、
     前記原画像データを非偽造画像とし、前記生成した偽造画像データを偽造画像として、それぞれ所定の機械学習モデルに入力して、当該機械学習モデルを、非偽造画像と偽造画像とを識別可能な状態に機械学習し、
     当該機械学習した機械学習モデルが、前記偽造画像を検出する処理に供されるプログラム。

     
    A program for detecting a forged image, which is executed by a computer device having a processor and a memory, wherein the processor executes the program,
    receiving original image data to be machine-learned and storing it in the memory;
    duplicating the original image data stored in the memory to generate first and second duplicate image data;
    Predetermined data extension processing is applied to at least one of the generated first and second duplicate image data, and the generated target image data and source image data are used as source image data and target image data, respectively. generating at least one piece of counterfeit image data using
    The original image data is set as a non-forged image, and the generated forged image data is set as a forged image, and each is input to a predetermined machine learning model, and the machine learning model is set to a state in which the non-forged image and the forged image can be distinguished. machine learning to
    A program in which the machine-learned machine-learning model is subjected to processing for detecting the forgery image.

PCT/JP2023/003431 2022-02-05 2023-02-02 Forgery image detection device, forgery image detection method, and program WO2023149513A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263307109P 2022-02-05 2022-02-05
US63/307,109 2022-02-05

Publications (1)

Publication Number Publication Date
WO2023149513A1 true WO2023149513A1 (en) 2023-08-10

Family

ID=87552549

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/003431 WO2023149513A1 (en) 2022-02-05 2023-02-02 Forgery image detection device, forgery image detection method, and program

Country Status (1)

Country Link
WO (1) WO2023149513A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020241142A1 (en) * 2019-05-27 2020-12-03 昭和電工株式会社 Image analysis device, method, and program
WO2021220343A1 (en) * 2020-04-27 2021-11-04 日本電気株式会社 Data generation device, data generation method, learning device, and recording medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020241142A1 (en) * 2019-05-27 2020-12-03 昭和電工株式会社 Image analysis device, method, and program
WO2021220343A1 (en) * 2020-04-27 2021-11-04 日本電気株式会社 Data generation device, data generation method, learning device, and recording medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AL-DHABI YUNES; ZHANG SHUANG: "Deepfake Video Detection by Combining Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN)", 2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AND ELECTRONIC ENGINEERING (CSAIEE), IEEE, 20 August 2021 (2021-08-20), pages 236 - 241, XP033980577, DOI: 10.1109/CSAIEE54046.2021.9543264 *
CHERNOMYRDINA ANNA: "Investigation of the algorithm for face substitution in a video sequence using deep learning", 2021 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND NANOTECHNOLOGY (ITNT), IEEE, 20 September 2021 (2021-09-20), pages 1 - 4, XP034059171, DOI: 10.1109/ITNT52450.2021.9649056 *
J. NARUNIEC; L. HELMINGER; C. SCHROERS; R.M. WEBER: "High‐Resolution Neural Face Swapping for Visual Effects", COMPUTER GRAPHICS FORUM : JOURNAL OF THE EUROPEAN ASSOCIATION FOR COMPUTER GRAPHICS, WILEY-BLACKWELL, OXFORD, vol. 39, no. 4, 20 July 2020 (2020-07-20), Oxford , pages 173 - 184, XP071489939, ISSN: 0167-7055, DOI: 10.1111/cgf.14062 *
TJON ERIC; MOH MELODY; MOH TENG-SHENG: "Eff-YNet: A Dual Task Network for DeepFake Detection and Segmentation", 2021 15TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM), IEEE, 4 January 2021 (2021-01-04), pages 1 - 8, XP033888433, DOI: 10.1109/IMCOM51814.2021.9377373 *

Similar Documents

Publication Publication Date Title
Khalid et al. Oc-fakedect: Classifying deepfakes using one-class variational autoencoder
Chen et al. Fsrnet: End-to-end learning face super-resolution with facial priors
WO2022001509A1 (en) Image optimisation method and apparatus, computer storage medium, and electronic device
CN110197229B (en) Training method and device of image processing model and storage medium
TW202105238A (en) Image processing method and device, processor, electronic equipment and storage medium
WO2020103700A1 (en) Image recognition method based on micro facial expressions, apparatus and related device
CN111144314B (en) Method for detecting tampered face video
US11386599B2 (en) Feature transfer
CN111062426A (en) Method, device, electronic equipment and medium for establishing training set
Cornejo et al. Emotion recognition from occluded facial expressions using weber local descriptor
CN113240655B (en) Method, storage medium and device for automatically detecting type of fundus image
CN112329586A (en) Client return visit method and device based on emotion recognition and computer equipment
CN116912604B (en) Model training method, image recognition device and computer storage medium
WO2023149513A1 (en) Forgery image detection device, forgery image detection method, and program
Pajot et al. Unsupervised adversarial image inpainting
CN111507279B (en) Palm print recognition method based on UNet + + network
JP2023082065A (en) Method of discriminating objet in image having biometric characteristics of user to verify id of the user by separating portion of image with biometric characteristic from other portion
CN112837318B (en) Ultrasonic image generation model generation method, ultrasonic image synthesis method, medium and terminal
Wang et al. Diverse image inpainting with disentangled uncertainty
CN111753722B (en) Fingerprint identification method and device based on feature point type
CN114694074A (en) Method, device and storage medium for generating video by using image
CN114049303A (en) Progressive bone age assessment method based on multi-granularity feature fusion
Santos et al. A new method based on deep learning to detect lesions in retinal images using YOLOv5
Chao et al. Instance-aware image dehazing
CN111563839A (en) Fundus image conversion method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23749829

Country of ref document: EP

Kind code of ref document: A1