JP2023543964A

JP2023543964A - Image processing method, image processing device, electronic device, storage medium and computer program

Info

Publication number: JP2023543964A
Application number: JP2023509715A
Authority: JP
Inventors: ▲長▼勇束; 家▲銘▼ ▲劉▼; 智▲びん▼ 洪; ▲鈞▼宇 ▲韓▼
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-08-25
Filing date: 2022-06-10
Publication date: 2023-10-19
Also published as: CN113962845A; US20240303774A1; WO2023024653A1; CN113962845B

Abstract

本開示は、画像処理方法、画像処理装置、電子機器及び記憶媒体を提供し、人工知能分野、特にコンピュータ視覚及びディープラーニングの分野に関し、顔画像処理及び顔識別などのシーンに適用することができる。具体的な技術案は、以下のとおりである。即ち、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、ここで、処理すべき画像におけるオブジェクトのアイデンティティ情報が第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングすることと、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成し、ここで、デカップリング画像セットが処理すべき画像中のオブジェクトのヘッド領域に対応するヘッドデカップリング画像及び処理すべき画像中のオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像を含むことと、デカップリング画像セットに基づいて、融合画像を生成し、ここで、融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報がそれぞれ処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ融合画像におけるオブジェクトに関連する修復すべき情報が修復されたことと、を含む。The present disclosure provides an image processing method, an image processing device, an electronic device, and a storage medium, and relates to the field of artificial intelligence, especially the field of computer vision and deep learning, and can be applied to scenes such as facial image processing and face identification. . The specific technical proposals are as follows. That is, an image to be processed is generated based on the first target image and the second target image, and the identity information of the object in the image to be processed is matched with the identity information of the object in the first target image. , generate a decoupling image set based on the second target image and the image to be processed, where the decoupling image set includes a head decoupling image corresponding to a head region of the object in the image to be processed and the image to be processed; generating a fused image based on the decoupling image set, including an inpainting decoupling image corresponding to to-be-inpainted information associated with the object in the fused image, where identity information and texture of the object in the fused image; The information is matched with the identity information and texture information of the object in the image to be processed, respectively, and the information to be repaired associated with the object in the fused image is repaired.

Description

cross reference

本願は、２０２１年８月２５日に提出され、出願番号が２０２１１０９８５６０５．０である中国特許出願の優先権を要求し、その全ての内容は引用により本願に組み込まれる。 This application claims priority to the Chinese patent application filed on August 25, 2021 and with application number 202110985605.0, the entire contents of which are incorporated by reference into this application.

本開示は、人工知能技術分野、特にコンピュータ視覚及びディープラーニングの技術分野に関し、顔画像処理及び顔識別などのシーンに応用することができる。具体的には、画像処理方法、画像処理装置、電子機器及び記憶媒体に関する。 The present disclosure relates to the field of artificial intelligence technology, especially the field of computer vision and deep learning, and can be applied to scenes such as facial image processing and face identification. Specifically, the present invention relates to an image processing method, an image processing device, an electronic device, and a storage medium.

インターネットの発展及びディープラーニングを核心とする人工知能技術の発展に伴い、コンピュータ視覚技術は様々な分野で広く応用される。 With the development of the Internet and the development of artificial intelligence technology centered on deep learning, computer vision technology has been widely applied in various fields.

オブジェクトが、豊富な顔表情動作を介して、内心感情を反映し、交流情報を伝達することができるため、オブジェクトの顔画像に対する研究は、コンピュータ視覚の分野における重要な研究内容の一つである。オブジェクトの顔画像を画像に結合して変換するイメージ置換技術に関する研究もそれに伴って出現する。イメージ置換は、例えば、映画編集又は仮想キャラクタという様々なシーンにいずれも応用される。 Research on facial images of objects is one of the important research topics in the field of computer vision because objects can reflect their inner feelings and convey interaction information through rich facial expressions. . Along with this, research on image replacement technology that combines and transforms facial images of objects into images is also emerging. Image replacement is applied to various scenes such as movie editing or virtual characters.

本開示は、画像処理方法、画像処理装置、電子機器および記憶媒体を提供する。 The present disclosure provides an image processing method, an image processing device, an electronic device, and a storage medium.

本開示の一態様によれば、画像処理方法を提供する。前記画像処理方法は、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、ここで、上記処理すべき画像におけるオブジェクトのアイデンティティ情報が上記第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングし、上記処理すべき画像におけるオブジェクトのテクスチャ情報が上記第二目標画像におけるオブジェクトのテクスチャ情報とマッチングすることと、上記第二目標画像及び上記処理すべき画像に基づいて、デカップリング画像セットを生成し、ここで、上記デカップリング画像セットが上記処理すべき画像中のオブジェクトのヘッド領域に対応するヘッドデカップリング画像及び上記処理すべき画像中のオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像を含むことと、上記デカップリング画像セットに基づいて、融合画像を生成し、ここで、上記融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報がそれぞれ上記処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ上記融合画像におけるオブジェクトに関連する修復すべき情報が修復されたことと、を含む。 According to one aspect of the present disclosure, an image processing method is provided. The image processing method generates an image to be processed based on a first target image and a second target image, wherein identity information of an object in the image to be processed is the identity of an object in the first target image. matching the texture information of the object in the image to be processed with the texture information of the object in the second target image, and based on the second target image and the image to be processed, a decoupling image. a head decoupling image corresponding to a head region of an object in the image to be processed and information to be repaired associated with the object in the image to be processed; and generating a fused image based on the decoupled image set, wherein identity information and texture information of an object in the fused image each correspond to an identity of an object in the image to be processed. information and texture information and that the information to be repaired that is related to the object in the fused image is repaired.

本開示の別の態様によれば、画像処理装置を提供する。前記画像処理装置は、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、ここで、上記処理すべき画像におけるオブジェクトのアイデンティティ情報が上記第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングし、上記処理すべき画像におけるオブジェクトのテクスチャ情報が上記第二目標画像におけるオブジェクトのテクスチャ情報とマッチングする第一生成モジュールと、上記第二目標画像及び上記処理すべき画像に基づいて、デカップリング画像セットを生成し、ここで、上記デカップリング画像セットが上記処理すべき画像におけるオブジェクトのヘッド領域に対応するヘッドデカップリング画像及び上記処理すべき画像におけるオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像を含む第二生成モジュールと、上記デカップリング画像セットに基づいて、融合画像を生成し、ここで、上記融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報がそれぞれ上記処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ上記融合画像におけるオブジェクトに関連する修復すべき情報が修復された第三生成モジュールと、を含む。 According to another aspect of the present disclosure, an image processing device is provided. The image processing device generates an image to be processed based on a first target image and a second target image, wherein identity information of an object in the image to be processed is the identity of an object in the first target image. a first generation module that matches the texture information of the object in the image to be processed with the texture information of the object in the second target image, based on the second target image and the image to be processed; generating a decoupling image set, wherein the decoupling image set includes a head decoupling image corresponding to a head region of an object in the to-be-processed image and to-be-repaired information associated with the object in the to-be-processed image; a second generation module comprising a corresponding repaired decoupled image and a fused image based on the decoupled image set, wherein the identity information and texture information of an object in the fused image are respectively associated with the image to be processed; and a third generation module in which the information to be repaired related to the object in the fused image is repaired.

本開示の別の態様によれば、電子機器を提供する。前記電子機器は、少なくとも一つのプロセッサと、上記少なくとも一つのプロセッサと通信接続されたメモリと、を含み、上記メモリは、上記少なくとも一つのプロセッサにより実行可能な命令を記憶し、上記命令は上記少なくとも一つのプロセッサにより実行されることにより、上記少なくとも一つのプロセッサが上記の方法を実行することができる。 According to another aspect of the present disclosure, an electronic device is provided. The electronic device includes at least one processor and a memory communicatively connected to the at least one processor, the memory storing instructions executable by the at least one processor, and the instructions executing the at least one processor. By being executed by one processor, the at least one processor can perform the method.

本開示の別の態様によれば、コンピュータ命令は上記コンピュータに上記の方法を実行させるコンピュータ命令を記憶した非一時的なコンピュータ可読記憶媒体を提供する。 According to another aspect of the disclosure, a non-transitory computer-readable storage medium having computer instructions stored thereon causes the computer to perform the method described above.

本開示の別の態様によれば、プロセッサにより実行される時に上記の方法を実現するコンピュータプログラムを含むコンピュータプログラム製品を提供する。 According to another aspect of the present disclosure, a computer program product is provided that includes a computer program that, when executed by a processor, implements the method described above.

理解すべきことは、本部分に記載された内容は本開示の実施例のキー又は重要な特徴を識別することを意図するものではなく、本開示の範囲を限定するものではない。本開示の他の特徴は、以下の説明により容易に理解されるであろう。 It should be understood that the content described in this section is not intended to identify key or important features of the embodiments of the disclosure or to limit the scope of the disclosure. Other features of the disclosure will be readily understood from the following description.

図面は、本技術案をよりよく理解するために用いられ、本開示を限定するものではない。
図１は、本開示の実施例に係る画像処理方法及び装置を適用できる例示的なシステムアーキテクチャを概略的に示す。図２は、本開示の実施例に係る画像処理方法のフローチャートを概略的に示す。図３は、本開示の実施例に係る処理すべき画像を生成する過程の概略図を概略的に示す。図４は、本開示の実施例に係る画像処理過程の概略図を概略的に示す。図５は、本開示の実施例に係る画像処理装置のブロック図を概略的に示す。図６は、本開示の実施例に係る画像処理方法を実現するのに適した電子機器のブロック図を概略的に示す。 The drawings are used to better understand the technical solution and are not intended to limit the disclosure.
FIG. 1 schematically depicts an exemplary system architecture in which an image processing method and apparatus according to embodiments of the present disclosure can be applied. FIG. 2 schematically shows a flowchart of an image processing method according to an embodiment of the present disclosure. FIG. 3 schematically shows a schematic diagram of a process of generating an image to be processed according to an embodiment of the present disclosure. FIG. 4 schematically shows a schematic diagram of an image processing process according to an embodiment of the present disclosure. FIG. 5 schematically shows a block diagram of an image processing device according to an embodiment of the present disclosure. FIG. 6 schematically shows a block diagram of an electronic device suitable for implementing an image processing method according to an embodiment of the present disclosure.

以下、図面を参照して本開示の例示的な実施例を説明し、ここで、理解しやすくするように、本開示の実施例の様々な詳細を含み、それらを例示的なものと考えるべきである。したがって、当業者としてわかるように、ここで説明した実施例に対して様々な変更及び修正を行うことができ、本開示の範囲及び精神から逸脱することはない。同様に、明確かつ簡単に説明するために、以下の説明において公知の機能及び構造に対する説明を省略する。 The following describes exemplary embodiments of the present disclosure with reference to the drawings and herein includes various details of embodiments of the present disclosure for ease of understanding and which are to be considered as exemplary. It is. Accordingly, those skilled in the art will appreciate that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for the sake of clarity and brevity, well-known functions and structures will not be described in the following description.

本開示の構想を実現する過程において、イメージ置換が顔の置換により実現され、即ち、顔五感を置換し、顔領域以外の他の情報、例えば、ヘッド情報及び肌色情報を無視し、ヘッド情報は、髪及びヘッド型等を含むことができる。これにより、置換後の画像のアイデンティティ類似度が低く、さらにイメージ置換の置換効果に影響を与える。 In the process of realizing the concept of the present disclosure, image replacement is realized by face replacement, that is, replacing the five face senses, ignoring other information other than the face area, such as head information and skin color information, and head information is , hair and head type, etc. As a result, the identity similarity of the image after replacement is low, which further affects the replacement effect of image replacement.

置換後の画像のアイデンティティの類似度が低いことを容易に招来する場合について、以下の例で説明することができる。例えば、画像Ａにおけるオブジェクトａのヘッド領域を画像Ｂにおけるオブジェクトｂのヘッド領域に置き換える必要がある。オブジェクトｂの肌色は黒色であり、オブジェクトａの肌色は黄色である。顔五感の置換であるが肌色情報を無視すれば、置換後の画像中のオブジェクトの顔五感が黄色であり顔の肌色が黒色である状況が出現し、置換後の画像のアイデンティティ類似度が低い。 The following example can explain a case where the identity similarity of the images after replacement is easily caused to be low. For example, it is necessary to replace the head area of object a in image A with the head area of object b in image B. The skin color of object b is black, and the skin color of object a is yellow. If skin color information is ignored when replacing the five facial senses, a situation will arise in which the five facial senses of the object in the replaced image are yellow and the skin color of the face is black, and the identity similarity of the replaced image will be low. .

このために、本開示の実施例は多段階ヘッド置換融合で、アイデンティティ情報の類似度が高い融合結果を生成する技術案を提供する。すなわち、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成し、かつデカップリング画像セットに基づいて、オブジェクトのアイデンティティ情報及びテクスチャ情報をそれぞれ処理すべき画像中のオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ修復すべき情報が修復された融合画像を生成する。融合画像におけるオブジェクトに関連する修復すべき情報が修復されたため、融合画像におけるアイデンティティ類似度を向上させ、さらにイメージ置換の置換効果を向上させる。 To this end, embodiments of the present disclosure provide a technical solution for generating a fusion result with high similarity of identity information through multi-stage head replacement fusion. That is, an image to be processed is generated based on the first target image and the second target image, a decoupling image set is generated based on the second target image and the image to be processed, and the decoupling image set is Based on this, the object's identity information and texture information are matched with the object's identity information and texture information in the image to be processed, respectively, and a fused image in which the information to be repaired is repaired is generated. Since the information to be repaired related to the object in the fused image has been repaired, identity similarity in the fused image is improved, and the replacement effect of image replacement is further improved.

図１は、本開示の実施例に係る画像処理方法及び装置を適用できる例示的なシステムアーキテクチャを概略的に示す。 FIG. 1 schematically depicts an exemplary system architecture in which an image processing method and apparatus according to embodiments of the present disclosure can be applied.

注意すべきものとして、図１に示すのは本開示の実施例のシステムアーキテクチャを適用することができる例示に過ぎず、当業者が本開示の技術内容を理解することに役立つが、本開示の実施例は他の機器、システム、環境又はシーンに用いることができないことを意味するものではない。例えば、別の実施例において、画像処理方法及び装置を適用することができる例示的なシステムアーキテクチャは端末機器を含むことができるが、端末機器はサーバと対話する必要がなく、本開示の実施例が提供する画像処理方法及び装置を実現することができる。 It should be noted that FIG. 1 is only an example to which the system architecture of the embodiments of the present disclosure may be applied, and is helpful for those skilled in the art to understand the technical content of the present disclosure; The examples are not meant to imply that they cannot be used in other devices, systems, environments, or scenes. For example, in another embodiment, an example system architecture in which the image processing method and apparatus may be applied may include a terminal device, but the terminal device does not need to interact with a server, and embodiments of the present disclosure It is possible to realize the image processing method and apparatus provided by.

図１に示すように、該実施例に係るシステムアーキテクチャ１００は、端末機器１０１、１０２、１０３、ネットワーク１０４及びサーバ１０５を含むことができる。ネットワーク１０４は端末機器１０１、１０２、１０３とサーバ１０５との間に通信リンクを提供する媒体であるに用いられる。ネットワーク１０４は、例えば有線及び／又は無線通信リンクなどの様々な接続タイプを含むことができる。 As shown in FIG. 1, a system architecture 100 according to the embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. Network 104 is used as a medium to provide a communication link between terminal equipment 101 , 102 , 103 and server 105 . Network 104 may include various connection types, such as wired and/or wireless communication links.

ユーザは、端末機器１０１、１０２、１０３を使用してネットワーク１０４を介してサーバ１０５と対話することにより、メッセージ等を受信するか又は送信することができる。端末機器１０１、１０２、１０３に様々な通信クライアントアプリケーション、例えば知識閲覧類アプリケーション、ウェブページブラウザアプリケーション、検索類アプリケーション、リアルタイム通信ツール、メールボックスクライアント及び／又はソーシャルプラットフォームソフトウェアなど（例に過ぎない）がインストールされてもよい。 A user can receive or send messages, etc. by interacting with a server 105 over a network 104 using terminal equipment 101 , 102 , 103 . The terminal devices 101, 102, 103 are equipped with various communication client applications, such as knowledge browsing applications, web page browser applications, search applications, real-time communication tools, mailbox clients, and/or social platform software (to name a few). May be installed.

端末機器１０１、１０２、１０３は、ディスプレイを有しかつウェブページの閲覧をサポートする様々な電子機器であってもよく、スマートフォン、タブレットコンピュータ、ラップトップ型携帯コンピュータ及びデスクトップコンピュータ等を含むがそれらに限定されない。 The terminal devices 101, 102, 103 may be various electronic devices that have a display and support viewing web pages, including, but not limited to, smartphones, tablet computers, laptop computers, desktop computers, and the like. Not limited.

サーバ１０５は様々なサービスを提供するサーバであってもよく、例えばユーザが端末機器１０１、１０２、１０３により閲覧されたコンテンツをサポートするバックグラウンド管理サーバ（例に過ぎない）を提供する。バックグラウンド管理サーバは受信されたユーザ要求等のデータを分析等の処理を行い、かつ処理結果（例えばユーザの要求に応じて取得又は生成されたウェブページ、情報、又はデータ等）を端末機器にフィードバックすることができる。 The server 105 may be a server that provides various services, such as providing a background management server (by way of example only) that supports content viewed by users on the terminal devices 101, 102, 103. The background management server performs processing such as analysis on data such as received user requests, and sends processing results (for example, web pages, information, or data acquired or generated in response to user requests) to the terminal device. Can give feedback.

サーバ１０５はクラウドサーバであってもよく、クラウドコンピューティングサーバ又はクラウドホストと呼ばれ、クラウドコンピューティングサービスシステムのうちの一つのホスト製品であり、従来の物理ホストとＶＰＳサービス（Virtual Private Server、ＶＰＳ）における管理難度が大きく、サービス拡張性が弱いという欠陥を解決する。サーバ１０５は、配布式システムのサーバであってもよく、又はブロックチェーンを結合したサーバであってもよい。 The server 105 may be a cloud server, called a cloud computing server or a cloud host, and is one of the host products of the cloud computing service system, which is a traditional physical host and a VPS service (Virtual Private Server, VPS). ), which is highly difficult to manage and has low service expandability. The server 105 may be a distributed system server or a blockchain coupled server.

説明すべきものとして、本開示の実施例が提供する画像処理方法は一般的に端末機器１０１、１０２、又は１０３により実行することができる。それに応じて、本開示の実施例が提供する画像処理装置は端末機器１０１、１０２、又は１０３に設置されてもよい。 It should be noted that the image processing method provided by the embodiments of the present disclosure can generally be performed by the terminal device 101, 102, or 103. Accordingly, the image processing device provided by the embodiment of the present disclosure may be installed in the terminal device 101, 102, or 103.

或いは、本開示の実施例が提供する画像処理方法は、一般的にサーバ１０５により実行されてもよい。それに応じて、本開示の実施例が提供する画像処理装置は、一般的にサーバ１０５に設置することができる。本開示の実施例が提供する画像処理方法は、サーバ１０５とは異なりかつ端末機器１０１、１０２、１０３及び／又はサーバ１０５と通信可能なサーバ又はサーバクラスタによって実行されてもよい。それに応じて、本開示の実施例が提供する画像処理装置は、サーバ１０５とは異なりかつ端末機器１０１、１０２、１０３及び／又はサーバ１０５と通信可能なサーバ又はサーバクラスタに設置されてもよい。 Alternatively, the image processing methods provided by embodiments of the present disclosure may be generally performed by the server 105. Accordingly, image processing devices provided by embodiments of the present disclosure may generally be installed on the server 105. Image processing methods provided by embodiments of the present disclosure may be performed by a server or server cluster that is different from server 105 and that is capable of communicating with terminal devices 101 , 102 , 103 and/or server 105 . Accordingly, the image processing device provided by the embodiments of the present disclosure may be installed in a server or server cluster that is different from the server 105 and that is capable of communicating with the terminal devices 101 , 102 , 103 and/or the server 105 .

例えば、サーバ１０５は、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、処理すべき画像におけるオブジェクトのアイデンティティ情報は、第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングし、処理すべき画像におけるオブジェクトのテクスチャ情報は、第二目標画像におけるオブジェクトのテクスチャ情報とマッチングし、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成し、デカップリング画像セットは処理すべき画像におけるオブジェクトのヘッド領域に対応するヘッドデカップリング画像及び処理すべき画像におけるオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像を含み、かつデカップリング画像セットに基づいて、融合画像を生成し、融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報はそれぞれ処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ融合画像におけるオブジェクトに関連する修復すべき情報は修復された。又は、端末機器１０１、１０２、１０３及び／又はサーバ１０５と通信可能なサーバ又はサーバクラスタにより第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成し、かつデカップリング画像セットに基づいて、融合画像を生成する。 For example, the server 105 generates an image to be processed based on the first target image and the second target image, and the identity information of the object in the image to be processed matches the identity information of the object in the first target image. , the texture information of the object in the image to be processed is matched with the texture information of the object in the second target image, and a decoupling image set is generated based on the second target image and the image to be processed; includes a head decoupling image corresponding to the head region of the object in the image to be processed and a repaired decoupling image corresponding to information to be repaired related to the object in the image to be processed, and based on the decoupled image set, A fused image is generated, the object's identity information and texture information in the fused image are matched with the object's identity information and texture information in the image to be processed, respectively, and the information to be repaired related to the object in the fused image is repaired. . Alternatively, a server or server cluster capable of communicating with the terminal devices 101, 102, 103 and/or the server 105 generates an image to be processed based on the first target image and the second target image, and generates the image to be processed based on the second target image and the second target image. A decoupled image set is generated based on the images to be processed, and a fused image is generated based on the decoupled image set.

理解すべきこととして、図１における端末機器、ネットワーク及びサーバの数は単に例示的である。必要に応じて、任意の数の端末機器、ネットワーク及びサーバを有することができる。 It should be understood that the number of terminal equipment, networks and servers in FIG. 1 is merely exemplary. It can have any number of terminal devices, networks and servers as required.

図２は、本開示の実施例に係る画像処理方法のフローチャートを模式的に示す。 FIG. 2 schematically shows a flowchart of an image processing method according to an embodiment of the present disclosure.

図２に示すように、該方法２００は操作Ｓ２１０～Ｓ２３０を含む。 As shown in FIG. 2, the method 200 includes operations S210-S230.

操作Ｓ２１０において、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成し、ここで、処理すべき画像におけるオブジェクトのアイデンティティ情報は第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングし、処理すべき画像におけるオブジェクトのテクスチャ情報は第二目標画像におけるオブジェクトのテクスチャ情報とマッチングする。 In operation S210, an image to be processed is generated based on the first target image and the second target image, where the identity information of the object in the image to be processed is matched with the identity information of the object in the first target image. , the texture information of the object in the image to be processed is matched with the texture information of the object in the second target image.

操作Ｓ２２０において、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成し、ここで、デカップリング画像セットは、処理すべき画像中のオブジェクトのヘッド領域に対応するヘッドデカップリング画像及び処理すべき画像中のオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像を含む。 In operation S220, a decoupling image set is generated based on the second target image and the image to be processed, where the decoupling image set is a head decoupling image corresponding to a head region of the object in the image to be processed. A repair decoupling image corresponding to the image and the information to be repaired related to the object in the image to be processed is included.

操作Ｓ２３０において、デカップリング画像セットに基づいて、融合画像を生成し、ここで、融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報はそれぞれ処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ融合画像におけるオブジェクトに関連する修復すべき情報が修復された。 In operation S230, a fused image is generated based on the decoupled image set, where the identity information and texture information of the object in the fused image are matched with the identity information and texture information of the object in the image to be processed, respectively, and The information to be repaired related to the object in the fused image has been repaired.

本開示の実施例によれば、第一目標画像は、第一オブジェクトのアイデンティティ情報を提供する画像であると理解することができ、第二目標画像は、第二オブジェクトのテクスチャ情報を提供する画像であると理解することができる。テクスチャ情報は、顔テクスチャ情報を含み、顔テクスチャ情報は、顔姿勢情報及び顔表情情報のうちの少なくとも一つを含むことができる。第一目標画像において、オブジェクトは第一オブジェクトであると理解することができ、第二目標画像において、オブジェクトは第二オブジェクトであると理解することができる。第一目標画像におけるオブジェクトのテクスチャ情報を第二目標画像におけるオブジェクトのテクスチャ情報に置き換える必要があれば、第一目標画像を被駆動画像と呼び、第二目標画像を駆動画像と呼ぶことができる。 According to embodiments of the present disclosure, the first target image can be understood to be an image that provides identity information of the first object, and the second target image can be understood as an image that provides texture information of the second object. It can be understood that The texture information may include facial texture information, and the facial texture information may include at least one of facial posture information and facial expression information. In the first target image, the object can be understood as a first object, and in the second target image, the object can be understood as a second object. If it is necessary to replace the texture information of the object in the first target image with the texture information of the object in the second target image, the first target image can be called a driven image and the second target image can be called a driving image.

本開示の実施例によれば、第一目標画像の数は、一つ又は複数を含むことができる。第一目標画像は、ビデオにおけるビデオフレームであってもよく、静止画像であってもよい。第二目標画像は、ビデオにおけるビデオフレームであってもよく、静止画像であってもよい。例えば、第一目標画像の数は、複数を含むことができ、複数の第一目標画像におけるオブジェクトのアイデンティティ情報は同じである。 According to embodiments of the present disclosure, the number of first target images may include one or more. The first target image may be a video frame in a video or a still image. The second target image may be a video frame in a video or a still image. For example, the number of first target images can include a plurality, and the identity information of the object in the plurality of first target images is the same.

本開示の実施例によれば、処理すべき画像は、オブジェクトのアイデンティティ情報と第一目標画像におけるオブジェクトのアイデンティティ情報と一致し、かつオブジェクトのテクスチャ情報と第二目標画像におけるオブジェクトのテクスチャ情報と一致する画像であり、すなわち、処理すべき画像におけるオブジェクトは第一オブジェクトであり、オブジェクトのテクスチャ情報は第二オブジェクトのテクスチャ情報である。 According to embodiments of the present disclosure, the image to be processed is such that the object identity information matches the object identity information in the first target image, and the object texture information matches the object texture information in the second target image. That is, the object in the image to be processed is the first object, and the texture information of the object is the texture information of the second object.

本開示の実施例によれば、デカップリング画像セットは、ヘッドデカップリング画像及び修復デカップリング画像を含むことができる。ヘッドデカップリング画像は、処理すべき画像中のオブジェクトのヘッド領域に対応する画像であると理解することができ、即ち、処理すべき画像からオブジェクトのヘッド領域の関連特徴を抽出して得られた画像である。修復デカップリング画像は、処理すべき画像におけるオブジェクトに関する修復すべき情報を含む画像であると理解することができる。修復すべき情報は、肌色情報及び欠損情報のうちの少なくとも一つを含むことができる。肌色情報は、顔の肌色を含んでもよい。 According to embodiments of the present disclosure, the decoupling image set may include a head decoupling image and a repair decoupling image. A head decoupled image can be understood as an image corresponding to the head region of an object in the image to be processed, i.e. obtained by extracting relevant features of the head region of the object from the image to be processed. It is an image. A repair decoupling image can be understood as an image containing information to be repaired about objects in the image to be processed. The information to be repaired may include at least one of skin color information and missing information. The skin color information may include the skin color of the face.

本開示の実施例によれば、融合画像は、修復すべき情報に対する修復操作を完了した後に得られた画像であると理解することができ、融合画像におけるオブジェクトは処理すべき画像におけるオブジェクトと同じであり、すなわち、融合画像におけるオブジェクトのアイデンティティ情報は処理すべき画像におけるオブジェクトのアイデンティティ情報と一致し、オブジェクトのテクスチャ情報は、処理すべき画像におけるオブジェクトのテクスチャ情報と一致する。 According to embodiments of the present disclosure, a fused image can be understood as an image obtained after completing an inpainting operation on the information to be inpainted, and objects in the fused image are the same as objects in the image to be processed. That is, the identity information of the object in the fused image matches the identity information of the object in the image to be processed, and the texture information of the object matches the texture information of the object in the image to be processed.

本開示の実施例によれば、第一目標画像及び第二目標画像を取得し、第一目標画像及び第二目標画像を処理し、処理すべき画像を取得し、第二目標画像及び処理すべき画像を処理し、デカップリング画像セットを取得し、デカップリング画像セットを処理し、融合画像を得る。第一目標画像及び第二目標画像を処理し、処理すべき画像を得ることは、第一目標画像からオブジェクトのアイデンティティ情報を抽出し、第二目標画像からオブジェクトのテクスチャ情報を抽出し、アイデンティティ情報及びテクスチャ情報に基づいて、処理すべき画像を取得することを含むことができる。 According to embodiments of the present disclosure, a first target image and a second target image are obtained, the first target image and the second target image are processed, an image to be processed is obtained, a second target image and the second target image are obtained, and the second target image and the second target image are processed. Processing the exponent images to obtain a decoupled image set; processing the decoupled image set to obtain a fused image. Processing the first target image and the second target image to obtain the image to be processed includes extracting the object's identity information from the first target image, extracting the object's texture information from the second target image, and extracting the object's identity information from the second target image. and obtaining an image to be processed based on the texture information.

本開示の実施例によれば、デカップリング画像セットに基づいて、融合画像を生成し、融合画像中のオブジェクトに関連する修復すべき情報が修復されたため、融合画像におけるアイデンティティ類似度を向上させ、さらにイメージ置換の置換効果を向上させる。 According to an embodiment of the present disclosure, a fused image is generated based on a decoupled image set, and since information to be repaired related to an object in the fused image is repaired, identity similarity in the fused image is improved; Furthermore, the replacement effect of image replacement is improved.

本開示の実施例によれば、修復デカップリング画像は、第一デカップリング画像及び第二デカップリング画像を含む。第一デカップリング画像におけるオブジェクトのアイデンティティ情報は処理すべき画像におけるオブジェクトのアイデンティティ情報とマッチングし、第一デカップリング画像におけるオブジェクトの肌色情報は第二目標画像におけるオブジェクトの肌色情報とマッチングする。第二デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との差分画像である。融合画像におけるオブジェクトに関連する修復すべき情報が修復されたことは、融合画像におけるオブジェクトの肌色情報が第二目標画像におけるオブジェクトの肌色情報とマッチングし、かつ差分画像における画素の画素値が予め設定された条件に合致することを示す。 According to embodiments of the present disclosure, the repaired decoupled image includes a first decoupled image and a second decoupled image. The identity information of the object in the first decoupled image is matched with the identity information of the object in the image to be processed, and the skin color information of the object in the first decoupled image is matched with the skin color information of the object in the second target image. The second decoupling image is a difference image between the head area of the object in the image to be processed and the head area of the object in the second target image. The information to be repaired related to the object in the fused image is repaired because the skin color information of the object in the fused image matches the skin color information of the object in the second target image, and the pixel value of the pixel in the difference image is set in advance. indicates that the specified conditions are met.

本開示の実施例によれば、イメージ置換の置換効果を向上させるために、処理すべき画像中のオブジェクトの肌色情報を駆動画像（即ち第二目標画像）中のオブジェクトの肌色情報と一致させ、処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との間の欠失領域が修復される必要がある。 According to embodiments of the present disclosure, in order to improve the replacement effect of image replacement, the skin color information of the object in the image to be processed is matched with the skin color information of the object in the driving image (i.e., the second target image); The missing region between the head region of the object in the image to be processed and the head region of the object in the second target image needs to be repaired.

本開示の実施例によれば、第一デカップリング画像は、処理すべき画像中のオブジェクトの肌色情報と第二目標画像中のオブジェクトの肌色情報を整合する（▲対▼▲斉▼，合わせる）役割を果たすことができる。第一デカップリング画像は、顔色を有する顔五感のマスク画像であってもよい。 According to embodiments of the present disclosure, the first decoupling image matches the skin color information of the object in the image to be processed and the skin color information of the object in the second target image. can play a role. The first decoupling image may be a mask image of the five senses of the face having complexion.

本開示の実施例によれば、第二デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との間の欠失領域を修復するという作用を果たすことができる。第二デカップリング画像は差分画像であると理解することができ、差分画像は処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との間の差分画像であってもよい。差分画像はマスク画像であってもよい。 According to embodiments of the present disclosure, the second decoupling image serves to repair the missing region between the head region of the object in the image to be processed and the head region of the object in the second target image. I can do it. The second decoupling image can be understood as a difference image, which may be a difference image between the head area of the object in the image to be processed and the head area of the object in the second target image. . The difference image may be a mask image.

本開示の実施例によれば、差分画像は複数の画素を含み、各画素はそれに対応する画素値を有し、差分画像における画素点の画素値が予め設定された条件に合致することは、複数の画素値のヒストグラム分布が予め設定されたヒストグラム分布に合致し、複数の画素値の平均分散が予め設定された分散閾値以下であり、複数の画素値の和が予め設定された閾値以下である、の一つを含むことができる。 According to an embodiment of the present disclosure, the difference image includes a plurality of pixels, each pixel has a corresponding pixel value, and the fact that the pixel value of a pixel point in the difference image meets a preset condition means that The histogram distribution of the multiple pixel values matches the preset histogram distribution, the average variance of the multiple pixel values is less than or equal to the preset variance threshold, and the sum of the multiple pixel values is less than or equal to the preset threshold. can include one of the following.

本開示の実施例によれば、ヘッドデカップリング画像は、第三デカップリング画像、第四デカップリング画像及び第五デカップリング画像を含む。第三デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域の階調画像を含む。第四デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域の二値化画像を含む。第五デカップリング画像は、第二目標画像と第四デカップリング画像から得られる画像を含む。 According to an embodiment of the present disclosure, the head decoupling images include a third decoupling image, a fourth decoupling image, and a fifth decoupling image. The third decoupled image includes a tone image of the head region of the object in the image to be processed. The fourth decoupled image includes a binarized image of the head region of the object in the image to be processed. The fifth decoupled image includes an image obtained from the second target image and the fourth decoupled image.

本開示の実施例によれば、第四デカップリング画像は、処理すべき画像中のオブジェクトのヘッド領域の二値化画像、即ち、処理すべき画像中のオブジェクトのヘッド領域の背景及び前景の二値化マスク画像を含むことができる。第五デカップリング画像は、第二目標画像と第四デカップリング画像との差分画像であってもよい。第五デカップリング画像は、第二目標画像におけるオブジェクトのヘッド領域を差し引きした後、第四デカップリング画像におけるオブジェクトのヘッド領域を差引領域に設置して得られた画像であると理解することができる。 According to an embodiment of the present disclosure, the fourth decoupling image is a binary image of the head area of the object in the image to be processed, that is, a binary image of the background and foreground of the head area of the object in the image to be processed. A value mask image may be included. The fifth decoupling image may be a difference image between the second target image and the fourth decoupling image. The fifth decoupling image can be understood to be an image obtained by subtracting the head area of the object in the second target image and then placing the head area of the object in the fourth decoupling image in the subtraction area. .

本開示の実施例によれば、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成することは、第二目標画像及び処理すべき画像に基づいて、第一デカップリング画像を取得することを含むことができる。第二目標画像及び処理すべき画像に基づいて、第二デカップリング画像を取得する。処理すべき画像に基づいて、第三デカップリング画像を取得する。処理すべき画像に基づいて、第四デカップリング画像を取得する。第二目標画像と第四デカップリング画像に基づいて、第五デカップリング画像を得る。 According to embodiments of the present disclosure, generating a decoupled image set based on a second target image and an image to be processed includes generating a decoupled image set based on a second target image and an image to be processed. This may include obtaining. A second decoupled image is obtained based on the second target image and the image to be processed. A third decoupled image is obtained based on the image to be processed. A fourth decoupled image is obtained based on the image to be processed. A fifth decoupled image is obtained based on the second target image and the fourth decoupled image.

本開示の実施例によれば、デカップリング画像セットに基づいて、融合画像を生成することは、以下の操作を含むことができる。 According to embodiments of the present disclosure, generating a fused image based on a decoupled image set may include the following operations.

融合モデルを利用してデカップリング画像セットを処理し、融合画像を取得し、ここで、融合モデルは第一敵対的生成ネットワークモデルにおける生成器を含む。 A fusion model is utilized to process the decoupled image set to obtain a fusion image, where the fusion model includes a generator in a first generative adversarial network model.

本開示の実施例によれば、融合モデルは、修復すべき情報を修復することにより、融合モデルを利用して得られた融合画像と仮想人物の背景をより自然に融合させる。融合モデルは、第二目標画像におけるオブジェクトの肌色情報、処理すべき画像におけるオブジェクトのヘッド領域及び第二目標画像における背景情報をデカップリングし、肌色整合及び欠失領域の画像に対する修復を実現するために用いられ、肌色整合は、処理すべき画像におけるオブジェクトの肌色情報を第二目標画像におけるオブジェクトの肌色情報に変更し、欠失領域の画像を修復することは、処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との間の差分画像における画素の画素値を設定することにより、画素値が予め設定された条件に合致することである。 According to the embodiment of the present disclosure, the fusion model restores information to be restored, thereby more naturally merging the fused image obtained using the fusion model with the background of the virtual person. The fusion model decouples the skin color information of the object in the second target image, the head area of the object in the image to be processed, and the background information in the second target image to realize skin color matching and repair of the missing area in the image. Skin color matching is used to change the skin color information of the object in the image to be processed to the skin color information of the object in the second target image, and repairing the image of the missing area is to change the skin color information of the object in the image to be processed to the skin color information of the object in the image to be processed. By setting the pixel value of a pixel in the difference image between the area and the head area of the object in the second target image, the pixel value matches a preset condition.

本開示の実施例によれば、融合モデルはディープラーニングを利用してトレーニングして得られたモデルであってもよい。融合モデルは、第一敵対的生成ネットワークモデルにおける生成器を含むことができ、すなわち、第一敵対的生成ネットワークモデルにおける生成器を利用してデカップリング画像セットを処理し、融合モデルを得る。 According to an embodiment of the present disclosure, the fusion model may be a model obtained by training using deep learning. The fusion model may include a generator in the first generative adversarial network model, ie, the generator in the first generative adversarial network model is utilized to process the decoupled image set to obtain the fusion model.

本開示の実施例によれば、敵対的生成ネットワークモデルは、ディープ畳み込み敵対的生成ネットワークモデル、ブルドーザの距離に基づく敵対的生成ネットワークモデル、又は条件性敵対的生成ネットワークモデル等を含むことができる。敵対的生成ネットワークモデルは、生成器と判別器とを含んでいてもよい。生成器及び判別器は、ニューラルネットワークモデルを含むことができる。ニューラルネットワークモデルは、Ｕｎｅｔモデルを含むことができる。Ｕｎｅｔモデルは、二つの対称部分を含むことができ、すなわち、前部分モデルは、一般的な畳み込みネットワークモデルと同じであり、畳み込み層及びダウンサンプリング層を含み、画像におけるコンテキスト情報（即ち画素間の関係）を抽出することができる。後部分モデルは、前部分と基本的に対称であり、畳み込み層及びアップサンプリング層を含み、それにより出力画像の分割という目的を達成する。また、Ｕｎｅｔモデルはさらに特徴融合を利用し、すなわち、前部分のダウンサンプリング部分の特徴と後部分のアップサンプリング部分の特徴を融合してより正確なコンテキスト情報を取得し、より良好な分割効果を達成する。 According to embodiments of the present disclosure, the generative adversarial network model may include a deep convolutional generative adversarial network model, a bulldozer distance-based generative adversarial network model, a conditional generative adversarial network model, or the like. The generative adversarial network model may include a generator and a discriminator. The generator and discriminator can include neural network models. Neural network models can include Unet models. The Unet model can include two symmetric parts, namely, the front part model is the same as the general convolutional network model, which includes a convolution layer and a downsampling layer, and contains the context information in the image (i.e. between relationships) can be extracted. The back part model is basically symmetrical with the front part and includes a convolution layer and an upsampling layer, thereby achieving the purpose of segmenting the output image. In addition, the Unet model further utilizes feature fusion, that is, fuses the features of the downsampling part of the front part and the features of the upsampling part of the rear part to obtain more accurate context information and achieve a better segmentation effect. achieve.

本開示の実施例によれば、第一敵対的生成ネットワークモデルの生成器はＵｎｅｔモデルを含むことができる。 According to embodiments of the present disclosure, the generator of the first generative adversarial network model may include a Unet model.

本開示の実施例によれば、融合モデルは以下の方式でトレーニングして得られるものであり、すなわち、第一サンプル画像セットを取得し、第一サンプル画像セットは複数の第一サンプル画像を含む。各第一サンプル画像を処理し、サンプルデカップリング画像セットを取得する。複数のサンプルデカップリング画像セットを利用して第一敵対的生成ネットワークモデルをトレーニングし、トレーニング済みの第一敵対的生成ネットワークモデルを得る。トレーニング済みの第一敵対的生成ネットワークモデルにおける生成器を融合モデルとして決定する。サンプルデカップリング画像セットは、第一サンプル画像におけるオブジェクトのヘッド領域に対応するヘッドデカップリング画像と、第一サンプル画像におけるオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像とを含んでもよい。 According to an embodiment of the present disclosure, the fusion model is obtained by training in the following manner: obtaining a first sample image set, the first sample image set including a plurality of first sample images; . Each first sample image is processed to obtain a sample decoupled image set. A first generative adversarial network model is trained using the plurality of sample decoupled image sets to obtain a trained first generative adversarial network model. A generator in the trained first generative adversarial network model is determined as a fusion model. The sample decoupling image set may include a head decoupling image corresponding to a head region of the object in the first sample image and a repair decoupling image corresponding to information to be repaired related to the object in the first sample image. .

本開示の実施例によれば、複数のサンプルデカップリング画像セットを利用して第一敵対的生成ネットワークモデルをトレーニングし、トレーニング済みの第一敵対的生成ネットワークモデルを取得することは、第一敵対的生成ネットワークモデル中の生成器を利用して複数のサンプルデカップリング画像セットの各サンプルデカップリング画像セットを処理し、各サンプルデカップリング画像セットに対応するサンプル融合画像を得ることを含むことができる。複数のサンプル融合画像及び第一サンプル画像セットに基づいて第一敵対的生成ネットワークモデルにおける生成器及び判別器を交互にトレーニングし、トレーニング済みの第一敵対的生成ネットワークモデルを得る。 According to embodiments of the present disclosure, training a first adversarial generative network model using a plurality of sample decoupled image sets and obtaining a trained first adversarial generative network model comprises: processing each sample decoupled image set of the plurality of sample decoupled image sets using a generator in the generative network model to obtain a sample fused image corresponding to each sample decoupled image set. . A generator and a discriminator in the first generative adversarial network model are alternately trained based on the plurality of sample fused images and the first set of sample images to obtain a trained first generative adversarial network model.

本開示の実施例によれば、第一サンプル画像におけるオブジェクトのヘッド領域に対応するヘッドデカップリング画像は、第一サンプルデカップリング画像及び第二サンプルデカップリング画像を含むことができる。第一サンプルデカップリング画像におけるオブジェクトのアイデンティティ情報は第一サンプル画像におけるオブジェクトのアイデンティティ情報に対応し、第一サンプルデカップリング画像におけるオブジェクトの肌色情報は、予め設定された肌色情報に対応する。第二サンプルデカップリング画像は、第一サンプル画像におけるオブジェクトのヘッド領域と予め設定されたヘッド領域との差分画像である。 According to embodiments of the present disclosure, the head decoupling image corresponding to the head region of the object in the first sample image may include a first sample decoupling image and a second sample decoupling image. The identity information of the object in the first sample decoupled image corresponds to the identity information of the object in the first sample image, and the skin color information of the object in the first sample decoupled image corresponds to preset skin color information. The second sample decoupled image is a difference image between the head area of the object in the first sample image and a preset head area.

本開示の実施例によれば、第一サンプル画像におけるオブジェクトに関する修復すべき情報に対応する修復デカップリング画像は、第三サンプルデカップリング画像、第四サンプルデカップリング画像及び第五サンプルデカップリング画像を含むことができる。第三サンプルデカップリング画像は、第一サンプル画像におけるオブジェクトのヘッド領域の諧調画像を含んでもよい。第四サンプルデカップリング画像は、第一サンプル画像におけるオブジェクトのヘッド領域の二値化画像を含んでもよい。第五サンプルデカップリング画像は、第四サンプルデカップリング画像から得られた画像を含んでもよい。 According to an embodiment of the present disclosure, the repaired decoupled image corresponding to the information to be repaired about the object in the first sample image includes the third sample decoupled image, the fourth sample decoupled image, and the fifth sample decoupled image. can be included. The third sample decoupled image may include a tone image of the head region of the object in the first sample image. The fourth sample decoupled image may include a binarized image of the head region of the object in the first sample image. The fifth sample decoupled image may include an image obtained from the fourth sample decoupled image.

本開示の実施例によれば、融合モデルは、第一アイデンティティ情報損失関数、第一画像特徴整合損失関数、第一判別特徴整合損失関数及び第一判別器損失関数を利用してトレーニングされたものである。 According to embodiments of the present disclosure, the fusion model is trained using a first identity information loss function, a first image feature matching loss function, a first discriminant feature matching loss function, and a first discriminator loss function. It is.

本開示の実施例によれば、アイデンティティ情報損失関数は、アイデンティティ情報の整合を実現するために用いられる。画像特徴整合損失関数は、テクスチャ情報の整合を実現するために用いることができる。判別特徴整合損失関数は、できるだけ判別器空間のテクスチャ情報の整合に用いることができる。判別器損失関数は、生成された画像が高い解像度を有することをできるだけ保証するために用いられる。 According to embodiments of the present disclosure, an identity information loss function is used to achieve identity information matching. An image feature matching loss function can be used to achieve matching of texture information. The discriminant feature matching loss function can be used to match texture information in the discriminator space as much as possible. The classifier loss function is used to ensure that the generated images have high resolution as much as possible.

本開示の実施例によれば、アイデンティティ情報損失関数は以下の式（１）に基づいて決定することができる。 According to embodiments of the present disclosure, the identity information loss function can be determined based on equation (1) below.

画像特徴整合損失関数は、以下の式（２）に基づいて決定することができる。 The image feature matching loss function can be determined based on equation (2) below.

判別特徴整合損失関数は以下の式（３）に基づいて決定することができる。 The discriminant feature matching loss function can be determined based on the following equation (3).

判別器損失関数は、以下の式（４）に基づいて決定することができる。 The discriminator loss function can be determined based on the following equation (4).

本開示の実施例によれば、第一アイデンティティ情報損失関数は、第一サンプル画像におけるオブジェクトのアイデンティティ情報とサンプル融合画像におけるオブジェクトのアイデンティティ情報との整合を実現するために用いられる。第一画像特徴整合損失関数は、第一サンプル画像におけるオブジェクトのテクスチャ情報とサンプル融合画像におけるオブジェクトのテクスチャ情報との整合を実現するために用いることができる。第一判別特徴整合損失関数は、判別器空間の第一サンプル画像におけるオブジェクトのテクスチャ情報とサンプル融合画像におけるオブジェクトのテクスチャ情報との整合を実現するために用いることができる。第一判別器損失関数は、サンプル融合画像が高い解像度を有することをできるだけ保証するために用いられる。 According to embodiments of the present disclosure, a first identity information loss function is used to achieve matching of the object's identity information in the first sample image and the object's identity information in the sample fused image. The first image feature matching loss function can be used to achieve matching between the texture information of the object in the first sample image and the texture information of the object in the sample fused image. The first discriminant feature matching loss function can be used to achieve matching between the texture information of the object in the first sample image in the classifier space and the texture information of the object in the sample fused image. The first discriminator loss function is used to ensure as much as possible that the sample fused images have high resolution.

本開示の実施例によれば、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成することは、以下の操作を含むことができる。 According to embodiments of the present disclosure, generating an image to be processed based on the first target image and the second target image may include the following operations.

駆動モデルにおけるアイデンティティ抽出モジュールを利用して第一目標画像を処理し、第一目標画像におけるオブジェクトのアイデンティティ情報を取得する。駆動モデルにおけるテクスチャ抽出モジュールを利用して第二目標画像を処理し、第二目標画像におけるオブジェクトのテクスチャ情報を取得する。駆動モデルにおけるスティッチングモジュールを利用してアイデンティティ情報及びテクスチャ情報を処理し、スティッチング情報を取得する。駆動モデル中の生成器によりスティッチング情報を処理し、処理すべき画像を取得する。 Processing the first target image using an identity extraction module in the driving model to obtain identity information of the object in the first target image. A texture extraction module in the driving model is used to process the second target image to obtain texture information of the object in the second target image. A stitching module in the driving model is used to process identity information and texture information to obtain stitching information. The stitching information is processed by a generator in the driving model to obtain an image to be processed.

本開示の実施例によれば、駆動モデルは、第一目標画像におけるオブジェクトのアイデンティティ情報及び第二目標画像におけるオブジェクトのテクスチャ情報をデカップリングし、第一目標画像におけるオブジェクトと第二目標画像におけるオブジェクトの顔の置換を完了することに用いられる。 According to embodiments of the present disclosure, the driving model decouples the identity information of the object in the first target image and the texture information of the object in the second target image, and decouples the object's identity information in the first target image and the object's texture information in the second target image. used to complete face replacement.

本開示の実施例によれば、駆動モデルは、アイデンティティ抽出モジュール、テクスチャ抽出モジュール、スティッチングモジュール及び生成器を含むことができる。駆動モデルの生成器は、第二敵対的生成ネットワークモデルの生成器であってもよい。アイデンティティ抽出モジュールは、オブジェクトのアイデンティティ情報を抽出するために用いられる。テクスチャ抽出モジュールは、オブジェクトのテクスチャ情報を抽出するために用いられてもよい。スティッチングモジュールは、アイデンティティ情報とテクスチャ情報をスティッチングするために用いられる。駆動モデルの生成器は、スティッチング情報に基づいて融合画像を生成するために用いられる。 According to embodiments of the present disclosure, the driving model may include an identity extraction module, a texture extraction module, a stitching module, and a generator. The driving model generator may be a second generative adversarial network model generator. The identity extraction module is used to extract identity information of an object. A texture extraction module may be used to extract texture information of an object. The stitching module is used to stitch identity information and texture information. A driving model generator is used to generate a fused image based on the stitching information.

本開示の実施例によれば、アイデンティティ抽出モジュールは、第一エンコーダであってもよく、テクスチャ抽出モジュールは、第二エンコーダであってもよく、スティッチングモジュールはＭＬＰ（Multilayer Perceptron、多層パーセプトロン）であってもよい。第一エンコーダ及び第二エンコーダはＶＧＧ（Visual Geometry Group、幾何学的視覚グループ）モデルを含むことができる。 According to embodiments of the present disclosure, the identity extraction module may be a first encoder, the texture extraction module may be a second encoder, and the stitching module is an MLP (Multilayer Perceptron). There may be. The first encoder and the second encoder may include a Visual Geometry Group (VGG) model.

本開示の実施例によれば、スティッチング情報は複数を含み、駆動モデルの生成器はカスケードされたＮ個のディープユニットを含み、Ｎは１より大きい整数である。 According to embodiments of the present disclosure, the stitching information includes a plurality, and the generator of the driving model includes N cascaded deep units, where N is an integer greater than 1.

駆動モデルにおける生成器によりスティッチング情報を処理し、処理すべき画像を取得することは、以下の操作を含むことができる。 Processing the stitching information by a generator in the driving model and obtaining an image to be processed may include the following operations.

Ｎ個のディープユニットのうちのｉ番目のディープユニットに対して、ｉ番目のディープユニットを利用してｉ番目のディープユニットに対応する第ｉレベルのジャンプ情報を処理し、第ｉレベルの特徴情報を取得し、ここで、第ｉレベルのジャンプ情報は第（ｉ－１）レベルの特徴情報と第ｉレベルのスティッチング情報を含み、ここで、ｉが１より大きくかつＮ以下である。第Ｎレベルの特徴情報に基づいて、処理すべき画像を生成する。 For the i-th deep unit among the N deep units, the i-th level jump information corresponding to the i-th deep unit is processed using the i-th deep unit, and the i-th level feature information is processed. , where the i-th level jump information includes the (i-1)-th level feature information and the i-th level stitching information, where i is greater than 1 and less than or equal to N. An image to be processed is generated based on the Nth level feature information.

本開示の実施例によれば、駆動モデルの生成器は、カスケードされたＮ個のディープユニットを含むことができる。各レベルのディープユニットは、それに対応するスティッチング情報を有する。異なるレベルのディープユニットは、画像の異なるディープの特徴を抽出するために用いられる。各レベルのディープユニットの入力は、二つの部分、即ち、該レベルのディープユニットの上位のディープユニットに対応する特徴情報と該レベルのディープユニットに対応するスティッチング情報を含むことができる。 According to embodiments of the present disclosure, the driving model generator may include N cascaded deep units. Deep units at each level have corresponding stitching information. Different level deep units are used to extract different deep features of the image. The input of a deep unit at each level may include two parts: feature information corresponding to a deep unit above the deep unit at the level and stitching information corresponding to the deep unit at the level.

本開示の実施例によれば、駆動モデルは以下の方式でトレーニングされて得られる。すなわち、第二サンプル画像セットと第三サンプル画像セットを取得し、第二サンプル画像セットは複数の第二サンプル画像を含み、第三サンプル画像セットは複数の第三サンプル画像を含む。アイデンティティ抽出モジュールを利用して第二サンプル画像を処理し、第二サンプル画像におけるオブジェクトのアイデンティティ情報を取得する。テクスチャ抽出モジュールを利用して第三サンプル画像を処理し、第三サンプル画像におけるオブジェクトのテクスチャ情報を取得する。スティッチングモジュールを利用して第二サンプル画像におけるオブジェクトのアイデンティティ情報及び第三サンプル画像におけるオブジェクトのテクスチャ情報を処理し、サンプルスティッチング情報を取得し、かつ生成器を利用してサンプルスティッチング情報を処理し、シミュレーション画像を取得する。第二サンプル画像セットとシミュレーション画像セットを利用してアイデンティティ抽出モジュール、テクスチャ抽出モジュール、スプライシングモジュール及び第二敵対的生成ネットワークモデルをトレーニングして、トレーニング済みの駆動モデルを取得する。 According to an embodiment of the present disclosure, a driving model is trained and obtained in the following manner. That is, a second sample image set and a third sample image set are obtained, the second sample image set including a plurality of second sample images, and the third sample image set including a plurality of third sample images. Processing the second sample image using an identity extraction module to obtain identity information of objects in the second sample image. Processing the third sample image using a texture extraction module to obtain texture information of the object in the third sample image. Processing the identity information of the object in the second sample image and the texture information of the object in the third sample image using a stitching module to obtain sample stitching information, and using a generator to generate the sample stitching information. Process and obtain simulation images. A trained driving model is obtained by training an identity extraction module, a texture extraction module, a splicing module, and a second generative adversarial network model using the second sample image set and the simulation image set.

本開示の実施例によれば、駆動モデルは、第二アイデンティティ情報損失関数、第二目標画像特徴整合損失関数、第二判別特徴整合損失関数、第二判別器損失関数及び循環一致損失関数を利用してトレーニングされたものである。 According to embodiments of the present disclosure, the driving model utilizes a second identity information loss function, a second target image feature matching loss function, a second discriminant feature matching loss function, a second discriminator loss function, and a cyclic matching loss function. This is what they were trained to do.

本開示の実施例によれば、第二アイデンティティ情報損失関数は、第二サンプル画像におけるオブジェクトのアイデンティティ情報とシミュレーション画像におけるオブジェクトのアイデンティティ情報との整合を実現するために用いられる。第二画像特徴整合損失関数は、第二サンプル画像におけるオブジェクトのテクスチャ情報とシミュレーション画像におけるオブジェクトのテクスチャ情報との整合を実現するために用いることができる。第二判別特徴整合損失関数は、判別器空間の第二サンプル画像におけるオブジェクトのテクスチャ情報とシミュレーション画像におけるオブジェクトのテクスチャ情報との整合を実現するために用いることができる。第二判別器損失関数は、シミュレーション画像が高い解像度を有することをできるだけ保証するために用いられる。循環一致損失関数は、第三サンプル画像におけるオブジェクトのテクスチャ情報に対する駆動モデルの保持能力を向上させるために用いることができる。 According to embodiments of the present disclosure, a second identity information loss function is used to achieve matching of the object's identity information in the second sample image and the object's identity information in the simulation image. The second image feature matching loss function can be used to achieve matching between the texture information of the object in the second sample image and the texture information of the object in the simulation image. The second discriminant feature matching loss function can be used to achieve matching between the texture information of the object in the second sample image in the discriminator space and the texture information of the object in the simulation image. The second classifier loss function is used to ensure that the simulated images have high resolution as much as possible. A circular matching loss function can be used to improve the driving model's ability to retain texture information of objects in the third sample image.

本開示の実施例によれば、循環一致損失関数は、実際結果と駆動モデルにより生成された予測結果に基づいて決定され、実際結果は、実際画像におけるオブジェクトの実際アイデンティティ情報及び実際テクスチャ情報を含み、予測結果は、シミュレーション画像におけるオブジェクトの予測アイデンティティ情報及び予測テクスチャ情報を含む。 According to embodiments of the present disclosure, the cyclic matching loss function is determined based on the actual result and the predicted result generated by the driving model, where the actual result includes actual identity information and actual texture information of the object in the actual image. , the prediction result includes predicted identity information and predicted texture information of the object in the simulation image.

本開示の実施例によれば、実際画像におけるオブジェクトの実際アイデンティティ情報は、上述した第二サンプル画像におけるオブジェクトのアイデンティティ情報であると理解することができる。実際画像におけるオブジェクトの実際テクスチャ情報は、上述した第三サンプル画像におけるオブジェクトのテクスチャ情報であると理解することができる。 According to an embodiment of the present disclosure, the actual identity information of the object in the actual image can be understood as the identity information of the object in the second sample image mentioned above. The actual texture information of the object in the actual image can be understood as the texture information of the object in the third sample image described above.

本開示の実施例によれば、循環一致損失関数は、以下の式（５）～（７）に基づいて決定することができる。 According to embodiments of the present disclosure, the cyclic matching loss function can be determined based on equations (5) to (7) below.

本開示の実施例によれば、上記画像処理方法はさらに以下の操作を含むことができる。 According to embodiments of the present disclosure, the image processing method may further include the following operations.

融合画像に対して強化処理を行い、強化画像を取得する。 Enhancement processing is performed on the fused image to obtain an enhanced image.

本開示の実施例によれば、融合画像の解像度を向上させるために、融合画像に対して解像度強化処理を行い、強化画像を取得することにより、強化画像の解像度が融合画像の解像度より大きい。 According to an embodiment of the present disclosure, in order to improve the resolution of the fused image, resolution enhancement processing is performed on the fused image to obtain the enhanced image, so that the resolution of the enhanced image is greater than the resolution of the fused image.

本開示の実施例によれば、融合画像に対して強化処理を行い、強化画像を取得することは、以下の操作を含むことができる。 According to embodiments of the present disclosure, performing an enhancement process on a fused image and obtaining an enhanced image may include the following operations.

強化モデルを利用して融合画像を処理し、強化画像を取得し、ここで、強化モデルは、第三敵対的生成ネットワークモデルにおける生成器を含む。 The fused image is processed using an enhanced model to obtain an enhanced image, where the enhanced model includes a generator in a third generative adversarial network model.

本開示の実施例によれば、強化モデルは、画像の解像度を向上させるために用いられる。強化モデルは、第三敵対的生成ネットワークモデルにおける生成器を含んでもよい。第三敵対的生成ネットワークモデルはＰＳＦＲ（Progressive Semantic-Aware Style、漸進式語彙感知パターン変換）－ＧＡＮを含むことができる。 According to embodiments of the present disclosure, an enhanced model is used to improve the resolution of an image. The reinforcement model may include a generator in a third generative adversarial network model. The third generative adversarial network model may include a PSFR (Progressive Semantic-Aware Style)-GAN.

以下に図３～図４を参照し、具体的な実施例を参照して本開示の実施例に係る画像処理方法をさらに説明する。 The image processing method according to the embodiment of the present disclosure will be further described below with reference to FIGS. 3 to 4 and specific embodiments.

図３は、本開示の実施例に係る処理すべき画像を生成する過程の概略図を概略的に示す。 FIG. 3 schematically shows a schematic diagram of a process of generating an image to be processed according to an embodiment of the present disclosure.

図３に示すように、過程３００において、第一目標画像セット３０１は、第一目標画像３０１０、第一目標画像３０１１、第一目標画像３０１２及び第一目標画像３０１３を含む。駆動モデルは、アイデンティティ抽出モジュール３０３、テクスチャ抽出モジュール３０５、スティッチングモジュール３０７及び生成器３０９を含む。 As shown in FIG. 3, in the process 300, the first target image set 301 includes a first target image 3010, a first target image 3011, a first target image 3012, and a first target image 3013. The driving model includes an identity extraction module 303, a texture extraction module 305, a stitching module 307 and a generator 309.

アイデンティティ抽出モジュール３０３を利用して第一目標画像セット３０１を処理し、第一目標画像３０１０中のオブジェクトのアイデンティティ情報３０４０、第一目標画像３０１１におけるオブジェクトのアイデンティティ情報３０４１、第一目標画像３０１２におけるオブジェクトのアイデンティティ情報３０４２、第一目標画像３０１３におけるオブジェクトのアイデンティティ情報３０４３を取得する。アイデンティティ情報３０４０、アイデンティティ情報３０４１、アイデンティティ情報３０４２及びアイデンティティ情報３０４３に基づいて、平均アイデンティティ情報３０４を取得し、平均アイデンティティ情報３０４を第一目標画像のアイデンティティ情報３０４として確定する。 The identity extraction module 303 is used to process the first target image set 301 and extract identity information 3040 of the object in the first target image 3010, identity information 3041 of the object in the first target image 3011, and object identity information 3041 of the object in the first target image 3012. The object identity information 3042 and the object identity information 3043 in the first target image 3013 are acquired. Based on the identity information 3040, the identity information 3041, the identity information 3042, and the identity information 3043, the average identity information 304 is obtained, and the average identity information 304 is determined as the identity information 304 of the first target image.

テクスチャ抽出モジュール３０５を利用して第二目標画像３０２を処理し、第一目標画像３０２におけるオブジェクトのテクスチャ情報３０６を取得する。 The second target image 302 is processed using the texture extraction module 305 to obtain texture information 306 of the object in the first target image 302 .

スティッチングモジュール３０７を利用してアイデンティティ情報３０４及びテクスチャ情報３０６を処理し、スティッチング情報セット３０８を取得し、スティッチング情報セット３０８は、スティッチング情報３０８０、スティッチング情報３０８１及びスティッチング情報３０８２を含む。 The stitching module 307 is used to process the identity information 304 and texture information 306 to obtain a stitching information set 308, which includes stitching information 3080, stitching information 3081, and stitching information 3082. include.

生成器３０９を利用してスティッチング情報セット３０８を処理し、処理すべき画像３１０を取得する。処理すべき画像３１０におけるオブジェクトのアイデンティティ情報は、第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングする。処理すべき画像３１０におけるオブジェクトのテクスチャ情報は、第二目標画像３０２におけるオブジェクトのテクスチャ情報とマッチングする。 The stitching information set 308 is processed using the generator 309 to obtain an image 310 to be processed. The identity information of the object in the image to be processed 310 is matched with the identity information of the object in the first target image. The texture information of the object in the image to be processed 310 is matched with the texture information of the object in the second target image 302.

図４は、本開示の実施例に係る画像処理過程の概略図を概略的に示す。 FIG. 4 schematically shows a schematic diagram of an image processing process according to an embodiment of the present disclosure.

図４に示すように、該過程４００において、駆動モデル４０３を利用して第一目標画像４０１及び第二目標画像４０２を処理し、処理すべき画像４０４を取得する。 As shown in FIG. 4, in the process 400, a driving model 403 is used to process a first target image 401 and a second target image 402 to obtain an image 404 to be processed.

第二目標画像４０２及び処理すべき画像４０４に基づいて、デカップリング画像セット４０５における第一デカップリング画像４０５０を取得する。第二目標画像４０２及び処理すべき画像４０４に基づいて、デカップリング画像セット４０５における第二デカップリング画像４０５１を取得する。処理すべき画像４０４に基づいて、デカップリング画像セット４０５における第三デカップリング画像４０５２を取得する。処理すべき画像４０４に基づいて、デカップリング画像セット４０５における第四デカップリング画像４０５３を取得する。第二目標画像４０２及び第四デカップリング画像４０５３に基づいて、デカップリング画像セット４０５における第五デカップリング画像４０５４を取得する。 A first decoupled image 4050 in the decoupled image set 405 is obtained based on the second target image 402 and the image to be processed 404 . A second decoupled image 4051 in the decoupled image set 405 is obtained based on the second target image 402 and the image to be processed 404 . Based on the image 404 to be processed, a third decoupled image 4052 in the decoupled image set 405 is obtained. A fourth decoupled image 4053 in the decoupled image set 405 is obtained based on the image 404 to be processed. A fifth decoupled image 4054 in the decoupled image set 405 is obtained based on the second target image 402 and the fourth decoupled image 4053.

融合モデル４０６を利用してデカップリング画像セット４０５を理解し、融合画像４０７を取得する。 The fusion model 406 is utilized to understand the decoupled image set 405 and obtain the fused image 407.

本開示の技術案において、関するユーザ個人情報の収集、記憶、使用、加工、伝送、提供、開示及び応用等の処理は、いずれも相関法規則の規定に適合し、必要なセキュリティ対策を採用し、かつ公序良俗に反するものではない。 In the technical proposal of this disclosure, the collection, storage, use, processing, transmission, provision, disclosure and application of related user personal information shall all comply with the provisions of relevant laws and regulations and adopt necessary security measures. , and does not violate public order and morals.

本開示の技術的技術案において、ユーザの個人情報を取得するか又は収集する前に、いずれもユーザの許可又は同意を取得する。 In the technical solution of the present disclosure, the user's permission or consent is obtained before obtaining or collecting the user's personal information.

図５は、本開示の実施例に係る画像処理装置のブロック図を概略的に示す。 FIG. 5 schematically shows a block diagram of an image processing device according to an embodiment of the present disclosure.

図５に示すように、画像処理装置５００は、第一生成モジュール５１０、第二生成モジュール５２０及び第三生成モジュール５３０を含むことができる。 As shown in FIG. 5, the image processing apparatus 500 may include a first generation module 510, a second generation module 520, and a third generation module 530.

第一生成モジュール５１０は、第一目標画像及び第二目標画像に基づいて、処理すべき画像を生成する。ここで、処理すべき画像におけるオブジェクトのアイデンティティ情報は、第一目標画像におけるオブジェクトのアイデンティティ情報とマッチングし、処理すべき画像におけるオブジェクトのテクスチャ情報は、第二目標画像におけるオブジェクトのテクスチャ情報とマッチングする。 The first generation module 510 generates an image to be processed based on the first target image and the second target image. Here, the identity information of the object in the image to be processed is matched with the identity information of the object in the first target image, and the texture information of the object in the image to be processed is matched with the texture information of the object in the second target image. .

第二生成モジュール５２０は、第二目標画像及び処理すべき画像に基づいて、デカップリング画像セットを生成する。ここで、デカップリング画像セットは、処理すべき画像中のオブジェクトのヘッド領域に対応するヘッドデカップリング画像と処理すべき画像中のオブジェクトに関連する修復すべき情報に対応する修復デカップリング画像とを含む。 A second generation module 520 generates a decoupled image set based on the second target image and the image to be processed. Here, the decoupling image set includes a head decoupling image corresponding to the head region of the object in the image to be processed and a repair decoupling image corresponding to information to be repaired related to the object in the image to be processed. include.

第三生成モジュール５３０は、デカップリング画像セットに基づいて、融合画像を生成する。ここで、融合画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報はそれぞれ処理すべき画像におけるオブジェクトのアイデンティティ情報及びテクスチャ情報とマッチングし、かつ融合画像におけるオブジェクトに関連する修復すべき情報が修復された。 A third generation module 530 generates a fused image based on the decoupled image set. Here, the identity information and texture information of the object in the fused image are matched with the identity information and texture information of the object in the image to be processed, respectively, and the information to be repaired related to the object in the fused image is repaired.

本開示の実施例によれば、修復デカップリング画像は、第一デカップリング画像及び第二デカップリング画像を含む。第一デカップリング画像におけるオブジェクトのアイデンティティ情報は処理すべき画像におけるオブジェクトのアイデンティティ情報とマッチングし、第一デカップリング画像におけるオブジェクトの肌色情報は第二目標画像におけるオブジェクトの肌色情報とマッチングする。第二デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域と第二目標画像におけるオブジェクトのヘッド領域との差分画像である。ここで、融合画像中のオブジェクトに関連する修復すべき情報が修復されたことは、融合画像におけるオブジェクトの肌色情報が第二目標画像におけるオブジェクトの肌色情報とマッチングし、かつ差分画像における画素の画素値が予め設定された条件に合致することを示す。 According to embodiments of the present disclosure, the repaired decoupled image includes a first decoupled image and a second decoupled image. The identity information of the object in the first decoupled image is matched with the identity information of the object in the image to be processed, and the skin color information of the object in the first decoupled image is matched with the skin color information of the object in the second target image. The second decoupling image is a difference image between the head area of the object in the image to be processed and the head area of the object in the second target image. Here, the information to be repaired related to the object in the fused image is repaired because the skin color information of the object in the fused image matches the skin color information of the object in the second target image, and the pixel of the pixel in the difference image is Indicates that the value matches a preset condition.

本開示の実施例によれば、ヘッドデカップリング画像は、第三デカップリング画像、第四デカップリング画像及び第五デカップリング画像を含む。第三デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域の階調画像を含む。第四デカップリング画像は、処理すべき画像におけるオブジェクトのヘッド領域の二値化画像を含む。第五デカップリング画像は、第二目標画像と第四デカップリング画像とから得られる画像を含む。 According to an embodiment of the present disclosure, the head decoupling images include a third decoupling image, a fourth decoupling image, and a fifth decoupling image. The third decoupled image includes a tone image of the head region of the object in the image to be processed. The fourth decoupled image includes a binarized image of the head region of the object in the image to be processed. The fifth decoupled image includes an image obtained from the second target image and the fourth decoupled image.

本開示の実施例によれば、第三生成モジュール５３０は第一処理ユニットを含むことができる。 According to embodiments of the present disclosure, the third generation module 530 may include a first processing unit.

第一処理ユニットは、融合モデルを利用してデカップリング画像セットを処理し、融合画像を取得する。ここで、融合モデルは、第一敵対的生成ネットワークモデルにおける生成器を含む。 The first processing unit processes the decoupled image set using the fusion model to obtain a fused image. Here, the fusion model includes the generator in the first generative adversarial network model.

本開示の実施例によれば、第一生成モジュール５１０は第二処理ユニット、第三処理ユニット、第四処理ユニット及び第五処理ユニットを含むことができる。 According to embodiments of the present disclosure, the first generation module 510 may include a second processing unit, a third processing unit, a fourth processing unit, and a fifth processing unit.

第二処理ユニットは、駆動モデルにおけるアイデンティティ抽出モジュールを利用して第一目標画像を処理し、第一目標画像におけるオブジェクトのアイデンティティ情報を取得する。 The second processing unit processes the first target image using the identity extraction module in the driving model to obtain identity information of the object in the first target image.

第三処理ユニットは、駆動モデルにおけるテクスチャ抽出モジュールを利用して第二目標画像を処理し、第二目標画像におけるオブジェクトのテクスチャ情報を取得する。 The third processing unit processes the second target image using a texture extraction module in the driving model to obtain texture information of the object in the second target image.

第四処理ユニットは、駆動モデルにおけるスティッチングモジュールを利用してアイデンティティ情報及びテクスチャ情報を処理し、スティッチング情報を取得する。 A fourth processing unit processes identity information and texture information using a stitching module in the driving model to obtain stitching information.

第五処理ユニットは、駆動モデルにおける生成器を利用してスティッチング情報を処理し、処理すべき画像を取得する。 The fifth processing unit processes the stitching information using a generator in the driving model to obtain an image to be processed.

第五処理ユニットは、処理サブユニットと、生成サブユニットとを含んでもよい。 The fifth processing unit may include a processing subunit and a generation subunit.

処理サブユニットは、Ｎ個のディープユニットのうちのｉ番目のディープユニットに対して、ｉ番目のディープユニットを利用してｉ番目のディープユニットに対応する第ｉレベルのジャンプ情報を処理し、第ｉレベルの特徴情報を取得する。ここで、第ｉレベルのジャンプ情報は第（ｉ－１）レベルの特徴情報及び第ｉレベルのスティッチング情報を含む。ここで、ｉは１より大きくかつＮ以下である。 The processing subunit processes jump information of the i-th level corresponding to the i-th deep unit using the i-th deep unit for the i-th deep unit among the N deep units; Obtain i-level feature information. Here, the i-th level jump information includes (i-1)-th level feature information and i-th level stitching information. Here, i is greater than 1 and less than or equal to N.

生成サブユニットは、第Ｎレベルの特徴情報に基づいて、処理すべき画像を生成する。 The generation subunit generates an image to be processed based on the Nth level feature information.

本開示の実施例によれば、駆動モデルは、第二アイデンティティ情報損失関数、第二画像特徴整合損失関数、第二判別特徴整合損失関数、第二判別器損失関数及び循環一致損失関数を利用してトレーニングされたものである。 According to embodiments of the present disclosure, the driving model utilizes a second identity information loss function, a second image feature matching loss function, a second discriminant feature matching loss function, a second discriminator loss function, and a cyclic matching loss function. This training has been carried out.

本開示の実施例によれば、上記画像処理装置５００はさらに処理モジュールを含むことができる。 According to embodiments of the present disclosure, the image processing device 500 may further include a processing module.

処理モジュールは、融合画像に強化処理を行い、強化画像を取得する。 The processing module performs an enhancement process on the fused image and obtains an enhanced image.

本開示の実施例によれば、本開示は、さらに、電子機器、可読記憶媒体及びコンピュータプログラム製品を提供する。 According to embodiments of the disclosure, the disclosure further provides electronic devices, readable storage media, and computer program products.

本開示の実施例によれば、電子機器であって、少なくとも一つのプロセッサと、少なくとも一つのプロセッサと通信接続されるメモリと、を含み、ここで、メモリに少なくとも一つのプロセッサにより実行可能な命令が記憶され、命令が少なくとも一つのプロセッサにより実行されることにより、少なくとも一つのプロセッサが前記のような画像処理方法を実行することができる。 According to embodiments of the present disclosure, an electronic device includes at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory includes instructions executable by the at least one processor. is stored and the instructions are executed by at least one processor, thereby allowing at least one processor to execute the image processing method as described above.

本開示の実施例によれば、コンピュータ命令を記憶した非一時的なコンピュータ可読記憶媒体であって、ここで、コンピュータ命令はコンピュータに前記の画像処理方法を実行させる。 According to embodiments of the present disclosure, a non-transitory computer-readable storage medium having computer instructions stored thereon, the computer instructions causing a computer to perform the image processing method described above.

本開示の実施例によれば、コンピュータプログラム製品であって、プロセッサにより実行される時に前記のような画像処理方法を実現するコンピュータプログラムを含む。 According to embodiments of the present disclosure, a computer program product includes a computer program product that, when executed by a processor, implements an image processing method as described above.

本開示の実施例によれば、電子機器であって、少なくとも一つのプロセッサと、少なくとも一つのプロセッサと通信接続されるメモリと、を含み、ここで、メモリに少なくとも一つのプロセッサにより実行可能な命令が記憶され、命令が少なくとも一つのプロセッサにより実行されることにより、少なくとも一つのプロセッサが前記のような方法を実行することができる。 According to embodiments of the present disclosure, an electronic device includes at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory includes instructions executable by the at least one processor. are stored and the instructions are executed by the at least one processor, thereby enabling the at least one processor to perform the method.

本開示の実施例によれば、コンピュータ命令を記憶した非一時的なコンピュータ可読記憶媒体であって、ここで、コンピュータ命令はコンピュータに前記の方法を実行させる。 According to an embodiment of the present disclosure, a non-transitory computer-readable storage medium having computer instructions stored thereon, the computer instructions causing a computer to perform the method described above.

本開示の実施例によれば、コンピュータプログラム製品であって、プロセッサにより実行される時に前記のような方法を実現するコンピュータプログラムを含む。 According to embodiments of the present disclosure, a computer program product includes a computer program product that, when executed by a processor, implements a method as described above.

図６は、本開示の実施例に係る画像処理方法を実現するのに適した電子機器のブロック図を概略的に示す。電子機器は、例えば、ラップトップ型コンピュータ、デスクトップコンピュータ、作業台、パーソナルデジタルアシスタント、サーバ、ブレードサーバ、大型コンピュータ、及び他の適切なコンピュータという様々な形式のデジタルコンピュータを表示することを意図する。電子機器は、さらに、例えば、パーソナルデジタルアシスタント、携帯電話、スマートフォン、ウェアラブルデバイス及び他の類似の計算装置という様々な形式の移動装置を表示してもよい。本明細書に示された部材、それらの接続及び関係、及びそれらの機能は例示に過ぎず、本明細書に記載された及び／又は要求された本開示の実現を限定するものではない。 FIG. 6 schematically shows a block diagram of an electronic device suitable for implementing an image processing method according to an embodiment of the present disclosure. Electronic equipment is intended to refer to various types of digital computers, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, large format computers, and other suitable computers. Electronic devices may also represent various types of mobile devices, such as personal digital assistants, mobile phones, smart phones, wearable devices, and other similar computing devices. The components, their connections and relationships, and their functions depicted herein are illustrative only and are not intended to limit implementation of the disclosure as described and/or required herein.

図６に示すように、電子機器６００は、計算ユニット６０１を含み、それはリードオンリーメモリ（ＲＯＭ）６０２に記憶されたコンピュータプログラム又は記憶ユニット６０８からランダムアクセスメモリ（ＲＡＭ）６０３にロードされたコンピュータプログラムに基づいて、様々な適切な動作及び処理を実行することができる。ＲＡＭ６０３には、さらに電子機器６００の操作に必要な様々なプログラム及びデータを記憶することができる。計算ユニット６０１、ＲＯＭ６０２、およびＲＡＭ６０３は、バス６０４を介して相互に接続されている。バス６０４には、入出力（Ｉ／Ｏ）インタフェース６０５も接続されている。 As shown in FIG. 6, the electronic device 600 includes a computing unit 601, which is a computer program stored in a read-only memory (ROM) 602 or loaded into a random access memory (RAM) 603 from a storage unit 608. Various appropriate actions and processing may be performed based on the . The RAM 603 can further store various programs and data necessary for operating the electronic device 600. Computing unit 601, ROM 602, and RAM 603 are interconnected via bus 604. An input/output (I/O) interface 605 is also connected to the bus 604 .

電子機器６００における複数の部品は、Ｉ／Ｏインタフェース６０５に接続され、例えばキーボード、マウス等の入力ユニット６０６と、例えば様々な種別のディスプレイ、スピーカ等の出力ユニット６０７と、例えば磁気ディスク、光ディスク等の記憶ユニット６０８と、例えばネットワークカード、モデム、無線通信トランシーバ等の通信ユニット６０９とを含む。通信ユニット６０９は、電子機器６００がインターネット等のコンピュータネットワーク及び／又は各種の電気通信網を介して他のデバイスと情報／データをやり取りすることを可能にする。 A plurality of components in the electronic device 600 are connected to an I/O interface 605, and include an input unit 606 such as a keyboard and a mouse, an output unit 607 such as various types of displays and speakers, and an output unit 607 such as a magnetic disk, an optical disk, etc. a storage unit 608, and a communication unit 609, such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the electronic device 600 to exchange information/data with other devices via a computer network such as the Internet and/or various telecommunication networks.

計算ユニット６０１は、処理及び計算能力を有する各種の汎用及び／又は専用の処理モジュールであってもよい。計算ユニット６０１の幾つかの例としては、中央処理装置（ＣＰＵ）、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、各種専用の人工知能（ＡＩ）演算チップ、各種機械学習モデルアルゴリズムの計算ユニット、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、並びに任意の適切なプロセッサ、コントローラ、マイクロコントローラ等が挙げられるが、これらに限定されない。計算ユニット６０１は、例えば画像処理方法のような前記記載された各方法と処理を実行する。例えば、いくつかの実施例において、画像処理方法は、例えば記憶ユニット６０８のような機械可読媒体に有形的に含まれるコンピュータソフトウェアプログラムとして実現されてもよい。いくつかの実施例において、コンピュータプログラムの一部又は全部は、ＲＯＭ１００２及び／又は通信ユニット６０９を介して電子機器６００にロード及び／又はインストールされてもよい。コンピュータプログラムがＲＡＭ１００３にロードされて計算ユニット６０１により実行される場合、前記記載された画像処理方法の１つ又は複数のステップを実行してもよい。代替的に、別の実施例において、計算ユニット６０１は、他の任意の適切な形態（例えば、ファームウェアを介する）により画像処理方法を実行するように構成されてもよい。 Computing unit 601 may be a variety of general purpose and/or special purpose processing modules with processing and computing capabilities. Some examples of the calculation unit 601 include a central processing unit (CPU), a GPU (Graphics Processing Unit), various dedicated artificial intelligence (AI) calculation chips, calculation units for various machine learning model algorithms, and a DSP (Digital Signal Processor). ), as well as any suitable processor, controller, microcontroller, etc. The calculation unit 601 performs the methods and processes described above, such as image processing methods, for example. For example, in some embodiments, the image processing method may be implemented as a computer software program tangibly contained in a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed on electronic device 600 via ROM 1002 and/or communication unit 609. When a computer program is loaded into RAM 1003 and executed by calculation unit 601, one or more steps of the image processing method described above may be performed. Alternatively, in another embodiment, the computing unit 601 may be configured to perform the image processing method in any other suitable manner (eg, via firmware).

本明細書で説明されたシステム及び技術の様々な実施形態は、デジタル電子回路システム、集積回路システム、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、特定用途向け集積回路（ＡＳＩＣ）、特定用途向け標準製品（ＡＳＳＰ）、システムオンチップ（ＳＯＣ）、コンプレックスプログラマブルロジックデバイス（ＣＰＬＤ）、コンピュータハードウェア、ファームウェア、ソフトウェア、及び／又はそれらの組み合わせにおいて実現されてもよい。これらの様々な実施形態は、１つ又は複数のコンピュータプログラムにおいて実施され、該１つ又は複数のコンピュータプログラムは、少なくとも１つのプログラムブルプロセッサを含むプログラムブルシステムで実行され及び／又は解釈されることが可能であり、該プログラムブルプロセッサは、専用又は汎用のプログラムブルプロセッサであってもよく、記憶システム、少なくとも１つの入力装置、及び少なくとも１つの出力装置からデータ及び命令を受信し、かつデータ及び命令を該記憶システム、該少なくとも１つの入力装置、及び該少なくとも１つの出力装置に伝送することができることを含んでもよい。 Various embodiments of the systems and techniques described herein include digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), and application specific standard products (ASSPs). ), a system on a chip (SOC), a complex programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments are implemented in one or more computer programs that are executed and/or interpreted on a programmable system that includes at least one programmable processor. The programmable processor may be a special purpose or general purpose programmable processor and receives data and instructions from a storage system, at least one input device, and at least one output device, and receives data and instructions from a storage system, at least one input device, and at least one output device. The method may include being able to transmit instructions to the storage system, the at least one input device, and the at least one output device.

本開示の方法を実施するためのプログラムコードは、１つ又は複数のプログラミング言語の任意の組み合わせで作成されてもよい。これらのプログラムコードは、汎用コンピュータ、専用コンピュータ又は他のプログラムブルデータ処理装置のプロセッサ又はコントローラに提供されてもよく、それによって、プログラムコードがプロセッサ又はコントローラにより実行される時に、フローチャート及び／又はブロック図に規定された機能／操作が実施される。プログラムコードは、機器に完全に実行されてもよく、部分的に機器で実行されてもよく、独立したソフトウェアパッケージとして部分的に機器で実行され、かつ部分的に遠隔機器で実行されるか又は完全に遠隔機器又はサーバで実行されてもよい。 Program code for implementing the methods of this disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing device, such that when executed by the processor or controller, the program codes may be implemented in a flowchart and/or block format. The functions/operations specified in the diagram are performed. The program code may be executed entirely on the device, partially on the device, partially on the device as a separate software package, and partially on a remote device, or It may be performed entirely on a remote device or server.

本開示のコンテキストにおいて、機械可読媒体は、有形の媒体であってもよく、命令実行システム、装置又は電子機器に使用され、又は命令実行システム、装置又は機器と組み合わせて使用されるプログラムを含んで又は記憶してもよい。機械可読媒体は、機械可読信号媒体又は機械可読記憶媒体であってもよい。機械可読媒体は、電子の、磁気的、光学的、電磁的、赤外線の、又は半導体システム、装置又は電子機器、又は前記内容の任意の適切な組み合わせを含んでもよいが、それらに限定されない。機械可読記憶媒体のより具体的な例としては、１つ以上の線による電気的接続、携帯式コンピュータディスク、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、読み出し専用メモリ（ＲＯＭ）、消去可能なプログラマブルリードオンリーメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、光ファイバ、コンパクトディスクリードオンリーメモリ（ＣＤ－ＲＯＭ）、光学記憶装置、磁気記憶装置、又は前記内容の任意の適切な組み合わせを含む。 In the context of this disclosure, a machine-readable medium may be a tangible medium and includes a program for use in or in combination with an instruction-execution system, device, or electronic device. Or it may be memorized. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or electronic device, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connection through one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

ユーザとのインタラクションを提供するために、コンピュータにここで説明されたシステム及び技術を実施させてもよく、該コンピュータは、ユーザに情報を表示するための表示装置（例えば、ＣＲＴ（陰極線管）又はＬＣＤ（液晶ディスプレイ）モニタ）と、キーボード及びポインティングデバイス（例えば、マウス又はトラックボール）とを備え、ユーザは、該キーボード及び該ポインティングデバイスを介して入力をコンピュータに提供することができる。他の種別の装置は、さらにユーザとのインタラクションを提供してもよく、例えば、ユーザに提供されたフィードバックは、いかなる形式のセンシングフィードバック（例えば、視覚フィードバック、聴覚フィードバック、又は触覚フィードバック）であってもよく、かついかなる形式（音声入力、語音入力又は触覚入力を含む）でユーザからの入力を受信してもよい。 A computer may implement the systems and techniques described herein to provide interaction with a user, and the computer may include a display device (e.g., a CRT (cathode ray tube) or a liquid crystal display (LCD) monitor), a keyboard and a pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other types of devices may further provide interaction with the user, for example, the feedback provided to the user may be any form of sensing feedback (e.g., visual feedback, auditory feedback, or haptic feedback). and may receive input from the user in any form, including audio input, speech input, or tactile input.

ここで説明されたシステム及び技術は、バックグラウンド部品を含むコンピューティングシステム（例えば、データサーバとする）、又はミドルウェア部品を含むコンピューティングシステム（例えば、アプリケーションサーバ）、又はフロントエンド部品を含むコンピューティングシステム（例えば、グラフィカルユーザインタフェース又はウェブブラウザを有するユーザコンピュータ、ユーザが該グラフィカルユーザインタフェース又は該ネットワークブラウザを介してここで説明されたシステム及び技術の実施形態とインタラクションすることができる）、又はこのようなバックグラウンド部品、ミドルウェア部品、又はフロントエンド部品のいずれかの組み合わせを含むコンピューティングシステムに実施されることが可能である。任意の形式又は媒体のデジタルデータ通信（例えば、通信ネットワーク）によりシステムの部品を互いに接続することができる。通信ネットワークの例としては、局所エリアネットワーク（ＬＡＮ）、ワイドエリアネットワーク（ＷＡＮ）及びインターネットを例示的に含む。 The systems and techniques described herein may be used in a computing system that includes background components (e.g., a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components. a system (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with embodiments of the systems and techniques described herein); The present invention may be implemented in a computing system that includes any combination of background components, middleware components, or front-end components. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks illustratively include local area networks (LANs), wide area networks (WANs), and the Internet.

コンピュータシステムは、クライアント及びサーバを含んでよい。クライアントとサーバ同士は、一般的に離れており、通常、通信ネットワークを介してインタラクションする。クライアントとサーバとの関係は、該当するコンピュータ上でランニングし、クライアント－サーバの関係を有するコンピュータプログラムによって生成される。サーバは、クラウドサーバであってもよく、分散システムのサーバ、またはブロックチェーンと組み合わせたサーバであってよい。 A computer system may include a client and a server. Clients and servers are generally remote and typically interact via a communications network. The relationship between client and server is created by a computer program running on the relevant computer and having a client-server relationship. The server may be a cloud server, a server of a distributed system, or a server combined with a blockchain.

理解されるべきこととして、以上に示された様々な形式のフローを使用してもよく、操作を改めてソーティングしたり、付加したり又は削除してもよい。例えば、本発明に記載の各操作は、並列的に実行されたり、順次に実行されたり、又は異なる順序で実行されてもよく、本開示の技術案の所望の結果を実現することができれば、本明細書はここで限定されない。 It should be understood that various types of flows illustrated above may be used and operations may be re-sorted, added, or removed. For example, each operation described in the present invention may be performed in parallel, sequentially, or in a different order, provided that the desired result of the technical solution of the present disclosure can be achieved. The specification is not limited here.

前記具体的な実施形態は、本開示の保護範囲を限定するものではない。当業者であれば、設計要件及び他の要因に応じて、様々な修正、組み合わせ、サブコンビネーション及び代替を行うことが可能であると理解すべきである。本開示の精神と原則内で行われた任意の修正、均等置換及び改良などは、いずれも本開示の保護範囲内に含まれるべきである。 The specific embodiments do not limit the protection scope of the present disclosure. Those skilled in the art should appreciate that various modifications, combinations, subcombinations, and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure should be included within the protection scope of this disclosure.

本開示は、人工知能技術分野、特にコンピュータ視覚及びディープラーニングの技術分野に関し、顔画像処理及び顔識別などのシーンに応用することができる。具体的には、画像処理方法、画像処理装置、電子機器、記憶媒体及びコンピュータプログラムに関する。The present disclosure relates to the field of artificial intelligence technology, especially the field of computer vision and deep learning, and can be applied to scenes such as facial image processing and face identification. Specifically, the present invention relates to an image processing method, an image processing device, an electronic device , a storage medium , and a computer program .

本開示の別の態様によれば、プロセッサにより実行される時に上記の方法を実現するコンピュータプログラムを提供する。According to another aspect of the disclosure, a computer program product is provided that, when executed by a processor, implements the method described above.

本開示の実施例によれば、本開示は、さらに、電子機器、可読記憶媒体及びコンピュータプログラムを提供する。According to embodiments of the present disclosure, the present disclosure further provides an electronic device, a readable storage medium, and a computer program .

本開示の実施例によれば、コンピュータプログラムであって、プロセッサにより実行される時に前記のような画像処理方法を実現する。 According to embodiments of the present disclosure, there is provided a computer program that , when executed by a processor, implements the image processing method as described above .

本開示の実施例によれば、コンピュータプログラムであって、プロセッサにより実行される時に前記のような方法を実現する。 According to embodiments of the present disclosure, a computer program is provided that, when executed by a processor, implements such a method .

Claims

An image processing method, comprising:
generating an image to be processed based on a first target image and a second target image, wherein identity information of an object in the image to be processed is matched with identity information of an object in the first target image; The texture information of the object in the image to be processed matches the texture information of the object in the second target image;
generating a decoupling image set based on the second target image and the to-be-processed image, wherein the decoupling image set corresponds to a head region of an object in the to-be-processed image; and a repair decoupling image corresponding to information to be repaired related to an object in the image to be processed;
generating a fused image based on the decoupled image set, wherein the identity information and texture information of an object in the fused image are matched with the identity information and texture information of an object in the image to be processed, respectively; Information to be repaired related to an object in the fused image is repaired.

the repaired decoupled image includes a first decoupled image and a second decoupled image;
The identity information of the object in the first decoupled image is matched with the identity information of the object in the image to be processed, and the skin color information of the object in the first decoupled image is matched with the skin color information of the object in the second target image. death,
The second decoupling image is a difference image between the head area of the object in the image to be processed and the head area of the object in the second target image,
Here, the fact that the information to be repaired related to the object in the fused image is repaired means that the skin color information of the object in the fused image matches the skin color information of the object in the second target image, and 2. The method of claim 1, further comprising indicating that a pixel value of a pixel meets a preset condition.

The head decoupling image includes a third decoupling image, a fourth decoupling image, and a fifth decoupling image,
The third decoupling image includes a gradation image of a head area of the object in the image to be processed,
The fourth decoupling image includes a binarized image of the head area of the object in the image to be processed,
The method according to claim 1 or 2, wherein the fifth decoupled image includes an image obtained based on the second target image and the fourth decoupled image.

Generating the fused image based on the decoupled image set comprises:
A method according to any one of claims 1 to 3, comprising processing the decoupled image set using a fusion model comprising a generator in a first generative adversarial network model to obtain the fused image. .

The fusion model is obtained by training using a first identity information loss function, a first image feature matching loss function, a first discriminant feature matching loss function, and a first discriminator loss function. Method described.

Generating an image to be processed based on the first target image and the second target image includes:
processing the first target image using an identity extraction module in a driving model to obtain identity information of an object in the first target image;
processing the second target image using a texture extraction module in the driving model to obtain texture information of an object in the second target image;
processing the identity information and the texture information using a stitching module in the driving model to obtain stitching information;
The method according to any one of claims 1 to 5, comprising processing the stitching information using a generator in the driving model and obtaining the image to be processed.

The stitching information includes a plurality of stitching information, the generator in the driving model includes N cascaded deep units, and N is an integer greater than 1;
Processing the stitching information using a generator in the driving model to obtain the image to be processed,
For the i-th deep unit among the N deep units, jump information of the i-th level corresponding to the i-th deep unit is processed using the i-th deep unit, and the jump information of the i-th level is processed. , where the i-th level jump information includes the (i-1)-th level feature information and the i-th level stitching information, where i is greater than 1 and less than or equal to N. and
7. The method of claim 6, comprising: generating the image to be processed based on Nth level feature information.

The driving model is obtained by training using a second identity information loss function, a second image feature matching loss function, a second discriminant feature matching loss function, a second discriminator loss function, and a cyclic matching loss function. The method according to claim 6 or 7.

The cyclic matching loss function is determined based on the actual result and the predicted result generated by the driving model, the actual result includes actual identity information and actual texture information of the object in the actual image, and the predicted result includes: 9. The method of claim 8, comprising predicted identity information and predicted texture information of the object in a simulated image.

The method according to any one of claims 1 to 8, further comprising performing an enhancement process on the fused image to obtain an enhanced image.

An image processing device,
generating an image to be processed based on a first target image and a second target image, wherein identity information of an object in the image to be processed is matched with identity information of an object in the first target image; a first generation module that matches texture information of an object in the image to be processed with texture information of an object in the second target image;
A decoupling image set is generated based on the second target image and the to-be-processed image, wherein the decoupling image set includes a head-decoupling image corresponding to a head region of an object in the to-be-processed image; a second generation module including a repaired decoupled image corresponding to information to be repaired related to an object in the image to be processed;
generating a fused image based on the decoupled image set, wherein the identity information and texture information of an object in the fused image are matched with the identity information and texture information of an object in the image to be processed, respectively; An image processing device including: a third generation module in which information to be repaired related to an object in the fused image is repaired.

the repaired decoupled image includes a first decoupled image and a second decoupled image;
The identity information of the object in the first decoupled image is matched with the identity information of the object in the image to be processed, and the skin color information of the object in the first decoupled image is matched with the skin color information of the object in the second target image. death,
The second decoupling image is a difference image between the head area of the object in the image to be processed and the head area of the object in the second target image,
Here, the fact that the information to be repaired related to the object in the fused image is repaired means that the skin color information of the object in the fused image matches the skin color information of the object in the second target image, and The device according to claim 11, wherein the pixel value of the pixel indicates that a preset condition is met.

The head decoupling image includes a third decoupling image, a fourth decoupling image, and a fifth decoupling image,
The third decoupling image includes a tone image of a head area of the object in the image to be processed,
The fourth decoupling image includes a binarized image of the head area of the object in the image to be processed,
The apparatus according to claim 11 or 12, wherein the fifth decoupled image includes an image obtained based on the second target image and the fourth decoupled image.

The third generation module is
14. A first processing unit for processing the decoupled image set using a fusion model comprising a generator in a first generative adversarial network model to obtain the fused image. The device described.

The fusion model is obtained by training using a first identity information loss function, a first image feature matching loss function, a first discriminant feature matching loss function, and a first discriminator loss function. The device described.

The first generation module is
a second processing unit that processes the first target image using an identity extraction module in a driving model to obtain identity information of an object in the first target image;
a third processing unit that processes the second target image using a texture extraction module in the driving model to obtain texture information of an object in the second target image;
a fourth processing unit that processes the identity information and the texture information using a stitching module in the driving model to obtain stitching information;
The apparatus according to any one of claims 11 to 15, further comprising a fifth processing unit that processes the stitching information using a generator in the driving model and obtains the image to be processed.

The stitching information includes a plurality of stitching information, the driving model includes N cascaded deep units, and N is an integer greater than 1;
The fifth processing unit generates jump information of the i-th level corresponding to the i-th deep unit using the i-th generator for the i-th deep unit among the N deep units. processing to obtain i-th level feature information, where the i-th level jump information includes (i-1)-th level feature information and i-th level stitching information, where i is a processing subunit greater than 1 and less than or equal to N;
17. The apparatus according to claim 16, further comprising: a generation subunit that generates the image to be processed based on Nth level feature information.

An electronic device,
at least one processor;
a memory communicatively connected to the at least one processor;
The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, thereby causing the at least one processor to operate according to any one of claims 1 to 10. The method described can be carried out.

A non-transitory computer-readable storage medium having computer instructions stored thereon;
The computer instructions cause the computer to perform a method according to any one of claims 1 to 10.

A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 10.