WO2023162131A1 - Image converting device, image converting method, and image converting program - Google Patents

Image converting device, image converting method, and image converting program Download PDF

Info

Publication number
WO2023162131A1
WO2023162131A1 PCT/JP2022/007869 JP2022007869W WO2023162131A1 WO 2023162131 A1 WO2023162131 A1 WO 2023162131A1 JP 2022007869 W JP2022007869 W JP 2022007869W WO 2023162131 A1 WO2023162131 A1 WO 2023162131A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature point
image
feature
point
points
Prior art date
Application number
PCT/JP2022/007869
Other languages
French (fr)
Japanese (ja)
Inventor
雄貴 蔵内
真奈 笹川
直紀 萩山
文香 佐野
隆二 山本
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2022/007869 priority Critical patent/WO2023162131A1/en
Publication of WO2023162131A1 publication Critical patent/WO2023162131A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • Embodiments of the present invention relate to an image conversion device, an image conversion method, and an image conversion program.
  • Non-Patent Document 1 discloses the possibility of manipulating emotional experience through real-time facial expression deformation feedback.
  • a subject's face is tracked in real time and natural facial expression deformation processing is performed.
  • the Rigid MLS (Moving Least Squares) method is used as an image transformation method to transform facial expressions in facial images.
  • the Rigid MLS method is a method of distorting an image by moving each control point using feature points in the image recognized from the image as control points.
  • the face image is an image obtained by photographing the face of the subject, an image obtained by extracting the face of a computer-generated avatar, or the like.
  • the eyebrows recognized only on the upper side are image-transformed by this method so as to be moved upward, the obtained face image will have thicker eyebrows, and only a face image with an unnatural expression will be obtained.
  • the face image with an unnatural expression will be similarly obtained when the image is deformed.
  • the present invention seeks to provide an image conversion technique that enables conversion into an image with a natural expression even when only one side of the facial parts is recognized.
  • an image conversion device includes a control point generation section and an expression conversion section.
  • the control point generation unit generates second feature points that are unrecognized feature points on the other side based on first feature points that are feature points on one side of facial parts recognized from an image of a person's face. Add the first and second feature points as control points.
  • the facial expression transforming unit transforms the control points by a deformation amount according to the transformed facial expression to be transformed, thereby obtaining a transformed image in which the facial expression of the person is transformed.
  • the feature points on one side of the face part are added to convert the image. It is possible to provide an image conversion technique that enables conversion into a face image of expression.
  • FIG. 1 is a block diagram showing an example of the configuration of an image conversion device according to one embodiment of the invention.
  • FIG. 2 is a diagram showing an example of the hardware configuration of the image conversion device.
  • FIG. 3 is a diagram showing an example of facial feature points.
  • FIG. 4 is a diagram showing an example of a storage form of feature points.
  • FIG. 5 is a diagram showing an example of a storage form of the amount of change.
  • FIG. 6 is a flow chart showing an example of an image conversion processing operation by the image conversion device.
  • FIG. 7 is a schematic diagram for explaining the relationship between the feature points of the eyebrows and the feature points above the eyes.
  • FIG. 8 is a schematic diagram for explaining a method of adding feature points on the lower side of eyebrows.
  • FIG. 9 is a schematic diagram for explaining a method of adding characteristic points of eyelids.
  • FIG. 1 is a block diagram showing an example of the configuration of an image conversion device 1 according to one embodiment of the invention.
  • the image conversion device 1 has an image acquisition section 11 , a feature point recognition section 12 , a control point generation section 13 , a converted facial expression input section 14 , a change amount storage section 15 , a facial expression conversion section 16 and an image output section 17 .
  • the image acquisition unit 11 acquires a face image from a web camera, avatar, or the like.
  • the image acquisition unit 11 outputs the acquired face image to the feature point recognition unit 12 and the facial expression conversion unit 16 .
  • the feature point recognition unit 12 receives the face image acquired by the image acquisition unit 11, and recognizes feature points from the face image. A method of recognizing feature points in the feature point recognition unit 12 will be described later.
  • the feature point recognition unit 12 outputs the recognized feature points to the control point generation unit 13 .
  • the control point generation unit 13 receives as input the first feature points that are the feature points recognized by the feature point recognition unit 12, and generates second feature points that are unrecognized feature points based on the input first feature points. Add the feature points of . For example, the control point generation unit 13 calculates the distance between the eyebrow feature point, which is the first feature point, and the eye feature point, and adds half the distance obtained from each eyebrow feature point downward to the second feature point. Add the feature points of . A method for adding the second feature point will be described in detail later.
  • the control point generator 13 outputs the first and second feature points to the facial expression converter 16 as control points. Based on which of the first feature points the second feature points are added and the number of second feature points to be added are predetermined. Therefore, the number of control points is also predetermined.
  • the converted facial expression input unit 14 acquires a converted facial expression, such as a smiling face, which is specified and input by the user from a user interface such as a keyboard.
  • the converted facial expression input unit 14 outputs the acquired converted facial expression to the facial expression conversion unit 16 .
  • the change amount storage unit 15 stores in advance the change amount for each control point for each facial expression to be converted.
  • the amount of change is information indicating how much the control point should be moved.
  • the amount of change can be obtained in advance by, for example, adjusting a specific facial image so that the user applies expression deformation processing to an expressionless face so as to obtain a natural expression.
  • the facial expression transforming unit 16 receives the face image acquired by the image acquiring unit 11, the control points output by the control point generating unit 13, and the transformed facial expression acquired by the transformed facial expression input unit 14. Moreover, the facial expression transforming unit 16 reads from the variation storage unit 15 the amount of change in the facial expression to be converted indicated by the transformed facial expression input from the transformed facial expression input unit 14 .
  • the change amount storage unit 15 obtains a face image in which the expression of the face image is converted by moving each control point in the input face image based on the read movement amount of the control point.
  • the facial expression conversion section 16 outputs the converted face image to the image output section 17 .
  • the image output unit 17 receives the face image after conversion from the facial expression conversion unit 16, and outputs the input face image.
  • output includes, for example, storing in a storage medium, displaying on a display, transmitting to another device via a communication network, and the like.
  • FIG. 2 is a diagram showing an example of the hardware configuration of the image conversion device 1. As shown in FIG.
  • the image conversion device 1 is composed of a computer such as a personal computer, a smart phone, a server computer, etc., for example.
  • the image conversion device 1 has a hardware processor 100 such as a CPU (Central Processing Unit), as shown in FIG. By using a multi-core and multi-threaded CPU, it is possible to execute a plurality of information processes at the same time. Also, the processor 100 may include multiple CPUs.
  • a program memory 200, a data memory 300, a communication interface 400, and an input/output interface (input/output IF in FIG. 2) 500 are connected to the processor 100 via a bus 600. connected.
  • the communication interface 400 can include, for example, one or more wired or wireless communication modules.
  • the communication interface 400 can communicate with other computers, web cameras, etc. connected via a cable or a network such as a LAN (Local Area Network) or the Internet.
  • LAN Local Area Network
  • the input unit 700 and a display unit 800 are connected to the input/output interface 500 .
  • the input unit 700 includes input devices such as a keyboard, a pointing device such as a mouse, a sensor device such as a camera, and the like.
  • the display unit 800 is a display device such as a liquid crystal display, a CRT (Cathode Ray Tube) display, or the like.
  • the input unit 700 and the display unit 800 can also use what is called a tablet-type input/display device.
  • This type of input/display device is configured by arranging an input detection sheet adopting an electrostatic method or a pressure method on a display screen of a display device using liquid crystal or organic EL (Electro Luminescence), for example.
  • the input/output interface 500 inputs to the processor 100 the operation information input through the input unit 700 and causes the display unit 800 to display display information generated by the processor 100 .
  • the input unit 700 and the display unit 800 may not be connected to the input/output interface 500.
  • the input unit 700 and the display unit 800 are provided with a communication unit for connecting to the communication interface 400 directly or via a network, so that information can be exchanged with the processor 100 .
  • the input/output interface 500 may have a read/write function for a recording medium such as a semiconductor memory such as a flash memory, or may be connected to a reader/writer having a read/write function for such a recording medium. It may have functions. Furthermore, the input/output interface 500 may have a connection function with other devices.
  • the program memory 200 is a combination of a non-volatile memory that can be written and read at any time and a non-volatile memory that can only be read at any time as a non-temporary tangible computer-readable storage medium.
  • Non-volatile memories that can be written and read at any time are, for example, HDDs (Hard Disk Drives), SSDs (Solid State Drives), and the like.
  • Non-volatile memory that can only be read at any time is, for example, ROM.
  • the program memory 200 stores a program necessary for the processor 100 to execute various control processes according to one embodiment, such as an image conversion program.
  • the processing function units in each unit of the image acquisition unit 11, the feature point recognition unit 12, the control point generation unit 13, the converted expression input unit 14, the change amount storage unit 15, the expression conversion unit 16, and the image output unit 17 are: Both can be realized by causing the processor 100 to read and execute the image conversion program stored in the program memory 200 .
  • Some or all of these processing functions may be implemented in various other forms, including integrated circuits such as Application Specific Integrated Circuits (ASICs) or field-programmable gate arrays (FPGAs). May be.
  • ASICs Application Specific Integrated Circuits
  • FPGAs field-programmable gate arrays
  • the data memory 300 is used as a tangible computer-readable storage medium, for example, by combining the above nonvolatile memory and a volatile memory such as RAM (Random Access Memory).
  • This data memory 300 is used to store various data acquired and created in the process of performing various processes. That is, in the data memory 300, an area for storing various data is appropriately secured in the process of performing various processes.
  • the data memory 300 includes, for example, an acquired image storage unit 301, a feature point storage unit 302, a control point storage unit 303, a converted facial expression designation storage unit 304, a change amount storage unit 305, a converted image storage unit 306, and so on. and a temporary storage unit 307 can be provided.
  • the acquired image storage unit 301 is used to store face images acquired when the processor 100 operates as the image acquisition unit 11 described above.
  • the feature point storage unit 302 is used to store feature points acquired when the processor 100 operates as the feature point recognition unit 12 described above.
  • FIG. 3 is a diagram showing an example of facial feature points.
  • the asterisks in FIG. 3 are feature points recognized by the processor 100, and the numbers attached to each feature point are unique feature point IDs for identifying each feature point.
  • the number of feature point IDs and the portion of the face for each feature point ID are determined by the feature point recognition method employed. For example, the feature point with the feature point ID "18" is predetermined as the left edge of the left eyebrow.
  • FIG. 4 is a diagram showing an example of a storage form of feature points in the feature point storage unit 302.
  • the feature point storage unit 302 stores x-coordinates and y-coordinates in the face image in a table format in association with feature point IDs. Coordinate values are in pixels. Therefore, in the example of FIG. 3, the feature point storage unit 302 stores the xy coordinates of feature points with feature point IDs "1" to "68".
  • the control point storage unit 303 is used to store control points generated when the processor 100 operates as the control point generation unit 13 described above.
  • the storage form of the control points in the control point storage unit 303 is the same as the storage form of the feature points in the feature point storage unit 302 shown in FIG. 4, for example. That is, the control point storage unit 303 can store the x-coordinate and y-coordinate in the face image in association with the control point ID in a table format.
  • the control point storage unit 303 stores the feature point IDs "1" to "68" assigned to the feature points shown in FIG. ” to “68” are associated with each other and stored.
  • the processor 100 also stores the xy coordinates of the second feature points, which are the added feature points, in association with the control point IDs "69" and so on.
  • the converted facial expression specification storage unit 304 is used to store the converted facial expression specified by the user, which is acquired when the processor 100 operates as the above-described converted facial expression input unit 14 .
  • the change amount storage unit 305 corresponds to the change amount storage unit 15 described above.
  • FIG. 5 is a diagram showing an example of the storage form of the amount of change in the amount of change storage unit 305.
  • the change amount storage unit 305 can have a table format that stores the change amount of the x-coordinate and the change amount of the y-coordinate in association with the control point ID for each converted facial expression.
  • the delta value is in pixels.
  • the amount of change is represented by the direction and amount of movement of the control point. For example, a movement amount of "+1" represents a movement of 1 pixel in the positive direction.
  • the converted image storage unit 306 is used to store face images converted when the processor 100 operates as the facial expression conversion unit 16 described above.
  • a temporary storage unit 307 stores the acquired image storage unit 301, the feature point storage unit 302, the control point storage unit 303, the converted facial expression designation storage unit 304, the change amount storage unit 305, and the converted image storage unit 304, which are generated during the operation of the processor 100. It is used to store various intermediate data that are not stored in unit 306 .
  • FIG. 6 is a flowchart showing an example of the image conversion processing operation by the image conversion device 1.
  • the processor 100 of the image conversion device 1 reads and executes the image conversion program stored in the program memory 200, thereby starting the operation of the image conversion device 1 shown in this flow chart. Execution of the image conversion program by the processor 100 is started when an instruction to perform image conversion is given from the input unit 700 via the input/output interface 500 or via the communication interface 400 .
  • the processor 100 operates as the converted facial expression input unit 14 and waits for the user's specification input of a converted facial expression, such as a smile, which is the facial expression to be converted (step S1). For example, the processor 100 determines whether or not the input signal from the input unit 700 via the input/output interface 500 or the communication interface 400 includes a specified input of a converted facial expression. If there is an input specifying a converted facial expression, the processor 100 proceeds to the process of step S2.
  • the processor 100 stores the designated converted facial expression in the converted facial expression designation storage section 304 of the data memory 300 (step S2).
  • the processor 100 operates as the image acquisition unit 11 and acquires a face image (step S3).
  • the processor 100 acquires through the input/output interface 500 an image of the subject's face captured by the camera of the input unit 700 .
  • the processor 100 acquires a face image captured by a web camera connected to a network or an avatar face generated by another computer via the communication interface 400 .
  • the processor 100 stores the acquired face image in the acquired image storage section 301 of the data memory 300 .
  • the processor 100 operates as the feature point recognition unit 12 and recognizes the first feature points from the face image stored in the acquired image storage unit 301 (step S4).
  • the processor 100 uses the face_landmark_detection function of dlib (see, for example, http://dlib.net/face_landmark_detection.py.html) to recognize feature points in the face image.
  • the processor 100 extracts the gradient direction distribution of luminance called HOG (Histogram of Oriented Gradients) features from the input face image.
  • HOG Heistogram of Oriented Gradients
  • a model trained based on data in which HOG features and positions of facial feature points are associated is generally provided. Therefore, the processor 100 inputs the extracted HOG features into this learning model to obtain the positions of the feature points of the face.
  • the processor 100 stores the acquired position of the first feature point in the feature point storage unit 302 of the data memory 300 .
  • the processor 100 operates as the control point generation unit 13 and generates control points (step S5). Specifically, the processor 100 stores the recognized first feature point in the control point storage unit 303 of the data memory 300 as a control point. Furthermore, the processor 100 adds a second feature point, which is the feature point on the other side, to the facial part for which the feature points on only one side have been recognized. The processor 100 then stores the added second feature points in the control point storage unit 303 as additional control points.
  • eyebrows, eyelids, and contours are examples of facial parts for which feature points are recognized on only one side. Since only the upper feature points of the eyebrows are recognized, the processor 100 adds the lower feature points of the eyebrows. Since only the upper feature point of the eye, which is the lower feature point of the double eyelid, is recognized, the processor 100 adds the upper feature point. Since the contour without shadows is recognized as a feature point, the processor 100 adds the feature points of the shadowed part.
  • the processor 100 adds eyebrow feature points as follows.
  • FIG. 7 is a schematic diagram for explaining the relationship between the feature points of the eyebrows and the feature points above the eyes.
  • the processor 100 draws a vertical line from each feature point of the eyebrows (feature point IDs “18” to “22”), and draws a vertical line from the eyebrow feature points (feature point IDs “37” to “40”). , the feature point with the closest distance from the perpendicular is obtained.
  • the processor 100 detects the feature point with the feature point ID “37” if the feature point has the feature point ID “18”, the feature point with the feature point ID “37” if the feature point has the feature point ID “19”, the feature point If it is a feature point with ID "20”, it is a feature point with feature point ID "38", if it is a feature point with feature point ID "21”, it is a feature point with feature point ID "39”, and if it is a feature point with feature point ID "22” For example, a feature point with a feature point ID of "40" is obtained, . At this time, d18, d19, d20, d21, d22, .
  • FIG. 8 is a schematic diagram for explaining a method of adding the lower feature point, which is the second eyebrow feature point.
  • the processor 100 calculates the average distance da of the distances d18-d27 between the above feature points.
  • the average distance da may be obtained without distinguishing between the right eye and the left eye, or the average distance da may be obtained separately since there is generally a slight difference between the left and right eyes.
  • the processor 100 sets 1/2 of the average distance da thus calculated, that is, da/2, as the feature point addition distance d, and adds the second feature point below each of the first feature points of the eyebrows by the feature point addition distance d. do. That is, the processor 100 adds the second feature point with the feature point ID of "69” below the first feature point with the feature point ID of "18” by the feature point addition distance d. Similarly, the processor 100 places the second feature point with the feature point ID "70” below the first feature point with the feature point ID "19", the second feature point with the feature point ID "20” below the first feature point with the feature point ID "20” , the second feature point with the feature point ID "78” below the first feature point with the feature point ID "27".
  • the processor 100 adds feature points of the eyelids, for example, as follows.
  • FIG. 9 is a schematic diagram for explaining a method of adding the second feature point of the eyelid.
  • the recognized unilateral feature point of the eyelid is the upper feature point of the eye. Therefore, when adding the second eyelid feature points, the average distance da used when adding the second eyebrow feature points can be used.
  • the processor 100 sets 1/4 of the left and right separate average distances da, ie, da/4, as the feature point additional distance d.
  • the processor 100 adds a second feature point above each feature point on the eye (the first feature points with feature point IDs "37" to "40") by a feature point addition distance d.
  • the processor 100 adds the second feature point with the feature point ID "79" above the first feature point with the feature point ID "37” by the feature point addition distance d. Similarly, the processor 100 causes the second feature point with the feature point ID "80” above the first feature point with the feature point ID "38", and the first feature point with the feature point ID "39”. , the second feature point with the feature point ID "86” above the first feature point with the feature point ID "46".
  • the processor 100 adds the second feature point of the contour shadow, for example, as follows. For example, the processor 100 adds a second feature point at a predetermined direction and distance for each contour feature point (first feature points with feature point IDs "1" to "17").
  • the processor 100 operates as the facial expression transforming unit 16 to transform the facial image stored in the acquired image storage unit 301 (step S6). That is, the processor 100 stores the control points stored in the control point storage unit 303 and the amount of change corresponding to the converted facial expression stored in the converted facial expression designation storage unit 304, which is stored in the amount of change storage unit 305. , to transform the face image.
  • processor 100 utilizes an implementation of MLS (see, eg, https://github.com/Jarvis73/Moving-Least-Squares), or the like. Specifically, the processor 100 moves each control point by the change amount corresponding to the transformed facial expression stored in the transformed facial expression designation storage unit 304 .
  • the x-y coordinates before conversion are (23, 45) for the control point with the control point ID "1" (see FIG. 4). is "+1" and the y-coordinate is "+2" (see FIG. 5), so that the pixel of the control point is moved to (24, 47).
  • x, y are the coordinates of the neighboring control point
  • x', y' are the coordinates obtained by adding the amount of change to the coordinates of the control point
  • a, b, c, d are the parameters
  • t x , t y are the translations.
  • the processor 100 calculates the least square mean of the coordinates x, y of the control points and the coordinates x′, y′ obtained by adding the amount of change, and sets the parameters a, b, c, d, t x , t y is determined by global optimization. Then, the coordinates of the object point to be transformed are set to x and y, and the coordinates after transformation are determined using the determined parameters.
  • the processor 100 uses the parameters a, b, c, d, t x , and t y obtained in this way to obtain the coordinates after transformation by the above affine transformation from the added control points.
  • the processor 100 stores the face image thus converted in the converted image storage section 306 of the data memory 300 as a converted image.
  • the processor 100 operates as the image output unit 17 and outputs the converted image stored in the converted image storage unit 306 (step S7).
  • the processor 100 causes the display unit 800 to display a facial image via the input/output interface 500 .
  • the processor 100 transmits over the network via the communication interface 400 and displays it on a display device connected to the network, or displays it on the display unit of another computer connected to the network.
  • the processor 100 determines whether or not to end the operation as the image conversion device 1 shown in this flow chart (step S8). For example, the processor 100 checks whether or not the user has instructed to end the image conversion from the input unit 700 via the input/output interface 500 or via the communication interface 400 . Here, when ending the operation, the processor 100 ends the operation shown in this flowchart.
  • the processor 100 operates as the converted facial expression input unit 14 and determines whether or not the user has entered a change designation input for the converted facial expression (step S9). If there is no change specification input for the converted facial expression, the processor 100 proceeds to the process of step S3. Also, when there is an input specifying a change in the converted facial expression, the processor 100 proceeds to the process of step S2.
  • the image conversion device 1 includes the control point generation section 13 and the facial expression conversion section 16 .
  • the control point generation unit 13 generates second feature points, which are unrecognized feature points on the other side, based on first feature points, which are feature points on one side of facial parts recognized from the image of a person's face. and let the first and second feature points be control points.
  • a facial expression conversion unit 16 obtains a transformed image in which the human facial expression is transformed by transforming the control points by a deformation amount according to the transformed facial expression to be transformed. Therefore, the image conversion apparatus 1 according to one embodiment converts an image by adding feature points on one side of the face part based on the feature points on the other side, so even if only one side of the face part is recognized. It is also possible to provide an image conversion technique that enables conversion into a face image with a natural expression.
  • facial parts include at least eyebrows or eyelids.
  • the lower eyebrow feature points and/or the upper eyelid feature points are added to create a natural image. It is possible to convert the facial expression into a face image.
  • control point generation unit 13 calculates the feature point addition distance d based on the distance between the feature points above the eyebrows and the feature points above the eyes, which are the first feature points.
  • control point generation unit 13 calculates the feature point addition distance d based on the distance between the feature point above the eyebrow and the feature point below the eyelid, which is the first feature point, and calculates the first feature point.
  • the amount of change storage unit 15 stores in advance the amount of change representing the amount of deformation for each control point for each converted facial expression to be converted, and the converted facial expression to be converted is input.
  • the control points corresponding to the second feature points are also used to create a facial image with a natural expression. It is possible to convert.
  • a fixed value such as 1/2 or 1/4 is used for the average distance da, but the user may specify an arbitrary value.
  • the user may be allowed to select which part of the face to add the second feature point, which is the feature point on the other side.
  • the method described in the embodiment can be executed by a computer (computer) as a program (software means), such as a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), an optical disk (CD-ROM, DVD, MO, etc.), semiconductor memory (ROM, RAM, flash memory, etc.), or the like, or can be transmitted and distributed via a communication medium.
  • the programs stored on the medium also include a setting program for configuring software means (including not only execution programs but also tables and data structures) to be executed by the computer.
  • a computer that realizes this apparatus reads a program recorded on a recording medium, and optionally constructs software means by a setting program. The operation is controlled by this software means to execute the above-described processes.
  • the term "recording medium” as used herein is not limited to those for distribution, and includes storage media such as magnetic disks, semiconductor memories, etc. provided in computers or devices connected via a network.
  • the present invention is not limited to the above embodiments, and can be modified in various ways without departing from the gist of the invention at the implementation stage.
  • each embodiment may be implemented in combination as much as possible, in which case the combined effect can be obtained.
  • the above-described embodiments include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)

Abstract

An image converting device according to one embodiment comprises a control point generating unit and an expression converting unit. On the basis of a first feature point that is on one side of a face part recognized from an image of a person's face, the control point generating unit adds a second feature point that is on the other side that is not recognized, and sets the first and second feature points as control points. The expression converting unit transforms the control points with a transformation amount that corresponds to a conversion expression to be converted, thereby obtaining a converted image in which the expression of the person's face is converted.

Description

画像変換装置、画像変換方法及び画像変換プログラムImage conversion device, image conversion method and image conversion program
 この発明の実施形態は、画像変換装置、画像変換方法及び画像変換プログラムに関する。 Embodiments of the present invention relate to an image conversion device, an image conversion method, and an image conversion program.
 非特許文献1は、リアルタイムな表情変形フィードバックによる感情体験の操作の可能性について開示している。非特許文献1では、被験者の顔をリアルタイムにトラッキングして自然な表情変形処理を施している。非特許文献1では、画像変換法としてRigid MLS(Moving Least Squares)法を使用して、顔画像における表情を変形している。Rigid MLS法は、画像から認識した画像中の特徴点を制御点として、各制御点を移動させることで、画像を歪めるという手法である。なお、顔画像とは、被験者の顔を撮影した画像、コンピュータが生成したアバターの顔を抽出した画像、などである。 Non-Patent Document 1 discloses the possibility of manipulating emotional experience through real-time facial expression deformation feedback. In Non-Patent Document 1, a subject's face is tracked in real time and natural facial expression deformation processing is performed. In Non-Patent Document 1, the Rigid MLS (Moving Least Squares) method is used as an image transformation method to transform facial expressions in facial images. The Rigid MLS method is a method of distorting an image by moving each control point using feature points in the image recognized from the image as control points. The face image is an image obtained by photographing the face of the subject, an image obtained by extracting the face of a computer-generated avatar, or the like.
 顔パーツを認識する場合に、顔パーツの片側のみしか認識しないような特徴点認識手法がある。そのような特徴点認識手法によって認識された顔パーツを、Rigid MLS法を使用して動かそうとすると、不自然な表情の顔画像しか得ることができない。 When recognizing facial parts, there is a feature point recognition method that only recognizes one side of the facial parts. When trying to move facial parts recognized by such a feature point recognition method using the Rigid MLS method, only facial images with unnatural facial expressions can be obtained.
 例えば、眉について、その片側である上側のみが認識され、もう片側の下側は認識されないような特徴点認識手法がある。この手法で上側のみが認識された眉を、上方向に移動するよう画像変換すると、得られた顔画像は眉が太くなってしまい、不自然な表情の顔画像しか得られない。他に、瞼の二重の幅、輪郭の陰影、などについても、片側しか認識されないと、画像変形した際に、同様に不自然な表情の顔画像になってしまう。 For example, there is a feature point recognition method in which only the upper side of the eyebrow is recognized, and the lower side of the eyebrow is not recognized. If the eyebrows recognized only on the upper side are image-transformed by this method so as to be moved upward, the obtained face image will have thicker eyebrows, and only a face image with an unnatural expression will be obtained. In addition, if only one side of the eyelid double width, outline shadow, etc. is recognized, the face image with an unnatural expression will be similarly obtained when the image is deformed.
 この発明は、顔パーツの片側のみしか認識しない場合であっても、自然な表情の画像に変換することを可能とする画像変換技術を提供しようとするものである。 The present invention seeks to provide an image conversion technique that enables conversion into an image with a natural expression even when only one side of the facial parts is recognized.
 上記課題を解決するために、この発明の一態様に係る画像変換装置は、制御点生成部と表情変換部とを備える。制御点生成部は、人の顔の画像から認識された顔パーツの片側の特徴点である第1の特徴点に基づいて、認識されていないもう片側の特徴点である第2の特徴点を追加し、第1及び第2の特徴点を制御点とする。表情変換部は、変換するべき変換表情に応じた変形量により制御点を変形することで人の顔の表情を変換した変換画像を得る。 In order to solve the above problems, an image conversion device according to one aspect of the present invention includes a control point generation section and an expression conversion section. The control point generation unit generates second feature points that are unrecognized feature points on the other side based on first feature points that are feature points on one side of facial parts recognized from an image of a person's face. Add the first and second feature points as control points. The facial expression transforming unit transforms the control points by a deformation amount according to the transformed facial expression to be transformed, thereby obtaining a transformed image in which the facial expression of the person is transformed.
 この発明の一態様によれば、顔パーツの片側の特徴点に基づいてもう片側の特徴点を追加して画像を変換するので、顔パーツの片側のみしか認識しない場合であっても、自然な表情の顔画像に変換することを可能とする画像変換技術を提供することができる。 According to one aspect of the present invention, based on the feature points on one side of the face part, the feature points on the other side are added to convert the image. It is possible to provide an image conversion technique that enables conversion into a face image of expression.
図1は、この発明の一実施形態に係る画像変換装置の構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of the configuration of an image conversion device according to one embodiment of the invention. 図2は、画像変換装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram showing an example of the hardware configuration of the image conversion device. 図3は、顔の特徴点の一例を示す図である。FIG. 3 is a diagram showing an example of facial feature points. 図4は、特徴点の記憶形態の一例を示す図である。FIG. 4 is a diagram showing an example of a storage form of feature points. 図5は、変化量の記憶形態の一例を示す図である。FIG. 5 is a diagram showing an example of a storage form of the amount of change. 図6は、画像変換装置による画像変換処理動作の一例を示すフローチャートである。FIG. 6 is a flow chart showing an example of an image conversion processing operation by the image conversion device. 図7は、眉の特徴点と目の上の特徴点との関係を説明するための模式図である。FIG. 7 is a schematic diagram for explaining the relationship between the feature points of the eyebrows and the feature points above the eyes. 図8は、眉の下側の特徴点の追加方法を説明するための模式図である。FIG. 8 is a schematic diagram for explaining a method of adding feature points on the lower side of eyebrows. 図9は、瞼の特徴点の追加方法を説明するための模式図である。FIG. 9 is a schematic diagram for explaining a method of adding characteristic points of eyelids.
 [一実施形態]
 以下、図面を参照して、この発明に係わる一実施形態を説明する。
[One embodiment]
An embodiment according to the present invention will be described below with reference to the drawings.
 (構成例)
 図1は、この発明の一実施形態に係る画像変換装置1の構成の一例を示すブロック図である。画像変換装置1は、画像取得部11、特徴点認識部12、制御点生成部13、変換表情入力部14、変化量格納部15、表情変換部16及び画像出力部17を有する。
(Configuration example)
FIG. 1 is a block diagram showing an example of the configuration of an image conversion device 1 according to one embodiment of the invention. The image conversion device 1 has an image acquisition section 11 , a feature point recognition section 12 , a control point generation section 13 , a converted facial expression input section 14 , a change amount storage section 15 , a facial expression conversion section 16 and an image output section 17 .
 画像取得部11は、webカメラやアバターなどから顔画像を取得する。画像取得部11は、取得した顔画像を、特徴点認識部12及び表情変換部16に出力する。 The image acquisition unit 11 acquires a face image from a web camera, avatar, or the like. The image acquisition unit 11 outputs the acquired face image to the feature point recognition unit 12 and the facial expression conversion unit 16 .
 特徴点認識部12は、画像取得部11が取得した顔画像を入力とし、その顔画像から特徴点を認識する。この特徴点認識部12における特徴点の認識手法については後述する。特徴点認識部12は、認識した特徴点を制御点生成部13に出力する。 The feature point recognition unit 12 receives the face image acquired by the image acquisition unit 11, and recognizes feature points from the face image. A method of recognizing feature points in the feature point recognition unit 12 will be described later. The feature point recognition unit 12 outputs the recognized feature points to the control point generation unit 13 .
 制御点生成部13は、特徴点認識部12が認識した特徴点である第1の特徴点を入力とし、入力されたそれらの第1の特徴点に基づいて、認識されない特徴点である第2の特徴点を追加する。例えば、制御点生成部13は、第1の特徴点である眉の特徴点と目の特徴点との間の距離を計算し、眉の各特徴点から求めた距離の半分だけ下に第2の特徴点を追加する。この第2の特徴点の追加手法については、後で詳述する。制御点生成部13は、第1及び第2の特徴点を制御点として、表情変換部16に出力する。第1の特徴点の何れに基づいて第2の特徴点を追加するのか、及び、追加する第2の特徴点の個数は、予め決まっている。従って、制御点の個数も予め決まっている。 The control point generation unit 13 receives as input the first feature points that are the feature points recognized by the feature point recognition unit 12, and generates second feature points that are unrecognized feature points based on the input first feature points. Add the feature points of . For example, the control point generation unit 13 calculates the distance between the eyebrow feature point, which is the first feature point, and the eye feature point, and adds half the distance obtained from each eyebrow feature point downward to the second feature point. Add the feature points of . A method for adding the second feature point will be described in detail later. The control point generator 13 outputs the first and second feature points to the facial expression converter 16 as control points. Based on which of the first feature points the second feature points are added and the number of second feature points to be added are predetermined. Therefore, the number of control points is also predetermined.
 変換表情入力部14は、キーボードなどのユーザインタフェースからユーザが指定入力した、笑顔などの変換したい先の表情である変換表情を取得する。変換表情入力部14は、取得した変換表情を表情変換部16に出力する。 The converted facial expression input unit 14 acquires a converted facial expression, such as a smiling face, which is specified and input by the user from a user interface such as a keyboard. The converted facial expression input unit 14 outputs the acquired converted facial expression to the facial expression conversion unit 16 .
 変化量格納部15は、変換したい先の表情ごとに、各制御点についての変化量を予め格納する。変化量は、制御点をどの程度移動すべきかを示す情報である。変化量は、例えば、ユーザが特定の顔画像について無表情顔に表情変形処理を適用しながら、自然な表情となるように調整して、予め求めることができる。 The change amount storage unit 15 stores in advance the change amount for each control point for each facial expression to be converted. The amount of change is information indicating how much the control point should be moved. The amount of change can be obtained in advance by, for example, adjusting a specific facial image so that the user applies expression deformation processing to an expressionless face so as to obtain a natural expression.
 表情変換部16は、画像取得部11が取得した顔画像、制御点生成部13が出力した制御点、及び変換表情入力部14が取得した変換表情を入力とする。また、表情変換部16は、変換表情入力部14から入力された変換表情で示される変換したい先の表情における変化量を変化量格納部15から読み出す。変化量格納部15は、入力された顔画像における各制御点を、読み出したその制御点の移動量に基づいて移動することで、顔画像の表情を変換した顔画像を得る。表情変換部16は、変換後の顔画像を画像出力部17に出力する。 The facial expression transforming unit 16 receives the face image acquired by the image acquiring unit 11, the control points output by the control point generating unit 13, and the transformed facial expression acquired by the transformed facial expression input unit 14. Moreover, the facial expression transforming unit 16 reads from the variation storage unit 15 the amount of change in the facial expression to be converted indicated by the transformed facial expression input from the transformed facial expression input unit 14 . The change amount storage unit 15 obtains a face image in which the expression of the face image is converted by moving each control point in the input face image based on the read movement amount of the control point. The facial expression conversion section 16 outputs the converted face image to the image output section 17 .
 画像出力部17は、表情変換部16からの変換後の顔画像を入力とし、入力された顔画像を出力する。ここで、出力とは、例えば、記憶媒体に記憶すること、ディスプレイで表示すること、通信ネットワークを介して他の機器へ送信すること、などを含む。 The image output unit 17 receives the face image after conversion from the facial expression conversion unit 16, and outputs the input face image. Here, output includes, for example, storing in a storage medium, displaying on a display, transmitting to another device via a communication network, and the like.
 図2は、画像変換装置1のハードウェア構成の一例を示す図である。 FIG. 2 is a diagram showing an example of the hardware configuration of the image conversion device 1. As shown in FIG.
 画像変換装置1は、例えば、パーソナルコンピュータ(Personal computer)、スマートホン、サーバコンピュータ、などのコンピュータにより構成される。画像変換装置1は、図2に示すように、CPU(Central Processing Unit)等のハードウェアプロセッサ100を有する。なお、CPUは、マルチコア及びマルチスレッドのものを用いることで、同時に複数の情報処理を実行することができる。また、プロセッサ100は、複数のCPUを備えていても良い。そして、画像変換装置1では、このプロセッサ100に対し、プログラムメモリ200と、データメモリ300と、通信インタフェース400と、入出力インタフェース(図2では入出力IFと記す)500とが、バス600を介して接続される。 The image conversion device 1 is composed of a computer such as a personal computer, a smart phone, a server computer, etc., for example. The image conversion device 1 has a hardware processor 100 such as a CPU (Central Processing Unit), as shown in FIG. By using a multi-core and multi-threaded CPU, it is possible to execute a plurality of information processes at the same time. Also, the processor 100 may include multiple CPUs. In the image conversion apparatus 1, a program memory 200, a data memory 300, a communication interface 400, and an input/output interface (input/output IF in FIG. 2) 500 are connected to the processor 100 via a bus 600. connected.
 通信インタフェース400は、例えば一つ以上の有線または無線の通信モジュールを含むことができる。通信インタフェース400は、ケーブルまたはLAN(Local Area Network)やインターネット等のネットワークを介して接続される他のコンピュータ、webカメラ、などとの間で通信を行うことができる。 The communication interface 400 can include, for example, one or more wired or wireless communication modules. The communication interface 400 can communicate with other computers, web cameras, etc. connected via a cable or a network such as a LAN (Local Area Network) or the Internet.
 入出力インタフェース500には、入力部700及び表示部800が接続されている。入力部700は、キーボード、マウスなどのポインティングデバイス、などの入力デバイス、カメラなどのセンサデバイス、などを含む。また、表示部800は、液晶ディスプレイ、CRT(Cathode Ray Tube)ディスプレイ、などの表示デバイスである。入力部700及び表示部800は、いわゆるタブレット型の入力・表示デバイスを用いたものが用いられることもできる。この種の入力・表示デバイスは、例えば液晶または有機EL(Electro Luminescence)を使用した表示デバイスの表示画面上に、静電方式または圧力方式を採用した入力検知シートを配置して構成される。入出力インタフェース500は、上記入力部700において入力された操作情報をプロセッサ100に入力すると共に、プロセッサ100で生成された表示情報を表示部800に表示させる。 An input unit 700 and a display unit 800 are connected to the input/output interface 500 . The input unit 700 includes input devices such as a keyboard, a pointing device such as a mouse, a sensor device such as a camera, and the like. Also, the display unit 800 is a display device such as a liquid crystal display, a CRT (Cathode Ray Tube) display, or the like. The input unit 700 and the display unit 800 can also use what is called a tablet-type input/display device. This type of input/display device is configured by arranging an input detection sheet adopting an electrostatic method or a pressure method on a display screen of a display device using liquid crystal or organic EL (Electro Luminescence), for example. The input/output interface 500 inputs to the processor 100 the operation information input through the input unit 700 and causes the display unit 800 to display display information generated by the processor 100 .
 なお、入力部700及び表示部800は、入出力インタフェース500に接続されていなくても良い。入力部700及び表示部800は、通信インタフェース400と直接またはネットワークを介して接続するための通信ユニットを備えることで、プロセッサ100との間で情報の授受を行い得る。 Note that the input unit 700 and the display unit 800 may not be connected to the input/output interface 500. The input unit 700 and the display unit 800 are provided with a communication unit for connecting to the communication interface 400 directly or via a network, so that information can be exchanged with the processor 100 .
 また、入出力インタフェース500は、フラッシュメモリ等の半導体メモリといった記録媒体のリード/ライト機能を有しても良いし、あるいは、そのような記録媒体のリード/ライト機能を持ったリーダライタとの接続機能を有しても良い。さらに、入出力インタフェース500は、他の機器との接続機能を有して良い。 Also, the input/output interface 500 may have a read/write function for a recording medium such as a semiconductor memory such as a flash memory, or may be connected to a reader/writer having a read/write function for such a recording medium. It may have functions. Furthermore, the input/output interface 500 may have a connection function with other devices.
 プログラムメモリ200は、非一時的な有形のコンピュータ可読記憶媒体として、随時書込み及び読出しが可能な不揮発性メモリと、随時読出しのみが可能な不揮発性メモリとが組み合わせて使用されたものである。随時書込み及び読出しが可能な不揮発性メモリは、例えば、HDD(Hard Disk Drive)、SSD(Solid State Drive)、などである。随時読出しのみが可能な不揮発性メモリは、例えば、ROMなどである。このプログラムメモリ200には、プロセッサ100が一実施形態に係る各種制御処理を実行するために必要なプログラム、例えば画像変換プログラムが格納されている。すなわち、上記の画像取得部11、特徴点認識部12、制御点生成部13、変換表情入力部14、変化量格納部15、表情変換部16及び画像出力部17の各部における処理機能部は、何れも、プログラムメモリ200に格納された画像変換プログラムを上記プロセッサ100により読み出させて実行させることにより実現され得る。なお、これらの処理機能部の一部または全部は、特定用途向け集積回路(ASIC:Application Specific Integrated Circuit)またはFPGA(field-programmable gate array)等の集積回路を含む、他の多様な形式によって実現されても良い。 The program memory 200 is a combination of a non-volatile memory that can be written and read at any time and a non-volatile memory that can only be read at any time as a non-temporary tangible computer-readable storage medium. Non-volatile memories that can be written and read at any time are, for example, HDDs (Hard Disk Drives), SSDs (Solid State Drives), and the like. Non-volatile memory that can only be read at any time is, for example, ROM. The program memory 200 stores a program necessary for the processor 100 to execute various control processes according to one embodiment, such as an image conversion program. That is, the processing function units in each unit of the image acquisition unit 11, the feature point recognition unit 12, the control point generation unit 13, the converted expression input unit 14, the change amount storage unit 15, the expression conversion unit 16, and the image output unit 17 are: Both can be realized by causing the processor 100 to read and execute the image conversion program stored in the program memory 200 . Some or all of these processing functions may be implemented in various other forms, including integrated circuits such as Application Specific Integrated Circuits (ASICs) or field-programmable gate arrays (FPGAs). May be.
 データメモリ300は、有形のコンピュータ可読記憶媒体として、例えば、上記の不揮発性メモリと、RAM(Random Access Memory)等の揮発性メモリとが組み合わせて使用されたものである。このデータメモリ300は、各種処理が行われる過程で取得及び作成された各種データが記憶されるために用いられる。すなわち、データメモリ300には、各種処理が行われる過程で、適宜、各種データを記憶するための領域が確保される。そのような領域として、データメモリ300には、例えば、取得画像記憶部301、特徴点記憶部302、制御点記憶部303、変換表情指定記憶部304、変化量記憶部305、変換画像記憶部306及び一時記憶部307を設けることができる。 The data memory 300 is used as a tangible computer-readable storage medium, for example, by combining the above nonvolatile memory and a volatile memory such as RAM (Random Access Memory). This data memory 300 is used to store various data acquired and created in the process of performing various processes. That is, in the data memory 300, an area for storing various data is appropriately secured in the process of performing various processes. As such areas, the data memory 300 includes, for example, an acquired image storage unit 301, a feature point storage unit 302, a control point storage unit 303, a converted facial expression designation storage unit 304, a change amount storage unit 305, a converted image storage unit 306, and so on. and a temporary storage unit 307 can be provided.
 取得画像記憶部301は、プロセッサ100が上記の画像取得部11として動作したときに取得した顔画像を記憶するために使用される。 The acquired image storage unit 301 is used to store face images acquired when the processor 100 operates as the image acquisition unit 11 described above.
 特徴点記憶部302は、プロセッサ100が上記の特徴点認識部12として動作したときに取得した特徴点を記憶するために使用される。 The feature point storage unit 302 is used to store feature points acquired when the processor 100 operates as the feature point recognition unit 12 described above.
 図3は、顔の特徴点の一例を示す図である。図3中の星印がプロセッサ100が認識した特徴点であり、各特徴点の横に付された数字は各特徴点を識別するための一意な特徴点IDである。特徴点IDの数及び各特徴点IDに対する顔の部分は、採用する特徴点認識手法により決まっている。例えば、特徴点ID「18」の特徴点は向かって左の眉の左端、のように予め決まっている。 FIG. 3 is a diagram showing an example of facial feature points. The asterisks in FIG. 3 are feature points recognized by the processor 100, and the numbers attached to each feature point are unique feature point IDs for identifying each feature point. The number of feature point IDs and the portion of the face for each feature point ID are determined by the feature point recognition method employed. For example, the feature point with the feature point ID "18" is predetermined as the left edge of the left eyebrow.
 図4は、特徴点記憶部302における特徴点の記憶形態の一例を示す図である。図4に示すように、特徴点記憶部302は、テーブル形式で、特徴点IDに対応付けて顔画像中のx座標及びy座標を記憶する。座標の値はピクセルである。従って、特徴点記憶部302は、図3の例であれば、特徴点ID「1」~「68」の特徴点について、そのxy座標を記憶する。 FIG. 4 is a diagram showing an example of a storage form of feature points in the feature point storage unit 302. As shown in FIG. As shown in FIG. 4, the feature point storage unit 302 stores x-coordinates and y-coordinates in the face image in a table format in association with feature point IDs. Coordinate values are in pixels. Therefore, in the example of FIG. 3, the feature point storage unit 302 stores the xy coordinates of feature points with feature point IDs "1" to "68".
 制御点記憶部303は、プロセッサ100が上記の制御点生成部13として動作したときに生成した制御点を記憶するために使用される。この制御点記憶部303における制御点の記憶形態は、例えば、図4に示した特徴点記憶部302における特徴点の記憶形態と同様である。すなわち、制御点記憶部303は、テーブル形式で、制御点IDに対応付けて顔画像中のx座標及びy座標を記憶することかできる。制御点記憶部303は、図3に示した特徴点に対して割り当てられた「1」~「68」の特徴点IDがそのまま制御点ID「1」~「68」として、特徴点ID「1」~「68」のxy座標を対応付けて記憶する。また、プロセッサ100は、追加した特徴点である第2の特徴点のそれぞれのxy座標を、制御点ID「69」~に対応付けて記憶する。 The control point storage unit 303 is used to store control points generated when the processor 100 operates as the control point generation unit 13 described above. The storage form of the control points in the control point storage unit 303 is the same as the storage form of the feature points in the feature point storage unit 302 shown in FIG. 4, for example. That is, the control point storage unit 303 can store the x-coordinate and y-coordinate in the face image in association with the control point ID in a table format. The control point storage unit 303 stores the feature point IDs "1" to "68" assigned to the feature points shown in FIG. ” to “68” are associated with each other and stored. The processor 100 also stores the xy coordinates of the second feature points, which are the added feature points, in association with the control point IDs "69" and so on.
 変換表情指定記憶部304は、プロセッサ100が上記の変換表情入力部14として動作したときに取得した、ユーザによって指定された変換表情を記憶するために使用される。 The converted facial expression specification storage unit 304 is used to store the converted facial expression specified by the user, which is acquired when the processor 100 operates as the above-described converted facial expression input unit 14 .
 変化量記憶部305は、上記の変化量格納部15に相当する。 The change amount storage unit 305 corresponds to the change amount storage unit 15 described above.
 図5は、変化量記憶部305における変化量の記憶形態の一例を示す図である。図5に示すように、変化量記憶部305は、変換表情ごとに、制御点IDに対応付けてx座標の変化量とy座標の変化量とを記憶するテーブル形式とすることができる。変化量の値はピクセルである。変化量は、制御点の移動方向と移動量によって表される。例えば、移動量「+1」は、正方向に1ピクセル移動することを表す。 FIG. 5 is a diagram showing an example of the storage form of the amount of change in the amount of change storage unit 305. As shown in FIG. As shown in FIG. 5, the change amount storage unit 305 can have a table format that stores the change amount of the x-coordinate and the change amount of the y-coordinate in association with the control point ID for each converted facial expression. The delta value is in pixels. The amount of change is represented by the direction and amount of movement of the control point. For example, a movement amount of "+1" represents a movement of 1 pixel in the positive direction.
 変換画像記憶部306は、プロセッサ100が上記の表情変換部16として動作したときに変換した顔画像を記憶するために使用される。 The converted image storage unit 306 is used to store face images converted when the processor 100 operates as the facial expression conversion unit 16 described above.
 一時記憶部307は、プロセッサ100が動作途中で発生する、上記取得画像記憶部301、特徴点記憶部302、制御点記憶部303、変換表情指定記憶部304、変化量記憶部305及び変換画像記憶部306に記憶しない種々の中間データを記憶するために使用される。 A temporary storage unit 307 stores the acquired image storage unit 301, the feature point storage unit 302, the control point storage unit 303, the converted facial expression designation storage unit 304, the change amount storage unit 305, and the converted image storage unit 304, which are generated during the operation of the processor 100. It is used to store various intermediate data that are not stored in unit 306 .
 (動作)
 次に、画像変換装置1を有する画像変換装置1の動作を説明する。
(motion)
Next, the operation of the image conversion device 1 having the image conversion device 1 will be described.
 図6は、画像変換装置1による画像変換処理動作の一例を示すフローチャートである。画像変換装置1のプロセッサ100は、プログラムメモリ200に記憶された画像変換プログラムを読み出して実行することで、このフローチャートに示す画像変換装置1としての動作を開始する。プロセッサ100での画像変換プログラムの実行は、入力部700から、入出力インタフェース500を介して、あるいは、通信インタフェース400を介して、画像変換の実施を指示されることで開始される。 FIG. 6 is a flowchart showing an example of the image conversion processing operation by the image conversion device 1. FIG. The processor 100 of the image conversion device 1 reads and executes the image conversion program stored in the program memory 200, thereby starting the operation of the image conversion device 1 shown in this flow chart. Execution of the image conversion program by the processor 100 is started when an instruction to perform image conversion is given from the input unit 700 via the input/output interface 500 or via the communication interface 400 .
 プロセッサ100は、変換表情入力部14として動作して、ユーザによる、笑顔などの変換したい先の表情である変換表情の指定入力を待つ(ステップS1)。例えば、プロセッサ100は、入出力インタフェース500または通信インタフェース400を介した入力部700からの入力信号が変換表情の指定入力を含むか否かを判断する。変換表情の指定入力が有ったならば、プロセッサ100は、ステップS2の処理へ移行する。 The processor 100 operates as the converted facial expression input unit 14 and waits for the user's specification input of a converted facial expression, such as a smile, which is the facial expression to be converted (step S1). For example, the processor 100 determines whether or not the input signal from the input unit 700 via the input/output interface 500 or the communication interface 400 includes a specified input of a converted facial expression. If there is an input specifying a converted facial expression, the processor 100 proceeds to the process of step S2.
 プロセッサ100は、指定された変換表情を、データメモリ300の変換表情指定記憶部304に記憶させる(ステップS2)。 The processor 100 stores the designated converted facial expression in the converted facial expression designation storage section 304 of the data memory 300 (step S2).
 プロセッサ100は、画像取得部11として動作して、顔画像を取得する(ステップS3)。例えば、プロセッサ100は、入力部700のカメラが被験者の顔を撮影した画像を入出力インタフェース500を介して取得する。あるいは、プロセッサ100は、ネットワークに接続されたwebカメラが撮影した顔画像や他のコンピュータが生成したアバターの顔を通信インタフェース400を介して取得する。プロセッサ100は、取得した顔画像を、データメモリ300の取得画像記憶部301に記憶させる。 The processor 100 operates as the image acquisition unit 11 and acquires a face image (step S3). For example, the processor 100 acquires through the input/output interface 500 an image of the subject's face captured by the camera of the input unit 700 . Alternatively, the processor 100 acquires a face image captured by a web camera connected to a network or an avatar face generated by another computer via the communication interface 400 . The processor 100 stores the acquired face image in the acquired image storage section 301 of the data memory 300 .
 プロセッサ100は、特徴点認識部12として動作して、取得画像記憶部301に記憶されている顔画像から第1の特徴点を認識する(ステップS4)。プロセッサ100は、例えば、dlibのface_landmark_detection関数(例えばhttp://dlib.net/face_landmark_detection.py.htmlを参照)などを利用して、顔画像に対して特徴点を認識する。具体的には、プロセッサ100は、入力の顔画像に対して、HOG(Histogram of Oriented Gradients)特徴と呼ばれる輝度の勾配方向の分布を抽出する。HOG特徴と顔の特徴点の位置を紐付けたデータをもとに学習したモデルは一般的に提供されている。よって、プロセッサ100は、抽出したHOG特徴をこの学習モデルに入力し、顔の特徴点の位置を取得する。プロセッサ100は、取得した第1の特徴点の位置を、データメモリ300の特徴点記憶部302に記憶させる。 The processor 100 operates as the feature point recognition unit 12 and recognizes the first feature points from the face image stored in the acquired image storage unit 301 (step S4). The processor 100, for example, uses the face_landmark_detection function of dlib (see, for example, http://dlib.net/face_landmark_detection.py.html) to recognize feature points in the face image. Specifically, the processor 100 extracts the gradient direction distribution of luminance called HOG (Histogram of Oriented Gradients) features from the input face image. A model trained based on data in which HOG features and positions of facial feature points are associated is generally provided. Therefore, the processor 100 inputs the extracted HOG features into this learning model to obtain the positions of the feature points of the face. The processor 100 stores the acquired position of the first feature point in the feature point storage unit 302 of the data memory 300 .
 プロセッサ100は、制御点生成部13として動作して、制御点を生成する(ステップS5)。具体的には、プロセッサ100は、認識された第1の特徴点を制御点としてデータメモリ300の制御点記憶部303に記憶させる。さらにプロセッサ100は、片側しか特徴点が認識されていない顔のパーツに対して、もう片方の特徴点である第2の特徴点を追加する。そしてプロセッサ100は、それら追加した第2の特徴点も追加の制御点として制御点記憶部303に記憶させる。 The processor 100 operates as the control point generation unit 13 and generates control points (step S5). Specifically, the processor 100 stores the recognized first feature point in the control point storage unit 303 of the data memory 300 as a control point. Furthermore, the processor 100 adds a second feature point, which is the feature point on the other side, to the facial part for which the feature points on only one side have been recognized. The processor 100 then stores the added second feature points in the control point storage unit 303 as additional control points.
 片側しか特徴点が認識されていない顔のパーツとしては、例えば、眉、瞼、輪郭などがある。眉は、上側の特徴点しか認識されていないため、プロセッサ100は、眉の下側の特徴点を追加する。瞼は二重の内、下側の特徴点となる目の上側の特徴点しか認識されていないため、プロセッサ100は、上側の特徴点を追加する。輪郭は、陰影の無い部分が特徴点として認識されるため、プロセッサ100は、陰影部分の特徴点を追加する。 For example, eyebrows, eyelids, and contours are examples of facial parts for which feature points are recognized on only one side. Since only the upper feature points of the eyebrows are recognized, the processor 100 adds the lower feature points of the eyebrows. Since only the upper feature point of the eye, which is the lower feature point of the double eyelid, is recognized, the processor 100 adds the upper feature point. Since the contour without shadows is recognized as a feature point, the processor 100 adds the feature points of the shadowed part.
 例えば、プロセッサ100は、以下のようにして眉の特徴点を追加する。
 図7は、眉の特徴点と目の上の特徴点との関係を説明するための模式図である。プロセッサ100は、眉の各特徴点(特徴点ID「18」~「22」の特徴点)から垂線を下ろし、目の上の特徴点(特徴点ID「37」~「40」の特徴点)の内、垂線からの距離が最も近い特徴点を得る。すなわち、プロセッサ100は、特徴点ID「18」の特徴点ならば特徴点ID「37」の特徴点、特徴点ID「19」の特徴点ならば特徴点ID「37」の特徴点、特徴点ID「20」の特徴点ならば特徴点ID「38」の特徴点、特徴点ID「21」の特徴点ならば特徴点ID「39」の特徴点、特徴点ID「22」の特徴点ならば特徴点ID「40」の特徴点、…、特徴点ID「27」の特徴点ならば特徴点ID「46」の特徴点、を得る。このとき、それぞれの特徴点の間の距離(縦座標の差)として、d18、d19、d20、d21、d22、…、d27、が得られる。
For example, the processor 100 adds eyebrow feature points as follows.
FIG. 7 is a schematic diagram for explaining the relationship between the feature points of the eyebrows and the feature points above the eyes. The processor 100 draws a vertical line from each feature point of the eyebrows (feature point IDs “18” to “22”), and draws a vertical line from the eyebrow feature points (feature point IDs “37” to “40”). , the feature point with the closest distance from the perpendicular is obtained. That is, the processor 100 detects the feature point with the feature point ID “37” if the feature point has the feature point ID “18”, the feature point with the feature point ID “37” if the feature point has the feature point ID “19”, the feature point If it is a feature point with ID "20", it is a feature point with feature point ID "38", if it is a feature point with feature point ID "21", it is a feature point with feature point ID "39", and if it is a feature point with feature point ID "22" For example, a feature point with a feature point ID of "40" is obtained, . At this time, d18, d19, d20, d21, d22, .
 図8は、眉の第2の特徴点である下側の特徴点の追加方法を説明するための模式図である。プロセッサ100は、上記の特徴点の間の距離d18~d27の平均距離daを計算する。このとき、右目と左目を区別せずに平均距離daを求めても良いし、一般に、左右の目には若干の差異が存在するので、別々に平均距離daを求めても良い。ここでは、左右を区別して平均距離を求めるものとする。すなわち、プロセッサ100は、向かって左の眉についてはda=(d18+d19+d20+d21+d22)/5により、また、向かって右の眉についてはda=(d23+d24+d25+d26+d27)/5により、それぞれ平均距離daを計算する。 FIG. 8 is a schematic diagram for explaining a method of adding the lower feature point, which is the second eyebrow feature point. The processor 100 calculates the average distance da of the distances d18-d27 between the above feature points. At this time, the average distance da may be obtained without distinguishing between the right eye and the left eye, or the average distance da may be obtained separately since there is generally a slight difference between the left and right eyes. Here, it is assumed that the average distance is determined by distinguishing between left and right. That is, the processor 100 calculates the average distance da by da=(d18+d19+d20+d21+d22)/5 for the left eyebrow and by da=(d23+d24+d25+d26+d27)/5 for the right eyebrow.
 プロセッサ100は、こうして計算した平均距離daの1/2つまりda/2を特徴点追加距離dとして、眉の第1の特徴点それぞれから特徴点追加距離dだけ下に第2の特徴点を追加する。すなわち、プロセッサ100は、特徴点ID「18」の第1の特徴点から特徴点追加距離dだけ下に、特徴点ID「69」となる第2の特徴点を追加する。同様にして、プロセッサ100は、特徴点ID「19」の第1の特徴点の下に特徴点ID「70」の第2の特徴点、特徴点ID「20」の第1の特徴点の下に特徴点ID「71」の第2の特徴点、…、特徴点ID「27」の第1の特徴点の下に特徴点ID「78」の第2の特徴点、を追加する。 The processor 100 sets 1/2 of the average distance da thus calculated, that is, da/2, as the feature point addition distance d, and adds the second feature point below each of the first feature points of the eyebrows by the feature point addition distance d. do. That is, the processor 100 adds the second feature point with the feature point ID of "69" below the first feature point with the feature point ID of "18" by the feature point addition distance d. Similarly, the processor 100 places the second feature point with the feature point ID "70" below the first feature point with the feature point ID "19", the second feature point with the feature point ID "20" below the first feature point with the feature point ID "20" , the second feature point with the feature point ID "78" below the first feature point with the feature point ID "27".
 また、プロセッサ100は、例えば以下のようにして瞼の特徴点を追加する。 
 図9は、瞼の第2の特徴点の追加方法を説明するための模式図である。瞼の認識された片側の特徴点は、目の上側の特徴点である。そこで、瞼の第2の特徴点を追加する際にも、眉の第2の特徴点を追加する際に使用した平均距離daを使用することができる。プロセッサ100は、この左右別々の平均距離daの1/4つまりda/4を特徴点追加距離dとする。プロセッサ100は、目の上の特徴点(特徴点ID「37」~「40」の第1の特徴点)それぞれから特徴点追加距離dだけ上に第2の特徴点を追加する。すなわち、プロセッサ100は、特徴点ID「37」の第1の特徴点から特徴点追加距離dだけ上に、特徴点ID「79」となる第2の特徴点を追加する。同様にして、プロセッサ100は、特徴点ID「38」の第1の特徴点の上に特徴点ID「80」の第2の特徴点、特徴点ID「39」の第1の特徴点の上に特徴点ID「81」の第2の特徴点、…、特徴点ID「46」の第1の特徴点の上に特徴点ID「86」の第2の特徴点、を追加する。
Also, the processor 100 adds feature points of the eyelids, for example, as follows.
FIG. 9 is a schematic diagram for explaining a method of adding the second feature point of the eyelid. The recognized unilateral feature point of the eyelid is the upper feature point of the eye. Therefore, when adding the second eyelid feature points, the average distance da used when adding the second eyebrow feature points can be used. The processor 100 sets 1/4 of the left and right separate average distances da, ie, da/4, as the feature point additional distance d. The processor 100 adds a second feature point above each feature point on the eye (the first feature points with feature point IDs "37" to "40") by a feature point addition distance d. That is, the processor 100 adds the second feature point with the feature point ID "79" above the first feature point with the feature point ID "37" by the feature point addition distance d. Similarly, the processor 100 causes the second feature point with the feature point ID "80" above the first feature point with the feature point ID "38", and the first feature point with the feature point ID "39". , the second feature point with the feature point ID "86" above the first feature point with the feature point ID "46".
 なお、輪郭の陰影の大きさについては、特に個人差は大きくないので、プロセッサ100は、例えば以下のようにして輪郭の陰影の第2の特徴点を追加する。例えば、プロセッサ100は、輪郭の特徴点(特徴点ID「1」~「17」の第1の特徴点)ごとに、予め決められた方向及び距離の位置に第2の特徴点を追加する。 As for the size of the contour shadow, since individual differences are not particularly large, the processor 100 adds the second feature point of the contour shadow, for example, as follows. For example, the processor 100 adds a second feature point at a predetermined direction and distance for each contour feature point (first feature points with feature point IDs "1" to "17").
 プロセッサ100は、表情変換部16として動作して、取得画像記憶部301に記憶されている顔画像の表情を変換する(ステップS6)。すなわち、プロセッサ100は、制御点記憶部303に記憶された制御点と、変化量記憶部305に記憶された、変換表情指定記憶部304に記憶された変換表情に応じた変化量とに基づいて、顔画像を変換する。例えば、プロセッサ100は、MLSの実装(例えば、https://github.com/Jarvis73/Moving-Least-Squaresを参照)などを利用する。
 具体的には、プロセッサ100は、各制御点について、変換表情指定記憶部304に記憶された変換表情に応じた変化量分、移動させる。例えば、表情を笑顔に変化する場合には、制御点ID「1」の制御点については、変換前のxy座標が(23,45)であるので(図4参照)、プロセッサ100は、x座標を「+1」、y座標を「+2」する(図5参照)ことで、当該制御点の画素を(24,47)に移動するような変換を行う。
The processor 100 operates as the facial expression transforming unit 16 to transform the facial image stored in the acquired image storage unit 301 (step S6). That is, the processor 100 stores the control points stored in the control point storage unit 303 and the amount of change corresponding to the converted facial expression stored in the converted facial expression designation storage unit 304, which is stored in the amount of change storage unit 305. , to transform the face image. For example, processor 100 utilizes an implementation of MLS (see, eg, https://github.com/Jarvis73/Moving-Least-Squares), or the like.
Specifically, the processor 100 moves each control point by the change amount corresponding to the transformed facial expression stored in the transformed facial expression designation storage unit 304 . For example, when changing the facial expression to a smiling face, the x-y coordinates before conversion are (23, 45) for the control point with the control point ID "1" (see FIG. 4). is "+1" and the y-coordinate is "+2" (see FIG. 5), so that the pixel of the control point is moved to (24, 47).
 そして、制御点以外の点については、プロセッサ100は、下記アフィン変換(ヘルマート変換=相似変換及びrigid deformation=剛体変形を含む)を適用する。 Then, for points other than the control points, the processor 100 applies the following affine transformation (including Helmert transformation = similarity transformation and rigid deformation = rigid deformation).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
ただし、x,yは近傍の制御点の座標、x’,y’はその制御点の座標に変化量を足した座標、a,b,c,dはパラメータ、tx,tyは平行移動パラメータである。プロセッサ100は、制御点の座標x,yと変化量を足した座標x’,y’の最小二乗平均を算出し、これを最小化するようなパラメータa,b,c,d,tx,tyを大域最適化により求める。そして、変換するべき対象点の座標をx,yとして、これら求めたパラメータを用いて変換後の座標を求める。プロセッサ100は、こうして求めたパラメータa,b,c,d,tx,tyを用いて、追加した制御点から上記アフィン変換により変換後の座標を求める。 However, x, y are the coordinates of the neighboring control point, x', y' are the coordinates obtained by adding the amount of change to the coordinates of the control point, a, b, c, d are the parameters, and t x , t y are the translations. is a parameter. The processor 100 calculates the least square mean of the coordinates x, y of the control points and the coordinates x′, y′ obtained by adding the amount of change, and sets the parameters a, b, c, d, t x , t y is determined by global optimization. Then, the coordinates of the object point to be transformed are set to x and y, and the coordinates after transformation are determined using the determined parameters. The processor 100 uses the parameters a, b, c, d, t x , and t y obtained in this way to obtain the coordinates after transformation by the above affine transformation from the added control points.
 プロセッサ100は、こうして変換した顔画像を変換画像としてデータメモリ300の変換画像記憶部306に記憶させる。 The processor 100 stores the face image thus converted in the converted image storage section 306 of the data memory 300 as a converted image.
 プロセッサ100は、画像出力部17として動作して、変換画像記憶部306に記憶された変換画像を出力する(ステップS7)。例えば、プロセッサ100は、入出力インタフェース500を介して表示部800に顔画像を表示させる。あるいは、プロセッサ100は、通信インタフェース400によりネットワーク上に送信し、ネットワークに接続された表示デバイスに表示させたり、ネットワークに接続された他のコンピュータの表示部に表示させたりする。 The processor 100 operates as the image output unit 17 and outputs the converted image stored in the converted image storage unit 306 (step S7). For example, the processor 100 causes the display unit 800 to display a facial image via the input/output interface 500 . Alternatively, the processor 100 transmits over the network via the communication interface 400 and displays it on a display device connected to the network, or displays it on the display unit of another computer connected to the network.
 プロセッサ100は、このフローチャートに示す画像変換装置1としての動作を終了するか否か判断する(ステップS8)。例えば、プロセッサ100は、入力部700から、入出力インタフェース500を介して、あるいは、通信インタフェース400を介して、ユーザから画像変換の終了を指示されたか否か確認する。ここで、動作を終了する場合には、プロセッサ100は、このフローチャートに示す動作を終了する。 The processor 100 determines whether or not to end the operation as the image conversion device 1 shown in this flow chart (step S8). For example, the processor 100 checks whether or not the user has instructed to end the image conversion from the input unit 700 via the input/output interface 500 or via the communication interface 400 . Here, when ending the operation, the processor 100 ends the operation shown in this flowchart.
 これに対して、未だ動作を終了しないと場合には、プロセッサ100は、変換表情入力部14として動作して、ユーザによる変換表情の変更指定入力が有ったか否か判断する(ステップS9)。変換表情の変更指定入力が無ければ、プロセッサ100は、ステップS3の処理へ移行する。また、変換表情の変更指定入力が有った場合には、プロセッサ100は、ステップS2の処理へ移行する。 On the other hand, if the operation has not yet ended, the processor 100 operates as the converted facial expression input unit 14 and determines whether or not the user has entered a change designation input for the converted facial expression (step S9). If there is no change specification input for the converted facial expression, the processor 100 proceeds to the process of step S3. Also, when there is an input specifying a change in the converted facial expression, the processor 100 proceeds to the process of step S2.
 以上に説明した一実施形態に係る画像変換装置1は、制御点生成部13と表情変換部16とを備える。制御点生成部13は、人の顔の画像から認識された顔パーツの片側の特徴点である第1の特徴点に基づいて、認識されていないもう片側の特徴点である第2の特徴点を追加し、第1及び第2の特徴点を制御点とする。表情変換部16は、変換するべき変換表情に応じた変形量により制御点を変形することで人の顔の表情を変換した変換画像を得る。 
 従って、一実施形態に係る画像変換装置1は、顔パーツの片側の特徴点に基づいてもう片側の特徴点を追加して画像を変換するので、顔パーツの片側のみしか認識しない場合であっても、自然な表情の顔画像に変換することを可能とする画像変換技術を提供することができる。
The image conversion device 1 according to the embodiment described above includes the control point generation section 13 and the facial expression conversion section 16 . The control point generation unit 13 generates second feature points, which are unrecognized feature points on the other side, based on first feature points, which are feature points on one side of facial parts recognized from the image of a person's face. and let the first and second feature points be control points. A facial expression conversion unit 16 obtains a transformed image in which the human facial expression is transformed by transforming the control points by a deformation amount according to the transformed facial expression to be transformed.
Therefore, the image conversion apparatus 1 according to one embodiment converts an image by adding feature points on one side of the face part based on the feature points on the other side, so even if only one side of the face part is recognized. It is also possible to provide an image conversion technique that enables conversion into a face image with a natural expression.
 さらに、一実施形態に係る画像変換装置1では、顔パーツは、少なくとも眉または瞼を含む。 
 このように、眉の上側の特徴点及び/または瞼の下側の特徴点しか認識できなくても、眉の下側の特徴点及び/または瞼の上側の特徴点を追加して、自然な表情の顔画像に変換することが可能となる。
Furthermore, in the image conversion device 1 according to one embodiment, facial parts include at least eyebrows or eyelids.
In this way, even if only the upper eyebrow feature points and/or the lower eyelid feature points can be recognized, the lower eyebrow feature points and/or the upper eyelid feature points are added to create a natural image. It is possible to convert the facial expression into a face image.
 ここで、制御点生成部13は、第1の特徴点である眉の上側の特徴点と目の上側の特徴点との間の距離に基づいて特徴点追加距離dを算出し、第1の特徴点である眉の上側の特徴点から特徴点追加距離dだけ離れた眉の下側の点を第2の特徴点として追加する。 
 よって、眉の認識されない下側の第2の特徴点を容易に追加することができる。
Here, the control point generation unit 13 calculates the feature point addition distance d based on the distance between the feature points above the eyebrows and the feature points above the eyes, which are the first feature points. A point on the lower side of the eyebrow, which is a feature point addition distance d from the feature point on the upper side of the eyebrow, is added as a second feature point.
Therefore, it is possible to easily add a second feature point on the unrecognized lower side of the eyebrow.
 あるいは、制御点生成部13は、眉の上側の特徴点と第1の特徴点である瞼の下側の特徴点との間の距離に基づいて特徴点追加距離dを算出し、第1の特徴点である瞼の下側の特徴点から特徴点追加距離dだけ離れた瞼の上側の点を第2の特徴点として追加する。 
 よって、瞼の認識されない上側の第2の特徴点を容易に追加することができる。
Alternatively, the control point generation unit 13 calculates the feature point addition distance d based on the distance between the feature point above the eyebrow and the feature point below the eyelid, which is the first feature point, and calculates the first feature point. A point on the upper side of the eyelid separated by a feature point addition distance d from the feature point on the lower side of the eyelid, which is a feature point, is added as a second feature point.
Therefore, it is possible to easily add a second feature point on the upper side of the eyelid that is not recognized.
 また、一実施形態に係る画像変換装置1では、変換するべき変換表情ごとに、制御点それぞれについての変形量を表す変化量を予め格納する変化量格納部15と、変換するべき変換表情を入力する変換表情入力部14と、を備える。そして、表情変換部16は、入力された変換表情に応じた変化量を変化量格納部から読み出し、その読み出した変化量を用いて変換画像を得る。 
 このように、追加する第2の特徴点に対応する制御点についても予め変化量を格納しておくことで、第2の特徴点に対応する制御点も利用して自然な表情の顔画像に変換することが可能となる。
Further, in the image conversion apparatus 1 according to one embodiment, the amount of change storage unit 15 stores in advance the amount of change representing the amount of deformation for each control point for each converted facial expression to be converted, and the converted facial expression to be converted is input. a converted facial expression input unit 14 for Then, the facial expression conversion unit 16 reads out the amount of change corresponding to the input converted facial expression from the amount of change storage unit, and obtains a converted image using the amount of change that has been read out.
In this way, by storing the amount of change in advance also for the control points corresponding to the second feature points to be added, the control points corresponding to the second feature points are also used to create a facial image with a natural expression. It is possible to convert.
 [他の実施形態]
 なお、この発明は上記一実施形態に限定されるものではない。 
 例えば、以上で説明した各処理の流れは、説明した手順に限定されるものではなく、いくつかのステップの順序が入れ替えられても良いし、いくつかのステップが同時並行で実施されても良い。
[Other embodiments]
In addition, this invention is not limited to the said one embodiment.
For example, the flow of each process described above is not limited to the procedures described above, and the order of some steps may be changed, and some steps may be performed in parallel. .
 また、以上で説明した各処理の流れは、リアルタイムに取得する顔画像の表情をリアルタイムに変換していく場合であったが、リアルタイム処理ではなく、保存された顔画像の表情を変換する用途にも同様に適用できる。 In addition, the flow of each process described above was for the case of converting the expression of a face image acquired in real time, but instead of real-time processing, the expression of a saved face image is converted. can be similarly applied.
 第2の特徴点を追加する際、平均距離daに対して1/2または1/4と固定の値を使用しているが、ユーザが任意の値を指定できるようにしても良い。 When adding the second feature point, a fixed value such as 1/2 or 1/4 is used for the average distance da, but the user may specify an arbitrary value.
 また、ユーザが、顔のどのパーツに対して、もう片側の特徴点である第2の特徴点を追加するか選択できるようにしても良い。 Also, the user may be allowed to select which part of the face to add the second feature point, which is the feature point on the other side.
 また、実施形態に記載した手法は、計算機(コンピュータ)に実行させることができるプログラム(ソフトウェア手段)として、例えば磁気ディスク(フロッピー(登録商標)ディスク、ハードディスク等)、光ディスク(CD-ROM、DVD、MO等)、半導体メモリ(ROM、RAM、フラッシュメモリ等)等の記録媒体に格納し、また通信媒体により伝送して頒布することもできる。なお、媒体側に格納されるプログラムには、計算機に実行させるソフトウェア手段(実行プログラムのみならずテーブル、データ構造も含む)を計算機内に構成させる設定プログラムをも含む。本装置を実現する計算機は、記録媒体に記録されたプログラムを読み込み、また場合により設定プログラムによりソフトウェア手段を構築し、このソフトウェア手段によって動作が制御されることにより上述した処理を実行する。なお、本明細書でいう記録媒体は、頒布用に限らず、計算機内部あるいはネットワークを介して接続される機器に設けられた磁気ディスク、半導体メモリ等の記憶媒体を含むものである。 Further, the method described in the embodiment can be executed by a computer (computer) as a program (software means), such as a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), an optical disk (CD-ROM, DVD, MO, etc.), semiconductor memory (ROM, RAM, flash memory, etc.), or the like, or can be transmitted and distributed via a communication medium. The programs stored on the medium also include a setting program for configuring software means (including not only execution programs but also tables and data structures) to be executed by the computer. A computer that realizes this apparatus reads a program recorded on a recording medium, and optionally constructs software means by a setting program. The operation is controlled by this software means to execute the above-described processes. The term "recording medium" as used herein is not limited to those for distribution, and includes storage media such as magnetic disks, semiconductor memories, etc. provided in computers or devices connected via a network.
 要するに、この発明は上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は可能な限り適宜組み合わせて実施しても良く、その場合組み合わせた効果が得られる。さらに、上記実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適当な組み合わせにより種々の発明が抽出され得る。 In short, the present invention is not limited to the above embodiments, and can be modified in various ways without departing from the gist of the invention at the implementation stage. In addition, each embodiment may be implemented in combination as much as possible, in which case the combined effect can be obtained. Furthermore, the above-described embodiments include inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements.
   1…画像変換装置
  11…画像取得部
  12…特徴点認識部
  13…制御点生成部
  14…変換表情入力部
  15…変化量格納部
  16…表情変換部
  17…画像出力部
 100…プロセッサ
 200…プログラムメモリ
 300…データメモリ
 301…取得画像記憶部
 302…特徴点記憶部
 303…制御点記憶部
 304…変換表情指定記憶部
 305…変化量記憶部
 306…変換画像記憶部
 307…一時記憶部
 400…通信インタフェース
 500…入出力インタフェース
 600…バス
 700…入力部
 800…表示部

 
DESCRIPTION OF SYMBOLS 1... Image conversion apparatus 11... Image acquisition part 12... Feature point recognition part 13... Control point generation part 14... Conversion expression input part 15... Change amount storage part 16... Expression conversion part 17... Image output part 100... Processor 200... Program Memory 300 Data memory 301 Obtained image storage unit 302 Feature point storage unit 303 Control point storage unit 304 Converted facial expression designation storage unit 305 Change amount storage unit 306 Converted image storage unit 307 Temporary storage unit 400 Communication Interface 500 Input/output interface 600 Bus 700 Input section 800 Display section

Claims (7)

  1.  人の顔の画像から認識された顔パーツの片側の特徴点である第1の特徴点に基づいて、認識されていないもう片側の特徴点である第2の特徴点を追加し、前記第1及び第2の特徴点を制御点とする制御点生成部と、
     変換するべき変換表情に応じた変形量により前記制御点を変形することで前記人の顔の表情を変換した変換画像を得る表情変換部と、
     を具備する、画像変換装置。
    Based on a first feature point that is a feature point on one side of a face part recognized from an image of a person's face, a second feature point that is a feature point on the other side that has not been recognized is added, and a control point generator having the second feature point as a control point;
    a facial expression conversion unit for obtaining a transformed image in which the facial expression of the person is transformed by transforming the control points by a deformation amount according to the transformed facial expression to be transformed;
    An image conversion device comprising:
  2.  前記顔パーツは、少なくとも眉または瞼を含む、請求項1に記載の画像変換装置。 The image conversion device according to claim 1, wherein the facial parts include at least eyebrows or eyelids.
  3.  前記制御点生成部は、
      前記第1の特徴点である眉の上側の特徴点と目の上側の特徴点との間の距離に基づいて特徴点追加距離を算出し、
      前記第1の特徴点である前記眉の上側の特徴点から前記特徴点追加距離だけ離れた眉の下側の点を前記第2の特徴点として追加する、請求項2に記載の画像変換装置。
    The control point generation unit
    calculating a feature point addition distance based on the distance between the feature point above the eyebrow and the feature point above the eye, which are the first feature points;
    3. The image conversion apparatus according to claim 2, wherein a point on the lower side of the eyebrow, which is separated by the feature point addition distance from the feature point on the upper side of the eyebrow, which is the first feature point, is added as the second feature point. .
  4.  前記制御点生成部は、
      眉の上側の特徴点と前記第1の特徴点である瞼の下側の特徴点との間の距離に基づいて特徴点追加距離を算出し、
      前記第1の特徴点である前記瞼の下側の特徴点から前記特徴点追加距離だけ離れた瞼の上側の点を前記第2の特徴点として追加する、
    請求項2に記載の画像変換装置。
    The control point generation unit
    calculating a feature point addition distance based on the distance between the feature point above the eyebrow and the feature point below the eyelid, which is the first feature point;
    Adding a point on the upper side of the eyelid separated by the feature point addition distance from the feature point on the lower side of the eyelid, which is the first feature point, as the second feature point;
    3. The image conversion device according to claim 2.
  5.  前記変換するべき変換表情ごとに、前記制御点それぞれについての前記変形量を表す変化量を予め格納する変化量格納部と、
     前記変換するべき変換表情を入力する変換表情入力部と、
     を更に具備し、
     前記表情変換部は、前記入力された変換表情に応じた前記変化量を前記変化量格納部から読み出し、その読み出した変化量を用いて前記変換画像を得る、
    請求項1乃至4の何れかに記載の画像変換装置。
    a change amount storage unit that stores in advance a change amount representing the deformation amount for each of the control points for each of the converted facial expressions to be converted;
    a conversion expression input unit for inputting the conversion expression to be converted;
    further comprising
    The facial expression conversion unit reads the amount of change corresponding to the input converted facial expression from the amount of change storage unit, and obtains the converted image using the amount of change read out.
    5. The image conversion device according to any one of claims 1 to 4.
  6.  プロセッサを有し、人の顔の画像における表情を変換する画像変換装置における画像変換方法であって、
     前記プロセッサにより、前記人の顔の画像から認識された顔パーツの片側の特徴点である第1の特徴点に基づいて、認識されていないもう片側の特徴点である第2の特徴点を追加し、前記第1及び第2の特徴点を制御点とし、
     前記プロセッサにより、変換するべき変換表情に応じた変形量により前記制御点を変形することで前記人の顔の表情を変換した変換画像を得る、
     画像変換方法。
    An image conversion method in an image conversion device having a processor and converting expressions in a human face image,
    The processor adds second feature points, which are unrecognized feature points on the other side, based on the first feature points, which are feature points on one side of the facial parts recognized from the image of the person's face. and with the first and second feature points as control points,
    Obtaining a transformed image in which the facial expression of the person is transformed by transforming the control points by a deformation amount corresponding to the transformed facial expression to be transformed by the processor;
    Image conversion method.
  7.  請求項1乃至5の何れかに記載の画像変換装置の前記各部としてプロセッサを機能させる画像変換プログラム。

     
    6. An image conversion program that causes a processor to function as each part of the image conversion apparatus according to any one of claims 1 to 5.

PCT/JP2022/007869 2022-02-25 2022-02-25 Image converting device, image converting method, and image converting program WO2023162131A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/007869 WO2023162131A1 (en) 2022-02-25 2022-02-25 Image converting device, image converting method, and image converting program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/007869 WO2023162131A1 (en) 2022-02-25 2022-02-25 Image converting device, image converting method, and image converting program

Publications (1)

Publication Number Publication Date
WO2023162131A1 true WO2023162131A1 (en) 2023-08-31

Family

ID=87765086

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/007869 WO2023162131A1 (en) 2022-02-25 2022-02-25 Image converting device, image converting method, and image converting program

Country Status (1)

Country Link
WO (1) WO2023162131A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004164649A (en) * 2002-11-14 2004-06-10 Eastman Kodak Co System and method for modifying portrait image in response to stimulus
JP2004265406A (en) * 2003-02-28 2004-09-24 Eastman Kodak Co Method and system for improving portrait image processed in batch mode
JP2005215763A (en) * 2004-01-27 2005-08-11 Konica Minolta Photo Imaging Inc Method, device and program for image processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004164649A (en) * 2002-11-14 2004-06-10 Eastman Kodak Co System and method for modifying portrait image in response to stimulus
JP2004265406A (en) * 2003-02-28 2004-09-24 Eastman Kodak Co Method and system for improving portrait image processed in batch mode
JP2005215763A (en) * 2004-01-27 2005-08-11 Konica Minolta Photo Imaging Inc Method, device and program for image processing

Similar Documents

Publication Publication Date Title
CN109359538B (en) Training method of convolutional neural network, gesture recognition method, device and equipment
US10832039B2 (en) Facial expression detection method, device and system, facial expression driving method, device and system, and storage medium
JP7282810B2 (en) Eye-tracking method and system
JP7155271B2 (en) Image processing system and image processing method
WO2021093453A1 (en) Method for generating 3d expression base, voice interactive method, apparatus and medium
JP4799104B2 (en) Information processing apparatus and control method therefor, computer program, and storage medium
JP4799105B2 (en) Information processing apparatus and control method therefor, computer program, and storage medium
JP2021503662A (en) Neural network model training
JP5505409B2 (en) Feature point generation system, feature point generation method, and feature point generation program
WO2020103700A1 (en) Image recognition method based on micro facial expressions, apparatus and related device
WO2017035966A1 (en) Method and device for processing facial image
KR101525133B1 (en) Image Processing Device, Information Generation Device, Image Processing Method, Information Generation Method, Control Program, and Recording Medium
JP2021517330A (en) A method for identifying an object in an image and a mobile device for carrying out the method.
JP2019510325A (en) Method and system for generating multimodal digital images
JP7064257B2 (en) Image depth determination method and creature recognition method, circuit, device, storage medium
JP7149124B2 (en) Image object extraction device and program
CN110598638A (en) Model training method, face gender prediction method, device and storage medium
JP2019117577A (en) Program, learning processing method, learning model, data structure, learning device and object recognition device
JP2013242757A (en) Image processing apparatus, image processing method, and computer program
JP2014211719A (en) Apparatus and method for information processing
WO2023162131A1 (en) Image converting device, image converting method, and image converting program
JP6694638B2 (en) Program, information storage medium, and recognition device
JP2010244251A (en) Image processor for detecting coordinate position for characteristic site of face
CN112580395A (en) Depth information-based 3D face living body recognition method, system, device and medium
WO2023162132A1 (en) Image transformation device, method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22928658

Country of ref document: EP

Kind code of ref document: A1