WO2023119557A1 - アバター表示装置、アバター生成装置及びプログラム - Google Patents

アバター表示装置、アバター生成装置及びプログラム Download PDF

Info

Publication number
WO2023119557A1
WO2023119557A1 PCT/JP2021/047876 JP2021047876W WO2023119557A1 WO 2023119557 A1 WO2023119557 A1 WO 2023119557A1 JP 2021047876 W JP2021047876 W JP 2021047876W WO 2023119557 A1 WO2023119557 A1 WO 2023119557A1
Authority
WO
WIPO (PCT)
Prior art keywords
avatar
person
data
image
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/047876
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
勝秀 安倉
卓也 坂口
伸幸 岡
武史 福泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SoftBank Corp
Original Assignee
SoftBank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SoftBank Corp filed Critical SoftBank Corp
Priority to US17/764,149 priority Critical patent/US11887234B2/en
Priority to CA3229535A priority patent/CA3229535A1/en
Priority to EP21968992.4A priority patent/EP4455998A4/en
Priority to AU2021480445A priority patent/AU2021480445B2/en
Priority to PCT/JP2021/047876 priority patent/WO2023119557A1/ja
Priority to JP2022517911A priority patent/JP7200439B1/ja
Priority to KR1020227009844A priority patent/KR102728245B1/ko
Priority to CN202180005532.2A priority patent/CN116648729A/zh
Priority to JP2022182125A priority patent/JP7504968B2/ja
Publication of WO2023119557A1 publication Critical patent/WO2023119557A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/20Three-dimensional [3D] animation
    • G06T13/40Three-dimensional [3D] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/00Two-dimensional [2D] image generation
    • G06T11/10Texturing; Colouring; Generation of textures or colours
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/00Two-dimensional [2D] image generation
    • G06T11/40Filling planar surfaces by adding surface attributes, e.g. adding colours or textures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/00Three-dimensional [3D] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three-dimensional [3D] modelling for computer graphics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating three-dimensional [3D] models or images for computer graphics
    • G06T19/20Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments

Definitions

  • the present invention relates to an avatar display device, an avatar generation device, and a program.
  • Patent Literature 1 discloses a technique for creating facial animation using facial meshes. It is said that this makes it possible to create a realistic facial animation of a person's face image.
  • an avatar display device includes an acquisition unit that acquires imaging data of a person, and a second acquisition unit that acquires depth data or image data of the person from the imaging data. a generating unit that synthesizes a 3D model based on the depth data or the image data to generate an avatar; and a display unit that displays the avatar following the movement of the person in an image display area of the information communication device. , provided.
  • an avatar generation device includes an acquisition unit that acquires imaging data of a person, and a second acquisition unit that acquires depth data or image data of the person from the imaging data. a generation unit that synthesizes a 3D model based on the depth data or the image data to generate an avatar; and an output unit that outputs the generated avatar.
  • the avatar display device and avatar generation device may be realized by a computer.
  • a control program for an avatar display device that implements the device on a computer, and a computer-readable recording medium recording it are also included in the scope of the present invention.
  • FIG. 1 is a block diagram of an avatar display system according to Embodiment 1 of the present invention
  • FIG. 4 is a flow chart showing the flow of an avatar display method according to the first embodiment
  • FIG. 4 is a conceptual diagram showing an example of a procedure for generating and displaying an upper-body avatar
  • FIG. 4 is a diagram in which only data corresponding to the eyes, nose, and mouth are extracted from the point cloud data of the face
  • FIG. 4 is a conceptual diagram showing processing for connecting a face model and an upper body model of a person when generating an avatar.
  • FIG. 4 is a conceptual diagram showing processing for generating an avatar using texture data
  • 3 is a block diagram of a mobile terminal according to Modification 1 of Embodiment 1.
  • FIG. 9 is a block diagram of an avatar display system according to Modification 2 of Embodiment 1;
  • FIG. 4 is a block diagram showing the configuration of a mobile terminal according to Embodiment 2 of the present invention; 9 is a flow chart showing the flow of an avatar display method according to Embodiment 2;
  • FIG. 11 is a block diagram showing the configuration of an avatar display system according to Embodiment 3 of the present invention; 11 is a flow chart showing the flow of an avatar display method according to Embodiment 3.
  • FIG. This is an example of a facial expression to be followed by an avatar.
  • FIG. 4 is a conceptual diagram showing procedures for replacing both the background and the person in the moving image;
  • FIG. 4 is a conceptual diagram showing procedures for replacing both the background and the person in the moving image;
  • FIG. 4 is a conceptual diagram showing procedures for replacing both the background and the person in the moving image;
  • FIG. 4 is a conceptual diagram showing procedures for replacing both the background and the person in the moving image;
  • FIG. 4 is
  • FIG. 10 is a conceptual diagram showing an example of a result of extracting the outline of a person from imaging data
  • FIG. 4 is a conceptual diagram showing an example of a result of extracting feature points of a person's face from image data
  • FIG. 4 is a conceptual diagram showing an example of a result of extracting feature points corresponding to the eyes, nose, and mouth of a person's face from image data
  • FIG. 2 is a conceptual diagram showing an example of a 3D model of the front part of the face and a 3D model of the head excluding the front part of the face. It is one structural example in the case of comprising each part, such as an avatar display device, using a computer.
  • FIG. 1 is a block diagram showing the configuration of an avatar display system 100 according to Embodiment 1.
  • the avatar display system 100 includes an imaging device 10 , an avatar display device 20 and an information communication device 30 .
  • the information communication device 30 is an information communication device used by a user, and is connected to information communication devices 30a, 30b, 30c, . . . used by other users via a network.
  • the information communication devices may be connected by P2P, or may be connected via a server (not shown).
  • an avatar display system refers to a system that displays an avatar in the image display area of an information communication device
  • an avatar display device refers to a device that displays an avatar in the image display area of an information communication device.
  • the information communication device on which the avatar is displayed may be the information communication device 30 used by the user, the information communication devices 30a, 30b, 30c, . . . used by other users, or both. may be
  • the imaging device 10 is a device for imaging a person (user) who creates an avatar.
  • the imaging device 10 can acquire still images or moving images, and includes a communication unit 11 for transmitting the acquired imaging data to the avatar display device 20 .
  • the imaging device 10 may be a depth camera capable of not only capturing an image (for example, an RGB image) but also measuring a distance (depth) to an object.
  • Known techniques can be used for the distance measurement method, such as three-dimensional Lidar (Light Detection and Ranging), triangulation using infrared light, or TOF (Time Of Flight).
  • the imaging device 10 may be a stereo camera having two or more imaging units.
  • the imaging data acquired by the imaging device 10 may include information indicating depth, and imaging data including information indicating depth may also be simply referred to as "imaging data".
  • the communication unit 11 is, for example, a wireless communication interface such as Bluetooth (registered trademark) or Wi-Fi (registered trademark), and can be connected to a network. As another example, the communication unit 11 may be a wired connection interface.
  • the imaging device 10 may be a video camera with a communication function. Alternatively, the imaging device 10 may be a personal computer, a mobile communication terminal, or the like having a communication function and an imaging device.
  • a mobile communication terminal is a mobile phone, a tablet, or the like.
  • the avatar display device 20 may include a main control section 21, a communication section 22, and a memory 23.
  • the main control unit 21 may include an acquisition unit 210 , a second acquisition unit 211 , a generation unit 212 , a display unit 213 and a texture generation unit 214 .
  • the second acquisition unit 211 may acquire (calculate) depth data from the imaging data. Note that, as described later, the second acquisition unit 211 may acquire image data from captured image data.
  • the generator 212 may synthesize the 3D model based on the depth data and generate the avatar. In one aspect, the generation unit 212 may generate an avatar by synthesizing the texture data of the person generated by the texture generation unit 214 with the 3D model. The details of the avatar generation method will be described later.
  • Display unit 213 uses the avatar generated by generation unit 212 to generate avatar display data for displaying the avatar that follows the movement of the person, and transmits the data to information communication device 30 via communication unit 22 .
  • the information communication device 30 may transmit the avatar display data to the information communication devices 30a, 30b, 30c, .
  • the display unit 213 can display an avatar that follows the movement of the person in all or part of the image display areas 31 of the information communication devices 30, 30a, 30b, 30c, .
  • the display unit 213 may transmit only the difference between the avatar display data and the past avatar display data to the information communication device 30 .
  • the avatar display data may be a moving image of the avatar.
  • the display unit 213 may generate avatar display data by replacing a person in each frame image of the moving image acquired by the imaging device 10 with an avatar that follows the movement of the person.
  • Replace in the present embodiment refers to general image processing in which a part of an image is seemingly replaced with another image. It includes a process of replacing with an avatar image, or a process of overlapping an image of a person's area and an avatar image with a predetermined transmittance.
  • the display unit 213 replaces a moving image of an avatar that follows the movement of a person with an avatar that follows the movement of the person in another moving image. may be generated. That is, the display unit 213 may generate a moving image of the avatar that follows the movement of the person by synthesizing the avatar that follows the movement of the person with a predetermined still image or moving image.
  • the data for displaying the avatar that follows the movement of the person may be data of the avatar that follows the movement of the person.
  • the display unit 213 transmits the data indicating the avatar generated by the generating unit 212 and the motion data indicating the movement of the avatar following the movement of the person to the information communication device 30 via the communication unit 22.
  • an avatar that follows the movement of the person may be displayed in all or part of the image display areas 31 of the information communication devices 30, 30a, 30b, 30c, .
  • the texture generation unit 214 may generate texture data of a person based on the imaging data. Details of the texture generation unit 214 will be described later.
  • the main control unit 21 of the avatar display device 20 includes at least one processor, and this processor reads and executes an avatar display program stored in the memory 23 to generate the generation unit 212 and the display unit 213.
  • You may have the structure which functions as.
  • the program functioning as generation unit 212 and the program functioning as display unit 213 may be separate programs, and they may be recorded in memories of separate devices. Such a configuration will be described later.
  • the communication unit 22 receives images from the imaging device 10 and transmits avatar display data to the information communication device 30 .
  • the communication unit 22 is, for example, a wireless communication interface such as Bluetooth (registered trademark) or Wi-Fi (registered trademark), and can be connected to a network.
  • the communication unit 22 may be a wired connection interface.
  • the main control unit 21 records (stores) the image acquired via the communication unit 22 in the memory 23 .
  • the memory 23 stores various programs executed by the main control unit 21 and various data referred to by these programs. Also, the memory 23 stores the avatars generated by the generation unit 212 . Still images and moving images acquired by the imaging device 10 are recorded in the memory 23 .
  • the information communication devices 30, 30a, 30b, 30c, . . . are devices capable of wirelessly or wiredly communicating information with other information communication devices.
  • the 3D viewer is, for example, a headset- or goggle-type display device that displays an xR (VR (Virtual Reality), AR (Augmented Reality) and MR (Mixed Reality)) space.
  • An image is displayed in the image display area 31 .
  • the image display area 31 may be part or all of the display provided in the information communication device 30 .
  • the communication unit 32 is, for example, a wireless communication interface such as Bluetooth (registered trademark) or Wi-Fi (registered trademark), and can be connected to a network such as the Internet.
  • the communication unit 32 of the information communication device 30 may receive avatar display data from the avatar display device 20 and transmit it to the information communication devices 30, 30a, 30b, 30c, .
  • each information communication device may receive avatar display data directly or indirectly from avatar display device 20 .
  • an application provided in each information communication device is an application that implements a communication system
  • the image display area 31 is an area on a display or viewer, where participants in the communication system It may be a participant image display area that is displayed.
  • the image displayed in the participant image display area may be a two-dimensional image or a three-dimensional image.
  • a user of the avatar display system 100 can display his/her own avatar as a participant in the participant image display area.
  • Examples of communication systems include communication systems via moving images and communication systems via virtual space.
  • a communication system via moving images is a system in which people or their avatars communicate while viewing moving images of each other, and examples thereof include a remote conference system, a videophone system, and the like.
  • a communication system through a virtual space is a system that communicates by displaying an avatar in the virtual space.
  • the virtual space may be recognized using VR, AR, MR technology, or the like, and may be, for example, a VR space.
  • Examples of systems that display avatars in a virtual space for communication include metaverse, VR chat, and the like.
  • FIG. 2 is a flow chart showing the flow of the avatar display method S1.
  • the avatar display method S1 includes the following steps. First, in step S10, the imaging device 10 may capture an image of a person and generate imaging data. The acquisition unit 210 of the avatar display device 20 may acquire imaging data generated by the imaging device 10 via the communication unit 22 and the communication unit 11 and record it in the memory 23 .
  • the second acquisition unit 211 may acquire (calculate) the depth data of the person from the imaging data acquired by the acquisition unit 210 .
  • the second acquisition unit 211 may calculate point cloud (three-dimensional) data as the person's depth data.
  • the imaging data includes multi-viewpoint image data captured by a stereo camera
  • the second acquisition unit 211 may analyze the correlation between the multi-viewpoint images included in the imaging data to calculate the depth data.
  • the imaging data includes information indicating depth (for example, when the imaging data includes a depth image)
  • the second acquisition unit 211 adds depth information to the information indicating the depth included in the imaging data.
  • the person's depth data (for example, point cloud (three-dimensional) data, etc.) may be calculated.
  • the second acquisition unit 211 may refer to the depth data and delete the background.
  • the second acquisition unit 211 can delete the background by referring to the depth data and deleting objects other than the largest object among the objects closest to the imaging device 10 . This is because when an image for generating an avatar of a person is captured, the person is often positioned closest to the imaging device 10 .
  • step S12 the generation unit 212 may synthesize a 3D model based on the depth data calculated by the second acquisition unit 211 and generate an avatar.
  • the generation unit 212 may record the generated avatar in the memory 23 .
  • step S ⁇ b>13 the display unit 213 generates avatar display data for displaying an avatar that follows the movement of the person, and transmits the data to the information communication device 30 via the communication unit 22 to display the information.
  • At least one image display area 31 of the communication devices 30, 30a, 30b, 30c, . . . may display an avatar that follows the movement of the person.
  • the display unit 213 may generate an avatar that follows the movement of the person by reading out the avatar recorded in the memory 23 and having the avatar follow the movement of the person. For example, the display unit 213 changes the orientation of the avatar by following the movement (orientation) of each part of the person, and arranges the avatar to follow the position of the person, thereby generating an avatar that follows the movement of the person. you can A method for determining the position and orientation of the person will be described later.
  • the display unit 213 replaces the person in each image frame of the imaging data recorded in the memory 23 with an avatar that follows the movement of the person, or (ii) a predetermined still image.
  • avatar display data may be generated by synthesizing an avatar that follows the movement of a person with each frame of the moving image, and may be transmitted to the information communication device 30 via the communication unit 22 .
  • the display unit 213 generates motion data indicating the movement of the avatar following the movement of the person, and together with the data of the avatar read from the memory 23, transmits the data as avatar display data via the communication unit 22. , may be transmitted to the information communication device 30 .
  • the display unit 213 may generate motion data that chronologically specifies the movement (orientation) and position of each part of the avatar corresponding to the movement (orientation) and position of each part of the person.
  • the generation unit 212 synthesizes a 3D model of the person's head from the depth data up to the person's face acquired by the imaging device 10 to generate the person's avatar.
  • the method will be specifically described below.
  • a method of generating and displaying an avatar of a person's head will be described. and chest) or whole body. That is, the generating unit 212 may generate, for example, an avatar of the head of the person, an avatar of the upper body of the person, or an avatar of the whole body of the person.
  • the generating unit 212 may generate an avatar composed of separate parts, such as an avatar composed of a head and arms.
  • FIG. 3 is a conceptual diagram showing an example of the procedure for generating a head avatar.
  • the acquisition unit 210 acquires imaging data for avatar generation from the imaging device 10 .
  • An image 331 in FIG. 3 is captured data from the front of the person who generates the avatar acquired from the imaging device 10 .
  • the second acquisition unit 211 calculates point group (three-dimensional) data 332 as the depth data of the person.
  • the point cloud (three-dimensional) data 332 is depth data representing the depth (three-dimensional shape) of the face with discrete point cloud data.
  • the generation unit 212 generates a face model based on the point cloud data 332 and combines it with the head model to generate a head avatar 333 .
  • the head model will be described later.
  • the generation unit 212 may generate a face model by pasting a texture to the point cloud data 332, or build a 3D (three-dimensional) model with a triangular mesh from the point cloud data 332 and paste the texture. may be attached. Textures will be discussed later.
  • the generation unit 212 does not need to generate a face model using all depth data of the face.
  • the generation unit 212 generates an avatar using one or more depth data selected from depth data corresponding to the eyes, depth data corresponding to the nose, and depth data corresponding to the mouth, among the depth data of the face.
  • Depth data of one or more feature points around the eyes can be used as the depth data corresponding to the eyes, and the number of feature points is not particularly limited.
  • the one or more feature points around the eyes may include feature points corresponding to eyebrows.
  • Depth data of one or more feature points around the nose can be used as depth data corresponding to the nose, and the number of feature points is not particularly limited.
  • Depth data of one or more feature points around the mouth can be used as depth data corresponding to the mouth, and the number of feature points is not particularly limited.
  • the one or more feature points around the mouth may include a feature point corresponding to the chin.
  • Fig. 4 is an image in which feature points corresponding to the eyes, nose, and mouth are extracted from the point cloud data of the face.
  • An image 341 in FIG. 4 is an image displaying the point cloud data of the entire face
  • an image 342 is an image obtained by extracting only feature points respectively corresponding to the eyes, nose, and mouth. It can be seen that the position and orientation of the face can be determined only by the arrangement of the eyes, nose, and mouth, and that the image 342 contains less data than the image 341 .
  • the amount of data for avatars can be reduced compared to avatars generated using data. As a result, image processing can be speeded up.
  • the generation unit 212 may, for example, , a set of 3D models of various faces and one or more of depth data corresponding to the eyes, depth data corresponding to the nose, and depth data corresponding to the mouth of the 3D model as teacher data.
  • a 3D model of the face may be generated by using the trained model.
  • FIG. 5 is a conceptual diagram showing the process of connecting a person's face model and head model in generating the head avatar 333 .
  • An image 352 is an image obtained by the generation unit 212 specifying the boundary between the face and the hair in the head model.
  • An image 362 is an image in which the generation unit 212 specifies the boundary between the face and both sides of the face in the head model.
  • the generating unit 212 uses a known algorithm such as blendshape to smoothly combine the face model and the head model in the elliptical portions of the images 352 and 362 to generate the head avatar 333. can do.
  • the avatar generated by the generation unit 212 is generated, for example, in a dedicated file format for 3D avatars and recorded in the memory 23.
  • the data format may be, for example, the VRM format, but the data format is not limited to this. It should be noted that depending on the platform or application that displays the avatar, it may not be possible to display the avatar recorded in the VRM format. In that case, the avatar recorded in VRM format may be converted into a data format that can be processed by the application. That is, the generation unit 212 may generate the avatar in a data format suitable for the application or system in which the avatar is used.
  • the display unit 213 can generate the avatar display data 334 including the background by replacing the person in the imaging data with the head avatar 333 generated by the generation unit 212 .
  • the display unit 213 may generate the avatar display data 334 by synthesizing the head avatar 333 generated by the generation unit 212 with a predetermined background image.
  • the display unit 213 generates motion data representing the movement of the avatar as described later, and combines the motion data with the head avatar 333 generated by the generation unit 212 to generate the avatar display data 334.
  • a face model and an upper body model may be combined, a face model and a whole body model may be combined, or a face model and other body models may be combined.
  • existing models may be used by adjusting their sizes. You can keep it. Also, a large number of existing models may be prepared and the user may select one, or the generation unit 212 may select a model similar to the user's captured image captured by the imaging device 10 .
  • the generation unit 212 generates a character head model, an upper body model, a whole body model, and other body models that are different from the person, and a head model that is different from the person in terms of gender and the like. , upper body models, full body models, and other body models may be combined. As a result, it is possible to generate an avatar of a different character and gender from the person while leaving the facial features of the person.
  • the face orientation can be determined from the eyes, nose and/or mouth. For example, since the shape and arrangement of a person's two eyes are roughly fixed, it is possible to determine which direction the face is facing from the shape and positional relationship.
  • the orientation of the face can be expressed using a spherical coordinate vector with a specific predetermined position as the origin.
  • the display unit 213 detects the position of at least one of the person's eyes, nose, and mouth based on the captured image of the person captured by the imaging device 10 .
  • the display unit 213 refers to the direction of the face determined from the detected position of at least one of the eyes, nose, and mouth, and replaces the image with the avatar facing that direction.
  • the position of the head can be determined as coordinates of a predetermined reference part of the head.
  • the generating unit 212 may generate a face model using, for example, general skin-colored textures. However, the generation unit 212 may use textures based on image data in order to approximate the face of an actual person (user).
  • the texture generation unit 214 may generate texture data of a person's face from the imaging data acquired by the acquisition unit 210 .
  • FIG. 6 is a conceptual diagram showing processing for generating an avatar using texture data generated by the texture generation unit 214. As shown in FIG.
  • the generation unit 212 generates a face model 1521 as shown in the image 152 based on the depth data calculated by the second acquisition unit 211 as shown in the image 151 . Then, the generation unit 212 synthesizes the generated face model 1521 with the texture data generated by the texture generation unit 214 to generate a face model as shown in the image 153 . This makes it possible to generate a more realistic avatar.
  • the texture generation unit 214 since the texture generation unit 214 generates texture data based on the captured data, it may fail to generate texture data for a portion of the person's face that is not included in the captured data. For example, texture data for both sides of the face may be generated based on the background of the person, as in the portion indicated by the ellipse 1541 in the image 154 . Therefore, the texture generation unit 214 may supplement the texture data of the portion of the person's face that is not included in the captured data with texture data in the vicinity thereof, as shown in the image 155 .
  • texture data prepared in advance for the character or the gender is used to create an avatar. may be generated.
  • the user's avatar is generated according to the user's image captured by the imaging device 10, and the avatar is displayed on the information communication devices 30, 30a, 30b, 30c, . . . can be displayed in the image display area 31.
  • FIG. Therefore, it is possible to realize a technique for generating and displaying an avatar of the user in accordance with the image from the image.
  • the generation unit 212 generates an avatar using the texture data generated by the texture generation unit 214, so that a more realistic avatar can be generated.
  • the generated avatar is an avatar whose face is at least matched (resembled) to the target person, and a person who sees this is an avatar that indicates who the avatar is. You can immediately recognize if there is one.
  • Such an avatar can be used in situations where it is necessary for the person to be recognized with certainty, but the person's true face should not be shown at all.
  • an avatar can be generated with an image of a well-groomed person, such as shaving and applying makeup, and can be used as necessary.
  • FIG. 7 is a block diagram of a mobile terminal 40 according to Modification 1 of Embodiment 1. As shown in FIG. The mobile terminal 40 is an example of an information communication device, and can communicate with the information communication devices 30a, 30b, 30c, .
  • the mobile terminal 40 may be, for example, a mobile phone or a tablet terminal that includes the imaging device 10 and has a communication function.
  • the mobile terminal 40 may include the imaging device 10 , the main controller 21 , the memory 23 and the display 41 .
  • the imaging device 10 has the functions described in the first embodiment, and may be incorporated in the mobile terminal 40 .
  • the display 41 is the display screen of the mobile terminal 40 and may have an image display area 411 .
  • the image display area 411 is an area where an image is displayed using an application of the mobile terminal 40, for example.
  • the main control unit 21 may include an acquisition unit 210, a second acquisition unit 211, a generation unit 212, a display unit 213, and a texture generation unit 214.
  • the second acquisition unit 211 may calculate depth data based on the imaging data.
  • the generation unit 212 may synthesize a 3D model based on the depth data to generate an avatar and record it in the memory 23 .
  • the display unit 213 causes the avatar recorded in the memory 23 to follow the direction of the person in the moving image, and the image display area 411 of the display 41 and the image display areas 31 of the information communication devices 30a, 30b, 30c, . An avatar that follows the movement of the person may be displayed.
  • the display unit 213 may (i) replace a person in each frame image of the moving image acquired by the imaging device 10 with an avatar that follows the movement of the person, or (ii) replace a predetermined still image.
  • an avatar that follows the movement of the person may be synthesized with the moving image, or (iii) data of the avatar that follows the movement of the person may be generated.
  • the texture generation unit 214 may generate texture data of a person based on the imaging data. The function of the texture generation unit 214 is as described in the first embodiment.
  • the main control unit 21 may display the data generated by the display unit 213 on the display 41, transmit the data to the information communication devices 30a, 30b, 30c, . , . . . may be displayed with an avatar that follows the movement of the person.
  • the memory 23 is as described in the first embodiment.
  • the mobile terminal 40 according to Modification 1 implements the imaging device 10, the avatar display device 20, and the information communication device 30 according to Embodiment 1 in one device.
  • the mobile terminal 40 is configured to have the functions of the imaging device 10, the avatar display device 20, and the information communication device 30 according to the first embodiment.
  • a desktop PC desktop personal computer
  • the mobile terminal 40 incorporates the imaging device 10 and also serves as an information communication device having the image display area 411 .
  • the performance of mobile terminals has improved, and various applications can be used. Therefore, the same effect as the avatar display system 100 described in the first embodiment can be obtained even with the configuration of the mobile terminal having all the functions described above. Also, the user can easily display his or her own avatar in the moving image using the portable terminal.
  • FIG. 8 is a block diagram of an avatar display system 200 according to Modification 2 of Embodiment 1. As shown in FIG.
  • the avatar display system 200 may comprise a mobile terminal 40A and a desktop PC 50.
  • the mobile terminal 40A may include the imaging device 10, the main control section 21, the communication section 22, and the memory 23.
  • the main control unit 21 may include an acquisition unit 210 , a second acquisition unit 211 , a generation unit 212 , a display unit 213 and a texture generation unit 214 .
  • the main control unit 21 may transmit avatar display data for displaying an avatar that follows the movement of the person to the desktop PC 50 via the communication unit 22 .
  • the configuration of the mobile terminal 40A is the same as that of the mobile terminal 40 described in Modification 1, except that the communication unit 22 is provided and the display 41 is not used.
  • the desktop PC 50 may include a display 51 and a communication section 52 , and the display 51 may include an image display area 511 .
  • the communication unit 52 may receive avatar display data from the mobile terminal 40A.
  • the desktop PC 50 may use the avatar display data received from the mobile terminal 40A to display an avatar that follows the movement of the person in the image display area 511 of the display 51 .
  • the desktop PC 50 is one form of the information communication device 30 described in the first embodiment.
  • the mobile terminal 40A may include an imaging device 10, an acquisition unit 210, a second acquisition unit 211, a generation unit 212, a display unit 213, and a texture generation unit 214, similar to the mobile terminal 40 of Modification 1. However, unlike the mobile terminal 40 of Modification 1, the mobile terminal 40A does not display the avatar that follows the movement of the person on its own display, but displays the avatar that follows the movement of the person on the display 51 of the desktop PC 50. can be displayed.
  • the portable terminal 40A may generate avatar display data for displaying an avatar that follows the movement of a person, and display the generated avatar display data on the desktop PC 50.
  • the display unit 213 may be provided in the desktop PC 50 instead of the mobile terminal 40A (not shown). According to the above configuration, the same effects as those of the avatar display system 100 described in the first embodiment can be obtained. In addition, the user can easily display his or her own avatar on the moving image of the desktop personal computer using the portable terminal.
  • FIG. 9 is a block diagram showing the configuration of a mobile terminal 40B according to Embodiment 2 of the present invention.
  • the mobile terminal 40 ⁇ /b>B may include the imaging device 10 , the main controller 21 , the memory 23 and the display 41 .
  • the main control unit 21 may include an acquisition unit 210 , a second acquisition unit 211 , a generation unit 212 , a display unit 213 , a conversion unit 215 and a texture generation unit 214 .
  • the imaging device 10, acquisition unit 210, second acquisition unit 211, generation unit 212, display unit 213, texture generation unit 214, and memory 23 are the same as those described in the first embodiment.
  • the mobile terminal 40B like the mobile terminal 40, also serves as the information communication device 30 described in the first embodiment.
  • the mobile terminal 40B may be equipped with an application for a communication system via moving images.
  • a moving image used in a communication system may be a two-dimensional image or a three-dimensional image. and a communication system in a VR space (virtual space).
  • a remote conference system is a system in which multiple people hold meetings, meetings, conversations, chats, etc. via mobile phones, personal computers, videophones, etc. using the Internet or a dedicated line. Also called conference system.
  • a communication system using moving images used in this embodiment can be implemented using a known application, and the type thereof is arbitrary.
  • the display 41 displays the communication system 411 via moving images (for example, the conference room of the remote conference system), and the participant image display area 4111 is displayed therein. Is displayed. In the participant image display area 4111, the participants of the communication system via moving images are displayed. In this embodiment, images of the users of the mobile terminal 40B and the information communication devices 30a, 30b, 30c, .
  • a participant image display area 4111 corresponds to the image display area 31 described in the first embodiment.
  • the display unit 213 may display an avatar that follows the movement of the participant in the participant image display area 4111 .
  • the user of the mobile terminal 40B may, for example, take an image of himself/herself with the mobile terminal 40B before starting the meeting. Then, a desired image may be selected from the images, and the generation unit 212 may generate an avatar. As an example, when the user selects the head or head region of the desired image and taps the avatar generation button, the generation unit 212 generates an avatar of the upper body or the head from the image of the region, 23.
  • the user may specify a participant image display area 4111 where his/her own image is displayed in the communication system through moving images. Specifically, for example, you may enter or select your own name or ID to be displayed in the image. It should be noted that the names or IDs of other participants cannot be entered or selected.
  • the display unit 213 may replace the user's upper body, whole body, or head in the moving image captured by the imaging device 10 with an avatar that follows the movement of the user.
  • the main control unit 21 may display the moving image replaced with the avatar in the participant image display area 4111 . Furthermore, the main control unit 21 may transmit the moving image in which the user's image is replaced with the avatar to the information communication devices used by the other participants through the application of the communication system via the moving image.
  • the conversion unit 215 may convert the data format of the avatar generated by the generation unit 212. Specifically, the conversion unit 215 may convert the data format of the avatar so as to conform to the image data format of the application used by the mobile terminal 40B. If the image data format that can be used differs depending on the application of the remote conference system, the conversion unit 215 may convert the avatar data format (for example, VRM format) into a data format that can be processed by the application. As a result, the avatar can be displayed regardless of the type of communication system using moving images.
  • the avatar display method S2 for converting the avatar data format will be described below with reference to the drawings.
  • FIG. 10 is a flow chart showing the flow of the avatar display method S2 executed by the mobile terminal 40B.
  • the avatar display method S2 includes the following steps.
  • the imaging device 10 may capture an image of a person and generate imaging data.
  • the acquisition unit 210 of the avatar display device 20 may acquire the imaging data generated by the imaging device 10 via the communication units 22 and 11 .
  • the second acquisition unit 211 may acquire (calculate) the depth data of the person from the imaging data acquired by the acquisition unit 210.
  • the second acquisition unit 211 may acquire image data of a person from the imaging data.
  • the generation unit 212 may synthesize the 3D model based on the depth data calculated by the second acquisition unit 211 to generate the avatar.
  • the generation unit 212 may record the generated avatar in the memory 23 .
  • the conversion unit 215 may convert the data format of the avatar according to the application of the communication system via the moving image used by the mobile terminal 40B.
  • the display unit 213 may display an avatar following the movement of the person in the participant image display area 4111 of the display 41 of the mobile terminal 40B (information communication device). Further, the display unit 213 transmits the avatar whose data format is converted to the information communication devices 30a, 30b, 30c, . An avatar may be displayed in (participant image display area).
  • the mobile terminal 40B and the avatar display method S2 according to Embodiment 2 having the above configuration, it is possible to display the user's avatar displayed in a communication system (for example, a remote conference system) via moving images. As a result, the same effect as that obtained in the first embodiment can be obtained. Especially in a remote conference system, it is necessary to be able to clearly recognize who has spoken. Since the avatar according to this embodiment is generated according to the image of the user, the avatar clearly recognizes the user. be able to. Further, since the avatar according to the present embodiment is an avatar that resembles the user, there is an advantage that the other participants do not feel a great deal of discomfort even if the real face is not shown. Furthermore, by providing the conversion unit 215, it is possible to cope with applications that require different data formats.
  • conversion unit 215 described in the above second embodiment can also be implemented in combination in the first embodiment.
  • FIG. 11 is a block diagram showing the configuration of an avatar display system 300 according to Embodiment 3 of the present invention.
  • the avatar display system 300 may include the imaging device 10 , the avatar display device 20 and the information communication device 30 .
  • the imaging device 10 and the information communication device 30 are the same as the devices described in the first embodiment, so their description is omitted.
  • the avatar display device 20 may include a main control section 21, a communication section 22, and a memory 23.
  • the main control unit 21 may include an acquisition unit 210 , a second acquisition unit 211 , a generation unit 212 , a display unit 213 , a texture generation unit 214 and a tracking unit 216 .
  • the acquisition unit 210, the second acquisition unit 211, the generation unit 212, the display unit 213, and the texture generation unit 214 are as described in the first embodiment.
  • the tracking unit 216 may track the face of the person in the image acquired by the imaging device 10 . Additionally, the tracking unit 216 may track the neck of the person in the image. Tracking is position detection.
  • the display unit 213 may refer to the tracked face orientation and cause the avatar to follow the movement of the person. Further, when the tracking unit 216 tracks the neck of the person, the display unit 213 may refer to the tracked orientation of the neck and cause the avatar to follow the movement of the body below the neck of the person. The operation will be specifically described below.
  • the generation unit 212 may generate not only the head model of the person, but also the model of the body below the neck from the image acquired by the imaging device 10 .
  • the body part below the neck is, for example, the upper half of the body excluding the head.
  • the body below the neck is also simply referred to as the "body".
  • the body model does not have to be as similar as the face model.
  • an existing body model may be used as the body model.
  • the body model may be an existing upper body model from which the head is removed.
  • the tracking unit 216 may track the person's face, for example, the eyes. It should be noted that any known program or application can be used to track a specific object in the image.
  • the objects to be tracked may be the eyes, nose and mouth.
  • the tracking unit 216 may derive the orientation of the face from the shape and placement of the two tracked eyes. The method for deriving the face orientation is the same as the method for determining the face orientation described in the first embodiment.
  • the tracking unit 216 may track the person's neck together with the face.
  • the neck for example, is perceived as a cylindrical area that extends down the face.
  • the tracking unit 216 may derive the tracked neck orientation.
  • the direction of the neck refers to the front, back, left, and right directions in which the neck is bent. In other words, the orientation of the neck is the direction in which the neck is lying down. If the neck is tilted left or right, it is determined that the body is tilted left or right at that angle. It is determined that he is lying forward and backward.
  • the reason for tracking the neck is to process the avatar's head and body separately.
  • the generating unit 212 may generate separately a model of the head of the person and a model of the body below the neck. The head moves independently of the body.
  • the orientation of the neck is linked to the orientation of the body. For example, if the head is tilted to the right, it is considered that the body is tilted to the right. Therefore, the display unit 213 may move the avatar's head by tracking the person's eyes, nose, and mouth, and move the avatar's body by referring to the direction of the person's neck. This makes it possible to display more realistic movements of the avatar.
  • Moving the head of the avatar means that the display unit 213 changes the orientation of the head model generated by the generation unit 212 so as to follow the orientation of the face derived by the tracking unit 216, so that the information communication devices 30, 30a, It is to display in at least one image display area 31 of 30b, 30c, .
  • Moving the body of the avatar means that the display unit 213 changes the orientation of the body model generated by the generation unit 212 so as to follow the orientation of the neck derived by the tracking unit 216, thereby moving the information communication devices 30, 30a, and 30b. , 30c, . . . in at least one image display area 31.
  • the display unit 213 may generate a head model or body model whose orientation has been changed, or may generate motion data specifying that the orientation of the head model or body model should be changed.
  • the position and orientation of the shoulder are detected by tracking the shoulder, and the degree of twist of the body is derived. You can let In addition, the limbs move independently of the body. Therefore, if the limbs are also avatarized, the positions and orientations of the limbs may be tracked from the image, and the orientation of the separately generated limb models may be changed.
  • FIG. 12 is a flow chart showing the flow of the avatar display method S3 executed by the avatar display system 300. As shown in FIG. The avatar display method S3 will be described with an example of displaying an avatar including both a head and a body.
  • the avatar display method S3 includes the following steps. First, in step S ⁇ b>31 , the acquisition unit 210 may acquire image data of a person imaged by the imaging device 10 . In step S ⁇ b>32 , the tracking unit 216 may track the eyes, nose, mouth, and neck of the person in the imaging data acquired by the acquisition unit 210 . At this time, depth data or image data may be obtained from the imaging data, and the eyes, nose, mouth and neck of the person may be tracked based on the depth data or image data.
  • the display unit 213 may track the eyes, nose and mouth of the tracked person and move the head of the avatar.
  • the display unit 213 may refer to the direction of the tracked neck of the person to move the body of the avatar.
  • the display unit 213 may move the avatar by generating avatar display data for displaying the avatar following the movement of the head and body of the person in the moving image. A moving image in which the avatar moves is displayed in the image display area 31 .
  • the head model and the body part model of the person are individually generated, and the eyes, nose, and mouth are tracked to display the avatar.
  • the avatar's body may be moved. This allows the avatar's head and body to move independently. Therefore, in addition to the effect according to the first or second embodiment, it is possible to obtain the effect of being able to display an avatar that is more faithful to the movement of the person.
  • tracking unit 216 described in Embodiment 3 above can also be implemented in combination with Embodiment 1 or Embodiment 2.
  • An avatar generation device includes an acquisition unit that acquires imaging data of a person, a second acquisition unit that calculates depth data of the person from the imaging data, and generates an avatar by synthesizing a 3D model based on the depth data. and an output unit for outputting the generated avatar.
  • the generator is the same as the generator 212 described in the first to third embodiments.
  • the output section is the same as the communication section 22 described in the first to third embodiments.
  • the avatar generation device may include a memory for recording the avatar generated by the generation unit. This memory is similar to the memory 23 described in the first to third embodiments.
  • the avatar generated by the avatar generation device can be output to the outside by the output unit.
  • an avatar generation device can output an avatar in a virtual space.
  • Virtual space refers to a virtual reality space called VR (Virtual Reality) that is entirely composed of computer graphic images (for example, VRChat), and an augmented reality space called AR (Augmented Reality) that combines real space and computer graphic images.
  • AR Augmented Reality
  • MR Mixed Reality
  • Outputting an avatar in a virtual space means displaying an avatar in such a virtual space.
  • the avatar generation device may include the conversion unit 215 described above.
  • one's own avatar can be generated and displayed in various virtual spaces.
  • the imaging apparatus 10 described in the first to fourth embodiments can acquire color data of a person by using color filters.
  • the generator 212 can generate an avatar using the depth data and the color data.
  • the avatars described in Embodiments 1 to 5 above can not only follow the orientation of the person, but can also follow the facial expression of the person.
  • the display unit 213 may detect feature points in the person's face based on the captured image of the person captured by the imaging device 10 .
  • the display unit 213 may detect, for example, one or more feature points around the eyes, one or more feature points around the nose, and one or more feature points around the mouth.
  • the one or more feature points around the eyes may include feature points corresponding to eyebrows.
  • the one or more feature points around the mouth may include a feature point corresponding to the chin.
  • the number of feature points in the person's face that are detected to follow the facial expression may be less than the number of feature points in the person's face that are detected to generate the avatar.
  • the greater the number of detected feature points the more accurately a person's facial expression can be represented. Difficulty displaying avatars in . Therefore, by narrowing down the number of feature points representing facial expressions, it is possible to perform quick information processing and generate avatars with rich facial expressions in real time.
  • Feature points in a person's face that are detected to follow facial expressions may be, for example, those shown in FIG.
  • the display unit 213 deforms the face model of the avatar based on the change in the coordinates of the detected feature points from the coordinates of the feature points when the avatar is generated, so that the facial expression of the avatar follows the facial expression of the person.
  • the shape of the eyes, nose and mouth can be deformed according to the facial expression of the person based on the change in the coordinates of the feature points around the eyes, nose and mouth, the facial expression of the avatar can follow the facial expression of the person. can be done.
  • the generating unit 212 may select, among the feature points in the face of the person detected for generating the avatar, Coordinates of feature points may be recorded in association with the face model.
  • the display unit 213 deforms the face model using a known algorithm such as blendshape based on the difference between the coordinates of each feature point linked to the face model and the coordinates of each detected feature point. you can Thereby, the display unit 213 can make the facial expression of the avatar follow the facial expression of the person in real time.
  • a known algorithm such as blendshape based on the difference between the coordinates of each feature point linked to the face model and the coordinates of each detected feature point. you can Thereby, the display unit 213 can make the facial expression of the avatar follow the facial expression of the person in real time.
  • the generation unit 212 prepares 3D model parts in advance and synthesizes them with the avatar for parts of the face that do not appear on the surface in a normal state (for example, tongue, teeth, etc.). You can leave it.
  • the display unit 213 may reflect the facial expression specified by the user in the facial expression of the avatar. For example, when a person has an angry expression, the avatar may have a smiling expression. In this case, the display unit 213 fixes (does not transform) only the feature points for creating a smile (for example, the feature points around the mouth), and changes the other feature points according to the coordinates of the detected feature points. So, you can express a smiling expression even though you are angry.
  • the facial feature points of the character and the facial feature points of the person are associated in advance, so that the facial expression of the character can be matched to the facial expression of the person. It can also be reflected. As a result, the user's facial expression can be reflected on characters other than humans.
  • the display unit 213 may use templates such as images 361a to 361h shown in FIG. 13 to make the avatar's facial expression follow the person's facial expression.
  • the display unit 213 may analyze the facial expression of a person in the captured image, and if a facial expression similar to one of the images 361a to 361h is detected, the facial expression of the avatar may follow the facial expression.
  • the avatar of only the upper body or head of the person in the moving image is displayed, and the background remains as it is.
  • the background in the moving image may be replaced and displayed.
  • the procedure in this case is shown in FIG. Image 371 in FIG. 14 is the original image.
  • part of the background is cut out with the outline of the person. This outline may be a rough outline slightly larger than the person. Then, by inserting the avatar of the person in the cut area and smoothly blending the gap between the avatar and the background, it is possible to replace the avatar with an image such as an image 373 in which a desired background and avatar are displayed.
  • the face is an avatar that matches the person as much as possible, but other parts do not need to match the person.
  • hairstyles may be changed, and accessories such as necklaces, eyeglasses, etc. may be added to the avatar.
  • body parts other than the head may be replaced with an avatar having a body shape different from that of the person.
  • the avatar display device 20 calculates depth data from imaging data of a person (user), synthesizes a 3D model of the person based on the depth data, and generates an avatar.
  • data for generating avatars is not limited to depth data.
  • an avatar may be generated from image data.
  • the second acquisition unit 211 acquires image data from captured image data
  • the generation unit 212 synthesizes a 3D model based on the image data to generate an avatar.
  • the image data is not limited to this, but for example, RGB image data can be used.
  • the second acquisition unit 211 may extract the image data of the person from the imaging data.
  • the second acquisition unit 211 uses a known edge detection algorithm to extract contours from the image data, thereby distinguishing between a person and the background in the image data, and extracting image data of the person. good.
  • FIG. 15 is a conceptual diagram showing an example of the result of extracting the outline of a person from imaging data.
  • An image 151 on the left side of FIG. 15 shows an example of captured data, and includes a person 153 and a background 154 .
  • An image 152 on the right side of FIG. 15 shows an example of an image in which the contour is extracted from the image 151 by the second acquisition unit 211 .
  • the second acquisition unit 211 can acquire image data of the person 153 by deleting the background 154 from the image 151 by deleting the outline of the largest object based on the image 152 .
  • the generation unit 212 may detect feature points from the person's image data.
  • the generator 212 may detect feature points of a person's face using a known feature point detection algorithm.
  • FIG. 16 is a conceptual diagram showing an example of the result of detecting the feature points P of the face of the person 161 from the image data of the person 161.
  • the generation unit 212 can detect the arrangement of the eyes, nose, and mouth of the face of the person 161 and the size of the face from the detected feature points P of the face of the person 161 .
  • FIG. 17 is a conceptual diagram showing an example of the result of extracting feature points Q of the eyes, nose, and mouth of a person's face from various image data including persons facing different directions.
  • the generation unit 212 generates feature points Q of the eyes, nose, and mouth of the person's face from not only the image 171 of the person facing the front but also from the images 172 to 274 of the person facing various directions in a similar manner. can be detected.
  • a tracking template is a template used to track the orientation of a person's face, and is a reference arrangement of feature points Q of the eyes, nose, and mouth of a person's face corresponding to various orientations of the person's face. is shown.
  • a tracking template may be derived, for example, based on images of different people (which may be of different races) oriented at different orientations, using the techniques described above to determine the eye, nose, and mouth features of the person's face in each image. It can be generated by detecting each of the points Q and performing machine learning. By detecting the arrangement of feature points of the eyes, nose, and mouth from an image of a person whose face orientation is to be tracked, and comparing them with a tracking template, the orientation of the person's face can be estimated.
  • the generation unit 212 may acquire information about the feature points Q of the eyes, nose, and mouth of the person's face using the tracking template criteria when generating the person's avatar. For example, when the feature points Q of the eyes, nose, and mouth are detected from each of the images 171 to 174, the feature points Q of the person's eyes, nose, and mouth are compared with the reference arrangement indicated by the tracking template created in advance. may obtain the difference in placement and size relative to the reference. Then, the generation unit 212 may generate a 3D model of the front part of the person's face based on the information of the feature points Q of the eyes, nose, and mouth thus acquired (differences in arrangement and size with respect to the reference). .
  • the method of generating the 3D model of the face based on the information of the feature points Q of the eyes, nose, and mouth is not particularly limited. , and information on the feature points Q of the eyes, nose, and mouth of the 3D model (differences in placement and size with respect to the reference) as teacher data, and a learning model that has undergone machine learning to generate a 3D model of the face. may
  • FIG. 18 is a conceptual diagram showing an example of a 3D model of the front part of the face and a 3D model of the head excluding the front part of the face.
  • the generation unit 212 generates a 3D model of the head including hair, which is stored in advance in the memory 23, for example, for the 3D model 181 of the front part of the face generated based on the feature points Q of the eyes, nose, and mouth of the person.
  • Combining 182 may generate a 3D model of the head.
  • the generation unit 212 performs, for example, adjustment of the size of each 3D model, calibration (correction of image distortion due to angle), morphing, smoothing the joint of the two 3D models. image processing) may be performed at the same time.
  • the generation unit 212 may combine the upper body and other body models with the head model, or synthesize textures based on imaging data, as in the first embodiment.
  • the avatar generated by the generation unit 212 in this way may be displayed by the display unit 213 as in the first embodiment.
  • the display unit 213 may use the tracking template described above to detect the face orientation of the person and follow the face orientation of the avatar. Further, the display unit 213 may detect predetermined feature points in the person's face and cause the facial expression of the avatar to follow the facial expression of the person.
  • the avatar display device 20 can generate an avatar of a person using the person's image data instead of or in addition to the person's depth data. That is, the second acquisition unit 211 can acquire depth data or image data of a person. This is the same for both the avatar display device and the avatar generation device according to the first to fifth embodiments described above.
  • Control blocks of the avatar display device 20 and the mobile terminals 40, 40A, and 40B are formed on an integrated circuit (IC chip) or the like. It may be implemented by a logic circuit (hardware), or by software using a CPU (Central Processing Unit). In the latter case, the avatar display device and the like may be configured using a computer (electronic calculator) as shown in FIG.
  • FIG. 19 is a block diagram illustrating the configuration of a computer 910 that can be used as an avatar display device or the like.
  • a computer 910 includes an arithmetic unit 912 , a main memory device 913 , an auxiliary memory device 914 , an input/output interface 915 , and a communication interface 916 which are interconnected via a bus 911 .
  • the computing unit 912, main memory 913, and auxiliary memory 914 may each be, for example, a CPU, random access memory (RAM), solid state drive, or hard disk drive.
  • the input/output interface 915 is connected to an input device 920 for the user to input various information to the computer 910 and an output device 930 for the computer 910 to output various information to the user.
  • the input device 920 and the output device 930 may be built in the computer 910 or may be connected (externally attached) to the computer 910 .
  • input devices 920 may be buttons, keyboards, mice, touch sensors, etc.
  • output devices 930 may be lamps, displays, printers, speakers, and the like.
  • a device having both functions of the input device 920 and the output device 930, such as a touch panel in which a touch sensor and a display are integrated, may be applied.
  • a communication interface 916 is an interface for the computer 910 to communicate with an external device.
  • the auxiliary storage device 914 stores an information processing program for operating the computer 910 as an avatar display device or the like.
  • Arithmetic device 912 deploys the information processing program stored in auxiliary storage device 914 on main storage device 913 and executes instructions included in the information processing program, thereby turning computer 910 into an avatar display device. It functions as each part provided by etc.
  • the recording medium used by the auxiliary storage device 914 to record information such as an information processing program may be a computer-readable "non-temporary tangible medium", such as a tape, disk, card, semiconductor memory, programmable It may be a logic circuit or the like.
  • the computer 910 functions using a program recorded in a recording medium external to the computer 910 or a program supplied to the computer 910 via any transmission medium (communication network, broadcast wave, etc.).
  • any transmission medium communication network, broadcast wave, etc.
  • the present invention can also be implemented in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission.
  • An avatar display device comprising: a generation unit; and a display unit that displays the avatar following the movement of the person in an image display area of an information communication device.
  • An avatar generation device comprising: a generation unit; and an output unit that outputs the generated avatar.
  • At least one processor comprising an acquisition process for acquiring imaging data of a person, a second acquisition process for acquiring depth data or image data of the person from the imaging data, and a 3D model based on the depth data or image data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Architecture (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Geometry (AREA)
  • Processing Or Creating Images (AREA)
  • Image Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
PCT/JP2021/047876 2021-12-23 2021-12-23 アバター表示装置、アバター生成装置及びプログラム Ceased WO2023119557A1 (ja)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US17/764,149 US11887234B2 (en) 2021-12-23 2021-12-23 Avatar display device, avatar generating device, and program
CA3229535A CA3229535A1 (en) 2021-12-23 2021-12-23 Avatar display device, avatar generation device, and program
EP21968992.4A EP4455998A4 (en) 2021-12-23 2021-12-23 AVATAR DISPLAY DEVICE, AVATAR GENERATION DEVICE AND PROGRAM
AU2021480445A AU2021480445B2 (en) 2021-12-23 2021-12-23 Avatar display device, avatar generation device, and program
PCT/JP2021/047876 WO2023119557A1 (ja) 2021-12-23 2021-12-23 アバター表示装置、アバター生成装置及びプログラム
JP2022517911A JP7200439B1 (ja) 2021-12-23 2021-12-23 アバター表示装置、アバター生成装置及びプログラム
KR1020227009844A KR102728245B1 (ko) 2021-12-23 2021-12-23 아바타 표시 장치, 아바타 생성 장치 및 프로그램
CN202180005532.2A CN116648729A (zh) 2021-12-23 2021-12-23 头像显示装置、头像生成装置以及程序
JP2022182125A JP7504968B2 (ja) 2021-12-23 2022-11-14 アバター表示装置、アバター生成装置及びプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/047876 WO2023119557A1 (ja) 2021-12-23 2021-12-23 アバター表示装置、アバター生成装置及びプログラム

Publications (1)

Publication Number Publication Date
WO2023119557A1 true WO2023119557A1 (ja) 2023-06-29

Family

ID=84797002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/047876 Ceased WO2023119557A1 (ja) 2021-12-23 2021-12-23 アバター表示装置、アバター生成装置及びプログラム

Country Status (8)

Country Link
US (1) US11887234B2 (https=)
EP (1) EP4455998A4 (https=)
JP (2) JP7200439B1 (https=)
KR (1) KR102728245B1 (https=)
CN (1) CN116648729A (https=)
AU (1) AU2021480445B2 (https=)
CA (1) CA3229535A1 (https=)
WO (1) WO2023119557A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7717879B1 (ja) * 2024-03-14 2025-08-04 ソフトバンク株式会社 画像処理装置、プログラム、及び画像処理方法

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7235153B2 (ja) * 2017-12-29 2023-03-08 株式会社三洋物産 遊技機
JP7235154B2 (ja) * 2018-02-15 2023-03-08 株式会社三洋物産 遊技機
JP7231076B2 (ja) * 2018-03-08 2023-03-01 株式会社三洋物産 遊技機
JP2020130466A (ja) * 2019-02-15 2020-08-31 株式会社三洋物産 遊技機
JP7234741B2 (ja) * 2019-03-28 2023-03-08 株式会社三洋物産 遊技機
JP7234740B2 (ja) * 2019-03-28 2023-03-08 株式会社三洋物産 遊技機
JP7234761B2 (ja) * 2019-04-11 2023-03-08 株式会社三洋物産 遊技機
JP7234760B2 (ja) * 2019-04-11 2023-03-08 株式会社三洋物産 遊技機
JP2023063369A (ja) * 2022-01-07 2023-05-09 株式会社三洋物産 遊技機
JP2023053387A (ja) * 2022-02-04 2023-04-12 株式会社三洋物産 遊技機
JP2023060270A (ja) * 2022-04-01 2023-04-27 株式会社三洋物産 遊技機
JP2023060269A (ja) * 2022-04-01 2023-04-27 株式会社三洋物産 遊技機
JP2023105195A (ja) * 2022-05-19 2023-07-28 株式会社三洋物産 遊技機
JP2023105194A (ja) * 2022-05-19 2023-07-28 株式会社三洋物産 遊技機
US12192257B2 (en) * 2022-05-25 2025-01-07 Microsoft Technology Licensing, Llc 2D and 3D transitions for renderings of users participating in communication sessions
US12374054B2 (en) 2022-05-27 2025-07-29 Microsoft Technology Licensing, Llc Automation of audio and viewing perspectives for bringing focus to relevant activity of a communication session
US12477016B2 (en) 2022-05-27 2025-11-18 Microsoft Technology Licensing, Llc Automation of visual indicators for distinguishing active speakers of users displayed as three-dimensional representations
JP2024008596A (ja) * 2022-07-08 2024-01-19 キヤノン株式会社 画像処理装置、制御方法、プログラム及び画像処理システム
US12608917B2 (en) * 2023-02-01 2026-04-21 Htc Corporation Face tracking system and method
JP2025001590A (ja) * 2023-06-20 2025-01-08 ソフトバンクグループ株式会社 情報処理システム及び情報処理方法
US20250124662A1 (en) * 2023-10-17 2025-04-17 Kyndryl, Inc. Preventing harassment on metaverse environments
JP2025076885A (ja) * 2023-11-02 2025-05-16 株式会社アシックス 足形状モデル提示装置、足形状モデル提示方法およびプログラム
JP7717211B1 (ja) * 2024-03-14 2025-08-01 ソフトバンク株式会社 画像処理装置、プログラム、及び画像処理方法
JP7634919B1 (ja) 2024-05-15 2025-02-25 株式会社Berry 頭部3次元モデル生成システム、プログラム及び頭部3次元モデル生成方法
WO2025249460A1 (ja) * 2024-05-28 2025-12-04 大日本印刷株式会社 プログラム、情報処理方法、情報処理装置及び製造方法
WO2026043226A1 (ko) * 2024-08-21 2026-02-26 삼성전자주식회사 아바타를 표시하기 위한 전자 장치, 방법, 및 비-일시적 컴퓨터 판독 가능 저장 매체

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009523288A (ja) 2006-01-10 2009-06-18 ソニー株式会社 顔メッシュを使用して顔アニメーションを作成する技術
JP2015531098A (ja) * 2012-06-21 2015-10-29 マイクロソフト コーポレーション デプスカメラを使用するアバター構築
JP2016500954A (ja) * 2012-10-10 2016-01-14 マイクロソフト コーポレーション 制御された三次元通信エンドポイント

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646340B2 (en) 2010-04-01 2017-05-09 Microsoft Technology Licensing, Llc Avatar-based virtual dressing room
AU2017335736B2 (en) * 2016-09-28 2022-08-11 Magic Leap, Inc. Face model capture by a wearable device
KR101866407B1 (ko) 2017-03-15 2018-06-12 주식회사 한글과컴퓨터 아바타 생성 시스템 및 이를 이용한 아바타 생성 방법
CN110490093B (zh) * 2017-05-16 2020-10-16 苹果公司 表情符号录制和发送
US10467793B2 (en) * 2018-02-08 2019-11-05 King.Com Ltd. Computer implemented method and device
US11218668B2 (en) * 2019-05-09 2022-01-04 Present Communications, Inc. Video conferencing method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009523288A (ja) 2006-01-10 2009-06-18 ソニー株式会社 顔メッシュを使用して顔アニメーションを作成する技術
JP2015531098A (ja) * 2012-06-21 2015-10-29 マイクロソフト コーポレーション デプスカメラを使用するアバター構築
JP2016500954A (ja) * 2012-10-10 2016-01-14 マイクロソフト コーポレーション 制御された三次元通信エンドポイント

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4455998A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7717879B1 (ja) * 2024-03-14 2025-08-04 ソフトバンク株式会社 画像処理装置、プログラム、及び画像処理方法
WO2025192687A1 (ja) * 2024-03-14 2025-09-18 ソフトバンク株式会社 画像処理装置、プログラム、及び画像処理方法

Also Published As

Publication number Publication date
EP4455998A4 (en) 2025-10-22
EP4455998A1 (en) 2024-10-30
US20230206531A1 (en) 2023-06-29
CN116648729A (zh) 2023-08-25
JPWO2023119557A1 (https=) 2023-06-29
JP7200439B1 (ja) 2023-01-06
AU2021480445A1 (en) 2024-03-07
JP2023094549A (ja) 2023-07-05
KR102728245B1 (ko) 2024-11-07
JP7504968B2 (ja) 2024-06-24
KR20230098089A (ko) 2023-07-03
AU2021480445B2 (en) 2025-04-10
US11887234B2 (en) 2024-01-30
CA3229535A1 (en) 2023-06-29

Similar Documents

Publication Publication Date Title
JP7200439B1 (ja) アバター表示装置、アバター生成装置及びプログラム
US12282594B2 (en) Presenting avatars in three-dimensional environments
US11736756B2 (en) Producing realistic body movement using body images
TWI650675B (zh) 群組視頻會話的方法及系統、終端、虛擬現實設備及網路設備
US10527846B2 (en) Image processing for head mounted display devices
US20210358227A1 (en) Updating 3d models of persons
US11783524B2 (en) Producing realistic talking face with expression using images text and voice
KR101190686B1 (ko) 화상 처리 장치, 화상 처리 방법 및 컴퓨터 판독가능한 기록 매체
TW202305551A (zh) 用於人工實境之全像通話
CN114219878A (zh) 虚拟角色的动画生成方法及装置、存储介质、终端
JP2019145108A (ja) 顔に対応する3次元アバターを用いて顔の動きが反映された3dアバターを含むイメージを生成する電子装置
CN114821675A (zh) 对象的处理方法、系统和处理器
CN115049016A (zh) 基于情绪识别的模型驱动方法及设备
Malleson et al. Rapid one-shot acquisition of dynamic VR avatars
JPH10240908A (ja) 映像合成方法
US20240404160A1 (en) Method and System for Generating Digital Avatars
JP5894505B2 (ja) 画像コミュニケーションシステム、画像生成装置及びプログラム
KR20200134623A (ko) 3차원 가상 캐릭터의 표정모사방법 및 표정모사장치
CN104715505A (zh) 三维头像产生系统及其装置、产生方法
KR20250109641A (ko) Ai 및 딥페이크 기술을 이용한 풀트래킹 기반 맞춤형 굿즈 제작 시스템 및 방법
JP2026047923A (ja) システム
WO2024163006A1 (en) Methods and systems for constructing facial images using partial images
JP2026510388A (ja) 人型キャラクタのパーソナライズされた表現およびアニメーション
CN121680620A (zh) 一种基于随身设备的虚拟场景多人实时交互系统
CN121767524A (zh) 用于用户表示的高斯溅射体

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2022517911

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202180005532.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21968992

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2021480445

Country of ref document: AU

Ref document number: 3229535

Country of ref document: CA

Ref document number: AU2021480445

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 808436

Country of ref document: NZ

ENP Entry into the national phase

Ref document number: 2021480445

Country of ref document: AU

Date of ref document: 20211223

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 11202401081W

Country of ref document: SG

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021968992

Country of ref document: EP

Effective date: 20240723