WO2021256319A1

WO2021256319A1 - Information processing device, information processing method, and recording medium

Info

Publication number: WO2021256319A1
Application number: PCT/JP2021/021615
Authority: WO
Inventors: 泰成池田
Original assignee: ソニーグループ株式会社
Priority date: 2020-06-15
Filing date: 2021-06-07
Publication date: 2021-12-23

Abstract

An information processing device (10) is provided with: a deletion unit (124b) that deletes, in 3D face model data, a range having a predetermined offset amount from the side corresponding to a 2D image including the face of a subject in a direction corresponding to the photographing direction of the 2D image; and a hair search unit (125) that determines whether a part of the 3D face model data that is the remaining part of the 3D face model data in which the range has been deleted by the deletion unit (124b) corresponds to the hairstyle of the subject included in the 2D image, wherein, when it is determined that the part corresponds to the hairstyle, the hair search unit (125) selects a 3D hair associated with the 3D face model data from a 3D hair database (111) (corresponding to an example of a "database").

Description

Information processing equipment, information processing methods and recording media

This disclosure relates to an information processing device, an information processing method, and a recording medium.

A technique for generating "3D (three dimensions) hair" by selecting a plurality of hairstyles similar to the hairstyle of the subject from a database and combining them is known. The 3D hair is represented, for example, as a 3D object representing the hair, defined by a set of line segments that continuously connect points in virtual space.

In the technique disclosed in Non-Patent Document 1, a 2D (two dimensions) image of a subject for which 3D hair is desired is first input to the system, and a stroke representing the flow of the subject's hair is accepted for the 2D image.

Then, when a stroke is input, the system selects 3D hair including "3D hair" having a shape similar to the input stroke from a database in which a plurality of 3D hairs are registered in advance. The 3D hair is data corresponding to each hair constituting the 3D hair, and the system projects the 3D hair onto the 2D image plane to which the stroke is input, and constitutes the 3D hair and the stroke, respectively. Similarity is determined by comparing the distances of the vertices.

However, when the above-mentioned conventional technique is used, there is a problem that the similarity of the shape of the entire hair cannot be taken into consideration because the comparison target of the similarity with the stroke is the 3D hair corresponding to each hair. .. There is also a problem that 3D hair in a portion invisible from the shooting direction of the subject is also a comparison target.

Therefore, in the present disclosure, we propose an information processing device, an information processing method, and a recording medium capable of improving the accuracy of the selection process of 3D hair and improving the quality of the finally generated 3D hair.

In order to solve the above problems, the information processing apparatus according to the present disclosure has the 3D face model data in the direction corresponding to the shooting direction of the 2D image from the side corresponding to the 2D image including the face of the subject. The 2D image includes a deletion unit that deletes a range having a predetermined offset amount, and a part of the 3D face model data that is the remaining part of the 3D face model data whose range is deleted by the deletion unit. It is provided with a hair search unit that determines whether or not it corresponds to the hairstyle of the subject, and if it is determined that it corresponds, selects the 3D hair associated with the 3D face model data from the database.

It is a schematic explanatory diagram of the information processing method which concerns on embodiment of this disclosure. It is a schematic block diagram of the information processing apparatus which concerns on the comparative example of embodiment of this disclosure. It is a flowchart which shows the processing procedure of the information processing apparatus which concerns on the comparative example of embodiment of this disclosure. It is a block diagram which shows the structural example of the information processing apparatus which concerns on embodiment of this disclosure. It is explanatory drawing (the 1) of a stroke. It is explanatory drawing (2) of a stroke. It is a figure which shows an example of a silhouette image _MH. It is a figure which shows an example of 3D face model data. Is a diagram illustrating an example of a silhouette image M _S. It is explanatory drawing (the 1) of the cut-out process which a cut-out part performs. It is explanatory drawing (the 2) of the cutting process executed by a cutting part. It is explanatory drawing (the 3) of the cut-out process which a cut-out part performs. It is explanatory drawing of the calculation process executed by the calculation unit. It is a figure which shows an example of a depth image. It is explanatory drawing (the 1) of the deletion process executed by the deletion part. It is explanatory drawing (the 2) of the deletion process executed by the deletion part. It is explanatory drawing (the 3) of the deletion process executed by the deletion part. It is explanatory drawing (the 4) of the deletion process executed by the deletion part. FIG. 5 is an explanatory diagram (No. 5) of the deletion process executed by the deletion unit. It is a figure which shows the output example to an output part. It is a flowchart which shows the processing procedure of the information processing apparatus which concerns on embodiment of this disclosure. It is explanatory drawing of the 3rd modification. It is a hardware block diagram which shows an example of the computer which realizes the function of an information processing apparatus.

Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. In each of the following embodiments, the same parts are designated by the same reference numerals, so that overlapping description will be omitted.

In addition, the present disclosure will be described according to the order of items shown below.
1. 1. Overview 1-1. Outline of comparative example of this embodiment 1-2. Problems in the comparative example of this embodiment 1-3. Outline of this embodiment 2. Configuration of information processing device 3. Information processing device processing procedure 4. Modification example 4-1. First modification 4-2. Second modification 4-3. Third variant example 4-4. Other modifications 5. Hardware configuration 6. Conclusion

<< 1. Overview >>
FIG. 1 is a schematic explanatory diagram of an information processing method according to an embodiment of the present disclosure. Further, FIG. 2 is a schematic block diagram of an information processing apparatus according to a comparative example of the embodiment of the present disclosure. Further, FIG. 3 is a flowchart showing a processing procedure of the information processing apparatus according to the comparative example of the embodiment of the present disclosure. Note that FIG. 1 shows an outline of both the comparative example and the present embodiment.

As shown in FIG. 1, the information system according to the comparative example and the present embodiment selects a plurality of hairstyles similar to the hairstyle of the subject from the 3D hair DB (Database) 111 in which a plurality of 3D hairs are registered in advance. It is a part of the system that generates 3D hair corresponding to the hairstyle of the subject by combining.

The 3D hair DB111 collects data on 3D hair shared by, for example, online communities related to games, unifies the format, and many of the 3D head models equipped with such 3D hair have their faces aligned. It is stored.

By the way, the actual hair is an aggregate of elongated fibers, and it is difficult to capture each hair with a camera. In addition, most of the hair is shielded by other hairs, and it is difficult to estimate the internal structure. Is. Therefore, as a 3D object in virtual space, it is known as an object that is difficult to automatically or semi-automatically generate.

For this reason, in the process of producing movies and games that use high-quality 3DCG (three-dimensional computer graphics), specialized artists often produce 3D hair by hand. That is, the work of producing 3D hair is a work that requires a high degree of specialized knowledge and skill, and is not a work that anyone can easily perform.

On the other hand, in recent years, due to the high performance of PCs (Personal Computers), tablet PCs, smartphones, etc., and the spread of SNS (Social Networking Service), general users who do not have specialized knowledge and skills are, for example, CG (computer graphics). Opportunities to create and disseminate content on their own are increasing.

Therefore, in the comparative example and the present embodiment, it is assumed that such a general user creates 3D hair simply by inputting on a 2D image using the above-mentioned simple equipment such as a PC, a tablet PC, or a smartphone. ..

<1-1. Outline of comparative example of this embodiment>
The outline of the comparative example will be described. As shown in FIG. 1, in the information processing method according to the comparative example, first, a 2D image of a subject for which 3D hair is desired is input to the system as an "input image", and the user roughly inputs the hair of the subject to the input image. "Enter a stroke" that represents the flow.

Then, in the information processing method according to the comparative example, if a stroke is input, a "hair search process" for selecting 3D hair including 3D hair having a shape similar to the input stroke is executed from the 3D hair DB111. ..

Specifically, as shown in FIG. 2, the information processing apparatus to which the information processing method according to the comparative example is applied has an image acquisition unit 121, a stroke acquisition unit 122, and a hair search unit 125. The image acquisition unit 121 acquires a 2D image of a subject for which the user wants to make 3D hair, which is taken by a camera (for example, a monocular RGB camera), as an input image.

The stroke acquisition unit 122 acquires a stroke manually drawn by the user via the input unit (for example, a mouse) with respect to the input image acquired by the image acquisition unit 121. For example, several to 10 strokes are input, but the number is not limited. In the hair search process in the subsequent stage, the same number of 3D hair candidates as the number of strokes acquired by the stroke acquisition unit 122 is selected from the 3D hair DB 111.

The hair search unit 125 searches for and selects 3D hair including 3D hair having a similar shape from the 3D hair DB 111 for each stroke based on the stroke acquired by the stroke acquisition unit 122.

Here, the hair search unit 125 projects the 3D hair onto the 2D image plane in order to compare the stroke drawn on the 2D image with the shape of the 3D hair. The projection mentioned here is all the 3D hair that the 3D hair has, assuming that the 3D hair arranged in the virtual space is taken from a virtual camera (hereinafter, appropriately referred to as "virtual camera") arranged in the virtual space. Refers to the process of moving vertices onto a 2D image plane.

At this time, the hair search unit 125 makes the positional relationship between the virtual camera and the 3D hair correspond to the positional relationship between the camera in the real space where the input image was taken and the subject, so that the 3D hair and the reality originally exist in the virtual space. It is possible to compare the positional relationship between the input stroke and the input image taken by the camera in space on the 2D image plane.

Then, the hair search unit 125 determines the similarity of the 3D hair and the stroke projected on the 2D image plane by comparing the distances of the vertices constituting each of them. Generally, a stroke processed by a computer is defined as a continuous connection of a set of vertices that make up such a stroke, so the distance between the stroke and the vertices of the 3D hair on the 2D image plane is calculated. can.

That is, the hair search unit 125 searches for the projected 3D hair having the shortest distance from each stroke drawn on the 2D image. By such a process, the number of 3D hairs equal to the number of strokes input by the user is selected as similar 3D hairs. Then, the hair search unit 125 selects a number of 3D hairs equal to the number of strokes, including each of the selected 3D hairs.

FIG. 3 shows in more detail the processing procedure of the information processing apparatus according to the comparative example. Note that FIG. 3 shows a processing procedure of the hair search process for one stroke S (i) by the hair search unit 125.

Here, i is the index number of the stroke S. j is an index number of 3D hair stored in the 3D hair DB 111. j _max is the number of 3D hairs stored in the 3D hair DB 111. d _c is a variable to which the distance between the stroke (i) and the 3D hair (j) of the 3D hair is substituted. d is a variable to which the distance between the stroke (i) and the 3D hair (j) is substituted. d _max is the maximum value that can be expressed in the data format for storing the value corresponding to d in the processing system of the information processing apparatus according to the comparative example.

As shown in FIG. 3, the hair search unit 125 executes the loop process shown in step S11 for each stroke S (i). Such a loop process is _{executed while j is assigned 0, d max} is assigned to d, and j is incremented while j <j _max is true.

In such a loop process, the hair search unit 125 acquires the 3D hair (j) from the 3D hair DB 111 (step S111). Then, the hair search unit 125 projects the 3D hair (j) onto the 2D image plane (step S112).

_{Then, the hair search unit 125 calculates the distance d c} between each 3D hair of the 3D hair (j) and the stroke S (i) (step S113). The hair retrieval unit 125 extracts the value of the smallest distance _{d c} (step S114), the distance _{d c} is determined whether the distance d is less than or not (step S115).

Here, if the smaller (step S115, Yes), and updates the distance d in the distance _{d c,} and record the value of j (step S116), while the step S11 is executed, the process is repeated from step S111. If it is large (steps S115, No), the process from step S111 is repeated while step S11 is executed.

Then, after step S11 is executed for all the strokes S (i), the distance to each of the strokes S (i) is the smallest based on the value of j recorded for each of the strokes S (i). That is, the 3D hair (j) having the most similarity is selected.

<1-2. Problems in the comparative example of this embodiment>
By the way, the information processing method according to such a comparative example has the following problems (a) and (b). First, the first problem (a) is that the comparison target for the similarity with the stroke is the 3D hair corresponding to each hair of the 3D hair, so the similarity of the shape of the entire 3D hair is taken into consideration. The point is that it cannot be done.

In the information processing method according to the comparative example, when selecting 3D hair having a shape close to the stroke input by the user, the stroke and the vertices of each 3D hair are compared in order. However, this process specializes in selecting the 3D hair having the closest shape to the stroke, and does not consider the shape of the entire 3D hair that is finally selected.

As a result, 3D hair may be selected, which is very similar to the stroke only in part, but has a completely different shape of the entire hair. In such a case, the shape of the final 3D hair may be significantly different from the hairstyle of the subject included in the input image.

The second problem (b) is that the 3D hair in the part that is originally invisible from the shooting direction of the subject (hereinafter, appropriately referred to as "invisible part") can also be compared with the stroke.

In the information processing method according to the comparative example, the depth information of the 3D hair is lost in the projection process described above. Depth information is position information in the direction from the front to the back when viewed from the virtual camera in the virtual space. When such depth information is lost, it becomes difficult for the computer to grasp the context of the object.

However, in the comparative example and the present embodiment in which a user who does not have a high degree of specialized knowledge or skill tries to generate 3D hair only by an operation that can be performed using simple equipment, the 3D hair existing in the virtual space is used. Projection processing is a necessary means for searching with 2D information.

For example, in the method of directly inputting strokes in virtual space, a head-mounted display that realizes VR (Virtual Reality), a device that realizes AR (Augmented Reality), etc. can be attached, and a special that enables input in virtual space. It requires a technology to actually input strokes using such a special device, and it cannot be said that everyone can use it.

Then, as long as the projection process is performed, the anteroposterior relationship between the bangs and the back hair becomes unclear due to the projection in the hair search process. As a result, when calculating the similarity with the stroke input by the user, for example, the hair that cannot be seen from the front, which is the shooting direction, may be selected as the most similar 3D hair. In this case, for example, there arises a problem that 3D hair with shortened back hair, which should be originally long, is finally generated.

<1-3. Outline of this embodiment>
Therefore, in the information processing method according to the embodiment, it was decided to compare the shape of the entire hair using a silhouette image for the problem (a). In addition, for problem (b), using the positional relationship between the virtual camera and the 3D hair in the virtual space, the invisible part, which is the 3D hair part other than the 3D hair seen from the virtual camera, is stroked in advance before projection. It was decided to remove from the comparison target of and compare only the visible part with the stroke.

Specifically, as shown in FIG. 1, in the information processing method according to the present embodiment, after the input image is acquired and the stroke is input, the problem (a) is subjected to the "silhouette determination process". The "invisible part deletion process" is executed for each of the points (b).

In the silhouette determination process, similarity comparison using silhouette images is performed (step S1). Here, there are two types of silhouette images, one based on the input image and the other based on the 3D hair DB 111.

As shown in the figure, the image based on the input image is an image in which the hair part of the subject included in the input image taken by the camera in the real space is painted in a single color, and the rest is painted in another color. .. This is hereinafter referred to as "silhouette image _MH " or "hair shape comparison image" as appropriate.

Based on the 3D hair DB111, for all 3D hair included in the 3D hair DB111, a virtual camera is placed in front of the 3D face model data, which is a 3D head model equipped with the 3D hair in the virtual space, and the virtual camera is placed. It is an image generated assuming that it was taken by a camera. As shown in the figure, in such an image, the portion of the 3D hair in the image is filled with a single color. Here it is hereinafter referred to as "silhouette image M _S".

Silhouette determination process, these silhouette images M _H, by comparing the M _S, determining the similarity of the shape of the entire hair. In the silhouette determination process, the stroke is used to limit the similarity comparison area (step S2).

In step S2, the silhouette image M _H, with respect to M _S, by applying the extraction process to be described later by using the start and end points of the stroke entered by the user, to limit the target area of the subsequent processing. Then, for a limited target area, the calculation processing described later, the silhouette image M _H, calculates the degree of overlapping between the hair portion in the M _S, the similarity of the silhouette image M _H, M _S on the basis of the superposition degree To judge.

Then, based on the result of such determination, 3D hair whose overall hair shape is different from the hair shape of the subject included in the input image is excluded from the search target of the hair search process. As a result, the above-mentioned problem (a) can be solved and the accuracy of selecting 3D hair can be improved.

A specific example of each process in the silhouette determination process will be described later with reference to FIGS. 4 and 6 to 13.

Further, in the invisible part deletion process, the invisible part is deleted using the depth information (step S3). The depth information includes, for example, a depth value, and the depth value represents the distance between the virtual camera arranged in the virtual space and the 3D object arranged in the virtual space. In the present embodiment, for all the 3D hairs included in the 3D hair DB 111, a depth image in which the depth value which is the distance to the 3D hair seen from the virtual camera is recorded is used as the depth information.

In step S3, using the above-mentioned 3D face model data, the depth value when the 3D face model data is viewed from the position of the virtual camera arranged in the virtual space is calculated, and the depth image in which the depth value is recorded (hereinafter, appropriately referred to as a "depth image M _D".) to generate.

Then, the value obtained by adding a predetermined offset amount in the depth direction when viewed from the virtual camera arranged in the virtual space to the depth value is used as a threshold value, and among the 3D hairs included in the 3D face model data, the virtual space is used. Judge the visibility based on the placed virtual camera.

Here, the offset amount is, as is known, a value used to realize some processing by adding or decreasing a constant value to a threshold value or a parameter. In the present embodiment, only the visible part of the 3D hair is taken out by adding an _{offset amount (hereinafter, appropriately referred to as “offset amount Od} ”) applied to the shooting direction of the virtual camera arranged in the virtual space to the depth value. Realize that.

Then, as a result of the determination of visibility, the 3D hair that is regarded as an invisible part is deleted from the comparison target with the stroke in the hair search process. As a result, the above-mentioned problem (b) can be solved and the selection accuracy of 3D hair can be improved.

A specific example of each process in the invisible portion deletion process will be described later with reference to FIGS. 4 and 14 to 19.

Thus, in the information processing method according to an embodiment, the above-mentioned problems with respect to (a), the silhouette image M _H, it was decided to compare the shape of the entire hair using M _S. Further, in response to the above problem (b), by utilizing the positional relationship between the virtual camera and the 3D hair in the virtual space, the invisible part which is the 3D hair part other than the 3D hair seen from the virtual camera is preliminarily projected. It was decided to remove it from the comparison target with the stroke and compare only the visible part with the stroke.

Therefore, according to the information processing method according to the embodiment, the above problems (a) and (b) are solved, the accuracy of the 3D hair selection process is improved, and the quality of the finally generated 3D hair is improved. Improvement can be realized.

Hereinafter, a configuration example of the information processing apparatus 10 to which the information processing method according to the above-described embodiment is applied will be described more specifically.

<< 2. Information processing device configuration >>
FIG. 4 is a block diagram showing a configuration example of the information processing apparatus 10 according to the embodiment of the present disclosure. Note that FIG. 4 shows only the components necessary for explaining the features of the present embodiment, and the description of general components is omitted.

In other words, each component shown in FIG. 4 is a functional concept and does not necessarily have to be physically configured as shown in the figure. For example, the specific form of distribution / integration of each block is not limited to the one shown in the figure, and all or part of it may be functionally or physically distributed in any unit according to various loads and usage conditions. It can be integrated and configured.

Further, in the explanation using FIG. 4, the explanation may be simplified or omitted for the components already explained.

As shown in FIG. 4, the information processing apparatus 10 includes a storage unit 11 and a control unit 12. Further, the information processing apparatus 10 includes a camera 3, an input unit 5, and an output unit 7. The information processing device 10 is realized by, for example, the above-mentioned PC, tablet PC, smartphone, or the like.

The camera 3 is realized by, for example, a monocular RGB camera. The input unit 5 is realized by, for example, a mouse. The output unit 7 is realized by, for example, a display monitor.

The input unit 5 may be a pointing device capable of inputting strokes by the user, and may be, for example, a pen type. Further, the input unit 5 and the output unit 7 may be realized by the touch panel display together, and the input unit 5 may accept the input of the stroke by the user's finger.

The storage unit 11 is realized by, for example, a semiconductor memory element such as a RAM (Random Access Memory), a ROM (Read Only Memory), or a flash memory (Flash Memory), or a storage device such as a hard disk or an optical disk. In the example shown in FIG. 4, the storage unit 11 stores the 3D hair DB 111.

The control unit 12 is a controller, and for example, various programs stored in the storage unit 11 are executed using the RAM as a work area by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like. Is realized by. Further, the control unit 12 can be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).

The control unit 12 has an image acquisition unit 121, a stroke acquisition unit 122, a silhouette determination unit 123, an invisible portion deletion unit 124, and a hair search unit 125, and has functions and actions of information processing described below. Realize or execute.

The image acquisition unit 121 acquires a 2D image of the subject for which the user wants to make 3D hair, which is taken by the camera 3, as an input image. The stroke acquisition unit 122 acquires a stroke manually drawn by the user via the input unit 5 with respect to the input image acquired by the image acquisition unit 121.

Here, FIGS. 5 and 6 are explanatory views (No. 1) and (No. 2) of the stroke. As shown in FIG. 5, the stroke is a plurality of lines representing the rough flow of the hair of the subject, which is manually drawn by the user via the input unit 5 with respect to the input image output to the output unit 7. ..

As shown in FIG. 6, the stroke is defined as a set of line segments connecting each vertex continuously with an arbitrary point on the 2D image plane as a vertex. As shown in the figure, the stroke has a start point and an end point. In the figure, the start point and the end point may be reversed. Such start points and end points are used in the cutting process described later.

Return to the explanation in Fig. 4. The silhouette determination unit 123 executes the silhouette determination process (see FIG. 1) according to the above-described embodiment. As shown in FIG. 4, the silhouette determination unit 123 includes a silhouette image generation unit 123a, a cutting unit 123b, a calculation unit 123c, and a determination unit 123d.

_{The silhouette image generation unit 123a generates the silhouette image MH} described above based on the input image acquired by the image acquisition unit 121. FIG. 7 is a diagram showing an example _{of the silhouette image MH.} As shown in FIG. 7, the silhouette image generation unit 123a is an image in which the hair portion of the subject included in the input image taken by the camera 3 in the real space is painted in a single color, and the rest is painted in another color. Generate silhouette image _MH.

For example, the silhouette image generation unit 123a uses an image in which the hair portion of the image and the rest of the image are painted separately as teacher data, and uses the hair portion of the subject of the input image as the subject based on a learning model learned by using an algorithm such as semantic segmentation. Estimate and generate a silhouette image _MH.

Returning to the description of FIG. Furthermore, the silhouette image generating unit 123a, based on the 3D hair DB 111, generates a silhouette image _{M S} described above. FIG. 8 is a diagram showing an example of 3D face model data. 9 is a diagram showing an example of a silhouette image M _S.

As shown in FIG. 8, the 3D face model data corresponds to the data in which the 3D head model with the 3D hair attached, in other words, the 3D hair and the 3D head model are displayed together. Silhouette image generating unit 123a, based on the assumed image and captured by the virtual camera placed in front of such a 3D face model data to generate the silhouette image M _S.

As shown in FIG. 9, the silhouette image generating unit 123a includes a silhouette image M _S as an image that fills the portion of at least 3D hair in the image is assumed to have taken with monochromatic by the virtual camera placed in front of the 3D face model data To generate.

For example, the silhouette image generation unit 123a converts the coordinates of all the 3D face model data included in the 3D hair DB 111 based on the external parameters of the camera 3 at the time of shooting the input image, and converts the hair portion, the skin portion, and the background portion into the hair portion, the skin portion, and the background portion. to produce a silhouette image M _S is painted in a different color.

Note that the external parameters of the camera 3 generally hold the position and rotation direction of the camera 3, and by knowing these, each vertex defined in the coordinate system of a specific virtual space is in another coordinate system. It is possible to calculate the position.

Returning to the description of FIG. The cutting unit 123b executes the cutting process according to the present embodiment described above. Cutout portion 123b is silhouette image M _H generated by the silhouette image generating unit _123a, to M _S, using the start and end points of the stroke entered by the user, limiting the target region to be processed in the subsequent processing do.

10 to 12 are explanatory views (No. 1) to (No. 3) of the cutting process executed by the cutting unit 123b. As she is shown in FIG. 10, cutout portions 123b, on the basis of the stroke S input by the user (i), to remove unwanted portions of the silhouette image _M H, _{M S.}

First, the cutout portion 123b extracts the leftmost and rightmost vertices on the 2D image plane, that is, the start point and the end point shown in FIG. 6 among the vertices constituting the stroke S (i). Then, the cutout portion 123b sets the extracted start point position and end point position as the temporary left end and right end of the region to be compared with each other. Sonouede, cutout portion 123b, as shown in FIG. 10, an area that fits the position of applying the predetermined offset amount O _w to such left and right edges, and cut-out target area A _c.

For example, cutout portion 123b is an offset O _w, and 5% with respect to the lateral length defined by start and end points. The offset amount O _w is a variation of practical processing and is not essential for this embodiment. What is the offset amount O _{d applied in the depth direction when viewed from the virtual camera arranged in the virtual space described above?} It's different.

FIG. 11 shows an example of the result of the cutout portion 123b performing the cutout process on _{the silhouette image MH} shown in FIG. 7. Similarly, with respect to the silhouette image M _S shown in FIG. 9, which shows an example of a result of cutout portion 123b is subjected to clipping processing is shown in FIG 12.

Returning to the description of FIG. The calculation unit 123c executes the calculation process according to the present embodiment described above. Calculator 123c, compared target region defined by the cutout portion 123b, and calculates a silhouette image _M H, the superposition degree of the hair portion in the _{M S.}

FIG. 13 is an explanatory diagram of the calculation process executed by the calculation unit 123c. Calculator 123c, the silhouette image _M H cut out on the basis of the cut-out target area _{A c,} compare _{M S,} overlap areas representing a hair in each (see "overlapping region" in FIG. 13) the IOU (Intersection-over- Calculated as Union).

IoU is a general index for quantifying the overlap between images, and is calculated by the formula of IoU = (region A∩region B) / {region A + region B- (region A∩region B)}. .. In the present embodiment, for example, region A in hair portion of silhouette images M _H, region B is the hair portion of the silhouette image M _S, corresponding respectively.

Returning to the description of FIG. Determining unit 123d, based on the IoU calculated by the calculating unit 123c, and determines the similarity of silhouette images _M H, _{M S.} Specifically, when the calculated IoU is equal to or greater than a predetermined value, the determination unit 123d sets the corresponding 3D hair as a search target candidate in the hair search process of the stroke S (i). Further, when the calculated IoU is less than a predetermined value, the determination unit 123d excludes the corresponding 3D hair from the search target candidates in the hair search process of the stroke S (i).

In the present embodiment, the above-mentioned predetermined value, which is a threshold value indicating whether or not the shape of the entire hair of the 3D hair is different from the hair shape of the subject included in the input image, is set to, for example, "0.45". It is just an example and does not limit the value.

Return to the explanation in Fig. 4. The invisible portion deletion unit 124 executes the invisible portion deletion process (see FIG. 1) according to the above-described embodiment. As shown in FIG. 4, the invisible portion deletion unit 124 has a depth image generation unit 124a and a deletion unit 124b.

The depth image generation unit 124a uses the above-mentioned 3D face model data, calculates the depth value when the 3D face model data is viewed from the position of the virtual camera arranged in the virtual space, and records the depth image. to generate the M _D. Figure 14 is a diagram showing an example of such a depth image M _D.

Specifically, the depth image generation unit 124a determines the positions of the contours of the eyes, mouth, and face of the subject in the input image, and the eyes, mouth, and face when the 3D face model data is viewed from a virtual camera arranged in the virtual space. Use parameters including the shooting position and rotation direction calculated to minimize the error from the contour position.

Further, the depth image generation unit 124a uses such a parameter to align the virtual camera so that it is placed in front of the 3D face model data in the virtual space, and the virtual camera takes a picture in the virtual space. assuming generates a depth image M _D by distance read as depth value up 3D face model data appearing on each pixel of the image.

Deleting unit 124b performs deletion of the invisible parts using depth image M _D generated by the depth image generating unit 124a. 15 to 19 are explanatory views (No. 1) to (No. 5) of the deletion process executed by the deletion unit 124b.

As already mentioned, the invisible part deletion process uses the positional relationship between the virtual camera and the 3D hair in the virtual space to stroke the invisible part, which is the 3D hair part other than the 3D hair seen from the virtual camera, in advance before projection. It is a process of deleting from the comparison target with and making only the visible part the comparison target of the stroke.

This is based on the premise that "the stroke input by the user is drawn with respect to the visible hair part that is actually visible on the input image", and the 3D hair that is not included in the input image is compared from the comparison target. It is to be excluded.

Deleting unit 124b, the determination unit 123d result of the determination by, using a 3D hair contained in the 3D hair remaining as a search target candidates, the depth value extracted from the depth image M _D generated by the depth image generating unit 124a Then, the positions in the depth direction are compared in the camera coordinate system of the virtual camera 3V (see FIG. 18) arranged in front of the 3D face model data.

Such time, deleting unit 124b, by applying an offset O _d against the recorded depth value in the depth image M _D, taking out only the visible part of the 3D hair that searched candidate.

In a typical example, the deletion unit 124b applies the offset amount O _d in the depth direction as viewed from the virtual camera 3V defined in a virtual space with respect to the depth values stored in each pixel of the depth image M _D.

FIG. 15 corresponds to an example before the application of the offset amount _Od , and FIG. 16 corresponds to an example after the application of the _{offset amount Od.} Further, the depth value D shown in both figures corresponds to a threshold value for separating the visible portion and the invisible portion.

In a typical example, the _{value used as the offset amount Od} corresponds to, for example, 5 cm in the real space. This length is based on the assumption that the distance between the frontmost bangs of the face and the hairline in real space is within 5 cm for a general hairstyle.

Coordinate information depth image M _D according to the offset amount O _d holds is as shown in FIG. 16 is a virtual space in which 3D hair is present. Here, as shown in the figure, of the whole 3D facial model data, a range defined by the offset amount O _d is defined as "a part of the face model data". The "part of the face model data" corresponds to the visible part of the 3D hair. Further, the range excluding "a part of the face model data" in the entire 3D face model data is defined as "the range having an offset amount". Further, the direction in which the offset amount _Od is applied is defined as "the direction corresponding to the shooting direction".

Then, the deletion unit 124b separates the visible portion and the invisible portion using the above-mentioned depth value D. In the present embodiment, as shown in FIG. 17, the deletion unit 124b has the 3D hair based on the ratio in which each vertex constituting one 3D hair ST exists on the virtual camera 3V side with the depth value D as a boundary. Judge the visibility of.

Specifically, as shown in FIG. 17, the deletion unit 124b exists in the region C1 on the virtual camera 3V side with the depth value D as a boundary, for example, 50% of the vertices constituting one 3D hair ST. If so, it is determined that the 3D hair ST is visible, and in other cases, it is determined to be invisible. Although schematically shown in FIG. 17, the region C1 is all regions on the virtual camera 3V side with the depth value D as a boundary.

Such a process is a process for correctly determining that half or more of the 3D hair ST is visible when the depth value D and the 3D hair ST are present in a state similar to the positional relationship shown in FIG. ..

If the conditions are not met in this process, the deletion is applied to the invisible 3D hair ST, and it is excluded from the comparison target with the stroke as the "invisible part" shown in FIG.

In addition, in FIG. 17, for the sake of clarity, the appearance of one 3D hair ST extending in the forward direction for humans is exaggerated, but in an actual typical hairstyle, the bangs are forward. It will exist along the forehead while extending in the direction.

Also, the deletion unit 124b, for 3D hair not excluded the invisible part of FIG. 18, based on the cut-out target area A _c described above, further refine the 3D hair to be compared with the stroke.

That is, the deleting unit 124b determines 3D hair that has not been excluded, the horizontal direction on the 2D image plane, whether fit in the target area A _c cut in the width direction of the face other words.

The deleting unit 124b is a 3D hair to fit in cut-out target area A _c, leaving the comparison as 3D hair likely to have a shape similar to the inputted stroke. Further, the deletion unit 124b applies deletion to the 3D hair that does not fit in the cutout target area Ac, and excludes it from the comparison target with the stroke. FIG. 19 corresponds to an example of the execution result of the deletion process by the deletion unit 124b up to the above.

Return to the explanation in Fig. 4. The hair search unit 125 compares strokes and individual 3D hairs with respect to only the visible portion left by the invisible portion deletion unit 124 among the 3D hairs selected as search target candidates by the silhouette determination unit 123, and is similar. Select 3D hair including 3D hair with a shaped shape.

The hair search unit 125 compares the distance between the vertices constituting the stroke and the vertices of the 3D hair constituting the 3D hair, as in the comparative example described above. Specifically, the hair search unit 125 matches the stroke and the number of vertices of 3D hair (for example, 100 vertices), calculates the Euclidean distance between the corresponding vertices in order from the beginning, and combines the total value with the stroke. 3D hair distance d _c .

Hair searching unit 125, when comparing the similarity of 3D hair in a certain stroke, all calculates the distance d _c for 3D hair, stroke and 3D hair its value the smallest with the 3D hair of interest Let the distance d be. The hair search unit 125 performs this for all visible parts of the 3D hairs that are candidates for search by the silhouette determination unit 123, and the 3D hair having the smallest distance d value is 3D having a shape similar to the stroke. Elected as hair.

The hair search unit 125 performs such a process a number of times equal to the number of strokes input by the user, so that 3D hair estimated to have a portion having a shape most similar to each stroke is selected. Further, the hair search unit 125 outputs the selection result of the 3D hair for each selected stroke to the output unit 7.

FIG. 20 shows a specific example of the selection result output to the output unit 7. FIG. 20 is a diagram showing an example of output to the output unit 7. For comparison, the upper part of FIG. 20 shows an output example in the above-mentioned comparative example, and the lower part shows an output example in the present embodiment.

As shown in FIG. 20, the difference in the selection results between the comparative example and the present embodiment is reflected in the shape of the bangs and the length of the hair. The subject included in the input image shown in FIG. 1 and the like has a hairstyle that exposes the forehead portion.

If these characteristics are correctly reflected, it is expected that the selected 3D hair will tend to expose the forehead, but in the third from the left of the comparative example, the 3D hair that hides the forehead is selected. You can see that there is.

On the other hand, in the present embodiment, for example, due to the effect of the invisible portion deletion process, the forehead is not hidden in the 3D hair whose bangs are close to the subject, that is, the third from the left in the comparative example, and the forehead is relatively exposed. It can be seen that 3D hair has been selected.

Further, as the subject included in the input image shown in FIG. 1 and the like, long hair 3D hair is selected as the first and second from the right in the comparative example, although the hairstyle is short hair. You can see that.

If such 3D hair is mixed in the selection result, there is a possibility that 3D hair having a shape different from the expected shape will be generated when the selected 3D hair is integrated to generate a new 3D hair. In order to prevent this, it is necessary to improve the accuracy of the selection process and adjust so that 3D hair with unnecessarily long hair is not selected.

Therefore, looking at the selection results of this embodiment, it can be seen that short-haired 3D hair similar to the silhouette of the subject's hair can be selected due to the effect of the silhouette determination process.

In the description so far, the information processing apparatus 10 has no UI (User Interface) other than accepting the input of the stroke for the input image and outputting the selection result of the 3D hair which is considered to be the most similar for each stroke. However, the UI is not limited to this. Such a modification will be described later with reference to FIG.

<< 3. Information processing device processing procedure >>
Next, a processing procedure executed by the information processing apparatus 10 according to the embodiment will be described with reference to FIG. 21. FIG. 21 is a flowchart showing a processing procedure of the information processing apparatus 10 according to the embodiment of the present disclosure.

Note that FIG. 21 corresponds to FIG. 3, and similarly to FIG. 3, the processing procedure for one stroke S (i) is shown.

Further, as in FIG. 3, i is an index number of the stroke S. j is an index number of 3D hair stored in the 3D hair DB 111. j _max is the number of 3D hairs stored in the 3D hair DB 111. d _c is a variable to which the distance between the stroke (i) and the 3D hair (j) of the 3D hair is substituted. d is a variable to which the distance between the stroke (i) and the 3D hair (j) is substituted. d _max is the maximum value that can be expressed in the data format for storing the value corresponding to d in the processing system of the information processing apparatus 10.

As shown in FIG. 21, the control unit 12 executes the loop process shown in step S101 for each stroke S (i). Such a loop process is _{executed while j is assigned 0, d max} is assigned to d, and j is incremented while j <j _max is true.

In such loop processing, the silhouette image generating unit 123a is, (step S1011) to acquire 3D hair to (j) from the 3D hair DB 111, generates a silhouette image _M H, _{M S} from the input image and the 3D hair (j).

The cutout portion 123b, based on the start and end points of the stroke S (i), the silhouette image _M H, a part from _{M S} (i.e., cut-out target area _{A c)} cut out (step S1012).

Then, calculating unit 123c is cut out silhouette image _M H, compares the _{M S,} calculates the IOU (step S1013). Then, the determination unit 123d determines whether or not the IoU is 0.45 or more (step S1014).

Here, when the IoU is 0.45 or more (step S1014, Yes), the deletion unit 124b deletes the invisible portion by using _{the depth image MD and the offset amount Od (step S1015).}

Then, the hair search unit 125 projects the 3D hair (j) onto the 2D image plane with respect to the visible portion left by the deletion unit 124b (step S1016).

_{Then, the hair search unit 125 calculates the distance d c} between each 3D hair of the 3D hair (j) and the stroke S (i) (step S1017). The hair retrieval unit 125 extracts the value of the smallest distance _{d c} (step S1018), the distance _{d c} is determined whether the distance d is less than or not (step S1019).

Here, if the smaller (step S1019, Yes), and updates the distance d in the distance _{d c,} and record the value of j (step S1020), while the step S101 is executed, the process is repeated from step S1011.

If the condition of step S1014 is not satisfied (step S1014, No), or if the condition of step S1019 is not satisfied (step S1019, No), the process from step S1011 is repeated while step S101 is executed.

Then, after step S101 is executed for all the strokes S (i), the distance to each of the strokes S (i) is the smallest based on the value of j recorded for each of the strokes S (i). That is, the 3D hair (j) having the most similarity is selected.

<< 4. Modification example >>
In addition, some modification examples can be mentioned in the said embodiment.

<4-1. First variant>
_{In the above-described embodiment, the calculation using the offset amount Od} when determining the invisible portion of the 3D hair was performed assuming a typical hairstyle. The typical hairstyle referred to here is a hairstyle having a shape such that at least the bangs extend forward for a human and exist along the forehead.

When a hairstyle having such a shape is assumed, it is possible to separate visible 3D hair and invisible 3D hair by using _{the offset amount Od set described in the above-described embodiment.}

However, if the typical hairstyle defined here does not apply, for example, if the hairstyle is all-back, with the above offset amount _Od setting, visible 3D hair and invisible 3D hair are expected. It is expected that it cannot be separated from.

The reason for this in the case of hair, such as all-back, all 3D scalp hair extends rearwardly direction to humans, even multiplied by the offset by the offset amount O _d relative depth image M _D This is because it is highly likely that more than half of the 3D hair is determined to be invisible.

In such a case _{, not only the offset amount Od} is set, but also the ratio for determining visible or invisible, that is, the ratio of the number of vertices existing in the region C1 described as 50% using FIG. 17 is changed. There is a need to.

If the as example the case of all-back determining, for example, after having applied an offset by the offset amount O _d, if there are vertex of about 10% in the region C1, and visible for such 3D hair. When the height of the all-back is high and the stroke can be sufficiently drawn with respect to the input image taken from the front of the subject, it is possible to cope with it by applying the first modification.

If there is no height of the all-back and it is necessary to pull the stroke from the side surface, the second modification described below can be used.

<4-2. Second variant>
In the above-described embodiment and the first modification, the direction in which the camera 3 is installed is the front of the subject, but by introducing the method shown below, it is possible to shoot from the lateral direction of the subject. be.

For example, when shooting a subject from the left or right direction, it can be assumed that one eye, nose, mouth, half of the chin, ears, cheek skin, etc. can be acquired as personal features on the input image. Based on this, in the second modification, in the projection process of projecting the 3D hair onto the 2D image plane, the points corresponding to the feature points obtained from the above-mentioned individual characteristics are selected on the 3D face model data, and the corresponding correspondence is performed. For example, an ICP (Iterative Closest Point) algorithm is used to align two point clouds so that the distance between the points is the closest. After performing such alignment, it is possible to deal with it by performing the processing after the projection processing of the above-described embodiment.

Further, as described above, the information processing apparatus 10 according to the above-described embodiment has a UI other than receiving the input of the stroke for the input image and outputting the selection result of the 3D hair which is considered to be the most similar for each stroke. No, but the UI is not limited to this. Next, a third modification example of the UI will be described.

<4-3. Third variant>
FIG. 22 is an explanatory diagram of the third modification. In the third modification, a UI as shown in FIG. 22 is assumed.

Specifically, as shown in FIG. 22, in the UI according to the third modification, for example, the input image and the stroke drawn for the input image are displayed at the top of the display layout.

Then, under the input image, the selection results of 3D hair having 3D hair similar to each stroke are displayed, for example, in descending order of similarity, and the user selects the 3D hair to be used for the final 3D hair generation. Provide the functions that can be done.

For example, the middle part of FIG. 22 shows how the 3D hair surrounded by a thick solid line or a broken line frame is selected by the user's operation via the input unit 5.

Although this embodiment improves the selection accuracy of 3D hair as compared with the comparative example, it does not always automatically select the best candidate. Therefore, by providing the function of selecting the 3D hair to be used by the user in this way, the user can be made to select the best candidate by his / her own judgment and generate the final 3D hair.

In addition, as shown at the bottom of the figure, the integrated result of the selected 3D hair (see the left figure at the bottom) and the image of the finally generated 3D hair may be previewed. (See the figure on the right at the bottom).

This allows the user to proceed with the work while visually observing the finally generated 3D hair and confirming whether or not the hair shape as the user imagines is generated. That is, usability can be improved.

<4-4. Other variants>
Other examples of modification can be given. For example, the camera 3 is a monocular RGB camera as an example, but the camera 3 may be a photographing device capable of acquiring the shape of a subject, and may be another RGBD camera or the like.

Further, for example, the output unit 7 has taken as an example a display monitor or a touch panel display, but any display device having a function of accepting and displaying video input such as a television, a projector, and a screen can be used. Since the modification of the input unit 5 has already been described, the description thereof is omitted here.

Further, among the processes described in the above-described embodiment, all or a part of the processes described as being automatically performed can be manually performed, or the processes described as being manually performed can be performed. All or part of it can be done automatically by a known method. In addition, information including processing procedures, specific names, various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified. For example, the various information shown in each figure is not limited to the information shown in the figure.

Further, each component of each device shown in the figure is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or part of them may be functionally or physically distributed / physically in arbitrary units according to various loads and usage conditions. Can be integrated and configured. For example, the image acquisition unit 121 and the stroke acquisition unit 122 shown in FIG. 4 may be integrated. Similarly, for example, the silhouette determination unit 123 and the invisible portion deletion unit 124 shown in FIG. 4 may be integrated.

Further, the above-described embodiments can be appropriately combined in an area where the processing contents do not contradict each other. Further, the order of each step shown in the sequence diagram or the flowchart of the present embodiment can be changed as appropriate.

<< 5. Hardware configuration >>
The information processing apparatus 10 according to the above-described embodiment is realized by, for example, a computer 1000 having a configuration as shown in FIG. 23. FIG. 22 is a hardware configuration diagram showing an example of a computer 1000 that realizes the functions of the information processing apparatus 10. The computer 1000 has a CPU 1100, a RAM 1200, a ROM 1300, an HDD (Hard Disk Drive) 1400, a communication interface 1500, and an input / output interface 1600. Each part of the computer 1000 is connected by a bus 1050.

The CPU 1100 operates based on the program stored in the ROM 1300 or the HDD 1400, and controls each part. For example, the CPU 1100 expands the program stored in the ROM 1300 or the HDD 1400 into the RAM 1200, and executes processing corresponding to various programs.

The ROM 1300 stores a boot program such as a BIOS (Basic Input Output System) executed by the CPU 1100 when the computer 1000 is started, a program depending on the hardware of the computer 1000, and the like.

The HDD 1400 is a computer-readable recording medium that non-temporarily records a program executed by the CPU 1100 and data used by such a program. Specifically, the HDD 1400 is a recording medium for recording an information processing program according to the present disclosure, which is an example of program data 1450.

The communication interface 1500 is an interface for the computer 1000 to connect to an external network 1550 (for example, the Internet). For example, the CPU 1100 receives data from another device or transmits data generated by the CPU 1100 to another device via the communication interface 1500.

The input / output interface 1600 is an interface for connecting the input / output device 1650 and the computer 1000. For example, the CPU 1100 receives data from an input device such as a keyboard or mouse via the input / output interface 1600. Further, the CPU 1100 transmits data to an output device such as a display, a speaker, or a printer via the input / output interface 1600. Further, the input / output interface 1600 may function as a media interface for reading a program or the like recorded on a predetermined recording medium (media). The media is, for example, an optical recording medium such as DVD (Digital Versatile Disc) or PD (Phase change rewritable Disk), a magneto-optical recording medium such as MO (Magneto-Optical disk), a tape medium, a magnetic recording medium, or a semiconductor memory. Is.

For example, when the computer 1000 functions as the information processing apparatus 10 according to the embodiment, the CPU 1100 of the computer 1000 realizes the function of the control unit 12 by executing the information processing program loaded on the RAM 1200. Further, the information processing program according to the present disclosure and the data in the storage unit 11 are stored in the HDD 1400. The CPU 1100 reads the program data 1450 from the HDD 1400 and executes the program, but as another example, these programs may be acquired from another device via the external network 1550.

<< 6. Conclusion >>
As described above, according to the embodiment of the present disclosure, the information processing apparatus 10 has a direction corresponding to the shooting direction of the 2D image from the side corresponding to the 2D image including the face of the subject with respect to the 3D face model data. The deletion unit 124b that deletes the range having a predetermined offset amount, and a part of the 3D face model data that is the remaining part of the 3D face model data whose range is deleted by the deletion unit 124b are converted into the 2D image. It is determined whether or not it corresponds to the hairstyle of the included subject, and if it is determined that it corresponds, the 3D hair associated with the 3D face model data is selected from the 3D hair DB111 (corresponding to an example of the "database"). The hair search unit 125 is provided. As a result, the accuracy of the 3D hair selection process can be improved, and the quality of the finally produced 3D hair can be improved.

Although each embodiment of the present disclosure has been described above, the technical scope of the present disclosure is not limited to the above-mentioned embodiments as they are, and various changes can be made without departing from the gist of the present disclosure. be. In addition, components spanning different embodiments and modifications may be combined as appropriate.

Further, the effects in each embodiment described in the present specification are merely examples and are not limited, and other effects may be obtained.

The present technology can also have the following configurations.
(1)
A deletion unit that deletes a range of 3D face model data having a predetermined offset amount from the side corresponding to the 2D image including the face of the subject in the direction corresponding to the shooting direction of the 2D image.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion unit, corresponds to the hairstyle of the subject included in the 2D image. A hair search unit that selects 3D hair associated with the 3D face model data from the database when it is determined to correspond to the hair search unit.
An information processing device.
(2)
An acquisition unit that acquires a stroke representing the hair flow of the subject, which is input to the 2D image by the user.
Further prepare
The hair search unit
Based on the stroke, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing device according to (1) above.
(3)
The hair search unit
A part of the 3D face model data is projected onto the image plane of the 2D image, and on the image plane, the vertices constituting each of the 3D hairs included in the part of the 3D face model data and the vertices constituting the stroke. By comparing the distances of the above, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing device according to (2) above.
(4)
Multiple strokes are input,
The hair search unit
For each of the strokes, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing device according to (3) above.
(5)
The deleted part is
The range having the offset amount is deleted based on the depth information including the depth value representing the distance between the virtual camera placed in the virtual space and the 3D object.
The information processing apparatus according to (3) or (4) above.
(6)
The depth information is
A depth image in which the depth value is retained in each pixel.
The deleted part is
The depth value is the distance to the 3D face model data held by each pixel of the image assumed to be taken by the virtual camera arranged so that at least the face can be photographed in the virtual space.
The information processing device according to (5) above.
(7)
The deleted part is
Calculated to minimize the error between the positions of the contours of the eyes, mouth and face of the subject in the 2D image and the positions of the contours of the eyes, mouth and face when the 3D face model data is viewed from the virtual camera. Based on the parameters, the virtual camera is aligned so as to be arranged in front of the 3D face model data in the virtual space, and it is assumed that the image is taken by the virtual camera.
The information processing apparatus according to (6) above.
(8)
The deleted part is
The offset amount is added to the depth value held in each pixel of the depth image in the depth direction as seen from the virtual camera.
The information processing apparatus according to (6) or (7) above.
(9)
A part of the 3D face model data corresponds to a visible part of the 3D hair when the 3D face model data is viewed from the virtual camera.
The deleted part is
By using the value obtained by adding the offset amount to the depth value as a threshold value for separating the visible portion and the invisible portion other than the visible portion, the range having the offset amount corresponding to the invisible portion is deleted.
The information processing apparatus according to (8) above.
(10)
The deleted part is
Of all the vertices constituting one 3D hair, when the vertices having a predetermined ratio or more exist in the area on the side of the virtual camera with the threshold as the boundary, the 3D hair is regarded as the visible portion. judge,
The information processing apparatus according to (9) above.
(11)
The deleted part is
Using the offset amount set based on the distance between the frontmost bangs of the face and the hairline in real space.
The information processing apparatus according to any one of (2) to (10).
(12)
A silhouette determination unit for determining the similarity between the first silhouette, which is the silhouette of the hair portion of the subject included in the 2D image, and the second silhouette, which is the silhouette of the hair portion of the 3D face model data.
Further prepare
The silhouette determination unit
When it is determined that the first silhouette and the second silhouette are similar, the 3D hair of the 3D face model data is selected by the hair search unit, and when it is determined that they are not similar, the hair Excluded from selection by the search department,
The information processing apparatus according to any one of (2) to (11).
(13)
The silhouette determination unit
The degree of superimposition of the first silhouette and the second silhouette is calculated, and the similarity between the first silhouette and the second silhouette is determined based on the calculated degree of superimposition.
The information processing apparatus according to (12) above.
(14)
The silhouette determination unit
The degree of superimposition is calculated as IoU (Intersection over Union), and when the IoU is equal to or higher than a predetermined value, it is determined that the first silhouette and the second silhouette are similar.
The information processing apparatus according to (13) above.
(15)
The silhouette determination unit
The first image showing the first silhouette is generated based on the 2D image, and the 3D face model data is coordinate-converted according to the shooting position and rotation direction at the time of shooting the 2D image. A second image showing the second silhouette is generated, and the degree of superimposition is calculated based on the first image and the second image.
The information processing apparatus according to (13) or (14).
(16)
The silhouette determination unit
The cutout target area defined based on both ends of one stroke is cut out from the first image and the second image, respectively, and the first silhouette and the second silhouette in the cutout target area are cut out. To determine similarity,
The information processing apparatus according to (15) above.
(17)
The deleted part is
A part that does not correspond to the cutout target area is further deleted from a part of the 3D face model data.
The information processing apparatus according to (16) above.
(18)
The hair search unit
The selection results of the 3D hairs corresponding to each of the strokes are presented to the user in descending order of similarity, and the selected 3D hairs are selected by the user arbitrarily selecting each of the selected 3D hairs. Presenting the combined image to the user,
The information processing apparatus according to any one of (2) to (17).
(19)
With respect to the 3D face model data, the range having a predetermined offset amount in the direction corresponding to the shooting direction of the 2D image is deleted from the side corresponding to the 2D image including the face of the subject.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion, corresponds to the hairstyle of the subject included in the 2D image. When it is determined that the hair corresponds, the 3D hair associated with the 3D face model data is selected from the database.
Information processing methods, including.
(20)
On the computer
For 3D face model data, deleting a range having a predetermined offset amount in the direction corresponding to the shooting direction of the 2D image from the side corresponding to the 2D image including the face of the subject.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion, corresponds to the hairstyle of the subject included in the 2D image. , Select the 3D hair associated with the 3D face model data from the database when it is determined to correspond.
A computer-readable recording medium on which a program is recorded to realize the above.

3 camera 3V virtual camera 5 input unit 7 output unit 10 information processing device 11 storage unit 111 3D hair DB
12 Control unit 121 Image acquisition unit 122 Stroke acquisition unit 123 Silhouette determination unit 123a Silhouette image generation unit

123b Cutout unit

123c Calculation unit

123d Judgment unit 124 Invisible part deletion unit 124a Depth image generation unit 124b Deletion unit 125 Hair search unit

Claims

A deletion unit that deletes a range of 3D face model data having a predetermined offset amount from the side corresponding to the 2D image including the face of the subject in the direction corresponding to the shooting direction of the 2D image.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion unit, corresponds to the hairstyle of the subject included in the 2D image. A hair search unit that selects 3D hair associated with the 3D face model data from the database when it is determined to correspond to the hair search unit.
An information processing device.
An acquisition unit that acquires a stroke representing the hair flow of the subject, which is input to the 2D image by the user.
Further prepare
The hair search unit
Based on the stroke, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing apparatus according to claim 1.
The hair search unit
A part of the 3D face model data is projected onto the image plane of the 2D image, and on the image plane, the vertices constituting each of the 3D hairs included in the part of the 3D face model data and the vertices constituting the stroke. By comparing the distances of the above, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing apparatus according to claim 2.
Multiple strokes are input,
The hair search unit
For each of the strokes, it is determined whether or not a part of the 3D face model data corresponds to the hairstyle of the subject included in the 2D image.
The information processing apparatus according to claim 3.
The deleted part is
The range having the offset amount is deleted based on the depth information including the depth value representing the distance between the virtual camera placed in the virtual space and the 3D object.
The information processing apparatus according to claim 3.
The depth information is
A depth image in which the depth value is retained in each pixel.
The deleted part is
The depth value is the distance to the 3D face model data held by each pixel of the image assumed to be taken by the virtual camera arranged so that at least the face can be photographed in the virtual space.
The information processing apparatus according to claim 5.
The deleted part is
Calculated to minimize the error between the positions of the contours of the eyes, mouth and face of the subject in the 2D image and the positions of the contours of the eyes, mouth and face when the 3D face model data is viewed from the virtual camera. Based on the parameters, the virtual camera is aligned so as to be arranged in front of the 3D face model data in the virtual space, and it is assumed that the image is taken by the virtual camera.
The information processing apparatus according to claim 6.
The deleted part is
The offset amount is added to the depth value held in each pixel of the depth image in the depth direction as seen from the virtual camera.
The information processing apparatus according to claim 6.
A part of the 3D face model data corresponds to a visible part of the 3D hair when the 3D face model data is viewed from the virtual camera.
The deleted part is
By using the value obtained by adding the offset amount to the depth value as a threshold value for separating the visible portion and the invisible portion other than the visible portion, the range having the offset amount corresponding to the invisible portion is deleted.
The information processing apparatus according to claim 8.
The deleted part is
Of all the vertices constituting one 3D hair, when the vertices having a predetermined ratio or more exist in the area on the side of the virtual camera with the threshold as the boundary, the 3D hair is regarded as the visible portion. judge,
The information processing apparatus according to claim 9.
The deleted part is
Using the offset amount set based on the distance between the frontmost bangs of the face and the hairline in real space.
The information processing apparatus according to claim 2.
A silhouette determination unit for determining the similarity between the first silhouette, which is the silhouette of the hair portion of the subject included in the 2D image, and the second silhouette, which is the silhouette of the hair portion of the 3D face model data.
Further prepare
The silhouette determination unit
When it is determined that the first silhouette and the second silhouette are similar, the 3D hair of the 3D face model data is selected by the hair search unit, and when it is determined that they are not similar, the hair Excluded from selection by the search department,
The information processing apparatus according to claim 2.
The silhouette determination unit
The degree of superimposition of the first silhouette and the second silhouette is calculated, and the similarity between the first silhouette and the second silhouette is determined based on the calculated degree of superimposition.
The information processing apparatus according to claim 12.
The silhouette determination unit
The degree of superimposition is calculated as IoU (Intersection over Union), and when the IoU is equal to or higher than a predetermined value, it is determined that the first silhouette and the second silhouette are similar.
The information processing apparatus according to claim 13.
The silhouette determination unit
The first image showing the first silhouette is generated based on the 2D image, and the 3D face model data is coordinate-converted according to the shooting position and rotation direction at the time of shooting the 2D image. A second image showing the second silhouette is generated, and the degree of superimposition is calculated based on the first image and the second image.
The information processing apparatus according to claim 13.
The silhouette determination unit
The cutout target area defined based on both ends of one stroke is cut out from the first image and the second image, respectively, and the first silhouette and the second silhouette in the cutout target area are cut out. To determine similarity,
The information processing apparatus according to claim 15.
The deleted part is
A part that does not correspond to the cutout target area is further deleted from a part of the 3D face model data.
The information processing apparatus according to claim 16.
The hair search unit
The selection results of the 3D hairs corresponding to each of the strokes are presented to the user in descending order of similarity, and the selected 3D hairs are selected by the user arbitrarily selecting each of the selected 3D hairs. Presenting the combined image to the user,
The information processing apparatus according to claim 2.
With respect to the 3D face model data, the range having a predetermined offset amount in the direction corresponding to the shooting direction of the 2D image is deleted from the side corresponding to the 2D image including the face of the subject.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion, corresponds to the hairstyle of the subject included in the 2D image. When it is determined that the hair corresponds, the 3D hair associated with the 3D face model data is selected from the database.
Information processing methods, including.
On the computer
For 3D face model data, deleting a range having a predetermined offset amount in the direction corresponding to the shooting direction of the 2D image from the side corresponding to the 2D image including the face of the subject.
It is determined whether or not a part of the 3D face model data, which is the remaining part of the 3D face model data whose range is deleted by the deletion, corresponds to the hairstyle of the subject included in the 2D image. , Select the 3D hair associated with the 3D face model data from the database when it is determined to correspond.
A computer-readable recording medium on which a program is recorded to realize the above.