WO2017006872A1 - Système, procédé et programme d'identification d'expression faciale - Google Patents

Système, procédé et programme d'identification d'expression faciale Download PDF

Info

Publication number
WO2017006872A1
WO2017006872A1 PCT/JP2016/069683 JP2016069683W WO2017006872A1 WO 2017006872 A1 WO2017006872 A1 WO 2017006872A1 JP 2016069683 W JP2016069683 W JP 2016069683W WO 2017006872 A1 WO2017006872 A1 WO 2017006872A1
Authority
WO
WIPO (PCT)
Prior art keywords
facial expression
unit
data
person
facial
Prior art date
Application number
PCT/JP2016/069683
Other languages
English (en)
Japanese (ja)
Inventor
麻樹 杉本
克俊 正井
正泰 尾形
鈴木 克洋
中村 文彦
稲見 昌彦
裕太 杉浦
Original Assignee
学校法人慶應義塾
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 学校法人慶應義塾 filed Critical 学校法人慶應義塾
Priority to JP2017527431A priority Critical patent/JP6850723B2/ja
Publication of WO2017006872A1 publication Critical patent/WO2017006872A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to a facial expression identification system, a facial expression identification method, and a facial expression identification program for identifying a facial expression of a person.
  • Patent Document 1 the system using the camera described in Patent Document 1 is expensive and the calculation amount is enormous. Furthermore, since the position and direction in which a person's face can be imaged with a camera is limited, it is difficult to identify facial expressions continuously (daily).
  • the present invention provides a facial expression identification system, a facial expression identification method, and a facial expression identification program capable of continuously (daily) identifying a human facial expression with an inexpensive and simple configuration. With the goal.
  • the first aspect of the present invention includes (a) a mounting device that can be mounted on a person's head, and (b) a mounting device that is disposed at a plurality of locations facing the person's face, and the mounting device is mounted. Obtained by machine learning of a plurality of detection devices that respectively detect the distance between the face and the wearing device at a plurality of locations, and (c) correspondence between past detection result data and facial expressions by the plurality of detection devices.
  • a facial expression identification system comprising: a storage device that stores learning data; and (d) a facial expression identification unit that reads learning data from the storage device and identifies facial expressions of a person using detection results from a plurality of detection devices as input data. It is a summary.
  • a plurality of detection devices respectively disposed at a plurality of locations of the mounting device determine the distance between the human face and the mounting device when the mounting device is mounted on the person's head.
  • the gist of the present invention is a facial expression identification method including a step of reading and identifying a facial expression of a person using data of detection results from a plurality of detection devices as input data.
  • the distance between the face of the person and the mounting device when the mounting device is mounted on the person's head is set to a plurality of detection devices respectively disposed at a plurality of locations of the mounting device.
  • learning data obtained by machine learning of the correspondence between the data of past detection results by the plurality of detection devices and the facial expression, stored in the storage device, from the storage device.
  • a facial expression identification program that causes a computer to execute a series of processes that include reading the data of detection results from a plurality of detection devices as input data and causing the facial expression identification unit to identify a facial expression of a person.
  • a facial expression identification system capable of continuously (daily) identifying a facial expression of a person with an inexpensive and simple configuration.
  • FIG. 3 (a) is a schematic view showing a state in which a person wearing the mounting device according to the first embodiment of the present invention has no expression
  • FIG. 3 (b) is a first view of the present invention. It is the schematic which shows a mode that the person who mounted
  • FIGS. 18A and 18B are schematic diagrams illustrating examples of display images of the display unit of the mounting device according to the third exemplary embodiment of the present invention. It is a graph which shows the relationship between the distance before and behind the pre-processing which concerns on the 3rd Embodiment of this invention, and a sensor value. It is the schematic which shows an example of the neural network of the learning phase which concerns on the 3rd Embodiment of this invention.
  • FIGS. 27A to 27C are graphs respectively showing sensor values when the eyes are opened and closed according to the third embodiment of the present invention.
  • FIG. 28A and FIG. 28B are graphs respectively showing the cluster classification result and the true value when the eyes are opened and closed according to the third embodiment of the present invention.
  • 29 (a) to 29 (c) are graphs showing sensor values when the mouth is moved according to the third embodiment of the present invention.
  • FIG. 30A and FIG. 30B are graphs respectively showing sensor values when the mouth according to the third embodiment of the present invention is moved.
  • FIGS. 35 (a) to 35 (d) are graphs respectively showing the results of regression obtained by merging multiclass classifiers according to the third embodiment of the present invention.
  • FIG. 39A is a graph showing changes in the sensor value with respect to “no expression” when the facial expression is shifted in the front-rear direction of the mounting device according to the fourth embodiment of the present invention, and FIG.
  • FIG. 40A is a graph showing changes in the sensor value with respect to “disgust” in the facial expression when the mounting apparatus according to the fourth embodiment of the present invention is shifted in the front-rear direction
  • FIG. 12 is a graph showing changes in sensor values with respect to “anger” when the facial expression when the mounting device according to the fourth embodiment of the present invention is shifted in the front-rear direction.
  • FIG. 41A is a graph showing changes in sensor values with respect to the “surprise” facial expression when the mounting apparatus according to the fourth embodiment of the present invention is shifted in the front-rear direction
  • FIG. FIG. 10 is a graph showing changes in sensor value with respect to “fear” when the facial expression when the mounting device according to the fourth embodiment of the present invention is shifted in the front-rear direction
  • Fig.42 (a) is a graph which shows the change of the sensor value with respect to the sadness when the facial expression when the mounting apparatus based on the 4th Embodiment of this invention is shifted to the front-back direction
  • FIG.42 (b) is shown.
  • FIG. 10 is a graph showing changes in sensor value with respect to “contempt” when the facial expression when the mounting device according to the fourth embodiment of the present invention is shifted in the front-rear direction. It is a flowchart for demonstrating an example of the facial expression identification method which concerns on the 4th Embodiment of this invention. It is a block diagram which shows an example of the facial expression identification system which concerns on the 5th Embodiment of this invention. It is a block diagram which shows an example of the facial expression identification system which concerns on the 6th Embodiment of this invention. It is a flowchart for demonstrating an example of the adjustment method of the emitted light intensity of the optical sensor which concerns on the 6th Embodiment of this invention.
  • a facial expression identification system includes a central processing unit (CPU) 1, a storage device 2, a mounting device (wearable device) 3, and a plurality of detection devices (light).
  • Sensor 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 4k, 4l, 4m, 4n, 4o, 4p, 4q, input device 5 and output device 6 are provided.
  • the CPU 1, the storage device 2, the detection devices 4 a to 4 q, the input device 5, and the output device 6 can transmit and receive signals by wire or wirelessly.
  • a spectacle-type device that can be mounted on a person's head or a head-mounted display (HMD) can be used, and commercially available glasses may be used.
  • the plurality of detection devices 4a to 4q are provided in the mounting device 3, and detect the distance between the human face and the mounting device 3 at a plurality of locations when the mounting device 3 is mounted on the head.
  • a reflective optical sensor photo reflector
  • a pyroelectric sensor a pyroelectric sensor
  • a proximity sensor a distance sensor, or the like
  • the photoreflector includes a light emitting unit made up of a light emitting diode (LED) that irradiates the face of the person wearing the wearing device 3 with infrared light, and a light receiving unit made up of a phototransistor that detects reflected light from the face of the person.
  • LED light emitting diode
  • the wearing device 3 is, for example, a glasses-type device as shown in FIG.
  • the detection devices 4a to 4q are provided around the lens portion of the mounting device 3 and at positions facing the human face when the mounting device 3 is mounted.
  • the detection devices 4a to 4q are preferably arranged at positions facing the eyelids, cheeks, between eyebrows, the corners of the eyes, and the like, which are easily changed when the facial expression of a person changes, for example.
  • the arrangement positions of the detection devices 4a to 4q are not particularly limited, and can be set as appropriate according to the type of facial expression desired to be identified.
  • FIG. 2 shows 17 detection devices 4a to 4q, but the number of detection devices 4a to 4q is not particularly limited, and at least two detection devices may be used. It can be appropriately selected according to the identification accuracy and the like. For example, of the 16 detection devices 4a to 4q shown in FIG. 2, only 8 detection devices 4b, 4h, 4j, 4k, 4l, 4o, 4p, and 4q are used to identify the 7 facial expressions described later. Is possible.
  • the facial expression identification system includes “no expression”, “smile”, “laughter”, “disgust”, “anger”, “surprise”, “sadness”. 7 facial expressions are identified.
  • the types and number of facial expression classifications are not particularly limited, as long as differences by facial expressions can be detected by the detection devices 4a to 4q and learning data can be stored in the learning data storage unit 20.
  • other examples of facial expressions include “joy” and “fear”.
  • the facial expression identification system identifies facial expressions using three-dimensional skin deformation caused by facial muscle fluctuations when the facial expression of a person changes.
  • the parts of the person's face such as the eyelids, cheeks, eyebrows, and eye corners, vary, and there are individual differences, but the variation in each part of the person's facial expression shows a common tendency. It is done.
  • FIGS. 3 (a) and 3 (b) show a state in which a person wearing the mounting device 3 on the head has no expression and is laughing, respectively.
  • the temple portions of the eyeglass-type mounting device 3 are omitted for convenience, and the detection devices 4a and 4j are schematically shown.
  • FIG. 3 (a) when the person has no expression, the person's heels and cheeks do not swell so much, and the distances D1 and D2 between the heels and cheeks and the detection devices 4a and 4j become relatively far away. The illuminance of the reflected light detected by the devices 4a and 4j decreases.
  • FIG. 3 (a) shows a state in which a person wearing the mounting device 3 on the head has no expression and is laughing, respectively.
  • the temple portions of the eyeglass-type mounting device 3 are omitted for convenience, and the detection devices 4a and 4j are schematically shown.
  • the detection devices 4a and 4j are schematically shown.
  • FIG. 3 (a) when the person has no expression,
  • the detection devices 4a to 4q output to the CPU 1 detection results (for example, current values) corresponding to the distance between the human face and the detection devices 4a to 4q. Note that, since the degree of reflection varies depending on the part of the face facing the detection devices 4a to 4q, the detection results by the detection devices 4a to 4q may be appropriately corrected according to the arrangement positions of the detection devices 4a to 4q. Further, the detection results by the detection devices 4a to 4q may be appropriately weighted according to the type of facial expression desired to be identified.
  • a machine learning unit 10 includes a machine learning unit 10, a facial expression identification unit 11, a geographic information acquisition unit 12, a facial expression map creation unit 13, and a recommendation information extraction unit 14, and further controls the entire facial expression identification system. It has a control circuit, an arithmetic circuit, a register for temporarily storing data, and the like. Note that some or all of the functions of the CPU 1 may be built in the mounting device 3 or the like.
  • the machine learning unit 10 generates learning data by performing machine learning using the correspondence between the past detection results by the detection devices 4a to 4q and the facial expression as teacher data.
  • a known support vector machine (SVM), a neural network, or the like can be used as a method of machine learning by the machine learning unit 10.
  • FIG. 4 shows an example of a correspondence relationship between past detection results by the detection devices 4a to 4q used for machine learning of the machine learning unit 10 and facial expressions.
  • Each plot is obtained by a large number of persons wearing the wearing device 3, sampling a large number of detection results from the detection devices 4a to 4q, and classifying each detection result for each facial expression.
  • the sampling results of a large number of persons are shown here, the sampling results of the user himself or herself who is the target of facial expression identification may be included, or the sampling results of only others may be included.
  • a sampling result of only the user himself / herself or another person who is the target of facial expression identification may be used for machine learning.
  • the facial expression identification unit 11 uses, for example, the correspondence between the past detection results by the detection devices 4a to 4q and the facial expressions shown in FIG. 4 as teacher data, and new detection results by the detection devices 4a to 4q by SVM. Is generated as learning data including a hyperplane (boundary surface) for classifying a region into a region (cluster) corresponding to seven facial expressions.
  • the learning data generated by the machine learning unit 10 is stored in the learning data storage unit 20 of the storage device 2.
  • the facial expression identification unit 11 shown in FIG. 1 uses the learning data stored in advance in the learning data storage unit 20 of the storage device 2 and performs measurement based on the detection results by the detection devices 4a to 4q for the person to be measured.
  • the facial expression of the target person is identified (classified) as one of seven facial expressions of “no expression”, “smile”, “laughter”, “disgust”, “anger”, “surprise”, and “sadness”.
  • the facial expression identification unit 11 determines, for example, which of the seven clusters of learning data stored in the learning data storage unit 20 is the closest to the detection result pattern by the detection devices 4a to 4q. Classify the facial expression corresponding to the cluster.
  • the identification result of the facial expression by the facial expression identification unit 11 is stored in the facial expression storage unit 21 of the storage device 2.
  • the facial expression identification unit 11 In order to improve the recognition result by the facial expression identification unit 11, it is desirable to record learning data for each individual to be identified for facial expression and perform machine learning. It is also possible to use the recorded learning data for identifying the facial expression of another user. In addition, when the measurement results are similar to the learning data recorded by others, the learning data recorded by others is used to simplify the process of learning the data for each individual. You can also.
  • the geographic information acquisition unit 12 acquires, from a global positioning system (GPS) or the like, geographic information such as a current position and time information such as a date when a facial expression is identified by the facial expression identification unit 11, and a storage device. 2 in the geographic information storage unit 22.
  • GPS global positioning system
  • the facial expression map creation unit 13 creates a facial expression map that associates the facial expression identified by the facial expression identification unit 11 with the geographic information and time information acquired by the geographic information acquisition unit 12. For example, as shown in FIG. 6, the facial expression map indicates the ratio (frequency) of seven facial expressions for each time information and geographic information. Note that the facial expression map is not particularly limited to the mode shown in FIG. 6. For example, time information and facial expression information may be arranged at positions on the map corresponding to geographic information instead of the table format. Further, the facial expression map may collect time information, geographic information, and identification results of facial expressions for a plurality of persons at once, and the facial expression information may be a cumulative value of many persons.
  • the facial expression map created by the facial expression map creation unit 13 is stored in the facial expression map storage unit 23.
  • the recommendation information extraction unit 14 searches the facial expression map created by the facial expression map creation unit 13 based on a search key such as facial expression, time information, and geographic information input via the input device 5, and retrieves the facial expression map. Facial expressions, time information, geographic information, etc. that match the key are extracted as recommendation information. For example, when the facial expression input as a search key is “laughter” or “smile”, the “place B” has a relatively high ratio of “laughter” or “smile” from the facial expression map shown in FIG. "Is extracted as recommendation information.
  • the storage device 2 for example, a semiconductor memory, a magnetic disk, an optical disk, or the like can be used.
  • the storage device 2 includes a learning data storage unit 20 that stores learning data generated by the machine learning unit 10, a facial expression storage unit 21 that stores a facial expression identification result by the facial expression identification unit 11, and a geographic information acquisition unit 12.
  • a geographic information storage unit 22 that stores the acquired geographic information and a facial expression map storage unit 23 that stores the facial expression map created by the facial expression map creation unit 13 are provided.
  • the storage device 2 further stores a facial expression identification program executed by the CPU 1 and various data necessary for executing the program.
  • the input device 5 a keyboard, a mouse, a touch panel, a voice recognition device, or the like can be used.
  • the input device 5 accepts search keys such as facial expressions, time information, geographic information, and the like from the user.
  • the output device 6 a display device such as a liquid crystal display (LCD), a tablet terminal, or the like can be used.
  • the output device 6 appropriately outputs (displays) the facial expression identification result by the facial expression identification unit 11, the facial expression map created by the facial expression map creation unit 13, and the recommended information extracted by the recommendation information extraction unit 14. .
  • ⁇ Facial expression identification method> an example of a facial expression identification method including the facial expression map creation method according to the first embodiment of the present invention will be described with reference to the flowchart of FIG. Note that the facial expression identification method described below is merely an example, and the present invention is not limited to this procedure.
  • step S10 the machine learning unit 10 generates learning data by performing machine learning using a correspondence relationship between past detection results and facial expressions by the detection devices 4a to 4q as shown in FIG. It is stored in the data storage unit 20.
  • step S11 the plurality of detection devices 4a to 4q detect the distance between the face of the person to be measured with the mounting device 3 on the head and the mounting device 3 at a plurality of locations.
  • step S12 the facial expression identification unit 11 performs pattern identification of the current detection results detected by the plurality of detection devices 4a to 4q using the learning data stored in advance in the learning data storage unit 20, and the measurement target Is identified as one of the seven facial expressions shown in FIG. 4 (for example, “smile”).
  • step S12 the geographic information acquisition unit 12 acquires the geographic information such as the current position at the time when the identification result by the facial expression identification unit 11 is obtained from the GPS or the like.
  • the geographic information acquisition unit 12 further acquires time information when the identification result by the facial expression identification unit 11 is obtained.
  • step S ⁇ b> 13 the facial expression map creation unit 13 creates a facial expression map by associating the identification result by the facial expression identification unit 11 with the geographic information and time information acquired by the geographic information acquisition unit 12.
  • the facial expression map created by the facial expression map creation unit 13 is stored in the facial expression map storage unit 23.
  • step S21 the recommended information extraction unit 14 causes the output device 6 to display a search menu.
  • the input device 5 receives search keys such as a facial expression, geographic information, and time information selected by the user from the search menu.
  • step S22 the recommended information extraction unit 14 searches the facial expression map stored in the facial expression map storage unit 23 based on the search key received by the input device 5, and the facial expression and geographic information matching the search key. And time information etc. are extracted as recommendation information.
  • step S23 the output device 6 displays the recommendation information extracted by the recommendation information extraction unit 14 on the screen.
  • the facial expression identification program according to the first embodiment of the present invention causes the CPU 1 to execute the facial expression identification procedure shown in FIGS. That is, in the facial expression identification program according to the first embodiment of the present invention, (a) the distance between the mounting device 3 and the face of the person wearing the mounting device 3 on the head of the plurality of detection devices 4a to 4q.
  • a facial expression of a person is identified by using a plurality of wearable detection devices 4a to 4q, so that it can be continuously performed with an inexpensive and simple configuration. It becomes possible to identify facial expressions with low power consumption (on a daily basis). Furthermore, since the detection devices 4a to 4q can be mounted inside spectacle frames or the HMD, imaging with a camera is performed in a daily environment or when a human face is shielded when the HMD is worn. However, even under difficult conditions, facial expressions can be identified continuously and easily. Therefore, the facial expression map can be created by continuously identifying the facial expression of the user without imposing an excessive burden on the user. Furthermore, by using the created facial expression map to extract and display recommendation information corresponding to the user's search key, it is possible to present recommendation information related to the facial expression to the user.
  • the CPU 1 adds a time information acquisition unit 15 and a facial expression distribution calculation unit 16 in addition to the machine learning unit 10 and the facial expression identification unit 11.
  • the storage device 2 includes a time information storage unit 24 and a facial expression distribution storage unit 25 in addition to the learning data storage unit 20 and the facial expression storage unit 21, and includes the geographic information storage unit 22 and the facial expression map shown in FIG.
  • storage part 23 differs from the structure of the memory
  • the time information acquisition unit 15 of the CPU 1 acquires time information when the facial expression is identified by the facial expression identification unit 11 in association with the facial expression.
  • the time information acquired by the time information acquisition unit 15 is stored in the time information storage unit 24.
  • the facial expression distribution calculation unit 16 calculates a facial expression distribution in which the identification result by the facial expression identification unit 11 and the time information acquired by the time information acquisition unit 15 are associated with each other.
  • FIG. 10 shows the ratio (frequency) of seven facial expressions per day.
  • the facial expression distribution is not limited to the mode shown in FIG. 10.
  • the facial expression ratios for the past several days may be calculated and arranged.
  • the facial expression distribution may be calculated on a weekly or monthly basis. Good.
  • the facial expression distribution calculated by the facial expression distribution calculation unit 16 is stored in the facial expression distribution storage unit 25.
  • the other configuration of the facial expression identification system according to the second embodiment of the present invention is the same as the configuration of the facial expression identification system according to the first embodiment of the present invention, and a duplicate description is omitted.
  • step S30 to step S32 is the same as the procedure from step S10 to step S12 in FIG.
  • step S ⁇ b> 33 the time information acquisition unit 15 acquires time information at the time when the facial expression is identified by the facial expression identification unit 11 and stores it in the time information storage unit 24.
  • step S ⁇ b> 41 the facial expression distribution calculation unit 16 causes the output device 6 to display an input screen. User information to be confirmed is input from the user via the input device 5.
  • step S ⁇ b> 42 the facial expression distribution calculation unit 16 calculates the facial expression distribution per day of the person who matches the user information to the facial expression identification unit 11.
  • step S ⁇ b> 43 the output device 6 outputs the facial expression distribution calculated by the facial expression distribution calculation unit 16. Note that the series of processes shown in FIGS. 11 and 12 may be performed continuously or in parallel with each other.
  • the wearable detection devices 4a to 4q to identify the facial expression of a person, an inexpensive and simple configuration
  • the detection devices 4a to 4q can be mounted inside spectacle frames or the HMD, imaging with a camera is performed in a daily environment or when a human face is shielded when the HMD is worn.
  • the facial expression distribution can be created by continuously identifying the facial expression of the user without imposing an excessive burden on the user.
  • the facial expression distribution calculation unit 16 calculates a facial expression distribution in which the identification result by the facial expression identification unit 11 and the time information acquired by the time information acquisition unit 15 are associated with each other, and displays them on the output device 6.
  • the user can infer the state of a person such as a family who lives or stays in a remote place from the facial expression distribution, and can watch a person such as a family who lives or stays in a remote place via a network.
  • the mounting apparatus 3 is a spectacle-type device.
  • the mounting apparatus 3 is in a virtual reality environment. It may be an HMD capable of presenting visual information.
  • illustration is omitted, inside the mounting device 3, for example, 17 detection devices similar to the detection devices 4a to 4q shown in FIG. 2 and a display unit are provided.
  • a virtual avatar 41 of a user wearing the wearing device 3 and a virtual avatar 42 of another person wearing a wearing device similar to the wearing device 3 are displayed on the display unit 40 inside the wearing device 3. Is displayed on the online service.
  • the facial expression identifying unit 11 identifies the facial expressions of the user and others as in the first and second embodiments.
  • the facial expression identification unit 11 further transmits the identification result of the facial expression on the network, and changes the facial expression of the virtual avatars 41 and 42 according to the transmitted identification result of the facial expression, so that the user and others
  • the facial expression of the person and the facial expressions of the virtual avatars 41 and 42 can be synchronized. Thereby, facial expression communication in an immersive online game or the like can be realized.
  • the facial expression identification unit 11 transmits the identification result of the facial expression on the network and displays it on the screen of the output device 6 according to the transmitted identification result.
  • the facial expression of the avatar on the online service may be changed.
  • the facial expression identification system includes a CPU 1, a storage device 2, a mounting device 3, an input device 5, and an output device 6.
  • the CPU 1, the storage device 2, the mounting device 3, the input device 5, and the output device 6 can transmit and receive signals and data to and from each other in a wired or wireless manner.
  • the mounting device 3 is an HMD.
  • Virtual reality (VR) using HMD is expected to be used in a wide range of applications such as immersive games and communication with remote locations.
  • VR using HMD it is possible to give the user an immersive feeling as if the user actually exists in the virtual environment.
  • avatar hereinafter simply referred to as “avatar”
  • smooth communication can be realized even when the upper part of the user's face is covered with the HMD, From the viewpoint of privacy, it is also suitable for humans who do not like the actual face exposed to the virtual environment.
  • An avatar means a two-dimensional or three-dimensional character that is a user's alternation in a virtual environment.
  • An avatar may simulate a human figure or an animal or robot, but in this specification, it expresses multiple facial expressions corresponding to multiple facial expressions of the user. Possible characters can be employed.
  • the mounting device 3 includes a plurality (16) of detection devices (light sensors) 61a, 61b, 61c, 61d, 61e, 61f, 61g, 61h, 61i, 61j, 61k, 61l, 61m, 61n, 61o, 61p and a display.
  • the unit 62 is provided.
  • the optical sensors 61a to 61p devices similar to the detection devices 4a to 4q shown in FIG. 1 can be used. For example, a reflective optical sensor (photo reflector) or the like can be used.
  • the optical sensors 61a to 61p are respectively disposed at a plurality of locations facing the user's face when the mounting device 3 is mounted.
  • the optical sensors 61a to 61p detect the distance between the face of the user wearing the mounting device 3 and the mounting device 3 at a plurality of locations.
  • FIG. 16 shows an example of the mounting device 3 viewed from the side mounted on the head.
  • a flexible circuit board 72 is fixed to the main body 71 of the mounting device 3.
  • Two openings are provided in the position of the flexible circuit board 72 facing both eyes of the user, and a pair of lenses 73a and 73b are disposed at the positions of the two openings.
  • 16 optical sensors 61a to 61p are arranged on the flexible circuit board 72.
  • the 14 optical sensors 61a to 61f, 61h to 61n, and 61p are arranged around the pair of lenses 73a and 73b.
  • the optical sensor 61a faces the vicinity of the eyebrows of the user who wears the mounting device 3.
  • the optical sensors 61b to 61d face the vicinity of the user's left eyebrow.
  • the optical sensor 61e faces the vicinity of the user's left eye corner.
  • the optical sensors 61f and 61h face the lower vicinity of the user's left eye.
  • the optical sensor 61i faces the vicinity of the user's eyebrow.
  • the optical sensors 61j to 61l face the vicinity of the user's right eyebrow.
  • the optical sensor 61m faces the vicinity of the user's right eye corner.
  • the optical sensors 61n and 61p face the lower vicinity of the user's right eye.
  • the flexible circuit board 72 has two portions protruding from the lower part of the main body 71, and two optical sensors 61g and 61o are arranged in the two protruding portions, respectively.
  • the two optical sensors 61g and 61o face the cheek vicinity of the user wearing the wearing device 3. Since the cheek muscles are connected to the mouth muscles, the state around the mouth can be estimated by measuring the movement of the cheeks.
  • FIG. 17 shows the user wearing the mounting device 3.
  • FIG. 17 schematically shows that the optical sensors 61a to 61p can be seen through the main body 71 of the mounting device 3.
  • the optical sensors 61a to 61p detect distances D (illustrated by arrows) between the face of the user's eyes and cheeks and the mounting device 3 at a plurality of locations.
  • the arrangement positions and number of the optical sensors 61a to 61p are not particularly limited, and can be set as appropriate according to the type of facial expression to be identified.
  • an immersive (non-transmissive) structure is illustrated as the display unit 62.
  • the display unit 62 has a transmissive structure in which a real environment and an avatar are superimposed and visible using a half mirror or the like. There may be a structure in which an image is projected only to a single eye.
  • the CPU 1 illustrated in FIG. 15 includes an avatar display control unit 31, a learning data generation unit 32, and a facial expression identification unit 33, and further includes a control circuit that controls the entire facial expression identification system, an arithmetic circuit, and a register that temporarily stores data. Etc. Note that a part of the function of the CPU 1 may be realized by another device, and a part or all of the function of the CPU 1 may be realized by a microprocessor or the like built in the mounting device 3.
  • the storage device 2 includes an avatar data storage unit 50, a learning data storage unit 51, an optical sensor data storage unit 52, and an identification result storage unit 53.
  • the avatar data storage unit 50 stores information related to the avatar including the avatar face image.
  • the learning data storage unit 51 stores a machine learning data set generated by the learning data generation unit 32 and learning data (such as an identification function) obtained by machine learning by the facial expression identification unit 33.
  • the optical sensor data storage unit 52 stores data (sensor values) of detection results detected by the optical sensors 61a to 61p.
  • the identification result storage unit 53 stores the facial expression identification result by the facial expression identification unit 33.
  • the storage device 2 further stores a facial expression identification program executed by the CPU 1 and various data necessary for executing the program. All or a part of the information stored in the storage device 2 may be stored in a memory built in the mounting device 3.
  • a machine learning method including a method for generating a data set for machine learning, and a facial expression recognition method (identification phase) using learning data obtained by machine learning.
  • learning phase including a method for generating a data set for machine learning
  • identity phase a facial expression recognition method using learning data obtained by machine learning.
  • the avatar data storage unit 50 stores information on one or more avatars used in the learning phase and the identification phase.
  • the avatar data storage unit 50 stores data of multiple types of facial expressions (face images) of avatars.
  • the avatar's face image is a human expression such as “no expression”, “smile”, “laughter”, “disgust”, “anger”, “surprise”, “sadness”, etc. Simulates facial expression.
  • “no expression” can be expressed by tying the mouth sideways
  • “smiling” can be expressed by lifting the corner of the mouth
  • “laughing” can be expressed by narrowing the eyes and opening the mouth. .
  • “Aversion” can be expressed by putting a heel between the eyebrows and raising the corner of the eye, and “angry” can be expressed by putting a heel between the eyebrows and raising the butt of the eyebrows rather than “hate”.
  • “Surprise” can be expressed by opening your eyes and your mouth open, and “Sadness” can be expressed by lowering your eyebrows.
  • the avatar data storage unit 5 may further store data of face images that have been partially changed, such as opening / closing the eyebrows, opening / closing both eyes or firmness, opening / closing the mouth, and the like.
  • the number and type of avatar face images stored in the avatar data storage unit 50 are not particularly limited.
  • the avatar display control unit 31 is identified by machine learning from the data of a plurality of avatar face images stored in the avatar data storage unit 50 based on the instruction information or the like input via the input device 5.
  • the data of the avatar face image corresponding to the facial expression desired is extracted.
  • the avatar display control unit 31 controls the display unit 62 of the mounting apparatus 3 so as to display the image of the extracted facial expression of the avatar 100.
  • FIG. 18A illustrates a case where the face image of the avatar 100 is “laughter”.
  • the avatar display control unit 31 sequentially extracts the data of the avatar face image corresponding to the facial expression desired to be identified by machine learning from the data of the plurality of avatar face images stored in the avatar data storage unit 50, as shown in FIG. As shown in b), the face image of the avatar 100 displayed on the display unit 62 is sequentially updated to the extracted face image.
  • FIG. 18B illustrates a case where the face image of the avatar 100 is “smile”.
  • the timing for updating the face image of the avatar 100 can be set as appropriate. For example, the timing after a predetermined time elapses, the timing after a predetermined number of frames are detected by the optical sensors 61a to 61p, the timing according to the instruction information input from the user via the input device 5, etc. Good.
  • the avatar display control unit 31 “similar the face image of the avatar” before, after, or simultaneously with the display of the face image of the avatar 100. Or the like is displayed on the display unit 62 to prompt the user to imitate the expression of the avatar 100.
  • sound information by a headphone, a speaker, etc. (not shown) attached to the mounting device 3 may be used.
  • the avatar display control unit 31 may continuously or intermittently prompt the user to imitate the expression of the avatar 100, but when the face image of the avatar 100 is first displayed, the expression of the avatar 100 is displayed to the user. After prompting the user to imitate, the presentation of text information and voice information may be stopped.
  • the optical sensors 61a to 61p detect a predetermined number of frames for the distance between the face and the wearing device 3 when the user wearing the wearing device 3 imitates the face image of the avatar.
  • the avatar display control unit 31 prompts the user to imitate the face image of the avatar or displays the avatar on the display unit 62, the user recognizes the type of the avatar face image, and the user's facial expression is the avatar. A time lag occurs until the facial expression corresponding to the face image changes. For this reason, the optical sensors 61a to 61p prompt the user to imitate the avatar's face image, or display the avatar on the display unit 62, and the detection timing or the input device 5 after a predetermined time has elapsed. The detection may be started at a timing according to the input instruction information.
  • preprocessing is performed so as to balance the individuals.
  • the average value of the sensor values detected when the facial expression is “no expression” is 0.5
  • the maximum value is 1
  • the minimum value among the sensor values detected when the facial expression is a plurality of types of facial expressions. Normalization is performed with 0 being zero.
  • the sensor value is linearly complemented so that the distance and the sensor value realize a linear relationship.
  • the learning data generation unit 32 generates a data set for machine learning using the sensor values from the optical sensors 61a to 61p as input data in the learning phase. For example, the learning data generation unit 32 performs a clustering process for classifying the sensor values from the optical sensors 61a to 61p into a subset (cluster) for each facial expression.
  • the learning data generation unit 32 further extracts the classified cluster by the avatar display control unit 31 and displays the type of avatar face image (for example, “smile”) displayed on the display unit 62 when the sensor value is detected. Labeling processing for assigning a label corresponding to “sadness” or the like is performed, and the cluster to which the label is assigned is stored in the learning data storage unit 51 as a data set (sampling result) for machine learning. In this manner, a machine learning data set is generated, and stored in the learning data storage unit 51 for each individual, whereby a learning database for facial expression recognition can be constructed.
  • the type of avatar face image for example, “smile”
  • the facial expression identifying unit 33 uses the machine learning data set stored in the learning data storage unit 51 as input data, and the facial expression of the user wearing the mounting device 3 using a neural network, a support vector machine, or the like.
  • Machine learning to identify For example, as schematically shown in FIG. 20, the facial expression identification unit 33 uses an error propagation method (BP) by using a neural network composed of a multilayer perceptron including an input layer L11, a hidden layer L12, and an output layer L13. Perform machine learning.
  • BP error propagation method
  • the neural network shown in FIG. 20 is a multi-class classification problem, for example, a normalized linear function is adopted as the activation function, a cross-entropy method is adopted as the error function, and the activation function of the output layer L3 is adopted. Adopts the softmax function.
  • the neural network outputs, from the output layer L13, the similarity between the facial expression input to the input layer L11 and each of a plurality of trained facial expressions to which a teacher signal corresponding to the facial expression is assigned. The facial expression with the highest degree is taken as the identification result.
  • the similarity to the correct facial expression is increased (in other words, the maximum value of the similarity to the correct facial expression).
  • the weight is corrected by back-propagating the error gradient from the output layer L13 to the input layer L11.
  • the facial expression identification unit 33 as shown in FIG. A regression neural network composed of the layer L22 and the output layer L23 may be implemented.
  • a normalized linear function is adopted as the activation function
  • a mean square function is adopted as the error function
  • a hyperbolic tangent is used as the activation function of the output layer L23.
  • Adopt a function As shown in FIG. 21, for example, in the case of a regression neural network whose facial expression is “laughter”, the similarity to “no expression” is set to 0 as the minimum value, and the similarity to “laughter” is set to 1 as the maximum value. .
  • a regression neural network similar to the regression neural network shown in FIG. 21 is implemented for the number of facial expressions to be identified, and is used depending on the result of the multi-class classification of the neural network shown in FIG.
  • the avatar data storage unit 50 may store intermediate facial expressions between representative facial expressions such as “no expression” and “laughter”.
  • the intermediate facial expression can be generated, for example, by morphing the texture geometry of the avatar.
  • the avatar display control unit 31 extracts representative facial expressions such as “no expression” and “laughter” and intermediate expressions between the facial expressions from the facial image data of the avatar 100.
  • the avatar display control part 31 displays the extracted representative facial expression and its intermediate expression on the display part 62 of the mounting apparatus 3 continuously. This makes it possible to generate a machine learning data set for identifying intermediate facial expressions.
  • the facial expression identification unit 33 uses the multi-class classifier 80, which is learning data, as shown in FIG. 22 to attach the mounting device 3 using the sensor values from the optical sensors 61a to 61p as input data. Identifying the facial expression of the user. For example, the facial expression identifying unit 33 calculates the similarity to a plurality of types of trained template expressions, and identifies the facial expression having the highest similarity as the facial expression. For example, as shown in FIG.
  • the facial expression identification unit 33 uses the learning data “smile”, “anger”, “surprise”, and “sadness” regression networks 81 to 84 from the optical sensors 61a to 61p.
  • the intermediate facial expression of the user wearing the wearing device 3 may be identified using the sensor value as input data.
  • the avatar display control unit 31 extracts the avatar face image data corresponding to the user's facial expression from the avatar data storage unit 50 based on the identification result by the facial expression identification unit 33.
  • the avatar display control unit 31 further transmits the extracted avatar face image data, and the display device 62 of the mounting device 3, the output device 6 such as a display, or a mounting device worn by a communication partner via a communication network. Is displayed on the display unit. For example, the face image of the user's own avatar and the face image of the communication partner's avatar are displayed on the display unit 62 of the mounting apparatus 3 in the same manner as the display image shown in FIG.
  • the avatar display control unit 31 performs other display according to the identification result of the user's facial expression instead of displaying the facial image of the avatar corresponding to the user's facial expression. Also good. For example, when the user's facial expression is identified as “anger”, a facial image character simulating “surprise” or “anxiety” is displayed on the display unit 62 of the mounting apparatus 3 or “what have you done?” Or text information such as “Are you okay?” Or voice information may be presented.
  • step S51 the avatar display control unit 31 stores the data of the avatar face image corresponding to the facial expression to be identified by machine learning from the avatar data storage unit 50 based on the instruction information or the like input via the input device 5. To extract. And the avatar display control part 31 displays the avatar of the extracted facial expression on the display part 62 of a mounting apparatus, and urges the user wearing the mounting apparatus 3 to imitate the avatar face image.
  • step S52 the optical sensors 61a to 61p acquire sensor values when the user wearing the wearing device 3 imitates the face image of the avatar.
  • step S53 the avatar display control unit 31 determines whether or not a sensor value for a predetermined number of facial expressions desired to be identified by machine learning has been acquired. If it is determined that the sensor value for the predetermined number of facial expressions has not been acquired, the process returns to step S50 and the same processing is repeated for the remaining facial expressions. For example, for 5 types of facial expressions, a total of 5000 data sets are acquired with 10 sets of 100 frames each. If it is determined in step S53 that sensor values for a predetermined number of facial expressions have been acquired, the process proceeds to step S54.
  • step S54 the learning data generation unit 32 classifies the sensor values of the optical sensors 61a to 61p into clusters for each facial expression.
  • step S55 the learning data generation unit 32 generates a machine learning data set by adding a label corresponding to the avatar face image generated by the avatar display control unit 31 to the classified cluster.
  • the learning data generation unit 32 stores the generated data set for machine learning in the learning data storage unit 51.
  • step S ⁇ b> 55 the facial expression identification unit 33 performs machine learning for identifying facial expressions using the machine learning data set stored in the learning data storage unit 51.
  • step S61 the optical sensors 61a to 61p acquire sensor values while the mounting device 3 is mounted on the user's head.
  • step S62 the facial expression identification unit 33 reads the learning data stored in the learning data storage unit 51 using the sensor values from the optical sensors 61a to 61p as input data, and the facial expression of the user wearing the mounting apparatus 3 Identify
  • step S ⁇ b> 63 the avatar display control unit 31 extracts the avatar face image data corresponding to the facial expression of the user wearing the mounting device 3 from the avatar data storage unit 50 based on the identification result by the facial expression identification unit 33. To do.
  • the avatar display control unit 31 displays the extracted avatar face image on the display unit 62 of the mounting device 3, the display unit of the mounting device of the communication partner, or the like.
  • FIG. 25 shows the calculation results of the first principal component by principal component analysis for the sensor values of the obtained optical sensors 61a to 61p.
  • FIG. 26 (a) to FIG. 26 (c) show changes in sensor value when the eyebrows are moved.
  • the left side of FIGS. 26 (a) to 26 (c) schematically shows a state where the eyebrows are raised, a state where the eyebrows are normal, and a state where the eyebrows are lowered.
  • FIGS. 26 (a) to 26 (c) The sensor value in each state is shown on the right side of.
  • the vertical axis of the graphs on the right side of FIGS. 26 (a) to 26 (c) shows the average normalized value of the sensor values, and the sensor numbers on the horizontal axis are the optical sensors 61a to 61p shown in FIGS. Corresponding sequentially (for example, sensor number 1 corresponds to optical sensor 61a.
  • FIGS. 26 (a) to 26 (c) sensor values close to the position of the eyebrows such as No. 3 and No. 12 vary greatly in each state, and the movement of the eyebrows is reflected. I understand. On the other hand, it can be seen that the sensor values far from the eyebrows such as No. 6 and No. 14 have little fluctuation.
  • FIG. 27 (a) to FIG. 27 (c) show changes in the sensor value when the eyes are opened and closed.
  • the left side of FIGS. 27 (a) to 27 (c) schematically shows a state in which only the right eye is closed, a state in which only the left eye is closed, and a state in which both eyes are closed (strongly collapsed).
  • the right side of FIG. 27 (c) shows the sensor value in each state.
  • FIG. 27A it can be seen that the sensor value of No. 12 in the right eye portion fluctuates compared to the other when only the right eye is closed.
  • FIG. 27 (b) it can be seen that the sensor values of Nos. 3, 4, and 5 fluctuate compared to the other when only the left eye is closed.
  • FIG. 27C it can be seen that the sensor values of No. 1 and No. 9 fluctuate when both eyes are closed.
  • FIG. 28A shows a cluster classification result obtained by adding a normal state (a state where both eyes are opened) to each state of FIG. 27A to FIG. 27C, and FIG.
  • FIG. 28 (a) and FIG. 28 (b) regarding the opening and closing of the eyes, most of the data is obtained in the normal state, the state in which only the right eye is closed, the state in which only the left eye is closed, and the state in which both eyes are closed. You can see that it is classified into the correct class.
  • 29 (a) to 29 (c), 30 (a), and 30 (b) show sensor values when the movement of the mouth is changed.
  • 29 (a) to 29 (c), FIG. 30 (a), and FIG. 30 (b) are when the user's mouth utters “a”, “b”, “c”, “e”, and “o”.
  • FIG. 29A to FIG. 29C, FIG. 30A, and FIG. 30B show the sensor values in each state. Show.
  • FIGS. 29 (a) to 29 (c), 30 (a), and 30 (b) it can be seen that the sensor values near the positions of the seventh and fifteenth cheeks fluctuate.
  • FIG. 31 (a) shows the cluster classification results in each state of FIGS. 29 (a) to 29 (c), FIG. 30 (a) and FIG. 30 (b), and FIG.
  • Fig. 31 (a) and Fig. 31 (b) "D", “O” etc. have been mixed, but only the movement of the mouth of "U” is between classes compared to other movements. It can be seen that the distance is far and it can be classified well. In addition, it can be classified into three clusters of “A” and “I”, “D”, “U” and “O”.
  • FIG. 32A shows training data of a classification network of five facial expressions of “no expression”, “smile”, “anger”, “surprise”, and “sadness”, and FIG. The first principal component by component analysis is shown, and FIG. 32C shows the second principal component by principal component analysis. From FIG. 32A to FIG. 32C, it can be seen that the sensor value changes when the user imitates the face image of the avatar.
  • FIG. 33A shows training data of the regression network when the four types of facial expressions of “smile”, “anger”, “surprise”, and “sadness” are gradually changed, and FIG. The first principal component by principal component analysis is shown.
  • FIG. 34A shows the target facial expression
  • FIG. 34B shows the result of multi-class classification.
  • FIGS. 35 (a) to 35 (d) show regression results of “smile”, “anger”, “surprise”, and “sadness”, respectively, by merging the multi-class classifications shown in FIG. 34 (b). From FIG. 35 (a) to FIG. 35 (d), it can be seen that the output of each regression changes linearly.
  • the avatar is displayed on the display unit 62 of the wearing device 3, and the optical sensor 61a when the user imitates the displayed avatar expression.
  • the user By generating a machine learning data set from sensor values from ⁇ 61p, it is possible to efficiently collect a machine learning data set in a short time.
  • the user since the user only has to imitate the face image of the avatar, the user can intuitively understand the facial expression of the avatar and imitate it, and the burden on the user can be reduced as compared with the case where the facial expression is indicated by voice information. it can.
  • the data recorded in consideration of the time series of the displayed face image of the avatar is labeled as training data for machine learning, but the time between the presented avatar and the user's facial expression Misalignment may occur.
  • the avatar face image is changed based on the identification result of the facial expression of the user in the identification phase, if an avatar identical or similar to the avatar imitated by the user is used in the learning phase, for example, opening and closing of the mouth
  • the face image of the avatar can be appropriately changed so as to assume the user's fine movement of the part such as the degree and the upper and lower parts of the eyebrows, and the emotion of the user can be expressed more clearly.
  • the mounting device 3 further includes a displacement sensor 63
  • the CPU 1 further includes a displacement amount calculation unit 34 and a correction presentation unit 35. It differs from the facial expression identification system according to the third embodiment shown in FIG. 15 in that the storage device 2 further includes a deviation data storage unit 55.
  • Other configurations are the same as those of the facial expression identification system according to the third embodiment shown in FIG.
  • the displacement sensor 63 a displacement sensor or a length measurement sensor can be used.
  • the deviation sensor 63 is arranged at a position different from the optical sensors 61a to 61p of the mounting device 3. Only one deviation sensor 63 may be arranged, or a plurality of deviation sensors 63 may be arranged. Based on the landmarks such as the eyes of the user wearing the mounting position 3, the displacement sensor 63 uses the position when the mounting position 3 is normally mounted as a reference position, and the front-rear direction, the vertical direction, and the left-right direction with respect to the reference position. A shift in at least one direction is detected. The detection result by the deviation sensor 63 is stored in the deviation data storage unit 55. Even if the deviation sensor 63 is not individually provided, the sensor values of the optical sensors 61a to 61p used for distance detection for facial expression detection can also be used for deviation detection.
  • the deviation amount calculation unit 34 detects the deviation amount and the deviation direction with respect to the reference position from the detection result by the deviation sensor 63 or the distribution of the sensor values of the optical sensors 61a to 61p.
  • FIG. 37 is a graph showing changes in sensor values of the optical sensors 61a to 61p with respect to various facial expressions when the mounting apparatus 3 is shifted in the front-rear direction. The amount of shift is changed between levels 1 to 4. The higher the level value, the farther the distance between the wearing apparatus 3 and the person's face is.
  • the deviation amount calculation unit 34 uses a part of the sensor values of the optical sensors 61a to 61p shown in FIG. 37 to calculate the deviation amount in the front-rear direction of the mounting device 3 as shown in FIG. 38 by regression.
  • 39 (a) to 42 (b) are graphs in which changes in sensor values for the various facial expressions shown in FIG. 37 are divided for each facial expression.
  • 39 (a) is “no expression”
  • FIG. 39 (b) is “joy”
  • FIG. 40 (a) is “disgust”
  • FIG. 40 (b) is “anger”
  • FIG. 41 (a) is “surprise”.
  • 41 (b) shows the change in sensor value for “fear”
  • FIG. 42 (a) shows “sadness”
  • FIG. 42 (b) shows the change in sensor value for “contempt”.
  • the sensor value generally tends to decrease as the numerical value of the level is higher and the distance between the wearing apparatus 3 and the person's face is longer.
  • the correction presentation unit 35 When the deviation amount calculated by the deviation amount calculation unit 34 is equal to or greater than a predetermined threshold, the correction presentation unit 35 presents the correction content of the deviation to the user wearing the mounting apparatus 3 and prompts the user to correct the deviation. Prompt.
  • the predetermined threshold can be set as appropriate, and may be stored in advance in the deviation data storage unit 55.
  • the correction presentation unit 35 When presenting the correction content of the deviation to the user, the correction presentation unit 35 indicates a correction direction such as “the mounting device is shifted in the upper right direction”, “please correct the mounting device in the lower left direction”, etc. You may display the character information and the image which shows correction directions, such as an arrow, on the display part 62. FIG.
  • voice information may be output instead of character information or images, or may be presented to the user's tactile sense by vibration or the like so that the correction direction of the displacement of the mounting apparatus 3 can be intuitively understood.
  • step S52 the deviation sensor 63 detects a deviation simultaneously with the detection by the optical sensors 61a to 61p.
  • the learning data generation unit 32 generates a data set for machine learning by adding labels corresponding to facial expressions and deviations to the sensor values of the optical sensors 61a to 61p. Information on the label corresponding to the shift may be input via the input device 5, for example.
  • step S71 the optical sensors 61a to 61p detect the distance between the face of the user wearing the wearing device 3 and the wearing device 3.
  • step S72 the deviation sensor 63 detects a deviation.
  • the deviation amount calculation unit 34 calculates the deviation amount and the deviation direction based on the detection result of the deviation sensor 63.
  • step S ⁇ b> 73 the correction presentation unit 35 determines whether the deviation amount calculated by the deviation amount calculation unit 34 is greater than or equal to a predetermined threshold value. When the deviation amount is equal to or larger than the predetermined threshold value, the process proceeds to step S74, and the correction presentation unit 35 presents the correction content to the user so as to correct the deviation.
  • step S73 the process proceeds to step S76, where the facial expression identification unit 33 selects learning data to which a label corresponding to the deviation is given, and the selected learning data Is used to identify the facial expression of the user.
  • step S ⁇ b> 76 the avatar display control unit 11 extracts the avatar face image based on the identification result by the facial expression identification unit 33, and displays the extracted facial expression avatar on the display unit 62 of the mounting apparatus 3.
  • the face is detected by detecting the shift when the mounting apparatus 3 is mounted and selecting the learning data to which the label corresponding to the shift is given. Facial expressions can be properly identified. Further, by detecting a shift when the mounting device 3 is mounted and urging the user to correct the shift, the shift of the mounting device 3 can be corrected appropriately.
  • the mounting device 3 further includes a blood flow sensor 64
  • the storage device 2 further includes a blood flow data storage unit 56. This is different from the facial expression identification system according to the third embodiment shown in FIG. Other configurations are the same as those of the facial expression identification system according to the third embodiment shown in FIG.
  • the blood flow sensor 64 detects the blood flow volume of the user wearing the mounting device 3.
  • the blood flow sensor 64 includes a light source that emits multi-wavelength light and a detection unit that detects the light intensity of the reflected light of multi-wavelength.
  • the blood flow sensor 64 irradiates multi-wavelength light at a position where the face color such as cheeks can be easily measured, and detects the blood flow based on the light intensity of the reflected light.
  • hemoglobin contained in red blood cells in blood has the property of absorbing green light. As blood flow increases, hemoglobin increases, and green light is easily absorbed when irradiated with light of multiple wavelengths. Thus, the blood flow rate can be detected based on the light intensity of the green wavelength of the reflected light.
  • the blood flow detected by the blood flow sensor 64 is stored in the blood flow data storage unit 56.
  • the facial expression identifying unit 33 identifies the user's facial color by comparing the blood flow detected by the blood flow sensor 64 with a predetermined threshold. For example, when the blood flow detected by the blood flow sensor 64 is equal to or greater than the first threshold, the facial expression identification unit 33 identifies the user's facial color as “blush”. When the blood flow detected by the blood flow sensor 64 is less than the second threshold, which is less than the first threshold, the facial expression identification unit 33 identifies the user's facial color as “palm white”. When the blood flow detected by the blood flow sensor 64 is less than the first threshold and greater than or equal to the second threshold, the facial expression identification unit 33 identifies the face color of the user as “normal”.
  • the avatar display control unit 31 changes the avatar face image based on the user facial expression and facial color identification results by the facial expression identification unit 33.
  • the avatar display control unit 31 can display, for example, a “blush” and “angry” face image, or a “light white” and “surprise” face image as the avatar face image.
  • the user's emotion can be identified in more detail by identifying the facial color in addition to the facial expression of the user wearing the mounting device 3. . Furthermore, the degree of freedom of facial expression communication in the VR environment can be improved by changing the face color of the avatar according to the identification result of the face color of the user.
  • the facial expression according to the third embodiment shown in FIG. 15 is that the CPU 1 further includes an optical sensor adjustment unit 37, as shown in FIG. Different from the identification system.
  • Other configurations are the same as those of the facial expression identification system according to the third embodiment shown in FIG.
  • the optical sensor adjustment unit 37 adjusts the light emission intensities and sensitivities of the optical sensors 61a to 61p according to the sensor values of the optical sensors 61a to 61p at the time of calibration in the learning phase or the identification phase. For example, the optical sensor adjustment unit 37 extracts the maximum value and the minimum value from the sensor values of the optical sensors 61a to 61p. The optical sensor adjustment unit 37 determines whether or not it is necessary to adjust the light emission intensity and sensitivity of the optical sensors 61a to 61p by comparing the extracted maximum value and minimum value with a predetermined threshold value.
  • the light sensor adjustment unit 37 adjusts the light intensity and sensitivity by adjusting the variable resistance values of the light sensors 61a to 61p. To do. For example, when it is determined that the maximum value of the sensor values of the optical sensors 61a to 61p is equal to or greater than the first threshold value, by increasing the value of the variable resistance of the optical sensors 61a to 61p, the emission intensity of the optical sensors 61a to 61p and Reduce sensitivity.
  • the value of the variable resistance of the optical sensors 61a to 61p is decreased to reduce the light sensor 61a to 61p.
  • the emission intensity and sensitivity of the sensors 61a to 61p are increased.
  • step S81 the optical sensors 61a to 61p acquire sensor values (reflection intensity).
  • step S82 the optical sensor adjustment unit 37 extracts the maximum value and the minimum value of the sensor values (reflection intensity) acquired by the optical sensors 61a to 61p.
  • step S83 the optical sensor adjustment unit 37 determines whether it is necessary to adjust the light emission intensity and sensitivity of the optical sensors 61a to 61p by comparing the extracted maximum value and minimum value with a predetermined threshold value. . If it is determined that the light intensity and sensitivity of the optical sensors 61a to 61p need to be adjusted, the process proceeds to step S84, and the light intensity of the optical sensors 61a to 61p is adjusted by adjusting the variable resistance of the optical sensors 61a to 61p. And adjust the sensitivity. On the other hand, if it is determined in step S83 that adjustment of the light emission intensity and sensitivity of the optical sensors 61a to 61p is unnecessary, the process is completed.
  • the sixth embodiment of the present invention by adjusting the light emission intensity and sensitivity of the optical sensors 61a to 61p based on the reflection intensity information of the optical sensors 61a to 61p, an appropriate range can be obtained. A sensor value can be detected.
  • the mounting device 3 may further include a reflective image sensor that detects the amount of skin movement on the face of the user wearing the mounting device 3.
  • the mounting apparatus 3 may include a reflective image sensor that detects the amount of movement of the skin of the user's face wearing the mounting apparatus 3 instead of using the optical sensors 61a to 61p.
  • the reflective image sensor for example, a CMOS image sensor or a CCD image sensor can be used.
  • the reflective image sensor detects, for example, the amount of skin movement on a one-dimensional or two-dimensional face.
  • the facial expression identification unit 33 identifies the facial expression of the user wearing the mounting device 3 based on the sensor values from the optical sensors 61a to 61p and the movement amount from the reflective image sensor. Thereby, the identification accuracy of the user's facial expression can be further improved.
  • the facial expression identification unit 33 when the learning data of the user who wears the mounting device 3 is stored in the learning data storage unit 51, the facial expression identification unit 33 The user's own learning data may be read to identify facial expressions. On the other hand, when the learning data of the user who wears the mounting device 3 is not stored in the learning data storage unit 51, the facial expression identification unit 33 stores the other person's stored in the learning data storage unit 51. Learning data may be used.
  • the facial expression identification unit 33 uses the sensor values of the optical sensors 61a to 61p and the label corresponding to the current facial expression of the user wearing the mounting device 3 as input data for each other's learning data. Then, the degree of similarity is calculated for a template facial expression (of the same facial expression) to which the same label as that corresponding to the user's current facial expression is assigned. Furthermore, the facial expression identification unit 33 reads the learning data of the other person who has the highest similarity and identifies the facial expression of the user.
  • the label input via the input device 5 as a label corresponding to a user's facial expression.
  • the avatar A label may be generated based on the data of the avatar face image extracted by the display control unit 31.
  • the components of the facial expression identification system according to the first to sixth embodiments may be combined with each other.
  • the CPU 1 illustrated in FIG. 1 further includes a time information acquisition unit 15 and a facial expression distribution calculation unit 16 illustrated in FIG. 9, and the storage device 2 illustrated in FIG. 1 includes the geographic information storage unit illustrated in FIG. 9. 22 and a facial expression map storage unit 23 may be further provided.
  • the machine learning method (learning phase) processing according to the first to sixth embodiments and the facial expression identification method (identification method) processing according to the first to sixth embodiments are different from each other. You may combine with each other.
  • the present invention can be used for a facial expression identification system, a facial expression identification method, and a facial expression identification program that automatically identify facial expressions.
  • Central processing unit (CPU) 2 ... Storage device 3 ... Mounting device 4a, 4b, 4c, 4d, 4e, 4f, 4g, 4h, 4i, 4j, 4k, 4l, 4m, 4n, 4o, 4p, 4q ... Detection device 5 ... Input device 6 ; Output device 10 ; machine learning unit 11 ... facial expression identification unit 33 DESCRIPTION OF SYMBOLS 12 ... Geographic information acquisition part 13 ... Facial expression map preparation part 14 ... Recommended information extraction part 15 ... Time information acquisition part 16 ... Facial expression distribution calculation part 20 ... Learning data storage part 21 ... Facial expression storage part 22 ... Geographic information storage part 23 ... Facial expression map storage unit 24 ... Time information storage unit 25 ...
  • Facial expression distribution storage unit 31 ... Avatar display control unit 32 ... Learning data generation unit 33 ... Facial expression identification unit 40 ... Display units 41, 42 ... Avatar 50 ... Avatar Data storage unit 51 ... learning data storage unit 52 ... optical sensor data storage unit 53 ... identification result storage units 61a, 61b, 61c, 61d, 61e, 61f, 61g, 61h, 61i, 61j, 61k, 61l, 61m, 61n, 61o, 61p ... detector 62 ... display 71 ... main body 72 ... flexible circuit boards 73a, 73b ... lens 80 ... multi-class classifiers 81-84 ... Null network 100 ... Avatar

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

L'invention concerne un système d'identification d'expression faciale qui possède une configuration économique, simple et susceptible d'identifier en continu (automatiquement) l'expression faciale d'une personne. Le système est pourvu : d'un dispositif pouvant être porté (3) qui peut être porté sur la tête d'une personne; de multiples dispositifs de détection (4a à 4q) qui sont respectivement disposés au niveau de multiples positions faisant face au visage de la personne portant le dispositif pouvant être porté (3), et qui détectent respectivement les distances entre le visage et le dispositif portatif (3) au niveau de multiples positions lorsque le dispositif pouvant être porté (3) est porté; une unité de stockage (2) pour stocker des données d'apprentissage qui sont obtenues à travers l'apprentissage machine de la relation correspondante entre des données de résultats de détection antérieurs de la pluralité de dispositifs de détection (4a à 4q) et des expressions faciales; et d'une unité d'identification d'expression faciale (11) qui, lors de la lecture des données d'apprentissage à partir de l'unité de stockage (2) et de la réception des résultats de détection des dispositifs de détection multiples (4a à 4q) en tant que données d'entrée, identifie l'expression faciale de la personne.
PCT/JP2016/069683 2015-07-03 2016-07-01 Système, procédé et programme d'identification d'expression faciale WO2017006872A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2017527431A JP6850723B2 (ja) 2015-07-03 2016-07-01 顔表情識別システム、顔表情識別方法及び顔表情識別プログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2015134460 2015-07-03
JP2015-134460 2015-07-03

Publications (1)

Publication Number Publication Date
WO2017006872A1 true WO2017006872A1 (fr) 2017-01-12

Family

ID=57685715

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/069683 WO2017006872A1 (fr) 2015-07-03 2016-07-01 Système, procédé et programme d'identification d'expression faciale

Country Status (2)

Country Link
JP (1) JP6850723B2 (fr)
WO (1) WO2017006872A1 (fr)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2561537A (en) * 2017-02-27 2018-10-24 Emteq Ltd Optical muscle sensor
CN109670393A (zh) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 人脸数据采集方法、设备、装置及计算机可读存储介质
JP2019096130A (ja) * 2017-11-24 2019-06-20 Kddi株式会社 モーフィング画像生成装置及びモーフィング画像生成方法
KR20190130179A (ko) * 2018-04-13 2019-11-22 인하대학교 산학협력단 미세한 표정변화 검출을 위한 2차원 랜드마크 기반 특징점 합성 및 표정 세기 검출 방법
JP2020052793A (ja) * 2018-09-27 2020-04-02 株式会社Nttドコモ システム
KR102108422B1 (ko) * 2019-08-07 2020-05-11 (주)자이언트스텝 Ai 기반의 표정 분류 및 리타겟팅을 통한 가상 캐릭터의 표정 최적화 시스템 및 방법, 및 컴퓨터 판독 가능한 저장매체
WO2020170645A1 (fr) * 2019-02-22 2020-08-27 ソニー株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations et programme
KR20200119546A (ko) * 2019-04-10 2020-10-20 한양대학교 산학협력단 전자 장치, 아바타 얼굴 표정 표시 시스템 및 제어 방법
WO2021145243A1 (fr) * 2020-01-16 2021-07-22 株式会社コロプラ Programme, procédé exécuté par ordinateur, et ordinateur
JP2021517689A (ja) * 2018-03-16 2021-07-26 マジック リープ, インコーポレイテッドMagic Leap,Inc. 眼追跡カメラからの顔の表情
WO2022117776A1 (fr) * 2020-12-04 2022-06-09 Socialdream Dispositif immersif
WO2023287425A1 (fr) * 2021-07-16 2023-01-19 Hewlett-Packard Development Company, L.P. Identification de l'expression faciale d'un porteur d'un visiocasque
US20230035961A1 (en) * 2021-07-29 2023-02-02 Fannie Liu Emoji recommendation system using user context and biosignals
CN111240482B (zh) * 2020-01-10 2023-06-30 北京字节跳动网络技术有限公司 一种特效展示方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11803237B2 (en) 2020-11-14 2023-10-31 Facense Ltd. Controlling an eye tracking camera according to eye movement velocity

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014021707A (ja) * 2012-07-18 2014-02-03 Nikon Corp 情報入出力装置、及び情報入出力方法
JP2015092646A (ja) * 2013-11-08 2015-05-14 ソニー株式会社 情報処理装置、制御方法、およびプログラム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015031996A (ja) * 2013-07-31 2015-02-16 Kddi株式会社 携帯型動き検出装置、感情情報集計装置、感情情報取得報知システム、感情情報取得報知方法およびコンピュータプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014021707A (ja) * 2012-07-18 2014-02-03 Nikon Corp 情報入出力装置、及び情報入出力方法
JP2015092646A (ja) * 2013-11-08 2015-05-14 ソニー株式会社 情報処理装置、制御方法、およびプログラム

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2561537B (en) * 2017-02-27 2022-10-12 Emteq Ltd Optical expression detection
GB2561537A (en) * 2017-02-27 2018-10-24 Emteq Ltd Optical muscle sensor
US11003899B2 (en) 2017-02-27 2021-05-11 Emteq Limited Optical expression detection
US11538279B2 (en) 2017-02-27 2022-12-27 Emteq Limited Optical expression detection
US11836236B2 (en) 2017-02-27 2023-12-05 Emteq Limited Optical expression detection
JP2019096130A (ja) * 2017-11-24 2019-06-20 Kddi株式会社 モーフィング画像生成装置及びモーフィング画像生成方法
JP2021517689A (ja) * 2018-03-16 2021-07-26 マジック リープ, インコーポレイテッドMagic Leap,Inc. 眼追跡カメラからの顔の表情
JP7344894B2 (ja) 2018-03-16 2023-09-14 マジック リープ, インコーポレイテッド 眼追跡カメラからの顔の表情
KR20190130179A (ko) * 2018-04-13 2019-11-22 인하대학교 산학협력단 미세한 표정변화 검출을 위한 2차원 랜드마크 기반 특징점 합성 및 표정 세기 검출 방법
KR102138809B1 (ko) * 2018-04-13 2020-07-28 인하대학교 산학협력단 미세한 표정변화 검출을 위한 2차원 랜드마크 기반 특징점 합성 및 표정 세기 검출 방법
CN109670393B (zh) * 2018-09-26 2023-12-19 平安科技(深圳)有限公司 人脸数据采集方法、设备、装置及计算机可读存储介质
CN109670393A (zh) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 人脸数据采集方法、设备、装置及计算机可读存储介质
JP2020052793A (ja) * 2018-09-27 2020-04-02 株式会社Nttドコモ システム
WO2020170645A1 (fr) * 2019-02-22 2020-08-27 ソニー株式会社 Dispositif de traitement d'informations, procédé de traitement d'informations et programme
US20220084196A1 (en) * 2019-02-22 2022-03-17 Sony Group Corporation Information processing apparatus, information processing method, and program
US11106899B2 (en) 2019-04-10 2021-08-31 Industry University Cooperation Foundation Hanyang University Electronic device, avatar facial expression system and controlling method thereof
KR102243040B1 (ko) * 2019-04-10 2021-04-21 한양대학교 산학협력단 전자 장치, 아바타 얼굴 표정 표시 시스템 및 제어 방법
KR20200119546A (ko) * 2019-04-10 2020-10-20 한양대학교 산학협력단 전자 장치, 아바타 얼굴 표정 표시 시스템 및 제어 방법
WO2021025279A1 (fr) * 2019-08-07 2021-02-11 (주)자이언트스텝 Système, procédé et support de stockage lisible par ordinateur pour optimiser une expression d'un caractère virtuel via une classification et un reciblage d'une expression basés sur l'intelligence artificielle (ai)
KR102108422B1 (ko) * 2019-08-07 2020-05-11 (주)자이언트스텝 Ai 기반의 표정 분류 및 리타겟팅을 통한 가상 캐릭터의 표정 최적화 시스템 및 방법, 및 컴퓨터 판독 가능한 저장매체
CN111240482B (zh) * 2020-01-10 2023-06-30 北京字节跳动网络技术有限公司 一种特效展示方法及装置
JP2021114036A (ja) * 2020-01-16 2021-08-05 株式会社コロプラ プログラム、コンピュータが実行する方法及びコンピュータ
JP7295045B2 (ja) 2020-01-16 2023-06-20 株式会社コロプラ プログラム、コンピュータが実行する方法及びコンピュータ
WO2021145243A1 (fr) * 2020-01-16 2021-07-22 株式会社コロプラ Programme, procédé exécuté par ordinateur, et ordinateur
FR3117221A1 (fr) * 2020-12-04 2022-06-10 Socialdream dispositif immersif
WO2022117776A1 (fr) * 2020-12-04 2022-06-09 Socialdream Dispositif immersif
WO2023287425A1 (fr) * 2021-07-16 2023-01-19 Hewlett-Packard Development Company, L.P. Identification de l'expression faciale d'un porteur d'un visiocasque
US20230035961A1 (en) * 2021-07-29 2023-02-02 Fannie Liu Emoji recommendation system using user context and biosignals
US11765115B2 (en) * 2021-07-29 2023-09-19 Snap Inc. Emoji recommendation system using user context and biosignals

Also Published As

Publication number Publication date
JP6850723B2 (ja) 2021-03-31
JPWO2017006872A1 (ja) 2018-04-19

Similar Documents

Publication Publication Date Title
WO2017006872A1 (fr) Système, procédé et programme d'identification d'expression faciale
US10986270B2 (en) Augmented reality display with frame modulation functionality
KR102402467B1 (ko) 혼합 현실 교정을 위한 안구주위 테스트
US20230105027A1 (en) Adapting a virtual reality experience for a user based on a mood improvement score
KR102516112B1 (ko) 증강 현실 아이덴티티 검증
US11430169B2 (en) Animating virtual avatar facial movements
US11797105B2 (en) Multi-modal hand location and orientation for avatar movement
JP2021509495A (ja) ディスプレイデバイスのための向上された姿勢決定
CN112106066A (zh) 根据眼睛跟踪相机的面部表情
US9482622B2 (en) Methods and apparatus for surface classification
WO2019246044A1 (fr) Systèmes d'affichage montés sur la tête à fonctionnalité d'économie d'énergie
US11127181B2 (en) Avatar facial expression generating system and method of avatar facial expression generation
TWI829944B (zh) 虛擬化身臉部表情產生系統和虛擬化身臉部表情產生方法
US11720168B1 (en) Inferred body movement using wearable RF antennas
US20240028294A1 (en) Automatic Quantitative Food Intake Tracking
US20220331196A1 (en) Biofeedback-based control of sexual stimulation devices
US11861778B1 (en) Apparatus and method for generating a virtual avatar
US20230372190A1 (en) Adaptive speech and biofeedback control of sexual stimulation devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16821341

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017527431

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16821341

Country of ref document: EP

Kind code of ref document: A1