JP7037159B2

JP7037159B2 - Systems, programs, and methods for measuring a subject's jaw movements

Info

Publication number: JP7037159B2
Application number: JP2021515056A
Authority: JP
Inventors: 基文十河; 善之木戸; 一徳野▲崎▼; 一典池邉; 哲山口; 雅也西願
Original assignee: Osaka University NUC; iLAND Solutions Inc
Current assignee: Osaka University NUC; iLAND Solutions Inc
Priority date: 2019-11-08
Filing date: 2020-11-06
Publication date: 2022-03-16
Anticipated expiration: 2040-11-06
Also published as: WO2021090921A1; JPWO2021090921A1; JP2022074153A

Description

本発明は、被験者の顎運動を測定するためのシステム、プログラム、および方法に関する。さらに、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するためのシステム、プログラム、および方法にも関する。 The present invention relates to a system, program, and method for measuring jaw movements of a subject. In addition, it relates to systems, programs, and methods for constructing motion point trajectory trained models used to measure a subject's jaw motion.

歯科分野における診断行為の１つとして、顎運動の評価が挙げられる。顎運動には、咀嚼していない状況（非機能時）で顎骨の可動範囲を確認する「限界運動測定」と、咀嚼時（機能時）の下顎運動（咀嚼運動）を測定する「咀嚼時顎運動測定」がある。歯科医療現場では、顎関節症の患者や、歯の欠損補綴が必要な患者などに適用されている。 One of the diagnostic actions in the dental field is the evaluation of jaw movement. Jaw movements include "marginal movement measurement" to check the range of movement of the jawbone when not chewing (during non-function) and "mastication jaw" to measure mandibular movement (mastication movement) during mastication (during functioning). There is "exercise measurement". In the field of dentistry, it is applied to patients with temporomandibular joint disease and patients who need prosthesis for missing teeth.

特許文献１は、顎運動を測定するための手法を開示している。この手法では、患者にヘッドフレームおよび下顎マーカー具を固定する必要がある。特許文献１に示されるような従来の顎運動を測定するための手法では、測定のために必要な器具を患者に装着するために、患者は病院に直接出向く必要があった。これは、患者および医師等にとって大きな負担および手間となり得る。さらに、これらの器具はいずれも専用品であり、非常に高価でコスト高であり得る。 Patent Document 1 discloses a method for measuring jaw movement. This technique requires the patient to have a headframe and mandibular marker fixed. In the conventional method for measuring jaw movement as shown in Patent Document 1, the patient had to go directly to the hospital in order to attach the instrument necessary for the measurement to the patient. This can be a heavy burden and effort for patients, doctors and the like. Moreover, all of these appliances are dedicated and can be very expensive and costly.

特開２００２－３３６２８２号公報Japanese Unexamined Patent Publication No. 2002-336282

本発明は、被験者の顎運動を簡易に測定することが可能なシステム等を提供することを目的とする。 An object of the present invention is to provide a system or the like capable of easily measuring the jaw movement of a subject.

本発明の一実施形態において、顎運動を測定するためのシステムは、顎運動中の被験者の顔の連続した複数の画像を取得する取得手段と、前記被験者の顔の座標系を少なくとも補正する補正手段と、前記顔の下顎領域内の運動点を少なくとも抽出する抽出手段と、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成する生成手段とを備える。 In one embodiment of the invention, the system for measuring jaw movement is an acquisition means for acquiring a plurality of consecutive images of the subject's face during jaw movement and a correction that at least corrects the coordinate system of the subject's face. The means includes an extraction means for extracting at least a motion point in the lower jaw region of the face, and a generation means for generating at least motion point trajectory information indicating the trajectory of the motion point by tracking the motion point.

一実施形態において、前記抽出手段は、前記顔の下顎領域内の運動点を抽出する第１の抽出手段と、前記複数の画像から前記顔の上顔面領域内の固定点を抽出する第２の抽出手段とを備え、前記補正手段は、前記固定点と、事前に定義された顔基準位置テンプレートとに基づいて、前記座標系を補正する。 In one embodiment, the extraction means has a first extraction means for extracting motion points in the lower jaw region of the face and a second extraction means for extracting fixed points in the upper facial region of the face from the plurality of images. The correction means includes extraction means, and the correction means corrects the coordinate system based on the fixed point and a predefined face reference position template.

一実施形態において、前記第１の抽出手段は、前記複数の画像において複数の特徴部分を抽出し、前記複数の特徴部分のうち、所定期間内の座標変化が所定の範囲内の特徴部分を運動点として抽出し、前記第２の抽出手段は、前記複数の特徴部分のうち、所定期間内の座標変化が所定の閾値未満の特徴部分を固定点として抽出する In one embodiment, the first extraction means extracts a plurality of feature portions in the plurality of images, and among the plurality of feature portions, a feature portion whose coordinate change within a predetermined period is within a predetermined range is moved. The second extraction means extracts a feature portion whose coordinate change within a predetermined period is less than a predetermined threshold value as a fixed point among the plurality of feature portions.

一実施形態において、前記補正手段は、前記被験者の顔の座標系を補正する第１の補正手段と、前記固定点を追跡することによって、固定点の軌跡を示す固定点軌跡情報を生成する第２の生成手段と、前記固定点軌跡情報に基づいて、前記運動点軌跡情報を補正する第２の補正手段とを備える。 In one embodiment, the correction means has a first correction means for correcting the coordinate system of the subject's face, and a second correction means for generating fixed point locus information indicating the locus of the fixed point by tracking the fixed point. 2. The generation means and the second correction means for correcting the motion point locus information based on the fixed point locus information are provided.

一実施形態において、前記補正手段は、複数の被験者の顔の基準座標系を学習する処理を施された基準座標系学習済モデルであって、前記基準座標系学習済モデルは、入力された画像中の被験者の顔の座標系を前記基準座標系に補正するように構成されている、基準座標系学習済モデルを備える。 In one embodiment, the correction means is a reference coordinate system trained model that has been processed to learn the reference coordinate system of the faces of a plurality of subjects, and the reference coordinate system trained model is an input image. It comprises a reference coordinate system trained model configured to correct the coordinate system of the subject's face inside to the reference coordinate system.

一実施形態において、前記基準座標系学習済モデルは、入力された画像中の被験者の顔の座標系と、前記基準座標系との差分をとることと、前記差分に基づいて、前記入力された画像を変換処理することとによって前記座標系を補正する。 In one embodiment, the reference coordinate system trained model takes a difference between the coordinate system of the subject's face in the input image and the reference coordinate system, and the input is based on the difference. The coordinate system is corrected by converting the image.

一実施形態において、前記変換処理は、アフィン変換を含む。 In one embodiment, the conversion process comprises an affine transformation.

一実施形態において、前記抽出手段は、前記複数の画像において複数の特徴部分を抽出し、前記複数の特徴部分のうち、所定期間内の座標変化が所定の範囲内のピクセルを運動点として抽出する。 In one embodiment, the extraction means extracts a plurality of feature portions in the plurality of images, and extracts pixels among the plurality of feature portions whose coordinate changes within a predetermined period are within a predetermined range as motion points. ..

一実施形態において、前記補正手段は、前記被験者の顔のベース顔モデルを生成するベース顔モデル生成手段と、前記複数の画像中の前記被験者の顔を前記ベース顔モデルに反映させることにより、前記被験者の顎運動顔モデルを生成する顎運動顔モデル生成手段とを備え、前記補正手段は、前記ベース顔モデルの座標系に基づいて、前記顎運動顔モデルの座標系を補正することにより、前記座標系を補正する。 In one embodiment, the correction means reflects the subject's face in the plurality of images in the base face model, and the base face model generation means for generating the base face model of the subject's face. The correction means includes a jaw movement face model generation means for generating a jaw movement face model of a subject, and the correction means corrects the coordinate system of the jaw movement face model based on the coordinate system of the base face model. Correct the coordinate system.

一実施形態において、前記抽出手段は、前記顎運動顔モデルまたは前記ベース顔モデルにおける運動点を抽出する。 In one embodiment, the extraction means extracts motion points in the jaw motion face model or the base face model.

一実施形態において、本発明のシステムは、前記生成された運動点軌跡情報に少なくとも基づいて、前記被験者の顎運動の評価を示す顎運動評価情報を生成する評価手段をさらに備える。 In one embodiment, the system of the present invention further comprises an evaluation means that generates jaw movement evaluation information indicating the evaluation of the jaw movement of the subject based on at least the generated movement point trajectory information.

一実施形態において、本発明のシステムは、前記被験者の前記下顎領域に設置されるように構成された標点をさらに備え、前記抽出手段は、前記複数の画像中の前記標点を運動点として抽出する。 In one embodiment, the system of the present invention further comprises a reference point configured to be placed in the mandibular region of the subject, and the extraction means uses the reference point in the plurality of images as a motion point. Extract.

一実施形態において、前記被験者の前記上顔面領域に設置されるように構成された標点をさらに備え、前記第２の抽出手段は、前記複数の画像中の前記標点を固定点として抽出する。 In one embodiment, the second extraction means further comprises a reference point configured to be installed in the upper facial region of the subject, and the second extraction means extracts the reference point in the plurality of images as a fixed point. ..

一実施形態において、前記標点は、前記標点上の特定点を表すように構成されている。 In one embodiment, the reference point is configured to represent a specific point on the reference point.

本発明の一実施形態において、顎運動を測定するためのプログラムは、プロセッサ部を備えるシステムにおいて実行され、前記プログラムは、顎運動中の被験者の顔の連続した複数の画像を取得することと、前記被験者の顔の座標系を少なくとも補正することと、前記顔の下顎領域内の運動点を少なくとも抽出することと、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成することとを含む処理を前記プロセッサ部に行わせる。 In one embodiment of the present invention, a program for measuring jaw movement is executed in a system including a processor unit, and the program acquires a plurality of consecutive images of a subject's face during jaw movement. Motion point trajectory information indicating the trajectory of the motion point by at least correcting the coordinate system of the subject's face, extracting at least the motion points in the lower jaw region of the face, and tracking the motion points. The processor unit is made to perform a process including at least generating.

本発明の一実施形態において、顎運動を測定するための方法は、顎運動中の被験者の顔の連続した複数の画像を取得することと、前記被験者の顔の座標系を少なくとも補正することと、前記顔の下顎領域内の運動点を少なくとも抽出することと、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成することとを含む。 In one embodiment of the invention, the method for measuring jaw movement is to acquire a plurality of consecutive images of the subject's face during jaw movement and to at least correct the coordinate system of the subject's face. , At least extracting the motion points in the lower jaw region of the face, and by tracking the motion points, at least generating motion point trajectory information indicating the trajectory of the motion points.

本発明の一実施形態において、顎運動を測定するためのシステムは、顎運動中の被験者の顔の連続した複数の画像を取得する取得手段と、前記複数の画像に基づいて、前記顔の下顎領域内の運動点を少なくとも抽出する抽出手段と、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成する生成手段と、前記運動点軌跡情報に少なくとも基づいて、前記被験者の顎運動の評価を示す顎運動評価情報を生成する評価手段であって、前記評価手段は、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを備え、前記運動点軌跡学習済モデルは、入力された運動点軌跡情報を顎運動の評価と相関させるように構成されている、評価手段とを備える。 In one embodiment of the present invention, the system for measuring jaw movement is an acquisition means for acquiring a plurality of consecutive images of a subject's face during jaw movement, and a lower jaw of the face based on the plurality of images. Based on at least the extraction means for extracting at least the motion points in the region, the generation means for generating at least the motion point trajectory information indicating the trajectory of the motion points by tracking the motion points, and the motion point trajectory information. , An evaluation means for generating jaw movement evaluation information indicating the evaluation of the jaw movement of the subject, the evaluation means is a movement point trajectory learned model subjected to processing for learning the movement point trajectory information of a plurality of subjects. The model having learned the locus of motion points includes an evaluation means configured to correlate the input locus of motion points with the evaluation of jaw movement.

本発明の一実施形態において、顎運動を測定するためのプログラムは、プロセッサ部を備えるシステムにおいて実行され、前記プログラムは、顎運動中の被験者の顔の連続した複数の画像を取得することと、前記複数の画像に基づいて、前記顔の下顎領域内の運動点を少なくとも抽出することと、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成することと、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを利用して、前記運動点軌跡情報に少なくとも基づいて、前記被験者の顎運動の評価を示す顎運動評価情報を生成することであって、前記運動点軌跡学習済モデルは、入力された運動点軌跡情報を顎運動の評価と相関させるように構成されている、こととを含む処理を前記プロセッサ部に行わせる。 In one embodiment of the present invention, a program for measuring jaw movement is executed in a system including a processor unit, and the program acquires a plurality of consecutive images of a subject's face during jaw movement. Based on the plurality of images, at least the motion points in the lower jaw region of the face are extracted, and by tracking the motion points, at least the motion point trajectory information indicating the trajectory of the motion points is generated. , Jaw movement evaluation showing the evaluation of the jaw movement of the subject based on at least the movement point trajectory information by using the movement point trajectory learned model that has been processed to learn the movement point trajectory information of a plurality of subjects. The processor unit is subjected to processing including the generation of information, wherein the motion point trajectory trained model is configured to correlate the input motion point trajectory information with the evaluation of jaw movement. Let me do it.

本発明の一実施形態において、顎運動を測定するための方法は、顎運動中の被験者の顔の連続した複数の画像を取得することと、前記複数の画像に基づいて、前記顔の下顎領域内の運動点を少なくとも抽出することと、前記運動点を追跡することによって、前記運動点の軌跡を示す運動点軌跡情報を少なくとも生成することと、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを利用して、前記運動点軌跡情報に少なくとも基づいて、前記被験者の顎運動の評価を示す顎運動評価情報を生成することであって、前記運動点軌跡学習済モデルは、入力された運動点軌跡情報を顎運動の評価と相関させるように構成されている、こととを含む。 In one embodiment of the invention, the method for measuring jaw movement is to acquire a plurality of consecutive images of the subject's face during jaw movement, and based on the plurality of images, the mandibular region of the face. Processing to learn at least the motion point trajectory information of a plurality of subjects by extracting at least the motion points in the motion point, generating at least the motion point trajectory information indicating the trajectory of the motion point by tracking the motion point, and learning the motion point trajectory information of a plurality of subjects. This is to generate jaw movement evaluation information indicating the evaluation of the jaw movement of the subject based on at least the movement point trajectory information by using the movement point trajectory learned model. The trained model includes that the input motion point trajectory information is configured to correlate with the evaluation of jaw motion.

本発明の一実施形態において、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するためのシステムは、複数の被験者の運動点を追跡することによって得られた運動点の軌跡を示す運動点軌跡情報を少なくとも取得する取得手段と、少なくとも前記複数の被験者の前記運動点軌跡情報を入力用教師データとした学習処理により、運動点軌跡学習済モデルを構築する構築手段とを備える。 In one embodiment of the present invention, the system for constructing a motion point trajectory trained model used for measuring the jaw motion of a subject is a motion point obtained by tracking the motion points of a plurality of subjects. An acquisition means for acquiring at least the motion point trajectory information indicating the trajectory of the motion point, and a construction means for constructing a motion point trajectory trained model by learning processing using the motion point trajectory information of at least the plurality of subjects as input teacher data. To prepare for.

一実施形態において、前記学習処理は、教師あり学習であり、前記取得手段は、前記複数の被験者の顎運動の評価を取得し、前記取得された顎運動の評価が出力用教師データとして利用される。 In one embodiment, the learning process is supervised learning, the acquisition means acquires evaluations of jaw movements of the plurality of subjects, and the acquired evaluations of jaw movements are used as output teacher data. To.

一実施形態において、前記学習処理は、教師なし学習であり、本発明のシステムは、前記構築された運動点軌跡学習済モデルの出力を分類する分類手段をさらに備える。 In one embodiment, the learning process is unsupervised learning, and the system of the present invention further comprises a classification means for classifying the output of the constructed motion point trajectory trained model.

本発明の一実施形態において、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するためのプログラムは、プロセッサ部を備えるシステムにおいて実行され、前記プログラムは、複数の被験者の運動点を追跡することによって得られた運動点の軌跡を示す運動点軌跡情報を少なくとも取得することと、少なくとも前記複数の被験者の前記運動点軌跡情報を入力用教師データとした学習処理により、運動点軌跡学習済モデルを構築することとを含む処理を前記プロセッサ部に行わせる。 In one embodiment of the present invention, a program for constructing a motion point trajectory trained model used for measuring a subject's jaw motion is executed in a system including a processor unit, and the program is a plurality of subjects. By acquiring at least the motion point trajectory information indicating the trajectory of the motion point obtained by tracking the motion points of the above, and by learning processing using the motion point trajectory information of at least the plurality of subjects as input teacher data. The processor unit is made to perform a process including constructing a model in which the motion point trajectory has been learned.

一実施形態において、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するための方法は、複数の被験者の運動点を追跡することによって得られた運動点の軌跡を示す運動点軌跡情報を少なくとも取得することと、少なくとも前記複数の被験者の前記運動点軌跡情報を入力用教師データとした学習処理により、運動点軌跡学習済モデルを構築することとを含む。 In one embodiment, a method for constructing a motion point trajectory trained model used to measure a subject's jaw motion is to track the trajectory of the motion points obtained by tracking the motion points of a plurality of subjects. It includes at least acquiring the motion point trajectory information to be shown, and constructing a motion point trajectory trained model by a learning process using the motion point trajectory information of at least the plurality of subjects as input teacher data.

本発明によれば、被験者の顎運動を簡易に測定することが可能なシステム等を提供することができる。ユーザは、本発明のシステムを用いることにより、場所を問わずに、例えば、医療機関以外にも、会社、自宅等で顎運動を測定することができるようになる。 According to the present invention, it is possible to provide a system or the like capable of easily measuring the jaw movement of a subject. By using the system of the present invention, the user can measure jaw movements at any place, for example, at a company, at home, or the like, in addition to a medical institution.

本発明の一実施形態を用いて、患者の顎運動を簡易に測定するためのフロー１０の一例を示す図The figure which shows an example of the flow 10 for simply measuring the jaw movement of a patient using one Embodiment of this invention. 被験者の顎運動を測定するためのコンピュータシステム１００の構成の一例を示す図The figure which shows an example of the structure of the computer system 100 for measuring the jaw movement of a subject. 一実施形態におけるプロセッサ部１２０の構成の一例を示す図The figure which shows an example of the structure of the processor part 120 in one Embodiment 抽出手段１２２による顔の複数の特徴部分の抽出の結果の一例を示す図The figure which shows an example of the result of the extraction of a plurality of feature portions of a face by the extraction means 122. 顔とカメラとの角度をずらした場合の動画における２つのフレームについて、Ｏｐｅｎｆａｃｅを用いて顔の特徴部分を抽出した結果を比較した図A diagram comparing the results of extracting facial features using Openface for two frames in a video when the angle between the face and the camera is shifted. 一実施形態におけるプロセッサ部１３０の構成の一例を示す図The figure which shows an example of the structure of the processor part 130 in one Embodiment 別の実施形態におけるプロセッサ部１４０の構成の一例を示す図The figure which shows an example of the structure of the processor part 140 in another embodiment. 別の実施形態におけるプロセッサ部１５０の構成の一例を示す図The figure which shows an example of the structure of the processor part 150 in another embodiment. 別の実施形態におけるプロセッサ部１６０の構成の一例を示す図The figure which shows an example of the structure of the processor part 160 in another embodiment. 評価手段１６１が利用し得るニューラルネットワークモデル１６１０の構造の一例を示す図The figure which shows an example of the structure of the neural network model 1610 which can be used by the evaluation means 161. 被験者の顎運動を測定するためのコンピュータシステム１００による処理の一例（処理８００）を示すフローチャートA flowchart showing an example (process 800) of processing by the computer system 100 for measuring the jaw movement of the subject. 被験者の顎運動を測定するためのコンピュータシステム１００による処理の別の例（処理９００）を示すフローチャートA flowchart showing another example (process 900) of processing by the computer system 100 for measuring the jaw movement of the subject. 被験者の顎運動を測定するためのコンピュータシステム１００による処理の別の例（処理１０００）を示すフローチャートA flowchart showing another example (process 1000) of processing by the computer system 100 for measuring the jaw movement of the subject. 標点の一例を示す図Diagram showing an example of a marker point 標点の一例を示す図Diagram showing an example of a marker point

（定義）
本明細書において「下顎領域」は、顔の下顎骨上の領域のことをいう。(Definition)
As used herein, the term "mandible region" refers to the region on the mandible of the face.

本明細書において「上顔面領域」は、顔の下顎領域以外の領域のことをいう。すなわち、顔の領域は、「下顎領域」と「上顔面領域」とに二分される。 As used herein, the term "upper facial region" refers to an region other than the mandibular region of the face. That is, the facial region is divided into a "mandibular region" and a "upper facial region".

本明細書において「顔の座標系」とは、顔において定義された座標系のことをいう。「顔の座標系」は、例えば、水平系（ｘ軸）と、矢状系（ｙ軸）と、冠状系（ｚ軸）とからなる。「顔の座標系」における水平系は、例えば、眼耳平面（フランクフルト平面）、鼻聴導線（もしくはカンペル平面）、ヒップ平面、咬合平面、または、両瞳孔線等に沿って定義される。「顔の座標系」の矢状系は、例えば、正中線、または、正中矢状平面等に沿って定義される。「顔の座標系」の冠状系は、例えば、眼窩平面等に沿って定義される。水平系、矢状系、および冠状系を定義する平面または線は、上述したものに限定されず、任意の平面または線に沿って定義され得る。 As used herein, the term "face coordinate system" refers to a coordinate system defined for a face. The "face coordinate system" includes, for example, a horizontal system (x-axis), a sagittal system (y-axis), and a coronal system (z-axis). The horizontal system in the "face coordinate system" is defined along, for example, the eye-ear plane (Frankfurt plane), the nasal hearing line (or Campel plane), the hip plane, the occlusal plane, or both pupil lines. The sagittal system of the "face coordinate system" is defined, for example, along the median line, the mid-sagittal plane, and the like. The coronal system of the "face coordinate system" is defined, for example, along the orbital plane. The planes or lines that define the horizontal, sagittal, and coronal systems are not limited to those described above and can be defined along any plane or line.

本明細書において「顔の基準座標系」とは、正面を向いた人間が通常有する顔の座標系のことをいう。「顔の基準座標系」は、複数の顔の画像を学習することによって導出される。 As used herein, the term "face reference coordinate system" refers to the face coordinate system normally possessed by a person facing the front. The "face reference coordinate system" is derived by learning multiple facial images.

本明細書において「運動点」とは、顎運動によって運動する部位上の点または領域のことをいう。 As used herein, the term "exercise point" refers to a point or region on a site that is exercised by jaw movement.

本明細書において「固定点」とは、顎運動によっては運動しない部位上の点または領域のことをいう。 As used herein, the term "fixed point" refers to a point or region on a site that does not move due to jaw movement.

本明細書において「ベース顔モデル」とは、顔の３次元モデルのことをいい、いわゆる「３Ｄアバター」である。実際の顎運動を撮影した動画から抽出された特徴部分の動きを「ベース顔モデル」に対して反映させると、実際の顎運動を反映した「顎運動顔モデル」となる。「顎運動顔モデル」は、いわゆる、動画中の顎運動に合わせて「動く３Ｄアバター」である。 In the present specification, the "base face model" refers to a three-dimensional model of a face, and is a so-called "3D avatar". When the movement of the characteristic part extracted from the video of the actual jaw movement is reflected on the "base face model", the "jaw movement face model" that reflects the actual jaw movement is obtained. The "jaw movement face model" is a so-called "moving 3D avatar" according to the jaw movement in the moving image.

本明細書において「約」とは、後に続く数値の±１０％を意味する。 As used herein, "about" means ± 10% of the value that follows.

以下、図面を参照しながら、本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

１．患者の顎運動を簡易に測定するためのフロー
図１は、本発明の一実施形態を用いて、患者の顎運動を簡易に測定するためのフロー１０の一例を示す。フロー１０は、歯科医が、患者２０が咀嚼している様子を動画で撮影するだけで、患者２０の顎運動が測定され（すなわち、咀嚼時顎運動測定が行われ）、顎運動の測定結果が歯科医に提供されるというものである。 1. 1. Flow for simply measuring a patient's jaw movement FIG. 1 shows an example of a flow 10 for simply measuring a patient's jaw movement using an embodiment of the present invention. In the flow 10, the dentist simply captures a video of the patient 20 chewing, and the jaw movement of the patient 20 is measured (that is, the jaw movement during chewing is measured), and the measurement result of the jaw movement is performed. Is provided to the dentist.

はじめに、歯科医が、端末装置３００（例えば、スマートフォン、タブレット等）を用いて、患者２０が咀嚼している様子を動画で撮影する。なお、動画は、連続した複数の画像（静止画）であると見なせるため、本明細書では、「動画」と「連続した複数の画像」とは同義に用いられる。 First, a dentist uses a terminal device 300 (for example, a smartphone, a tablet, etc.) to take a moving image of the patient 20 chewing. Since a moving image can be regarded as a plurality of continuous images (still images), the "moving image" and the "consecutive plurality of images" are used interchangeably in the present specification.

ステップＳ１では、撮影された動画が、サーバ装置３０に提供される。動画がサーバ装置３０に提供される態様は問わない。例えば、動画は、ネットワーク（例えば、インターネット、ＬＡＮ等）を介してサーバ装置３０に提供され得る。例えば、動画は、記憶媒体（例えば、リムーバブルメディア）を介してサーバ装置３０に提供され得る。 In step S1, the captured moving image is provided to the server device 30. It does not matter how the moving image is provided to the server device 30. For example, the moving image may be provided to the server device 30 via a network (eg, the Internet, LAN, etc.). For example, the moving image may be provided to the server device 30 via a storage medium (eg, removable media).

次いで、サーバ装置３０において、ステップＳ１で提供された動画に対する処理が行われる。サーバ装置３０による処理により、患者２０の顎運動の測定結果が生成される。顎運動の測定結果は、例えば、顎運動の軌跡を表す情報であり得る。あるいは、顎運動の測定結果は、例えば、顎運動の軌跡を表す情報に基づいて生成された患者２０の顎運動を評価した顎運動評価情報であり得る。顎運動評価情報は、例えば、顎運動の軌跡が正常なパターンであるか否かの情報を含む。 Next, in the server device 30, processing is performed on the moving image provided in step S1. The processing by the server device 30 generates the measurement result of the jaw movement of the patient 20. The measurement result of the jaw movement may be, for example, information representing the trajectory of the jaw movement. Alternatively, the measurement result of the jaw movement may be, for example, jaw movement evaluation information for evaluating the jaw movement of the patient 20 generated based on the information representing the trajectory of the jaw movement. The jaw movement evaluation information includes, for example, information on whether or not the trajectory of the jaw movement is a normal pattern.

ステップＳ２では、サーバ装置３０で生成された顎運動の測定結果が、歯科医に提供される。顎運動の測定結果が提供される態様は問わない。例えば、顎運動の測定結果は、ネットワークを介して提供されてもよいし、記憶媒体を介して提供されてもよい。あるいは、例えば、顎運動の測定結果は、紙媒体を介して提供されてもよい。 In step S2, the measurement result of the jaw movement generated by the server device 30 is provided to the dentist. It does not matter which mode the measurement result of jaw movement is provided. For example, jaw movement measurements may be provided via a network or via a storage medium. Alternatively, for example, the measurement results of jaw movement may be provided via a paper medium.

歯科医は、ステップＳ２で提供された患者２０の顎運動の測定結果を確認する。 The dentist confirms the measurement result of the jaw movement of the patient 20 provided in step S2.

このように、歯科医は、特有の器具を必要とすることなく、端末装置３００で患者２０の咀嚼の様子を撮影するだけで、患者２０の顎運動を簡易に測定することができる。 As described above, the dentist can easily measure the jaw movement of the patient 20 simply by photographing the state of chewing of the patient 20 with the terminal device 300 without requiring a specific instrument.

上述した例では、フロー１０において、咀嚼時顎運動測定を行うことを説明したが、本発明は、これに限定されない。例えば、フロー１０において、限界運動測定を行うこともできる。これは、例えば、歯科医が、端末装置３００を用いて、患者２０が顎を大きく動かした様子（例えば、大きく口を開けた様子、顎を前に出した様子等）を動画で撮影し、撮影された動画をサーバ装置３０で処理することによって達成され得る。 In the above-mentioned example, it has been described that the jaw movement during mastication is measured in the flow 10, but the present invention is not limited to this. For example, in the flow 10, the limit motion measurement can be performed. For example, the dentist uses the terminal device 300 to take a video of the patient 20 moving his jaw greatly (for example, his mouth wide open, his jaw extended forward, etc.). It can be achieved by processing the captured moving image in the server device 30.

上述した例では、歯科医が、患者２０の顎運動の動画を撮影するだけで、顎運動の測定結果の提供を受けることができるフロー１０を説明したが、本発明は、これに限定されない。例えば、患者２０が、自身の端末装置３００を用いて、自身が咀嚼している様子を動画で撮影するだけで、あるいは、患者２０が、別の人に、端末装置３００を用いて、自身が咀嚼している様子を動画で撮影してもらうだけで、患者２０の顎運動の測定結果が、医師、歯科医、介護者、または患者２０の家族あるいは患者２０本人に提供されるというフローも可能である。 In the above-mentioned example, the dentist has described the flow 10 in which the measurement result of the jaw movement can be provided only by taking a moving image of the jaw movement of the patient 20, but the present invention is not limited thereto. For example, the patient 20 can use his / her own terminal device 300 to take a video of himself / herself chewing, or the patient 20 can use the terminal device 300 to another person and himself / herself. It is also possible to have the patient 20's jaw movement measurement results provided to the doctor, dentist, caregiver, or patient 20's family or 20 patients by simply having them take a video of the mastication. Is.

なお、上述したサーバ装置３０による処理を端末装置３００で行うことも可能である。この場合は、サーバ装置３０が省略され得、端末装置３００は、スタンドアローンで動くことができる。患者２０の顎運動の測定結果が、医師、歯科医、介護者、または家族に提供される上述した例では、端末装置３００からサーバ装置３０を介することなく、顎運動の測定結果が病院の医師もしくは歯科医、または、介護者、または、患者の２０の家族あるいは患者２０本人に提供され得る。 It is also possible for the terminal device 300 to perform the processing by the server device 30 described above. In this case, the server device 30 may be omitted and the terminal device 300 can operate standalone. In the above-mentioned example, in which the measurement result of the jaw movement of the patient 20 is provided to a doctor, a dentist, a caregiver, or a family member, the measurement result of the jaw movement is obtained from the terminal device 300 without going through the server device 30 by the doctor in the hospital. Alternatively, it may be provided to a dentist, a caregiver, or 20 family members or 20 patients themselves.

上述したフロー１０は、後述する本発明のコンピュータシステム１００を利用して実現され得る。 The above-mentioned flow 10 can be realized by using the computer system 100 of the present invention described later.

２．被験者の顎運動を測定するためのコンピュータシステムの構成
図２は、被験者の顎運動を測定するためのコンピュータシステム１００の構成の一例を示す。 2. 2. Configuration of a Computer System for Measuring the Jaw Movement of a Subject FIG. 2 shows an example of the configuration of a computer system 100 for measuring the jaw movement of a subject.

コンピュータシステム１００は、データベース部２００に接続されている。また、コンピュータシステム１００は、少なくとも１つの端末装置３００にネットワーク４００を介して接続され得る。 The computer system 100 is connected to the database unit 200. Further, the computer system 100 may be connected to at least one terminal device 300 via the network 400.

ネットワーク４００は、任意の種類のネットワークであり得る。ネットワーク４００は、例えば、インターネットであってもよいし、ＬＡＮであってもよい。ネットワーク４００は、有線ネットワークであってもよいし、無線ネットワークであってもよい。 The network 400 can be any kind of network. The network 400 may be, for example, the Internet or a LAN. The network 400 may be a wired network or a wireless network.

コンピュータシステム１００の一例は、サーバ装置であるが、これに限定されない。端末装置３００の一例は、ユーザが保持するコンピュータ（例えば、端末装置）、または、病院に設置されているコンピュータ（例えば、端末装置）であるが、これに限定されない。ここで、コンピュータ（サーバ装置または端末装置）は、任意のタイプのコンピュータであり得る。例えば、端末装置は、スマートフォン、タブレット、パーソナルコンピュータ、スマートグラス等の任意のタイプの端末装置であり得る。 An example of the computer system 100 is, but is not limited to, a server device. An example of the terminal device 300 is, but is not limited to, a computer held by a user (for example, a terminal device) or a computer installed in a hospital (for example, a terminal device). Here, the computer (server device or terminal device) can be any type of computer. For example, the terminal device can be any type of terminal device such as a smartphone, tablet, personal computer, smart glasses and the like.

コンピュータシステム１００は、インターフェース部１１０と、プロセッサ部１２０と、メモリ１７０部とを備える。 The computer system 100 includes an interface unit 110, a processor unit 120, and a memory unit 170.

インターフェース部１１０は、コンピュータシステム１００の外部と情報のやり取りを行う。コンピュータシステム１００のプロセッサ部１２０は、インターフェース部１１０を介して、コンピュータシステム１００の外部から情報を受信することが可能であり、コンピュータシステム１００の外部に情報を送信することが可能である。インターフェース部１１０は、任意の形式で情報のやり取りを行うことができる。 The interface unit 110 exchanges information with the outside of the computer system 100. The processor unit 120 of the computer system 100 can receive information from the outside of the computer system 100 via the interface unit 110, and can transmit the information to the outside of the computer system 100. The interface unit 110 can exchange information in any format.

インターフェース部１１０は、例えば、コンピュータシステム１００に情報を入力することを可能にする入力部を備える。入力部が、どのような態様でコンピュータシステム１００に情報を入力することを可能にするかは問わない。例えば、入力部がタッチパネルである場合には、ユーザがタッチパネルにタッチすることによって情報を入力するようにしてもよい。あるいは、入力部がマウスである場合には、ユーザがマウスを操作することによって情報を入力するようにしてもよい。あるいは、入力部がキーボードである場合には、ユーザがキーボードのキーを押下することによって情報を入力するようにしてもよい。あるいは、入力部がマイクである場合には、ユーザがマイクに音声を入力することによって情報を入力するようにしてもよい。あるいは、入力部がカメラである場合には、カメラが撮像した情報を入力するようにしてもよい。あるいは、入力部がデータ読み取り装置である場合には、コンピュータシステム１００に接続された記憶媒体から情報を読み取ることによって情報を入力するようにしてもよい。あるいは、入力部が受信器である場合、受信器がネットワーク４００を介してコンピュータシステム１００の外部から情報を受信することにより入力してもよい。 The interface unit 110 includes, for example, an input unit that enables information to be input to the computer system 100. It does not matter in what manner the input unit enables the information to be input to the computer system 100. For example, when the input unit is a touch panel, the user may input information by touching the touch panel. Alternatively, when the input unit is a mouse, the user may input information by operating the mouse. Alternatively, when the input unit is a keyboard, the user may input information by pressing a key on the keyboard. Alternatively, when the input unit is a microphone, the user may input information by inputting voice into the microphone. Alternatively, when the input unit is a camera, the information captured by the camera may be input. Alternatively, when the input unit is a data reading device, the information may be input by reading the information from the storage medium connected to the computer system 100. Alternatively, when the input unit is a receiver, the receiver may input information by receiving information from the outside of the computer system 100 via the network 400.

インターフェース部１１０は、例えば、コンピュータシステム１００から情報を出力することを可能にする出力部を備える。出力部が、どのような態様でコンピュータシステム１００から情報を出力することを可能にするかは問わない。例えば、出力部が表示画面である場合、表示画面に情報を出力するようにしてもよい。あるいは、出力部がスピーカである場合には、スピーカからの音声によって情報を出力するようにしてもよい。あるいは、出力部がデータ書き込み装置である場合、コンピュータシステム１００に接続された記憶媒体に情報を書き込むことによって情報を出力するようにしてもよい。あるいは、出力部が送信器である場合、送信器がネットワーク４００を介してコンピュータシステム１００の外部に情報を送信することにより出力してもよい。この場合、ネットワークの種類は問わない。例えば、送信器は、インターネットを介して情報を送信してもよいし、ＬＡＮを介して情報を送信してもよい。 The interface unit 110 includes, for example, an output unit that enables information to be output from the computer system 100. It does not matter in what manner the output unit enables the information to be output from the computer system 100. For example, when the output unit is a display screen, information may be output to the display screen. Alternatively, when the output unit is a speaker, the information may be output by the voice from the speaker. Alternatively, when the output unit is a data writing device, the information may be output by writing the information to the storage medium connected to the computer system 100. Alternatively, when the output unit is a transmitter, the transmitter may output information by transmitting information to the outside of the computer system 100 via the network 400. In this case, the type of network does not matter. For example, the transmitter may transmit information via the Internet or may transmit information via LAN.

例えば、出力部は、コンピュータシステム１００によって生成された運動点軌跡情報を出力することができる。例えば、出力部は、コンピュータシステム１００によって生成された顎運動評価情報を出力することができる。 For example, the output unit can output the motion point trajectory information generated by the computer system 100. For example, the output unit can output the jaw movement evaluation information generated by the computer system 100.

プロセッサ部１２０は、コンピュータシステム１００の処理を実行し、かつ、コンピュータシステム１００全体の動作を制御する。プロセッサ部１２０は、メモリ部１７０に格納されているプログラムを読み出し、そのプログラムを実行する。これにより、コンピュータシステム１００を所望のステップを実行するシステムとして機能させることが可能である。プロセッサ部１２０は、単一のプロセッサによって実装されてもよいし、複数のプロセッサによって実装されてもよい。 The processor unit 120 executes the processing of the computer system 100 and controls the operation of the entire computer system 100. The processor unit 120 reads a program stored in the memory unit 170 and executes the program. This makes it possible to make the computer system 100 function as a system that performs a desired step. The processor unit 120 may be implemented by a single processor or may be implemented by a plurality of processors.

メモリ部１７０は、コンピュータシステム１００の処理を実行するために必要とされるプログラムやそのプログラムの実行に必要とされるデータ等を格納する。メモリ部１７０は、被験者の顎運動を測定するための処理をプロセッサ部１２０に行わせるためのプログラム（例えば、後述する図８、図９に示される処理を実現するプログラム）、および、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するための処理（例えば、後述する図１０に示される処理を実現するプログラム）を格納してもよい。ここで、プログラムをどのようにしてメモリ部１７０に格納するかは問わない。例えば、プログラムは、メモリ部１７０にプリインストールされていてもよい。あるいは、プログラムは、ネットワークを経由してダウンロードされることによってメモリ部１７０にインストールされるようにしてもよい。この場合、ネットワークの種類は問わない。メモリ部１７０は、任意の記憶手段によって実装され得る。 The memory unit 170 stores a program required for executing the processing of the computer system 100, data required for executing the program, and the like. The memory unit 170 includes a program for causing the processor unit 120 to perform a process for measuring the jaw movement of the subject (for example, a program for realizing the process shown in FIGS. 8 and 9 described later), and the subject's jaw. A process for constructing a motion point trajectory trained model used for measuring motion (for example, a program for realizing the process shown in FIG. 10 described later) may be stored. Here, it does not matter how the program is stored in the memory unit 170. For example, the program may be pre-installed in the memory unit 170. Alternatively, the program may be installed in the memory unit 170 by being downloaded via the network. In this case, the type of network does not matter. The memory unit 170 may be implemented by any storage means.

データベース部２００には、例えば、複数の被験者のそれぞれについて、顎運動中の被験者の顔の連続した複数の画像が格納され得る。顎運動中の被験者の顔の連続した複数の画像は、例えば、端末装置３００からネットワーク４００を介してデータベース部２００に送信されたものであってもよいし、例えば、コンピュータシステム１００が備える撮影手段によって撮影されたものであってもよい。複数の被験者の顎運動中の顔の連続した複数の画像は、後述する顔基準座標系学習済モデルを構築するために利用され得る。 The database unit 200 may store, for example, a plurality of consecutive images of the faces of the subjects during jaw movement for each of the plurality of subjects. A plurality of continuous images of the subject's face during jaw movement may be, for example, transmitted from the terminal device 300 to the database unit 200 via the network 400, or may be, for example, a photographing means provided in the computer system 100. It may be taken by. A plurality of consecutive images of faces during jaw movement of a plurality of subjects can be used to construct a face reference coordinate system trained model described later.

データベース部２００には、例えば、複数の被験者のそれぞれについて、顎運動中の顔の連続した複数の画像またはそれらの画像から導出された運動点軌跡情報と、その被験者の顎運動の評価とが関連付けて格納され得る。格納されたデータは、後述する運動点軌跡学習済モデルを構築するために利用され得る。 In the database unit 200, for example, for each of a plurality of subjects, a plurality of consecutive images of the face during jaw movement or movement point trajectory information derived from those images is associated with the evaluation of the jaw movement of the subject. Can be stored. The stored data can be used to build a motion point trajectory trained model, which will be described later.

また、データベース部２００には、例えば、コンピュータシステム１００によって出力された運動点軌跡情報、または顎運動評価情報が格納され得る。 Further, the database unit 200 may store, for example, motion point trajectory information output by the computer system 100 or jaw motion evaluation information.

端末装置３００は、少なくとも、カメラ等の撮影手段を備える。撮影手段は、少なくとも連続した複数の画像を撮影することが可能である限り、任意の構成を備え得る。撮影手段は、顎運動中の被験者の顔の連続した複数の画像を撮影するために利用される。撮影される画像は、２次元情報（縦×横）を含む画像であってもよいし、３次元情報（縦×横×奥行）を含む画像であってもよい。 The terminal device 300 includes at least a shooting means such as a camera. The photographing means may have any configuration as long as it is possible to capture at least a plurality of consecutive images. The imaging means is used to capture a plurality of consecutive images of the subject's face during jaw movement. The image to be captured may be an image including two-dimensional information (vertical x horizontal) or an image including three-dimensional information (vertical x horizontal x depth).

図３は、一実施形態におけるプロセッサ部１２０の構成の一例を示す。 FIG. 3 shows an example of the configuration of the processor unit 120 in one embodiment.

プロセッサ部１２０は、取得手段１２１と、抽出手段１２２と、生成手段１２３とを備える。 The processor unit 120 includes an acquisition unit 121, an extraction unit 122, and a generation unit 123.

取得手段１２１は、顎運動中の被験者の顔の連続した複数の画像を取得するように構成されている。取得手段１２１は、例えば、データベース部２００に格納されている顎運動中の被験者の顔の連続した複数の画像をインターフェース部１１０を介して取得することができる。あるいは、取得手段１２１は、例えば、インターフェース部１１０を介して端末装置３００から受信された連続した複数の画像を取得することができる。 The acquisition means 121 is configured to acquire a plurality of consecutive images of the face of the subject during jaw movement. The acquisition means 121 can acquire, for example, a plurality of consecutive images of the subject's face during jaw movement stored in the database unit 200 via the interface unit 110. Alternatively, the acquisition means 121 can acquire a plurality of continuous images received from the terminal device 300 via the interface unit 110, for example.

取得手段１２１は、後述する運動点軌跡学習済モデルを構築するために、複数の被験者の運動点を追跡することによって得られた運動点の軌跡を示す運動点軌跡情報を取得するように構成されてもよい。運動点軌跡情報は、例えば、本発明のコンピュータシステム１００を用いて取得された運動点軌跡情報であってもよいし、公知の任意の顎運動測定装置から得られた運動点軌跡情報であってもよい。取得手段１２１は、例えば、データベース部２００に格納されている運動点軌跡情報をインターフェース部１１０を介して取得することができる。取得手段１２１は、さらに、後述する運動点軌跡学習済モデルを構築するために、複数の被験者の顎運動の評価を取得するように構成されてもよい。取得手段１２１は、例えば、データベース部２００に格納されている顎運動の評価をインターフェース部１１０を介して取得することができる。 The acquisition means 121 is configured to acquire motion point trajectory information indicating the trajectory of the motion point obtained by tracking the motion points of a plurality of subjects in order to construct a model in which the motion point trajectory has been learned, which will be described later. You may. The motion point trajectory information may be, for example, motion point trajectory information acquired by using the computer system 100 of the present invention, or motion point trajectory information obtained from any known jaw motion measuring device. May be good. The acquisition means 121 can acquire, for example, the motion point trajectory information stored in the database unit 200 via the interface unit 110. The acquisition means 121 may be further configured to acquire evaluations of jaw movements of a plurality of subjects in order to construct a motion point trajectory trained model described later. The acquisition means 121 can acquire, for example, the evaluation of the jaw movement stored in the database unit 200 via the interface unit 110.

抽出手段１２２は、顔の下顎領域内の運動点を少なくとも抽出するように構成されている。抽出手段１２２は、例えば、取得手段１２１が取得した画像から運動点を抽出することができる。抽出手段１２２は、例えば、後述する補正手段１３１による出力から運動点を抽出することができる。抽出手段１２２は、例えば、画像中の運動点がどこであるかの入力を受け、その入力に基づいて、運動点を抽出するようにしてもよいし、入力を受けることなく、自動的に運動点を抽出するようにしてもよい。抽出手段１２２は、例えば、顔の正中線上の点または領域を運動点として自動的に抽出することができる。顔の下顎領域内の正中線上の点は、顎運動によって大きく動くため、顎運動を評価するために好適である。抽出手段１２２は、例えば、下顎の先端の点を運動点として自動的に抽出するようにしてもよい。 The extraction means 122 is configured to extract at least the motion points in the lower jaw region of the face. The extraction means 122 can, for example, extract a motion point from an image acquired by the acquisition means 121. The extraction means 122 can, for example, extract a motion point from the output of the correction means 131 described later. For example, the extraction means 122 may receive an input of where the motion point in the image is, and extract the motion point based on the input, or may automatically extract the motion point without receiving the input. May be extracted. The extraction means 122 can automatically extract, for example, a point or region on the midline of the face as a motion point. The points on the midline in the mandibular region of the face are suitable for assessing jaw movement because they move significantly with jaw movement. For example, the extraction means 122 may automatically extract a point at the tip of the lower jaw as a movement point.

一実施形態において、抽出手段１２２が運動点を自動的に抽出する場合、抽出手段１２２は、まず、複数の画像の各々に対して、顔の複数の特徴部分を検出する。顔の複数の特徴部分のうちの１つは、例えば、画像中の１以上のピクセルとして表現され得る。顔の複数の特徴部分は、例えば、目、眉毛、鼻、口、眉間、顎、耳、喉、輪郭等に該当する部分であり得る。抽出手段１２２による顔の複数の特徴部分の抽出は、例えば、複数の被験者の顔画像を用いて特徴部分を学習する処理を施された学習済モデルを用いて行うことができる。一例において、抽出手段１２２による顔の複数の特徴部分の抽出は、顔認識アプリケーション「Ｏｐｅｎｆａｃｅ」を用いて行うことができる（Tadas Baltrusaitis, Peter Robinson, Louis-Philippe Morency、「OpenFace: an open source facial behavior analysis toolkit」、2016 IEEE Winter Conference on Applications of Computer Vision (WACV)、pp.1-10、2016）。「Ｏｐｅｎｆａｃｅ」は、顔の２次元情報（縦×横）を含む画像から顔の３次元情報（縦×横×奥行）を出力できるソフトウェアである。 In one embodiment, when the extraction means 122 automatically extracts the motion points, the extraction means 122 first detects a plurality of feature portions of the face for each of the plurality of images. One of the plurality of feature portions of the face can be represented, for example, as one or more pixels in the image. The plurality of characteristic parts of the face may be, for example, parts corresponding to eyes, eyebrows, nose, mouth, eyebrows, chin, ears, throat, contour and the like. The extraction of a plurality of feature portions of a face by the extraction means 122 can be performed, for example, by using a trained model that has been subjected to a process of learning the feature portions using facial images of a plurality of subjects. In one example, the extraction of a plurality of facial features by the extraction means 122 can be performed using the face recognition application "Openface" (Tadas Baltrusaitis, Peter Robinson, Louis-Philippe Morency, "OpenFace: an open source facial behavior". analysis toolkit ”, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp.1-10, 2016). "Openface" is software that can output three-dimensional information (vertical x horizontal x depth) of a face from an image including two-dimensional information (vertical x horizontal) of the face.

抽出手段１２２は、例えば、画像に対して色強調処理を行い、色調が周囲と異なる部分を特徴部分として抽出するようにしてもよい。これは、眉毛等の周囲と色調が明確に異なる部分を抽出する際に特に好ましい。 For example, the extraction means 122 may perform color enhancement processing on an image so as to extract a portion having a color tone different from that of the surroundings as a feature portion. This is particularly preferable when extracting a portion having a color tone that is clearly different from the surroundings such as eyebrows.

図４は、抽出手段１２２による顔の複数の特徴部分の抽出の結果の一例を示す。抽出手段１２２は、顔認識アプリケーション「Ｏｐｅｎｆａｃｅ」を用いて、顔の複数の特徴部分を抽出している。抽出された複数の特徴部分のそれぞれが、黒点４０００で表示されている。「Ｏｐｅｎｆａｃｅ」では、輪郭が特徴部分として抽出され、輪郭内部の目立つ特徴がない部分（例えば、頬、口と顎との間等）を抽出することはできない。 FIG. 4 shows an example of the result of extraction of a plurality of feature portions of a face by the extraction means 122. The extraction means 122 uses the face recognition application "Openface" to extract a plurality of feature portions of a face. Each of the extracted feature portions is displayed with a black dot 4000. In "Open face", the contour is extracted as a feature portion, and it is not possible to extract a portion without a conspicuous feature inside the contour (for example, between the cheek, the mouth and the chin, etc.).

抽出手段１２２は、抽出された複数の特徴部分のうち、下顎領域内の点４１００を運動点として抽出する。抽出手段１２２は、例えば、抽出された複数の特徴部分のうち、所定期間内の座標変化が所定の範囲内の部分を運動点として抽出することができる。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。所定範囲は、例えば、約５ｍｍ～約２０ｍｍの範囲等であり得る。これにより、顎運動による運動以外で座標変化している特徴部分を誤って運動点として抽出することを回避することができる。例えば、被験者を撮影する際の被験者の身体の大きな動きを誤って抽出することを回避することができる。 The extraction means 122 extracts the point 4100 in the mandibular region as a motion point from the extracted feature portions. The extraction means 122 can, for example, extract a portion of the extracted plurality of feature portions whose coordinate change within a predetermined period is within a predetermined range as a motion point. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The predetermined range may be, for example, a range of about 5 mm to about 20 mm. This makes it possible to avoid erroneously extracting a feature portion whose coordinates change other than the movement due to jaw movement as a movement point. For example, it is possible to avoid erroneously extracting a large movement of the subject's body when photographing the subject.

例えば、図４に示されるように、下顎領域内の下顎の下端の運動点４１００が抽出され得る。 For example, as shown in FIG. 4, the motion point 4100 at the lower end of the mandible in the mandibular region can be extracted.

再び図３を参照して、生成手段１２３は、運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を少なくとも生成するように構成されている。運動点軌跡情報は、所定期間内の運動点の軌跡を示す情報である。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。生成手段１２３は、例えば、連続した複数の画像の各々における運動点の画像中の座標を特定し、連続した複数の画像間で座標をトレースすることによって、運動点軌跡情報を生成することができる。このとき、運動点の座標は、２次元座標であってもよいし、３次元座標であってもよい。画像が２次元情報（縦×横）を含む画像である場合には、運動点の座標は２次元座標となり、画像が３次元情報（縦×横×奥行）を含む画像である場合には、運動点の座標は３次元座標となる。あるいは、画像が２次元情報（縦×横）を含む画像であっても、２次元画像から３次元情報（縦×横×奥行）を出力するアルゴリズムを利用する場合には、運動点の座標は３次元座標となり得る。 With reference to FIG. 3 again, the generating means 123 is configured to generate at least the locus information of the locus indicating the locus of the locus by tracking the locus. The motion point locus information is information indicating the trajectory of the motion point within a predetermined period. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The generation means 123 can generate motion point locus information by, for example, specifying the coordinates in an image of a motion point in each of a plurality of consecutive images and tracing the coordinates between the plurality of consecutive images. .. At this time, the coordinates of the motion point may be two-dimensional coordinates or three-dimensional coordinates. When the image is an image containing two-dimensional information (vertical x horizontal), the coordinates of the motion points are two-dimensional coordinates, and when the image is an image containing three-dimensional information (vertical x horizontal x depth), the coordinates are two-dimensional coordinates. The coordinates of the motion point are three-dimensional coordinates. Alternatively, even if the image contains 2D information (vertical x horizontal), when using an algorithm that outputs 3D information (vertical x horizontal x depth) from the 2D image, the coordinates of the motion point are It can be 3D coordinates.

生成された運動点軌跡情報は、インターフェース部１１０を介してコンピュータシステム１００の外部に出力され得る。 The generated motion point trajectory information can be output to the outside of the computer system 100 via the interface unit 110.

上述した抽出手段１２２による顔の複数の特徴部分の抽出では、連続した複数の画像の各々において、顔上の同一の点を抽出することが難しい場合がある。例えば、上述したＯｐｅｎｆａｃｅでは、連続した複数の画像の各々で独立して特徴部分を抽出するため、連続した複数の画像のうちの第１の画像で抽出された特徴部分と、連続した複数の画像のうちの第２の画像で抽出された特徴部分とが同一であるとは限らない。さらに、Ｏｐｅｎｆａｃｅでは、目、眉毛、鼻、口、眉間、顎、耳、喉、輪郭等が特徴部分として抽出されるため、例えば、輪郭上の点を下顎領域に対応する特徴部分とする場合には、連続した複数の画像の撮影中に被験者の身体が動いたり、傾いたりすると、画像内の被写体の顔の輪郭が変わり、抽出される特徴部分も変わる。これにより、抽出される顔の複数の特徴部分は、連続した複数の画像間でずれることとなる。 In the extraction of the plurality of feature portions of the face by the extraction means 122 described above, it may be difficult to extract the same points on the face in each of the plurality of consecutive images. For example, in the above-mentioned Openface, since the feature portion is independently extracted from each of a plurality of consecutive images, the feature portion extracted by the first image among the continuous plurality of images and the continuous plurality of images The feature portion extracted in the second image is not always the same. Further, in Openface, eyes, eyebrows, nose, mouth, eyebrows, chin, ears, throat, contour, etc. are extracted as feature parts. Therefore, for example, when a point on the contour is a feature part corresponding to the lower jaw region. When the subject's body moves or tilts while taking a plurality of consecutive images, the contour of the subject's face in the image changes, and the extracted feature portion also changes. As a result, the plurality of featured portions of the extracted face are displaced between the plurality of consecutive images.

図５は、顔とカメラとの角度をずらした場合の動画における２つのフレームについて、Ｏｐｅｎｆａｃｅを用いて顔の特徴部分を抽出した結果を比較した図である。図５において、白線は、鼻筋の延長線であり、黒点が抽出された特徴部分である。矢印で示される下顎上の特徴部分と白線との間の距離が、２つのフレームで異なっていることから、抽出された特徴部分が、２つのフレームで同一ではないことが確認される。 FIG. 5 is a diagram comparing the results of extracting facial feature portions using Openface for two frames in a moving image when the angle between the face and the camera is shifted. In FIG. 5, the white line is an extension of the nasal muscle and is a characteristic portion from which black dots are extracted. Since the distance between the feature portion on the mandible indicated by the arrow and the white line is different in the two frames, it is confirmed that the extracted feature portion is not the same in the two frames.

このように、連続した複数の画像間で抽出される特徴部分が異なると、特徴部分に基づいて抽出される運動点も、連続した複数の画像間で異なるようになる。これでは、正確な運動点軌跡情報を生成することができなくなる。従って、連続した複数の画像間で、抽出される運動点が異ならないように、撮影中の被験者の動きや傾きを補正することが好ましい。 As described above, when the feature portions extracted from the plurality of consecutive images are different, the motion points extracted based on the feature portions are also different between the plurality of consecutive images. This makes it impossible to generate accurate motion point trajectory information. Therefore, it is preferable to correct the movement and inclination of the subject during imaging so that the extracted motion points do not differ between a plurality of consecutive images.

例えば、被験者の下顎領域に標点を設置することにより、抽出手段１２２は、下顎領域に設置された標点を認識することによって、運動点を抽出するようにしてもよい。標点は、例えば、シールであってもよいし、インクであってもよい。抽出手段１２２が、標点を運動点として抽出することにより、抽出される運動点は、連続した複数の画像間で同一になり得る。 For example, by setting a gauge point in the mandibular region of the subject, the extraction means 122 may extract the motion point by recognizing the gauge point installed in the mandibular region. The reference point may be, for example, a sticker or an ink. By extracting the reference point as a motion point by the extraction means 122, the extracted motion points can be the same among a plurality of consecutive images.

一実施形態において、標点は、標点上の特定点を表すように構成され得る。標点がある程度の大きさを有すると、抽出手段１２２によって連続した複数の画像のそれぞれから認識される標点が、連続した複数の画像間でわずかにずれることがある。抽出手段１２２が、標点内の任意の点を認識してしまうからである。これでは、抽出される運動点が、連続した複数の画像間でわずかに異なってしまい、これは、運動点軌跡情報の誤差につながり得る。抽出手段１２２が、標点上の特定点を認識するようにすることで、認識される標点のずれを無くすことができ、抽出される運動点を、連続した複数の画像間で同一にすることができる。特定点は、抽出手段１２２によって点として認識される大きさであることが好ましい。 In one embodiment, the gauge point may be configured to represent a particular point on the gauge point. If the gauge points have a certain size, the gauge points recognized by the extraction means 122 from each of the plurality of consecutive images may be slightly shifted between the plurality of consecutive images. This is because the extraction means 122 recognizes an arbitrary point in the reference point. In this case, the extracted motion points are slightly different between a plurality of consecutive images, which may lead to an error in the motion point trajectory information. By making the extraction means 122 recognize a specific point on the reference point, it is possible to eliminate the deviation of the recognized reference point, and make the extracted motion points the same among a plurality of consecutive images. be able to. The specific point is preferably of a size recognized as a point by the extraction means 122.

例えば、標点は、特定点を表す模様を有することができる。模様は、例えば、ドット模様（例えば、図１１Ａに示されるようなＡＲマーカー（例えば、ＡｒＵｃｏマーカー）、ＱＲコード（登録商標）等）であり得る。ドット模様は、各ドットの角または模様の中心を特定点として表している。模様は、例えば、機械製図における重心記号（例えば、図１１Ｂに示されるような、円形を４等分して塗り分けた記号）であり得る。機械製図における重心記号は、その中心を特定点として表している。例えば、標点は、色分けされる（例えば、特定点の色を標点の他の部分の色の補色とする、または、特定点の色を肌色の補色とする）ことにより、特定点を表すようにしてもよい。例えば、標点は、その大きさを十分に小さくすることにより、特定点を表すようにしてもよい。 For example, the gauge point can have a pattern representing a specific point. The pattern can be, for example, a dot pattern (eg, an AR marker (eg, ArUco marker) as shown in FIG. 11A, a QR code®, etc.). The dot pattern represents the corner of each dot or the center of the pattern as a specific point. The pattern can be, for example, a center of gravity symbol in mechanical drawing (for example, a symbol in which a circle is divided into four equal parts and painted separately, as shown in FIG. 11B). The center of gravity symbol in the mechanical drawing represents the center as a specific point. For example, the reference point is color-coded (for example, the color of the specific point is the complementary color of the color of other parts of the reference point, or the color of the specific point is the complementary color of the skin color) to represent the specific point. You may do so. For example, the gauge point may represent a specific point by making its size sufficiently small.

例えば、標点が有する模様がドット模様である場合、ドット模様上の少なくとも４点を認識することにより、標点の３次元座標を導出することができる。３次元座標の導出は、ＡＲ（拡張現実）の分野等で公知の技術を用いて達成することができる。標点の３次元座標は、運動点の３次元座標に相当することになる。 For example, when the pattern possessed by the reference point is a dot pattern, the three-dimensional coordinates of the reference point can be derived by recognizing at least four points on the dot pattern. Derivation of three-dimensional coordinates can be achieved by using techniques known in the field of AR (augmented reality) and the like. The three-dimensional coordinates of the reference point correspond to the three-dimensional coordinates of the motion point.

なお、抽出手段１２２は、「Ｏｐｅｎｆａｃｅ」以外のアプリケーションを用いて特徴部分を抽出するようにしてもよい。例えば、複数の画像間で一貫した特徴部分を抽出することができるアプリケーションであることが好ましい。これにより、上述した標点を用いる場合と同様の効果、すなわち、抽出される運動点が、連続した複数の画像間で同一になり得るという効果を得られるからである。また、抽出手段１２２は、例えば、顔の３次元情報（縦×横×奥行）を含む画像から特徴部分を抽出することも可能である。抽出手段１２２が、顔の３次元情報（縦×横×奥行）を含む画像から特徴部分を抽出することによっても、抽出される運動点は、連続した複数の画像間で同一になり得る。 The extraction means 122 may extract the feature portion by using an application other than "Open face". For example, an application that can extract a consistent feature portion between a plurality of images is preferable. This is because the same effect as the case of using the above-mentioned reference point, that is, the effect that the extracted motion points can be the same among a plurality of consecutive images can be obtained. Further, the extraction means 122 can, for example, extract a feature portion from an image including three-dimensional information (length x width x depth) of the face. The extraction means 122 also extracts a feature portion from an image including three-dimensional information (length x width x depth) of the face, and the extracted motion points can be the same among a plurality of consecutive images.

このように、抽出手段１２２が、標点を運動点として抽出する、または、複数の画像間で一貫した特徴部分を抽出することができるアプリケーションを用いて特徴部分を抽出する、または、顔の３次元情報を含む画像から特徴部分を抽出することにより、抽出される運動点は、連続した複数の画像間で同一になり得る。しかしながら、これらの場合であっても、撮影中の被験者の動きや傾きが存在すると、運動点の軌跡に被験者の動きや傾きが含まれることになり、正確な運動点軌跡情報を生成することができなくなる。従って、撮影中の被験者の動きや傾きを補正することが好ましい。 In this way, the extraction means 122 extracts the feature portion using an application capable of extracting the reference point as a motion point, or extracting a consistent feature portion among a plurality of images, or the face 3 By extracting a feature portion from an image containing dimensional information, the extracted motion points can be the same among a plurality of consecutive images. However, even in these cases, if the subject's movement or inclination during imaging is present, the subject's movement or inclination is included in the locus of the motion point, and accurate motion point trajectory information can be generated. become unable. Therefore, it is preferable to correct the movement and inclination of the subject during imaging.

図６Ａは、一実施形態におけるプロセッサ部１３０の構成の一例を示す。プロセッサ部１３０は、連続した複数の画像間で、抽出される運動点が異ならないように補正をするための構成を有し得る。プロセッサ部１３０は、上述したプロセッサ部１２０の代替としてコンピュータシステム１００が備えるプロセッサ部である。図６Ａでは、図３に示される要素と同一の要素に同じ参照番号を付し、ここでは説明を省略する。 FIG. 6A shows an example of the configuration of the processor unit 130 in one embodiment. The processor unit 130 may have a configuration for making corrections so that the extracted motion points do not differ between a plurality of consecutive images. The processor unit 130 is a processor unit included in the computer system 100 as an alternative to the processor unit 120 described above. In FIG. 6A, the same reference numbers as the elements shown in FIG. 3 are assigned the same reference numbers, and the description thereof will be omitted here.

プロセッサ部１３０は、取得手段１２１と、補正手段１３１と、抽出手段１２２と、生成手段１２３とを備える。 The processor unit 130 includes an acquisition unit 121, a correction unit 131, an extraction unit 122, and a generation unit 123.

補正手段１３１は、取得手段１２１によって取得された複数の画像に基づいて、被験者の顔の座標系を少なくとも補正するように構成されている。 The correction means 131 is configured to at least correct the coordinate system of the subject's face based on the plurality of images acquired by the acquisition means 121.

一実施形態において、補正手段１３１は、複数の画像の各々で被験者の顔の座標系が一致するように、複数の画像の各々について、被験者の顔の座標系を補正することができる。補正手段１３１は、例えば、複数の画像の各々をアフィン変換することにより、顔の座標系を補正することができる。複数の画像が３次元情報（縦×横×奥行）を含む画像である場合には、３次元のアフィン変換が行われ得る。これにより、複数の画像の各々から抽出される運動点が、複数の画像間で一致するようになる。 In one embodiment, the correction means 131 can correct the coordinate system of the subject's face for each of the plurality of images so that the coordinate system of the subject's face matches for each of the plurality of images. The correction means 131 can correct the coordinate system of the face, for example, by performing an affine transformation on each of the plurality of images. When a plurality of images are images including three-dimensional information (length x width x depth), three-dimensional affine transformation can be performed. As a result, the motion points extracted from each of the plurality of images are matched between the plurality of images.

別の実施形態において、補正手段１３１は、複数の被験者の顔の基準座標系を学習する処理を施された基準座標系学習済モデルを利用して、複数の画像の各々について、被験者の顔の座標系を補正することができる。基準座標系学習済モデルは、入力された画像中の被験者の顔の座標系を基準座標系に補正した画像を出力するように構成されている。基準座標系学習済モデルは、任意の機械学習モデルを用いて構築することができる。 In another embodiment, the correction means 131 utilizes a reference coordinate system trained model that has been processed to learn the reference coordinate system of the faces of a plurality of subjects, for each of the plurality of images, of the subject's face. The coordinate system can be corrected. The reference coordinate system trained model is configured to output an image obtained by correcting the coordinate system of the subject's face in the input image to the reference coordinate system. The frame of reference trained model can be constructed using any machine learning model.

基準座標系学習済モデルは、例えば、教師あり学習によって構築され得る。教師あり学習では、例えば、被験者の顔の画像が入力用教師データとして用いられ、その画像における基準座標系が出力用教師データとして用いられ得る。複数の被験者（例えば、少なくとも５０人分）の複数の画像を繰り返し学習することにより、学習済モデルは、複数の被験者の顔が統計上有すると推定される基準座標系を認識することができるようになる。このような学習済モデルに被験者の顔の画像を入力すると、被験者の顔の座標系と基準座標系との差分が出力されるようになる。例えば、被験者の顔の座標系と基準座標系との差分をゼロにまたは所定の閾値未満にするように、入力された画像を変換処理（例えば、拡縮、回転、剪断、平行移動等）するように学習済モデルを構成することによって、基準座標系学習済モデルが生成され得る。画像の変換処理は、例えば、アフィン変換を利用して行われ得る。 The frame of reference trained model can be constructed, for example, by supervised learning. In supervised learning, for example, an image of the subject's face can be used as input teacher data, and the reference coordinate system in the image can be used as output teacher data. By iteratively learning multiple images of multiple subjects (eg, for at least 50 people), the trained model can recognize a frame of reference that is statistically presumed to have the faces of multiple subjects. become. When an image of the subject's face is input to such a trained model, the difference between the subject's face coordinate system and the reference coordinate system is output. For example, the input image is converted (for example, scaling, rotation, shearing, translation, etc.) so that the difference between the coordinate system of the subject's face and the reference coordinate system becomes zero or less than a predetermined threshold. By constructing a trained model in, a frame of reference trained model can be generated. The image conversion process can be performed using, for example, an affine transformation.

基準座標系学習済モデルは、例えば、教師なし学習によって構築され得る。教師なし学習では、例えば、複数の被験者の顔の画像が入力用教師データとして用いられ得る。複数の被験者の多数の画像を繰り返し学習することにより、学習済モデルは、多くの画像に共通する顔の座標系を、複数の被験者が統計上有すると推定される基準座標系として認識することができるようになる。このような学習済モデルに被験者の顔の画像を入力すると、被験者の顔の座標系と基準座標系との差分が出力されるようになる。例えば、被験者の顔の座標系と基準座標系との差分をゼロにまたは所定の閾値未満にするように、入力された画像を変換処理（例えば、拡縮、回転、剪断、平行移動等）するように学習済モデルを構成することによって、基準座標系学習済モデルが生成され得る。画像の変換処理は、例えば、アフィン変換を利用して行われ得る。 The frame of reference trained model can be constructed, for example, by unsupervised learning. In unsupervised learning, for example, facial images of a plurality of subjects can be used as input teacher data. By repeatedly learning a large number of images of multiple subjects, the trained model can recognize the facial coordinate system common to many images as a reference coordinate system that is estimated to be statistically possessed by multiple subjects. become able to. When an image of the subject's face is input to such a trained model, the difference between the subject's face coordinate system and the reference coordinate system is output. For example, the input image is converted (for example, scaling, rotation, shearing, translation, etc.) so that the difference between the coordinate system of the subject's face and the reference coordinate system becomes zero or less than a predetermined threshold. By constructing a trained model in, a frame of reference trained model can be generated. The image conversion process can be performed using, for example, an affine transformation.

入力用教師データに用いられる画像は、例えば、２次元情報（縦×横）を含む画像であってもよいが、３次元情報（縦×横×奥行）を含む画像であることが好ましい。これにより、構築される基準座標系学習済モデルが、３次元情報を含む画像を出力することができるようになるからである。３次元情報（縦×横×奥行）を含む画像は、例えば、ＲＧＢ－Ｄカメラを用いて取得され得る。２次元情報を含む画像を利用する場合には、奥行き情報を推定する処理を行い、奥行き情報を付加したうえで、学習処理に利用することが好ましい。 The image used for the input teacher data may be, for example, an image including two-dimensional information (vertical x horizontal), but is preferably an image including three-dimensional information (vertical x horizontal x depth). This is because the constructed reference coordinate system trained model can output an image including three-dimensional information. An image containing three-dimensional information (length x width x depth) can be acquired using, for example, an RGB-D camera. When using an image containing two-dimensional information, it is preferable to perform a process of estimating the depth information, add the depth information, and then use the image for the learning process.

抽出手段１２２は、補正手段１３１によって補正された座標系において、顔の下顎領域内の運動点を抽出する。生成手段１２３は、抽出された運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を生成する。複数の画像の各々から抽出される運動点は、それぞれ同一の座標系において取得されるため、生成される運動点軌跡情報は、より正確な情報となる。 The extraction means 122 extracts the motion points in the lower jaw region of the face in the coordinate system corrected by the correction means 131. The generation means 123 generates motion point locus information indicating the locus of the motion point by tracking the extracted motion points. Since the motion points extracted from each of the plurality of images are acquired in the same coordinate system, the generated motion point trajectory information becomes more accurate information.

図６Ｂは、別の実施形態におけるプロセッサ部１４０の構成の一例を示す。プロセッサ部１４０は、顎運動においては運動しない固定点を基準として用いて、被験者の顔の座標系を補正するための構成を有し得る。プロセッサ部１４０は、上述したプロセッサ部１２０の代替としてコンピュータシステム１００が備えるプロセッサ部である。図６Ｂでは、図３、図６Ａで上述した構成要素と同じ構成要素には同じ参照数字を付し、ここでは説明を省略する。 FIG. 6B shows an example of the configuration of the processor unit 140 in another embodiment. The processor unit 140 may have a configuration for correcting the coordinate system of the subject's face by using a fixed point that does not move in jaw movement as a reference. The processor unit 140 is a processor unit included in the computer system 100 as an alternative to the processor unit 120 described above. In FIG. 6B, the same reference numbers are assigned to the same components as those described above in FIGS. 3 and 6A, and the description thereof will be omitted here.

プロセッサ部１４０は、取得手段１２１と、補正手段１３１と、抽出手段１２２と、生成手段１２３とを備える。抽出手段１２２は、第１の抽出手段１４１と、第２の抽出手段１４２とを備える。 The processor unit 140 includes an acquisition unit 121, a correction unit 131, an extraction unit 122, and a generation unit 123. The extraction means 122 includes a first extraction means 141 and a second extraction means 142.

第２の抽出手段１４２は、複数の画像に基づいて、顔の上顔面領域内の固定点を抽出するように構成されている。第２の抽出手段１４２は、例えば、画像中の固定点がどこであるかの入力を受け、その入力に基づいて、固定点を抽出するようにしてもよいし、入力を受けることなく、自動的に固定点を抽出するようにしてもよい。例えば、固定点は、顎運動によっては運動しない額、眉、眉間、鼻頭等の部位上の点または領域であり得る。例えば、固定点は、解剖学的特徴（例えば、眉、眉間、外眼角（目尻）、内眼角、瞳孔、耳珠、鼻頭等）上の点または領域であり得る。固定点は、画像上で少なくとも３ピクセルを有することが好ましく、固定点は、例えば、少なくとも３ピクセルを有する１つの領域であってもよいし、それぞれが少なくとも１ピクセルを有する３つの点であってもよい。少なくとも３ピクセルあれば、後述する補正手段１３１が、これら３つを基準として用いて、複数の画像を顔基準位置テンプレートに対応付けるように補正することができるからである。例えば、固定点は、相互に離間した少なくとも３つの点であり得、この場合、後述する補正手段１３３によって、固定点と事前に定義された顔基準位置テンプレートとに基づいて被験者の顔の座標系を補正する際に生じ得る誤差を小さくすることができる。例えば、固定点は、顔の上顔面領域の少なくとも一部を覆う領域（例えば、額を覆う曲面）であり得、この場合、後述する補正手段１３３によって、固定点と事前に定義された顔基準位置テンプレートとに基づいて被験者の顔の座標系を補正する際に生じ得る誤差を小さくすることができる。 The second extraction means 142 is configured to extract fixed points in the upper facial region of the face based on a plurality of images. The second extraction means 142 may, for example, receive an input of where the fixed point in the image is and extract the fixed point based on the input, or may automatically extract the fixed point without receiving the input. A fixed point may be extracted. For example, the fixed point can be a point or region on a site such as the forehead, glabellar, glabellar, nasal tip, etc. that does not move with jaw movement. For example, the fixed point can be a point or region on an anatomical feature (eg, eyebrows, glabellar, external eye angle (outer corner of the eye), internal eye angle, pupil, tragus, nose tip, etc.). The fixed point is preferably having at least 3 pixels on the image, and the fixed point may be, for example, one area having at least 3 pixels, or 3 points each having at least 1 pixel. May be good. This is because if there are at least 3 pixels, the correction means 131, which will be described later, can use these three as a reference and correct a plurality of images so as to be associated with the face reference position template. For example, the fixed points can be at least three points that are spaced apart from each other, in which case the coordinate system of the subject's face is based on the fixed points and a predefined face reference position template by the correction means 133 described below. It is possible to reduce the error that may occur when correcting the above. For example, the fixed point can be an area that covers at least a portion of the upper facial area of the face (eg, a curved surface that covers the forehead), in which case the fixed point and a pre-defined face reference by the correction means 133 described below. It is possible to reduce the error that may occur when correcting the coordinate system of the subject's face based on the position template.

一実施形態において、第２の抽出手段１４２が自動的に固定点を抽出する場合、第２の抽出手段１４２は、図３を参照して上述した抽出手段１２２と同様に、まず、複数の画像の各々に対して、顔の複数の特徴部分を検出する。第２の抽出手段１４２による顔の複数の特徴部分の抽出は、例えば、複数の被験者の顔画像を用いて特徴部分を学習する処理を施された学習済モデルを用いて行うことができ、例えば、顔認識アプリケーション「Ｏｐｅｎｆａｃｅ」を用いて行うことができる。第２の抽出手段１４２は、抽出された複数の特徴部分のうち、上顔面領域内の点または領域を固定点として抽出する。第２の抽出手段１４２は、例えば、抽出された複数の特徴部分のうち、所定期間内の座標変化が所定の閾値未満の部分を固定点として抽出することができる。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。所定の閾値は、例えば、約１ｍｍであり得る。第２の抽出手段１４２は、例えば、画像に対して色強調処理を行い、色調が周囲と異なる部分を特徴部分として抽出するようにしてもよい。これは、眉毛等の周囲と色調が明確に異なる部分を抽出する際に特に好ましい。 In one embodiment, when the second extraction means 142 automatically extracts a fixed point, the second extraction means 142 first, like the extraction means 122 described above with reference to FIG. 3, first has a plurality of images. For each of the above, multiple feature parts of the face are detected. The extraction of a plurality of feature portions of a face by the second extraction means 142 can be performed, for example, by using a trained model that has been processed to learn the feature portions using facial images of a plurality of subjects. , Can be done using the face recognition application "Openface". The second extraction means 142 extracts a point or region in the upper facial region as a fixed point among the plurality of extracted feature portions. The second extraction means 142 can, for example, extract a portion of the extracted plurality of feature portions whose coordinate change within a predetermined period is less than a predetermined threshold value as a fixed point. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The predetermined threshold can be, for example, about 1 mm. The second extraction means 142 may, for example, perform color enhancement processing on the image to extract a portion having a color tone different from that of the surroundings as a feature portion. This is particularly preferable when extracting a portion having a color tone that is clearly different from the surroundings such as eyebrows.

一実施形態において、被験者の上顔面領域に標点を設置することにより、第２の抽出手段１４２は、上顔面領域に設置された標点を認識することによって、固定点を抽出するようにしてもよい。本明細書および特許請求の範囲では、上顔面領域に設置される標点は、下顎領域に設置される標点と区別するために、固定点用標点とも呼ばれる。固定点用標点は、下顎領域に設置される標点と同様の構成を有することができる。固定点用標点は、例えば、シールであってもよいし、インクであってもよい。固定点用標点は、下顎領域に設置される標点とは別個に用いられてもよいが、固定点用標点は、下顎領域に設置される標点と併用されることが好ましい。 In one embodiment, by setting a reference point in the upper face region of the subject, the second extraction means 142 extracts the fixed point by recognizing the reference point installed in the upper face region. May be good. In the present specification and claims, a fixed point is also referred to as a fixed point to distinguish it from a point placed in the mandibular region. The fixed point station can have the same configuration as the point point installed in the mandibular region. The fixed point reference point may be, for example, a seal or ink. The fixed point point may be used separately from the point set in the mandibular region, but the fixed point point is preferably used in combination with the point set in the mandibular region.

一実施形態において、固定点用標点は、固定点用標点上の特定点を表すように構成され得る。例えば、固定点用標点は、特定点を表す模様を有することができる。模様は、例えば、ドット模様（例えば、図１１Ａに示されるようなＡＲマーカー（例えば、ＡｒＵｃｏマーカー）、ＱＲコード（登録商標）等）であり得る。ドット模様は、各ドットの角または模様の中心を特定点として表している。模様は、例えば、機械製図における重心記号（例えば、図１１Ｂに示されるような、円形を４等分して塗り分けた記号）であり得る。機械製図における重心記号は、その中心を特定点として表している。例えば、固定点用標点は、色分けされる（例えば、特定点の色を標点の他の部分の色の補色とする、または、特定点の色を肌色の補色とする）ことにより、特定点を表すようにしてもよい。例えば、固定点用標点は、その大きさを十分に小さくすることにより、特定点を表すようにしてもよい。 In one embodiment, the fixed point gauge may be configured to represent a particular point on the fixed point gauge. For example, a fixed point fixed point can have a pattern representing a specific point. The pattern can be, for example, a dot pattern (eg, an AR marker (eg, ArUco marker) as shown in FIG. 11A, a QR code®, etc.). The dot pattern represents the corner of each dot or the center of the pattern as a specific point. The pattern can be, for example, a center of gravity symbol in mechanical drawing (for example, a symbol in which a circle is divided into four equal parts and painted separately, as shown in FIG. 11B). The center of gravity symbol in the mechanical drawing represents the center as a specific point. For example, a fixed point can be identified by color-coding (for example, the color of a specific point is the complementary color of the color of another part of the reference point, or the color of the specific point is the complementary color of the skin color). It may represent a point. For example, a fixed point fixed point may represent a specific point by making its size sufficiently small.

例えば、固定点用標点が有する模様がドット模様である場合、ドット模様上の少なくとも４点を認識することにより、固定点用標点の３次元座標を導出することができる。３次元座標の導出は、ＡＲ（拡張現実）の分野等で公知の技術を用いて達成することができる。固定点用標点の３次元座標は、固定点の３次元座標に相当することになる。 For example, when the pattern of the fixed point gauge point is a dot pattern, the three-dimensional coordinates of the fixed point gauge point can be derived by recognizing at least four points on the dot pattern. Derivation of three-dimensional coordinates can be achieved by using techniques known in the field of AR (augmented reality) and the like. The three-dimensional coordinates of the fixed point reference point correspond to the three-dimensional coordinates of the fixed point.

補正手段１３１は、固定点と、事前に定義された顔基準位置テンプレートとに基づいて、被験者の顔の座標系を補正するように構成されている。例えば、補正手段１３１は、複数の画像の各々の固定点が、顔基準位置テンプレート上の対応する点に移動するように、複数の画像の各々を変換処理（例えば、拡縮、回転、剪断、平行移動等）する。あるいは、例えば、補正手段１３１は、複数の画像の各々の固定点と顔基準位置テンプレート上の対応する点との間の距離をゼロにまたは所定の閾値未満にするように、複数の画像の各々を変換処理する。あるいは、例えば、補正手段１３１は、複数の画像の各々の固定点によって定義される平面と顔基準位置テンプレート上の対応する平面とが一致するように、複数の画像の各々を変換処理する。変換処理後の複数の画像内の被験者の顔の座標系が、補正後の顔の座標系となる。画像の変換処理は、例えば、アフィン変換によって行われ得る。例えば、固定点が顔の上顔面領域の少なくとも一部を覆う領域（例えば、額を覆う曲面）である場合に、補正手段１３１は、当該領域と、顔基準位置テンプレート上の対応する領域との誤差が最小になるように、複数の画像の各々を変換処理することができる。これは、例えば、最小二乗法を用いて行うことができる。補正手段１３１は、例えば、固定点に加えて、運動点も利用して、被験者の顔の座標系を補正するようにしてもよい。 The correction means 131 is configured to correct the coordinate system of the subject's face based on the fixed point and a predefined face reference position template. For example, the correction means 131 transforms each of the plurality of images (eg, scaling, rotation, shearing, translation) so that each fixed point of the plurality of images moves to a corresponding point on the face reference position template. (Move, etc.). Alternatively, for example, the correction means 131 may each of the plurality of images so that the distance between each fixed point of the plurality of images and the corresponding point on the face reference position template is zero or less than a predetermined threshold. Is converted. Alternatively, for example, the correction means 131 transforms each of the plurality of images so that the plane defined by each fixed point of the plurality of images coincides with the corresponding plane on the face reference position template. The coordinate system of the subject's face in the plurality of images after the conversion process becomes the coordinate system of the corrected face. The image conversion process can be performed by, for example, an affine transformation. For example, when the fixed point is a region covering at least a part of the upper facial region of the face (for example, a curved surface covering the forehead), the correction means 131 has the region and the corresponding region on the face reference position template. Each of the plurality of images can be converted so that the error is minimized. This can be done, for example, using the method of least squares. The correction means 131 may, for example, correct the coordinate system of the subject's face by using the motion point in addition to the fixed point.

一例において、顔基準位置テンプレートは、顔が正面を向いている状態での顔の位置（すなわち、顔基準位置）を定義するテンプレートである。例えば、顔基準位置テンプレートは、解剖学的に定義され得る。例えば、顔基準位置テンプレートは、眼耳平面（フランクフルト平面）、鼻聴導線（もしくはカンペル平面）、ヒップ平面、咬合平面、または、両瞳孔線が水平となる状態での顔の位置を定義するテンプレートであり得る。例えば、顔基準位置テンプレートは、正中線、または、正中矢状平面が垂直となる状態での顔の位置を定義するテンプレートであり得る。例えば、顔基準位置テンプレートは、眼窩平面が正面を向く状態での顔の位置を定義するテンプレートであり得る。例えば、顔基準位置テンプレートは、眼耳平面（フランクフルト平面）、鼻聴導線（もしくはカンペル平面）、ヒップ平面、咬合平面、または、両瞳孔線が水平となる状態で、かつ／または、正中線、または、正中矢状平面が垂直となる状態で、かつ／または、眼窩平面が正面を向く状態での顔の位置を定義するテンプレートであり得る。顔基準位置テンプレートは、顔が正面を向いている状態での顔の各部位の位置を確認するために用いられることができる。例えば、補正手段１３１は、顔基準位置テンプレートにおける顔の部位の位置を確認し、その位置に複数の画像中の対応する部位（例えば、固定点が位置する部位）を移動させるように、複数の画像の各々を変換処理することができる。例えば、補正手段１３１は、顔基準位置テンプレートにおける種々の平面（例えば、フランクフルト平面、カンペル平面等）の向きを確認し、その向きに複数の画像中の対応する平面（例えば、固定点によって定義される平面）を移動させるように、複数の画像の各々を変換処理することができる。 In one example, the face reference position template is a template that defines the position of the face (that is, the face reference position) when the face is facing the front. For example, a face reference position template can be anatomically defined. For example, the face reference position template is a template that defines the position of the face when the eye-ear plane (Frankfurt plane), the nasal hearing line (or Campel plane), the hip plane, the occlusal plane, or both pupil lines are horizontal. Can be. For example, the face reference position template can be a template that defines the position of the face in a state where the median line or the mid-sagittal plane is vertical. For example, a face reference position template can be a template that defines the position of the face with the orbital plane facing forward. For example, the face reference position template may be an ophthalmic plane (Frankfurt plane), a nasal hearing line (or Campel plane), a hip plane, an occlusal plane, or with both pupil lines horizontal and / or a midline. Alternatively, it may be a template that defines the position of the face with the midsaline plane perpendicular and / or with the orbital plane facing forward. The face reference position template can be used to confirm the position of each part of the face with the face facing forward. For example, the correction means 131 confirms the position of the face part in the face reference position template, and moves a plurality of corresponding parts (for example, a part where a fixed point is located) in the plurality of images to the position. Each of the images can be converted. For example, the correction means 131 confirms the orientation of various planes (eg, Frankfurt plane, Campel plane, etc.) in the face reference position template, and the orientation is defined by the corresponding planes (eg, fixed points) in the plurality of images. Each of the plurality of images can be converted so as to move the plane.

一例において、顔基準位置テンプレートは、連続した複数の画像を撮影する際に利用されるテンプレートとして実装され得る。例えば、顔基準位置テンプレートが、端末装置３００を用いて複数の画像を撮影する際に端末装置３００の画面に表示され、被験者の顔の向きが、顔基準位置テンプレートに整合した場合にのみ、画像が撮影されるようにすることができる。これにより、撮影される複数の画像の各々における被験者の顔の写り方が一貫し、連続した複数の画像間で、抽出される運動点が異なる可能性を低減することができる。このような撮影される画像を制限することも、コンピュータシステム１００による補正の一種である。 In one example, the face reference position template can be implemented as a template used when taking a plurality of consecutive images. For example, the image is displayed only when the face reference position template is displayed on the screen of the terminal device 300 when a plurality of images are taken by the terminal device 300, and the orientation of the subject's face matches the face reference position template. Can be taken. As a result, the appearance of the subject's face in each of the plurality of captured images is consistent, and the possibility that the extracted motion points differ among the plurality of consecutive images can be reduced. Limiting such captured images is also a type of correction by the computer system 100.

第１の抽出手段１４１は、図３を参照して上述した抽出手段１２２と同様の構成であり得る。第１の抽出手段１４１は、補正手段１３１によって補正された座標系において、顔の下顎領域内の運動点を抽出する。生成手段１２３は、抽出された運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を生成する。複数の画像の各々から抽出される運動点は、それぞれ同一の座標系において取得されるため、生成される運動点軌跡情報は、より正確な情報となる。 The first extraction means 141 may have the same configuration as the extraction means 122 described above with reference to FIG. The first extraction means 141 extracts the motion points in the lower jaw region of the face in the coordinate system corrected by the correction means 131. The generation means 123 generates motion point locus information indicating the locus of the motion point by tracking the extracted motion points. Since the motion points extracted from each of the plurality of images are acquired in the same coordinate system, the generated motion point trajectory information becomes more accurate information.

上述したように、図６Ｂに示される例では、固定点を基準として用いて顔の座標系を補正することにより、より正確な運動点軌跡情報を生成することができる。例えば、連続した複数の画像の撮影時に、被験者の身体の動き等により、顎運動とは独立して固定点が移動してしまう場合（例えば、眉を固定点とする際に顎運動をしながら眉を上下に動かしたり、上顔面領域の皮膚が動いたりすることにより、固定点が複数の画像間で一致しない場合等）には、固定点を基準として用いて顔の座標系を補正することに加えて、固定点の移動を相殺するように補正することが好ましい。固定点の移動による運動点の軌跡の誤差が低減され、さらに正確な運動点軌跡情報を生成することができるからである。 As described above, in the example shown in FIG. 6B, more accurate motion point trajectory information can be generated by correcting the coordinate system of the face using the fixed point as a reference. For example, when the fixed point moves independently of the jaw movement due to the movement of the subject's body when taking a plurality of consecutive images (for example, while performing the jaw movement when the eyebrows are set as the fixed point). If the fixed points do not match between multiple images due to the movement of the chin up and down or the skin in the upper facial area, etc.), correct the facial coordinate system using the fixed points as a reference. In addition, it is preferable to correct the movement of the fixed point so as to cancel it. This is because the error of the locus of the moving point due to the movement of the fixed point is reduced, and more accurate information on the locus of the moving point can be generated.

固定点の移動を相殺するように補正するために、補正手段１３１は、第２の生成手段（図示せず）と、第２の補正手段（図示せず）とをさらに備え得る。 In order to compensate for offsetting the movement of the fixed point, the correction means 131 may further include a second generation means (not shown) and a second correction means (not shown).

第２の生成手段は、固定点を追跡することによって、固定点の軌跡を示す固定点軌跡情報を生成するように構成されている。第２の生成手段は、例えば、少なくとも３ピクセルのそれぞれについて固定点軌跡情報を生成するようにしてもよいし、少なくとも３ピクセルのうちの少なくとも１ピクセル（例えば、最も正中線に近いピクセル、最も上側にあるピクセル、最も下側にあるピクセル、ランダムに選択されるピクセルなど）について固定点軌跡情報を生成するようにしてもよい。第２の生成手段は、例えば、少なくとも３ピクセルの重心について固定点軌跡情報を生成するようにしてもよい。固定点軌跡情報は、所定期間内の固定点の軌跡を示す情報である。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。第２の生成手段は、例えば、連続した複数の画像の各々における固定点の画像中の座標を追跡することによって、固定点軌跡情報を生成することができる。 The second generation means is configured to generate fixed point locus information indicating the locus of the fixed point by tracking the fixed point. The second generation means may, for example, generate fixed point trajectory information for each of at least 3 pixels, or at least 1 pixel out of at least 3 pixels (eg, the pixel closest to the midline, the uppermost). Fixed point trajectory information may be generated for pixels at, the bottom of the pixel, randomly selected pixels, etc.). The second generation means may generate fixed point locus information for, for example, a center of gravity of at least 3 pixels. The fixed point locus information is information indicating the locus of a fixed point within a predetermined period. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The second generation means can generate fixed point locus information, for example, by tracking the coordinates in the image of the fixed point in each of a plurality of consecutive images.

第２の補正手段は、固定点軌跡情報に基づいて、運動点軌跡情報を補正するように構成されている。例えば、第２補正手段は、運動点の軌跡から固定点の軌跡を差し引くことによって、運動点軌跡情報を補正するようにしてもよい。このとき、例えば、少なくとも３ピクセルのそれぞれの固定点軌跡情報のうち、最も軌跡の移動が小さいピクセルの固定点軌跡情報を用いて、運動点の軌跡から固定点の軌跡を差し引くことによって、運動点軌跡情報を補正することができる。あるいは、例えば、少なくとも３ピクセルのそれぞれの固定点軌跡情報のうち、最も軌跡の移動が大きいピクセルの固定点軌跡情報を用いて、運動点の軌跡から固定点の軌跡を差し引くことによって、運動点軌跡情報を補正することができる。あるいは、例えば、少なくとも３ピクセルの重心の固定点軌跡情報を用いて、運動点の軌跡から固定点の軌跡を差し引くことによって、運動点軌跡情報を補正することができる。 The second correction means is configured to correct the motion point trajectory information based on the fixed point trajectory information. For example, the second correction means may correct the motion point locus information by subtracting the locus of the fixed point from the locus of the motion point. At this time, for example, among the fixed point locus information of each of at least 3 pixels, the fixed point locus information of the pixel having the smallest locus movement is used, and the locus of the fixed point is subtracted from the locus of the locus to move the locus. The trajectory information can be corrected. Alternatively, for example, by using the fixed point locus information of the pixel having the largest movement of the locus among the fixed point locus information of at least 3 pixels, the locus of the fixed point is subtracted from the locus of the moving point to obtain the locus of the moving point. Information can be corrected. Alternatively, for example, the motion point trajectory information can be corrected by subtracting the motion point trajectory from the motion point trajectory using the fixed point trajectory information of the center of gravity of at least 3 pixels.

例えば、第２の補正手段は、運動点の座標から固定点の座標を差し引いて得られる補正後の運動点を追跡することによって、運動点軌跡情報を補正するようにしてもよい。このとき、例えば、少なくとも３ピクセルのそれぞれの固定点軌跡情報のうち、最も軌跡の移動が小さいピクセルの座標を運動点の座標から差し引いて、補正後の運動点を得るようにしてもよい。あるいは、例えば、少なくとも３ピクセルのそれぞれの固定点軌跡情報のうち、最も軌跡の移動が大きいピクセルの座標を運動点の座標から差し引いて、補正後の運動点を得るようにしてもよい。あるいは、例えば、少なくとも３ピクセルの重心の座標を運動点の座標から差し引いて、補正後の運動点を得るようにしてもよい。 For example, the second correction means may correct the motion point locus information by tracking the corrected motion point obtained by subtracting the coordinates of the fixed point from the coordinates of the motion point. At this time, for example, among the fixed point trajectory information of at least 3 pixels, the coordinates of the pixel having the smallest locus movement may be subtracted from the coordinates of the motion point to obtain the corrected motion point. Alternatively, for example, the coordinates of the pixel having the largest movement of the locus among the fixed point locus information of at least 3 pixels may be subtracted from the coordinates of the moving point to obtain the corrected moving point. Alternatively, for example, the coordinates of the center of gravity of at least 3 pixels may be subtracted from the coordinates of the motion point to obtain the corrected motion point.

このようにして補正された運動点軌跡情報には、固定点の移動による誤差が含まれておらず、より正確な情報となり得る。 The motion point trajectory information corrected in this way does not include an error due to the movement of the fixed point, and can be more accurate information.

上述した例では、顔の座標系を補正し、補正された座標系において運動点を抽出し、その運動点を追跡することによって、運動点軌跡情報を生成することを説明したが、本発明では、補正のタイミングはこれに限定されない。例えば、運動点を抽出し、抽出された運動点を補正し、補正された運動点を追跡することによって運動点軌跡情報を生成するようにしてもよい。例えば、運動点を抽出し、抽出された運動点を追跡することによって運動点軌跡情報を生成し、生成された運動点軌跡情報を補正するようにしてもよい。これは、上述した座標系の補正と同様の手法で、運動点の座標系または運動点軌跡情報を補正することによって達成され得る。 In the above-mentioned example, it has been described that the motion point locus information is generated by correcting the coordinate system of the face, extracting the motion points in the corrected coordinate system, and tracking the motion points. , The timing of correction is not limited to this. For example, the motion point locus information may be generated by extracting the motion points, correcting the extracted motion points, and tracking the corrected motion points. For example, the motion point locus information may be generated by extracting the motion points and tracking the extracted motion points, and the generated motion point trajectory information may be corrected. This can be achieved by correcting the coordinate system of the moving point or the moving point locus information by the same method as the correction of the coordinate system described above.

すなわち、生成手段１２３によって生成される運動点軌跡情報は、下顎の運動点以外の運動（例えば、撮影中の被験者の身体の動き（例えば、上顎の運動）や傾きによるノイズ）を示す情報を含む場合と、下顎の運動点以外の運動を示す情報を含まない場合とがある。前者の場合は、運動点軌跡情報を生成した後に運動点軌跡情報を補正する場合である。後者の場合は、運動点軌跡情報を生成する前に、座標系を補正して運動点を抽出する場合であるか、または、運動点追跡情報を生成する前に、運動点を抽出し、抽出された運動点を補正する場合である。 That is, the motion point trajectory information generated by the generation means 123 includes information indicating motions other than the motion points of the lower jaw (for example, movement of the subject's body during imaging (for example, movement of the upper jaw) and noise due to tilt). In some cases, it may not contain information indicating movement other than the movement point of the lower jaw. In the former case, the motion point trajectory information is corrected after the motion point trajectory information is generated. In the latter case, the coordinate system is corrected to extract the motion points before the motion point trajectory information is generated, or the motion points are extracted and extracted before the motion point tracking information is generated. This is the case of correcting the motion points.

図６Ｃは、別の実施形態におけるプロセッサ部１５０の構成の一例を示す。プロセッサ部１５０は、被験者の顎運動顔モデルを用いて、被験者の顔の座標系を補正するための構成を有し得る。プロセッサ部１５０は、上述したプロセッサ部１２０の代替としてコンピュータシステム１００が備えるプロセッサ部である。図６Ｃでは、図３、図６Ａで上述した構成要素と同じ構成要素には同じ参照数字を付し、ここでは説明を省略する。 FIG. 6C shows an example of the configuration of the processor unit 150 in another embodiment. The processor unit 150 may have a configuration for correcting the coordinate system of the subject's face by using the subject's jaw movement face model. The processor unit 150 is a processor unit included in the computer system 100 as an alternative to the processor unit 120 described above. In FIG. 6C, the same reference numbers are assigned to the same components as those described above in FIGS. 3 and 6A, and the description thereof will be omitted here.

プロセッサ部１５０は、取得手段１２１と、補正手段１３１と、抽出手段１２２と、生成手段１２３とを備える。補正手段１３１は、ベース顔モデル生成手段１５１と、顎運動顔モデル生成手段１５２とを備える。 The processor unit 150 includes an acquisition unit 121, a correction unit 131, an extraction unit 122, and a generation unit 123. The correction means 131 includes a base face model generation means 151 and a jaw movement face model generation means 152.

ベース顔モデル生成手段１５１は、被験者の顔のベース顔モデルを生成するように構成されている。ベース顔モデル作成手段１５１は、例えば、予め取得された被験者の顔の画像（例えば、データベース部２００に格納されている被験者の顔の画像）から、ベース顔モデルを生成するようにしてもよいし、インターフェース部１１０を介して端末装置３００から受信された被験者の顔の画像から、ベース顔モデルを生成するようにしてもよい。ベース顔モデルを生成するために利用される顔の画像は、正面を向いて静止した無表情の画像であることが好ましい。生成されるベース顔モデルが、正面を向いた無表情のものとなり、種々の動きに対応し易くなるからである。ベース顔モデル生成手段１５１は、公知の任意の手法を用いて、ベース顔モデルを生成することができる。利用される画像は、２次元情報（縦×横）を含む画像であってもよいしが、３次元情報（縦×横×奥行）を含む画像であることが好ましい。３次元情報を含む画像により、より容易にかつ高精度のベース顔モデルを生成することができるからである。 The base face model generation means 151 is configured to generate a base face model of the subject's face. The base face model creating means 151 may generate a base face model from, for example, an image of the subject's face acquired in advance (for example, an image of the subject's face stored in the database unit 200). The base face model may be generated from the image of the subject's face received from the terminal device 300 via the interface unit 110. The face image used to generate the base face model is preferably a front-facing, stationary, expressionless image. This is because the generated base face model becomes an expressionless one facing the front, and it becomes easy to correspond to various movements. The base face model generation means 151 can generate a base face model by using any known method. The image used may be an image containing two-dimensional information (vertical x horizontal), but is preferably an image containing three-dimensional information (vertical x horizontal x depth). This is because an image containing three-dimensional information can generate a base face model more easily and with high accuracy.

顎運動顔モデル生成手段１５２は、複数の画像中の被験者の顔をベース顔モデルに反映させることにより、被験者の顎運動顔モデルを生成するように構成されている。顎運動顔モデル生成手段１５２は、インターフェース部１１０を介して端末装置３００から受信された連続した複数の画像内の被験者の顔をベース顔モデルに反映することにより、被験者の顎運動顔モデルを生成する。顎運動顔モデル生成手段１５２は、公知の任意の手法を用いて、顎運動顔モデルを生成することができる。例えば、連続した複数の画像内の被験者の顔の各部位の座標を導出し、各部位の座標をベース顔モデル上にマッピングすることにより顎運動顔モデルが作成され得る。生成された顎運動顔モデルは、複数の画像に写る被験者の動きに合わせて動く３Ｄアバターとなる。 The jaw movement face model generation means 152 is configured to generate a jaw movement face model of a subject by reflecting the face of the subject in a plurality of images in the base face model. The jaw movement face model generation means 152 generates a jaw movement face model of the subject by reflecting the face of the subject in a plurality of consecutive images received from the terminal device 300 via the interface unit 110 in the base face model. do. The jaw movement face model generation means 152 can generate a jaw movement face model by using any known method. For example, a jaw movement face model can be created by deriving the coordinates of each part of the subject's face in a plurality of consecutive images and mapping the coordinates of each part on the base face model. The generated jaw movement face model is a 3D avatar that moves according to the movement of the subject shown in a plurality of images.

一例において、ベース顔モデル生成手段１５１および顎運動顔モデル生成手段１５２は、ｉＰｈｏｎｅ（登録商標）Ｘにおいて実装されている「アニ文字」を構築するための処理と同様の処理によって、顎運動顔モデルを生成することができる。 In one example, the base face model generation means 151 and the jaw movement face model generation means 152 are subjected to the same processing as the processing for constructing the "animoji" implemented in iPhone (registered trademark) X, and the jaw movement face model is performed. Can be generated.

補正手段１３１は、ベース顔モデルと顎運動顔モデルとの座標系の違いを補正する。補正手段１３１は、ベース顔モデルの座標系に基づいて、顎運動顔モデルの座標系を補正することによって、顔の座標系を補正するように構成されている。補正手段１３１は、例えば、任意の座標変換処理を行うことにより、顎運動顔モデルの座標系をベース顔モデルの座標系に変換することによって、顔の座標系を補正する。 The correction means 131 corrects the difference in the coordinate system between the base face model and the jaw movement face model. The correction means 131 is configured to correct the coordinate system of the face by correcting the coordinate system of the jaw movement face model based on the coordinate system of the base face model. The correction means 131 corrects the coordinate system of the face by converting the coordinate system of the jaw movement face model to the coordinate system of the base face model, for example, by performing an arbitrary coordinate conversion process.

補正手段１３１は、例えば、顎運動顔モデル生成手段１５２が顎運動顔モデルを生成する前に、連続した複数の画像内の被験者の顔の各部位の座標に対して座標変換処理を行うようにしてもよい。これにより、生成される顎運動顔モデルの座標系は、ベース顔モデルの座標系と一致する。 The correction means 131, for example, causes the jaw movement face model generation means 152 to perform coordinate conversion processing on the coordinates of each part of the subject's face in a plurality of consecutive images before the jaw movement face model generation means 152 generates the jaw movement face model. You may. As a result, the coordinate system of the generated jaw movement face model matches the coordinate system of the base face model.

抽出手段１２２は、補正手段１３１によって補正された座標系において、顔の下顎領域内の運動点を抽出する。抽出手段１２２は、例えば、ベース顔モデルにおいて運動点を抽出するようにしてもよいし、座標変換前の顎運動顔モデルにおいて運動点を抽出するようにしてもよいし、座標変換後の顎運動顔モデルにおいて運動点を抽出するようにしてもよい。ベース顔モデルにおいて運動点を抽出する場合には、運動点は、顎運動顔モデルの生成過程において、顎運動顔モデルの対応する点に反映される。 The extraction means 122 extracts the motion points in the lower jaw region of the face in the coordinate system corrected by the correction means 131. The extraction means 122 may, for example, extract the movement points in the base face model, may extract the movement points in the jaw movement face model before the coordinate conversion, or may extract the movement points in the jaw movement after the coordinate conversion. The movement points may be extracted in the face model. When extracting motion points in the base face model, the motion points are reflected in the corresponding points of the jaw motion face model in the process of generating the jaw motion face model.

生成手段１２３は、抽出された運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を生成する。複数の画像の各々から抽出される運動点は、それぞれ同一の座標系において取得されるため、生成される運動点軌跡情報は、より正確な情報となる。 The generation means 123 generates motion point locus information indicating the locus of the motion point by tracking the extracted motion points. Since the motion points extracted from each of the plurality of images are acquired in the same coordinate system, the generated motion point trajectory information becomes more accurate information.

上述した例では、補正された座標系（すなわち、ベース顔モデルの座標系）における運動点を追跡することによって、運動点軌跡情報を生成することを説明したが、本発明では、補正のタイミングはこれに限定されない。例えば、顎運動顔モデルの座標系において運動点を抽出し、抽出された運動点を補正し、補正された運動点を追跡することによって運動点軌跡情報を生成するようにしてもよい。例えば、顎運動顔モデルの座標系において運動点を抽出し、運動点を追跡することによって生成された運動点軌跡情報を補正することにより、補正された運動点軌跡情報を生成するようにしてもよい。例えば、複数の画像において運動点を抽出し、抽出された運動点を顎運動顔モデルに反映し、反映された運動点または反映された運動点を追跡することによって生成された運動点軌跡情報を補正することによって運動点軌跡情報を生成するようにしてもよい。これは、上述した座標変換処理と同様の手法で、運動点の座標系または運動点の軌跡を補正することによって達成され得る。 In the above example, it has been described that the motion point trajectory information is generated by tracking the motion points in the corrected coordinate system (that is, the coordinate system of the base face model), but in the present invention, the timing of the correction is determined. Not limited to this. For example, the motion point trajectory information may be generated by extracting the motion points in the coordinate system of the jaw motion face model, correcting the extracted motion points, and tracking the corrected motion points. For example, the corrected motion point trajectory information may be generated by extracting the motion points in the coordinate system of the jaw motion face model and correcting the motion point trajectory information generated by tracking the motion points. good. For example, motion point trajectory information generated by extracting motion points in multiple images, reflecting the extracted motion points in the jaw motion face model, and tracking the reflected motion points or the reflected motion points. The motion point trajectory information may be generated by the correction. This can be achieved by correcting the coordinate system of the moving point or the locus of the moving point by the same method as the coordinate conversion process described above.

図６Ｄは、別の実施形態におけるプロセッサ部１６０の構成の一例を示す。プロセッサ部１６０は、生成された運動点軌跡情報に基づいて被験者の運動を評価するための構成を有し得る。プロセッサ部１６０は、上述したプロセッサ部１２０の代替としてコンピュータシステム１００が備えるプロセッサ部である。図６Ｄでは、図３で上述した構成要素と同じ構成要素には同じ参照数字を付し、ここでは説明を省略する。 FIG. 6D shows an example of the configuration of the processor unit 160 in another embodiment. The processor unit 160 may have a configuration for evaluating the motion of the subject based on the generated motion point trajectory information. The processor unit 160 is a processor unit included in the computer system 100 as an alternative to the processor unit 120 described above. In FIG. 6D, the same reference figures are assigned to the same components as those described above in FIG. 3, and the description thereof will be omitted here.

プロセッサ部１６０は、取得手段１２１と、抽出手段１２２と、生成手段１２３と、評価手段１６１とを備える。 The processor unit 160 includes an acquisition unit 121, an extraction unit 122, a generation unit 123, and an evaluation unit 161.

評価手段１６１は、運動点軌跡情報に少なくとも基づいて、被験者の顎運動の評価を示す顎運動評価情報を生成するように構成されている。評価手段１６１は、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを利用して、顎運動評価情報を生成することができる。運動点軌跡学習済モデルは、入力された運動点軌跡情報を顎運動の評価と相関させるように構成されている。生成手段１２３によって生成された運動点軌跡情報には、撮影中の被験者の身体の動きや傾きによるノイズが含まれ得るが、運動点軌跡学習済モデルは、このようなノイズも含めた運動点軌跡情報を学習する処理を施されているため、運動点軌跡情報に含まれ得る撮影中の被験者の身体の動きや傾きによるノイズに関わらず、精度よく、顎運動評価情報を生成することができる。学習に用いられる運動点軌跡情報は、２次元の情報（或る平面における運動点の軌跡を示す情報）であってもよいし、３次元の情報（或る空間における運動点の軌跡を示す情報）であってもよいし、４次元の情報（或る空間における運動点の軌跡および運動点の速度を示す情報）であってもよい。ここで、速度は、スカラー量であってもよいが、好ましくは、ベクトル量である。より高次元の情報を利用することにより、構築される運動点軌跡学習済モデルの精度が向上する。運動点軌跡学習済モデルは、例えば、数千、数万、数十万、または数百万の運動点軌跡情報を学習することによって構築され得る。より多くの情報を学習するほど精度は向上するが、過学習に留意する必要がある。 The evaluation means 161 is configured to generate jaw movement evaluation information indicating the evaluation of the jaw movement of the subject, based on at least the movement point trajectory information. The evaluation means 161 can generate jaw motion evaluation information by using the motion point trajectory learned model that has been processed to learn the motion point trajectory information of a plurality of subjects. The motion point trajectory trained model is configured to correlate the input motion point trajectory information with the evaluation of jaw motion. The motion point trajectory information generated by the generation means 123 may include noise due to the movement and inclination of the subject's body during imaging, and the motion point trajectory trained model includes the motion point trajectory including such noise. Since the processing for learning the information is performed, it is possible to accurately generate the jaw movement evaluation information regardless of the noise caused by the movement or tilt of the subject's body during imaging, which may be included in the movement point trajectory information. The motion point trajectory information used for learning may be two-dimensional information (information indicating the trajectory of the motion point in a certain plane) or three-dimensional information (information indicating the trajectory of the motion point in a certain space). ), Or four-dimensional information (information indicating the locus of the moving point and the speed of the moving point in a certain space). Here, the velocity may be a scalar quantity, but is preferably a vector quantity. By using higher-dimensional information, the accuracy of the constructed motion point trajectory trained model is improved. The motion point trajectory trained model can be constructed, for example, by learning thousands, tens of thousands, hundreds of thousands, or millions of motion point trajectory information. The more information you learn, the better the accuracy, but you need to be aware of overfitting.

動点軌跡学習済モデルは、任意の機械学習モデルを用いて構築することができる。運動点軌跡学習済モデルは、例えば、ニューラルネットワークモデルであり得る。 The moving point trajectory trained model can be constructed by using any machine learning model. The motion point trajectory trained model can be, for example, a neural network model.

図７は、評価手段１６１が利用し得るニューラルネットワークモデル１６１０の構造の一例を示す。 FIG. 7 shows an example of the structure of the neural network model 1610 that can be used by the evaluation means 161.

ニューラルネットワークモデル１６１０は、入力層と、少なくとも１つの隠れ層と、出力層とを有する。ニューラルネットワークモデル１６１０の入力層のノード数は、入力されるデータの次元数に対応する。ニューラルネットワークモデル１６１０の隠れ層は、任意の数のノードを含むことができる。ニューラルネットワークモデル１６１０の出力層のノード数は、出力されるデータの次元数に対応する。例えば、顎運動に異常が有るかないかを評価する場合、出力層のノード数は、１であり得る。例えば、顎運動の軌跡が７つのパターンのうちのいずれであるかを評価する場合、出力層のノード数は、７であり得る。 The neural network model 1610 has an input layer, at least one hidden layer, and an output layer. The number of nodes in the input layer of the neural network model 1610 corresponds to the number of dimensions of the input data. The hidden layer of the neural network model 1610 can contain any number of nodes. The number of nodes in the output layer of the neural network model 1610 corresponds to the number of dimensions of the output data. For example, when evaluating whether or not there is an abnormality in jaw movement, the number of nodes in the output layer may be 1. For example, when evaluating which of the seven patterns the jaw movement trajectory is, the number of nodes in the output layer can be seven.

ニューラルネットワークモデル１６１０は、取得手段１２１が取得した情報を使用して予め学習処理がなされ得る。学習処理は、取得手段１２１が取得したデータを使用して、ニューラルネットワークモデル１６１０の隠れ層の各ノードの重み係数を計算する処理である。 The neural network model 1610 can be preliminarily trained using the information acquired by the acquisition means 121. The learning process is a process of calculating the weighting coefficient of each node of the hidden layer of the neural network model 1610 by using the data acquired by the acquisition means 121.

学習処理は、例えば、教師あり学習である。教師あり学習では、例えば、運動点軌跡情報を入力用教師データとし、対応する顎運動の評価を出力用教師データとして、複数の被験者の情報を使用してニューラルネットワークモデル１６１０の隠れ層の各ノードの重み係数を計算することにより、運動点軌跡情報を顎運動の評価と相関させることが可能な学習済モデルを構築することができる。 The learning process is, for example, supervised learning. In supervised learning, for example, the motion point trajectory information is used as input teacher data, the evaluation of the corresponding jaw movement is used as output teacher data, and the information of a plurality of subjects is used for each node of the hidden layer of the neural network model 1610. By calculating the weighting coefficient of, it is possible to construct a trained model that can correlate the motion point trajectory information with the evaluation of jaw motion.

例えば、教師あり学習のための（入力用教師データ，出力用教師データ）の組は、（第１の被験者の運動点軌跡情報，第１の被験者の顎運動の評価）、（第２の被験者の運動点軌跡情報，第２の被験者の顎運動の評価）、・・・（第ｉの被験者の運動点軌跡情報，第ｉの被験者の顎運動の評価、・・・等であり得る。このような学習済のニューラルネットワークモデルの入力層に被験者から新たに取得された運動点軌跡情報を入力すると、その被験者の顎運動の評価が出力層に出力される。 For example, a set of (input teacher data, output teacher data) for supervised learning includes (first subject's motion point trajectory information, first subject's jaw motion evaluation), (second subject). (Evaluation of the jaw movement of the second subject), ... (Information on the movement point trajectory of the i-th subject, evaluation of the jaw movement of the i-th subject, ...). When the motion point trajectory information newly acquired from the subject is input to the input layer of the trained neural network model, the evaluation of the jaw motion of the subject is output to the output layer.

学習処理は、例えば、教師なし学習である。教師なし学習では、例えば、複数の被験者について、運動点軌跡情報を入力用教師データとしたときの複数の出力をクラスタリングすることによって、出力を複数のクラスタに区分する。複数のクラスタの各クラスタについて、属する被験者の顎運動の評価に基づいて、各クラスタを特徴付ける。これにより、運動点軌跡情報を顎運動の評価と相関させることが可能な学習済モデルが構築される。クラスタリングは、例えば、任意の公知の手法を用いて行われ得る。このような学習済のニューラルネットワークモデルの入力層に被験者から新たに取得された運動点軌跡情報を入力すると、その被験者の顎運動の評価が出力層に出力される。 The learning process is, for example, unsupervised learning. In unsupervised learning, for example, for a plurality of subjects, the outputs are divided into a plurality of clusters by clustering a plurality of outputs when the motion point trajectory information is used as input teacher data. For each cluster of multiple clusters, characterize each cluster based on the assessment of jaw movements of the subject to which it belongs. As a result, a trained model capable of correlating the motion point trajectory information with the evaluation of jaw motion is constructed. Clustering can be performed using, for example, any known technique. When the motion point trajectory information newly acquired from the subject is input to the input layer of the trained neural network model, the evaluation of the jaw motion of the subject is output to the output layer.

図６Ａ～図６Ｃを参照して上述した例で生成された運動点軌跡情報は、例えば、被験者の顎運動を評価するために利用され得る。この場合、プロセッサ部１２０、１３０、１４０、または１５０は、生成された運動点軌跡情報に少なくとも基づいて、被験者の顎運動の評価を示す顎運動評価情報を生成する評価手段（図示せず）をさらに備え得る。評価手段は、プロセッサ部１６０が備える評価手段１６１と同様の構成であってもよいし、異なる構成であってもよい。評価手段は、例えば、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを利用して、顎運動評価情報を生成することができる。 The motion point trajectory information generated in the example described above with reference to FIGS. 6A-6C can be used, for example, to evaluate the jaw motion of the subject. In this case, the processor unit 120, 130, 140, or 150 provides an evaluation means (not shown) that generates jaw movement evaluation information indicating the evaluation of the jaw movement of the subject based on at least the generated movement point trajectory information. Further prepared. The evaluation means may have the same configuration as the evaluation means 161 included in the processor unit 160, or may have a different configuration. As the evaluation means, for example, the jaw motion evaluation information can be generated by using the motion point trajectory learned model that has been processed to learn the motion point trajectory information of a plurality of subjects.

図６Ａ～図６Ｃを参照して上述した例では、連続した複数の画像間で、抽出される運動点が異ならないように、撮影中の被験者の動きや傾きに対処するために、補正手段１３１を備える構成を説明した。図６Ｄに示される例は、撮影中の被験者の身体の動きや傾きを含めた顎運動の軌跡を学習した学習済モデルを用いることにより、撮影中の被験者の動きや傾きに対処している。 In the above-mentioned example with reference to FIGS. 6A to 6C, the correction means 131 is used to deal with the movement and inclination of the subject during imaging so that the extracted motion points do not differ between a plurality of consecutive images. The configuration including is described. In the example shown in FIG. 6D, the movement and inclination of the subject during imaging are dealt with by using a learned model in which the locus of jaw movement including the body movement and inclination of the subject during imaging is learned.

図２に示される例では、データベース部２００は、コンピュータシステム１００の外部に設けられているが、本発明はこれに限定されない。データベース部２００の少なくとも一部をコンピュータシステム１００の内部に設けることも可能である。このとき、データベース部２００の少なくとも一部は、メモリ部１７０を実装する記憶手段と同一の記憶手段によって実装されてもよいし、メモリ部１７０を実装する記憶手段とは別の記憶手段によって実装されてもよい。いずれにせよ、データベース部２００の少なくとも一部は、コンピュータシステム１００のための格納部として構成される。データベース部２００の構成は、特定のハードウェア構成に限定されない。例えば、データベース部２００は、単一のハードウェア部品で構成されてもよいし、複数のハードウェア部品で構成されてもよい。例えば、データベース部２００は、コンピュータシステム１００の外付けハードディスク装置として構成されてもよいし、ネットワークを介して接続されるクラウド上のストレージとして構成されてもよい。 In the example shown in FIG. 2, the database unit 200 is provided outside the computer system 100, but the present invention is not limited thereto. It is also possible to provide at least a part of the database unit 200 inside the computer system 100. At this time, at least a part of the database unit 200 may be implemented by the same storage means as the storage means for mounting the memory unit 170, or may be implemented by a storage means different from the storage means for mounting the memory unit 170. You may. In any case, at least a part of the database unit 200 is configured as a storage unit for the computer system 100. The configuration of the database unit 200 is not limited to a specific hardware configuration. For example, the database unit 200 may be composed of a single hardware component or may be composed of a plurality of hardware components. For example, the database unit 200 may be configured as an external hard disk device of the computer system 100, or may be configured as a storage on the cloud connected via a network.

上述した図３、図６Ａ～図６Ｄに示される例では、プロセッサ部１２０、１３０、１４０、１５０、１６０の各構成要素が同一のプロセッサ部１２０、１３０、１４０、１５０、１６０内に設けられているが、本発明はこれに限定されない。プロセッサ部１２０、１３０、１４０、１５０、１６０の各構成要素が、複数のプロセッサ部に分散される構成も本発明の範囲内である。このとき、複数のプロセッサ部は、同一のハードウェア部品内に位置してもよいし、近傍または遠隔の別個のハードウェア部品内に位置してもよい。例えば、プロセッサ部１５０のベース顔モデル生成手段１５１は、他の構成要素とは別のプロセッサ部によって実装されることが好ましい。これにより、ベースモデル作成という負荷が大きい処理を別個に行うことができるようになるからである。 In the example shown in FIGS. 3 and 6A to 6D described above, the components of the processor units 120, 130, 140, 150 and 160 are provided in the same processor unit 120, 130, 140, 150 and 160. However, the present invention is not limited to this. It is also within the scope of the present invention that the components of the processor units 120, 130, 140, 150, and 160 are distributed to a plurality of processor units. At this time, the plurality of processor units may be located in the same hardware component, or may be located in separate hardware components in the vicinity or remote. For example, it is preferable that the base face model generation means 151 of the processor unit 150 is implemented by a processor unit different from other components. This makes it possible to separately perform the heavy-duty process of creating a base model.

なお、上述したコンピュータシステム１００の各構成要素は、単一のハードウェア部品で構成されていてもよいし、複数のハードウェア部品で構成されていてもよい。複数のハードウェア部品で構成される場合は、各ハードウェア部品が接続される態様は問わない。各ハードウェア部品は、無線で接続されてもよいし、有線で接続されてもよい。本発明のコンピュータシステム１００は、特定のハードウェア構成には限定されない。プロセッサ部１２０、１３０、１４０、１５０、１６０をデジタル回路ではなくアナログ回路によって構成することも本発明の範囲内である。本発明のコンピュータシステム１００の構成は、その機能を実現できる限りにおいて上述したものに限定されない。 Each component of the computer system 100 described above may be composed of a single hardware component or may be composed of a plurality of hardware components. When it is composed of a plurality of hardware parts, the mode in which each hardware part is connected does not matter. Each hardware component may be connected wirelessly or may be connected by wire. The computer system 100 of the present invention is not limited to a specific hardware configuration. It is also within the scope of the present invention that the processor units 120, 130, 140, 150, 160 are configured by an analog circuit instead of a digital circuit. The configuration of the computer system 100 of the present invention is not limited to the above-mentioned one as long as the function can be realized.

３．被験者の顎運動を測定するためのコンピュータシステムによる処理
図８は、被験者の顎運動を測定するためのコンピュータシステム１００による処理の一例（処理８００）を示すフローチャートである。処理８００は、例えば、コンピュータシステム１００におけるプロセッサ部１３０、１４０または１５０において実行される。 3. 3. Processing by a computer system for measuring the jaw movement of a subject FIG. 8 is a flowchart showing an example (processing 800) of processing by a computer system 100 for measuring the jaw movement of a subject. The process 800 is executed, for example, in the processor unit 130, 140 or 150 in the computer system 100.

ステップＳ８０１では、プロセッサ部の取得手段１２１が、顎運動中の被験者の顔の連続した複数の画像を取得する。取得手段１２１は、例えば、データベース部２００に格納されている顎運動中の被験者の顔の連続した複数の画像をインターフェース部１１０を介して取得することができる。あるいは、取得手段１２１は、例えば、インターフェース部１１０を介して端末装置３００から受信された連続した複数の画像を取得することができる。 In step S801, the acquisition means 121 of the processor unit acquires a plurality of continuous images of the face of the subject during jaw movement. The acquisition means 121 can acquire, for example, a plurality of consecutive images of the subject's face during jaw movement stored in the database unit 200 via the interface unit 110. Alternatively, the acquisition means 121 can acquire a plurality of continuous images received from the terminal device 300 via the interface unit 110, for example.

ステップＳ８０２では、プロセッサ部の補正手段１３１が、被験者の顔の座標系を少なくとも補正する。プロセッサ部の補正手段１３１は、例えば、ステップＳ８０２で取得された画像に基づいて、被験者の顔の座標系を補正することができる。 In step S802, the correction means 131 of the processor unit at least corrects the coordinate system of the subject's face. The correction means 131 of the processor unit can correct the coordinate system of the subject's face, for example, based on the image acquired in step S802.

ステップＳ８０３では、プロセッサ部の抽出手段１２２が、上述した各実施形態のステップＳ８０２で補正された座標系において、顔の下顎領域内の運動点を少なくとも抽出する。抽出手段１２２は、例えば、画像中の運動点がどこであるかの入力をインターフェース部１１０を介して（例えば、端末装置３００から）受け、その入力に基づいて、運動点を抽出することができる。あるいは、抽出手段１２２は、例えば、入力を受けることなく、自動的に運動点を抽出することができる。 In step S803, the extraction means 122 of the processor unit extracts at least the motion points in the lower jaw region of the face in the coordinate system corrected in step S802 of each of the above-described embodiments. For example, the extraction means 122 can receive an input of where the motion point in the image is via the interface unit 110 (for example, from the terminal device 300), and can extract the motion point based on the input. Alternatively, the extraction means 122 can automatically extract the motion point without receiving an input, for example.

抽出手段１２２は、例えば、複数の画像の各々に対して、顔の複数の特徴部分を検出し、抽出された複数の特徴部分のうち、所定期間内の座標変化が所定の範囲内の部分を運動点として抽出することができる。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。所定範囲は、例えば、約５ｍｍ～約２０ｍｍであり得る。 For example, the extraction means 122 detects a plurality of feature portions of the face for each of the plurality of images, and among the plurality of extracted feature portions, a portion whose coordinate change within a predetermined period is within a predetermined range. It can be extracted as a motion point. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The predetermined range can be, for example, about 5 mm to about 20 mm.

ステップＳ８０４では、プロセッサ部の生成手段１２３が、運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を少なくとも生成する。運動点軌跡情報は、所定期間内の運動点の軌跡を示す情報である。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。 In step S804, the generation means 123 of the processor unit traces the motion point to generate at least the motion point trajectory information indicating the trajectory of the motion point. The motion point locus information is information indicating the trajectory of the motion point within a predetermined period. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken.

処理８００によれば、複数の画像の各々から抽出される運動点は、それぞれ同一の座標系において取得されるため、生成される運動点軌跡情報は、より正確な情報となる。 According to the process 800, since the motion points extracted from each of the plurality of images are acquired in the same coordinate system, the generated motion point trajectory information becomes more accurate information.

一実施形態において、ステップＳ８０２では、プロセッサ部１４０の補正手段１３１が、固定点と、事前に定義された顔基準位置テンプレートとに基づいて、被験者の顔の座標系を補正する。 In one embodiment, in step S802, the correction means 131 of the processor unit 140 corrects the coordinate system of the subject's face based on the fixed point and the predefined face reference position template.

この実施形態では、ステップＳ８０２の前に、ステップＳ８０２０を含むことができる。ステップＳ８０２０では、プロセッサ部１４０の抽出手段１２２の第２の抽出手段１４２が、ステップＳ８０１で取得された複数の画像に基づいて、顔の上顔面領域内の固定点を抽出する。第２の抽出手段１４２は、例えば、画像中の固定点がどこであるかの入力をインターフェース部１１０を介して（例えば、端末装置３００から）受け、その入力に基づいて、固定点を抽出することができる。あるいは、第２の抽出手段１４２は、例えば、入力を受けることなく、自動的に固定点を抽出することができる。 In this embodiment, step S8020 can be included before step S802. In step S8020, the second extraction means 142 of the extraction means 122 of the processor unit 140 extracts fixed points in the upper facial region of the face based on the plurality of images acquired in step S801. The second extraction means 142 receives, for example, an input of where the fixed point in the image is via the interface unit 110 (for example, from the terminal device 300), and extracts the fixed point based on the input. Can be done. Alternatively, the second extraction means 142 can automatically extract a fixed point without receiving an input, for example.

第２の抽出手段１４２は、例えば、複数の画像の各々に対して、顔の複数の特徴部分を検出し、抽出された複数の特徴部分のうち、所定期間内の座標変化が所定閾値未満の部分を固定点として抽出することができる。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。所定範囲は、例えば、約５ｍｍ～約２０ｍｍであり得る。第２の抽出手段１４２は、例えば、画像に対して色強調処理を行い、色調が周囲と異なる部分を固定点とし抽出することもできる。ステップＳ８０２０で抽出される固定点は、ステップＳ８０３の抽出するステップで抽出されるものの１つであるとみなすことができ、従って、ステップＳ８０２０は、ステップＳ８０３の一部であるとみなすことができる。 The second extraction means 142 detects, for example, a plurality of feature portions of the face for each of the plurality of images, and among the plurality of extracted feature portions, the coordinate change within a predetermined period is less than a predetermined threshold value. The part can be extracted as a fixed point. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The predetermined range can be, for example, about 5 mm to about 20 mm. The second extraction means 142 can, for example, perform color enhancement processing on an image and extract a portion having a color tone different from the surroundings as a fixed point. The fixed point extracted in step S8020 can be considered to be one of those extracted in the extraction step of step S803, and thus step S8020 can be considered to be part of step S803.

ステップＳ８０２では、プロセッサ部１４０の補正手段１３１が、ステップＳ８０２０で抽出された固定点と、事前に定義された顔基準位置テンプレートとに基づいて、被験者の顔の座標系を補正する。例えば、補正手段１３１は、複数の画像の各々について、ステップＳ８０２０で抽出された固定点が、顔基準位置テンプレート上の対応する点に移動するように、複数の画像の各々を変換処理（例えば、拡縮、回転、剪断、平行移動等）する。あるいは、例えば、補正手段１３１は、複数の画像の各々について、ステップＳ８０２０で抽出された固定点と顔基準位置テンプレート上の対応する点との間の距離をゼロにまたは所定の閾値未満にするように、複数の画像の各々を変換処理する。あるいは、例えば、補正手段１３１は、複数の画像の各々の固定点によって定義される平面と顔基準位置テンプレート上の対応する平面とが一致するように、複数の画像の各々を変換処理する。ここで、例えば、顔基準位置テンプレートは、顔が正面を向いている状態での顔の位置を定義するテンプレートであり、解剖学的に定義され得る。 In step S802, the correction means 131 of the processor unit 140 corrects the coordinate system of the subject's face based on the fixed point extracted in step S8020 and the predetermined face reference position template. For example, the correction means 131 converts each of the plurality of images (for example, for example) so that the fixed point extracted in step S8020 moves to the corresponding point on the face reference position template for each of the plurality of images. Scale, rotation, shear, translation, etc.). Alternatively, for example, the correction means 131 makes the distance between the fixed point extracted in step S8020 and the corresponding point on the face reference position template zero or less than a predetermined threshold for each of the plurality of images. In addition, each of the plurality of images is converted. Alternatively, for example, the correction means 131 transforms each of the plurality of images so that the plane defined by each fixed point of the plurality of images coincides with the corresponding plane on the face reference position template. Here, for example, the face reference position template is a template that defines the position of the face when the face is facing the front, and can be anatomically defined.

別の実施形態において、ステップＳ８０２では、プロセッサ部１３０の補正手段１３１は、複数の被験者の顔の基準座標系を学習する処理を施された基準座標系学習済モデルを利用して、複数の画像の各々について、被験者の顔の座標系を補正することができる。基準座標系学習済モデルは、入力された画像中の被験者の顔の座標系を基準座標系に補正した画像を出力するように構成されている。 In another embodiment, in step S802, the correction means 131 of the processor unit 130 uses a reference coordinate system trained model that has been processed to learn the reference coordinate system of the faces of a plurality of subjects, and uses a plurality of images. The coordinate system of the subject's face can be corrected for each of the above. The reference coordinate system trained model is configured to output an image obtained by correcting the coordinate system of the subject's face in the input image to the reference coordinate system.

基準座標系学習済モデルは、例えば、教師あり学習によって構築され得る。教師あり学習では、例えば、被験者の顔の画像が入力用教師データとして用いられ、その画像における基準座標系が出力用教師データとして用いられ得る。複数の被験者の複数の画像を繰り返し学習することにより、学習済モデルは、複数の被験者の顔が統計上有すると推定される基準座標系を認識することができるようになる。このような学習済モデルに被験者の顔の画像を入力すると、被験者の顔の座標系と基準座標系との差分が出力されるようになる。例えば、被験者の顔の座標系と基準座標系との差分をゼロにまたは所定の閾値未満にするように、入力された画像を変換処理（例えば、拡縮、回転、剪断、平行移動等）するように学習済モデルを構成することによって、基準座標系学習済モデルが生成され得る。画像の変換処理は、例えば、アフィン変換を利用して行われ得る。 The frame of reference trained model can be constructed, for example, by supervised learning. In supervised learning, for example, an image of the subject's face can be used as input teacher data, and the reference coordinate system in the image can be used as output teacher data. By repeatedly learning a plurality of images of a plurality of subjects, the trained model becomes able to recognize a reference coordinate system that is estimated to be statistically possessed by the faces of the plurality of subjects. When an image of the subject's face is input to such a trained model, the difference between the subject's face coordinate system and the reference coordinate system is output. For example, the input image is converted (for example, scaling, rotation, shearing, translation, etc.) so that the difference between the coordinate system of the subject's face and the reference coordinate system becomes zero or less than a predetermined threshold. By constructing a trained model in, a frame of reference trained model can be generated. The image conversion process can be performed using, for example, an affine transformation.

ステップＳ８０２では、プロセッサ部１３０の補正手段１３１が、ステップＳ８０１で取得された複数の画像を基準座標系学習済モデルに入力し、入力された画像中の被験者の顔の座標系が基準座標系に補正された画像を得ることができる。 In step S802, the correction means 131 of the processor unit 130 inputs a plurality of images acquired in step S801 into the reference coordinate system trained model, and the coordinate system of the subject's face in the input image becomes the reference coordinate system. A corrected image can be obtained.

さらに別の実施形態において、ステップＳ８０２では、プロセッサ部１５０補正手段１３１が、被験者の顎運動顔モデルを用いて、被験者の顔の座標系を補正する。 In yet another embodiment, in step S802, the processor unit 150 correction means 131 corrects the coordinate system of the subject's face using the subject's jaw movement face model.

この実施形態では、ステップＳ８０２の前に、ステップＳ８０２１と、ステップＳ８０２２とを含むことができる。ステップＳ８０２１では、プロセッサ部１５０の補正手段１３１のベース顔モデル生成手段１５１が、被験者の顔のベース顔モデルを生成する。ベース顔モデル作成手段１５１は、例えば、予め取得された被験者の顔の画像（例えば、データベース部２００に格納されている被験者の顔の画像）から、ベース顔モデルを生成するようにしてもよいし、インターフェース部１１０を介して端末装置３００から受信された被験者の顔の画像から、ベース顔モデルを生成するようにしてもよい。なお、ベース顔モデルが予め生成されている場合には、ステップＳ８０２１は省略され得る。 In this embodiment, step S8021 and step S8022 can be included before step S802. In step S8021, the base face model generation means 151 of the correction means 131 of the processor unit 150 generates the base face model of the subject's face. The base face model creating means 151 may generate a base face model from, for example, an image of the subject's face acquired in advance (for example, an image of the subject's face stored in the database unit 200). The base face model may be generated from the image of the subject's face received from the terminal device 300 via the interface unit 110. If the base face model is generated in advance, step S8021 may be omitted.

ステップＳ８０２２では、プロセッサ部１５０の補正手段１３１の顎運動顔モデル生成手段１５２が、ステップＳ８０１で取得された複数の画像中の被験者の顔をベース顔モデルに反映させることにより、被験者の顎運動顔モデルを生成する。生成された顎運動顔モデルは、ステップＳ８０１で取得された複数の画像に写る被験者の顎運動に合わせて動く３Ｄアバターとなる。 In step S8022, the jaw movement face model generation means 152 of the correction means 131 of the processor unit 150 reflects the face of the subject in the plurality of images acquired in step S801 on the base face model, so that the jaw movement face of the subject is reflected. Generate a model. The generated jaw movement face model is a 3D avatar that moves according to the jaw movement of the subject shown in the plurality of images acquired in step S801.

ステップＳ８０２では、プロセッサ部１５０の補正手段１３１が、ステップＳ８０２１で生成されたベース顔モデルの座標系に基づいて、ステップＳ８０２２で生成された顎運動顔モデルの座標系を補正する。補正手段１３１は、例えば、任意の座標変換処理を行うことにより、顎運動顔モデルの座標系をベース顔モデルの座標系に変換することができる。 In step S802, the correction means 131 of the processor unit 150 corrects the coordinate system of the jaw movement face model generated in step S8022 based on the coordinate system of the base face model generated in step S8021. The correction means 131 can convert the coordinate system of the jaw movement face model to the coordinate system of the base face model, for example, by performing an arbitrary coordinate conversion process.

ステップＳ８０２で、固定点と、顔基準位置テンプレートとに基づいて、被験者の顔の座標系を補正した場合には、連続した複数の画像の撮影時の被験者の身体の動き等により、顎運動とは独立して固定点が移動してしまうことがある。このため、処理８００は、運動点軌跡情報に含まれ得る固定点軌跡情報を相殺するための処理（ステップＳ８０５、ステップＳ８０６）を含むことができる。 When the coordinate system of the subject's face is corrected based on the fixed point and the face reference position template in step S802, the jaw movement is caused by the movement of the subject's body during the acquisition of a plurality of consecutive images. May move the fixed point independently. Therefore, the process 800 can include a process (step S805, step S806) for canceling the fixed point locus information that may be included in the motion point locus information.

ステップＳ８０５では、プロセッサ部１４０の補正手段１３１の第２の生成手段が、ステップＳ８０１０で抽出された固定点を追跡することによって、固定点の軌跡を示す固定点軌跡情報を生成する。固定点軌跡情報は、所定期間内の固定点の軌跡を示す情報である。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。第２の生成手段は、例えば、連続した複数の画像の各々における固定点の画像中の座標を追跡することによって、固定点軌跡情報を生成することができる。 In step S805, the second generation means of the correction means 131 of the processor unit 140 generates fixed point locus information indicating the locus of the fixed point by tracking the fixed point extracted in step S8010. The fixed point locus information is information indicating the locus of a fixed point within a predetermined period. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken. The second generation means can generate fixed point locus information, for example, by tracking the coordinates in the image of the fixed point in each of a plurality of consecutive images.

ステップＳ８０６では、プロセッサ部１４０の補正手段１３１の第２の補正手段が、定点軌跡情報に基づいて、運動点軌跡情報を補正する。例えば、第２補正手段は、運動点の軌跡から固定点の軌跡を差し引くことによって、運動点軌跡情報を補正することができる。あるいは、例えば、第２補正手段は、運動点の座標から固定点の座標を差し引いて得られる補正後の運動点を追跡することによって、運動点軌跡情報を補正するようにしてもよい。ステップＳ８０５およびステップＳ８０６で補正される運動点軌跡情報は、ステップＳ８０２の補正するステップで補正されるものの１つであるとみなすことができ、従って、ステップＳ８０５およびステップＳ８０６は、ステップＳ８０２の一部であるとみなすことができる。 In step S806, the second correction means of the correction means 131 of the processor unit 140 corrects the motion point trajectory information based on the fixed point trajectory information. For example, the second correction means can correct the motion point locus information by subtracting the locus of the fixed point from the locus of the motion point. Alternatively, for example, the second correction means may correct the motion point locus information by tracking the corrected motion point obtained by subtracting the coordinates of the fixed point from the coordinates of the motion point. The motion point trajectory information corrected in step S805 and step S806 can be regarded as one of those corrected in the correction step of step S802, and therefore step S805 and step S806 are a part of step S802. Can be considered to be.

上述した例では、補正された座標系における運動点を追跡することによって、運動点軌跡情報を生成することを説明したが、本発明では、補正のタイミングはこれに限定されない。例えば、ステップＳ８０２の前のステップＳ８０３において、ステップＳ８０１で取得された複数の画像から運動点を抽出し、次いで、ステップＳ８０４において、運動点を追跡することによって運動点軌跡情報を生成した後に、ステップＳ８０２において、ステップＳ８０４で生成された運動点軌跡情報を補正するようにしてもよい。あるいは、例えば、ステップＳ８０２の前のステップＳ８０３において、ステップＳ８０１で取得された複数の画像から運動点を抽出し、次いで、ステップＳ８０２において、ステップＳ８０３で抽出された運動点を補正し、次いで、ステップＳ８０４において、補正された運動点を追跡することによって運動点軌跡情報を生成するようにしてもよい。 In the above-mentioned example, it has been described that the motion point trajectory information is generated by tracking the motion points in the corrected coordinate system, but in the present invention, the timing of the correction is not limited to this. For example, in step S803 before step S802, motion points are extracted from a plurality of images acquired in step S801, and then in step S804, motion point trajectory information is generated by tracking the motion points, and then the step. In S802, the motion point locus information generated in step S804 may be corrected. Alternatively, for example, in step S803 before step S802, the motion points are extracted from the plurality of images acquired in step S801, then in step S802, the motion points extracted in step S803 are corrected, and then the step. In S804, the motion point trajectory information may be generated by tracking the corrected motion point.

すなわち、生成手段１２３によって生成される運動点軌跡情報は、下顎の運動点以外の運動（例えば、撮影中の被験者の身体の動き（例えば、上顎の運動）や傾きによるノイズ）を示す情報を含む場合と、下顎の運動点以外の運動を示す情報を含まない場合とがある。前者は、運動点軌跡情報を生成した後に運動点軌跡情報を補正する場合である。後者は、運動点軌跡情報を生成する前に、座標系を補正して運動点を抽出する場合、または、運動点追跡情報を生成する前に、運動点を抽出し、抽出された運動点を補正する場合である。 That is, the motion point trajectory information generated by the generation means 123 includes information indicating motions other than the motion points of the lower jaw (for example, movement of the subject's body during imaging (for example, movement of the upper jaw) and noise due to tilt). In some cases, it may not contain information indicating movement other than the movement point of the lower jaw. The former is a case where the motion point trajectory information is corrected after the motion point trajectory information is generated. In the latter case, the coordinate system is corrected to extract the motion points before the motion point trajectory information is generated, or the motion points are extracted and the extracted motion points are extracted before the motion point tracking information is generated. This is the case for correction.

図９は、被験者の顎運動を測定するためのコンピュータシステム１００による処理の別の例（処理９００）を示すフローチャートである。処理９００は、被験者の顎運動を評価するための処理である。処理９００は、例えば、コンピュータシステム１００におけるプロセッサ部１６０において実行される。 FIG. 9 is a flowchart showing another example (process 900) of processing by the computer system 100 for measuring the jaw movement of the subject. Process 900 is a process for evaluating the jaw movement of the subject. The process 900 is executed, for example, in the processor unit 160 in the computer system 100.

ステップＳ９０１では、プロセッサ部１６０の取得手段１２１が、顎運動中の被験者の顔の連続した複数の画像を取得する。ステップＳ９０１は、ステップＳ８０１と同様の処理である。 In step S901, the acquisition means 121 of the processor unit 160 acquires a plurality of continuous images of the subject's face during jaw movement. Step S901 is the same process as step S801.

ステップＳ９０２では、プロセッサ部１６０の抽出手段１２２が、ステップＳ８０１で取得された複数の画像に基づいて、顔の下顎領域内の運動点を少なくとも抽出する。顔の下顎領域内の運動点を抽出する。抽出手段１２２は、例えば、画像中の運動点がどこであるかの入力をインターフェース部１１０を介して（例えば、端末装置３００から）受け、その入力に基づいて、運動点を抽出することができる。あるいは、抽出手段１２２は、例えば、入力を受けることなく、自動的に運動点を抽出することができる。 In step S902, the extraction means 122 of the processor unit 160 extracts at least the motion points in the mandibular region of the face based on the plurality of images acquired in step S801. Extract the motion points in the mandibular region of the face. For example, the extraction means 122 can receive an input of where the motion point in the image is via the interface unit 110 (for example, from the terminal device 300), and can extract the motion point based on the input. Alternatively, the extraction means 122 can automatically extract the motion point without receiving an input, for example.

ステップＳ９０３では、プロセッサ部１６０の生成手段１２３が、ステップＳ９０２で抽出された運動点を追跡することによって、運動点の軌跡を示す運動点軌跡情報を少なくとも生成する。運動点軌跡情報は、所定期間内の運動点の軌跡を示す情報である。ここで、所定期間は、例えば、連続した複数の画像が撮影された期間のすべてまたは一部であり得る。 In step S903, the generation means 123 of the processor unit 160 generates at least the motion point locus information indicating the trajectory of the motion point by tracking the motion points extracted in step S902. The motion point locus information is information indicating the trajectory of the motion point within a predetermined period. Here, the predetermined period may be, for example, all or a part of the period in which a plurality of consecutive images are taken.

ステップＳ９０４では、プロセッサ部１６０の評価手段１６１が、ステップＳ９０３で生成された運動点軌跡情報に少なくとも基づいて、被験者の顎運動の評価を示す顎運動評価情報を生成する。評価手段１６１は、複数の被験者の運動点軌跡情報を学習する処理を施された運動点軌跡学習済モデルを利用して、顎運動評価情報を生成することができる。運動点軌跡学習済モデルは、入力された運動点軌跡情報を顎運動の評価と相関させるように構成されている。例えば、ステップＳ９０３で生成された運動点軌跡情報を運動点軌跡学習済モデルに入力すると、被験者の顎運動の推定された評価が出力され得る。 In step S904, the evaluation means 161 of the processor unit 160 generates jaw movement evaluation information indicating the evaluation of the jaw movement of the subject based on at least the movement point trajectory information generated in step S903. The evaluation means 161 can generate jaw motion evaluation information by using the motion point trajectory learned model that has been processed to learn the motion point trajectory information of a plurality of subjects. The motion point trajectory trained model is configured to correlate the input motion point trajectory information with the evaluation of jaw motion. For example, when the motion point trajectory information generated in step S903 is input to the motion point trajectory learned model, the estimated evaluation of the jaw motion of the subject can be output.

ステップＳ９０３で生成された運動点軌跡情報には、撮影中の被験者の身体の動きや傾きによるノイズが含まれ得るが、運動点軌跡学習済モデルは、このようなノイズも含めた運動点軌跡情報を学習する処理を施されているため、ステップＳ９０４では、運動点軌跡情報に含まれ得る運撮影中の被験者の身体の動きや傾きによるノイズに関わらず、精度よく、顎運動評価情報を生成することができる。 The motion point trajectory information generated in step S903 may include noise due to the movement and inclination of the subject's body during imaging, and the motion point trajectory trained model includes motion point trajectory information including such noise. In step S904, the jaw movement evaluation information is accurately generated regardless of the noise caused by the movement or tilt of the subject's body during luck photography, which may be included in the movement point trajectory information. be able to.

例えば、ステップＳ９０１～ステップＳ９０４を行った後に、後述する学習処理と同様の処理を行うことにより、運動点軌跡学習済モデルを更新するようにしてもよい。 For example, after performing steps S901 to S904, the exercise point locus trained model may be updated by performing the same processing as the learning processing described later.

図１０は、被験者の顎運動を測定するためのコンピュータシステム１００による処理の別の例（処理１０００）を示すフローチャートである。処理１０００は、被験者の顎運動を測定するために利用される運動点軌跡学習済モデルを構築するための処理である。処理１０００は、例えば、コンピュータシステム１００におけるプロセッサ部１６０において実行される。 FIG. 10 is a flowchart showing another example (process 1000) of processing by the computer system 100 for measuring the jaw movement of the subject. The process 1000 is a process for constructing a motion point trajectory trained model used for measuring the jaw motion of the subject. The process 1000 is executed, for example, in the processor unit 160 in the computer system 100.

ステップＳ１００１では、プロセッサ部１６０の取得手段１２１が、複数の被験者の運動点を追跡することによって得られた運動点の軌跡を示す運動点軌跡情報を少なくとも取得する。取得手段１２１は、例えば、データベース部２００に格納されている運動点軌跡情報をインターフェース部１１０を介して取得することができる。運動点軌跡情報は、例えば、本発明のコンピュータシステム１００を用いて取得された運動点軌跡情報であってもよいし、公知の任意の顎運動測定装置から得られた運動点軌跡情報であってもよい。 In step S1001, the acquisition means 121 of the processor unit 160 acquires at least the motion point locus information indicating the locus of the motion points obtained by tracking the motion points of a plurality of subjects. The acquisition means 121 can acquire, for example, the motion point trajectory information stored in the database unit 200 via the interface unit 110. The motion point trajectory information may be, for example, motion point trajectory information acquired by using the computer system 100 of the present invention, or motion point trajectory information obtained from any known jaw motion measuring device. May be good.

ステップＳ１００１では、さらに、取得手段１２１は、複数の被験者の顎運動の評価を取得するようにしてもよい。取得手段１２１は、例えば、データベース部２００に格納されている顎運動の評価をインターフェース部１１０を介して取得することができる。 In step S1001, the acquisition means 121 may further acquire an evaluation of the jaw movements of a plurality of subjects. The acquisition means 121 can acquire, for example, the evaluation of the jaw movement stored in the database unit 200 via the interface unit 110.

ステップＳ１００２では、プロセッサ部１６０が、少なくとも、ステップＳ１００１で取得された複数の被験者の運動点軌跡情報を入力用教師データとした学習処理により、運動点軌跡学習済モデルを構築する。運動点軌跡学習済モデルは、例えば、ニューラルネットワークモデルであり得る。 In step S1002, the processor unit 160 constructs a motion point trajectory learned model by learning processing using at least the motion point trajectory information of a plurality of subjects acquired in step S1001 as input teacher data. The motion point trajectory trained model can be, for example, a neural network model.

例えば、動点軌跡学習済モデルが、ニューラルネットワークモデルである場合、ステップＳ１００２では、学習処理により、ステップＳ１００１で取得されたデータを使用して、ニューラルネットワークモデルの隠れ層の各ノードの重み係数が計算される。 For example, when the moving point trajectory trained model is a neural network model, in step S1002, the weighting coefficient of each node in the hidden layer of the neural network model is set by using the data acquired in step S1001 by the learning process. It is calculated.

学習処理は、例えば、教師なし学習である。教師なし学習では、例えば、複数の被験者について、運動点軌跡情報を入力用教師データとしたときの複数の出力を分類する。分類は、任意の公知の手法を用いて行うことができ、分類された出力を顎運動の評価で特徴付けることによって、運動点軌跡情報を顎運動の評価と相関させることが可能な学習済モデルが構築される。 The learning process is, for example, unsupervised learning. In unsupervised learning, for example, for a plurality of subjects, a plurality of outputs when the motion point trajectory information is used as input teacher data are classified. Classification can be performed using any known method, and a trained model capable of correlating motion point trajectory information with jaw motion evaluation by characterizing the classified output with jaw motion evaluation is available. Will be built.

分類は、例えば、クラスタリングによって行われる。例えば、複数の出力をクラスタリングすることによって、出力を複数のクラスタに区分する。複数のクラスタの各クラスタについて、属する被験者の顎運動の評価に基づいて、各クラスタを特徴付ける。これにより、運動点軌跡情報を顎運動の評価と相関させることが可能な学習済モデルが構築される。クラスタリングは、例えば、任意の公知の手法を用いて行われ得る。 Classification is done, for example, by clustering. For example, by clustering a plurality of outputs, the outputs are divided into a plurality of clusters. For each cluster of multiple clusters, characterize each cluster based on the assessment of jaw movements of the subject to which it belongs. As a result, a trained model capable of correlating the motion point trajectory information with the evaluation of jaw motion is constructed. Clustering can be performed using, for example, any known technique.

図８～図１０を参照して上述した例では、特定の順序で処理が行われることを説明したが、各処理の順序は説明されたものに限定されず、論理的に可能な任意の順序で行われ得る。 In the above-mentioned example with reference to FIGS. 8 to 10, it has been described that the processes are performed in a specific order, but the order of each process is not limited to the described one, and any order that is logically possible is not limited. Can be done in.

図８～図１０を参照して上述した例では、図８～図１０に示される各ステップの処理は、プロセッサ部１２０、プロセッサ部１３０、プロセッサ部１４０、プロセッサ部１５０、またはプロセッサ部１６０とメモリ部１７０に格納されたプログラムとによって実現することが説明されたが、本発明はこれに限定されない。図８～図１０に示される各ステップの処理のうちの少なくとも１つは、制御回路などのハードウェア構成によって実現されてもよい。 In the above-described example with reference to FIGS. 8 to 10, the processing of each step shown in FIGS. 8 to 10 includes a processor unit 120, a processor unit 130, a processor unit 140, a processor unit 150, or a processor unit 160 and a memory. Although it has been described that it is realized by the program stored in the unit 170, the present invention is not limited to this. At least one of the processes of each step shown in FIGS. 8 to 10 may be realized by a hardware configuration such as a control circuit.

上述した例では、コンピュータシステム１００が、端末装置３００にネットワーク４００を介して接続されるサーバ装置である場合を例に説明したが、本発明は、これに限定されない。コンピュータシステム１００は、プロセッサ部を備える任意の情報処理装置であり得る。例えば、コンピュータシステム１００は、端末装置３００であり得る。あるいは、例えば、コンピュータシステム１００は、端末装置３００とサーバ装置との組み合わせであり得る。 In the above-mentioned example, the case where the computer system 100 is a server device connected to the terminal device 300 via the network 400 has been described as an example, but the present invention is not limited thereto. The computer system 100 can be any information processing device including a processor unit. For example, the computer system 100 can be a terminal device 300. Alternatively, for example, the computer system 100 may be a combination of a terminal device 300 and a server device.

上述した例では、本発明の一実装例としてコンピュータシステム１００を説明したが、本発明は、例えば、コンピュータシステム１００を含むシステムとしても実装され得る。このシステムは、例えば、上述したコンピュータシステム１００と、被験者の下顎領域に設置されるように構成された標点とを備える。標点は、上述したように、標点上の特定点を表すように構成され得る。標点を被験者の下顎領域に設置した状態で撮影した連続した複数の画像を利用することにより、コンピュータシステム１００での運動点の抽出処理が容易になり、かつ、抽出される運動点が、複数の画像間で一致することになる。このシステムは、被験者の上顔面領域に設置されるように構成された固定点用標点をさらに備えることができる。固定点用標点も、上述したように、固定用標点上の特定点を表すように構成され得る。固定点用標点を被験者の上顔面領域に設置した状態で撮影した連続した複数の画像を利用することにより、コンピュータシステム１００での固定点の抽出処理が容易になる。 In the above-mentioned example, the computer system 100 has been described as an implementation example of the present invention, but the present invention can also be implemented as, for example, a system including the computer system 100. The system comprises, for example, the computer system 100 described above and a gauge point configured to be placed in the mandibular region of the subject. The gauge point may be configured to represent a particular point on the gauge point, as described above. By using a plurality of continuous images taken with the reference point placed in the lower jaw region of the subject, the process of extracting the motion points in the computer system 100 becomes easy, and the number of motion points to be extracted is multiple. Will match between the images in. The system may further comprise a fixed point gauge configured to be placed in the upper facial area of the subject. Fixed point gauges may also be configured to represent specific points on the fixed point, as described above. By using a plurality of continuous images taken with the fixed point reference point installed in the upper face region of the subject, the fixed point extraction process in the computer system 100 becomes easy.

本発明は上述した各実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the embodiments obtained by appropriately combining the technical means disclosed in the different embodiments. Is also included in the technical scope of the present invention.

本発明は、被験者の顎運動を簡易に測定することが可能なシステム等を提供するものとして有用である。 The present invention is useful as providing a system or the like capable of easily measuring the jaw movement of a subject.

１００コンピュータシステム
１１０インターフェース部
１２０、１３０、１４０、１５０、１６０プロセッサ部
１７０メモリ部
２００データベース部
３００端末装置
４００ネットワーク100 Computer system 110 Interface part 120, 130, 140, 150, 160 Processor part 170 Memory part 200 Database part 300 Terminal device 400 Network

Claims

A system for measuring jaw movement,
It is an acquisition means for acquiring a plurality of consecutive images of the subject's face during jaw movement, and the plurality of images include an image in which the subject's face is tilted, and the subject's face is included. The subject's face in the tilted image is tilted with respect to the subject's face in at least one of the plurality of images.
A correction means for at least correcting the coordinate system of the subject's face, wherein the correction means matches the coordinate system of the subject's face in each of the plurality of images including an image in which the subject's face is tilted. As a correction means for correcting the coordinate system of the subject's face,
An extraction means for extracting at least a motion point in the mandibular region of the face,
A system including a generation means for generating motion point locus information indicating at least the locus of the motion point by tracking the motion point.

The extraction means is
A first extraction means for extracting a motion point in the lower jaw region of the face,
A second extraction means for extracting a fixed point in the upper facial region of the face from the plurality of images is provided.
The system of claim 1, wherein the correction means corrects the coordinate system based on the fixed point and a predefined face reference position template.

The first extraction means extracts a plurality of feature portions in the plurality of images, and among the plurality of feature portions, the feature portions whose coordinate changes within a predetermined period are within a predetermined range are extracted as motion points.
The second extraction means extracts a feature portion whose coordinate change within a predetermined period is less than a predetermined threshold value as a fixed point among the plurality of feature portions.
The system according to claim 2.

The correction means
The first correction means for correcting the coordinate system of the subject's face,
A second generation means for generating fixed point locus information indicating the locus of the fixed point by tracking the fixed point, and
The system according to any one of claims 2 to 3, further comprising a second correction means for correcting the motion point trajectory information based on the fixed point trajectory information.

The correction means
It is a reference coordinate system trained model that has been processed to learn the reference coordinate system of the faces of a plurality of subjects, and the reference coordinate system trained model uses the coordinate system of the subject's face in the input image as described above. The system of claim 1, comprising a reference coordinate system trained model configured to be corrected to the reference coordinate system.

The reference coordinate system trained model is
Taking the difference between the coordinate system of the subject's face in the input image and the reference coordinate system,
The system according to claim 5, wherein the coordinate system is corrected by performing a conversion process on the input image based on the difference.

The system according to claim 6, wherein the conversion process includes an affine transformation.

The extraction means extracts a plurality of feature portions in the plurality of images, and extracts pixels having a coordinate change within a predetermined period within a predetermined range as motion points from the plurality of feature portions. The system according to any one of 7.

The correction means
A base face model generation means for generating a base face model of the subject's face,
A jaw movement face model generation means for generating a jaw movement face model of the subject by reflecting the face of the subject in the plurality of images in the base face model is provided.
The system according to claim 1, wherein the correction means corrects the coordinate system by correcting the coordinate system of the jaw movement face model based on the coordinate system of the base face model.

The system according to claim 9, wherein the extraction means extracts motion points in the jaw motion face model or the base face model.

The system according to any one of claims 1 to 10, further comprising an evaluation means for generating jaw movement evaluation information indicating the evaluation of the jaw movement of the subject based on at least the generated movement point trajectory information.

Further equipped with a gauge point configured to be placed in the mandibular region of the subject.
The system according to any one of claims 1, 5 to 11, wherein the extraction means extracts the reference point in the plurality of images as a motion point.

Further equipped with a gauge point configured to be placed in the mandibular region of the subject.
The system according to any one of claims 2 to 4, wherein the extraction means extracts the reference point in the plurality of images as a motion point.

Further comprising a reference point configured to be placed in the upper facial area of the subject.
The system according to claim 13, wherein the second extraction means extracts the reference point in the plurality of images as a fixed point.

The system according to any one of claims 12 to 14, wherein the reference point is configured to represent a specific point on the reference point.

A program for measuring jaw movement, wherein the program is executed in a system including a processor unit, and the program is
It is to acquire a plurality of consecutive images of the subject's face during jaw movement, and the plurality of images include an image in which the subject's face is tilted, and the subject's face is included. The subject's face in the tilted image is tilted with respect to the subject's face in at least one of the plurality of images.
The correction is at least correcting the coordinate system of the subject's face, which means that the coordinate system of the subject's face matches in each of the plurality of images including the image in which the subject's face is tilted. As such, including correcting the coordinate system of the subject's face, and
Extracting at least the motion points in the mandibular region of the face and
A program for causing the processor unit to perform a process including at least generating motion point trajectory information indicating the trajectory of the motion point by tracking the motion point.

A method for measuring jaw movement,
It is to acquire a plurality of consecutive images of the subject's face during jaw movement, and the plurality of images include an image in which the subject's face is tilted, and the subject's face is included. The subject's face in the tilted image is tilted with respect to the subject's face in at least one of the plurality of images.
The correction is at least correcting the coordinate system of the subject's face, which means that the coordinate system of the subject's face matches in each of the plurality of images including the image in which the subject's face is tilted. As such, including correcting the coordinate system of the subject's face, and
Extracting at least the motion points in the mandibular region of the face and
A method comprising tracking the motion point to at least generate motion point trajectory information indicating the trajectory of the motion point.