JP2020187714A

JP2020187714A - Information processor, information processing system, learning device, and information processing program

Info

Publication number: JP2020187714A
Application number: JP2019099768A
Authority: JP
Inventors: 美子大平; Yoshiko Ohira
Original assignee: Individual
Current assignee: Individual
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2020-11-19

Abstract

To provide an information processor, an information processing system, a learning device, and an information processing program for appropriately estimating an occupation of an object person.SOLUTION: An information processor 100 includes: a reception section 102 that receives an input of object person information on an object person; an extraction section 1460 that extracts a feature quantity for estimating a future occupation of the object person from the object person information; a learned estimation model 1400 that outputs a possibility of the future occupation of the object person as an estimated result on the basis of the feature quantity; and an output section 108 that outputs the estimated result in the estimation model. The estimation model 1400 is generated by learning processing using a data set for learning. The data set for learning includes learning data labeled with an occupation of the other object person for a learning parameter which is a parameter of the other object person.SELECTED DRAWING: Figure 7

Description

本発明は、情報処理装置、情報処理システム、学習装置、および情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing system, a learning device, and an information processing program.

たとえば、幼児（対象者）の保護者（たとえば、親）が、該幼児が向いている将来の職業を予測したい場合がある。たとえば、将来の職業は、たとえば、スポーツ選手、歌手、医者および法律家などである。非特許文献１では、対象者の職業を提案するＷｅｂページが開示されている。 For example, a guardian (eg, parent) of an infant (target person) may want to predict the future occupation for which the infant is suitable. For example, future professions include, for example, athletes, singers, doctors and lawyers. Non-Patent Document 1 discloses a Web page that proposes the occupation of the subject.

［２０１９年５月１０日検索］インターネット＜ＵＲＬ＞ｈｔｔｐ：／／ｏｓｈｉｇｏｔｏ．ｍｙｄｒｅａｍｓ．ｊｐ／ｓｈｉｎｄａｎ／ｓｈｉｎｄａｎ＿ｓｅｌｅｃｔ１．ｐｈｐ[Search on May 10, 2019] Internet <URL> http: // oshigit. mydreams. jp / shindan / shindan_select1. php

しかしながら、非特許文献１では、ある程度の年齢に達した対象者に対する質問を対象者に答えさせることにより職業を提案するシステムが開示されており、たとえば、対象者が幼児などである場合には、適切な職業を提案することができなかった。 However, Non-Patent Document 1 discloses a system for proposing a profession by having a subject answer a question to a subject who has reached a certain age. For example, when the subject is an infant or the like, I couldn't propose an appropriate profession.

本発明は、対象者の将来の職業を適切に推定する情報処理装置、情報処理システム、学習装置、および情報処理プログラムを提供することを目的としている。 An object of the present invention is to provide an information processing device, an information processing system, a learning device, and an information processing program that appropriately estimate the future occupation of a subject.

本発明のある局面によれば、対象者に関する対象者情報の入力を受付ける受付部と、前記対象者の将来の職業を推定するための特徴量を、前記対象者情報から抽出する抽出部と、前記特徴量に基づいて、前記対象者の将来の職業の可能性を、推定結果として出力する学習済の推定モデルと、前記推定モデルでの推定結果を出力する出力部とを備え、前記推定モデルは、学習用データセットを用いた学習処理により生成され、前記学習用データセットは、他の対象者のパラメータである学習用パラメータに対して、該他の対象者の職業をラベル付けした学習用データを含む、情報処理装置が提案される。 According to a certain aspect of the present invention, a reception unit that accepts input of target person information regarding the target person, an extraction unit that extracts a feature amount for estimating the future occupation of the target person from the target person information, and an extraction unit. Based on the feature amount, the estimation model includes a learned estimation model that outputs the possibility of the subject's future occupation as an estimation result, and an output unit that outputs the estimation result of the estimation model. Is generated by a learning process using a learning data set, and the learning data set is for learning in which the occupation of the other target person is labeled with respect to the learning parameter which is the parameter of the other target person. An information processing device containing data is proposed.

本発明によれば、対象者の将来の職業を適切に推定することができる。 According to the present invention, the future occupation of the subject can be appropriately estimated.

本実施形態に従う職業推定システムでのフェーズを示す図である（その１）。It is a figure which shows the phase in the occupation estimation system according to this embodiment (the 1). 本実施形態に従う職業推定システムでのフェーズを示す図である（その２）。It is a figure which shows the phase in the occupation estimation system according to this embodiment (the 2). 本実施形態に従う職業推定システムでのフェーズを示す図である（その３）。It is a figure which shows the phase in the occupation estimation system according to this embodiment (the 3). 本実施形態に従う職業推定システムの一例を示す図である。It is a figure which shows an example of the occupation estimation system according to this embodiment. 本実施形態に従う情報処理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware configuration of the information processing apparatus which follows this embodiment. 本実施形態に従う管理装置のハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware configuration of the management apparatus according to this embodiment. 本実施形態に従う情報処理装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the information processing apparatus which follows this embodiment. 本実施形態に従う推定部の構成の一例を示す図である。It is a figure which shows an example of the structure of the estimation part according to this embodiment. 本実施形態に従う推定モデルの一例を示す図である。It is a figure which shows an example of the estimation model according to this embodiment. 本実施形態に従う管理装置の機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the management apparatus according to this embodiment. 本実施形態に従う職業推定処理のフローチャートである。It is a flowchart of occupation estimation processing according to this embodiment. 本実施形態に従う管理処理のフローチャートである。It is a flowchart of the management process according to this embodiment.

本発明の実施の形態について、図面を参照しながら詳細に説明する。なお、図中の同一または相当部分については、同一符号を付してその説明は繰り返さない。 Embodiments of the present invention will be described in detail with reference to the drawings. The same or corresponding parts in the drawings are designated by the same reference numerals and the description thereof will not be repeated.

＜Ａ．職業推定システムの概要＞
まず、本発明に係る情報処理システムの典型例として、本実施形態に従う職業推定システム１の概要について説明する。本明細書において、「対象者」は、「将来の職業が推定される者」であり、たとえば、０歳〜１８歳までの人間である。保護者は、たとえば、対象者に対する保護を行う者をいう。保護者は、たとえば、対象者の両親である。<A. Overview of occupation estimation system>
First, as a typical example of the information processing system according to the present invention, an outline of the occupation estimation system 1 according to the present embodiment will be described. In the present specification, the "subject" is a "person whose future occupation is presumed", for example, a person aged 0 to 18 years. A guardian is, for example, a person who protects the target person. The guardian is, for example, the subject's parents.

図１および図２は、本実施形態に従う職業推定システムの使用例を説明するための図である。本実施形態では、職業推定システムの使用にあたって、複数のフェーズが規定されている。本実施形態の情報処理装置は、該複数のフェーズそれぞれにおいて取得された情報が入力されて、該情報に基づいて、対象者の将来の職業を推定する。図１（Ａ）は、音声取得フェーズを説明するための図である。音声取得フェーズは、対象者と質問者との質疑応答における音声を取得するフェーズである。マイク２０は、音声を取得する。マイク２０が取得した音声の音声データは、情報処理装置１００に入力される。情報処理装置１００は、所定の記憶領域に、入力された音声データを記憶する。情報処理装置１００は、情報を処理できる装置であれば、如何なる装置であってもよい。情報処理装置１００は、たとえば、ＰＣ（ｐｅｒｓｏｎａｌｃｏｍｐｕｔｅｒ）、タブレット、スマートフォンなどとしてもよい。 1 and 2 are diagrams for explaining a usage example of the occupation estimation system according to the present embodiment. In this embodiment, a plurality of phases are defined in using the occupation estimation system. In the information processing device of the present embodiment, the information acquired in each of the plurality of phases is input, and the future occupation of the target person is estimated based on the information. FIG. 1A is a diagram for explaining a voice acquisition phase. The voice acquisition phase is a phase in which voice is acquired in a question and answer session between the subject and the questioner. The microphone 20 acquires voice. The voice data of the voice acquired by the microphone 20 is input to the information processing device 100. The information processing device 100 stores the input voice data in a predetermined storage area. The information processing device 100 may be any device as long as it can process information. The information processing device 100 may be, for example, a PC (personal computer), a tablet, a smartphone, or the like.

たとえば、質問者からの質問は、対象者が将来に就きたい職業を問う質問を含む。該質問は、たとえば、「将来、なりたい職業は何ですか」といった質問（以下、「特定の質問」という。）である。特定の質問は、その他、「好きなことは何ですか」、「得意なことは何ですか」、「将来、なりたい職業は何ですか」といった質問（以下、「特定の質問」という。）を含んでもよい。 For example, a question from a questioner includes a question asking the subject what profession he / she wants to have in the future. The question is, for example, a question such as "What occupation do you want to be in the future?" (Hereinafter referred to as "specific question"). Other specific questions include "what do you like", "what are you good at", and "what occupation do you want to be in the future" (hereinafter referred to as "specific questions"). May include.

たとえば、医者の職業に就いている人は、幼少期には、「好きなことは何ですか」、「得意なことは何ですか」といった質問に対しては、自信を持って「算数、理科」などと回答している傾向にある。また、医者の職業に就いている人は、幼少期には、「将来、なりたい職業は何ですか」といった質問に対しては、自信を持って「医者」と回答している傾向にある。 For example, a person in the doctor's profession can confidently answer questions such as "what do you like" and "what are you good at" in early childhood? There is a tendency to answer "science". In addition, those who are in the profession of doctors tend to confidently answer "doctors" to questions such as "what profession do you want to be in the future?" In early childhood.

このように、特定の職業に就いている人について、特定の質問に対する回答には、特徴量として、特定のキーワード、および特定の話し方（テンポなど）が含まれる。特定のキーワードは、特定の職業を示すキーワードである。このように、特定の質問に対する回答における特定のキーワード、および特定の話し方（特徴量）は、対象者の職業を推定するにあたって有用な指標となる。 In this way, for a person in a specific occupation, the answer to a specific question includes a specific keyword and a specific way of speaking (tempo, etc.) as features. A specific keyword is a keyword indicating a specific occupation. In this way, a specific keyword in the answer to a specific question and a specific way of speaking (feature amount) are useful indexes for estimating the occupation of the subject.

図１（Ｂ）は、静止画取得フェーズを説明するための図である。静止画取得フェーズでは、対象者により描かれた画像、および対象者により描かれた文字などのうち少なくとも１つを含む静止画を取得するフェーズである。カメラ４０は、静止画を取得する。カメラ４０が取得した静止画の静止画像データは、情報処理装置１００に入力される。情報処理装置１００は、所定の記憶領域に、入力された静止画像データを記憶する。 FIG. 1B is a diagram for explaining a still image acquisition phase. The still image acquisition phase is a phase in which a still image including at least one of an image drawn by the target person and characters drawn by the target person is acquired. The camera 40 acquires a still image. The still image data of the still image acquired by the camera 40 is input to the information processing device 100. The information processing device 100 stores the input still image data in a predetermined storage area.

たとえば、静止画は、対象者が描いた絵画（画像）、および対象者が描いた文字などを含む。たとえば、高い学歴を有する職業（弁護士および医師など）に就いている人は、幼少期の文字は綺麗である傾向にある。「文字が綺麗である」とは、たとえば、「模範となる文字」からの乖離度が小さいことをいう。 For example, a still image includes a painting (image) drawn by the subject, characters drawn by the subject, and the like. For example, people in highly educated professions (such as lawyers and doctors) tend to have beautiful childhood letters. "The characters are beautiful" means, for example, that the degree of deviation from the "model characters" is small.

また、絵画については、特定の対象物に対する絵画とすることが好ましい。特定の対象物とは、たとえば、花、動物などである。たとえば、画家の職業に就いている人は、幼少期の絵画が綺麗である傾向にある。「絵画が綺麗である」とは、「模範となる絵画」からの幼少期の文字は綺麗である傾向にある。「模範となる絵画」とは、たとえば、特定の対象物の写真などである。 Further, the painting is preferably a painting for a specific object. Specific objects are, for example, flowers, animals, and the like. For example, those who work as painters tend to have beautiful paintings in their childhood. "Paintings are beautiful" means that the characters in childhood from "model paintings" tend to be beautiful. A "model painting" is, for example, a photograph of a particular object.

このように、特定の職業に就いている人が幼少期に描いた画像、および対象者が描く文字は、対象者の職業を推定するにあたって有用な指標となる。 In this way, the images drawn by a person in a specific occupation during childhood and the characters drawn by the subject are useful indicators for estimating the occupation of the subject.

図１（Ｃ）は、動画取得フェーズを説明するための図である。動画取得フェーズでは、対象者と質問者との質疑応答における動画を取得するフェーズである。カメラ４０は、動画を取得する。カメラ４０が取得した動画の動画像データは、情報処理装置１００に入力される。情報処理装置１００は、所定の記憶領域に、入力された動画像データを記憶する。 FIG. 1C is a diagram for explaining a moving image acquisition phase. The video acquisition phase is a phase in which a video is acquired in a question and answer session between the subject and the questioner. The camera 40 acquires a moving image. The moving image data of the moving image acquired by the camera 40 is input to the information processing device 100. The information processing device 100 stores the input moving image data in a predetermined storage area.

たとえば、動画は、対象者が特定の動作を行う動画である。特定の動作は、対象者が歩く動作、対象者が走る動作、対象者がボールを蹴る動作、対象者がバットを振る動作などを含む。たとえば、野球選手の職業に就いている人は幼少期での走るフォーム、バットを振るフォームなどが綺麗である傾向にある。「フォームが綺麗である」とは、「模範となるフォーム」からの乖離度が小さいことをいう。また、たとえば、野球選手の職業に就いている人は、幼少期での走るスピードが速い傾向にある。 For example, a moving image is a moving image in which a target person performs a specific action. The specific motion includes a motion of the subject walking, a motion of the subject running, a motion of the subject kicking the ball, a motion of the subject swinging a bat, and the like. For example, people in the profession of baseball players tend to have beautiful running forms and bat-waving forms in their childhood. "The form is beautiful" means that the degree of deviation from the "model form" is small. Also, for example, those who are in the profession of baseball players tend to run faster in their childhood.

このように、特定の職業に就いている人の幼少期での特定の動作は、対象者の職業を推定するにあたって有用な指標となる。 In this way, a specific movement of a person in a specific occupation in childhood is a useful index for estimating the occupation of the subject.

図２（Ｄ）は、対象者の成績取得フェーズを説明するための図である。対象者の成績取得フェーズは、対象者の学業の成績を取得するフェーズである。対象者の学業の成績とは、塾および小学校などの試験の成績などである。成績は、たとえば、テストの点数およびテストの偏差値などである。対象者の成績の情報処理装置１００への入力の手法として、たとえば、情報処理装置１００は、成績入力画面（特に図示せず）を表示する。そして、対象者の保護者など（ユーザ）が、この成績入力画面から、対象者の成績を入力する。情報処理装置１００は、所定の記憶領域に、入力された対象者の成績データを記憶する。 FIG. 2D is a diagram for explaining the grade acquisition phase of the subject. The subject's grade acquisition phase is a phase in which the subject's academic achievement is acquired. The subject's academic performance includes the results of examinations at cram schools and elementary schools. Grades include, for example, test scores and test deviations. As a method of inputting the results of the subject to the information processing device 100, for example, the information processing device 100 displays a grade input screen (not shown in particular). Then, a guardian or the like (user) of the target person inputs the grade of the target person from this grade input screen. The information processing device 100 stores the input performance data of the target person in a predetermined storage area.

高い学歴を有する職業（弁護士および医師など）に就いている人の幼少期の学業の成績は、高い傾向にある。このように、特定の職業に就いている人の幼少期の成績は、対象者の職業を推定するにあたって有用な指標となる。 Those in highly educated professions (such as attorneys and doctors) tend to have high childhood academic performance. In this way, the childhood performance of a person in a specific occupation is a useful index for estimating the occupation of the subject.

図２（Ｅ）は、保護者情報取得フェーズを説明するための図である。保護者情報取得フェーズは、保護者の情報を取得するフェーズである。保護者の情報とは、たとえば、保護者の職業、保護者の学歴、および保護者の年収などを含む。保護者の職業の情報処理装置１００への入力の手法として、たとえば、情報処理装置１００は、保護者情報入力画面（特に図示せず）を表示する。そして、対象者の保護者など（ユーザ）が、この保護者情報入力画面から、保護者情報を入力する。情報処理装置１００は、所定の記憶領域に、入力された保護者情報データを記憶する。 FIG. 2E is a diagram for explaining a guardian information acquisition phase. The parent information acquisition phase is a phase in which parent information is acquired. Parental information includes, for example, the parent's profession, the parent's educational background, and the parent's annual income. As a method of inputting the occupation of the guardian to the information processing device 100, for example, the information processing device 100 displays a guardian information input screen (not particularly shown). Then, the guardian or the like (user) of the target person inputs the guardian information from the guardian information input screen. The information processing device 100 stores the input guardian information data in a predetermined storage area.

特定の職業（たとえば、医師など）に就いている人の保護者は、該特定の職業に就いている人が多い。たとえば、医者の職業に就いている人の保護者も医者であることが多い。また、たとえば、高学歴が必要とされる職業（たとえば、弁護士、医師など）に就くためには、塾などに行く必要があり、金銭的に多大なコストがかかる。したがって、対象者の保護者の年収も多いことが好ましい。また、高学歴が必要とされる職業（たとえば、弁護士、医師など）に就くためには対象者は、偏差値の高い大学などに入学および卒業をする必要がある。一般的に、対象者の保護者が高学歴である場合には、対象者も高学歴にある傾向にある。理由は、保護者が高学歴であると、教育環境の観点が整うことから、保護者の対象者も勉学に打ち込みやすくなるからである。また、保護者と対象者との血がつながっている場合には、遺伝の観点で、保護者が高学歴であると、対象者も高学歴になりやすくなる。 Many parents of people in a particular profession (such as a doctor) are in that particular profession. For example, parents of people in the profession of doctors are often doctors. In addition, for example, in order to get a profession that requires a high degree of education (for example, a lawyer, a doctor, etc.), it is necessary to go to a cram school, etc., which costs a lot of money. Therefore, it is preferable that the parents of the target person also have a large annual income. In addition, in order to get a job that requires a high degree of education (for example, a lawyer, a doctor, etc.), the subject must enter and graduate from a university with a high deviation value. In general, when the guardian of the subject is highly educated, the subject also tends to be highly educated. The reason is that if the parents are highly educated, the viewpoint of the educational environment will be adjusted, and it will be easier for the parents to devote themselves to their studies. In addition, when the blood of the guardian and the subject is connected, if the guardian is highly educated from the viewpoint of heredity, the subject is likely to be highly educated.

このように、本実施形態では、図１（Ａ）〜図１（Ｃ）、および図２（Ｄ）、（Ｅ）に示す４つのフェーズで、情報処理装置１００は、音声データ、静止画像データ、動画像データ、対象者の成績データ、および保護者情報データを取得する。情報処理装置１００は、これら５つのデータを所定の記憶領域に記憶させる。以下では、この５つのデータを「対象者に関する対象者情報」または、単に「５つのデータ」ともいう。また、本実施形態では、情報処理装置１００等は、５つのデータの全てを用いる形態を説明するが、変形例として、５つのデータのうち少なくとも１つのデータを用いるようにしてもよい。 As described above, in the present embodiment, in the four phases shown in FIGS. 1 (A) to 1 (C) and 2 (D) and 2 (E), the information processing apparatus 100 uses the audio data and the still image data. , Moving image data, subject's performance data, and guardian information data. The information processing device 100 stores these five data in a predetermined storage area. Hereinafter, these five data are also referred to as "target person information regarding the target person" or simply "five data". Further, in the present embodiment, the information processing apparatus 100 or the like describes a mode in which all five data are used, but as a modification, at least one of the five data may be used.

図３は、表示結果フェーズによる説明するための図である。情報処理装置１００は、該情報処理装置１００が記憶している５つのデータに基づいて、推定モデル１４００等を用いて、対象者の将来の職業の可能性を推定する。情報処理装置１００は、該推定された推定結果を表示装置２００に出力し、表示装置２００に表示結果を表示させる。図３の例では、「あなたが向いている職業」という文字とともに、最も適性のある職業を、順序とともに表示する。図３の例では、表示装置２００は、最も適性のある（最も可能性のある）職業を「１位」として「医者」を表示し、次に適性のある（次に可能性のある）職業を「２位」として「弁護士」を表示し、次に適性のある（次に可能性のある）職業を「３位」として「金融系サラリーマン」を表示する。 FIG. 3 is a diagram for explaining the display result phase. The information processing device 100 estimates the possibility of the subject's future occupation by using the estimation model 1400 or the like based on the five data stored in the information processing device 100. The information processing device 100 outputs the estimated estimation result to the display device 200, and causes the display device 200 to display the display result. In the example of FIG. 3, the most suitable profession is displayed in order along with the characters "professions that you are suitable for". In the example of FIG. 3, the display device 200 displays the "doctor" with the most suitable (most likely) occupation as the "first place", followed by the most suitable (next possible) occupation. "Attorney" is displayed as "2nd place", and "Financial office worker" is displayed as "3rd place" for the next suitable (next possible) occupation.

このように、情報処理装置１００は、対象者の将来の職業の可能性を、推定結果として表示装置２００に出力する。表示装置２００は、該推定結果を示す画面を表示する。したがって、対象者、および対象者の保護者は、対象者の将来の職業の可能性を認識することができる。また、情報処理装置１００は、特定の職業（例えば、医者）に将来なれる割合（例えば、５０％）を表示させるようにしてもよい。 In this way, the information processing device 100 outputs the possibility of the subject's future occupation to the display device 200 as an estimation result. The display device 200 displays a screen showing the estimation result. Therefore, the subject and the subject's guardian can recognize the potential of the subject's future profession. In addition, the information processing apparatus 100 may display a specific occupation (for example, a doctor) at a future rate (for example, 50%).

［職業推定システムの構成例］
次に、本実施形態に従う職業推定システム１のシステム構成例について説明する。先に、職業推定システム１の全体構成例を説明した上で、職業推定システム１に含まれる主要装置のハードウェア構成例について説明する。[Configuration example of occupation estimation system]
Next, a system configuration example of the occupation estimation system 1 according to the present embodiment will be described. First, an overall configuration example of the occupation estimation system 1 will be described, and then a hardware configuration example of the main device included in the occupation estimation system 1 will be described.

（システム構成例）
図４は、本実施形態に従う職業推定システム１のシステム構成の一例を示す模式図である。図４を参照して、職業推定システム１は、管理装置３００と、ネットワーク２と、１以上の情報処理装置１００と、該１以上の情報処理装置１００それぞれに接続された該１以上の表示装置２００とを含む。(System configuration example)
FIG. 4 is a schematic diagram showing an example of the system configuration of the occupation estimation system 1 according to the present embodiment. With reference to FIG. 4, the occupation estimation system 1 includes a management device 300, a network 2, one or more information processing devices 100, and one or more display devices connected to each of the one or more information processing devices 100. Includes 200 and.

情報処理装置１００は、対象者情報である５つのデータの入力を受付け、該５つのデータから、対象者の将来の職業を推定する。管理装置３００は、情報処理装置１００が利用する学習済モデルの管理および更新を実行する。より具体的には、管理装置３００は、情報処理装置１００から入力された対象者情報（５つのデータ）を取得する。この対象者情報が「学習用の対象者情報Ａ」となる。そして、管理装置３００は、数年経過して、該対象者が職業に就いた場合に、該入力された「学習用の対象者情報Ａ」と、該職業を示す職業データとを対応付けることにより学習用データセットを生成する。 The information processing device 100 receives the input of five data which is the target person information, and estimates the future occupation of the target person from the five data. The management device 300 manages and updates the trained model used by the information processing device 100. More specifically, the management device 300 acquires the target person information (five data) input from the information processing device 100. This target person information becomes "target person information A for learning". Then, when the target person has a profession after several years, the management device 300 associates the input "target person information A for learning" with the profession data indicating the profession. Generate a training data set.

また、職業推定システム１の管理者などは、職業に就いている人にインタビューを行い、該職業に就いている人の幼少期における５つのデータのうちの少なくとも１つを取得するようにしてもよい。たとえば、医者の幼少期での「対象者Ａと質問者Ｂとの質疑応答の音声データ」と、「静止画像データ（幼少期での絵画および文字）に関するデータ」と、「動画像データ（幼少期の走っている画像等に関するデータ）」と、「学業に関する成績データ」と、「医者の幼少期における保護者の保護者データ」とのうち少なくとも１つのデータを取得する。この少なくとも１つのデータが、「学習用の対象者情報Ｂ」となる。そして、管理者は、該取得した少なくとも１つのデータと、該取得先の人（職業に就いている人）の職業を示す職業データを管理装置３００に入力する。管理装置３００は、入力された少なくとも１つのデータ（「学習用の対象者情報Ｂ」）と、職業データとを対応付けることにより、学習用データセットを生成する。管理装置３００は、生成した学習用データセットを用いて、学習済モデルの学習（新規学習および追加学習の両方を含み得る。）を実行する。 In addition, the manager of the occupation estimation system 1 may interview a person in the occupation and acquire at least one of the five data of the person in the occupation in his or her childhood. Good. For example, "voice data of Q & A between subject A and questioner B", "still image data (paintings and characters in childhood)", and "moving image data (childhood)" in the doctor's childhood. At least one of "data on images of running period)", "results data on schoolwork", and "guardian data of parents in their childhood" is acquired. This at least one data becomes "target person information B for learning". Then, the manager inputs the acquired at least one data and occupation data indicating the occupation of the person (person who has an occupation) to the management device 300. The management device 300 generates a learning data set by associating at least one input data (“learning target person information B”) with occupational data. The management device 300 uses the generated training data set to perform training of the trained model (which may include both new learning and additional learning).

（情報処理装置１００）
図５は、本実施形態に従う職業推定システム１を構成する情報処理装置１００のハードウェア構成の一例を示す模式図である。(Information processing device 100)
FIG. 5 is a schematic diagram showing an example of the hardware configuration of the information processing device 100 constituting the occupation estimation system 1 according to the present embodiment.

図５を参照して、情報処理装置１００は、主要なハードウェア要素として、プロセッサ１５４と、メモリ１５６と、ネットワークコントローラ１５８と、ストレージ１０９と、プリンタ１２０と、カメラインターフェースと、入力装置インターフェース１７２と、マイクインターフェースと、表示装置インターフェース１７６とを含む。 With reference to FIG. 5, the information processing apparatus 100 includes a processor 154, a memory 156, a network controller 158, a storage 109, a printer 120, a camera interface, and an input device interface 172 as main hardware elements. , A microphone interface and a display device interface 176.

プロセッサ１５４は、後述するような各種プログラムを実行することで、情報処理装置１００の実現に必要な処理を実行する演算主体である、プロセッサ１５４としては、例えば、１または複数のＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などで構成される。複数のコアを有するＣＰＵまたはＧＰＵを用いてもよい。 The processor 154 is an arithmetic unit that executes processing necessary for realizing the information processing apparatus 100 by executing various programs as described later. The processor 154 includes, for example, one or a plurality of CPUs (Central Processing Units). ) And GPU (Graphics Processing Unit). A CPU or GPU having a plurality of cores may be used.

メモリ１５６は、プロセッサ１５４がプログラムを実行するにあたって、プログラムコードやワークメモリなどを一時的に格納する記憶領域を提供する。メモリ１５６としては、例えば、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）やＳＲＡＭ（ＳｔａｔｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などの揮発性メモリデバイスを用いてもよい。 The memory 156 provides a storage area for temporarily storing a program code, a work memory, and the like when the processor 154 executes a program. As the memory 156, for example, a volatile memory device such as a DRAM (Dynamic Random Access Memory) or a SRAM (Static Random Access Memory) may be used.

ストレージ１６０は、プロセッサ１５４にて実行されるＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）１１２、後述するような機能構成を実現するためのアプリケーションプログラム１６４、学習済モデル３２６などを格納する。ストレージ１６０としては、例えば、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）などの不揮発性メモリデバイスを用いてもよい。 The storage 160 stores an OS (Operating System) 112 executed by the processor 154, an application program 164 for realizing a functional configuration as described later, a trained model 326, and the like. As the storage 160, for example, a non-volatile memory device such as a hard disk or SSD (Solid State Drive) may be used.

アプリケーションプログラム１６４をプロセッサ１５４で実行する際に必要となるライブラリや機能モジュールの一部を、ＯＳ１６２が標準で提供するライブラリまたは機能モジュールを用いるようにしてもよい。 As a part of the library or functional module required when executing the application program 164 on the processor 154, the library or functional module provided as standard by OS 162 may be used.

光学ドライブ１５２は、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）などの光学ディスク１２４に格納されているプログラムなどの情報を読み出す。光学ディスク１２４は、非一過的（ｎｏｎ−ｔｒａｎｓｉｔｏｒｙ）な記録媒体の一例であり、任意のプログラムを不揮発的に格納した状態で流通する。光学ドライブ１５２が光学ディスク１２４からプログラムを読み出して、ストレージ１６０にインストールすることで、本実施形態に従う情報処理装置１００を構成できる。したがって、本発明の主題は、ストレージ１６０などにインストールされたプログラム自体、または、本実施形態に従う機能や処理を実現するためのプログラムを格納した光学ディスク１２４などの記録媒体でもあり得る。 The optical drive 152 reads information such as a program stored in an optical disk 124 such as a CD-ROM (Compact Disk Read Only Memory) or a DVD (Digital Versaille Disc). The optical disk 124 is an example of a non-transient recording medium, and is distributed in a non-volatile state in which an arbitrary program is stored. The information processing device 100 according to the present embodiment can be configured by the optical drive 152 reading a program from the optical disk 124 and installing the program in the storage 160. Therefore, the subject of the present invention may be the program itself installed in the storage 160 or the like, or a recording medium such as an optical disk 124 containing a program for realizing a function or process according to the present embodiment.

図５には、非一過的な記録媒体の一例として、光学ディスク１２４などの光学記録媒体を示すが、これに限らず、フラッシュメモリなどの半導体記録媒体、ハードディスクまたはストレージテープなどの磁気記録媒体、ＭＯ（Ｍａｇｎｅｔｏ−Ｏｐｔｉｃａｌｄｉｓｋ）などの光磁気記録媒体を用いてもよい。 FIG. 5 shows an optical recording medium such as an optical disk 124 as an example of a non-transient recording medium, but the present invention is not limited to this, and a semiconductor recording medium such as a flash memory or a magnetic recording medium such as a hard disk or a storage tape is shown. , MO (Magnet-Optical disk) and other magneto-optical recording media may be used.

あるいは、情報処理装置１００を実現するためのプログラムは、上述したような任意の記録媒体に格納されて流通するだけでなく、インターネットまたはイントラネットを介してサーバ装置などからダウンロードすることで配布されてもよい。 Alternatively, the program for realizing the information processing device 100 is not only stored and distributed in an arbitrary recording medium as described above, but also distributed by downloading from a server device or the like via the Internet or an intranet. Good.

カメラインターフェース１７０は、カメラ４０と、情報処理装置１００とのインターフェースの役割を果たす。入力装置インターフェース１７２は、入力装置５０と情報処理装置１００とのインターフェースの役割を果たす。入力装置５０は、キーボードおよびマウスの少なくとも１つを含む。マイクインターフェース１７４は、マイク２０と、情報処理装置１００とのインターフェースの役割を果たす。表示装置インターフェース１７６は、表示装置２００とのインターフェースの役割を果たす。 The camera interface 170 serves as an interface between the camera 40 and the information processing device 100. The input device interface 172 serves as an interface between the input device 50 and the information processing device 100. The input device 50 includes at least one of a keyboard and a mouse. The microphone interface 174 serves as an interface between the microphone 20 and the information processing device 100. The display device interface 176 serves as an interface with the display device 200.

（管理装置３００）
図６は、管理装置３００のハードウェア構成の一例を示す模式図である。図６を参照して、管理装置３００は、主要なハードウェア要素として、プロセッサ３０４と、メモリ３０６と、ネットワークコントローラ３０８と、ストレージ３１０とを含む。(Management device 300)
FIG. 6 is a schematic diagram showing an example of the hardware configuration of the management device 300. With reference to FIG. 6, the management device 300 includes a processor 304, a memory 306, a network controller 308, and a storage 310 as the main hardware elements.

プロセッサ３０４は、各種プログラムを実行することで、管理装置３００の実現に必要な処理を実行する演算主体である、プロセッサ３０４としては、例えば、１または複数のＣＰＵやＧＰＵなどで構成される。複数のコアを有するＣＰＵまたはＧＰＵを用いてもよい。管理装置３００においては、学習済モデルを生成するための学習処理に適したＧＰＵなどを採用することが好ましい。 The processor 304 is an arithmetic unit that executes processing necessary for realizing the management device 300 by executing various programs. The processor 304 is composed of, for example, one or a plurality of CPUs or GPUs. A CPU or GPU having a plurality of cores may be used. In the management device 300, it is preferable to employ a GPU or the like suitable for the learning process for generating the trained model.

メモリ３０６は、プロセッサ３０４がプログラムを実行するにあたって、プログラムコードやワークメモリなどを一時的に格納する記憶領域を提供する。メモリ３０６としては、例えば、ＤＲＡＭやＳＲＡＭなどの揮発性メモリデバイスを用いてもよい。 The memory 306 provides a storage area for temporarily storing a program code, a work memory, and the like when the processor 304 executes a program. As the memory 306, for example, a volatile memory device such as DRAM or SRAM may be used.

ネットワークコントローラ３０８は、ネットワーク２を介して、外部の装置（情報処理装置１００など）との間でデータを送受信する。ネットワークコントローラ３０８は、例えば、イーサネット、無線ＬＡＮ、Ｂｌｕｅｔｏｏｔｈなどの任意の通信方式に対応するようにしてもよい。 The network controller 308 transmits / receives data to / from an external device (information processing device 100 or the like) via the network 2. The network controller 308 may be compatible with any communication method such as Ethernet, wireless LAN, and Bluetooth.

ストレージ３１０は、プロセッサ３０４にて実行されるＯＳ３１２、後述するような機能構成を実現するためのアプリケーションプログラム３１４、学習用データセット３２４を生成するための前処理プログラム３１６、ならびに、学習用データセット３２４を用いて学習済モデル３２６を生成するための学習用プログラム３１８などを格納する。 The storage 310 includes an OS 312 executed by the processor 304, an application program 314 for realizing a functional configuration as described later, a preprocessing program 316 for generating a learning data set 324, and a learning data set 324. Stores a training program 318 and the like for generating a trained model 326 using.

学習用の対象者情報３２５は、情報処理装置１００に入力された前述の「学習用の対象者情報Ａ」と、「学習用の対象者情報Ｂ」とを含む。 The learning target person information 325 includes the above-mentioned "learning target person information A" and "learning target person information B" input to the information processing apparatus 100.

学習用データセット３２４は、学習用の対象者情報３２５に後述の職業データ２１８（正解ラベルあるいは、タグ）が付与した訓練データセットである。学習済モデル３２６は、学習用データセット３２４を用いて学習処理を実行することで得られる推定モデルである。 The learning data set 324 is a training data set in which vocational data 218 (correct label or tag) described later is added to the learning target person information 325. The trained model 326 is an estimation model obtained by executing a training process using the training data set 324.

ストレージ３１０としては、例えば、ハードディスク、ＳＳＤなどの不揮発性メモリデバイスを用いてもよい。 As the storage 310, for example, a non-volatile memory device such as a hard disk or SSD may be used.

アプリケーションプログラム３１４、前処理プログラム３１６および学習用プログラム３１８をプロセッサ３０４で実行する際に必要となるライブラリや機能モジュールの一部を、ＯＳ３１２が標準で提供するライブラリまたは機能モジュールを用いるようにしてもよい。この場合には、アプリケーションプログラム３１４、前処理プログラム３１６および学習用プログラム３１８の各単体では、対応する機能を実現するために必要なプログラムモジュールのすべてを含むものにはならないが、ＯＳ３１２の実行環境下にインストールされることで、後述するような機能構成を実現できることになる。そのため、このような一部のライブラリまたは機能モジュールを含まないプログラムであっても、本発明の技術的範囲に含まれ得る。 Some of the libraries and functional modules required to execute the application program 314, the preprocessing program 316, and the learning program 318 on the processor 304 may use the libraries or functional modules provided as standard by the OS 312. .. In this case, each of the application program 314, the preprocessing program 316, and the learning program 318 does not include all the program modules necessary to realize the corresponding functions, but under the execution environment of OS 312. By installing it in, it is possible to realize the functional configuration described later. Therefore, even a program that does not include some such libraries or functional modules may be included in the technical scope of the present invention.

アプリケーションプログラム３１４、前処理プログラム３１６および学習用プログラム３１８は、光学ディスクなどの光学記録媒体、フラッシュメモリなどの半導体記録媒体、ハードディスクまたはストレージテープなどの磁気記録媒体、ならびにＭＯなどの光磁気記録媒体といった非一過的な記録媒体に格納されて流通し、ストレージ３１０にインストールされてもよい。したがって、本発明の主題は、ストレージ３１０などにインストールされたプログラム自体、または、本実施形態に従う機能や処理を実現するためのプログラムを格納した記録媒体でもあり得る。 The application program 314, the preprocessing program 316, and the learning program 318 include optical recording media such as optical disks, semiconductor recording media such as flash memory, magnetic recording media such as hard disks or storage tapes, and magneto-optical recording media such as MO. It may be stored in a non-transient recording medium, distributed, and installed in the storage 310. Therefore, the subject of the present invention may be the program itself installed in the storage 310 or the like, or a recording medium in which the program for realizing the functions and processes according to the present embodiment is stored.

あるいは、管理装置３００を実現するためのプログラムは、上述したような任意の記録媒体に格納されて流通するだけでなく、インターネットまたはイントラネットを介してサーバ装置などからダウンロードすることで配布されてもよい。 Alternatively, the program for realizing the management device 300 may be distributed not only by being stored in an arbitrary recording medium as described above and distributed, but also by downloading from a server device or the like via the Internet or an intranet. ..

入力部３３０は、各種の入力操作を受け付ける。入力部３３０としては、例えば、キーボード、マウス、タッチパネル、ペンなどを用いてもよい。 The input unit 330 accepts various input operations. As the input unit 330, for example, a keyboard, a mouse, a touch panel, a pen, or the like may be used.

図６には、汎用コンピュータ（プロセッサ３０４）がアプリケーションプログラム３１４、前処理プログラム３１６および学習用プログラム３１８を実行することで管理装置３００を実現する構成例を示すが、管理装置３００を実現するために必要な機能の全部または一部を、集積回路などのハードワイヤード回路を用いて実現してもよい。例えば、ＡＳＩＣやＦＰＧＡなどを用いて実現してもよい。 FIG. 6 shows a configuration example in which the general-purpose computer (processor 304) realizes the management device 300 by executing the application program 314, the preprocessing program 316, and the learning program 318, but in order to realize the management device 300, All or part of the required functions may be realized by using a hard-wired circuit such as an integrated circuit. For example, it may be realized by using ASIC, FPGA or the like.

＜Ｃ．情報処理装置１００の機能および処理＞
次に、本実施形態に従う職業推定システム１を構成する情報処理装置１００の機能および処理について説明する。<C. Functions and processing of information processing device 100>
Next, the functions and processes of the information processing apparatus 100 constituting the occupation estimation system 1 according to the present embodiment will be described.

（ｃ１：情報処理装置１００の機能構成）
図７は、本実施形態に従う職業推定システム１を構成する情報処理装置１００の機能構成の一例を示す模式図である。図１０に示す各機能は、典型的には、情報処理装置１００のプロセッサ１５４がＯＳ１６２およびアプリケーションプログラム１６４（いずれも図７参照）を実行することで実現されてもよい。(C1: Functional configuration of information processing device 100)
FIG. 7 is a schematic diagram showing an example of the functional configuration of the information processing device 100 constituting the occupation estimation system 1 according to the present embodiment. Each function shown in FIG. 10 may be typically realized by the processor 154 of the information processing apparatus 100 executing the OS 162 and the application program 164 (both of which are referred to in FIG. 7).

図７は、情報処理装置１００の機能構成例を示す図である。情報処理装置１００は、受付部１０２と、記憶部１０４と、推定部１０６と、出力部１０８とを有する。 FIG. 7 is a diagram showing a functional configuration example of the information processing device 100. The information processing device 100 has a reception unit 102, a storage unit 104, an estimation unit 106, and an output unit 108.

図１および図２で説明したように、マイク２０からは音声データが情報処理装置１００に入力される。また、カメラ４０からは動画像データおよび静止画像データが情報処理装置１００に入力される。また、成績入力画面からは情報処理装置１００に成績データが入力され、保護者入力画面からは情報処理装置１００に保護者データが入力される。 As described with reference to FIGS. 1 and 2, voice data is input to the information processing device 100 from the microphone 20. Further, moving image data and still image data are input to the information processing device 100 from the camera 40. Further, the grade data is input to the information processing device 100 from the grade input screen, and the guardian data is input to the information processing device 100 from the guardian input screen.

受付部１０２は、入力された５つのデータを受付ける。受付部１０２が受付けた５つのデータは、一旦、記憶部１０４に記憶される。記憶部１０４に記憶された５つのデータは、所定の条件が成立したときに、推定部１０６に入力される。所定の条件とは、たとえば、ユーザにより、所定の動作が行われることにより成立する。 The reception unit 102 receives the five input data. The five data received by the reception unit 102 are temporarily stored in the storage unit 104. The five data stored in the storage unit 104 are input to the estimation unit 106 when a predetermined condition is satisfied. The predetermined condition is satisfied, for example, when a predetermined operation is performed by the user.

所定の条件が成立すると、５つのデータ（動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０）は、推定部１０６に入力される。推定部１０６は、該５つのデータに基づいて、対象者の将来の職業の可能性を出力する。推定部１０６は、推定モデル１４００と、抽出部１４６０とを有する。抽出部１４６０は、図８に示す領域特定モジュール１４１と、サイズ調整モジュール１４２と、領域特定モジュール１４３と、サイズ調整モジュール１４４と、区間特定モジュール１４５と、リサンプリングモジュール１４６と、抽出モジュール１４７と、抽出モジュール１４８などに対応する。推定部１０６の詳細な処理は後述する。出力部１０８は、推定結果を外部に出力する。本実施形態では、「外部」は、表示装置２００であるとする。「外部」は、他であってもよく、「外部」は、たとえば、プリンタであってもよく、ネットワークであってもよい。 When the predetermined conditions are satisfied, the five data (moving image data 136, still image data 137, audio data 138, guardian data 139, and grade data 140) are input to the estimation unit 106. The estimation unit 106 outputs the possibility of the subject's future occupation based on the five data. The estimation unit 106 has an estimation model 1400 and an extraction unit 1460. The extraction unit 1460 includes the area identification module 141, the size adjustment module 142, the area identification module 143, the size adjustment module 144, the section identification module 145, the resampling module 146, the extraction module 147, and the area identification module 141 shown in FIG. It corresponds to the extraction module 148 and the like. The detailed processing of the estimation unit 106 will be described later. The output unit 108 outputs the estimation result to the outside. In the present embodiment, the "external" is the display device 200. The "external" may be other, and the "external" may be, for example, a printer or a network.

表示装置２００は、情報処理装置１００から出力された推定結果に基づいた画面（図３参照）を表示する。 The display device 200 displays a screen (see FIG. 3) based on the estimation result output from the information processing device 100.

また、記憶部１０４からの５つのデータは、出力部１０８を経由して、管理装置３００に対して送信される。管理装置３００に対して送信された５つのデータが、学習用の対象者情報Ａとして用いられる。 Further, the five data from the storage unit 104 are transmitted to the management device 300 via the output unit 108. The five data transmitted to the management device 300 are used as the target person information A for learning.

［推定部の処理］
次に、図８を用いて、推定部１０６の処理の詳細を説明する。図８を参照して、推定部１０６は、領域特定モジュール１４１と、サイズ調整モジュール１４２と、領域特定モジュール１４３と、サイズ調整モジュール１４４と、区間特定モジュール１４５と、リサンプリングモジュール１４６と、抽出モジュール１４７と、抽出モジュール１４８と、推定モデル１４００とを含む。[Processing of estimation part]
Next, the details of the processing of the estimation unit 106 will be described with reference to FIG. With reference to FIG. 8, the estimation unit 106 includes an area identification module 141, a size adjustment module 142, an area identification module 143, a size adjustment module 144, an interval identification module 145, a resampling module 146, and an extraction module. It includes 147, an extraction module 148, and an estimation model 1400.

領域特定モジュール１４１は、動画像データ１３６の動画像において対象者領域を特定する。そして、領域特定モジュール１４１は、特定した対象者領域に対応する対象者動画像を動画像データ１３６から抽出して出力する。 The area identification module 141 specifies the target person area in the moving image of the moving image data 136. Then, the area specifying module 141 extracts and outputs the target person moving image corresponding to the specified target person area from the moving image data 136.

典型的には、領域特定モジュール１４１は、目や鼻などの顔特徴を抽出するとともに、手足などの骨格特徴を抽出することで、対象者領域を特定する。領域特定モジュール１４１は、動画像データ１３６から抽出した対象者動画像をサイズ調整モジュール１４２に出力する。 Typically, the region identification module 141 identifies the subject region by extracting facial features such as eyes and nose and skeletal features such as limbs. The area identification module 141 outputs the target person moving image extracted from the moving image data 136 to the size adjusting module 142.

サイズ調整モジュール１４２において、対象者動画像は、予め定められた次元をもつ特徴量（特徴量ベクトル）に変換されて推定モデル１４００に入力される。ここで、領域特定モジュール１４１により抽出される対象者動画像の画像サイズは変動し得るため、サイズ調整モジュール１４２は画像サイズを規格化する。 In the size adjustment module 142, the subject moving image is converted into a feature amount (feature amount vector) having a predetermined dimension and input to the estimation model 1400. Here, since the image size of the target person moving image extracted by the area specifying module 141 can fluctuate, the size adjustment module 142 standardizes the image size.

より具体的には、サイズ調整モジュール１４２は、領域特定モジュール１４１からの対象者動画像を予め定められた画素数の画像に調整した上で、調整後の画像を構成する各画素の画素値を動画像特徴量１４１０として推定モデル１４００に入力する。 More specifically, the size adjustment module 142 adjusts the target person moving image from the area identification module 141 to an image having a predetermined number of pixels, and then adjusts the pixel value of each pixel constituting the adjusted image. It is input to the estimation model 1400 as the moving image feature amount 1410.

同様に、領域特定モジュール１４３は、静止画像データ１３７の静止画像において対象者領域を特定する。そして、領域特定モジュール１４３は、特定した対象者領域に対応する対象者静止画像を静止画像データ１３７から抽出して出力する。 Similarly, the area identification module 143 identifies the target person area in the still image of the still image data 137. Then, the area specifying module 143 extracts and outputs the target person still image corresponding to the specified target person area from the still image data 137.

典型的には、領域特定モジュール１４３は、文字や絵画を特定する。領域特定モジュール１４３は、静止画像データ１３７から抽出した文字画像や絵画画像をサイズ調整モジュール１４４に出力する。 Typically, the domain identification module 143 identifies characters and paintings. The area identification module 143 outputs a character image or a painting image extracted from the still image data 137 to the size adjustment module 144.

サイズ調整モジュール１４４において、対象者静止画像は、予め定められた次元をもつ特徴量（特徴量ベクトル）に変換されて推定モデル１４００に入力される。ここで、領域特定モジュール１４３により抽出される文字画像や絵画画像の画像サイズは変動し得るため、サイズ調整モジュール１４４は画像サイズを規格化する。 In the size adjustment module 144, the still image of the subject is converted into a feature amount (feature amount vector) having a predetermined dimension and input to the estimation model 1400. Here, since the image size of the character image or the painting image extracted by the area specifying module 143 can fluctuate, the size adjustment module 144 standardizes the image size.

より具体的には、サイズ調整モジュール１４４は、領域特定モジュール１４３からの文字画像や絵画画像を予め定められた画素数の画像に調整した上で、調整後の画像を構成する各画素の画素値を静止画像特徴量１４２０として推定モデル１４００に入力する。 More specifically, the size adjustment module 144 adjusts the character image and the painting image from the area identification module 143 to an image having a predetermined number of pixels, and then adjusts the pixel value of each pixel constituting the adjusted image. Is input to the estimation model 1400 as a still image feature amount 1420.

区間特定モジュール１４５は、音声データ１３８に含まれる対象者が発した音声の区間を特定して、特定区間音声を抽出して出力する。典型的には、区間特定モジュール１４５は、音声データ１３８が示す音声の時間的変化を解析して、情報処理装置１００の周囲にある雑音成分に対して、振幅あるいは周波数などが変化した区間を特定することで、特定区間音声を抽出する。 The section specifying module 145 identifies the section of the voice emitted by the target person included in the voice data 138, extracts the specific section voice, and outputs the voice. Typically, the section identification module 145 analyzes the temporal change of the voice indicated by the voice data 138, and identifies the section in which the amplitude or frequency has changed with respect to the noise component around the information processing apparatus 100. By doing so, the specific section voice is extracted.

また、特定区間音声には、質問者Ｂの質問が含まれる。区間特定モジュール１４５は、質問者Ｂの質問部分については特定せずに、対象者Ａの回答部分のみを特定する。さらに、特定区間音声は、対象者の現在のフィーリング（気分）、およびテンポを示す情報を含む場合がある。 In addition, the specific section voice includes the question of the questioner B. The section specifying module 145 does not specify the question part of the questioner B, but specifies only the answer part of the target person A. In addition, the particular section audio may include information indicating the subject's current mood and tempo.

このように、区間特定モジュール１４５は、マイク２０で収集された音声のうち対象者の発話に対応する部分の音声を特定する。 In this way, the section specifying module 145 identifies the voice of the portion of the voice collected by the microphone 20 that corresponds to the utterance of the target person.

再度図８を参照して、区間特定モジュール１４５が音声データ１３８から抽出した特定区間音声は、リサンプリングモジュール１４６へ出力される。リサンプリングモジュール１４６において、特定区間音声は、予め定められた次元をもつ特徴量（特徴量ベクトル）に変換されて推定モデル１４００に出力される。ここで、区間特定モジュール１４５により特定される特定区間音声の時間長さは変動し得るため、リサンプリングモジュール１４６が音声サンプリング数を規格化する。 With reference to FIG. 8 again, the specific section voice extracted by the section specifying module 145 from the voice data 138 is output to the resampling module 146. In the resampling module 146, the specific section voice is converted into a feature quantity (feature quantity vector) having a predetermined dimension and output to the estimation model 1400. Here, since the time length of the specific section voice specified by the section specifying module 145 can vary, the resampling module 146 standardizes the number of voice samples.

より具体的には、リサンプリングモジュール１４６は、区間特定モジュール１４５からの特定区間音声が示す音声の時間波形を予め定められたサンプル数でサンプリングすることで、各サンプリング点での振幅値を音声特徴量１４３０として推定モデル１４００に入力する。 More specifically, the resampling module 146 samples the time waveform of the voice indicated by the specific section voice from the section specification module 145 with a predetermined number of samples, and obtains the amplitude value at each sampling point as the voice feature. It is input to the estimation model 1400 as the quantity 1430.

抽出モジュール１４７は、保護者データ１３９から、保護者の職業、保護者の学歴、および保護者の年収に対応する特徴量を抽出する。抽出モジュール１４７で抽出された保護者特徴量１４４０は、推定モデル１４００に入力される。抽出モジュール１４８は、成績データ１４０から、対象者の成績を抽出する。抽出モジュール１４７で抽出された成績特徴量１４５０は、推定モデル１４００に入力される。 The extraction module 147 extracts features corresponding to the guardian's occupation, the guardian's educational background, and the guardian's annual income from the guardian data 139. The guardian features 1440 extracted by the extraction module 147 are input to the estimation model 1400. The extraction module 148 extracts the grades of the subject from the grade data 140. The performance feature amount 1450 extracted by the extraction module 147 is input to the estimation model 1400.

推定モデル１４００は、ネットワーク構造および対応するパラメータを規定する学習済モデル３２６に基づいて構築される。動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０が推定モデル１４００に入力されることで、推定モデル１４００が定義する演算処理が実行されて、推定結果１４４９として職業毎のスコアが算出される。ここで、職業毎のスコアは、対象者に適している各職業それぞれの可能性を示す値である。 The estimation model 1400 is built on the trained model 326 which defines the network structure and the corresponding parameters. By inputting the moving image feature amount 1410, the still image feature amount 1420, the audio feature amount 1430, the guardian feature amount 1440, and the performance feature amount 1450 into the estimation model 1400, the arithmetic processing defined by the estimation model 1400 is executed. Therefore, the score for each occupation is calculated as the estimation result 1449. Here, the score for each occupation is a value indicating the possibility of each occupation suitable for the target person.

推定モデル１４００は、後述するような学習用データセット３２４を用いた学習処理により生成される。 The estimation model 1400 is generated by a learning process using the learning data set 324 as described later.

このように、学習済の推定モデルである推定モデル１４００は、動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０の入力を受けて、対象者の将来の職業の可能性を、推定結果として出力する。 As described above, the estimated model 1400, which is the trained estimation model, receives the inputs of the moving image feature amount 1410, the still image feature amount 1420, the audio feature amount 1430, the guardian feature amount 1440, and the performance feature amount 1450. The possibility of the subject's future occupation is output as an estimation result.

図９は、図８に示す推定モデル１４００のネットワーク構成例を示す模式図である。図９を参照して、推定モデル１４００は、ＤＮＮ（ＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋ）に分類されるネットワークである。推定モデル１４００は、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）に分類される前処理ネットワーク１４６０，１４７０，１４８０，１４９０，１５００と、中間層１５１０と、出力層に相当する活性化関数１４９２と、Ｓｏｆｔｍａｘ関数１４９４とを含む。 FIG. 9 is a schematic diagram showing a network configuration example of the estimation model 1400 shown in FIG. With reference to FIG. 9, the estimation model 1400 is a network classified as a DNN (Deep Neural Network). The estimation model 1400 includes a preprocessing network 1460, 1470, 1480, 1490, 1500 classified as a CNN (Convolutional Neural Network), an intermediate layer 1510, an activation function 1492 corresponding to an output layer, and a Softmax function 1494. Including.

前処理ネットワーク１４６０，１４７０，１４８０，１４９０，１５００は、相対的に次数の大きな動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０から、推定結果１４４９を算出するために有効な特徴量を抽出するための一種のフィルタとして機能する。前処理ネットワーク１４６０，１４７０，１４８０，１４９０，１５００の各々は、畳み込み層（ＣＯＮＶ）およびプーリング層（Ｐｏｏｌｉｎｇ）が交互に配置された構成を有している。なお、畳み込み層とプーリング層との数は同数でなくてもよく、また、畳み込み層の出力側にはＲｅＬＵ（正規化線形関数：ｒｅｃｔｉｆｉｅｄｌｉｎｅａｒｕｎｉｔ）などの活性化関数が配置される。また、図９では、前処理ネットワーク１４７０，１４８０，１４９０，１５００の詳細は省略されているが、例えば、前処理ネットワーク１４７０，１４８０，１４９０，１５００は、例えば、前処理ネットワーク１４６０と同様の構成とされる。 The preprocessing networks 1460, 1470, 1480, 1490, and 1500 are based on moving image features 1410, still image features 1420, audio features 1430, guardian features 1440, and performance features 1450, which have relatively large orders. It functions as a kind of filter for extracting features that are effective for calculating the estimation result 1449. Each of the pretreatment networks 1460, 1470, 1480, 1490, and 1500 has a configuration in which convolutional layers (CONV) and pooling layers (Pooling) are alternately arranged. The number of convolutional layers and pooling layers does not have to be the same, and an activation function such as ReLU (rectified linear function) is arranged on the output side of the convolutional layer. Further, in FIG. 9, the details of the preprocessing networks 1470, 1480, 1490, and 1500 are omitted, but for example, the preprocessing networks 1470, 1480, 1490, and 1500 have the same configuration as that of the preprocessing network 1460. Will be done.

より具体的には、前処理ネットワーク１４６０は、動画像特徴量１４１０（ｘ１１，ｘ１２，・・・，ｘ１ａ）の入力を受けて、動画における文字および絵画の綺麗さ度合い等を示す内部特徴量を出力するように構築される。前処理ネットワーク１４７０は、静止画像特徴量１４２０（ｘ２１，ｘ２２，・・・，ｘ２ｂ）の入力を受けて、静止画における文字および絵画などの綺麗さ度合い等を示す内部特徴量を出力するように構築される。前処理ネットワーク１４８０は、音声特徴量１４３０（ｘ３１，ｘ３２，・・・，ｘ３ｃ）の入力を受けて、発言における職業を特定するための情報、および、対象者の現在のフィーリング（気分）を示す情報等を示す内部特徴量を出力するように構築される。前処理ネットワーク１４９０は、保護者特徴量１４４０（ｘ４１，ｘ４２，・・・，ｘ４ｄ）の入力を受けて、保護者の職業、保護者の学歴、および保護者の年収等を出力するように構築される。前処理ネットワーク１５００は、成績特徴量１４５０（ｘ５１，ｘ５２，・・・，ｘ５ｅ）の入力を受けて、対象者の成績を出力するように構築される。 More specifically, the preprocessing network 1460 receives the input of the moving image feature amount 1410 (x11, x12, ..., X1a) and outputs the internal feature amount indicating the degree of beauty of characters and paintings in the moving image. Constructed to output. The preprocessing network 1470 receives the input of the still image feature amount 1420 (x21, x22, ..., X2b) and outputs the internal feature amount indicating the degree of cleanliness of characters and paintings in the still image. Will be built. The preprocessing network 1480 receives the input of the voice feature amount 1430 (x31, x32, ..., X3c), and receives the information for identifying the occupation in the speech and the current feeling (mood) of the subject. It is constructed to output internal features that indicate the information to be shown. The preprocessing network 1490 is constructed so as to receive the input of the guardian feature amount 1440 (x41, x42, ..., X4d) and output the guardian's occupation, the guardian's educational background, the guardian's annual income, etc. Will be done. The preprocessing network 1500 is constructed so as to receive the input of the grade feature amount 1450 (x51, x52, ..., X5e) and output the grade of the subject.

中間層１５１０は、所定数の層数を有する全結合ネットワークからなり、前処理ネットワーク１４６０，１４７０，１４８０，１４９０，１５００の各々からの出力を、各ノードについて決定される重みおよびバイアスを用いてノード毎に順次結合する。 The intermediate layer 1510 consists of a fully coupled network with a predetermined number of layers and outputs from each of the preprocessing networks 1460, 1470, 1480, 1490, 1500 with the weights and biases determined for each node. Combine sequentially every time.

なお、図９の例では前処理ネットワーク１４６０，１４７０，１４８０，１４９０，１５００は全て同じ構成であるとする。 In the example of FIG. 9, it is assumed that the preprocessing networks 1460, 1470, 1480, 1490, and 1500 all have the same configuration.

中間層１５１０の出力側には、ＲｅＬＵなどの活性化関数１４９２が配置され、最終的には、Ｓｏｆｔｍａｘ関数１４９４により確率分布に正規化された上で、推定結果１４４９（ｙ１，ｙ２，・・・，ｙＮ）が出力される。 An activation function 1492 such as ReLU is arranged on the output side of the intermediate layer 1510, and finally, after being normalized to a probability distribution by the Softmax function 1494, the estimation result 1449 (y1, y2, ... , YN) is output.

後述するような学習フェーズにおいては、推定モデル１４００のネットワークを構築する各エレメントのパラメータが最適化される。 In the learning phase as described below, the parameters of each element that builds the network of the estimation model 1400 are optimized.

［学習フェーズ］
次に、学習フェーズを説明する。図１０は、管理装置３００の学習機能における処理内容を説明するための図である。図１０を参照して、管理装置３００は、学習機能として、領域特定モジュール１４１と、サイズ調整モジュール１４２と、領域特定モジュール１４３と、サイズ調整モジュール１４４と、区間特定モジュール１４５と、リサンプリングモジュール１４６と、抽出モジュール１４７と、抽出モジュール１４８とを有する。これらのモジュールは、情報処理装置１００が、図８で説明した各モジュールと実質的に同一である。そのため、これらのモジュールについての詳細な説明は繰り返さない。[Learning phase]
Next, the learning phase will be described. FIG. 10 is a diagram for explaining the processing contents in the learning function of the management device 300. With reference to FIG. 10, the management device 300 has the area identification module 141, the size adjustment module 142, the area identification module 143, the size adjustment module 144, the section identification module 145, and the resampling module 146 as learning functions. And an extraction module 147 and an extraction module 148. In these modules, the information processing apparatus 100 is substantially the same as each module described with reference to FIG. Therefore, the detailed description of these modules will not be repeated.

さらに、管理装置３００は、学習機能として、最適化部３６２を含む。最適化部３６２は、推定モデル１４００を規定するためのモデルパラメータ３６４を最適化することで、学習済モデル３２６を生成する。 Further, the management device 300 includes an optimization unit 362 as a learning function. The optimization unit 362 generates the trained model 326 by optimizing the model parameter 364 for defining the estimation model 1400.

最適化部３６２は、学習用データセット３２４に含まれる動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０の各組（学習用データ）を用いて、モデルパラメータ３６４を最適化する。 The optimization unit 362 uses each set (learning data) of moving image data 136, still image data 137, audio data 138, guardian data 139, and grade data 140 included in the learning data set 324 to model. Optimize parameter 364.

より具体的には、管理装置３００は、学習用データセット３２４に含まれる各組の動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０から、動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０を生成する。動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０は、推定モデル１４００に入力されることで推定結果１４４９を算出する。そして、最適化部３６２は、推定モデル１４００から出力される推定結果１４４９と対応する職業データ２１８（正解ラベル）とを比較することで誤差を算出し、算出した誤差に応じて（算出した誤差を小さくするように）モデルパラメータ３６４の値を最適化（調整）する。 More specifically, the management device 300 is based on the moving image data 136, the still image data 137, the audio data 138, the guardian data 139, and the performance data 140 of each set included in the learning data set 324. A quantity 1410, a still image feature quantity 1420, a voice feature quantity 1430, a guardian feature quantity 1440, and a performance feature quantity 1450 are generated. The moving image feature amount 1410, the still image feature amount 1420, the audio feature amount 1430, the guardian feature amount 1440, and the performance feature amount 1450 are input to the estimation model 1400 to calculate the estimation result 1449. Then, the optimization unit 362 calculates an error by comparing the estimation result 1449 output from the estimation model 1400 with the corresponding occupation data 218 (correct answer label), and according to the calculated error (calculated error). Optimize (adjust) the value of model parameter 364 (to make it smaller).

すなわち、最適化部３６２は、学習部に相当し、学習用データセット３２４（動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０に職業データ２１８がラベル付けされているデータ）から抽出された、動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０を推定モデル１４００に入力して出力される推定結果１４４９が、当該学習用データにラベル付けされている職業データ２１８に近付くように、推定モデル１４００を最適化する。言い換えれば、最適化部３６２は、学習用データセット３２４に含まれる動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０から特徴量を抽出して推定モデル１４００に入力したときに算出される推定結果１４４９が対応する職業データ２１８と一致するようにモデルパラメータ３６４を調整する。 That is, the optimization unit 362 corresponds to the learning unit, and the training data set 324 (moving image data 136, still image data 137, audio data 138, guardian data 139, and grade data 140 are labeled with occupational data 218. Estimated output by inputting moving image feature amount 1410, still image feature amount 1420, audio feature amount 1430, guardian feature amount 1440, and performance feature amount 1450 extracted from the data) into the estimation model 1400. The estimation model 1400 is optimized so that the result 1449 approaches the occupational data 218 labeled in the training data. In other words, the optimization unit 362 extracts feature quantities from the moving image data 136, the still image data 137, the audio data 138, the guardian data 139, and the performance data 140 included in the learning data set 324, and estimates the model 1400. The model parameter 364 is adjusted so that the estimation result 1449 calculated when inputting to is consistent with the corresponding occupation data 218.

同様の手順で、学習用データセット３２４に含まれる各学習用データ（動画像データ１３６、静止画像データ１３７、音声データ１３８、保護者データ１３９、および成績データ１４０）に基づいて、推定モデル１４００のモデルパラメータ３６４を繰り返し最適化することで、学習済モデル３２６が生成される。 In the same procedure, the estimation model 1400 is based on the training data (moving image data 136, still image data 137, audio data 138, guardian data 139, and performance data 140) included in the training data set 324. The trained model 326 is generated by iteratively optimizing the model parameter 364.

最適化部３６２がモデルパラメータ３６４の値を最適化するにあたっては、任意の最適化アルゴリズムを用いることができる。より具体的には、最適化アルゴリズムとしては、例えば、ＳＧＤ（ＳｔｏｃｈａｓｔｉｃＧｒａｄｉｅｎｔＤｅｓｃｅｎｔ：確率的勾配降下法）、ＭｏｍｅｎｔｕｍＳＧＤ（慣性項付加ＳＧＤ）、ＡｄａＧｒａｄ、ＲＭＳｐｒｏｐ、ＡｄａＤｅｌｔａ、Ａｄａｍ（Ａｄａｐｔｉｖｅｍｏｍｅｎｔｅｓｔｉｍａｔｉｏｎ）などの勾配法を用いることができる。 When the optimization unit 362 optimizes the value of the model parameter 364, an arbitrary optimization algorithm can be used. More specifically, as the optimization algorithm, for example, SGD (Stochastic Gradient Descent), Momentum SGD (Inertia Term Addition SGD), AdaGrad, RMSprop, AdaDelta, Adam (Adaptation), etc. Gradient method can be used.

最適化部３６２によりモデルパラメータ３６４を最適化された推定モデル１４００は、学習済モデル３２６に相当し、情報処理装置１００へ送信される。 The estimation model 1400 whose model parameter 364 is optimized by the optimization unit 362 corresponds to the trained model 326 and is transmitted to the information processing apparatus 100.

［職業推定処理のフロー］
次に、図１１を用いて、情報処理装置１００による職業推定処理のフローチャートを説明する。ステップＳ２において、情報処理装置１００は、動画像データ１３６から動画像特徴量１４１０（図８参照）を抽出する。次に、ステップＳ４において、情報処理装置１００は、静止画像データ１３７から静止画像特徴量１４２０（図８参照）を抽出する。次に、ステップＳ６において、情報処理装置１００は、音声データ１３８から音声特徴量１４３０（図８参照）を抽出する。次に、ステップＳ８において、情報処理装置１００は、保護者データ１３９から保護者特徴量１４４０（図８参照）を抽出する。次に、ステップＳ１０において、情報処理装置１００は、成績データ１４０から成績特徴量１４５０（図８参照）を抽出する。[Flow of occupation estimation processing]
Next, a flowchart of the occupation estimation process by the information processing apparatus 100 will be described with reference to FIG. In step S2, the information processing apparatus 100 extracts the moving image feature amount 1410 (see FIG. 8) from the moving image data 136. Next, in step S4, the information processing apparatus 100 extracts the still image feature amount 1420 (see FIG. 8) from the still image data 137. Next, in step S6, the information processing apparatus 100 extracts the voice feature amount 1430 (see FIG. 8) from the voice data 138. Next, in step S8, the information processing apparatus 100 extracts the guardian feature amount 1440 (see FIG. 8) from the guardian data 139. Next, in step S10, the information processing apparatus 100 extracts the grade feature amount 1450 (see FIG. 8) from the grade data 140.

次に、ステップＳ１２において、情報処理装置１００は、５つの特徴量（動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０）に基づいて、対象者の職業を推定する（図８の推定結果１４４９を出力する）。 Next, in step S12, the information processing apparatus 100 is based on five feature amounts (moving image feature amount 1410, still image feature amount 1420, audio feature amount 1430, guardian feature amount 1440, and performance feature amount 1450). , Estimate the occupation of the subject (output the estimation result 1449 in FIG. 8).

［学習処理のフロー］
次に、次に図１２を用いて、管理装置３００による学習処理のフローチャートを説明する。まず、ステップＳ１０１において、管理装置３００は、各情報処理装置１００から学習用データセットを取得する。ここで、各情報処理装置１００は、ネットワーク２経由で管理装置３００と接続されている全ての情報処理装置１００である（図４参照）。[Learning process flow]
Next, a flowchart of the learning process by the management device 300 will be described with reference to FIG. First, in step S101, the management device 300 acquires a learning data set from each information processing device 100. Here, each information processing device 100 is all the information processing devices 100 connected to the management device 300 via the network 2 (see FIG. 4).

また、管理装置３００は、所定の契機で、要求信号を全ての情報処理装置１００に対して送信する。所定の契機は、たとえば、管理者が管理装置３００に対して要求信号を送信させるための操作を行うという第１契機と、所定の期間が経過するという第２契機とを含む。 Further, the management device 300 transmits a request signal to all the information processing devices 100 at a predetermined opportunity. The predetermined trigger includes, for example, a first trigger in which the administrator performs an operation for transmitting a request signal to the management device 300, and a second trigger in which the predetermined period elapses.

次に、ステップＳ１０２において、管理装置３００は、ステップＳ１０１で取得した学習用データセットのうち１つの組（学習用データセット）を選択する。 Next, in step S102, the management device 300 selects one set (learning data set) of the learning data sets acquired in step S101.

次に、ステップＳ１０４において、管理装置３００は、選択した学習用データセットから、他の対象者の特徴量を抽出する。ステップＳ１０４の処理は、たとえば、図１０の領域特定モジュール１４１、サイズ調整モジュール１４２、領域特定モジュール１４３、サイズ調整モジュール１４４、区間特定モジュール１４５、リサンプリングモジュール１４６、抽出モジュール１４７、および抽出モジュール１４８が行う。 Next, in step S104, the management device 300 extracts the feature amount of another target person from the selected learning data set. The processing of step S104 includes, for example, the area identification module 141, the size adjustment module 142, the area identification module 143, the size adjustment module 144, the section identification module 145, the resampling module 146, the extraction module 147, and the extraction module 148 in FIG. Do.

次に、ステップＳ１０６において、管理装置３００は、抽出された５つの特徴量を推定モデル１４００に入力して、推定結果１４４９を生成する。次に、ステップＳ１０８において、管理装置３００は、選択した学習用データセットの職業（ラベル）と、推定結果に係る職業との誤差に基づいて学習済モデルのパラメータを最適化する。具体的には、管理装置３００は、選択した学習用データセットの職業（正解ラベル）と、推定結果に係る職業との誤差が小さくなるように、学習済モデルのパラメータを更新する。 Next, in step S106, the management device 300 inputs the extracted five feature quantities into the estimation model 1400 and generates an estimation result 1449. Next, in step S108, the management device 300 optimizes the parameters of the trained model based on the error between the occupation (label) of the selected learning data set and the occupation related to the estimation result. Specifically, the management device 300 updates the parameters of the trained model so that the error between the occupation (correct label) of the selected learning data set and the occupation related to the estimation result becomes small.

次に、ステップＳ１１０において、管理装置３００は、全ての学習用データセットを処理済であるか否かを判断する。管理装置３００は、全ての学習用データセットが処理済ではないと判断した場合には（ステップＳ１１０でＮＯ）、ステップＳ１０２に戻る。一方、ステップＳ１１０において、管理装置３００は、全ての学習用データセットを処理済ではないと判断した場合には（ステップＳ１１０でＹＥＳ）、ステップＳ１１２に移行する。 Next, in step S110, the management device 300 determines whether or not all the learning data sets have been processed. When the management device 300 determines that all the learning data sets have not been processed (NO in step S110), the management device 300 returns to step S102. On the other hand, in step S110, when the management device 300 determines that all the learning data sets have not been processed (YES in step S110), the management device 300 proceeds to step S112.

ステップＳ１１２においては、管理装置３００は、全ての情報処理装置１００に対して、学習済モデルを送信する。全ての情報処理装置１００は、送信された学習済モデルで、記憶していた学習済モデルを更新する。 In step S112, the management device 300 transmits the trained model to all the information processing devices 100. All the information processing devices 100 update the stored learned model with the transmitted trained model.

［小括］
前述の実施の形態の情報処理装置１００によれば、例えば、図７に示すように、受付部１０２は、対象者に関する対象者情報の入力を受付ける。抽出部１４６０は、対象者の将来の職業を推定するための特徴量を、対象者情報から抽出する。また、推定モデル１４００は、学習済推定モデルであり、かつ推定モデル１４００は、特徴量に基づいて、対象者の将来の職業の可能性を、推定結果として出力する。出力部１０８は、推定モデル１４００での推定結果を出力する。また、図１０等に示すように、推定モデル１４００は、学習用データセットを用いた学習処理により生成され、学習用データセットは、他の対象者のパラメータである学習用パラメータに対して、該他の対象者の職業（職業データ２１８）をラベル付けした学習用データを含む。これにより、情報処理装置１００は、学習済の推定モデル１４００を用いて、対象者の対象者情報に基づいて、対象者の将来の職業を推定できる。したがって、情報処理装置１００は、適切に、対象者の将来の職業を推定できる。[Brief Summary]
According to the information processing apparatus 100 of the above-described embodiment, for example, as shown in FIG. 7, the reception unit 102 accepts the input of the target person information regarding the target person. The extraction unit 1460 extracts the feature amount for estimating the future occupation of the subject from the subject information. Further, the estimation model 1400 is a trained estimation model, and the estimation model 1400 outputs the possibility of the subject's future occupation as an estimation result based on the feature amount. The output unit 108 outputs the estimation result of the estimation model 1400. Further, as shown in FIG. 10 and the like, the estimation model 1400 is generated by the learning process using the learning data set, and the learning data set is the same as the learning parameters which are the parameters of other subjects. Includes learning data labeled with the profession of another subject (occupational data 218). Thereby, the information processing apparatus 100 can estimate the future occupation of the target person based on the target person information of the target person by using the trained estimation model 1400. Therefore, the information processing device 100 can appropriately estimate the future occupation of the target person.

また、特徴量は、質問者による質問に対する対象者の回答（音声特徴量１４３０）、対象者の学業の成績（成績特徴量１４５０）、対象者が描く画像（静止画像特徴量１４２０）、対象者が描く文字（静止画像特徴量１４２０）、および対象者の動画（動画特徴量１４２１）のうち少なくとも１つを含む。したがって、情報処理装置１００は、対象者の将来の職業を推定するために適切な特徴量に基づいて、該対象者の職業を適切に推定できる。 The features include the subject's answer to the question asked by the questioner (voice feature amount 1430), the subject's academic performance (performance feature amount 1450), the image drawn by the subject (still image feature amount 1420), and the subject. Includes at least one of the characters drawn by (still image feature amount 1420) and the moving image of the subject (moving image feature amount 1421). Therefore, the information processing apparatus 100 can appropriately estimate the occupation of the target person based on the feature amount appropriate for estimating the future occupation of the target person.

また、質問者による質問は、対象者が将来に就きたい職業を問う質問を含む。このような質問、および該質問に対する回答は、対象者の職業を推定するにあたって有用な指標となる。したがって、情報処理装置１００は、対象者の職業を適切に推定できる。 In addition, the question asked by the questioner includes a question asking the profession that the subject wants to get in the future. Such a question and the answer to the question are useful indicators for estimating the occupation of the subject. Therefore, the information processing device 100 can appropriately estimate the occupation of the target person.

また、特徴量は、対象者の保護者についてのパラメータである保護者パラメータを含む。対象者の保護者の保護者パラメータは、対象者の職業を推定するにあたって有用な指標となる。したがって、情報処理装置１００は、対象者の職業を適切に推定できる。 In addition, the feature amount includes a guardian parameter which is a parameter for the guardian of the target person. The guardian parameters of the subject's guardian are useful indicators in estimating the subject's occupation. Therefore, the information processing device 100 can appropriately estimate the occupation of the target person.

また、保護者パラメータは、保護者の職業、保護者の学歴、および保護者の年収のうちの少なくとも１つを含む。保護者の職業、保護者の学歴、および保護者の年収は、対象者の職業を推定するにあたって有用な指標となる。したがって、情報処理装置１００は、対象者の職業を適切に推定できる。 The parental parameter also includes at least one of the parent's occupation, parent's educational background, and parent's annual income. The profession of the guardian, the educational background of the guardian, and the annual income of the guardian are useful indicators for estimating the profession of the target person. Therefore, the information processing device 100 can appropriately estimate the occupation of the target person.

また、本実施の形態では、図３に示したように、推定結果は、推定結果は、特定の職業になれる確率に基づく情報である。図３の例では、特定の職業は、医者、弁護士、金融系サラリーマンなどである。したがって、情報処理装置１００は、対象者および対象者の保護者などに対して、特定の職業になれる確率に基づいた職業を認識させることができる。 Further, in the present embodiment, as shown in FIG. 3, the estimation result is information based on the probability of becoming a specific occupation. In the example of FIG. 3, the specific occupation is a doctor, an attorney, a financial office worker, and the like. Therefore, the information processing device 100 can make the target person, the guardian of the target person, and the like recognize a profession based on the probability of becoming a specific profession.

また、学習装置としての管理装置３００は、対象者の将来の職業を推定するための特徴量の入力を受けて、対象者の将来の職業の可能性を、推定結果として出力する推定モデル１４００を生成する。管理装置３００は、学習用データセットを取得する。また、学習用データセット３２４は、他の対象者の特徴量に対して、該他の対象者の職業をラベル付けした学習用データを複数含む。管理装置３００は、学習用データセット３２４から、他の対象者の特徴量（動画像特徴量１４１０、静止画像特徴量１４２０、音声特徴量１４３０、保護者特徴量１４４０、および成績特徴量１４５０）を抽出する。また、最適化部３６２は、受付部が受付けた他の対象者の特徴量を前記推定モデルに入力して出力される推定結果が、学習用データにラベル付けされている他の対象者の職業（職業データ２１８）に近づくように、推定モデル１４００を最適化する。したがって、管理装置３００は、適切な推定結果を出力できるように、推定モデル１４００を更新することができる。 Further, the management device 300 as a learning device receives an input of a feature amount for estimating the future occupation of the target person, and outputs an estimation model 1400 that outputs the possibility of the future occupation of the target person as an estimation result. Generate. The management device 300 acquires a learning data set. In addition, the learning data set 324 includes a plurality of learning data in which the occupations of the other subjects are labeled with respect to the feature quantities of the other subjects. The management device 300 obtains the feature amounts of other subjects (moving image feature amount 1410, still image feature amount 1420, audio feature amount 1430, guardian feature amount 1440, and grade feature amount 1450) from the learning data set 324. Extract. In addition, the optimization unit 362 inputs the feature amount of the other target person received by the reception unit into the estimation model and outputs the estimation result, which is the occupation of the other target person whose training data is labeled. The estimation model 1400 is optimized so as to approach (occupation data 218). Therefore, the management device 300 can update the estimation model 1400 so that an appropriate estimation result can be output.

１００情報処理装置、１０２受付部、１０４記憶部、１０６推定部、１０８出力部。100 information processing device, 102 reception unit, 104 storage unit, 106 estimation unit, 108 output unit.

Claims

The reception department that accepts input of target person information about the target person,
An extraction unit that extracts a feature amount for estimating the future occupation of the target person from the target person information,
Based on the feature quantity, a learned estimation model that outputs the possibility of the subject's future occupation as an estimation result, and
It is equipped with an output unit that outputs the estimation result of the estimation model.
The estimation model is generated by a learning process using a learning data set, and the learning data set labels the occupation of the other target person with respect to the learning parameter which is a parameter of the other target person. An information processing device that contains the learning data.

The feature amount is at least one of the subject's answer to the question asked by the questioner, the subject's academic performance, the image drawn by the subject, the characters drawn by the subject, and the video of the subject. The information processing apparatus according to claim 1, which includes.

The information processing device according to claim 2, wherein the question by the questioner includes a question asking the profession that the target person wants to take in the future.

The information processing device according to any one of claims 1 to 3, wherein the feature amount includes a guardian parameter which is a parameter for the guardian of the target person.

The information processing apparatus according to claim 4, wherein the guardian parameter includes at least one of the guardian's occupation, the guardian's educational background, and the guardian's annual income.

The estimation result according to any one of claims 1 to 5, wherein the estimation result includes at least one of information indicating whether or not the person can become a specific occupation and information based on the probability of becoming the specific occupation. Information processing device.

An information processing device that inputs features for estimating the future occupation of the target person into the trained estimation model and outputs the future occupation of the target person.
A learning device for generating the estimation model is provided.
The information processing device
It is provided with an extraction unit for extracting the feature amount.
The estimation model is trained to receive the input of the feature amount and output the possibility of the subject's future occupation as an estimation result.
The information processing device includes an output unit that outputs information about the future occupation of the target person based on the estimation result.
The learning device includes an acquisition unit for acquiring a learning data set.
The learning data set includes a plurality of learning data in which the occupations of the other subjects are labeled with respect to the features of the other subjects.
The learning device further
The reception section that accepts the input of the feature amount of the other target person,
The estimation result output by inputting the feature amount of the other target person received by the reception unit into the estimation model approaches the occupation of the other target person labeled in the learning data. , An information processing system including a learning unit that optimizes the estimation model.

It is a learning device for generating an estimation model that receives input of a feature amount for estimating the future occupation of the target person and outputs the possibility of the future occupation of the target person as an estimation result.
Equipped with an acquisition unit to acquire the training data set
The learning data set includes a plurality of learning data in which the occupations of the other subjects are labeled with respect to the features of the other subjects.
The learning device is
An extraction unit that extracts the features of the other subject from the learning data set, and
The estimation model is optimized so that the estimation result output by inputting the feature amount of the other target person into the estimation model approaches the occupation of the other target person labeled in the learning data. A learning device that includes a learning unit that is optimized.

On the computer
The steps to extract the features for estimating the future occupation of the target person from the target person information, and
Based on the feature quantity, the possibility of the future occupation of the subject is output as an estimation result, and a step of outputting an estimation result by a trained estimation model is executed.
The estimation model is generated by a learning process using a learning data set, and the learning data set labels the occupation of the other target person with respect to the learning parameter which is a parameter of the other target person. An information processing program that includes the learning data.