JP6783454B2

JP6783454B2 - Interesting quantitative evaluation device, interesting quantitative evaluation method, interesting adjustment method, and program

Info

Publication number: JP6783454B2
Application number: JP2016155057A
Authority: JP
Inventors: 哲也眞榮城
Original assignee: University of Tsukuba NUC
Current assignee: University of Tsukuba NUC
Priority date: 2016-08-05
Filing date: 2016-08-05
Publication date: 2020-11-11
Anticipated expiration: 2036-08-05
Also published as: JP2018022118A

Description

本発明の実施形態は、面白さ定量評価装置、面白さ定量評価方法、面白さ調整方法、及びプログラムに関する。 Embodiments of the present invention relate to an interesting quantitative evaluation device, an interesting quantitative evaluation method, an interesting adjustment method, and a program.

漫才やコントなどの演目では、聴衆を笑わせることを目的とするやりとり(笑いを生むやりとり)が、演者による口演として重ね重ね披露される。漫才は、通常２人の演者のやりとりで構成される。コントは寸劇として構成される。口演において笑いを生じさせるネタ要素の形態および内容は、様々であり、台詞によるものに限らず、身振りや動作によるものも含まれる（非特許文献１）。 In performances such as Manzai and Tale, exchanges aimed at making the audience laugh (interactions that produce laughter) are repeatedly performed as oral performances by the performers. Manzai usually consists of the interaction of two performers. Tale is organized as a skit. The forms and contents of the material elements that cause laughter in oral performances are various, and include not only those based on dialogue but also those based on gestures and movements (Non-Patent Document 1).

廣瀬義人、「漫才の構成要素と「笑い」の相関性：ボケ・ツッコミ・表情・フリなど」、日本大学国文学会、語文、No.146、pp.75−65、2013Yoshito Hirose, "Correlation between Manzai Components and" Laughter ": Blurring, Tsukkomi, Facial Expressions, Pretending, etc.", Nihon University National Literature Society, Language, No.146, pp.75-65, 2013

しかしながら、上記の通り、漫才やコントなどの口演におけるネタ要素の形態および内容は、様々である。それゆえ、口演の面白さを定量的に評価することは困難であった。 However, as mentioned above, there are various forms and contents of material elements in oral performances such as comics and comics. Therefore, it was difficult to quantitatively evaluate the fun of oral performance.

本発明が解決しようとする課題は、より容易に、口演の面白さを定量的に評価する面白さ定量評価装置、面白さ定量評価方法、面白さ調整方法、及びプログラムを提供することである。 An object to be solved by the present invention is to more easily provide an fun quantitative evaluation device, a fun quantitative evaluation method, a fun adjustment method, and a program for quantitatively evaluating the fun of oral performance.

（１）上記目的を達成するため、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価する評価部と、を備え、前記第１解析結果には、前記演者である複数の人のやりとりのテンポの評価の結果が含まれる。
（２）また、上記の面白さ定量評価装置における前記第１解析結果には、前記演者が発した音声に基づいた前記演者の発話内容に関する解析結果が含まれず、前記第２解析結果には、前記演目中の台詞無しのネタ要素の割合の評価の結果が含まれる。
（３）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価する評価部を備え、前記第２解析結果には、前記演者が前記演目のなかでネタ要素を演じた時間間隔の情報、又は前記演目の後半に前記演者がネタ要素を演じてから前記演目の終了までの時間の情報が含まれる。
（４）また、上記の面白さ定量評価装置において、前記演目の後半に前記演者が演じたネタ要素は、前記演目の最後に演じたネタ要素である。
（５）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価する評価部を備え、前記第１解析結果には、前記演者が発した音声の情報量を示す情報が含まれる。
（６）また、上記の面白さ定量評価装置における前記音声の情報量には、前記演者が発した音声において、台詞に割り当てられた時間、単位時間に割り当てられたモーラ数、前記演者の台詞のうち比較的長い台詞のモーラ数のうちの１又は複数が含まれる。
（７）また、上記の面白さ定量評価装置における前記評価部は、複数の回に分けて段階的に選抜するコンテストの所定の回のうち第１の回において前記演者が演じた演目における前記第１解析結果と前記第２解析結果とに基づいて前記演者が演じた演目の面白さの第１評価値と、前記所定の回のうち前記第１の回とは異なる第２の回において前記演者が演じた演目における前記第１解析結果と前記第２解析結果とに基づいて前記演者が演じた演目の面白さの第２評価値と、を導出し、上記の面白さ定量評価装置は、前記第１評価値と前記第２評価値とに基づいて、前記第１の回と前記第２の回の両方の回より後に前記演者が演じる演目の面白さを推定する推定部をさらに備える。
（８）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価する評価部を備え、前記評価部は、複数の回に分けて段階的に選抜するコンテストの所定の回のうち第１の回において前記演者が演じた演目における前記第１解析結果と前記第２解析結果とに基づいて前記演者が演じた演目の面白さの第１評価値と、前記所定の回のうち前記第１の回とは異なる第２の回において前記演者が演じた演目における前記第１解析結果と前記第２解析結果とに基づいて前記演者が演じた演目の面白さの第２評価値と、を導出し、上記の面白さ定量評価装置は、前記第１評価値と、前記第２評価値と、前記第１評価値と前記第２評価値に重みづけする係数と、に基づいて、前記第１の回と前記第２の回の両方の回より後に前記演者が演じる演目の面白さを推定する推定部をさらに備える。
（９）また、上記の面白さ定量評価装置は、前記面白さが評価された演目の特徴量と評価結果とに基づいて前記演目の面白さを評価するためのモデルを生成するモデル生成部を備え、前記面白さが評価された演目の特徴量には、前記面白さが評価された演目を演じた演者が発した音声と、前記面白さが評価された演目を視聴した人の笑い声とが含まれる。
（１０）また、本発明の一態様に係る面白さ定量評価方法は、コンピュータが、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価する過程を含み、前記第１解析結果には、前記演者である複数の人のやりとりのテンポの評価の結果、若しくは、前記演者が発した音声の情報量を示す情報が含まれ、又は、前記第２解析結果には、前記演者が前記演目のなかでネタ要素を演じた時間間隔の情報、若しくは、前記演目の後半に前記演者がネタ要素を演じてから前記演目の終了までの時間の情報が含まれる。
（１１）また、本発明の一態様に係るプログラムは、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価するステップを、面白さ定量評価装置のコンピュータに実行させるためのプログラムであって、前記第１解析結果には、前記演者である複数の人のやりとりのテンポの評価の結果、若しくは、前記演者が発した音声の情報量を示す情報が含まれ、又は、前記第２解析結果には、前記演者が前記演目のなかでネタ要素を演じた時間間隔の情報、若しくは、前記演目の後半に前記演者がネタ要素を演じてから前記演目の終了までの時間の情報が含まれるプログラムである。
（１２）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価した結果の評価値を生成するように形成された評価部を備え、前記評価値が所望の値になるように前記演目の台本の構成を調整する。
（１３）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価した結果の評価値を生成するように形成された評価部を備え、前記評価値が所望の値になるように前記演目の構成を調整することで、前記演目の台本の作成を支援する。
（１４）また、本発明の一態様に係る面白さ定量評価装置は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価した結果の評価値を生成するように形成された評価部を備え、前記評価値に基づいて前記演目の台本を定量的に評価する。
（１５）また、本発明の一態様に係る面白さ調整方法は、コンピュータが演目の面白さの調整を支援するための面白さ調整方法であって、コンピュータが、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる第１演目の面白さを評価する過程と、コンピュータが、前記評価の結果に基づいて面白さが調整された後の第２演目の面白さを評価する過程と、を含む。
（１６）また、本発明の一態様に係る面白さ調整方法は、演目中の演者が発した音声に基づいて導出された前記演者の話し方の傾向を示す第１解析結果と、前記演目を視聴する人の笑い声に基づいて導出された前記演者が演じた内容の傾向を示す第２解析結果と、に基づいて前記演者が演じる演目の面白さを評価可能に調整されたコンピュータが、対象の演目を評価して、前記対象の演目から所望の面白さが得られるように前記演目の台本の作成を支援する過程と、を含む。 (1) In order to achieve the above object, the fun quantitative evaluation device according to one aspect of the present invention is the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance. A second analysis result showing the tendency of the content played by the performer derived based on the laughter of the viewer of the performance, and an evaluation unit for evaluating the fun of the performance performed by the performer based on the results. wherein the the first analysis result, Ru contain the results of the evaluation of the tempo of a plurality of human interaction is the performer.
(2) Further, the first analysis result in the above-mentioned fun quantitative evaluation device does not include the analysis result regarding the utterance content of the performer based on the voice uttered by the performer , and the second analysis result includes the analysis result. Includes the results of an evaluation of the proportion of non-verbal material elements in the performance .
(3) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. The second analysis is provided with a second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the viewer and an evaluation unit for evaluating the fun of the performance performed by the performer based on the second analysis result. The result includes information on the time interval in which the performer played the material element in the performance , or information on the time from when the performer played the material element in the latter half of the performance to the end of the performance.
( 4 ) Further, in the above-mentioned fun quantitative evaluation device, the material element played by the performer in the latter half of the performance is the material element played at the end of the performance.
( 5 ) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. The first analysis is provided with a second analysis result showing a tendency of the content performed by the performer derived based on the laughter of the viewer, and an evaluation unit for evaluating the fun of the performance performed by the performer based on the result. The result includes information indicating the amount of information of the voice emitted by the performer.
( 6 ) Further, the amount of information of the voice in the above-mentioned fun quantitative evaluation device includes the time allotted to the dialogue, the number of mora assigned to the unit time, and the dialogue of the performer in the voice uttered by the performer. Among them, one or more of the number of mora of relatively long dialogue is included .
(7 ) Further, the evaluation unit in the above-mentioned fun quantitative evaluation device is the first in the performance performed by the performer in the first of the predetermined times of the contest in which the contest is divided into a plurality of times and selected stepwise. The first evaluation value of the fun of the performance performed by the performer based on the 1 analysis result and the second analysis result, and the performer in the second time different from the first time in the predetermined times. Based on the first analysis result and the second analysis result in the performance performed by the performer, the second evaluation value of the fun of the performance performed by the performer is derived, and the fun quantitative evaluation device is described. An estimation unit for estimating the fun of the performance performed by the performer after both the first and second evaluations based on the first evaluation value and the second evaluation value is further provided.
( 8 ) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. The evaluation unit is provided with a second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the viewer, and an evaluation unit for evaluating the fun of the performance performed by the performer based on the result. , The performer plays based on the first analysis result and the second analysis result in the performance performed by the performer in the first of the predetermined times of the contest which is divided into a plurality of times and selected stepwise. The first evaluation value of the fun of the performance, the first analysis result and the second analysis result in the performance performed by the performer in the second time different from the first time in the predetermined time. Based on the above, a second evaluation value of the fun of the performance performed by the performer is derived, and the fun quantitative evaluation device uses the first evaluation value, the second evaluation value, and the first evaluation. An estimation unit that estimates the fun of the performance performed by the performer after both the first and second times is further based on the value and the coefficient that weights the second evaluation value. Be prepared.
(9) Further, the above-mentioned fun quantitative evaluation device includes a model generation unit that generates a model for evaluating the fun of the play based on the feature amount of the play whose fun is evaluated and the evaluation result. In preparation, the feature quantity of the performance evaluated for the fun includes the voice emitted by the performer who performed the performance evaluated for the fun and the laughter of the person who watched the performance evaluated for the fun. included.
( 10 ) Further, the fun quantitative evaluation method according to one aspect of the present invention includes a first analysis result showing a tendency of the performer's speaking style derived by a computer based on the voice emitted by the performer during the performance. only contains a second analysis results showing the trend of what the performers, which is derived based on the people of laughter to watch the repertoire is played, the process of evaluating the interest of the performers play repertoire on the basis of, the The first analysis result includes the result of evaluation of the tempo of the interaction of a plurality of persons who are the performers, or the information indicating the amount of information of the voice emitted by the performer, or the second analysis result includes the result of evaluation. , Information on the time interval in which the performer played the material element in the performance, or information on the time from when the performer played the material element in the latter half of the performance to the end of the performance .
( 11 ) Further, the program according to one aspect of the present invention includes a first analysis result showing a tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and a person who watches the performance. The computer of the quantification evaluation device is made to perform a step of evaluating the fun of the performance performed by the performer based on the second analysis result showing the tendency of the content played by the performer derived based on the laughter. The first analysis result includes information indicating the result of evaluation of the tempo of the interaction of a plurality of persons who are the performers, or the amount of information of the voice emitted by the performer, or The second analysis result includes information on the time interval in which the performer played the material element in the program, or the time from when the performer played the material element in the latter half of the program to the end of the program. It is a program that contains the information of .
(12) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. To generate an evaluation value as a result of evaluating the fun of the performance performed by the performer based on the second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the viewer. A formed evaluation unit is provided, and the composition of the script of the performance is adjusted so that the evaluation value becomes a desired value.
(13) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. To generate an evaluation value as a result of evaluating the fun of the performance performed by the performer based on the second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the viewer. The formed evaluation unit is provided, and the composition of the program is adjusted so that the evaluation value becomes a desired value, thereby supporting the creation of the script of the program.
(14) Further, the fun quantitative evaluation device according to one aspect of the present invention provides the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performance. To generate an evaluation value as a result of evaluating the fun of the performance performed by the performer based on the second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the viewer. It is provided with a formed evaluation unit, and the script of the performance is quantitatively evaluated based on the evaluation value.
(15) Further, the fun adjustment method according to one aspect of the present invention is an fun adjustment method for the computer to support the adjustment of the fun of the performance, and the computer makes a sound produced by the performer during the performance. The first analysis result showing the tendency of the performer's speaking style derived based on the above, and the second analysis result showing the tendency of the content played by the performer derived based on the laughter of the viewer of the performance. The process of evaluating the fun of the first performance performed by the performer based on the above, and the process of the computer evaluating the fun of the second performance after the fun is adjusted based on the result of the evaluation. Including.
(16) Further, the fun adjustment method according to one aspect of the present invention includes a first analysis result showing a tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and viewing the performance. A computer adjusted so that the fun of the performance performed by the performer can be evaluated based on the second analysis result showing the tendency of the content performed by the performer derived based on the laughter of the performer is the target performance. Includes a process of evaluating and supporting the creation of a script for the performance so that the desired performance can be obtained from the performance.

本発明の一態様によれば、より容易に、口演の面白さを定量的に評価する面白さ定量評価装置、面白さ定量評価方法、面白さ調整方法、及びプログラムを提供することができる。 According to one aspect of the present invention, it is possible to more easily provide an fun quantitative evaluation device, a fun quantitative evaluation method, a fun adjustment method, and a program for quantitatively evaluating the fun of oral performance.

第１の実施形態の面白さ定量評価装置１の概要を示す図である。It is a figure which shows the outline of the fun quantitative evaluation apparatus 1 of 1st Embodiment. 本実施形態の面白さ定量評価装置１のハードウエア構成を示す図である。It is a figure which shows the hardware structure of the fun quantitative evaluation apparatus 1 of this embodiment. 本実施形態の面白さ定量評価装置１の構成を示す図である。It is a figure which shows the structure of the fun quantitative evaluation apparatus 1 of this embodiment. 本実施形態の面白さ定量評価装置１の詳細な構成の一例を示す図である。It is a figure which shows an example of the detailed structure of the fun quantitative evaluation apparatus 1 of this embodiment. 演者Ａと演者Ｂのやりとりの一例を示す図である。It is a figure which shows an example of the interaction between performer A and performer B. 漫才の開始から終了までの時間と、検出されたネタ要素の位置を示す図である。It is a figure which shows the time from the start to the end of a comic book, and the position of the detected material element. 漫才コンテストの各開催回の決勝戦と最終決戦の出場グループ数である。This is the number of participating groups in the final and final finals of each Manzai contest. 漫才コンテストのＤＶＤに記録された映像において削除された箇所の個数を示す図である。It is a figure which shows the number of the deleted part in the video recorded on the DVD of the comic dialogue contest. 本実施形態に係るニューラルネットワーク２４１の構成図である。It is a block diagram of the neural network 241 which concerns on this embodiment. 本実施形態に係る学習データを示す図である。It is a figure which shows the learning data which concerns on this embodiment. 本実施形態に係る評価値を示す図である。It is a figure which shows the evaluation value which concerns on this embodiment. 本実施形態に係る面白さ定量評価装置１の検証処理を示すフローチャートである。It is a flowchart which shows the verification process of the fun quantitative evaluation apparatus 1 which concerns on this embodiment. MZ -2001〜MZ -2010の決勝戦の予測精度を示す図である。It is a figure which shows the prediction accuracy of the final game of MZ -2001-MZ -2010. MZ -2001〜MZ -2010の最終決戦の予測精度を示す図である。It is a figure which shows the prediction accuracy of the final decisive battle of MZ -2001-MZ -2010. ６個の指標ｐ１〜ｐ６の予測精度に対する影響度を示す図である。It is a figure which shows the degree of influence on the prediction accuracy of 6 indexes p1 to p6. 第２の実施形態の面白さ定量評価装置１の構成を示す図である。It is a figure which shows the structure of the fun quantitative evaluation apparatus 1 of the 2nd Embodiment. 漫才コンテストの一例であるM1-2015の進行を示す図である。It is a figure which shows the progress of M1-2015 which is an example of a comic dialogue contest. 漫才コンテストの一例であるM1-2015の決勝戦の順位予測処理の概要を示す図である。It is a figure which shows the outline of the ranking prediction processing of the final game of M1-2015 which is an example of a comic contest. 本実施形態に係る決勝戦の順位の予測の結果を示す図である。It is a figure which shows the result of the prediction of the ranking of the final game which concerns on this embodiment. 本実施形態の解析装置を含む面白さ定量評価装置１の構成図である。It is a block diagram of the fun quantitative evaluation apparatus 1 including the analysis apparatus of this embodiment. 本実施形態に係る特徴量取得部２６により得られた指標の一例を示す図である。It is a figure which shows an example of the index obtained by the feature amount acquisition part 26 which concerns on this embodiment.

以下、図面を参照し、実施形態に係る面白さ定量評価装置、面白さ定量評価方法、及びプログラムの実施形態について説明する。 Hereinafter, with reference to the drawings, an interesting quantitative evaluation device, an interesting quantitative evaluation method, and an embodiment of the program will be described.

（第１の実施形態）
［１．面白さの定量評価装置の構成］
図１は、本実施形態の面白さ定量評価装置１の概要を示す図である。
面白さ定量評価装置１は、演目中の演者が発した音声に基づいて導出された演者の話し方の傾向を示す第１解析結果と、その演目を視聴する人の笑い声に基づいて導出された演者が演じた内容の傾向を示す第２解析結果と、に基づいて、演者が演じる演目の面白さを評価する。 (First Embodiment)
[1. Configuration of quantitative evaluation device for fun]
FIG. 1 is a diagram showing an outline of the fun quantitative evaluation device 1 of the present embodiment.
The fun quantitative evaluation device 1 is derived from the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performer derived based on the laughter of the viewer of the performance. Based on the second analysis result showing the tendency of the content performed by the performer, the fun of the performance performed by the performer is evaluated.

面白さ定量評価装置１は、漫才やコントなどにおける口演の内容に関する指標と、発話に関する指標を用いることで、その口演を定量的に評価することを可能にする。さらに、面白さ定量評価装置１は、口演についての定量的な評価結果を導出することができ、その評価結果を他の手段による評価結果と対比することが可能になる。そこで、面白さ定量評価装置１は、他の手段による評価結果を利用する機械学習の手法によって、面白さ定量評価装置１による評価の精度をより高める。 The fun quantitative evaluation device 1 makes it possible to quantitatively evaluate the oral performance by using an index related to the content of the oral performance in a comic dialogue or a contest and an index related to the utterance. Further, the fun quantitative evaluation device 1 can derive a quantitative evaluation result for the oral performance, and the evaluation result can be compared with the evaluation result by other means. Therefore, the fun quantitative evaluation device 1 further enhances the accuracy of the evaluation by the fun quantitative evaluation device 1 by a machine learning method that utilizes the evaluation results by other means.

本実施形態における以下の説明では、評価の対象として漫才を例示して説明する。面白さ定量評価装置１は、漫才の映像データ２に基づいて、その面白さについて定量評価する。例えば、面白さ定量評価装置１は、漫才の映像データ２として、漫才のコンテストの映像を利用する。面白さ定量評価装置１は、機械学習のための他の手段による評価結果として、上記の漫才のコンテストの審査官による評価結果を利用する。 In the following description in the present embodiment, a manzai will be illustrated and described as an evaluation target. The fun quantitative evaluation device 1 quantitatively evaluates the fun based on the video data 2 of the comic book. For example, the fun quantitative evaluation device 1 uses the video of the comic contest as the video data 2 of the comic. The fun quantitative evaluation device 1 utilizes the evaluation result by the examiner of the above-mentioned comic dialogue contest as the evaluation result by other means for machine learning.

ところで、漫才には様々な種類が存在し、「１つのストーリを披露する」、「特定の状況について複数の展開を披露する」、「１つの話題から始まり類似の話題へ展開していく」、これらの他にも多数の形態が挙げられる。 By the way, there are various types of manzai, "showing one story", "showing multiple developments in a specific situation", "starting from one topic and developing to similar topics", In addition to these, there are many other forms.

本実施形態における面白さ定量評価装置１は、このような漫才の面白さを評価するに当たり、特定の漫才の形態に依存しない指標を利用して、漫才の面白さについて評価する。例えば、本実施形態の面白さ定量評価装置１は、発話内容に依存する話の流れや話題の連続性といった漫才を構成するネタ要素の前後の接続関係を示す指標も評価の指標に含まなくてもよい。漫才を構成するネタ要素の前後の接続関係には、個々のネタ要素の内容、各ネタ要素の意味と面白さの度合い、ネタ要素間の意味的なつながり、などが関係する。面白さ定量評価装置１は、上記のような個々のネタ要素の前後の接続関係を、評価の指標に用いない場合を例示する。 In evaluating the fun of such comics, the fun quantitative evaluation device 1 in the present embodiment evaluates the fun of comics by using an index that does not depend on a specific form of comics. For example, the fun quantitative evaluation device 1 of the present embodiment does not include an index showing the connection relationship before and after the material elements constituting the comic story such as the flow of the story and the continuity of the topic depending on the utterance content. May be good. The connection relationship before and after the material elements that make up the comic is related to the content of each material element, the meaning and degree of fun of each material element, and the semantic connection between the material elements. The fun quantitative evaluation device 1 exemplifies a case where the connection relationship before and after each individual material element as described above is not used as an evaluation index.

また、面白さ定量評価装置１は、聴き手に笑いをもたらすネタ要素の具体的な作成方法や、ボケに対するツッコミのパターン、また発話時の間の取り方のように、漫才の個々のネタ要素間の関係（ネタ要素間の関連）による面白さの要因についても評価の指標に含めなくてもよい。つまり、本実施形態における面白さ定量評価装置１は、面白さの評価の指標として、口演の時系列に関わる情報と言語情報を用いるものではない。 In addition, the fun quantitative evaluation device 1 is used to create a specific material element that brings laughter to the listener, a pattern of tsukkomi for blurring, and how to take the time between speeches. It is not necessary to include the factor of fun due to the relationship (relationship between the material elements) in the evaluation index. That is, the fun quantitative evaluation device 1 in the present embodiment does not use the information related to the time series of the oral presentation and the linguistic information as the index for evaluating the fun.

例えば、面白さ定量評価装置１は、上記の各情報によらない定量的な評価に基づいて漫才コンテストの順位を予測してもよい。面白さ定量評価装置１は、漫才を、聴衆に１回笑いを生じさせるやりとり（笑いを生むやりとり）の連鎖と捉える。ここで、１回笑いを生むやりとりを「ネタ要素」と定義する。ネタ要素の形態および内容は問わず、台詞によるもの、身振りや動作によるものも含まれる。 For example, the fun quantitative evaluation device 1 may predict the ranking of the comic contest based on the quantitative evaluation that does not depend on the above information. The fun quantitative evaluation device 1 regards comics as a chain of exchanges that cause the audience to laugh once (interactions that produce laughter). Here, the exchange that causes laughter once is defined as a "material element". Regardless of the form and content of the material element, it includes those based on dialogue, gestures and movements.

図２は、本実施形態の面白さ定量評価装置１のハードウエア構成を示す図である。面白さ定量評価装置１は、ＣＰＵ１Ａと、ＲＯＭ（Read Only Memory）、ＥＥＰＲＯＭ（Electrically Erasable and Programmable Read Only Memory）、ＨＤＤ（Hard Disk Drive）の不揮発性記憶装置１Ｂと、ＲＡＭ（Random Access Memory）、レジスタ等の揮発性記憶装置１Ｃと、可搬型記録媒体ドライブ装置１Ｄと、入出力装置１Ｅと、通信インターフェース１Ｆなどを含むコンピュータであり、実行するプログラムにより、漫才の口演の面白さを評価する処理を実行する。 FIG. 2 is a diagram showing a hardware configuration of the fun quantitative evaluation device 1 of the present embodiment. The fun quantitative evaluation device 1 includes a CPU 1A, a ROM (Read Only Memory), an EEPROM (Electrically Erasable and Programmable Read Only Memory), a non-volatile storage device 1B of an HDD (Hard Disk Drive), and a RAM (Random Access Memory). A computer including a volatile storage device 1C such as a register, a portable recording medium drive device 1D, an input / output device 1E, a communication interface 1F, and the like, and a process for evaluating the fun of a comic speech by a program to be executed. To execute.

図３Ａは、本実施形態の面白さ定量評価装置１の構成を示す図である。図３Ｂは、本実施形態の面白さ定量評価装置１の詳細な構成の一例を示す図である。面白さ定量評価装置１は、記憶部１０と、評価部２１と、平均点算出部２２と、順序判定部２３と、モデル生成部２４と、特徴量抽出部２５と、予測精度算出部２６とを備える。 FIG. 3A is a diagram showing the configuration of the fun quantitative evaluation device 1 of the present embodiment. FIG. 3B is a diagram showing an example of a detailed configuration of the fun quantitative evaluation device 1 of the present embodiment. The fun quantitative evaluation device 1 includes a storage unit 10, an evaluation unit 21, an average score calculation unit 22, an order determination unit 23, a model generation unit 24, a feature quantity extraction unit 25, and a prediction accuracy calculation unit 26. To be equipped.

記憶部１０は、不揮発性記憶装置１Ｂと、揮発性記憶装置１Ｃとによって実現される。記憶部１０は、面白さ定量評価装置１を機能させるためのプログラムの他に、学習用データ１１、モデルデータ１２、評定用データ１３、評価結果データ１４、審査結果データ１５等のデータを格納する。学習用データ１１、モデルデータ１２、評定用データ１３、評価結果データ１４、審査結果データ１５の詳細については後述する。 The storage unit 10 is realized by the non-volatile storage device 1B and the volatile storage device 1C. The storage unit 10 stores data such as learning data 11, model data 12, evaluation data 13, evaluation result data 14, examination result data 15, and the like, in addition to the program for operating the fun quantitative evaluation device 1. .. Details of the learning data 11, the model data 12, the evaluation data 13, the evaluation result data 14, and the examination result data 15 will be described later.

例えば、評価部２１と、平均点算出部２２と、順序判定部２３と、モデル生成部２４、特徴量抽出部２５、予測精度算出部２６等の各部は、記憶部１０に格納されたプログラムをＣＰＵ１Ａが実行することにより機能する。 For example, each unit such as the evaluation unit 21, the average score calculation unit 22, the order determination unit 23, the model generation unit 24, the feature amount extraction unit 25, and the prediction accuracy calculation unit 26 stores a program stored in the storage unit 10. It functions when the CPU 1A executes it.

評価部２１は、記憶部１０から読み出した評定用データ１３に基づいて、演目を演じるグループの演者（以下、単に演者という。）が演じる演目の面白さを評価して、評価結果である評価値（予測評価点ｋ（k=1,…,10））を出力する。評定用データ１３には、第１解析結果１３１と第２解析結果１３２とが含まれる。第１解析結果１３１は、演目中の演者が発した音声に基づいて、演者の話し方の傾向を解析した結果である。第２解析結果１３２は、演目を視聴する人の笑い声に基づいて、演者が演じた内容の傾向を解析した結果である。例えば、評価部２１は、モデル生成部２４によって機械学習の手法により生成された評価モデルに基づいて、演者が演じる演目の面白さを評価する評価処理を実施して、上記の評価値（予測評価点ｋ）を導出する。演者が演じる演目の面白さを評価する処理の詳細については後述する。 The evaluation unit 21 evaluates the fun of the performance performed by the performer of the group performing the performance (hereinafter, simply referred to as the performer) based on the evaluation data 13 read from the storage unit 10, and evaluates the evaluation value as the evaluation result. (Prediction evaluation point k (k = 1, ..., 10)) is output. The rating data 13 includes the first analysis result 131 and the second analysis result 132. The first analysis result 131 is a result of analyzing the tendency of the performer's speaking style based on the voice emitted by the performer during the performance. The second analysis result 132 is a result of analyzing the tendency of the content performed by the performer based on the laughter of the viewer of the program. For example, the evaluation unit 21 performs an evaluation process for evaluating the fun of the performance performed by the performer based on the evaluation model generated by the model generation unit 24 by the machine learning method, and performs the above evaluation value (predictive evaluation). The point k) is derived. The details of the process for evaluating the fun of the performance performed by the performer will be described later.

平均点算出部２２は、評価部２１により導出された各演目の演者ごとの評価値（予測評価点ｋ）の平均を算出する。 The average score calculation unit 22 calculates the average of the evaluation values (predicted evaluation points k) for each performer of each performance derived by the evaluation unit 21.

順序判定部２３は、評価部２１により導出された各演目のグループ（演者）ごとの評価値の平均に基づいた判定により、対象の複数の演者について、当該演者の口演の面白さの順序を決定し、評価結果データ１４を生成する。面白さ定量評価装置１は、順序判定部２３により決定された演者の面白さの順序に基づいて、将来実施される口演の面白さを予測してもよい。 The order determination unit 23 determines the order of the fun of the oral performances of the target performers by the determination based on the average of the evaluation values for each group (performer) of each performance derived by the evaluation unit 21. Then, the evaluation result data 14 is generated. The fun quantitative evaluation device 1 may predict the fun of the oral performance to be performed in the future based on the order of the fun of the performers determined by the order determination unit 23.

予測精度算出部２６は、順序判定部２３により決定されたグループ（演者）の面白さの順序（評価結果データ１４）と、審査結果データ１５により示される演者の面白さの順序とを対比して、予測精度を算出する。 The prediction accuracy calculation unit 26 compares the order of fun of the group (performer) determined by the order determination unit 23 (evaluation result data 14) with the order of fun of the performer indicated by the examination result data 15. , Calculate the prediction accuracy.

モデル生成部２４は、評価部２１として機能させる評価モデルを予め生成し、モデルデータ１２を生成する。例えば、モデル生成部２４は、上記の評価モデルを、機械学習の手法に基づき生成する。以下、モデル生成部２４の一例として、機械学習の手法の一例としてニューラルネットを適用する場合について説明する。例えば、モデル生成部２４は、ニューラルネットワーク２４１と、学習処理部２４２とを含む。学習処理部２４２は、ニューラルネットワーク２４１の予備学習を制御する。予備学習を終えたニューラルネットワーク２４１によって、評価モデルが生成される。モデル生成部２４は、生成した評価モデルをモデルデータ１２として記憶部１０に格納する。上記の評価モデルとは、ニューラルネットを定式化した際の、層間の結合の重み、各層のノードに与えられたバイアスなどを含む数値情報を含む。この評価モデルは、予備学習により上記の数値情報が決定される。 The model generation unit 24 generates an evaluation model to function as the evaluation unit 21 in advance, and generates model data 12. For example, the model generation unit 24 generates the above evaluation model based on a machine learning method. Hereinafter, as an example of the model generation unit 24, a case where a neural network is applied as an example of a machine learning method will be described. For example, the model generation unit 24 includes a neural network 241 and a learning processing unit 242. The learning processing unit 242 controls the preliminary learning of the neural network 241. The evaluation model is generated by the neural network 241 that has completed the preliminary learning. The model generation unit 24 stores the generated evaluation model as model data 12 in the storage unit 10. The above evaluation model includes numerical information including the weight of the connection between layers and the bias given to the node of each layer when the neural network is formulated. In this evaluation model, the above numerical information is determined by preliminary learning.

特徴量抽出部２５は、演者が発する音声と、視聴者の笑い声とのそれぞれを互いに識別可能に検出する。例えば、特徴量抽出部２５は、演者の近傍に配置されたマイクによって集音された音声を、演者が発する音声として検出する。特徴量抽出部２５は、演者が発する音声から、その音声を構成する文字数を識別し、発声期間と非発声期間とを識別する。
特徴量抽出部２５は、演者が発する音声から識別した文字数と発声期間と非発声期間とに基づいて、演者の話し方の傾向の特徴量を抽出し、記憶部１０にその解析結果（第１解析結果）を書き込む。演者の話し方の傾向の特徴量の解析についての詳細な説明は後述する。 The feature amount extraction unit 25 detects each of the voice emitted by the performer and the laughter of the viewer so as to be distinguishable from each other. For example, the feature amount extraction unit 25 detects the sound collected by the microphone arranged in the vicinity of the performer as the sound emitted by the performer. The feature amount extraction unit 25 identifies the number of characters constituting the voice from the voice emitted by the performer, and distinguishes between the vocalized period and the non-vocalized period.
The feature amount extraction unit 25 extracts the feature amount of the tendency of the performer's speech based on the number of characters identified from the voice emitted by the performer, the vocalization period, and the non-vocalization period, and the analysis result (first analysis) in the storage unit 10. Result) is written. A detailed explanation of the analysis of the features of the speaker's speaking tendency will be described later.

特徴量抽出部２５は、客席側で集音された音声から、視聴者の笑い声を検出する。なお、演者が発する音声と、視聴者の笑い声とが混合されている場合には、混合されている音から、音声の特徴に基づいて、演者が発する音声と視聴者の笑い声のそれぞれを分離する。特徴量抽出部２５は、分離した視聴者の笑い声に基づいて、演者が演じた内容の傾向を解析し、記憶部１０にその解析結果（第２解析結果）を書き込む。演者が演じた内容の傾向の解析についての詳細な説明は後述する。 The feature amount extraction unit 25 detects the laughter of the viewer from the sound collected on the audience side. When the sound emitted by the performer and the laughter of the viewer are mixed, the sound emitted by the performer and the laughter of the viewer are separated from the mixed sound based on the characteristics of the sound. .. The feature amount extraction unit 25 analyzes the tendency of the content played by the performer based on the laughter of the separated viewer, and writes the analysis result (second analysis result) in the storage unit 10. A detailed explanation of the analysis of the tendency of the content performed by the performer will be described later.

特徴量抽出部２５は、審査官による審査結果を抽出し、その結果を記憶部１０に書き込む。 The feature amount extraction unit 25 extracts the examination result by the examiner and writes the result in the storage unit 10.

［２．面白さの定量評価の方法］
面白さ定量評価装置１による面白さの定量評価の方法について説明する。面白さ定量評価装置１は、漫才口演を、演者の発話と漫才の内容の２つの側面について定量化する。なお、本実施形態においては、どの指標も１つの漫才口演全体の値を計測し、口演を前中後盤のように時系列に基づいて分割した計測は行わない定量評価の適用例を例示する。 [2. Method of quantitative evaluation of fun]
A method of quantitative evaluation of fun by the fun quantitative evaluation device 1 will be described. The fun quantitative evaluation device 1 quantifies the manzai oral performance in terms of two aspects: the performer's utterance and the content of the manzai. In addition, in this embodiment, an application example of quantitative evaluation is illustrated in which the value of one manzai oral performance is measured for each index, and the oral performance is not divided based on the time series as in the case of the front, middle, and rear stages. ..

［2.1 漫才の評価指標］
面白さ定量評価装置１は、漫才口演を(A)発話と(B)漫才の内容の２つの側面から捉える。
面白さ定量評価装置１は、発話については、以下の３つの特徴を計測する。発話についての３つの特徴とは、(A-1)ｐ１：１台詞に割当て可能な時間と、(A-2)ｐ２：１秒（単位時間）の発話に割当て可能なモーラ数と、(A-3)ｐ３：最も長い台詞のモーラ数とである。上記は、演者が発した音声における音声の情報量の一例である。 [2.1 Evaluation index for comics]
The fun quantitative evaluation device 1 captures the manzai oral performance from two aspects: (A) utterance and (B) the content of the manzai.
The fun quantitative evaluation device 1 measures the following three characteristics for utterance. The three characteristics of utterance are (A-1) p1: 1 time that can be assigned to dialogue, (A-2) p2: 1 second (unit time), the number of mora that can be assigned to utterance, and (A). -3) p3: The number of mora in the longest line. The above is an example of the amount of voice information in the voice emitted by the performer.

一方、面白さ定量評価装置１は、漫才の内容については、観客に笑いを生じさせる要因であるネタ要素を扱い、以下の３つの特徴を計測する。漫才の内容についての３つの特徴とは、(B-1)ｐ４：ネタ要素の平均時間間隔と、(B-2)ｐ５：最後のネタ要素から漫才終了までの時間と、(B-3)ｐ６：動作のみで台詞のないネタ要素の、全てのネタ要素に対する出現頻度の割合とである。 On the other hand, the fun quantitative evaluation device 1 handles the material element, which is a factor that causes the audience to laugh, and measures the following three characteristics for the content of the comic dialogue. The three characteristics of the content of the comics are (B-1) p4: the average time interval of the material elements, (B-2) p5: the time from the last material element to the end of the comics, and (B-3). p6: The ratio of the frequency of appearance of the material elements that are only movements and have no dialogue to all the material elements.

［2.1.1 発話に関する指標］
漫才は、通常２人の演者が発話する。身振りや手振りといった動作も聴衆に漫才の内容を伝える手段ではあるが、発話が主な手段である。面白さ定量評価装置１は、発話に関する特徴を定量化する。発話に関する指標ｐ１〜ｐ３は、映像データ２から台詞を抽出又は書き起こした文字データを用いて算出される。面白さ定量評価装置１の特徴量抽出部２５は、演者が発する音声から識別した文字数と発声期間と非発声期間とに基づいて、発話に関する指標ｐ１〜ｐ３を導出する。 [2.1.1 Utterance index]
Manzai is usually spoken by two performers. Movements such as gestures and gestures are also means of communicating the content of comics to the audience, but utterances are the main means. The fun quantitative evaluation device 1 quantifies the characteristics related to utterance. The utterance-related indexes p1 to p3 are calculated using the character data obtained by extracting or transcribing the dialogue from the video data 2. The feature quantity extraction unit 25 of the fun quantitative evaluation device 1 derives indexes p1 to p3 related to utterance based on the number of characters identified from the voice uttered by the performer, the utterance period, and the non-utterance period.

本実施形態では、１台詞を会話分析における発話が次に演者に移るまでを１ターンと定義する。図４は、演者Ａと演者Ｂのやりとりの一例を示す図である。例えば、図４のやりとりの台詞数は５である。発話に関する特徴を扱うため、書き起こした台詞の文字数ではなく、台詞の発話量により近い値が得られるモーラ（mora）数を計測する。モーラ数は基本的に平仮名の数と同じだが、拗音は数えない。例えば「がくふ」「らっこ」「れたー」は全て３モーラだが、「りょひ」は２モーラである。面白さ定量評価装置１は、演者の音声に基づいて抽出した文字データを全て読みの平仮名に変換し、台詞毎のモーラ数を計測する。面白さ定量評価装置１は、演者の音声に基づいて抽出又は書き起こした文字データを全て読みの平仮名に変換し、台詞毎のモーラ数として計測してもよい。 In this embodiment, one line is defined as one turn until the utterance in the conversation analysis moves to the next performer. FIG. 4 is a diagram showing an example of interaction between performer A and performer B. For example, the number of dialogues in the exchange in FIG. 4 is 5. In order to deal with the characteristics related to utterance, the number of mora that can obtain a value closer to the amount of utterance of the line is measured, not the number of characters of the transcribed line. The number of mora is basically the same as the number of hiragana, but youon is not counted. For example, "Gakufu", "Sea otter", and "Rae" are all 3 mora, but "Ryohi" is 2 mora. The fun quantitative evaluation device 1 converts all the character data extracted based on the performer's voice into hiragana for reading, and measures the number of mora for each line. The fun quantitative evaluation device 1 may convert all the character data extracted or transcribed based on the performer's voice into reading hiragana and measure it as the number of mora for each line.

ところで、「小気味好いやりとり」、「小気味好いテンポ」のような表現があるように、やりとりの速さも漫才の評価に関係する。面白さ定量評価装置１は、ｐ１（１台詞に割当て可能な時間）を第１解析結果１３１に含める。また、面白さ定量評価装置１は、発話速度に関する指標として、ｐ２（１秒の発話に割当て可能なモーラ数）を、第１解析結果１３１に含める。さらには、面白さ定量評価装置１は、演者のやりとりに関する別の指標として、ｐ３（最も長い台詞のモーラ数）を第１解析結果１３１に含める。これは、発話の独占が生じると、やりとりのテンポに影響を与えるためである。全台詞の中で長い台詞の割合が高い漫才も有り得るが、一又は複数の比較的長い台詞、又は、最長の台詞をサンプルとして抽出し、その長さを指標として用いる。 By the way, the speed of the exchange is also related to the evaluation of the manzai, as there are expressions such as "slightly friendly exchange" and "slightly pleasant tempo". The fun quantitative evaluation device 1 includes p1 (time that can be allocated to one line) in the first analysis result 131. In addition, the fun quantitative evaluation device 1 includes p2 (the number of mora that can be assigned to one second of utterance) in the first analysis result 131 as an index related to the utterance speed. Further, the fun quantitative evaluation device 1 includes p3 (the number of mora of the longest dialogue) in the first analysis result 131 as another index regarding the interaction of the performers. This is because when the monopoly of utterance occurs, it affects the tempo of the exchange. There may be a manzai with a high proportion of long lines in all lines, but one or more relatively long lines or the longest line is extracted as a sample and its length is used as an index.

例えば、指標ｐ１は、式（１）に基づいて算出される。 For example, the index p1 is calculated based on the equation (1).

例えば、指標ｐ２は、式（２）に基づいて算出される。 For example, the index p2 is calculated based on the equation (2).

上記の式（１）と式（２）とに示すように、指標ｐ１とｐ２は、実際の発話速度の値と異なる。なお、式（１）と式（２）とにおいて、分子と分母とで口演時間の扱いを逆にしている。これは、台詞の場合は１台詞当りの時間、また、発話速度と関係するモーラ数の場合は１秒当りのモーラ数とする方が、より直感的に把握でき、漫才口演の実践に応用しやすくしたことによる。なお、特徴量抽出部２５は、指標ｐ１とｐ２を、上記式（１）と式（２）とに基づいて導出してもよい。 As shown in the above equations (1) and (2), the indexes p1 and p2 are different from the actual utterance speed values. In the formulas (1) and (2), the treatment of the oral presentation time is reversed between the numerator and the denominator. In the case of dialogue, the time per dialogue, and in the case of the number of mora related to speech speed, the number of mora per second can be grasped more intuitively, and it is applied to the practice of comic dialogue. By making it easier. The feature amount extraction unit 25 may derive the indexes p1 and p2 based on the above equations (1) and (2).

［2.1.2 漫才の内容に関する指標］
漫才の内容は、漫才口演の評価に大きな影響を与える。本実施形態では、ネタ要素を聴衆に笑いを生じさせる要因として扱う。例えば、ネタ要素の数が多い程、漫才口演の評価が高いとともに、ネタ要素が評価へ与える影響も大きいものとして定義する。面白さ定量評価装置１は、ｐ４（ネタ要素の平均時間間隔）を計測する。 [2.1.2 Indicators related to the content of comics]
The content of the manzai has a great influence on the evaluation of the manzai oral performance. In this embodiment, the material element is treated as a factor that causes the audience to laugh. For example, it is defined that the larger the number of material elements, the higher the evaluation of the comic dialogue performance, and the greater the influence of the material elements on the evaluation. The fun quantitative evaluation device 1 measures p4 (average time interval of material elements).

一方、口演の後半に、盛り上がりを持ってくることが良いとされ、口演の印象や評価は終了直前の事柄によって左右される。そこで、口演の中で最後のネタ要素の位置が漫才口演の評価へ影響すると仮定する。面白さ定量評価装置１は、ｐ５（最後のネタ要素から漫才終了までの時間）を計測する。 On the other hand, it is said that it is good to bring excitement in the latter half of the oral performance, and the impression and evaluation of the oral performance depends on the matter immediately before the end. Therefore, it is assumed that the position of the last material element in the oral performance affects the evaluation of the comic dialogue. The fun quantitative evaluation device 1 measures p5 (the time from the last material element to the end of the comic dialogue).

前述のようにｐ１（１台詞に割当て可能な時間）とｐ２（１秒の発話に割当て可能なモーラ数）は実際の発話速度を表わした値ではない。面白さ定量評価装置１は、ｐ１とｐ２を補完するために、ｐ６（台詞無しのネタ要素の割合）を計測する。 As described above, p1 (time that can be assigned to one line) and p2 (the number of mora that can be assigned to one second of utterance) do not represent the actual utterance speed. The fun quantitative evaluation device 1 measures p6 (ratio of material elements without dialogue) in order to complement p1 and p2.

実際の発話速度を評価項目とする場合を比較例として、本実施形態の場合と対比する。上記の比較例の場合、演者が口演において発話速度を変更することは容易である。その発話速度の変更に伴い口演の総時間が変化し、その結果、他の指標であるｐ１（１台詞に割当て可能な時間）、ｐ４（ネタ要素の平均時間間隔）、ｐ５（最後のネタ要素から漫才終了までの時間）の値が影響を受ける。本実施形態の評価の指標にした、ｐ６（台詞無しのネタ要素の割合）は、発話に関する指標とは独立した仕様であり、実際の漫才口演に応用しやすいという利点がある。 The case where the actual speech speed is used as the evaluation item is used as a comparative example, and is compared with the case of the present embodiment. In the case of the above comparative example, it is easy for the performer to change the utterance speed in the oral performance. The total time of the oral performance changes as the speech speed changes, and as a result, other indicators such as p1 (time that can be assigned to one dialogue), p4 (average time interval of the material element), and p5 (last material element). (Time from to the end of Manzai) is affected. P6 (ratio of material elements without dialogue), which is used as an index for evaluation of the present embodiment, has a specification independent of the index related to utterance, and has an advantage that it can be easily applied to an actual comic dialogue.

特徴量抽出部２５は、漫才の内容に関する指標ｐ４〜ｐ６を、漫才の映像データ２における観客の笑いの発生時点を検出することにより計測する。観客の笑いに関して計測するポイントは、発生時点のみであり、笑いの大きさや持続時間は計測しなくてもよい。例えば、前述の図４に示すやり取りの中では、笑いが２箇所で発生しており、特徴量抽出部２５は、ネタ要素が２つあると判断する。特徴量抽出部２５は、聴衆に笑いが生じれば、その内容に関係なく、ネタ要素として扱う。なお、作成者が意図していない箇所で、聴衆に笑いが生じる場合や、逆に意図した箇所で笑いが生じないことが起こり得るが、どちらであるかを台本の作成者には確認できない。面白さ定量評価装置１は、上記の両者を区別せずに扱う。 The feature amount extraction unit 25 measures the indexes p4 to p6 relating to the content of the comic dialogue by detecting the time point at which the laughter of the audience occurs in the video data 2 of the comic dialogue. The point to measure the laughter of the audience is only at the time of occurrence, and it is not necessary to measure the magnitude and duration of the laughter. For example, in the above-mentioned exchange shown in FIG. 4, laughter occurs at two places, and the feature amount extraction unit 25 determines that there are two material elements. If the audience laughs, the feature amount extraction unit 25 treats it as a material element regardless of the content. It is possible that the audience may laugh at a place not intended by the creator, or conversely, laughter may not occur at a place intended by the creator, but the author of the script cannot confirm which is the case. The fun quantitative evaluation device 1 handles the above two without distinguishing between them.

図５は、漫才の開始から終了までの時間と、検出されたネタ要素の位置を示す図である。同図において、ｔ［ｉ］、（ｉ=１からＮ）は、観客に笑いが生じたときを示す。つまり、ｔ［ｉ］は、ネタ要素の発生時間である。Ｎは、検出されたネタ要素の数である。なお、同図において、ｔ［０］は、漫才の開始時間である。また、ｔ［Ｎ+１］は、漫才の終了時間であり、その値はグループ毎に異なる。 FIG. 5 is a diagram showing the time from the start to the end of the comic dialogue and the position of the detected material element. In the figure, t [i] and (i = 1 to N) indicate when the audience laughs. That is, t [i] is the generation time of the material element. N is the number of detected material elements. In the figure, t [0] is the start time of the comic book. Further, t [N + 1] is the end time of the comic dialogue, and its value differs for each group.

例えば、ネタ要素の平均時間間隔（ｐ４）は、式（３）により算出される。 For example, the average time interval (p4) of the material element is calculated by the equation (3).

なお、式（３）において、Ｎは検出したネタ要素の数である。 In the equation (3), N is the number of detected material elements.

また、最後のネタ要素から漫才終了までの時間（ｐ５）は、式（４）により算出される。 In addition, the time (p5) from the last material element to the end of the comic dialogue is calculated by the equation (4).

［2.2 対象データ］
本実施形態における対象データは、漫才コンテストにおけるデータである。
本実施形態における対象データとして、より具体的な漫才コンテストにおけるデータを適用した場合を例示して、その効果について説明する。漫才コンテストの一例として、日本においてテレビ放映される放送番組「Ｍ−１グランプリ(登録商標)」が知られている（詳しくは、http://www.m-1gp.comを参照。）。Ｍ−１グランプリは、第１回が２００１年に開催され、以降、２０１０年まで毎年１回開催され、２０１０年に開催された第１０回で、その開催が一旦中断された。 [2.2 Target data]
The target data in this embodiment is the data in the comic contest.
As the target data in the present embodiment, a case where more specific data in the comic dialogue contest is applied will be illustrated, and the effect will be described. As an example of the Mansai Contest, the broadcast program "M-1 Grand Prix (registered trademark)" that is televised in Japan is known (for details, see http://www.m-1gp.com). The first M-1 Grand Prix was held in 2001, and since then, it has been held once a year until 2010, and the tenth held in 2010 was temporarily suspended.

この１０回開催された「Ｍ−１グランプリ」は、進行方法が年により一部異なるが、概ね同様のルールで実施されている。例えば、「Ｍ−１グランプリ」は、４段階（第１０回のみ５段階）の予選（１回戦、２回戦、３回戦、準々決勝（第１０回のみ実施）、準決勝）を経て、決勝戦に進出する９グループ（第１回のみ１０グループ）が選ばれる。なお、第２回以降では、決勝戦へ進出するグループの内、１グループは準決勝または準々決勝敗退組から選ばれる。 The "M-1 Grand Prix" held 10 times is carried out according to the same rules, although the progress method differs depending on the year. For example, the "M-1 Grand Prix" will go through four stages (five stages only in the 10th round) qualifying rounds (1st round, 2nd round, 3rd round, quarterfinals (only 10th round), semifinals) and then to the finals. Nine groups to advance (10 groups only for the first time) are selected. From the second round onward, one group will be selected from the semi-final or quarter-final defeated groups among the groups that advance to the finals.

参加グループは、結成後１０年以内のグループであり、その数は、毎回数千グループに達する。例えば、第１回Ｍ−１グランプリ２００１には１、６０３グループが参加し、第１０回Ｍ−１グランプリ２０１０には４、８３５グループが参加している。この中で、決勝戦に出場できるグループの能力は高いと判断できる。 Participating groups are groups within 10 years of formation, and the number reaches 1,000 groups each time. For example, 1,603 groups participate in the 1st M-1 Grand Prix 2001, and 4,835 groups participate in the 10th M-1 Grand Prix 2010. Among these, it can be judged that the ability of the group that can participate in the final match is high.

決勝戦では、７人の審査員が各グループの口演の出来栄えを評価して、各グループの口演に対して採点する。各グループは、各審査官によって１００点を満点とする採点基準に従い採点され、その合計得点によって評価される。最終決戦への進出権を得られるのは、合計得点が上位３位（第１回のみ上位２位）までのグループである。最終決戦では、決勝戦とは異なる漫才を披露することが要求される。なお、第１回の決勝戦のみ観客の点数が加算されている。
Ｍ−１グランプリの決勝戦と最終決戦の順位は、審査員の評価によって決定される。審査の評価項目の存在や、各審査員の判定基準は不明である。 In the finals, seven judges will evaluate the performance of each group's oral performance and score each group's oral performance. Each group is scored by each examiner according to a scoring standard with a maximum of 100 points, and is evaluated by the total score. Only the groups with the highest total score (the top two in the first round) can get the right to advance to the final battle. In the final final, it is required to show a different manzai from the final. The points of the spectators are added only in the first final match.
The ranking of the final and final battles of the M-1 Grand Prix will be determined by the evaluation of the judges. The existence of evaluation items for examination and the judgment criteria for each judge are unknown.

面白さ定量評価装置１は、例えば、「Ｍ−１グランプリ」のように複数回開催された漫才コンテストの決勝戦と最終決戦についてのデータを解析の対象にする。特徴量抽出部２５は、複数回開催された漫才コンテストとして、第１回漫才コンテスト（以下MZ-2001）から第１０回漫才コンテスト（以下MZ-2010）の１０回に渡って開催された漫才コンテストを対象としてもよい。例えば、面白さ定量評価装置１は、複数回開催された漫才コンテストの映像データ２を解析し、例えば、指標ｐ１〜ｐ６の値を抽出する。面白さ定量評価装置１は、各開催回のデータを１セットとするleave-one-out法で各開催回の順位を予測し、予測精度を計測する。 The fun quantitative evaluation device 1 analyzes data on the final and final battles of a comic contest held a plurality of times, for example, such as the "M-1 Grand Prix". The feature amount extraction unit 25 is a manzai contest held multiple times, from the 1st manzai contest (hereinafter MZ-2001) to the 10th manzai contest (hereinafter MZ-2010). May be targeted. For example, the fun quantitative evaluation device 1 analyzes the video data 2 of the comic contests held a plurality of times, and extracts, for example, the values of the indexes p1 to p6. The fun quantitative evaluation device 1 predicts the ranking of each holding time by the leave-one-out method using the data of each holding time as one set, and measures the prediction accuracy.

面白さ定量評価装置１は、第１回漫才コンテスト（以下MZ-2001）から第１０回漫才コンテスト（以下MZ-2010）に、例えば、第１回開催のＭ−１グランプリ2001（以下M1-2001）から第１０回のＭ−１グランプリ2010（以下M1-2010）のデータを適用してもよい。例えば、M1-2001からM1-2010のデータには、市販の映像データを利用する。市販の映像データとして、Ｍ−１グランプリthe FINAL PREMIUM COLLECTION 2001-2010、（よしもとアール・アンド・シー社）のＤＶＤからのデータを利用してもよい。 The fun quantitative evaluation device 1 is used in the 1st Manzai Contest (MZ-2001) to the 10th Manzai Contest (MZ-2010), for example, the 1st M-1 Grand Prix 2001 (M1-2001). ) To the 10th M-1 Grand Prix 2010 (hereinafter M1-2010) may be applied. For example, commercially available video data is used for the data of M1-2001 to M1-2010. As commercially available video data, data from the DVD of M-1 Grand Prix the FINAL PREMIUM COLLECTION 2001-2010 (Yoshimoto R & C Co., Ltd.) may be used.

図６は、漫才コンテストの各開催回の決勝戦と最終決戦の出場グループ数の一例を示す図である。なお、市販の映像データ等には、著作権や表現等の関係で削除されている部分が含まれる。図７は、漫才コンテストのＤＶＤに記録された映像において削除された箇所の個数の一例を示す図である。削除された部分の台詞は必然的に解析の対象外となる。また、解析する映像データは削除時間分だけ実際の時間よりも短くなっているが、削除された映像の時間は不明である。なお、同図に示すグループのデータのうち、学習データとして使用したデータには、「学習データＩＤ」（後述の図９参照）を記載している。なお、グループ名は、仮称である。 FIG. 6 is a diagram showing an example of the number of participating groups in the finals and finals of each of the Manzai contests. It should be noted that commercially available video data and the like include parts that have been deleted due to copyright, expression, and the like. FIG. 7 is a diagram showing an example of the number of deleted parts in the video recorded on the DVD of the Manzai contest. The lines of the deleted part are inevitably excluded from the analysis. Further, the video data to be analyzed is shorter than the actual time by the deletion time, but the time of the deleted video is unknown. Of the data in the group shown in the figure, the data used as the learning data includes a "learning data ID" (see FIG. 9 described later). The group name is a tentative name.

［2.3 漫才口演の評価値の計算］
漫才の口演の評価値を計算するためのモデルを、フィードフォワード型多層ニューラルネットワークを用いて構築する。モデル生成部２４におけるニューラルネットワーク２４１は、フィードフォワード型多層ニューラルネットワークの一例である。例えば、その詳細については、「Geoffrey E. Hinton, "Learning multiple layers of representation", Trends in Cognitive Sciences, Vol. 11,pp. 428-34, 2007.」を参照してもよい。 [2.3 Calculation of evaluation value of Manzai oral performance]
A model for calculating the evaluation value of Manzai's oral performance is constructed using a feedforward multi-layer neural network. The neural network 241 in the model generation unit 24 is an example of a feedforward type multi-layer neural network. For example, see Geoffrey E. Hinton, "Learning multiple layers of representation", Trends in Cognitive Sciences, Vol. 11, pp. 428-34, 2007. for more information.

図８は、本実施形態に係るニューラルネットワーク２４１の構成図である。同図に示すように、ニューラルネットワーク２４１の隠れ層数は３、各隠れ層のノード数は１０である。学習処理部２４２は、ニューラルネットワーク２４１の学習を教師付きで行い、その学習処理において、バックプロパゲーション（誤差逆伝搬法）を用いた確率的勾配降下法を使用して、評価の誤差を低減し評価の精度を高めるように特性を最適化する。例えば、ニューラルネットワーク２４１は、H2Oパッケージを用いて構成してもよく、その場合、学習パラメータはパッケージのデフォルト値を使ってもよい。H2Oパッケージについての詳細は、http://0xdata.comを参照する。 FIG. 8 is a configuration diagram of the neural network 241 according to the present embodiment. As shown in the figure, the number of hidden layers of the neural network 241 is 3, and the number of nodes of each hidden layer is 10. The learning processing unit 242 trains the neural network 241 with a teacher, and reduces the evaluation error by using a stochastic gradient descent method using backpropagation (error backpropagation method) in the learning processing. Optimize the characteristics to improve the accuracy of the evaluation. For example, the neural network 241 may be configured using the H2O package, in which case the training parameters may use the default values of the package. For more information on the H2O package, see http://0xdata.com.

なお、上記のニューラルネットワーク２４１の入力層のノード数は６であり、上記の各ノードには、漫才の口演データから抽出された６個の指標ｐ１〜ｐ６がパラメータとしてそれぞれ入力される。出力層のノード数は１であり、上記のノードは、評価結果である評価値を出力する。なお、上記のとおり各隠れ層のノード数は、入力層のノード数と出力層のノード数よりも多い。 The number of nodes in the input layer of the neural network 241 is 6, and six indexes p1 to p6 extracted from the oral data of the comic dialogue are input to each of the nodes as parameters. The number of nodes in the output layer is 1, and the above-mentioned nodes output an evaluation value which is an evaluation result. As described above, the number of nodes in each hidden layer is larger than the number of nodes in the input layer and the number of nodes in the output layer.

図９は、本実施形態に係る学習データを示す図である。学習処理部２４２は、ニューラルネットワーク２４１の学習と検証を、leave-one-out交差検証法により行う。leave-one-out交差検証法は、標本群から1つの事例だけを抜き出してテストデータ（テスト事例）とし、残りを学習用データ（訓練事例）とする。学習処理部２４２は、全事例を一回ずつテストデータにして、それぞれの検証を繰り返し実施する。これを、本実施形態のニューラルネットワーク２４１の学習と検証に適用する。 FIG. 9 is a diagram showing learning data according to the present embodiment. The learning processing unit 242 learns and verifies the neural network 241 by the leave-one-out cross-validation method. In the leave-one-out cross-validation method, only one case is extracted from the sample group and used as test data (test case), and the rest is used as learning data (training case). The learning processing unit 242 converts all the cases into test data once, and repeatedly executes each verification. This is applied to the learning and verification of the neural network 241 of the present embodiment.

ニューラルネットワーク２４１の学習には、各開催回の決勝戦の代表的な事例のデータを利用する。学習処理部２４２は、図９に示すように、MZ-2001〜MZ-2010各回の決勝戦から上位２グループ、下位２グループと中位１グループを抽出し、各回分の学習データとする。従って、学習処理部２４２は、各開催回の決勝戦のデータセットの半数程度のデータを学習時に利用する。MZ-2001は１０個のうち５個、MZ-2003とMZ-2010は８個のうち５個、それら以外は９個のうち５個が、学習データとして採用される。 For the learning of the neural network 241, the data of typical cases of the finals of each held time is used. As shown in FIG. 9, the learning processing unit 242 extracts the top two groups, the bottom two groups, and the middle one group from the finals of each of the MZ-2001 to MZ-2010 times, and uses them as the learning data for each time. Therefore, the learning processing unit 242 uses about half of the data of the data set of the final match of each held time at the time of learning. Five out of ten MZ-2001, five out of eight MZ-2003 and MZ-2010, and five out of nine other than those are adopted as training data.

MZ-2001〜MZ-2010の各開催回の決勝戦データを１セットとし、テストデータは各開催回の決勝戦と最終決戦に出場の全グループとする。例えば、MZ-2001がテストデータの場合、MZ-2002〜MZ-2010の各決勝戦の上位２グループ、中位１グループと下位２グループ、１セット当り５グループ、計４５グループが学習データであり、MZ-2001の決勝戦の１０グループと最終決戦の２グループがテストデータである。なお、MZ-2003とMZ-2010の決勝戦には９グループが出場しているが、著作権の問題によりＤＶＤに収録されていないグループは解析せず、どちらも８グループの解析となる。 The final match data for each of the MZ-2001 to MZ-2010 will be set as one set, and the test data will be for all groups participating in the final and final finals of each event. For example, when MZ-2001 is the test data, the training data is the top 2 groups, the middle 1 group and the bottom 2 groups, 5 groups per set, and 45 groups in each final of MZ-2002 to MZ-2010. , 10 groups of the final match of MZ-2001 and 2 groups of the final match are test data. Nine groups have participated in the finals of MZ-2003 and MZ-2010, but due to copyright issues, groups not recorded on the DVD will not be analyzed, and both will be analyzed by eight groups.

上記の図９に示すように、学習に複数回使われるグループがある。例えば「PGB(仮称)」はX2006-05とX2007-05のどちらも順位が下位の学習データとして使われるグループである。一方、「KRN（仮称）」のように、上位(X2005-02)、中位(X2001-03)、下位(X2003-04)、全種類の学習データとして使われるグループもある。同一のグループであっても口演内容は毎回異なっており、学習データとしての重複は無い。ニューラルネットワークにおける学習の目的は、グループの同定や特徴抽出ではないことから、選択したデータは学習データとして適したものである。 As shown in FIG. 9 above, there are groups that are used multiple times for learning. For example, "PGB (tentative name)" is a group in which both X2006-05 and X2007-05 are used as learning data with lower ranks. On the other hand, there are also groups such as "KRN (tentative name)" that are used as all types of learning data, such as upper (X2005-02), middle (X2001-03), and lower (X2003-04). Even in the same group, the content of the oral presentation is different each time, and there is no duplication as learning data. Since the purpose of learning in the neural network is not group identification or feature extraction, the selected data is suitable as training data.

なお、学習処理部２４２は、学習用データ１１に含めて学習時に用いる評価値（審査結果）には、実際の決勝戦において審査員が投票した得点の合計点ではなく、規格化した値を用いる。図１０は、本実施形態に係る評価値を示す図である。このように審査官による得点を直接利用しない理由として、審査員の審査項目や項目毎の加点の値が明示されていないこと、評価基準の客観性が不明確であることなどが挙げられる。また、異なる開催回の点数を直接比較することも、開催回ごとに値に開きがあり困難であることも挙げられる。例えば、MZ-2002の決勝戦第１位のグループ(「FTB-RAW-（仮称）」)の点数は６２１点であるが、MZ-2005の決勝戦第５位のグループ(「C−TRAR（仮称）」)の点数(６２２点)よりも低い。 The learning processing unit 242 uses a standardized value for the evaluation value (examination result) included in the learning data 11 and used at the time of learning, not the total score of the scores voted by the judges in the actual final match. .. FIG. 10 is a diagram showing evaluation values according to the present embodiment. Reasons for not directly using the score by the examiner in this way include the fact that the examiner's examination items and the value of points added for each item are not specified, and that the objectivity of the evaluation criteria is unclear. In addition, it is difficult to directly compare the scores of different holding times because the values vary depending on the holding time. For example, the MZ-2002 finals 1st place group ("FTB-RAW- (tentative name)") scored 621 points, but the MZ-2005 finals 5th place group ("C-TRAR ("C-TRAR) " Tentative name) ”) is lower than the score (622 points).

上記などの理由により、本実施形態では、学習に用いるグループの口演の評価値は審査員が投じた得点の合計点ではなく、図１０に示す開催回毎の順位に基づいた評価値（相対値）を用いる。例えば、上位のグループには１００点、中位のグループには５０点、下位のグループには０点を与える。順位に基づく評価値であれば、異なる開催回のグループであっても、少なくとも同一順位のグループは同等に扱うことができる。その反面、ある開催回の１位のグループの口演が、別の開催回の例えば中位のグループの口演と同等の出来栄えであっても、異なる評価値が与えられる可能性がある。順位に基づく評価値、審査点に基づく評価値のどちらも学習上の欠点を有するものであるが、開催回毎のバラつきの影響を低減させることの効果を奏することを期待して、本実施形態では順位に基づく評価値を用いる。 For the above reasons, in this embodiment, the evaluation value of the oral performance of the group used for learning is not the total score of the scores cast by the judges, but the evaluation value (relative value) based on the ranking of each holding time shown in FIG. ) Is used. For example, the upper group is given 100 points, the middle group is given 50 points, and the lower group is given 0 points. As long as the evaluation value is based on the ranking, at least groups having the same ranking can be treated equally even if the groups are held at different times. On the other hand, even if the oral performance of the first group in one holding time is equivalent to the oral performance of the middle group in another holding time, different evaluation values may be given. Both the evaluation value based on the ranking and the evaluation value based on the examination points have learning defects, but the present embodiment is expected to have the effect of reducing the influence of the variation in each holding time. Then, the evaluation value based on the ranking is used.

ニューラルネットワーク２４１の学習に用いる確率的勾配降下法は、初期値依存性がある。本実施形態のモデル生成部２４は、確率的勾配降下法の初期値依存性による影響を解消するために、複数個のモデルを生成する。例えば、モデル生成部２４のニューラルネットワーク２４１は、同一のデータに基づいた学習を、学習処理部２４２により生成され、値が異なる乱数の種を用いて１０回行い、特性の異なる１０個のモデルを構築する。 The stochastic gradient descent method used for learning the neural network 241 has an initial value dependence. The model generation unit 24 of the present embodiment generates a plurality of models in order to eliminate the influence of the initial value dependence of the stochastic gradient descent method. For example, the neural network 241 of the model generation unit 24 performs learning based on the same data 10 times using random number seeds generated by the learning processing unit 242 and having different values, and 10 models having different characteristics are obtained. To construct.

この１０個のモデルは、評価部２１として機能する。モデル（２１−ｋ）、（k=1,…,10）のそれぞれには、グループ毎に当該グループの評定用データ１３（テストデータ）が入力される。モデル（２１−ｋ）のそれぞれに入力される評定用データ１３は、グループ毎に同一のものである。各モデル（２１−ｋ）は、それぞれグループｇの評価値Ｐｇ−ｋをグループ毎に出力する。評価値Ｐｇ−ｋは、グループｇの評価用データ１３に基づいてモデル（２１−ｋ）により導出した予測評価点ｋを示す。
平均点算出部２２は、グループ毎に、各モデル（２１−ｋ）から出力される１０個の評価値（評価値Ｐｇ−ｋ）ｙの平均値Ｐｇｙを算出する。上記の関係を式（５）に示す。なお、図に示す構成は、特定の開催回の処理を示すものであり、処理の対象にする評価用データに対応する開催回毎に用意される。 These 10 models function as the evaluation unit 21. In each of the models (21-k) and (k = 1, ..., 10), the rating data 13 (test data) of the group is input for each group. The rating data 13 input to each of the models (21-k) is the same for each group. Each model (21-k) outputs the evaluation value Pg-k of the group g for each group. The evaluation value Pg-k indicates a predicted evaluation point k derived by the model (21-k) based on the evaluation data 13 of the group g.
The average point calculation unit 22 calculates the average value Pgy of 10 evaluation values (evaluation value Pg-k) y output from each model (21-k) for each group. The above relationship is shown in Equation (5). The configuration shown in the figure shows the processing of a specific holding time, and is prepared for each holding time corresponding to the evaluation data to be processed.

上記の式（５）において、ｙはMZ-2001〜MZ-2010のテストデータを指し、2001はMZ-2001、2002はMZ-2002を意味し、以下同様である。
順序判定部２３は、平均値Ｐｇｙに基づいて、それぞれのテストデータの予測順位Ｒ_ｙ（y=2001,…,2010）を決定する。順序判定部２３は、予測順位Ｒ_ｙを、「グループｇの平均点算出部」が出力する評価値の平均値Ｐｇｙの大きい順に付与する。 In the above formula (5), y refers to the test data of MZ-2001 to MZ-2010, 2001 means MZ-2001, 2002 means MZ-2002, and so on.
The order determination unit 23 determines the prediction order _Ry (y = 2001, ..., 2010) of each test data based on the average value Pgy. The order determination unit 23 assigns the prediction order R _y in descending order of the average value Pgy of the evaluation values output by the “average point calculation unit of the group g”.

予測精度算出部２６は、予測順位Ｒ_ｙと実際の順位ＲＲ_ｙ（y=2001,…,2010）を比較し、順位相関係数を計算する。予測精度算出部２６は、この順位相関係数をテストデータの予測精度Ｓｙ（y=2001,…,2010）とする。 The prediction accuracy calculation unit 26 compares the predicted rank R _y with the actual rank RR _y (y = 2001, ..., 2010) and calculates the rank correlation coefficient. The prediction accuracy calculation unit 26 sets this rank correlation coefficient as the prediction accuracy Sy (y = 2001, ..., 2010) of the test data.

MZ-2001〜MZ-2010の全テストデータに対する予測精度Ｓを、式（６）に示す。順序判定部２３は、各テストデータに対するモデルの順位予測精度Ｓｙを平均し、MZ-2001〜MZ-2010の全テストデータに対する予測精度Ｓとする。 The prediction accuracy S for all the test data of MZ-2001 to MZ-2010 is shown in Equation (6). The order determination unit 23 averages the rank prediction accuracy Sy of the model for each test data, and sets the prediction accuracy S for all the test data of MZ-2001 to MZ-2010.

面白さ定量評価装置１は、その検証に用いる評定用データ（テストデータ）を、MZ-2001〜MZ-2010の各回に対応する１０セット（Ｙセット）のデータセット毎に２種類用意する。１つが決勝戦のデータであり、面白さ定量評価装置１は、そのデータに対する予測精度をスピアマンの順位相関係数で評価する。もう１つが最終決戦のデータであり、面白さ定量評価装置１は、そのデータに対する予測精度を１位のグループ(開催回の優勝グループ)を予測できたか否かをもって評価する。 The fun quantitative evaluation device 1 prepares two types of evaluation data (test data) used for the verification for each of 10 sets (Y sets) of data sets corresponding to each time of MZ-2001 to MZ-2010. One is the data of the final match, and the fun quantitative evaluation device 1 evaluates the prediction accuracy for the data by Spearman's rank correlation coefficient. The other is the data of the final battle, and the fun quantitative evaluation device 1 evaluates the prediction accuracy for the data based on whether or not the first-ranked group (the winning group of the holding times) can be predicted.

図１１は、面白さ定量評価装置１の検証処理を示すフローチャートである。例えば、テストデータをMZ-2001とする場合を例示して、具体的に説明する。 FIG. 11 is a flowchart showing a verification process of the fun quantitative evaluation device 1. For example, a case where the test data is MZ-2001 will be illustrated and described in detail.

まず、学習処理部２４２は、図９に記載のMZ-2002〜MZ-2010の計４５個の漫才口演データを学習用データ１１として、学習を１０回（Ｍ回）実行し（Ｓ１０）、１０個（Ｍ個）のモデルを生成する（Ｓ１５）。その結果、１０個（Ｍ個）の独立したモデルが生成される。 First, the learning processing unit 242 executes learning 10 times (M times) using a total of 45 comic dialogue data of MZ-2002 to MZ-2010 shown in FIG. 9 as learning data 11 (S10), 10 Generate (M) models (S15). As a result, 10 (M) independent models are generated.

次に、MZ-2001の決勝戦に出場した１０組（Ｇ組）のグループのデータ(例えば、MZ-2001決勝戦)が評価部２１の各モデルに入力され、評価部２１は、グループ毎の評価値を、モデルごとに導出する（S２０）。
面白さ定量評価装置１は、同一グループにおける全モデルの処理を終えた後、平均点算出部２３は、同一グループにおける各モデルにより導出された評価値の平均値を導出する（Ｓ２５）。
面白さ定量評価装置１は、全グループの処理が終了したか否かを判定し（Ｓ３０）、当該グループの全モデルの処理を終えるまで、Ｓ２０からの処理を繰り返す。
全グループの処理を終えた後、順序判定部２３は、各グループの評価値の平均点に基づいて順位Ｒｙｋを導出し（Ｓ３５）、１０組（Ｇ組）の順位を導出する。 Next, the data of the groups of 10 groups (Group G) who participated in the final match of MZ-2001 (for example, the final match of MZ-2001) is input to each model of the evaluation unit 21, and the evaluation unit 21 is used for each group. The evaluation value is derived for each model (S20).
After the fun quantitative evaluation device 1 finishes processing all the models in the same group, the average score calculation unit 23 derives the average value of the evaluation values derived by each model in the same group (S25).
The fun quantitative evaluation device 1 determines whether or not the processing of all the groups is completed (S30), and repeats the processing from S20 until the processing of all the models of the group is completed.
After finishing the processing of all the groups, the order determination unit 23 derives the rank Ryk based on the average score of the evaluation values of each group (S35), and derives the rank of 10 groups (G group).

次に、予測精度算出部２６は、Ｓ３５において導出した順位と実際の順位との順位相関係数(スピアマン)を導出する（Ｓ４０）。予測精度算出部２６は、導出した順位相関係数を、MZ-2001決勝戦の予測精度とする。 Next, the prediction accuracy calculation unit 26 derives the rank correlation coefficient (Spearman) between the rank derived in S35 and the actual rank (S40). The prediction accuracy calculation unit 26 uses the derived rank correlation coefficient as the prediction accuracy of the MZ-2001 final match.

次に、MZ-2001の最終決戦に出場した２組（Ｄ組、Ｄ＜Ｇ）グループのデータ(MZ-2001最終決戦)が、決勝戦の予測に用いた１０個（Ｍ個）のモデル（評価部２１）に入力され、評価部２１は、グループ毎の評価値を１０個（Ｍ個）導出し（Ｓ４５）、平均点算出部２３は、同一グループにおける各モデルにより導出された評価値の平均値を導出する（Ｓ５０）。次に、順序判定部２３は、グループ毎の評価値の平均値に基づいて、予測１位のグループを１０個（Ｙ個）選択する（Ｓ５５）。 Next, the data of the two groups (Group D, D <G) who participated in the final battle of MZ-2001 (MZ-2001 final battle) was used to predict the final battle of 10 (M) models (M). Input to the evaluation unit 21), the evaluation unit 21 derives 10 (M) evaluation values for each group (S45), and the mean score calculation unit 23 derives the evaluation values derived from each model in the same group. The average value is derived (S50). Next, the order determination unit 23 selects 10 (Y) groups having the highest prediction based on the average value of the evaluation values for each group (S55).

面白さ定量評価装置１は、全モデルの処理を終えるまで、Ｓ４５からの処理を繰り返す（Ｓ６０）。全モデルの処理を終えた後、予測精度算出部２６は、Ｓ５０において選択した予測１位のグループが、実際に１位を獲得したグループである割合を導出し（Ｓ６５）、その結果を最終決戦の予測精度とする。面白さ定量評価装置１は、上記の処理を、評価の対象にした開催回毎に実行する。 The fun quantitative evaluation device 1 repeats the processing from S45 until the processing of all the models is completed (S60). After finishing the processing of all the models, the prediction accuracy calculation unit 26 derives the ratio of the group of the predicted 1st place selected in S50 to the group that actually won the 1st place (S65), and determines the result as the final battle. The prediction accuracy of. The fun quantitative evaluation device 1 executes the above processing for each holding time targeted for evaluation.

なお、解析対象の漫才コンテストは、開催回により審査方法が異なることがあり、又、実際の順位が不明な回がある。このような場合、統一可能範囲で審査方法を調整してもよい。例えば、一例に示すM1-2001では、決勝戦でのみ、札幌、大阪、福岡の吉本興業の劇場にいた観客１００人ずつによる合計３００点の観客点が審査員の点数に加算され、順位が決定されている。M1-2002以降では審査員による点数のみに変更されたことから、順位決定の評価基準を統一するために、M1-2001の決勝戦の解析には、審査員の点数のみによる順位を用いる。そのため、M1-2001の決勝戦は５グループの順位が実際の順位と異なる。一方、最終決戦の順位は、審査員のみによるため、実際の順位を用いた。なお、審査員の点数のみによって決定される順位を用いると、対象とするグループのデータが存在しないことがある。対象とするグループのデータが存在しない場合には、対象とするグループの順位に近接する順位のグループのデータを利用してもよい。例えば、上記のように順位を調整する場合に、最終決戦に参加しなかったグループを利用としても、最終決戦の口演データが存在しないグループは、解析の対象外にする。 In addition, the judging method of the comic contest to be analyzed may differ depending on the number of times it is held, and there are times when the actual ranking is unknown. In such a case, the examination method may be adjusted within the range that can be unified. For example, in M1-2001 shown as an example, a total of 300 spectators' points by 100 spectators at the Yoshimoto Kogyo theaters in Sapporo, Osaka, and Fukuoka will be added to the judges' points only in the final match, and the ranking will be decided. Has been done. Since M1-2002 and later, the score was changed only by the judges, so in order to unify the evaluation criteria for ranking, the ranking based only on the judges' scores will be used in the analysis of the final match of M1-2001. Therefore, in the final match of M1-2001, the ranking of the 5 groups is different from the actual ranking. On the other hand, the ranking of the final battle depends only on the judges, so the actual ranking was used. If the ranking determined only by the judges' scores is used, the data of the target group may not exist. If the data of the target group does not exist, the data of the group having a rank close to the rank of the target group may be used. For example, when adjusting the ranking as described above, even if the group that did not participate in the final battle is used, the group that does not have the oral data of the final battle is excluded from the analysis.

また、審査員による評価点が同点になる場合には、これら２グループの順位を、審査員毎の最高点の数に基づき決定した順位を解析に用いてもよい。 Further, when the evaluation points by the judges are the same, the rankings of these two groups may be used for the analysis based on the number of the highest points for each judge.

また、最終決戦では、実際の順位が不明確な場合がある。決勝戦には９グループ（M1-2001のみ１０グループ）が参加し、審査員はグループ毎に１００点満点の点数を付ける。全審査員の点数の合計がグループの得点となり、上位３グループが最終決戦へ進む。最終決戦では、各審査員が１つのグループに票を入れ、票数が最多のグループが優勝する。このため、最終決戦では、優勝グループ以外の２グループの票数がゼロを含めて同数の場合があり、２位と３位が定まらないことがある。例えば、M1-2009の最終決戦では全審査員が同じグループに投票しているため、１位は明らかだが、２位と３位のグループは不明である。そのため、最終決戦については、１位のグループを予測できたかで提案手法の精度を測る。 Also, in the final battle, the actual ranking may be unclear. Nine groups (10 groups only for M1-2001) will participate in the final, and the judges will give each group a maximum of 100 points. The total score of all judges will be the score of the group, and the top three groups will advance to the final battle. In the final battle, each judge will vote in one group, and the group with the most votes will win. Therefore, in the final decisive battle, the number of votes of the two groups other than the winning group may be the same including zero, and the second and third places may not be decided. For example, in the final decisive battle of M1-2009, all judges voted for the same group, so the 1st place is clear, but the 2nd and 3rd place groups are unknown. Therefore, for the final battle, the accuracy of the proposed method will be measured based on whether the first group can be predicted.

［３．結果］
［3.1 漫才コンテストの順位の予測精度］
図１２は、MZ-2001〜MZ-2010の決勝戦の予測精度を示す図である。同図は、テストデータMZ-2001〜MZ-2010の決勝戦の予測順位の順位相関係数の一例を示す。この順位相関係数は、同一テストデータを入力した１０個のモデルによりそれぞれ導出されたものである。順位相関係数に基づいて予測精度を判定できる。なお、最下段の平均は、全テストデータの値の平均値である。なお、具体的な数値は、テストデータMZ-2001〜MZ-2010として、テストデータM1-2001〜M1-2010を利用した場合を例示する。その場合の出場グループ数は図６を参照する。 [3. result]
[3.1 Prediction accuracy of Manzai contest ranking]
FIG. 12 is a diagram showing the prediction accuracy of the final match of MZ-2001 to MZ-2010. The figure shows an example of the rank correlation coefficient of the predicted ranking of the test data MZ-2001 to MZ-2010. This rank correlation coefficient is derived from each of the 10 models in which the same test data is input. The prediction accuracy can be determined based on the rank correlation coefficient. The average value at the bottom is the average value of all test data values. The specific numerical values are illustrated when the test data M1-2001 to M1-2010 are used as the test data MZ-2001 to MZ-2010. Refer to FIG. 6 for the number of participating groups in that case.

M1-2001〜M1-2010の全テストデータに基づいた平均予測精度は、決勝戦の順位相関係数が０．５８であった。上記の決勝戦の順位相関係数の値（０．５８）は、決勝戦の予測順位が実際の順位と相関があることを示している。決勝戦の精度はスピアマンの順位相関係数であり、７個のテストデータについて、相関ありの結果が得られた。
同図に示すように、強い相関が６個のテストデータ（MZ-2001、MZ-2002、MZ-2006、MZ-2007、MZ-2008、MZ-2009）、弱い相関が１個のテストデータ（MZ-2005）となっている。残りの３個のテストデータ（MZ-2003、MZ-2004、MZ-2010）については、無相関となっており、その中でもMZ-2004の予測精度（０．１７）が最も低い。 The average prediction accuracy based on all the test data of M1-2001 to M1-2010 was 0.58 in the ranking correlation coefficient of the final match. The value (0.58) of the ranking correlation coefficient of the final match indicates that the predicted ranking of the final match is correlated with the actual ranking. The accuracy of the final match is Spearman's rank correlation coefficient, and correlated results were obtained for seven test data.
As shown in the figure, test data with 6 strong correlations (MZ-2001, MZ-2002, MZ-2006, MZ-2007, MZ-2008, MZ-2009) and test data with 1 weak correlation (MZ-2001, MZ-2002, MZ-2006, MZ-2007, MZ-2008, MZ-2009). It is MZ-2005). The remaining three test data (MZ-2003, MZ-2004, MZ-2010) are uncorrelated, and the prediction accuracy (0.17) of MZ-2004 is the lowest among them.

図１３は、MZ -2001〜MZ -2010の最終決戦の予測精度を示す図である。同図は、テストデータMZ -2001〜MZ -2010の最終決戦の１位の予測精度の一例を示す。この予測精度は、同一テストデータを入力した１０個のモデルによりそれぞれ導出された予測精度の平均値である。「高精度」の欄は、比較例として１位をランダムに予測した場合よりも精度が高い結果が得られた回を示し、高精度であると判定された場合に「ＧＯＯＤ」を記載している。最下段の平均は、全テストデータの値の平均値である。なお、具体的な数値は、テストデータMZ-2001〜MZ-2010として、テストデータM1-2001〜M1-2010を利用した場合を例示する。その場合の出場グループ数は図６を参照する。出場グループ数は図６を参照する。 FIG. 13 is a diagram showing the prediction accuracy of the final battle of MZ -2001 to MZ -2010. The figure shows an example of the prediction accuracy of the first place in the final battle of test data MZ -2001 to MZ -2010. This prediction accuracy is the average value of the prediction accuracy derived from each of the 10 models in which the same test data is input. In the "High accuracy" column, as a comparative example, the number of times that a result with higher accuracy than the case where the first place is randomly predicted is obtained, and when it is judged to be high accuracy, "GOOD" is described. There is. The average at the bottom is the average of all test data values. The specific numerical values are illustrated when the test data M1-2001 to M1-2010 are used as the test data MZ-2001 to MZ-2010. Refer to FIG. 6 for the number of participating groups in that case. See FIG. 6 for the number of participating groups.

なお、テストデータMZ-2001〜MZ-2010として、M1-2001〜M1-2010の全テストデータに利用した場合の平均予測精度は、最終決戦の１位予測精度が０．６８であった。最終決戦の平均予測精度（０．６８）は、比較例として１位をランダムに予測した場合の予測精度（０．３５）の約２倍であることから、１位のグループを予測可能なことを示している。なお、比較例の１位をランダムに予測した場合の予測精度とは、M1-2001が０．５、他が０．３３であり、M1-2001〜M1-2010の平均値は０．３５（=（0.5 + 9x0.33）／10）である。 When the test data MZ-2001 to MZ-2010 were used for all the test data of M1-2001 to M1-2010, the average prediction accuracy was 0.68 for the first place prediction accuracy in the final battle. Since the average prediction accuracy (0.68) of the final battle is about twice the prediction accuracy (0.35) when the first place is randomly predicted as a comparative example, the first place group can be predicted. Is shown. The prediction accuracy when the first place of the comparative example is randomly predicted is 0.5 for M1-2001 and 0.33 for the others, and the average value of M1-2001 to M1-2010 is 0.35 ( = (0.5 + 9x0.33) / 10).

MZ-2001〜MZ-2010にそれぞれ対応するM1-2001〜M1-2010の１０個のテストデータのうち、８個のテストデータについて、比較例として例示したランダムな予測精度の値よりも高い値となっている。さらには、４個のテストデータ（M1-2004、M1-2005、M1-2006、M1-2010）では、１００%の確率で１位を予測している。なお、２個のテストデータ（M1-2003とM1-2007）で若干低い精度となっている。一方、２個のテストデータ（M1-2008とM1-2009）で１位の予測に失敗しており、どちらも比較例として例示したランダムに予測した場合の予測精度０．３３（=１/３)よりも低い値である。 Of the 10 test data of M1-2001 to M1-2010 corresponding to MZ-2001 to MZ-2010, 8 test data are higher than the random prediction accuracy value illustrated as a comparative example. It has become. Furthermore, in the four test data (M1-2004, M1-2005, M1-2006, M1-2010), the first place is predicted with a 100% probability. The accuracy of the two test data (M1-2003 and M1-2007) is slightly lower. On the other hand, the prediction of the first place failed in the two test data (M1-2008 and M1-2009), and the prediction accuracy in the case of random prediction illustrated as a comparative example is 0.33 (= 1/3). ) Is lower than.

面白さ定量評価装置１の有効性をＭ−１グランプリのデータを用いて検証した。Ｍ−１グランプリの審査員による漫才評価の基準は不明である。面白さ定量評価装置１の評価の基準は、上記の審査員による漫才評価の基準に整合させたものではない。 The effectiveness of the fun quantitative evaluation device 1 was verified using the data of the M-1 Grand Prix. The criteria for manzai evaluation by the judges of the M-1 Grand Prix are unknown. The evaluation criteria of the fun quantitative evaluation device 1 are not consistent with the above-mentioned criteria for manzai evaluation by the judges.

面白さ定量評価装置１は、独自に定めた評価基準に従って、MZ-2001〜MZ-2010にそれぞれ対応するM1-2001〜M1-2010の１０個のテストデータを評価した結果、審査員による漫才評価の結果と相関性がある評価結果を導出した。 The fun quantitative evaluation device 1 evaluates 10 test data of M1-2001 to M1-2010 corresponding to MZ-2001 to MZ-2010 according to the evaluation criteria set independently, and as a result, the judge evaluates the comics. We derived an evaluation result that correlates with the result of.

［3.2 指標ｐ１〜ｐ６］
提案手法は、漫才口演の評価にｐ１〜ｐ６の6個の指標を用いる。それぞれの指標の予測精度に対する寄与率を調べた。順位予測にはニューラルネットワークを用いているため個々の指標の寄与率を正確に把握することは困難であるが、個々の指標を除いて予測した場合の精度の変化を調べることで、指標毎の影響の度合いを推測する。 [3.2 Indicators p1 to p6]
The proposed method uses six indexes, p1 to p6, for the evaluation of manzai oral performance. The contribution rate of each index to the prediction accuracy was investigated. Since a neural network is used for ranking prediction, it is difficult to accurately grasp the contribution rate of each index, but by examining the change in accuracy when forecasting excluding individual indexes, each index Estimate the degree of impact.

図１４は、６個の指標ｐ１〜ｐ６の予測精度に対する影響度を示す図である。同図には、６個の指標ｐ１〜ｐ６から、そのうちの１つの指標を除いた場合のM1-2001〜M1-2010の決勝戦の順位予測精度の平均が示されている。同図は、ｐ１〜ｐ６の全ての指標を用いた場合と同様、leave-one-out法で各回１０回の順位相関係数の平均を示す。「ｐ１〜ｐ６を用いた場合との差」は、前述の図１２に記載のｐ１〜ｐ６全ての指標を用いた場合の決勝戦の平均予測精度の値(０．５８)との差である。 FIG. 14 is a diagram showing the degree of influence of the six indexes p1 to p6 on the prediction accuracy. The figure shows the average ranking prediction accuracy of the finals of M1-2001 to M1-2010 when one of the six indexes p1 to p6 is removed. The figure shows the average of the rank correlation coefficients 10 times each time by the leave-one-out method as in the case where all the indexes of p1 to p6 are used. The "difference from the case of using p1 to p6" is the difference from the value (0.58) of the average prediction accuracy of the final match when all the indexes of p1 to p6 shown in FIG. 12 are used. ..

同図が示すように、６個の指標ｐ１〜ｐ６のうち何れの指標を取り除いても全指標を用いた場合よりも順位予測精度が低下する。予測精度が低下した値の大きさから、指標の中では指標ｐ４（ネタ要素の平均時間間隔）が最も影響が大きいものであることが分かる。以下、指標ｐ１（１台詞に割当て可能な時間）、指標ｐ２（１秒の発話に割当て可能なモーラ数）と指標ｐ６（台詞無しのネタ要素の割合）、そして指標ｐ５（最後のネタ要素から漫才終了までの時間）と続く。指標ｐ３（最も長い台詞のモーラ数）の影響が最も低い。指標ｐ４は、ネタ要素の時間当りの密度を計測して得られたものであることから、ネタ要素の数は口演の評価に大きな影響を与えると言える。指標ｐ１は、演者のやりとりのテンポに関する指標であり、「やりとりの小気味好さ」とネタ要素の間隔は同等の重要性を持つことが判る。他の指標も重要ではあるが，より低い影響を持つ。同図から、発話の速さ（指標ｐ２）は、演者同士のやりとりの速さ（指標ｐ１）よりも重要性が低いことが判る。なお、台詞無しのネタ要素の割合（指標ｐ６）が発話の速さ（指標ｐ２）と同等の影響を持ち、さらには指標ｐ５（最後のネタ要素から漫才終了までの時間）よりも大きな影響を持つことが、特徴として挙げられる。 As shown in the figure, even if any of the six indexes p1 to p6 is removed, the ranking prediction accuracy is lower than when all the indexes are used. From the size of the value whose prediction accuracy has decreased, it can be seen that the index p4 (average time interval of the material elements) has the greatest influence among the indexes. Below, index p1 (time that can be assigned to one line), index p2 (number of mora that can be assigned to one second utterance), index p6 (ratio of material elements without dialogue), and index p5 (from the last material element). Time until the end of Manzai) and so on. The influence of the index p3 (the number of mora of the longest line) is the lowest. Since the index p4 is obtained by measuring the density of the material elements per hour, it can be said that the number of material elements has a great influence on the evaluation of the oral performance. The index p1 is an index related to the tempo of the performer's interaction, and it can be seen that the "feeling of interaction" and the interval between the material elements have the same importance. Other indicators are important, but have a lower impact. From the figure, it can be seen that the speed of utterance (index p2) is less important than the speed of interaction between performers (index p1). In addition, the ratio of the material element without dialogue (index p6) has the same effect as the speed of utterance (index p2), and further has a greater effect than the index p5 (time from the last material element to the end of comic dialogue). Having is mentioned as a feature.

上記の通り、６個の指標の中で、ネタ要素の時間間隔（ｐ４）の寄与率が最も高いと推測される。ここで、笑いの数は、台本に含まれるネタ要素の数と等価であると仮定する。つまり、漫才の面白さは笑いの数と関連しており、観客に生じる笑いの数が多い程、面白い漫才であるとの仮説が考えられる。これを検証するために、時間当りの笑いの数（ｐ４の逆数）の値に基づく順位と、実際の順位とを、順位相関係数に基づいてその関係を示す。 As described above, it is estimated that the contribution rate of the time interval (p4) of the material element is the highest among the six indexes. Here, it is assumed that the number of laughter is equivalent to the number of material elements contained in the script. In other words, the fun of comics is related to the number of laughter, and it is hypothesized that the greater the number of laughter that occurs in the audience, the more interesting comics are. In order to verify this, the relationship between the rank based on the value of the number of laughter per hour (the reciprocal of p4) and the actual rank is shown based on the rank correlation coefficient.

前述の図１２には、M1-2001〜M1-2010の決勝戦における時間当りの笑いの数(笑い密度：1=ｐ４)に基づく順位と実際の順位の順位相関係数と、提案手法に基づく順位と実際の順位の順位相関係数が示されている。同図に示されるように、M1-2001〜M1-2010における決勝戦の順位相関係数の平均値は０．３４と、無相関の値であり、提案手法による予測の順位相関係数の平均値（０．５８）より低い。 Figure 12 above shows the ranking correlation coefficient between the ranking based on the number of laughter per hour (laughing density: 1 = p4) in the finals of M1-2001 to M1-2010 and the actual ranking, and the proposed method. The rank correlation coefficient between the rank and the actual rank is shown. As shown in the figure, the average value of the rank correlation coefficient of the final match in M1-2001 to M1-2010 is 0.34, which is an uncorrelated value, and the average of the rank correlation coefficient predicted by the proposed method. Lower than the value (0.58).

さらには、M1-2010の場合は（−０．２４）であり、負の値かつ無相関であった。これらは、笑いの数が多い程、面白いとの仮説に反する。これらのことから、漫才の面白さの評価には、笑いの数および密度は単体では指標として使えないものであることが分かる。さらには、漫才の台本を作成する際に、単純にネタ要素を多く入れるだけでは不十分であることが上記の結果から分かる。 Furthermore, in the case of M1-2010, it was (-0.24), which was a negative value and was uncorrelated. These are contrary to the hypothesis that the more laughter you have, the more interesting it is. From these facts, it can be seen that the number and density of laughter cannot be used as indicators by themselves in evaluating the fun of comics. Furthermore, it can be seen from the above results that it is not enough to simply add a lot of material elements when creating a script for a comic book.

上記の実施形態によれば、演目中の演者が発した音声に基づいて導出された演者の話し方の傾向を示す第１解析結果と、演目を視聴する人の笑い声に基づいて導出された演者が演じた内容の傾向を示す第２解析結果と、に基づいて、演者が演じる演目の面白さを評価することにより、口演の面白さを定量的に評価することができる。 According to the above embodiment, the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance and the performer derived based on the laughter of the viewer of the performance are derived. By evaluating the fun of the performance performed by the performer based on the second analysis result showing the tendency of the performance, the fun of the oral performance can be quantitatively evaluated.

（第２の実施形態）
第２の実施形態について説明する。前述の第１の実施形態では、M1-2001からM1-2010の各開催回のデータ（履歴データ）に基づく面白さ定量評価装置１の評価結果を、各開催回の審査結果と対比して、その有効性を確認した。これに代えて、第２の実施形態では、漫才コンテストの審査結果を予測する一例について説明する。例えば、Ｍ−１グランプリは、前述の１０回が開催されてから４年間の未開催期間を経て、第１１回のＭ−１グランプリ２０１５（M1-2015という。）が開催された。本実施形態の手法をM1-2015に適用し、その決勝が実施される前に、決勝の順位と最終決戦の１位のグループを予測する事例について説明する。 (Second Embodiment)
A second embodiment will be described. In the first embodiment described above, the evaluation result of the fun quantitative evaluation device 1 based on the data (history data) of each holding time of M1-2001 to M1-2010 is compared with the examination result of each holding time. The effectiveness was confirmed. Instead, in the second embodiment, an example of predicting the judging result of the comic dialogue contest will be described. For example, as for the M-1 Grand Prix, the 11th M-1 Grand Prix 2015 (referred to as M1-2015) was held after a four-year non-holding period after the above-mentioned 10 times were held. An example of applying the method of this embodiment to M1-2015 and predicting the ranking of the final and the first group of the final final before the final is carried out will be described.

図１５は、本実施形態の面白さ定量評価装置１の構成を示す図である。図３との相違点を中心に説明する。
面白さ定量評価装置１における記憶部１０は、学習用データ１１として、M1-2001からM1-2010の１０回分のデータ（履歴データ）を格納し、評定用データ１３として、M1-2015における予選会と準々決勝のデータを格納する。
評価部２１は、評定用データ１３であるM1-2015における予選会と準々決勝のデータに基づいて、対象のグループの口演の面白さを評価し、予測精度算出部２６が決勝戦以降の順位を予測する。 FIG. 15 is a diagram showing the configuration of the fun quantitative evaluation device 1 of the present embodiment. The differences from FIG. 3 will be mainly described.
The storage unit 10 in the fun quantitative evaluation device 1 stores data (history data) for 10 times from M1-2001 to M1-2010 as learning data 11, and serves as evaluation data 13 for the preliminary meeting in M1-2015. And store the quarterfinal data.
The evaluation unit 21 evaluates the fun of the oral performance of the target group based on the data of the qualifying session and the quarterfinals in M1-2015, which is the evaluation data 13, and the prediction accuracy calculation unit 26 determines the ranking after the final match. Predict.

図１６は、漫才コンテストの一例であるM1-2015の進行を示す図である。M1-2015は、１回戦、２回戦、３回戦の３回の予選会が行われた後、準々決勝、準決勝が行われ、準決勝において８組のグループが決勝出場者として選抜された。M1-2015では、面白い口演をした演者を複数の回に分けて段階的に選抜する。M1-2015の参加者（演者）は、所定の回ごとに演目を演じ、演じた演目の面白さの程度に基づいて選択される。８組のグループの選抜の結果は、２０１５年１１月１９日に公開された。 FIG. 16 is a diagram showing the progress of M1-2015, which is an example of a comic contest. In M1-2015, after three qualifying rounds of the first, second and third rounds were held, the quarterfinals and semifinals were held, and eight groups were selected as finalists in the semifinals. In M1-2015, performers who made interesting oral presentations will be selected in stages by dividing them into multiple times. Participants (performers) of M1-2015 will perform the performance at each predetermined time and will be selected based on the degree of fun of the performance. The results of the selection of the eight groups were released on November 19, 2015.

さらに、M1-2015の決勝出場者は、上記で選抜された８組のグループの他に、準決勝までに敗者となったグループのうちから１組のグループが選抜され、追加される。なお、その結果は決勝の当日（２０１５年１２月６日）に公開される。 In addition to the eight groups selected above, one group will be selected and added to the M1-2015 finalists from the groups that lost by the semi-finals. The results will be released on the day of the final (December 6, 2015).

M1-2015の決勝は、上記の計９組のグループ（決勝出場者）で争われ、勝ち進んだ３組のグループが、最終決戦に参加するグループとして選抜される。最終決戦では、選抜された３組のグループのうちから１位のグループ（優勝者）が選ばれる。なお、決勝と最終決戦は、同日に行われる。 The final of M1-2015 will be contested by the above 9 groups (final contestants), and the 3 groups that have won will be selected as the groups that will participate in the final final. In the final battle, the first group (winner) will be selected from the three selected groups. The final and final battle will be held on the same day.

本実施形態の面白さ定量評価装置１は、M1-2015の準決勝を終えた段階で、準決勝において選抜された８グループの決勝における順位と、最終決戦の優勝者を予想する。ただし、この予想の結果には、決勝当日に追加される１組のグループは含まないものである。 The fun quantitative evaluation device 1 of the present embodiment predicts the ranking in the finals of the eight groups selected in the semifinals and the winner of the final final at the stage when the semifinals of M1-2015 are completed. However, the results of this forecast do not include one group that will be added on the day of the final.

図１７は、漫才コンテストの一例であるM1-2015の決勝戦の順位予測処理の概要を示す図である。M1-2015の決勝戦の順位予測処理は、評価部２１、順序判定部２３、及び、予測精度算出部２６により実施される。順序判定部２３と予測精度算出部２６は、推定部の一例である。例えば、順序判定部２３と予測精度算出部２６は、評価部２１による評定の結果に基づいて、グループの順位を推定する。 FIG. 17 is a diagram showing an outline of the ranking prediction process of the final match of M1-2015, which is an example of the comic contest. The ranking prediction process of the final match of M1-2015 is carried out by the evaluation unit 21, the order determination unit 23, and the prediction accuracy calculation unit 26. The order determination unit 23 and the prediction accuracy calculation unit 26 are examples of the estimation unit. For example, the order determination unit 23 and the prediction accuracy calculation unit 26 estimate the ranking of the group based on the evaluation result by the evaluation unit 21.

評価部２１は、決勝における順位を予想する処理において、前述の１０個のモデルを用いる。この１０個のモデルは、第１の実施形態に示した手順により学習され、生成されたものである。評価部２１は、M1-2015の３回戦と準々決勝の映像の解析結果に基づいて、
各グループの口演の面白さを評価する。 The evaluation unit 21 uses the above-mentioned 10 models in the process of predicting the ranking in the final. These 10 models are learned and generated by the procedure shown in the first embodiment. The evaluation unit 21 is based on the analysis results of the images of the third round and the quarterfinals of M1-2015.
Evaluate the fun of the oral performance of each group.

例えば、評価部２１は、準決勝において選抜された８組のグループについて、３回戦の映像の解析結果と準々決勝の映像の解析結果とに基づいて、それぞれの回のグループごとの評価値を導出する。評価部２１は、グループごとに、それぞれの回の評価値の平均（平均評価値）を導出する。つまり、評価部２１は、所定の回のうち３回戦（第１の回）においてグループが演じた演目における第１解析結果と前記第２解析結果とに基づいて、グループが演じた演目の面白さの第１評価値を導出する。また、評価部２１は、前記所定の回のうち準々決勝（第１の回とは異なる第２の回）においてグループが演じた演目における第１解析結果と第２解析結果とに基づいて、グループが演じた演目の面白さの第２評価値を導出する。例えば、評価部２１は、上記の第１評価値と第２評価値の平均（平均評価値）を導出する。例えば、評価部２１は、予め定められた係数値に基づいて、上記の第１評価値と第２評価値の加重平均（平均評価値）を導出する。 For example, the evaluation unit 21 derives the evaluation value for each group of eight groups selected in the semi-finals based on the analysis result of the video of the third round and the analysis result of the video of the quarter-finals. .. The evaluation unit 21 derives the average (average evaluation value) of the evaluation values of each time for each group. That is, the evaluation unit 21 is interested in the performance performed by the group based on the first analysis result and the second analysis result in the performance performed by the group in the third round (first round) of the predetermined rounds. The first evaluation value of is derived. In addition, the evaluation unit 21 groups based on the first analysis result and the second analysis result in the performance performed by the group in the quarterfinals (the second time different from the first time) among the predetermined times. Derivation of the second evaluation value of the fun of the performance played by. For example, the evaluation unit 21 derives the average (average evaluation value) of the first evaluation value and the second evaluation value. For example, the evaluation unit 21 derives a weighted average (average evaluation value) of the first evaluation value and the second evaluation value based on a predetermined coefficient value.

順序判定部２３は、準々決勝に重きを置いた加重平均により評価部２１によって導出されたグループごとの平均評価値に基づいて、対象のグループの口演の面白さの評価点を導出する。
予測精度算出部２６は、順序判定部２３により導出された口演の面白さの平均点に基づいて、各グループの順位を予測する。 The order determination unit 23 derives an evaluation point of the fun of the oral performance of the target group based on the average evaluation value for each group derived by the evaluation unit 21 by the weighted average that emphasizes the quarterfinals.
The prediction accuracy calculation unit 26 predicts the ranking of each group based on the average score of the fun of the oral performance derived by the order determination unit 23.

図１８は、本実施形態に係る決勝戦の順位の予測の結果を示す図である。同図に示すように、順序判定部２３は、上記の結果から、上位の３グループを予測した。また、同図は、併せて実際の審査結果を示す。同図に示されるように、上位３グループについての予測の結果は、実際の結果と同じであった。また、決勝戦の順位相関係数の値は、第１の実施形態の検証の結果を示す平均値より高い値が得られた。全８グループについての値は、０.６９であった。 FIG. 18 is a diagram showing the result of prediction of the ranking of the final match according to the present embodiment. As shown in the figure, the order determination unit 23 predicted the top three groups from the above results. In addition, the figure also shows the actual examination results. As shown in the figure, the results of the predictions for the top three groups were the same as the actual results. In addition, the value of the ranking correlation coefficient of the final match was higher than the average value indicating the result of the verification of the first embodiment. The value for all 8 groups was 0.69.

上記の値が導出された条件は、下記の点で第１の実施形態の結果を導出した場合と異なるものである。M1-2015の予測に使われたデータは、同一グループが決勝戦に先だち実施された予選等において演じた口演に基づいたものである。なお、予選における口演の内容は、最終決戦における口演とは異なるものである。このように、上記の実施形態によれば、特定の台詞又はグループによる口演を評価することに限定されるものではなく、グループの技能を評価することができる。 The condition from which the above values are derived is different from the case where the result of the first embodiment is derived in the following points. The data used for the prediction of M1-2015 is based on the oral performances of the same group in the qualifying rounds held prior to the finals. The content of the oral performance in the preliminary round is different from the oral performance in the final battle. As described above, according to the above-described embodiment, the skill of the group can be evaluated without being limited to evaluating the oral performance by a specific line or group.

上記の実施形態によれば、第１の実施形態と同様の効果を奏することの他、口演の面白さを定量的に評価することにより、漫才コンテストの審査結果を予測することができる。 According to the above-described embodiment, in addition to achieving the same effect as that of the first embodiment, the judging result of the comic dialogue contest can be predicted by quantitatively evaluating the fun of the oral performance.

（第３の実施形態）
第３の実施形態について説明する。前述の第１の実施形態と第２の実施形態では、漫才を例示して説明した。これに代えて、第３の実施形態では、コントを例示して説明する。 (Third Embodiment)
A third embodiment will be described. In the first embodiment and the second embodiment described above, a manzai has been illustrated and described. Instead of this, in the third embodiment, a control will be illustrated and described.

前述の漫才の場合には、ｐ１〜ｐ６の６個の指標に基づいて口演を評価した。コントの場合は、下記のｐ１〜ｐ９、ｐ４を除く８つの指標に基づいて口演を評価する。漫才との相違点を中心に説明する。コントの場合においても、上記の８つの指標のうち、指標ｐ１〜ｐ３と指標ｐ５とｐ６の５つの指標は、漫才の場合と同様である。 In the case of the above-mentioned manzai, the oral performance was evaluated based on the six indexes of p1 to p6. In the case of Tale, the oral performance is evaluated based on the following eight indicators excluding p1 to p9 and p4. I will mainly explain the differences from Manzai. Even in the case of the control, among the above eight indexes, the five indexes of the indexes p1 to p3 and the indexes p5 and p6 are the same as in the case of the comic book.

ｐ７は１発音当りの時間である。ｐ８は最後から２番目の笑いから口演終了までの時間である。ｐ９は口演開始から最初の笑いまでの時間である。上記の指標ｐ７（１発音当りの時間）は、例えば、式（７）のように定義してもよい。 p7 is the time per pronunciation. p8 is the time from the penultimate laughter to the end of the oral performance. p9 is the time from the start of the oral performance to the first laughter. The above index p7 (time per pronunciation) may be defined as, for example, the equation (7).

上記の指標の個数の変更に伴い、モデル生成部２４におけるニューラルネットワーク２４１とモデル生成部２４が生成するモデルの入力ノードの数を８個にして、評価部２１が対象とする指標を上記の８個にする。
例えば、モデルの生成に係る処理と評価に係る処理は、第１実施形態と同様にする。 With the change in the number of the above indexes, the number of input nodes of the model generated by the neural network 241 and the model generation unit 24 in the model generation unit 24 is set to eight, and the index targeted by the evaluation unit 21 is set to the above eight. Make it an individual.
For example, the process related to model generation and the process related to evaluation are the same as in the first embodiment.

上記の実施形態によれば、第１の実施形態と同様の効果を奏することの他に、対象の演目をコントにすることを可能にする。 According to the above embodiment, in addition to producing the same effect as that of the first embodiment, it is possible to control the target performance.

（第４の実施形態）
第４の実施形態について説明する。第４の実施形態では、観客の笑いの検出と発生時点(笑いが生じた時間) の検出の精度を高める手法について説明する。 (Fourth Embodiment)
A fourth embodiment will be described. In the fourth embodiment, a method for improving the accuracy of detecting the laughter of the audience and detecting the time of occurrence (time when the laughter occurs) will be described.

笑いの大きさと持続時間の正確な計測は、音声波形に基づいて自動識別処理の結果を得る場合などに、音声の大きさの識別が困難になることがある。また、人が判定する場合には、個人差があり、判定者が異なる場合には同じ判定基準で判定することが困難になる。例えば、判定が困難となる場合には、下記のような場合が挙げられる。連続するネタ要素は、直前の観客の笑いが収束する前に次の笑いが生じる。このような場合、観客の笑いが途切れている必要は無く、直前の観客の笑いが収束する前に次の笑いが生じていても、新たなネタ要素として計測する。 Accurate measurement of the loudness and duration of laughter can make it difficult to identify the loudness of a voice, such as when obtaining the results of an automatic identification process based on a voice waveform. In addition, when a person makes a judgment, there are individual differences, and when the judgment is different, it becomes difficult to make a judgment based on the same judgment criteria. For example, when the determination becomes difficult, the following cases can be mentioned. In the continuous material element, the next laugh occurs before the laugh of the previous audience converges. In such a case, the laughter of the audience does not have to be interrupted, and even if the next laughter occurs before the laughter of the previous audience converges, it is measured as a new material element.

本実施形態では、このような場合に対し、観客の笑いの検出と発生時点(笑いが生じた時間) の検出を３人の判定者がそれぞれ独立して行ない、２人以上が検出した観衆の笑いをネタ要素として解析に用いる。 In the present embodiment, in such a case, three judges independently detect the laughter of the spectator and the time when the laughter occurs (the time when the laughter occurs), and the spectators detected by two or more of them. Laughter is used for analysis as a material element.

図１９は、本実施形態の解析装置を含む面白さ定量評価装置１の構成図である。
面白さ定量評価装置１は、記憶部１０と、評価部２１と、順序判定部２３と、予測精度算出部２６と、モデル生成部２４と、特徴量取得部２６とを備える。 FIG. 19 is a configuration diagram of an interesting quantitative evaluation device 1 including the analysis device of the present embodiment.
The fun quantitative evaluation device 1 includes a storage unit 10, an evaluation unit 21, an order determination unit 23, a prediction accuracy calculation unit 26, a model generation unit 24, and a feature quantity acquisition unit 26.

特徴量取得部２６は、操作検出部２６１と、操作判定部２６２と、計時部２６３とを備える。 The feature amount acquisition unit 26 includes an operation detection unit 261, an operation determination unit 262, and a timekeeping unit 263.

例えば、入出力装置１Ｅ（図２）には、口演における笑いを判定する判定者によって操作される台詞ネタ要素検出キーと、動作ネタ要素検出キーとが設けられている。台詞ネタ要素検出キーは、台詞を含むネタ要素があった場合に操作されるキーである。動作ネタ要素検出キーは、台詞を含まないネタ要素があった場合に操作されるキーである。例えば、動作ネタ要素検出キーは、動作のみのネタ要素の場合などに操作される。 For example, the input / output device 1E (FIG. 2) is provided with a dialogue material element detection key operated by a judge who determines laughter in oral performance, and an operation material element detection key. The dialogue material element detection key is a key that is operated when there is a material element containing a dialogue. The operation material element detection key is a key that is operated when there is a material element that does not include a dialogue. For example, the operation material element detection key is operated in the case of an operation-only material element.

例えば、台詞ネタ要素検出キーと動作ネタ要素検出キーを組にして、その組は、各判定者に対応付けてそれぞれ設けられている。 For example, a dialogue material element detection key and an operation material element detection key are paired, and the pair is provided in association with each judge.

操作検出部２６１は、台詞ネタ要素検出キーと動作ネタ要素検出キーの操作を検出する。例えば、操作検出部２６１は、３人の判定者によりそれぞれ操作される３組の台詞ネタ要素検出キーと動作ネタ要素検出キーに対する操作を、それぞれ検出する。 The operation detection unit 261 detects the operation of the dialogue material element detection key and the operation material element detection key. For example, the operation detection unit 261 detects operations on the three sets of dialogue material element detection keys and operation material element detection keys, which are operated by the three judges, respectively.

操作判定部２６２は、上記の３組のキーのうち同じ種類のキーが操作されているか否かを検出し、検出したキーの種類と数に応じて、キーに対応するネタ要素の発生を検出する。操作判定部２６２は、同じ種類の３つのキーのうち、０又は１個のキーの操作が検出されている状態を「非検出状態」と判定し、「非検出状態」から２又は３個のキーの操作が検出されている状態に変化した場合を「検出状態」と判定する。つまり、操作判定部２６２は、「検出状態」から０又は１に戻った状態を、上記の「非検出状態」と判定する。
なお、上記の検出の方法は、台詞ネタ要素検出キーと動作ネタ要素検出キーとに共通である。 The operation determination unit 262 detects whether or not the same type of key is operated among the above three sets of keys, and detects the occurrence of a material element corresponding to the key according to the type and number of the detected keys. To do. The operation determination unit 262 determines that the state in which the operation of 0 or 1 key is detected among the three keys of the same type is the "non-detection state", and 2 or 3 from the "non-detection state". When the key operation changes to the detected state, it is determined as the "detected state". That is, the operation determination unit 262 determines the state of returning from the "detection state" to 0 or 1 as the above-mentioned "non-detection state".
The above detection method is common to the dialogue material element detection key and the operation material element detection key.

計時部２６３は、ネタ要素の発生の検出に同期して、漫才の開始時間からの経過時間を測り記録する。 The timekeeping unit 263 measures and records the elapsed time from the start time of the comic dialogue in synchronization with the detection of the occurrence of the material element.

特徴量取得部２６は、操作判定部２６２による判定の結果に基づいて、ネタ要素の発生時点を検出して、その結果を記憶部１０に格納する。例えば、判定者は、映像を視聴しながら、観客の笑いが発生していると判断した瞬間に、予め定められているキーを押す。その際、判定者は、台詞を含むネタ要素の場合に台詞ネタ要素検出キーを操作し、動作のみのネタ要素の場合に動作ネタ要素検出キーを操作する。特徴量取得部２６は、判定者の操作を検出し、その操作が検出されたタイミングを計測することで、自動的に指標ｐ６を計算するためのデータを収集する。特徴量取得部２６は、それぞれのキーの操作を検出すると、漫才の開始時間からの経過時間を記録する。特徴量取得部２６は、観客に笑いが生じていると判定した２人か３人の判定者の時間データの平均値を算出し、算出した時間をネタ要素の発生時間として、評価部２１による評価などに利用する。 The feature amount acquisition unit 26 detects the time of occurrence of the material element based on the result of the determination by the operation determination unit 262, and stores the result in the storage unit 10. For example, the judge presses a predetermined key at the moment when it is determined that the laughter of the audience is occurring while watching the video. At that time, the judge operates the dialogue material element detection key in the case of the material element including the dialogue, and operates the operation material element detection key in the case of the material element containing only the operation. The feature amount acquisition unit 26 detects an operation of the determiner and measures the timing at which the operation is detected to automatically collect data for calculating the index p6. When the feature amount acquisition unit 26 detects the operation of each key, the feature amount acquisition unit 26 records the elapsed time from the start time of the comic book. The feature amount acquisition unit 26 calculates the average value of the time data of two or three judges who have determined that the audience is laughing, and the calculated time is used as the generation time of the material element by the evaluation unit 21. Used for evaluation etc.

なお、舞台設定、観客数や状況が異なる漫才口演を比較する場合には、特徴量取得部２６は、笑いの大きさを検出してもよく、又は、笑いの持続時間を正規化してもよい。 When comparing comic dialogue performances with different stage settings, number of spectators, and situations, the feature amount acquisition unit 26 may detect the magnitude of laughter or may normalize the duration of laughter. ..

図２０は、本実施形態に係る特徴量取得部２６により得られた指標の一例を示す図である。同図に示される指標の値は、ニューラルネットワーク２４１の学習データにするものである。 FIG. 20 is a diagram showing an example of an index obtained by the feature amount acquisition unit 26 according to the present embodiment. The value of the index shown in the figure is used as training data of the neural network 241.

上記の実施形態によれば、第１の実施形態と同様の効果を奏することの他に、観客の笑いの検出と発生時点（笑いが生じた時間）の検出の精度を高めることを可能にする。 According to the above embodiment, in addition to achieving the same effect as that of the first embodiment, it is possible to improve the accuracy of detecting the laughter of the audience and the time when the laughter occurs (time when the laughter occurs). ..

（第５の実施形態）
第５の実施形態について説明する。第５の実施形態では、漫才の台本作成を支援する手法について説明する。 (Fifth Embodiment)
A fifth embodiment will be described. In the fifth embodiment, a method for supporting the creation of a comic script will be described.

面白さ定量評価装置１は、漫才の台本作成に利用されることにより、高評価な漫才の台本の作成を支援する。 The fun quantitative evaluation device 1 supports the creation of a highly evaluated comic script by being used for the creation of a comic script.

例えば、台本を作成するユーザは、台詞数、最長台詞のモーラ数、総モーラ数を計測しながら、台詞を記述した台本を作成する。この際、口演時間が規定されていなければ口演時間を決定する。 For example, a user who creates a script creates a script that describes the lines while measuring the number of lines, the number of mora of the longest line, and the total number of mora. At this time, if the oral performance time is not specified, the oral performance time is determined.

面白さ定量評価装置１は、予備学習を終えて各口演の評価が可能な状態にあるものとする。 It is assumed that the fun quantitative evaluation device 1 is in a state where it is possible to evaluate each oral performance after completing the preliminary learning.

面白さ定量評価装置１は、上記の台詞数、総モーラ数、及び、口演時間に基づいて、１台詞に割当て可能な時間と、１秒の発話に割当て可能なモーラ数とを導出する。面白さ定量評価装置１は、観客に笑いを生じさせるネタ要素（笑いを生じると予想される箇所）の間隔をモーラ数で計測する。例えば、面白さ定量評価装置１は、上記のモーラ数を、１秒に割当て可能なモーラ数からネタ要素間の時間の近似値を計算するか、ユーザが音読した音声に基づいて計測する。 The fun quantitative evaluation device 1 derives the time that can be allocated to one line and the number of mora that can be allocated to one second of utterance based on the number of lines, the total number of mora, and the oral performance time. The fun quantitative evaluation device 1 measures the interval of the material element (the part where laughter is expected to occur) that causes the audience to laugh by the number of mora. For example, the fun quantitative evaluation device 1 calculates the approximate value of the time between the material elements from the number of mora that can be assigned to one second, or measures the number of mora based on the voice read aloud by the user.

また、面白さ定量評価装置１は、最後のネタ要素を可能な限り終りに配置するように、台本の構成を調整する。面白さ定量評価装置１の評価部２１は、上記の調整の結果に基づいた評価を実施する。面白さ定量評価装置１は、この評価部２１による評価の結果を、台本の評価値とする。 In addition, the fun quantitative evaluation device 1 adjusts the composition of the script so that the last material element is arranged at the end as much as possible. The evaluation unit 21 of the fun quantitative evaluation device 1 carries out the evaluation based on the result of the above adjustment. The fun quantitative evaluation device 1 uses the result of the evaluation by the evaluation unit 21 as the evaluation value of the script.

また、聴衆に代わって聴き手が同席する予行演習を実施する場合には、面白さ定量評価装置１は、予行演習において聴き手に笑いが生じた時間を記録し、ネタ要素の位置と時間間隔を再計算する。台詞に割当て可能な時間、及び、１秒に割当て可能なモーラ数が推奨値の範囲に含まれない場合には、良好な評価値が得られない。面白さ定量評価装置１は、評価値が良好な値になるまで、台詞、モーラ、動作や身振りの追加または削除を実行し、台本の構成を繰り返し調整するとよい。その結果、台本の台詞当りの時間などの評価項目の値が推奨値に近くなると、面白さ定量評価装置１は、良好な値の評価値を出力する。 In addition, when a rehearsal exercise in which the listener is present on behalf of the audience is carried out, the fun quantitative evaluation device 1 records the time when the listener laughs in the rehearsal exercise, and the position and time interval of the material element. To recalculate. If the time that can be allocated to the dialogue and the number of mora that can be allocated to 1 second are not included in the recommended value range, a good evaluation value cannot be obtained. The fun quantitative evaluation device 1 may add or delete lines, mora, movements and gestures until the evaluation value becomes a good value, and repeatedly adjust the composition of the script. As a result, when the value of the evaluation item such as the time per line of the script approaches the recommended value, the fun quantitative evaluation device 1 outputs an evaluation value of a good value.

このように、面白さ定量評価装置１は、漫才の台本について定量的に評価することで、漫才の台本の作成を支援する。面白さ定量評価装置１は、定量的に評価可能な指標を利用することにより、漫才口演の評価や、台本の作成能力がどれだけ向上したかの検証、改善すべき点の検出、漫才の出来の客観的な比較等を可能にする。面白さ定量評価装置１は、漫才口演の大局的な特性を定量的な指標で捉えることにより、言語的情報および時系列に関わる特徴に依存せずに、口演を客観的に評価する。 In this way, the fun quantitative evaluation device 1 supports the creation of the comic script by quantitatively evaluating the comic script. The fun quantitative evaluation device 1 uses an index that can be evaluated quantitatively to evaluate the manzai performance, verify how much the script writing ability has improved, detect points to be improved, and perform the manzai. It enables objective comparison of. The fun quantitative evaluation device 1 objectively evaluates the oral performance without depending on the linguistic information and the characteristics related to the time series by grasping the global characteristics of the comic dialogue performance with a quantitative index.

上記の実施形態によれば、第１の実施形態と同様の効果を奏することの他、漫才の台本作成を支援することができる。 According to the above-described embodiment, in addition to achieving the same effect as that of the first embodiment, it is possible to support the script creation of a comic book.

以上、本発明の好ましい実施の形態について詳述したが、本発明は特定の実施の形態に限定されるものではなく、特許請求の範囲内に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 Although the preferred embodiments of the present invention have been described in detail above, the present invention is not limited to the specific embodiments, and varies within the scope of the gist of the present invention described in the claims. Can be transformed / changed.

なお、上記の実施形態では、繰り返して開催される特定の漫才コンテスト（例えば、Ｍ−１グランプリ）内という限られた範囲で、それに出場した漫才グループのみを比較する場合を例示している。繰り返して開催される特定の漫才コンテストの各開催回は、審査員数、観客の規模などに大きな変化はなく開催されている。同様の審査員数、観客の規模で開催されているものであれば、その違いの影響を受けにくい。面白さ定量評価装置１は、その予測性能を高めるには、例えば、Ｍ−１グランプリのように学習データの条件と同一又は類似する条件の舞台で実施される漫才を評価の対象にするとよい。また、Ｍ−１グランプリの他に、放送番組「ＴＨＥＭＡＮＺＡＩ」等が知られている。「ＴＨＥＭＡＮＺＡＩ」等についても、本実施形態を適用してもよい。 In the above embodiment, a case is illustrated in which only the comic dialogue groups that participated in the specific comic dialogue contest (for example, M-1 Grand Prix) that are held repeatedly are compared within a limited range. Each of the specific Manzai contests that are held repeatedly is held without any major changes in the number of judges, the size of the audience, etc. If it is held with the same number of judges and the size of the audience, it is not easily affected by the difference. In order to improve the prediction performance, the fun quantitative evaluation device 1 may evaluate a comic dialogue performed on a stage under the same or similar conditions as the training data, such as the M-1 Grand Prix. In addition to the M-1 Grand Prix, a broadcast program "THE MANZAI" and the like are known. The present embodiment may also be applied to "THE MANZAI" and the like.

一方で、異なる条件の舞台で実施される漫才を含めて解析する場合においても、面白さ定量評価装置１は、口演の面白さの一般性に基づいて評価することが可能である。聴衆が面白いと感じる漫才は、時とともに変化すると考えるのが自然である。２００１〜２０１０年の漫才を用いて２０１５年のコンテスト結果を予測できたことは、本手法によって、１０年という長期間に及ぶ好みの変化や流行・廃りとは独立した漫才の面白さの中核部分を捉えられたことと、その５年後にも通用する面白さの評価方法が得られたことを意味する。さらに、２００１〜２０１０年と２０１５年のＭ−１グランプリの審査員には重複がないため、審査員の組み合わせにも依存しないことが判る。 On the other hand, even in the case of analyzing including comics performed on the stage under different conditions, the fun quantitative evaluation device 1 can evaluate based on the generality of the fun of oral performance. It is natural to think that the manzai that the audience finds interesting changes over time. The fact that we were able to predict the results of the 2015 contest using the 2001-2010 Manzai is the core of the fun of the Manzai, which is independent of the long-term changes in taste and fashion / obsolescence of 10 years. It means that we were able to catch the above and that we were able to obtain an evaluation method of fun that would be valid five years later. Furthermore, since there is no overlap between the judges of the 2001-2010 and 2015 M-1 Grand Prix, it can be seen that it does not depend on the combination of judges.

なお、上記の実施形態では、静的かつ非言語的な情報のみを用いて漫才の面白さを定量的に評価できる可能性を示したが、面白さ定量評価装置１は、動的又は言語的な情報を、静的かつ非言語的な情報に組み合わせて評価してもよい。 In the above embodiment, it has been shown that the fun of comics can be quantitatively evaluated by using only static and non-verbal information. However, the fun quantitative evaluation device 1 is dynamic or linguistic. Information may be evaluated in combination with static and non-verbal information.

なお、面白さ定量評価装置１は、実施形態に示した用途に限らず下記のような用途にも適用してもよい。
例えば、面白さ定量評価装置１は、定量的な指標を利用して評価するため、作成済の台本や漫才口演を変更するための判断や、変更すべき点の明確化、漫才口演の指導等の目的に適用できる。さらには、漫才の台本を作る際に、口演する前の評価や、自動生成した台本の評価に利用することで、面白い台本を提供することに役立つと考えられる。 The fun quantitative evaluation device 1 may be applied not only to the applications shown in the embodiment but also to the following applications.
For example, since the fun quantitative evaluation device 1 evaluates using a quantitative index, it makes a judgment for changing a prepared script or a manzai oral performance, clarifies the points to be changed, teaches a manzai oral performance, etc. It can be applied to the purpose of. Furthermore, when making a script for a manzai, it is considered to be useful for providing an interesting script by using it for evaluation before speaking or for evaluation of an automatically generated script.

また、出場グループの中でも予選を通過した技量の高い漫才グループを解析している。しかし、その中でも上位と下位の間には差があり、その差の要因を明らかにすることでより高評価な漫才を実演するための具体的な目標設定が可能となる。例えば、漫才コンテストの予選などの審査を行う審査官ロボットとしての利用が可能である。 In addition, we are analyzing the highly skilled manzai groups that have passed the preliminary rounds among the participating groups. However, among them, there is a difference between the upper and lower ranks, and by clarifying the cause of the difference, it is possible to set specific goals for demonstrating a more highly evaluated manzai. For example, it can be used as an examiner robot that judges qualifying for a comic contest.

また、面白さ定量評価装置１は、「面白い」漫才を抽出することができる。この特徴を利用して、漫才の制作や企画、或いは、放送するグループの選択、グループの順序の決定など番組の編成などに利用してもよい。 In addition, the fun quantitative evaluation device 1 can extract "interesting" comics. This feature may be used for the production and planning of comics, or for organizing programs such as selecting a group to broadcast and determining the order of groups.

１…面白さ定量評価装置、１０…記憶部、１１…学習用データ、１２…モデルデータ、１３…評定用データ、１４…評価結果データ、１５…審査結果データ、２１…評価部、２２…平均点算出部、２３…順序判定部、２４…モデル生成部、２５…特徴量抽出部、２６…予測精度算出部、１３１…第１解析結果、１３２…第２解析結果。 1 ... Interesting quantitative evaluation device, 10 ... Storage unit, 11 ... Learning data, 12 ... Model data, 13 ... Evaluation data, 14 ... Evaluation result data, 15 ... Examination result data, 21 ... Evaluation unit, 22 ... Average Point calculation unit, 23 ... order determination unit, 24 ... model generation unit, 25 ... feature quantity extraction unit, 26 ... prediction accuracy calculation unit, 131 ... first analysis result, 132 ... second analysis result.

Claims

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. The evaluation unit that evaluates the fun of the performance performed by the performer based on the second analysis result showing
Equipped with a,
The first analysis result includes the result of evaluation of the tempo of the interaction of the plurality of performers.
Interesting quantitative evaluation device.

The first analysis result does not include the analysis result regarding the utterance content of the performer based on the voice uttered by the performer .
The second analysis result includes the result of evaluation of the ratio of the material element without dialogue in the performance.
The fun quantitative evaluation device according to claim 1.

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. Evaluation unit that evaluates the fun of the performance performed by the performer based on the second analysis result showing
With
The second analysis result includes information on the time interval in which the performer played the material element in the performance , or information on the time interval.
The latter half of the performance includes information on the time from when the performer plays the material element to the end of the performance.
Interesting quantitative evaluation device.

The material element played by the performer in the latter half of the performance is the material element played at the end of the performance.
The fun quantitative evaluation device according to claim 3 .

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. Evaluation unit that evaluates the fun of the performance performed by the performer based on the second analysis result showing
With
The first analysis result includes information indicating the amount of information of the voice emitted by the performer.
Surface white of the quantitative evaluation device.

The amount of information in the voice includes the time allotted to the dialogue, the number of mora assigned to the unit time in the voice uttered by the performer, one of the number of mora of the relatively long dialogue of the performer, or Includes multiple
The fun quantitative evaluation device according to claim 5 .

The evaluation unit is based on the first analysis result and the second analysis result of the performance performed by the performer in the first of the predetermined times of the contest which is divided into a plurality of times and selected stepwise. The first evaluation value of the fun of the performance performed by the performer and
The fun of the performance performed by the performer based on the first analysis result and the second analysis result in the performance performed by the performer in the second time different from the first time among the predetermined times. Derived from the second evaluation value of
An estimation unit for estimating the fun of the performance performed by the performer after both the first and second evaluations based on the first evaluation value and the second evaluation value is further provided.
The fun quantitative evaluation device according to any one of claims 1 to 6 .

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. Evaluation unit that evaluates the fun of the performance performed by the performer based on the second analysis result showing
With
The evaluation unit
The performer performed based on the first analysis result and the second analysis result of the performance performed by the performer in the first of the predetermined times of the contest which is divided into a plurality of times and selected stepwise. The first evaluation value of the fun of the performance and
The fun of the performance performed by the performer based on the first analysis result and the second analysis result in the performance performed by the performer in the second time different from the first time among the predetermined times. Derived from the second evaluation value of
Both the first and second evaluations are based on the first evaluation value, the second evaluation value, and the coefficients that weight the first evaluation value and the second evaluation value. It further includes an estimation unit that estimates the fun of the performance performed by the performer after the times .
Interesting quantitative evaluation device.

A model generation unit that generates a model for evaluating the fun of the performance based on the feature amount of the performance whose fun is evaluated and the evaluation result.
With
The feature quantity of the performance evaluated for the fun includes the voice emitted by the performer who performed the performance evaluated for the fun and the laughter of the person who watched the performance evaluated for the fun. ,
The fun quantitative evaluation device according to any one of claims 1 to 8.

The computer
The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. The process of evaluating the fun of the performance performed by the performer based on the second analysis result showing
Only including,
The first analysis result includes the result of evaluating the tempo of the interaction of a plurality of people who are the performers, or the information indicating the amount of information of the voice emitted by the performer, or
The second analysis result includes information on the time interval in which the performer played the material element in the performance, or the time from when the performer played the material element in the latter half of the performance to the end of the performance. Contains information,
Interesting quantitative evaluation method.

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the program, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the program. and a second analysis result showing the step of evaluating the fun of the performers plays repertoire based on a program for causing a computer to execute the interesting quantitative evaluation device,
The first analysis result includes the result of evaluating the tempo of the interaction of a plurality of people who are the performers, or the information indicating the amount of information of the voice emitted by the performer, or
The second analysis result includes information on the time interval in which the performer played the material element in the performance, or the time from when the performer played the material element in the latter half of the performance to the end of the performance. Contains information,
Program .

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. An evaluation unit formed so as to generate an evaluation value of the result of evaluating the fun of the performance performed by the performer based on the second analysis result indicating
With
Adjust the composition of the script of the performance so that the evaluation value becomes a desired value.
Interesting quantitative evaluation device.

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. An evaluation unit formed so as to generate an evaluation value of the result of evaluating the fun of the performance performed by the performer based on the second analysis result indicating
With
By adjusting the composition of the program so that the evaluation value becomes a desired value, the creation of the script of the program is supported.
Interesting quantitative evaluation device.

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. It is provided with an evaluation unit formed so as to generate an evaluation value of the result of evaluating the fun of the performance performed by the performer based on the second analysis result showing.
Quantitatively evaluate the script of the performance based on the evaluation value.
Interesting quantitative evaluation device.

It is a fun adjustment method for the computer to help adjust the fun of the performance.
The computer played the first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the performer derived based on the laughter of the viewer of the performance. The process of evaluating the fun of the first performance performed by the performer based on the second analysis result showing the tendency of the content, and
The process in which the computer evaluates the fun of the second performance after the fun is adjusted based on the result of the evaluation .
Interesting adjustment method including .

The first analysis result showing the tendency of the performer's speaking style derived based on the voice emitted by the performer during the performance, and the tendency of the content performed by the performer derived based on the laughter of the viewer of the performance. A computer adjusted so that the fun of the performance performed by the performer can be evaluated based on the second analysis result showing
The process of evaluating the target program and supporting the creation of the script of the target program so that the desired fun can be obtained from the target program .
Interesting adjustment method including .