JP7455889B2

JP7455889B2 - Image evaluation device, image processing system, user terminal, image evaluation method, and image evaluation program

Info

Publication number: JP7455889B2
Application number: JP2022064667A
Authority: JP
Inventors: 満中澤; 科翰陳
Original assignee: Rakuten Group Inc
Current assignee: Rakuten Group Inc
Priority date: 2022-04-08
Filing date: 2022-04-08
Publication date: 2024-03-26
Anticipated expiration: 2042-04-08
Also published as: JP2023154980A

Description

本発明は、画像評価装置、画像処理システム、ユーザ端末、画像評価方法、画像評価プログラム、および画像表示支援方法に関する。 The present invention relates to an image evaluation device, an image processing system, a user terminal, an image evaluation method, an image evaluation program, and an image display support method.

たとえば下記特許文献１には、学習済みモデルを利用して画像データの異常を検出する装置が記載されている。 For example, Patent Document 1 listed below describes a device that detects abnormalities in image data using a trained model.

特開２０２２－２６０１６号公報Japanese Patent Application Publication No. 2022-26016

発明者は、身体の所定部位の露出がある画像データを異常とすることを検討した。その場合、学習済みモデルが所定部位に注目することなく画像データの異常の有無の判定結果を出力する場合、判定精度が低くなるおそれがある。 The inventor considered that image data in which a predetermined part of the body is exposed is considered abnormal. In that case, if the learned model outputs a determination result of the presence or absence of an abnormality in image data without paying attention to a predetermined region, the determination accuracy may be lowered.

以下、上記課題を解決するための手段およびその作用効果について記載する。
１．実行装置、および記憶装置を備え、前記記憶装置には、位置情報写像データと、評価写像データと、が記憶され、前記位置情報写像データは、位置情報写像を規定するためのデータであり、前記位置情報写像は、位置情報データを出力する写像であり、前記位置情報データは、評価対象とする画像データが示す画像において人の所定部位の位置情報を示すデータであり、前記評価写像データは、評価写像を規定するためのデータであり、前記評価写像は、評価用入力データおよび前記位置情報データを入力として前記画像データの評価結果を出力する写像であり、前記評価用入力データは、前記画像データに応じたデータであって且つ前記評価写像の入力とするデータであり、前記実行装置は、位置情報生成処理、および評価処理を実行するように構成され、前記位置情報生成処理は、位置用入力データを前記位置情報写像に入力することによって前記位置情報データを生成する処理であり、前記位置用入力データは、前記画像データに応じたデータであって且つ前記位置情報写像の入力とするデータであり、前記評価処理は、前記位置情報データおよび前記評価用入力データを前記評価写像に入力することによって、前記画像データを評価する処理である画像評価装置である。 Below, means for solving the above problems and their effects will be described.
1. an execution device and a storage device, the storage device stores location information mapping data and evaluation mapping data, the location information mapping data is data for defining a location information mapping, and the storage device stores location information mapping data and evaluation mapping data; The position information mapping is a mapping that outputs position information data, the position information data is data indicating position information of a predetermined part of a person in an image shown by image data to be evaluated, and the evaluation mapping data is The evaluation mapping is data for defining an evaluation mapping, and the evaluation mapping is a mapping that inputs evaluation input data and the position information data and outputs an evaluation result of the image data, and the evaluation input data is a mapping that outputs an evaluation result of the image data. The execution device is configured to execute a position information generation process and an evaluation process, and the position information generation process It is a process of generating the position information data by inputting input data into the position information mapping, and the position input data is data that corresponds to the image data and is data that is input to the position information mapping. In the image evaluation apparatus, the evaluation process is a process of evaluating the image data by inputting the position information data and the evaluation input data to the evaluation mapping.

上記構成では、評価写像に、画像データに応じた評価用入力データのみならず、位置情報データが入力される。位置情報データは、所定部位の情報を示すデータである。そのため、評価写像は、画像データ中の所定部位の位置についての情報を利用して評価結果を出力できる。そのため、位置情報データを利用しない場合と比較して、所定部位の露出等の異常をより高精度に評価できる。 In the above configuration, not only the evaluation input data corresponding to the image data but also the position information data is input to the evaluation mapping. The position information data is data indicating information on a predetermined region. Therefore, the evaluation mapping can output an evaluation result using information about the position of a predetermined part in the image data. Therefore, abnormalities such as exposure of a predetermined region can be evaluated with higher precision than when position information data is not used.

２．前記所定部位は、人の胸部、尻部、および正面下腹部の３つの部分のうちの少なくとも１つを含む上記１記載の画像評価装置である。
人の胸部、尻部、および正面下腹部の３つの部分の露出の有無は、公序良俗に反するか否かを定める上で特に重要である。そのため、上記構成では、所定部位の位置情報を利用することにより、画像データが公序良俗に反するか否かを高精度に評価できる。 2. The image evaluation device according to 1 above, wherein the predetermined region includes at least one of the following three parts: a person's chest, buttocks, and front lower abdomen.
The presence or absence of exposure of the three parts of a person's chest, buttocks, and front lower abdomen is particularly important in determining whether or not it violates public order and morals. Therefore, with the above configuration, by using the position information of the predetermined part, it is possible to evaluate with high accuracy whether or not the image data violates public order and morals.

３．前記位置情報データは、前記評価用入力データを構成する各画素が前記所定部位を示すか否かの情報を付与するデータであり、前記評価写像は、前記評価用入力データの各画素と前記各画素に対応する前記位置情報データとが対応付けて入力されることによって前記評価結果を出力する写像である上記１または２記載の画像評価装置である。 3. The position information data is data that provides information as to whether each pixel forming the evaluation input data indicates the predetermined region, and the evaluation mapping is data that provides information on whether each pixel forming the evaluation input data indicates the predetermined region, and the evaluation mapping is data that provides information on whether each pixel forming the evaluation input data indicates the predetermined region. The image evaluation device according to the above 1 or 2, wherein the image evaluation device is a mapping that outputs the evaluation result by inputting the position information data corresponding to the pixels in association with each other.

上記評価写像は、評価用入力データおよび位置情報データが互いに対応付けて評価写像に入力される。そのため、画像データの各画素のうち所定部位に対応する画素に着目しつつ評価結果を算出する確実性を高めることができる。 In the evaluation mapping, evaluation input data and position information data are input into the evaluation mapping in association with each other. Therefore, it is possible to increase the reliability of calculating an evaluation result while paying attention to a pixel corresponding to a predetermined region among each pixel of image data.

４．前記評価写像は、仮評価写像を含み、前記仮評価写像は、前記評価用入力データを入力として前記画像データの仮の評価結果を出力する写像であり、前記評価処理は、仮評価処理と、妥当性評価処理と、を含み、前記仮評価処理は、前記仮評価写像に前記評価用入力データを入力することによって前記仮の評価結果を出力する処理を含み、前記妥当性評価処理は、前記位置情報データを入力として、前記画像データの示す領域のうちの前記所定部位を示す領域が前記仮の評価結果に寄与した度合いに応じて前記仮の評価結果の妥当性を評価する処理を含む上記１～３のいずれか１つに記載の画像評価装置である。 4. The evaluation mapping includes a provisional evaluation mapping, the provisional evaluation mapping is a mapping that inputs the evaluation input data and outputs a provisional evaluation result of the image data, and the evaluation processing includes provisional evaluation processing; a validity evaluation process, the provisional evaluation process includes a process of outputting the provisional evaluation result by inputting the evaluation input data into the provisional evaluation mapping, and the validity evaluation process includes the The above-described process includes a process of taking position information data as input and evaluating the validity of the tentative evaluation result according to the degree to which a region indicating the predetermined part out of the regions indicated by the image data has contributed to the tentative evaluation result. The image evaluation device according to any one of items 1 to 3.

上記構成では、実行装置が、仮評価写像が出力する仮の評価結果に画像データが示す領域のうちの所定部位を示す領域が寄与した度合いに応じて、仮の評価結果の妥当性を評価する処理を実行する。そのため、妥当性の評価をすることなく仮評価写像の出力する仮の評価結果から最終的な評価結果を定める場合と比較すると、最終的な評価結果の精度を高めることができる。 In the above configuration, the execution device evaluates the validity of the tentative evaluation result according to the degree to which the region indicating a predetermined part of the region indicated by the image data has contributed to the tentative evaluation result output by the tentative evaluation mapping. Execute processing. Therefore, compared to the case where the final evaluation result is determined from the provisional evaluation result output by the provisional evaluation mapping without evaluating the validity, the accuracy of the final evaluation result can be improved.

５．前記実行装置は、通知処理を実行するように構成され、前記通知処理は、前記妥当性評価処理によって妥当ではないと判定される場合に、妥当ではない旨を通知する処理である上記４記載の画像評価装置である。 5. 4. The execution device is configured to execute a notification process, and the notification process is a process of notifying that it is not valid when it is determined that it is not valid by the validity evaluation process. It is an image evaluation device.

上記構成では、実行装置は、妥当性評価処理によって妥当ではないと判定する場合、その旨を通知する。そのため、仮の評価結果の妥当性について、最終的に人が判断することが可能となる。 In the above configuration, when the execution device determines that the validity evaluation process is not valid, it notifies the execution device to that effect. Therefore, it becomes possible for a person to ultimately judge the validity of the provisional evaluation results.

６．前記評価写像は、特徴量レイヤを備え、前記特徴量レイヤは、前記画像データが示す領域を複数個の領域に分割したそれぞれに数値を与えることによって、前記画像データの特徴量を定量化するレイヤであり、前記評価写像は、前記特徴量レイヤが示す前記複数個の領域の少なくとも一部の値と前記位置情報データとを合成する処理を含んで前記評価結果を出力する写像である上記１記載の画像評価装置である。 6. The evaluation mapping includes a feature layer, and the feature layer is a layer that quantifies the feature amount of the image data by dividing the region indicated by the image data into a plurality of regions and assigning a numerical value to each region. 1 above, wherein the evaluation mapping is a mapping that outputs the evaluation result including a process of combining at least some values of the plurality of regions indicated by the feature amount layer and the position information data. This is an image evaluation device.

上記構成では、実行装置が特徴量レイヤが示す複数個の領域の少なくとも一部の値と位置情報データとを合成する処理を施す。そのため、この処理の施された特徴量は、補正が施されない特徴量と比較して、所定部位により着目した特徴量となりうる。そのため、評価結果を、所定部位に確実に着目した評価結果とすることが可能となる。 In the above configuration, the execution device performs a process of synthesizing at least part of the values of the plurality of regions indicated by the feature amount layer and the position information data. Therefore, the feature amount that has been subjected to this processing can be a feature amount that focuses more on a predetermined region than the feature amount that has not been corrected. Therefore, it is possible to obtain an evaluation result that reliably focuses on a predetermined region.

７．上記１～６のいずれか１つに記載の画像評価装置における前記実行装置、および前記記憶装置と、複数のユーザ端末と、を備え、前記実行装置は、提供処理、および制限処理を実行するように構成され、前記提供処理は、前記評価処理によって表示されることを制限しなくてよい旨の評価がなされた前記画像データが示す画像を前記ユーザ端末に表示可能とすべく、前記画像データを前記ユーザ端末に送信する処理であり、前記制限処理は、前記評価処理によって表示されることを制限すべき旨の評価がなされた前記画像データが示す画像が、前記ユーザ端末により表示されることを制限する処理である画像処理システムである。 7. The image evaluation device according to any one of items 1 to 6 above includes the execution device, the storage device, and a plurality of user terminals, and the execution device is configured to execute provision processing and restriction processing. and the provision process is configured to provide the image data in order to enable the user terminal to display an image indicated by the image data that has been evaluated by the evaluation process to the effect that the display does not need to be restricted. The restriction process is a process of transmitting to the user terminal, and the restriction process is a process of transmitting the image to the user terminal, and the restriction process is a process of transmitting the image to the user terminal, and the restriction process is a process of transmitting the image to the user terminal, and the restriction process is a process of transmitting the image to the user terminal. This is an image processing system that limits processing.

上記提供処理および制限処理によれば、評価処理による評価結果に応じて、ユーザ端末において画像データが示す画像を選択的に表示させることを適切に支援できる。
８．前記ユーザ端末は、指示処理を実行するように構成され、前記指示処理は、身体の露出度についての許容範囲を指示する処理であり、前記制限しなくてよい旨の評価は、前記露出度が前記許容範囲内である旨の評価であり、前記ユーザ端末により表示されることを制限すべき旨の評価は、前記露出度が前記許容範囲から外れる旨の評価である上記７記載の画像処理システムである。 According to the above-mentioned provision processing and restriction processing, it is possible to appropriately support selectively displaying the image indicated by the image data on the user terminal according to the evaluation result of the evaluation processing.
8. The user terminal is configured to execute an instruction process, and the instruction process is a process for instructing a permissible range for the degree of body exposure, and the evaluation that there is no need to limit the degree of exposure means that the degree of exposure is The image processing system according to 7 above, wherein the evaluation that the degree of exposure is within the permissible range and that displaying by the user terminal should be restricted is an evaluation that the degree of exposure is outside the permissible range. It is.

上記構成では、ユーザの意思に応じて、ユーザ端末において画像データが示す画像を選択的に表示させることを適切に支援できる。
９．前記制限処理は、前記ユーザ端末により表示されることを制限すべきと評価された前記画像データの前記ユーザ端末への送信を禁止する禁止処理と、前記ユーザ端末により表示されることを制限すべきと評価された前記画像データを制限指令とともに送信する制限送信処理と、の少なくとも１つの処理を含み、前記制限指令は、前記ユーザ端末において前記画像データが示す画像を警告とともに表示されるようにする指令、前記ユーザ端末において前記画像データが示す画像のうちの所定部位の露出にマスクをする指令、および前記ユーザ端末において前記画像データが示す画像を表示しないようにする指令のいずれかである上記７または８記載の画像処理システムである。 With the above configuration, it is possible to appropriately support selectively displaying an image indicated by image data on a user terminal according to the user's intention.
9. The restriction process includes a prohibition process that prohibits transmission of the image data that has been evaluated to be restricted from being displayed by the user terminal to the user terminal, and a prohibition process that prohibits the image data from being displayed by the user terminal. and a restriction transmission process of transmitting the image data evaluated as such along with a restriction command, the restriction command causing an image indicated by the image data to be displayed on the user terminal along with a warning. 7 above, which is any one of a command, a command to mask the exposure of a predetermined part of the image represented by the image data on the user terminal, and a command not to display the image represented by the image data on the user terminal. or the image processing system described in 8.

上記禁止処理によれば、ユーザ端末により表示されることを制限すべきと評価された画像データが示す画像が、ユーザ端末に表示されることを確実に抑制できる。また、上記制限送信処理によれば、以下のいずれかが可能となる。 According to the above-mentioned prohibition process, it is possible to reliably suppress the image indicated by the image data evaluated to be restricted from being displayed on the user terminal from being displayed on the user terminal. Further, according to the above-mentioned limited transmission processing, one of the following becomes possible.

・ユーザ端末により表示されることを制限すべき旨の警告がユーザ端末によってなされること。
・ユーザ端末に表示された画像中の所定部位の露出がマスクされること。 - A warning is issued by the user terminal to the effect that display on the user terminal should be restricted.
- The exposure of a predetermined part in the image displayed on the user terminal is masked.

・ユーザ端末により表示されることを制限すべきと評価された画像データが示す画像が表示されることを、ユーザ端末において禁止すること。
１０．上記８記載の画像処理システムにおける前記ユーザ端末である。 - Prohibiting the user terminal from displaying an image indicated by image data that has been evaluated as being restricted from being displayed by the user terminal.
10. The user terminal in the image processing system described in 8 above.

１１．上記１～６のいずれか１つに記載の画像評価装置における前記各処理を実行するステップを有する画像評価方法である。
１２．上記１～６のいずれか１つに記載の画像評価装置における前記各処理をコンピュータに実行させる画像評価プログラムである。 11. An image evaluation method comprising the steps of executing each of the processes in the image evaluation apparatus according to any one of items 1 to 6 above.
12. This is an image evaluation program that causes a computer to execute each of the processes in the image evaluation apparatus described in any one of 1 to 6 above.

１３．ユーザ端末において選択的に画像が表示されることを支援する画像表示支援方法であって、前記ユーザ端末により表示される画像の候補を示す画像データのうち、上記１～６のいずれか１つに記載の画像評価装置における前記評価処理によって表示されることを制限しなくてよい旨の評価がなされた前記画像データが示す画像を前記ユーザ端末に表示可能とすべく、前記画像データを前記ユーザ端末に送信する提供工程と、前記評価処理によって表示されることを制限すべき旨の評価がなされた前記画像データが示す画像が、前記ユーザ端末により表示されることを制限する制限工程と、を有する画像表示支援方法である。 13. An image display support method for supporting selective display of images on a user terminal, the method comprising: selecting any one of the above items 1 to 6 among image data indicating image candidates to be displayed by the user terminal; In order to be able to display, on the user terminal, an image indicated by the image data that has been evaluated to the effect that displaying does not need to be restricted by the evaluation process in the image evaluation device described above, the image data is displayed on the user terminal. and a restricting step of restricting display of an image represented by the image data that has been evaluated by the evaluation process as being restricted from being displayed by the user terminal. This is an image display support method.

上記提供工程および制限工程によれば、評価処理による評価結果に応じて、ユーザ端末において画像データが示す画像を選択的に表示させることを適切に支援できる。
１４．ユーザ端末において選択的に画像が表示されることを支援する画像表示支援方法であって、前記ユーザ端末は、指示処理を実行するように構成され、前記指示処理は、身体の露出度についての許容範囲を指示する処理であり、前記ユーザ端末により表示される画像の候補を示す画像データのうち、上記１～６のいずれか１つに記載の画像評価装置における前記評価処理によって前記露出度が前記許容範囲内である旨評価された前記画像データが示す画像を前記ユーザ端末に表示可能とすべく、前記画像データを前記ユーザ端末に送信する提供工程と、前記ユーザ端末により表示される画像の候補を示す画像データのうち、前記評価処理によって、前記露出度が前記許容範囲から外れると評価された前記画像データが示す画像が、前記ユーザ端末により表示されることを制限する制限工程と、を有する画像表示支援方法である。 According to the above-mentioned providing step and restriction step, it is possible to appropriately support selectively displaying the image indicated by the image data on the user terminal according to the evaluation result of the evaluation process.
14. An image display support method for supporting selective display of images on a user terminal, wherein the user terminal is configured to execute instruction processing, and the instruction processing includes determining a permissible degree of body exposure. This is a process of instructing a range, and the exposure degree is determined by the evaluation process in the image evaluation apparatus according to any one of 1 to 6 above among image data indicating image candidates to be displayed by the user terminal. a providing step of transmitting the image data to the user terminal so that the image indicated by the image data evaluated to be within an acceptable range can be displayed on the user terminal; and a candidate image to be displayed by the user terminal. a limiting step of restricting display of an image represented by the image data whose exposure degree is evaluated to be outside the permissible range by the evaluation process from being displayed by the user terminal. This is an image display support method.

上記方法では、ユーザの意思に応じて、ユーザ端末において画像データが示す画像を選択的に表示させることを適切に支援できる。
１５．ユーザ端末において選択的に画像が表示されることを支援する画像表示支援方法であって、前記ユーザ端末により表示される画像の候補を示す画像データを取得する取得工程と、前記取得工程において取得された前記画像データのうち上記１～６のいずれか１つに記載の画像評価装置における前記評価処理によって表示されることを制限しなくてよい旨の評価がなされた前記画像データが示す画像を前記ユーザ端末に表示可能とすべく、前記画像データを前記ユーザ端末に送信する提供工程と、前記取得工程において取得された前記画像データのうち、前記評価処理によって表示されることを制限すべき旨の評価がなされた前記画像データが示す画像が、前記ユーザ端末により表示されることを制限する制限工程と、を有する画像表示支援方法である。 The above method can appropriately support selectively displaying images indicated by image data on a user terminal according to the user's intention.
15. An image display support method for supporting selective display of images on a user terminal, the method comprising: an acquisition step of acquiring image data indicating image candidates to be displayed by the user terminal; The image represented by the image data that has been evaluated by the evaluation process in the image evaluation device according to any one of the above items 1 to 6 to the effect that it is not necessary to restrict display of the image data. a providing step of transmitting the image data to the user terminal so that the image data can be displayed on the user terminal; and a step of restricting display of the image data acquired in the acquisition step by the evaluation process. The image display support method includes the step of restricting display of an image indicated by the evaluated image data by the user terminal.

上記提供工程および制限工程によれば、評価処理による評価結果に応じて、ユーザ端末において画像データが示す画像を選択的に表示させることを適切に支援できる。 According to the above-mentioned providing step and restricting step, it is possible to appropriately support selectively displaying the image indicated by the image data on the user terminal according to the evaluation result of the evaluation process.

第１の実施形態にかかる画像処理システムの全体構成を示す図である。1 is a diagram showing the overall configuration of an image processing system according to a first embodiment. 同実施形態にかかる表示可能な画像を規定する図である。FIG. 3 is a diagram defining displayable images according to the same embodiment. 同実施形態にかかるＣＭ動画データが示す画像の表示支援に関する処理の手順を示す流れ図である。2 is a flowchart showing a procedure for processing related to display support for an image indicated by commercial video data according to the same embodiment. 同実施形態にかかるフレーム評価処理の詳細な手順を示す図である。FIG. 3 is a diagram illustrating a detailed procedure of frame evaluation processing according to the embodiment. 同実施形態にかかるヒートマップを説明するための図である。It is a figure for explaining the heat map concerning the same embodiment. 第２の実施形態にかかるＣＭ動画データが示す画像の表示支援に関する処理の手順を示す流れ図である。12 is a flowchart illustrating a process procedure related to display support for an image indicated by CM video data according to a second embodiment. 第３の実施形態にかかるフレーム評価処理の詳細な手順を示す流れ図である。12 is a flowchart showing detailed steps of frame evaluation processing according to the third embodiment. 第４の実施形態にかかるフレーム評価処理の詳細な手順を示す図である。FIG. 7 is a diagram showing detailed procedures of frame evaluation processing according to the fourth embodiment. 同実施形態にかかる最終評価処理の詳細な手順を示す図である。It is a figure showing the detailed procedure of final evaluation processing concerning the same embodiment. 第５の実施形態にかかるフレーム評価処理の詳細な手順を示す図である。FIG. 7 is a diagram showing detailed procedures of frame evaluation processing according to the fifth embodiment. 同実施形態にかかるＣＭ動画データが示す画像の表示支援に関する処理の手順を示す流れ図である。2 is a flowchart showing a procedure for processing related to display support for an image indicated by commercial video data according to the same embodiment.

＜第１の実施形態＞
以下、第１の実施形態について図面を参照しつつ説明する。
「前提構成」
図１に、画像処理システムの全体構成を示す。 <First embodiment>
The first embodiment will be described below with reference to the drawings.
"Prerequisite configuration"
FIG. 1 shows the overall configuration of the image processing system.

複数の業者端末１０（１），１０（２），…は、何らかの商品を消費者に販売する業者が所持する端末である。なお、以下では、業者端末１０（１），１０（２），…を総括する場合、業者端末１０と記載する。業者端末１０は、ＰＵ１２、記憶装置１４、および通信機１６を備えている。ＰＵ１２は、ＣＰＵ、ＧＰＵ、およびＴＰＵ等の演算ユニットを含むソフトウェア処理装置である。記憶装置１４は、電気的に書き換え可能な不揮発性メモリ、およびディスク媒体等の記憶媒体を備える。記憶装置１４には、ＰＵ１２が実行するプログラムが記憶されている。 The plurality of merchant terminals 10(1), 10(2), . . . are terminals owned by merchants who sell certain products to consumers. Note that hereinafter, when the vendor terminals 10(1), 10(2), . . . are collectively referred to as the vendor terminal 10. The vendor terminal 10 includes a PU 12, a storage device 14, and a communication device 16. The PU 12 is a software processing device including arithmetic units such as a CPU, a GPU, and a TPU. The storage device 14 includes an electrically rewritable nonvolatile memory and a storage medium such as a disk medium. The storage device 14 stores programs executed by the PU 12.

業者端末１０の通信機１６は、ネットワーク２０を介して画像評価装置３０と通信可能とされる。業者端末１０は、販売したい商品に関するＣＭ動画データを画像評価装置３０に送信する。ここで、ＣＭ動画データとは、たとえば商品を広告するためのコマーシャル動画に相当する。ＣＭ動画の再生時間、フレーム数、解像度、およびデータフォーマットに制限はない。 The communication device 16 of the vendor terminal 10 is enabled to communicate with the image evaluation device 30 via the network 20. The vendor terminal 10 transmits CM video data regarding a product that the vendor wants to sell to the image evaluation device 30. Here, the commercial video data corresponds to, for example, a commercial video for advertising a product. There are no restrictions on the playback time, number of frames, resolution, or data format of commercial videos.

画像評価装置３０は、ＣＭ動画データを評価して、問題がない場合にはユーザ端末５０（１），５０（２），…に送信する。以下では、ユーザ端末５０（１），５０（２），…を総括する場合、ユーザ端末５０と記載する。 The image evaluation device 30 evaluates the commercial video data, and if there is no problem, transmits the data to the user terminals 50(1), 50(2), . . . In the following, when the user terminals 50(1), 50(2), . . . are collectively referred to as the user terminal 50.

画像評価装置３０は、ＰＵ３２、記憶装置３４、および通信機３６を備えている。ＰＵ３２は、ＣＰＵ、ＧＰＵ、およびＴＰＵ等の演算ユニットを含むソフトウェア処理装置である。詳しくは、ＰＵ３２は、画像処理、推論処理および並列処理等の所定の処理に特化したハードウェアアクセラレータを演算ユニットとして含んでよい。また、ＰＵ３２は、ホモジニアスアーキテクチャまたはヘテロジニアスアーキテクチャの態様をとる演算ユニットを含んでよい。また、ＰＵ３２は、単一のチップ上にモノリシックに集積された演算ユニットを含んでよく、チップレット等の複数のチップが接続された演算ユニットを含んでよい。記憶装置３４は、電気的に書き換え可能な不揮発性メモリ、およびディスク媒体等の記憶媒体を備える。記憶装置３４には、画像評価プログラム３４ａ、位置情報写像データ３４ｂ、および評価写像データ３４ｃが記憶されている。通信機３６は、ネットワーク２０を介して業者端末１０およびユーザ端末５０との通信を可能とするための機器である。 The image evaluation device 30 includes a PU 32, a storage device 34, and a communication device 36. The PU 32 is a software processing device that includes arithmetic units such as a CPU, a GPU, and a TPU. Specifically, the PU 32 may include a hardware accelerator specialized for predetermined processing such as image processing, inference processing, and parallel processing as an arithmetic unit. Further, the PU 32 may include an arithmetic unit that takes the form of a homogeneous architecture or a heterogeneous architecture. Further, the PU 32 may include an arithmetic unit monolithically integrated on a single chip, or may include an arithmetic unit to which a plurality of chips such as chiplets are connected. The storage device 34 includes an electrically rewritable nonvolatile memory and a storage medium such as a disk medium. The storage device 34 stores an image evaluation program 34a, position information mapping data 34b, and evaluation mapping data 34c. The communication device 36 is a device that enables communication with the vendor terminal 10 and the user terminal 50 via the network 20.

ユーザ端末５０は、ＣＭ動画データを再生する端末である。ユーザ端末５０は、ＣＭ動画が示す商品の購入操作を実行する機能を有してもよい。
ユーザ端末５０は、ＰＵ５２、記憶装置５４、通信機５６、およびユーザインターフェース５８を備えている。ＰＵ５２は、ＣＰＵ、ＧＰＵ、およびＴＰＵ等の演算ユニットを含むソフトウェア処理装置である。ＰＵ５２は、ＰＵ３２に含まれ得る態様の演算ユニットを含んでもよい。記憶装置５４は、電気的に書き換え可能な不揮発性メモリ、およびディスク媒体等の記憶媒体を備える。記憶装置５４には、アプリケーションプログラム５４ａが記憶されている。アプリケーションプログラム５４ａは、画像評価装置３０から送信されるＣＭ動画データを再生するプログラムである。通信機５６は、ネットワーク２０を介して画像評価装置３０との通信を可能とするための機器である。ユーザインターフェース５８は、表示装置等を備える。 The user terminal 50 is a terminal that plays back commercial video data. The user terminal 50 may have a function of executing a purchasing operation for a product indicated by a commercial video.
The user terminal 50 includes a PU 52, a storage device 54, a communication device 56, and a user interface 58. The PU 52 is a software processing device that includes arithmetic units such as a CPU, a GPU, and a TPU. PU52 may include an arithmetic unit in a manner that can be included in PU32. The storage device 54 includes an electrically rewritable nonvolatile memory and a storage medium such as a disk medium. The storage device 54 stores an application program 54a. The application program 54a is a program that plays back commercial video data transmitted from the image evaluation device 30. The communication device 56 is a device that enables communication with the image evaluation device 30 via the network 20. The user interface 58 includes a display device and the like.

「画像評価処理」
画像評価装置３０のＰＵ３２は、業者端末１０から送信されたＣＭ動画データがユーザ端末５０によって再生して問題ないデータであるか否かを評価する。 "Image evaluation processing"
The PU 32 of the image evaluation device 30 evaluates whether the CM video data transmitted from the vendor terminal 10 is data that can be reproduced by the user terminal 50 without any problem.

図２は、再生して問題ないデータの定義の一例を示す。図２には、身体の部分のうち、所定部位に、マーキングがなされている。すなわち、部分「３」～「５」によって定義される胸部にマーキングがなされている。また、人体の正面視において、部分「１０」、「１１」、「１８」によって定義される「正面下腹部」にマーキングがなされている。また、人体の背面における部分「１２～１４」，「１８」によって定義される「尻部」にマーキングがなされている。ＰＵ３２は、人の画像のうち図２にマーキングした箇所が露出した画像については、再生して問題がある画像であると評価する。 FIG. 2 shows an example of the definition of data that can be reproduced without any problem. In FIG. 2, markings are made at predetermined parts of the body. That is, markings are made on the chest defined by portions "3" to "5". Furthermore, in a front view of the human body, markings are made on the "front lower abdomen" defined by portions "10", "11", and "18". Also, markings are made on the "buttocks" defined by parts "12 to 14" and "18" on the back of the human body. The PU 32 evaluates an image of a person in which the portion marked in FIG. 2 is exposed as an image that poses a problem in reproduction.

図３に、画像評価装置３０によって実行される、図２に示した基準に沿った評価に基づいてＣＭ動画データがユーザ端末５０によって選択的に表示されることを支援する処理の手順を示す。図３に示す処理は、記憶装置３４に記憶された画像評価プログラム３４ａを、ＰＵ３２がたとえば所定周期でくり返し実行することにより実現される。なお、以下では、先頭に「Ｓ」が付与された数字によって各処理のステップ番号を表現する。 FIG. 3 shows the procedure of a process executed by the image evaluation device 30 to support selective display of CM video data by the user terminal 50 based on evaluation in accordance with the criteria shown in FIG. The process shown in FIG. 3 is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34, for example, at a predetermined period. Note that in the following, the step number of each process is expressed by a number prefixed with "S".

図３に示す一連の処理において、ＰＵ３２は、まず、業者端末１０から送信されたＣＭ動画データを取得する（Ｓ１０）。なお、ここで、「取得」は、記憶装置３４に記憶されたＣＭ動画データのうちの１つを選択的に読み出すことを意味する。この処理の前に、ＰＵ３２は、通信機３６を介して業者端末１０から送信されたＣＭ動画データを受信する。そして、ＰＵ３２は、受信したＣＭ動画データを記憶装置３４に記憶する。 In the series of processes shown in FIG. 3, the PU 32 first obtains CM video data transmitted from the vendor terminal 10 (S10). Note that here, "obtaining" means selectively reading out one of the CM video data stored in the storage device 34. Before this process, the PU 32 receives CM video data transmitted from the vendor terminal 10 via the communication device 36. Then, the PU 32 stores the received CM video data in the storage device 34.

次にＰＵ３２は、ＣＭ動画データのフレームを指定する変数ｉを初期化する（Ｓ１２）。この処理は、変数ｉを、ＣＭ動画データの先頭のフレームを指定する値とするための処理である。次に、ＰＵ３２は、フレームデータＦＤを、Ｎ個周期で３個サンプリングする（Ｓ１４）。すなわち、ＰＵ３２は、フレームデータＦＤ（ｉ）、ＦＤ（ｉ＋Ｎ）、ＦＤ（ｉ＋２Ｎ）をサンプリングする。なお、フレームデータＦＤ（１），ＦＤ（２），ＦＤ（３），…は、ＣＭ動画データの再生順序に従ったフレームデータＦＤの時系列を示す。詳しくは、フレームデータＦＤの後のカッコ内の数字が大きいほど、時系列的に後に再生されるフレームデータＦＤであることを示す。ちなみに、フレームデータＦＤは、レッド、グリーン、ブルーの３原色のそれぞれの輝度を示す「ｗ×ｈ」の画素数を有した２次元データである。ここで、「ｗ」および「ｈ」は自然数である。なお、以下では、上記３原色を適宜、「Ｒ，Ｇ，Ｂ」と表現する。 Next, the PU 32 initializes a variable i that specifies a frame of CM video data (S12). This process is a process for setting the variable i to a value that specifies the first frame of the commercial video data. Next, the PU 32 samples the frame data FD three times at N intervals (S14). That is, the PU 32 samples frame data FD(i), FD(i+N), and FD(i+2N). Note that frame data FD(1), FD(2), FD(3), . . . indicate a time series of frame data FD according to the reproduction order of CM video data. Specifically, the larger the number in parentheses after the frame data FD, the later the frame data FD is reproduced in chronological order. Incidentally, the frame data FD is two-dimensional data having the number of pixels of "w×h" indicating the brightness of each of the three primary colors of red, green, and blue. Here, "w" and "h" are natural numbers. In addition, below, the said three primary colors will be expressed as "R, G, B" suitably.

そして、ＰＵ３２は、フレームデータＦＤ（ｉ）、ＦＤ（ｉ＋Ｎ）、ＦＤ（ｉ＋２Ｎ）のそれぞれを評価するフレーム評価処理を実行する（Ｓ１６）。
図４に、フレーム評価処理の詳細を示す。 Then, the PU 32 executes a frame evaluation process for evaluating each of the frame data FD(i), FD(i+N), and FD(i+2N) (S16).
FIG. 4 shows details of the frame evaluation process.

図４に示す一連の処理において、ＰＵ３２は、１つのフレームデータＦＤを胸部、正面下腹部、尻部、および顔部の位置情報を出力する回帰モデルに入力する（Ｓ３０）。回帰モデルは、フレームデータＦＤが示す「ｗ×ｈ」の画像領域の少なくとも１部における、胸部、正面下腹部、尻部、および顔部のそれぞれの存在確率を示す２次元分布を出力するモデルである。ここで、２次元分布は、例として、２次元ガウス分布である。 In the series of processes shown in FIG. 4, the PU 32 inputs one frame data FD to a regression model that outputs positional information of the chest, front lower abdomen, buttocks, and face (S30). The regression model is a model that outputs a two-dimensional distribution indicating the existence probability of each of the chest, front lower abdomen, buttocks, and face in at least part of the "w x h" image area indicated by the frame data FD. be. Here, the two-dimensional distribution is, for example, a two-dimensional Gaussian distribution.

ＰＵ３２は、例として、胸部、正面下腹部、尻部、および顔部のそれぞれの存在確率に関する２次元ガウス分布が平均値を示すｘｙ座標成分の値と、分散パラメータとを出力する。本実施形態では、２次元ガウス分布の共分散行列を、対角行列とみなす。また、本実施形態では、２つの対角成分を等しいとみなす。したがって、本実施形態の２次元ガウス分布の分散パラメータは、１個である。回帰モデルは、画像領域内において扱える上限の人数を所定数ｋとしている。そして、回帰モデルでは、人毎に、胸部、正面下腹部、尻部、および顔部のそれぞれの存在確率の分布を表現するガウス分布に関する、上記ｘｙ座標成分の値、および分散パラメータの値を出力する。したがって、回帰モデルは、例として、「１２ｋ」個の出力値を有する。 The PU 32 outputs, for example, the value of the xy coordinate component and the dispersion parameter in which the two-dimensional Gaussian distribution regarding the existence probability of each of the chest, frontal lower abdomen, buttocks, and face shows the average value. In this embodiment, the covariance matrix of a two-dimensional Gaussian distribution is regarded as a diagonal matrix. Furthermore, in this embodiment, two diagonal components are considered to be equal. Therefore, the two-dimensional Gaussian distribution of this embodiment has one dispersion parameter. In the regression model, the upper limit of the number of people that can be handled within the image area is a predetermined number k. Then, in the regression model, the values of the xy coordinate components and the values of the dispersion parameters are output for each person regarding the Gaussian distribution that expresses the distribution of the existence probability of the chest, front lower abdomen, buttocks, and face. do. Therefore, the regression model has, by way of example, "12k" output values.

回帰モデルは、例として、教師あり学習によって学習された学習済みモデルである。
図５に、回帰モデルの訓練データを例示する。訓練データは、胸部、正面下腹部、尻部、および顔部のそれぞれの代表点の座標（ｘｉ，ｙｉ：ｉ＝１～４）と、分散パラメータの値が定義されたデータである。分散パラメータは、画像領域に占める人の大きさに応じて設定される。 The regression model is, for example, a trained model learned by supervised learning.
FIG. 5 illustrates training data for the regression model. The training data is data in which the coordinates (xi, yi: i=1 to 4) of representative points of the chest, front lower abdomen, buttocks, and face, and the values of the dispersion parameters are defined. The dispersion parameter is set depending on the size of the person occupying the image area.

なお、回帰モデルは、ニューラルネットワーク（以下、ＮＮと記載）を含むモデルであってよい。具体的には、畳み込みニューラルネットワーク（以下、ＣＮＮと記載）としてもよい。ただし、ＣＮＮに限らない。 Note that the regression model may be a model including a neural network (hereinafter referred to as NN). Specifically, it may be a convolutional neural network (hereinafter referred to as CNN). However, it is not limited to CNN.

回帰モデルは、図１に示した記憶装置３４に記憶された位置情報写像データ３４ｂによって規定される。位置情報写像データ３４ｂは、パラメトリックモデルである回帰モデルの学習済みのパラメータの値を含むデータである。すなわち、回帰モデルがＣＮＮの場合、位置情報写像データ３４ｂは、畳み込み処理に用いる各フィルタの値を含む。また、回帰モデルが全結合レイヤ等を含む場合、位置情報写像データ３４ｂは、全結合レイヤの重みを示すパラメータの値を含む。なお、回帰モデルがＮＮの場合、位置情報写像データ３４ｂは、活性化関数を規定するデータを含んでもよい。ただし、たとえば活性化関数を規定するデータは、画像評価プログラム３４ａに含めてもよい。なお、回帰モデルは、教師なし学習によって学習された学習済みモデルであってもよく、半教師あり学習によって学習された学習済みモデルであってもよい。 The regression model is defined by the position information mapping data 34b stored in the storage device 34 shown in FIG. The position information mapping data 34b is data that includes values of learned parameters of a regression model that is a parametric model. That is, when the regression model is CNN, the position information mapping data 34b includes the values of each filter used in the convolution process. Furthermore, when the regression model includes a fully connected layer or the like, the position information mapping data 34b includes the value of a parameter indicating the weight of the fully connected layer. Note that when the regression model is a NN, the position information mapping data 34b may include data that defines an activation function. However, for example, data defining the activation function may be included in the image evaluation program 34a. Note that the regression model may be a trained model learned by unsupervised learning, or a trained model learned by semi-supervised learning.

図４に戻り、ＰＵ３２は、Ｓ１０の処理を完了する場合、回帰モデルの出力値に基づき、胸部、正面下腹部、尻部、および顔部のそれぞれに関するヒートマップを生成する（Ｓ３２）。それらヒートマップは、いずれも「ｗ×ｈ」個の領域を有する。ヒートマップは、「ｗ×ｈ」個の領域のそれぞれに、存在確率に応じた値が定められたマップである。たとえば胸部のヒートマップは、胸部に対応する２次元ガウス分布の平均値の座標（ｘ１，ｙ１）に対応する領域において、最も大きい値が定められている。そして、座標（ｘ１，ｙ１）に対応する領域から離れた領域ほど、小さい値が定められている。座標（ｘ１，ｙ１）に対応する領域からの距離と値との関係は、分散パラメータによって規定される。 Returning to FIG. 4, when completing the process of S10, the PU 32 generates a heat map for each of the chest, front lower abdomen, buttocks, and face based on the output value of the regression model (S32). Each of these heat maps has "w×h" areas. The heat map is a map in which a value is determined for each of "w×h" areas according to the existence probability. For example, in the heat map of the chest, the largest value is determined in the region corresponding to the coordinates (x1, y1) of the average value of the two-dimensional Gaussian distribution corresponding to the chest. A smaller value is determined for an area farther away from the area corresponding to the coordinates (x1, y1). The relationship between the distance from the area corresponding to the coordinates (x1, y1) and the value is defined by the dispersion parameter.

次にＰＵ３２は、フレームデータＦＤと、同フレームデータＦＤから生成されたヒートマップとを、識別モデル７０ａに入力することによって、フレームデータＦＤを評価する評価変数ｙｏｋ，ｙｎｇの値を算出する（Ｓ３４）。評価変数ｙｏｋは、フレームデータＦＤが、胸部、正面下腹部、および尻部の露出がない画像を示すデータである確率を示す変数である。また、評価変数ｙｎｇは、フレームデータＦＤが、胸部、正面下腹部、および尻部の少なくとも１つに露出がある画像を示すデータである確率を示す変数である。 Next, the PU 32 calculates the values of evaluation variables yok and yng for evaluating the frame data FD by inputting the frame data FD and the heat map generated from the frame data FD into the identification model 70a (S34 ). The evaluation variable yok is a variable that indicates the probability that the frame data FD is data representing an image in which the chest, front lower abdomen, and buttocks are not exposed. Furthermore, the evaluation variable yng is a variable that indicates the probability that the frame data FD is data representing an image in which at least one of the chest, front lower abdomen, and buttocks is exposed.

識別モデル７０ａは、例として、ＣＮＮを含む。識別モデル７０ａにおいて、Ｒ，Ｇ，Ｂの３つの「ｗ×ｈ」のデータと、胸部、正面下腹部、尻部、および顔部のそれぞれの「ｗ×ｈ」のヒートマップ６０～６６とは、１または複数のフィルタｆｌによって、畳み込まれる。フィルタｆｌは、「ａ×ｂ×７」個の数値よりなる。ただし、「ａ」は、「ｗ」より小さい自然数である。また、「ｂ」は、「ｈ」より小さい自然数である。この処理は、フレームデータＦＤを構成する各画素領域と、ヒートマップ６０～６６の対応する領域とが、互いに対応付けられて識別モデル７０ａに入力されることを意味する。 The identification model 70a includes CNN, for example. In the identification model 70a, what are the three "w x h" data of R, G, B and the "w x h" heat maps 60 to 66 of the chest, front lower abdomen, buttocks, and face? , one or more filters fl. The filter fl consists of "a×b×7" numerical values. However, "a" is a natural number smaller than "w". Further, "b" is a natural number smaller than "h". This processing means that each pixel region constituting the frame data FD and the corresponding regions of the heat maps 60 to 66 are inputted into the identification model 70a in correspondence with each other.

識別モデル７０ａは、いくつかの畳み込みレイヤおよびプーリングレイヤ等を有する。なお、識別モデル７０ａのＣＮＮは、残差ブロックを有してもよい。識別モデル７０ａは、下流の全結合レイヤＭ１０において、特徴マップが結合されることによって２つの出力値が出力される。そして、出力活性化関数としての２出力のソフトマックス関数Ｍ１２によって、評価変数ｙｏｋ，ｙｎｇの値が算出される。 The identification model 70a has several convolution layers, pooling layers, etc. Note that the CNN of the identification model 70a may have a residual block. In the discriminative model 70a, two output values are output by combining the feature maps in the downstream fully connected layer M10. Then, the values of the evaluation variables yok and yng are calculated using the two-output softmax function M12 as an output activation function.

識別モデル７０ａは、例として、教師あり学習によって学習がなされた学習済みモデルである。識別モデル７０ａは、次の訓練データを用いて訓練されたモデルである。すなわち、胸部、正面下腹部、および尻部の露出がない画像を示す画像データと、胸部、正面下腹部、および尻部の少なくとも１つに露出がある画像を示す画像データとである。学習によって、識別モデル７０ａを規定するフィルタの数値および全結合レイヤＭ１０の重みパラメータ等が学習される。なお、識別モデル７０ａは、教師なし学習によって学習された学習済みモデルであってもよく、半教師あり学習によって学習された学習済みモデルであってもよい。 The identification model 70a is, for example, a trained model that has been trained by supervised learning. The identification model 70a is a model trained using the following training data. That is, there are image data showing images in which the chest, front lower abdomen, and buttocks are not exposed, and image data showing images in which at least one of the chest, front lower abdomen, and buttocks is exposed. Through learning, numerical values of the filter that define the identification model 70a, weight parameters of the fully connected layer M10, etc. are learned. Note that the identification model 70a may be a learned model learned by unsupervised learning, or may be a learned model learned by semi-supervised learning.

学習がなされたフィルタの数値および全結合レイヤＭ１０の重みパラメータ等は、図１に示した記憶装置３４に記憶される評価写像データ３４ｃに含まれる。評価写像データ３４ｃは、フィルタの数値および全結合レイヤＭ１０の重みパラメータ等を含むことによって、識別モデル７０ａを規定する。なお、評価写像データ３４ｃは、ＣＮＮの活性化関数を規定するデータを含んでもよい。ただし、ＣＮＮの活性化関数を規定するデータは、画像評価プログラム３４ａに含めてもよい。 The numerical values of the trained filter, the weight parameters of the fully connected layer M10, and the like are included in the evaluation mapping data 34c stored in the storage device 34 shown in FIG. The evaluation mapping data 34c defines the identification model 70a by including filter values, weight parameters of the fully connected layer M10, and the like. Note that the evaluation mapping data 34c may include data that defines an activation function of the CNN. However, the data defining the CNN activation function may be included in the image evaluation program 34a.

ＰＵ３２は、Ｓ３４の処理を完了する場合、評価変数ｙｏｋの値が評価変数ｙｎｇの値よりも大きいか否かを判定する（Ｓ３６）。この処理は、評価変数ｙｏｋ，ｙｎｇの値に基づき、フレームデータＦＤが胸部、正面下腹部、および尻部の露出がない画像を示すか否かを判定する処理である。 When completing the process of S34, the PU32 determines whether the value of the evaluation variable yok is larger than the value of the evaluation variable yng (S36). This process is a process of determining whether the frame data FD represents an image in which the chest, front lower abdomen, and buttocks are not exposed, based on the values of the evaluation variables yok and yng.

ＰＵ３２は、評価変数ｙｏｋの値が評価変数ｙｎｇの値よりも大きいと判定する場合（Ｓ３６：ＹＥＳ）、胸部、正面下腹部、および尻部の露出がない旨評価する（Ｓ３８）。以下では、この評価を、ＯＫ判定と称する。一方、ＰＵ３２は、評価変数ｙｏｋの値が評価変数ｙｎｇの値以下であると判定する場合（Ｓ３６：ＮＯ）、胸部、正面下腹部、および尻部の少なくとも１つに露出がある旨評価する（Ｓ４０）。以下では、この評価を、ＮＧ判定と称する。 When determining that the value of the evaluation variable yok is greater than the value of the evaluation variable yng (S36: YES), the PU 32 evaluates that the chest, front lower abdomen, and buttocks are not exposed (S38). Hereinafter, this evaluation will be referred to as OK determination. On the other hand, when determining that the value of the evaluation variable yok is less than or equal to the value of the evaluation variable yng (S36: NO), the PU 32 evaluates that at least one of the chest, front lower abdomen, and buttocks is exposed ( S40). Hereinafter, this evaluation will be referred to as NG determination.

なお、ＰＵ３２は、フレームデータＦＤ（ｉ），ＦＤ（ｉ＋Ｎ），ＦＤ（ｉ＋２Ｎ）に関してＳ３８，Ｓ４０の処理を完了する場合、図３のＳ１６の処理を完了する。
そしてＰＵ７２は、フレームデータＦＤ（ｉ），ＦＤ（ｉ＋Ｎ），ＦＤ（ｉ＋２Ｎ）のうちの２つ以上についてＯＫ判定がなされたか否かを判定する（Ｓ１８）。そして、ＰＵ３２は、ＯＫ判定が１つ以下であると判定する場合（Ｓ１８：ＮＯ）、Ｓ１０の処理によって取得したＣＭ動画データの配信を禁止する（Ｓ２０）。Ｓ１８の処理において否定判定されることは、ＣＭ動画データに、胸部、正面下腹部、および尻部の少なくとも１つに露出がある旨の最終的な判定がなされたことを意味する。 Note that when the PU 32 completes the processing of S38 and S40 regarding the frame data FD(i), FD(i+N), and FD(i+2N), it completes the processing of S16 in FIG. 3.
Then, the PU 72 determines whether or not two or more of the frame data FD(i), FD(i+N), and FD(i+2N) have been determined to be OK (S18). If the PU 32 determines that there is one or less OK determinations (S18: NO), the PU 32 prohibits distribution of the CM video data acquired through the process of S10 (S20). A negative determination in the process of S18 means that a final determination has been made that at least one of the chest, front lower abdomen, and buttocks is exposed in the CM video data.

一方、ＰＵ３２は、ＯＫ判定が２つ以上であると判定する場合（Ｓ１８：ＹＥＳ）、変数ｉに「Ｎ」を加算する（Ｓ２２）。そしてＰＵ３２は、フレームデータＦＤ（ｉ＋２Ｎ）が存在するか否かを判定する（Ｓ２４）。この処理は、ＣＭ動画データの全てについてＳ１６の処理を完了したか否かを判定する処理である。ＰＵ３２は、フレームデータＦＤ（ｉ＋２Ｎ）が存在すると判定する場合（Ｓ２４：ＹＥＳ）、Ｓ１４の処理に戻る。一方、ＰＵ３２は、フレームデータＦＤ（ｉ＋２Ｎ）が存在しないと判定する場合（Ｓ２４：ＮＯ）、ＣＭ動画データを配信可能とする（Ｓ２６）。すなわち、ＰＵ３２は、ユーザ端末５０からのリクエストに応じてＣＭ動画データを配信する。 On the other hand, if the PU 32 determines that there are two or more OK determinations (S18: YES), it adds "N" to the variable i (S22). The PU 32 then determines whether frame data FD(i+2N) exists (S24). This process is a process for determining whether the process of S16 has been completed for all of the CM video data. If the PU 32 determines that frame data FD (i+2N) exists (S24: YES), the process returns to S14. On the other hand, when the PU 32 determines that the frame data FD (i+2N) does not exist (S24: NO), the PU 32 enables distribution of the CM video data (S26). That is, the PU 32 distributes CM video data in response to a request from the user terminal 50.

なお、ＰＵ３２は、Ｓ２０，Ｓ２６の処理を完了する場合、図３に示した一連の処理を一旦終了する。
ここで、本実施形態の作用および効果について説明する。 Note that when the PU 32 completes the processes of S20 and S26, it temporarily ends the series of processes shown in FIG.
Here, the functions and effects of this embodiment will be explained.

ＰＵ３２は、フレームデータＦＤを識別モデル７０ａに入力することによって、評価変数ｙｏｋ，ｙｎｇの値を算出する。そして、ＰＵ３２は、評価変数ｙｏｋ，ｙｎｇの値に応じてフレームデータＦＤを評価する。この処理は、フレームデータＦＤを入力としてＯＫ判定またはＮＧ判定である、フレームデータＦＤの評価結果を出力する評価写像を利用する処理である。 The PU 32 calculates the values of the evaluation variables yok and yng by inputting the frame data FD into the identification model 70a. Then, the PU 32 evaluates the frame data FD according to the values of the evaluation variables yok and yng. This process is a process that uses an evaluation mapping that inputs the frame data FD and outputs an evaluation result of the frame data FD, which is an OK judgment or a NG judgment.

上記識別モデル７０ａは、学習済みモデルである。特に、識別モデル７０ａは、例として、中間層の層数が多いディープニューラルネットワーク（以下、ＤＮＮと記載）である。ＤＮＮは、特徴量を自動で抽出する。そのため、上記評価写像を、ＤＮＮを用いて構成する場合、その入力をフレームデータＦＤのみとすることが簡素である。しかし、その場合、ＤＮＮが胸部、正面下腹部、および尻部に着目して評価結果を出力するとは限らない。 The identification model 70a is a learned model. In particular, the identification model 70a is, for example, a deep neural network (hereinafter referred to as DNN) having a large number of intermediate layers. DNN automatically extracts feature amounts. Therefore, when the evaluation mapping is configured using a DNN, it is simple to input only the frame data FD. However, in that case, the DNN does not necessarily output evaluation results focusing on the chest, front lower abdomen, and buttocks.

そこで、本実施形態では、識別モデル７０ａを、フレームデータＦＤに加えて胸部、正面下腹部、および尻部のそれぞれの存在確率を示すヒートマップ６０～６４を入力とするモデルとした。換言すれば、評価写像の入力に、胸部、正面下腹部、および尻部のそれぞれの存在確率を示すヒートマップ６０～６４を加えた。これにより、ヒートマップ６０～６４を加えない場合と比較して、識別モデル７０ａにおいて抽出された特徴量を示す特徴量マップが胸部、正面下腹部、および尻部のそれぞれの特徴を表現する可能性が高まる。そのため、評価変数ｙｏｋ，ｙｎｇの値を、胸部、正面下腹部、および尻部の少なくとも１つが露出しているか否かをより高精度に表現した値とすることができる。 Therefore, in this embodiment, the identification model 70a is a model that inputs heat maps 60 to 64 indicating the existence probability of each of the chest, front lower abdomen, and buttocks in addition to the frame data FD. In other words, heat maps 60 to 64 indicating the respective existence probabilities of the chest, front lower abdomen, and buttocks are added to the input of the evaluation map. As a result, compared to the case where the heat maps 60 to 64 are not added, there is a possibility that the feature map showing the feature values extracted in the identification model 70a expresses the respective features of the chest, front lower abdomen, and buttocks. increases. Therefore, the values of the evaluation variables yok and yng can be values that more accurately represent whether at least one of the chest, front lower abdomen, and buttocks is exposed.

なお、フレームデータＦＤが示す領域のうちのヒートマップ６０～６４によって示される胸部、正面下腹部、および尻部の存在確率が低い領域を一律、マスキングして評価写像に入力することも考えられる。しかし、その場合、胸部、正面下腹部、および尻部以外の露出部分に示される肌の色に関する情報、および人の挙動に関する情報等が省かれることとなる。しかし、それらの情報は、胸部、正面下腹部、および尻部の少なくとも１つが露出しているか否かを評価するうえで有益な情報となりうる。 It is also conceivable to uniformly mask the areas where the existence probability of the chest, front lower abdomen, and buttocks shown by the heat maps 60 to 64 among the areas shown by the frame data FD is low, and then input the masked areas to the evaluation mapping. However, in that case, information regarding the skin color shown in exposed areas other than the chest, front lower abdomen, and buttocks, information regarding human behavior, etc. will be omitted. However, such information can be useful information in evaluating whether at least one of the chest, front lower abdomen, and buttocks is exposed.

したがって、上記識別モデル７０ａによれば、画像中の胸部、正面下腹部、および尻部に着目する可能性を高めつつ、胸部、正面下腹部、および尻部以外の情報をも加味して評価変数ｙｏｋ，ｙｎｇの値を高精度に算出できる。 Therefore, according to the above-mentioned identification model 70a, while increasing the possibility of focusing on the chest, front lower abdomen, and buttocks in the image, evaluation variables are The values of yok and yng can be calculated with high precision.

以上説明した本実施形態によれば、さらに以下に記載する作用および効果が得られる。
（１－１）識別モデル７０ａの入力に、顔部の存在確率を示すヒートマップ６６を含めた。身体のうち顔部は露出している可能性が最も高い。そのため、顔部に着目することにより、その人の肌の色に着目する可能性を高めることができる。したがって、評価変数ｙｏｋ，ｙｎｇの値を、胸部、正面下腹部、および尻部の少なくとも１つが露出しているか否かをより高精度に評価した値とすることができる。 According to the present embodiment described above, the following effects and effects can be obtained.
(1-1) A heat map 66 indicating the existence probability of a face part is included in the input of the identification model 70a. The facial part of the body is most likely to be exposed. Therefore, by focusing on the face, it is possible to increase the possibility of focusing on the person's skin color. Therefore, the values of the evaluation variables yok and yng can be values that more accurately evaluate whether or not at least one of the chest, front lower abdomen, and buttocks is exposed.

（１－２）時系列的にＮ個周期で隣接する３つのフレームデータＦＤのうちの２つ以上でＯＫ判定される場合に、２Ｎ＋１個のフレームデータの区間に問題がないとした。これにより、１つのフレームデータＦＤに関して誤判定された場合であっても、誤判定の影響を抑制できる。 (1-2) If OK is determined for two or more of three frame data FDs that are chronologically adjacent at N intervals, it is assumed that there is no problem in the section of 2N+1 frame data. Thereby, even if an erroneous determination is made regarding one frame data FD, the influence of the erroneous determination can be suppressed.

＜第２の実施形態＞
以下、第２の実施形態について、第１の実施形態との相違点を中心に図面を参照しつつ説明する。 <Second embodiment>
The second embodiment will be described below with reference to the drawings, focusing on the differences from the first embodiment.

図６に、本実施形態にかかる画像評価装置３０によって実行される、上記基準に沿った評価に基づいてＣＭ動画データがユーザ端末５０によって選択的に表示されることを支援する処理の手順を示す。図６に示す処理は、記憶装置３４に記憶された画像評価プログラム３４ａを、ＰＵ３２がたとえば所定周期でくり返し実行することにより実現される。なお、図６において、図３に示した処理に対応する処理については、便宜上、同一のステップ番号を付与してその説明を省略する。 FIG. 6 shows the procedure of a process executed by the image evaluation device 30 according to the present embodiment to support selective display of CM video data by the user terminal 50 based on evaluation in accordance with the above criteria. . The process shown in FIG. 6 is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34, for example, at a predetermined period. In addition, in FIG. 6, for convenience, the same step numbers are given to the processes corresponding to the processes shown in FIG. 3, and the description thereof will be omitted.

図６に示すように、ＰＵ３２は、Ｓ１８の処理において否定判定する場合（Ｓ１８：ＮＯ）、ＣＭ動画データに制限指令を付与する（Ｓ２０ａ）。そして、ＰＵ３２は、Ｓ２６の処理に移行する。 As shown in FIG. 6, when the PU 32 makes a negative determination in the process of S18 (S18: NO), it gives a restriction command to the CM video data (S20a). The PU 32 then proceeds to the process of S26.

制限指令は、次の何れかの指令である。
・ユーザ端末５０においてＣＭ動画データが再生される場合、始めに警告をする指令である。これはユーザインターフェース５８が備えるディスプレイに視覚情報を表示する処理としてもよい。またたとえば、ユーザインターフェース５８が備えるスピーカから音声情報を出力する処理としてもよい。 The restriction command is one of the following commands.
- When commercial video data is played back on the user terminal 50, this is a command to issue a warning at the beginning. This may be a process of displaying visual information on a display included in the user interface 58. Alternatively, for example, the process may be performed to output audio information from a speaker included in the user interface 58.

・ユーザ端末５０においてＣＭ動画データが再生される場合、所定部位が露出しているシーンが再生されるときに再生画像にマスクをする指令である。マスクをする指令は、たとえば、再生画像に対して所定画像を重畳することでマスクをする指令であってもよい。また、たとえば、マスクをする指令は、再生画像に対してぼかし等のエフェクトを適用することでマスクをする指令であってもよい。またたとえば、マスクをする指令は、露出している所定部位の領域を少なくとも含む再生画像の一部に対して所定画像を重畳することでマスクをする指令であってもよい。またたとえば、露出している所定部位の領域を少なくとも含む再生画像の一部に対してばかし等のエフェクトを適用することでマスクをする指令であってもよい。 - When commercial video data is played back on the user terminal 50, this is a command to mask the playback image when a scene in which a predetermined part is exposed is played back. The command to mask may be, for example, a command to perform masking by superimposing a predetermined image on the reproduced image. Further, for example, the command to mask may be a command to perform masking by applying an effect such as blurring to the reproduced image. For example, the command to mask may be a command to perform masking by superimposing a predetermined image on a part of the reproduced image that includes at least an exposed predetermined region. Alternatively, for example, it may be a command to mask a part of the reproduced image that includes at least an exposed predetermined region by applying an effect such as a mask.

・ユーザ端末５０に対してなされる、ＣＭ動画データを再生してはいけない旨の指令である。これは、たとえばアプリケーションプログラム５４ａが、同指令を検知する場合、ＣＭ動画データの再生をしない設定とすることで実現できる。 - This is a command issued to the user terminal 50 to the effect that commercial video data should not be played back. This can be realized, for example, by setting the application program 54a not to reproduce the commercial video data when the application program 54a detects the command.

＜第３の実施形態＞
以下、第３の実施形態について、第１の実施形態との相違点を中心に図面を参照しつつ説明する。 <Third embodiment>
The third embodiment will be described below with reference to the drawings, focusing on the differences from the first embodiment.

本実施形態では、識別モデル７０ａに代えて、識別モデル７０ｂを用いる。
図７に、フレーム評価処理の詳細を示す。図７に示す処理は、記憶装置３４に記憶された画像評価プログラム３４ａを、ＰＵ３２がたとえば所定周期でくり返し実行することにより実現される。なお、図７において、図４に示した処理に対応する処理については、便宜上、同一のステップ番号を付与する。 In this embodiment, the identification model 70b is used instead of the identification model 70a.
FIG. 7 shows details of the frame evaluation process. The process shown in FIG. 7 is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34, for example, at a predetermined period. Note that in FIG. 7, the same step numbers are given to the processes corresponding to the processes shown in FIG. 4 for convenience.

図７に示す一連の処理において、ＰＵ７２は、ヒートマップ６０～６６を生成すると（Ｓ３２）、それらヒートマップ６０～６６を縮小する（Ｓ４２）。すなわち、Ｓ３２の処理において生成された「ｗ×ｈ」の領域数のヒートマップを「ｗ１×ｈ１」の領域数を有したヒートマップに縮小する。ここで、「ｗ１＜ｗ，ｈ１＜ｈ」である。この処理は、たとえば、「ｗ×ｈ」の領域数のヒートマップを「１／４」のヒートマップに縮小する場合、次のようにすればよい。すなわち、「ｗ×ｈ」の領域数のヒートマップの「４×４」の領域の平均値を、「ｗ１×ｈ１」の領域数を有したヒートマップの１つの領域の値とすればよい。 In the series of processes shown in FIG. 7, the PU 72 generates the heat maps 60 to 66 (S32) and then reduces the heat maps 60 to 66 (S42). That is, the heat map with the number of regions of "w×h" generated in the process of S32 is reduced to a heat map with the number of regions of "w1×h1". Here, "w1<w, h1<h". For example, when reducing a heat map with a number of regions of "w×h" to a heat map of "1/4", this process may be performed as follows. That is, the average value of the "4 x 4" areas of the heat map with the number of areas of "w x h" may be taken as the value of one area of the heat map with the number of areas of "w1 x h1".

次に、ＰＵ３２は、縮小されたヒートマップから、「ｗ１×ｈ１」の領域数の係数行列ＭＫを算出する（Ｓ４４）。係数行列ＭＫは、「ｗ１×ｈ１」個の領域のそれぞれに関する係数Ｋの値を定義する。係数Ｋの値は、縮小されたヒートマップにおける領域の値が閾値以上の場合に「１」よりも大きい値となる。係数Ｋの値は、縮小されたヒートマップにおける領域の値が閾値未満の場合に「１」となる。ここで、閾値は、「０」よりも大きく「１」未満の値である。Ｓ４４の処理において生成される係数行列ＭＫは、ヒートマップ６０～６６に対応した４個である。たとえば、ヒートマップ６０に対応した係数行列ＭＫは、その成分の値が「１」よりも大きい場合、その領域に胸部が存在する確率が大きいことを意味する。また、たとえば、ヒートマップ６０に対応した係数行列ＭＫは、その成分の値が「１」の場合、その領域に胸部が存在する確率が小さいことを意味する。 Next, the PU 32 calculates a coefficient matrix MK with the number of regions of “w1×h1” from the reduced heat map (S44). Coefficient matrix MK defines the value of coefficient K for each of “w1×h1” regions. The value of the coefficient K is greater than "1" when the value of the region in the reduced heat map is equal to or greater than the threshold value. The value of the coefficient K is "1" when the value of the area in the reduced heat map is less than the threshold value. Here, the threshold value is a value greater than "0" and less than "1". There are four coefficient matrices MK generated in the process of S44, which correspond to heat maps 60 to 66. For example, when the coefficient matrix MK corresponding to the heat map 60 has a value of a component greater than "1", it means that there is a high probability that a chest exists in that region. Further, for example, when the value of a component of the coefficient matrix MK corresponding to the heat map 60 is "1", it means that the probability that a chest exists in that region is small.

次に、ＰＵ７２は、識別モデル７０ｂに、フレームデータＦＤおよび係数行列ＭＫを入力することによって、評価変数ｙｏｋ，ｙｎｇの値を算出する（Ｓ３４ａ）。
識別モデル７０ｂは、フレームデータＦＤおよび係数行列ＭＫを入力とし、評価変数ｙｏｋ，ｙｎｇの値を出力する。ただし、識別モデル７０ｂは、係数行列ＭＫを最上流から入力するモデルではない。 Next, the PU 72 calculates the values of the evaluation variables yok and yng by inputting the frame data FD and the coefficient matrix MK to the identification model 70b (S34a).
The identification model 70b inputs the frame data FD and the coefficient matrix MK, and outputs the values of evaluation variables yok and yng. However, the identification model 70b is not a model in which the coefficient matrix MK is input from the most upstream side.

識別モデル７０ｂは、特徴抽出器Ｍ２０を備える。特徴抽出器Ｍ２０は、フレームデータＦＤを入力として、フレームデータＦＤの特徴量を抽出する。特徴抽出器Ｍ２０は、ＣＮＮを含む。この特徴量は、「ｈ１×ｗ１」の特徴マップ８０である。特徴マップ８０は、複数存在する。識別モデル７０ａは、積算処理Ｍ２２を有する。積算処理Ｍ２２は、４個の係数行列ＭＫのそれぞれについて、特徴マップ８０のいくつかとのアダマール積を算出する処理である。ここで、１つの特徴マップ８０とアダマール積が算出される係数行列ＭＫは、４個の係数行列ＭＫのうちのいずれか１つであってよい。また、特徴抽出器Ｍ２０が出力する特徴マップ８０の中には、係数行列ＭＫとのアダマール積の算出対象とされないものもある。 The identification model 70b includes a feature extractor M20. The feature extractor M20 receives the frame data FD as input and extracts the feature amount of the frame data FD. Feature extractor M20 includes a CNN. This feature amount is a feature map 80 of “h1×w1”. A plurality of feature maps 80 exist. The identification model 70a has an integration process M22. The integration process M22 is a process of calculating Hadamard products with some of the feature maps 80 for each of the four coefficient matrices MK. Here, the coefficient matrix MK for which the Hadamard product is calculated with one feature map 80 may be any one of the four coefficient matrices MK. Furthermore, some of the feature maps 80 output by the feature extractor M20 are not subject to calculation of the Hadamard product with the coefficient matrix MK.

具体的には、たとえば、「ｎ」を自然数として、特徴マップ８０の数が「５ｎ」個の場合、ヒートマップ６０～６６のそれぞれに対応する係数行列ＭＫとの積算対象となる特徴マップ８０を、「ｎ」個ずつとしてもよい。その場合、いずれの係数行列ＭＫとの積算対象ともならない特徴マップ８０が「ｎ」個存在する。 Specifically, for example, when the number of feature maps 80 is "5n" where "n" is a natural number, the feature map 80 to be multiplied with the coefficient matrix MK corresponding to each of the heat maps 60 to 66 is , "n" pieces. In that case, there are "n" feature maps 80 that are not subject to integration with any coefficient matrix MK.

識別モデル７０ｂは、識別ユニットＭ２４を有する。識別ユニットＭ２４には、積算処理Ｍ２２が施された特徴マップが入力される。識別ユニットＭ２４は、たとえば出力活性化関数がソフトマックス関数であるＣＮＮであってもよい。 The identification model 70b has an identification unit M24. The feature map subjected to the integration process M22 is input to the identification unit M24. The identification unit M24 may be, for example, a CNN whose output activation function is a softmax function.

なお、識別モデル７０ｂは、教師あり学習によって学習された学習済みモデルである。ここでの訓練データは、識別モデル７０ａによるものと同様である。本実施形態において、評価写像データ３４ｃは、識別モデル７０ａの学習によって得られたパラメータを含む。ちなみに、特徴抽出器Ｍ２０を規定するパラメータについては、胸部、正面下腹部、および尻部の少なくとも１つに露出があるか否かに応じた画像を示す画像データを用いた学習によって得られたものでなくてもよい。すなわち、既存の画像認識処理によって得られた特徴抽出器を転移学習によって利用してもよい。 Note that the identification model 70b is a learned model learned by supervised learning. The training data here is the same as that based on the identification model 70a. In this embodiment, the evaluation mapping data 34c includes parameters obtained by learning the discriminative model 70a. Incidentally, the parameters defining the feature extractor M20 were obtained through learning using image data representing images depending on whether or not at least one of the chest, front lower abdomen, and buttocks is exposed. It doesn't have to be. That is, a feature extractor obtained by existing image recognition processing may be used by transfer learning.

以上説明した本実施形態によれば、以下の作用および効果が得られる。
（３－１）複数個の特徴マップ８０に、４個の係数行列ＭＫのうちのいずれか１つと選択的に合成されるマップを含めた。これにより、合成後の特徴マップは、胸部、正面下腹部、尻部、および顔部のうちの対応する部分の値が増幅されるマップとなることから、対応する部分の特徴を抽出するマップとなりやすい。 According to this embodiment described above, the following actions and effects can be obtained.
(3-1) The plurality of feature maps 80 include a map that is selectively combined with any one of the four coefficient matrices MK. As a result, the combined feature map becomes a map in which the values of the corresponding parts of the chest, front lower abdomen, buttocks, and face are amplified, so it becomes a map that extracts the features of the corresponding parts. Cheap.

（３－２）複数個の特徴マップ８０に、４個の係数行列ＭＫのうちのいずれによっても補正されない特徴マップを含めた。これにより、胸部、正面下腹部、尻部、および顔部に特化しないものの胸部、正面下腹部、および尻部の少なくとも１つが露出しているか否かを判定するうえで有効な特徴を抽出するマップを生成させやすい。 (3-2) The plurality of feature maps 80 include a feature map that is not corrected by any of the four coefficient matrices MK. As a result, features that are not specific to the chest, front lower abdomen, buttocks, and face but are effective in determining whether at least one of the chest, front lower abdomen, and buttocks are exposed are extracted. Easy to generate maps.

＜第４の実施形態＞
以下、第４の実施形態について、第１の実施形態との相違点を中心に図面を参照しつつ説明する。 <Fourth embodiment>
The fourth embodiment will be described below with reference to the drawings, focusing on the differences from the first embodiment.

本実施形態では、識別モデル７０ａに代えて、識別モデル７０ｃを用いる。
図８に、フレーム評価処理の詳細を示す。図８に示す処理は、記憶装置３４に記憶された画像評価プログラム３４ａを、ＰＵ３２がたとえば所定周期でくり返し実行することにより実現される。なお、図８において、図４に示した処理に対応する処理については、便宜上、同一のステップ番号を付与する。 In this embodiment, an identification model 70c is used instead of the identification model 70a.
FIG. 8 shows details of the frame evaluation process. The process shown in FIG. 8 is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34, for example, at a predetermined period. Note that in FIG. 8, the same step numbers are given to the processes corresponding to the processes shown in FIG. 4 for convenience.

図８に示す一連の処理において、ＰＵ３２は、まず識別モデル７０ｃにフレームデータＦＤを入力することによって、評価変数ｙｏｋ，ｙｎｇの値を算出する（Ｓ３４ｂ）。識別モデル７０ｃは、フレームデータＦＤを入力することによって、評価変数ｙｏｋ，ｙｎｇの値を出力するモデルである。識別モデル７０ｃは、特徴抽出器Ｍ３０を含む。特徴抽出器Ｍ３０は、Ｒ，Ｇ，Ｂのそれぞれについて「ｗ×ｈ」からなる画像データを入力として、「ｗ２×ｈ２」の特徴マップ８２を出力する。ここで、「ｗ２＜ｗ，ｈ２＜ｈ」である。特徴抽出器Ｍ３０は、ＣＮＮを含む。特徴抽出器Ｍ３０が一度に出力する特徴マップ８２の数は、Ｊ個である。ただし、「Ｊ＞２」である。 In the series of processes shown in FIG. 8, the PU 32 first calculates the values of the evaluation variables yok and yng by inputting the frame data FD to the identification model 70c (S34b). The identification model 70c is a model that outputs the values of evaluation variables yok and yng by inputting the frame data FD. Discrimination model 70c includes a feature extractor M30. The feature extractor M30 inputs image data of "w×h" for each of R, G, and B, and outputs a "w2×h2" feature map 82. Here, "w2<w, h2<h". Feature extractor M30 includes a CNN. The number of feature maps 82 that the feature extractor M30 outputs at one time is J. However, "J>2".

識別モデル７０ｃは、全結合レイヤＭ３２を含む。全結合レイヤＭ３２は、Ｊ個の特徴マップ８２の「ｗ２×ｈ２×Ｊ」個の数値を結合して２つの出力値を出力する。それら２つの出力値は、ソフトマックス関数Ｍ３４に入力される。ソフトマックス関数Ｍ３４は、評価変数ｙｏｋ，ｙｎｇの値を出力する。 The identification model 70c includes a fully connected layer M32. The fully connected layer M32 combines "w2×h2×J" numerical values of the J feature maps 82 and outputs two output values. These two output values are input to the softmax function M34. The softmax function M34 outputs the values of evaluation variables yok and yng.

識別モデル７０ｃは、例として、教師あり学習によって学習された学習済みモデルである。識別モデル７０ｃの訓練データは、識別モデル７０ａの訓練データと同様である。なお、識別モデル７０ｃは、教師なし学習によって学習された学習済みモデルであってもよく、半教師あり学習によって学習された学習済みモデルであってもよい。 The identification model 70c is, for example, a learned model learned by supervised learning. The training data for the identification model 70c is similar to the training data for the identification model 70a. Note that the identification model 70c may be a learned model learned by unsupervised learning, or may be a learned model learned by semi-supervised learning.

そして、ＰＵ３２は、評価変数ｙｏｋが評価変数ｙｎｇよりも大きいと判定する場合（Ｓ３６：ＹＥＳ）、最終評価処理を実行する（Ｓ５０）。
図９に、最終評価処理の詳細を示す。 Then, when determining that the evaluation variable yok is larger than the evaluation variable yng (S36: YES), the PU 32 executes the final evaluation process (S50).
FIG. 9 shows details of the final evaluation process.

図９に示す一連の処理において、ＰＵ３２は、評価変数ｙｏｋが大きいときに識別モデル７０ｃが着目した領域を示すアクティベーションマップを生成する（Ｓ６０）。ここで、上記Ｊ個の特徴マップを、ｆ１（ｘ，ｙ），ｆ２（ｘ，ｙ），…，ｆＪ（ｘ，ｙ）と記載する。そして、全結合レイヤＭ３２のパラメータのうち、評価変数ｙｏｋに対応するものを、ｗｏｋ１，ｗｏｋ２，…，ｗｏｋＪとする。ただし、「１≦ｘ≦ｗ２，１≦ｙ≦ｈ２」である。 In the series of processes shown in FIG. 9, the PU 32 generates an activation map indicating the area that the identification model 70c focuses on when the evaluation variable yok is large (S60). Here, the above J feature maps are written as f1 (x, y), f2 (x, y), ..., fJ (x, y). Among the parameters of the fully connected layer M32, those corresponding to the evaluation variable yok are assumed to be wok1, wok2, . . . , wokJ. However, "1≦x≦w2, 1≦y≦h2".

その場合、各座標（ｘ，ｙ）が、評価変数ｙｏｋが大きいことに寄与した度合いは、
ｗｏｋ１・ｆ１（ｘ，ｙ）＋ｗｏｋ２・ｆ２（ｘ，ｙ）＋…＋ｗｏｋＪ・ｆＪ（ｘ，ｙ）
である。 In that case, the degree to which each coordinate (x, y) contributed to the large evaluation variable yok is:
wok1・f1(x,y)+wok2・f2(x,y)+…+wokJ・fJ(x,y)
It is.

ＰＵ３２は、「ｗ２×ｈ２」の各座標（ｘ，ｙ）について、上記値を算出することによって、アクティベーションマップを生成する。
次にＰＵ３２は、アクティベーションマップを２値化した２値化マップＭＡＣＴを生成する（Ｓ６２）。ここで、ＰＵ３２は、まず、アクティベーションマップが示す「ｗ２×ｈ２」個の値のそれぞれをロジスティックシグモイド関数に代入することによって、「ｗ２×ｈ２」個の値を「０」以上「１」以下の値とする。そして、ＰＵ３２は、ロジスティックシグモイド関数の出力値が所定値よりも大きい場合に「１」として且つ、所定値以下の場合に「－１」とする。ここで、所定値は、「０」よりも大きく「１」よりも小さい値に設定される。なお、所定値は、「１／２」以上であってもよい。 The PU 32 generates an activation map by calculating the above values for each coordinate (x, y) of "w2×h2".
Next, the PU 32 generates a binarized map MACT by binarizing the activation map (S62). Here, the PU 32 first assigns each of the "w2 x h2" values indicated by the activation map to the logistic sigmoid function, thereby calculating the "w2 x h2" values from "0" to "1". be the value of Then, the PU 32 sets it as "1" when the output value of the logistic sigmoid function is larger than a predetermined value, and sets it as "-1" when it is less than or equal to the predetermined value. Here, the predetermined value is set to a value greater than "0" and smaller than "1". Note that the predetermined value may be "1/2" or more.

２値化マップＭＡＣＴは、フレームデータＦＤが示す領域を縮小した「ｗ２×ｈ２」の領域において、ＯＫ判定に大きく寄与した領域に「１」が付与されたマップである。また、２値化マップＭＡＣＴは、フレームデータＦＤが示す領域を縮小した「ｗ２×ｈ２」の領域において、ＯＫ判定にあまり寄与していない領域に「－１」が付与されたマップである。 The binarized map MACT is a map in which "1" is assigned to the region that greatly contributed to the OK determination in the "w2×h2" region obtained by reducing the region indicated by the frame data FD. Furthermore, the binarized map MACT is a map in which "-1" is assigned to an area that does not contribute much to the OK determination in the area of "w2×h2" which is a reduced area of the area indicated by the frame data FD.

次にＰＵ３２は、フレームデータＦＤを回帰モデルに入力する（Ｓ６４）。回帰モデルは、顔部の２次元ガウス分布等の２次元分布を定義する出力がないことを除いて、Ｓ３０の処理によって用いたものと同じである。次にＰＵ３２は、ヒートマップ６０～６４を生成する（Ｓ６６）。次に、ＰＵ３２は、３個のヒートマップ６０～６４のそれぞれを、「ｗ２×ｈ２」に縮小する（Ｓ６８）。この処理は、Ｓ４２の処理と同様である。次に、ＰＵ３２は、縮小された３個のヒートマップから、１つの２値化マップＭｂｏｄｙを生成する（Ｓ７０）。Ｓ７０の処理において、ＰＵ３２は、まず、３個のヒートマップのそれぞれの成分の値と閾値との大小を比較することによって３個の暫定マップを生成する。すなわち、ＰＵ３２は、ヒートマップの成分の値が閾値よりも大きい場合には、暫定マップの対応する成分の値を「１」とする。また、ＰＵ３２は、ヒートマップの成分の値が閾値以下の場合には、暫定マップの対応する成分の値を「－１」とする。そして、ＰＵ３２は、３個の暫定マップの成分のそれぞれについて、３個の値の全てが「－１」の場合には、２値化マップＭｂｏｄｙの対応する値を「－１」とする。また、ＰＵ３２は、３個の暫定マップの成分のそれぞれについて、３個の値の少なくとも１つが「１」の場合には、２値化マップＭｂｏｄｙの対応する値を「１」とする。 Next, the PU 32 inputs the frame data FD into the regression model (S64). The regression model is the same as that used in the process of S30, except that there is no output that defines a two-dimensional distribution such as a two-dimensional Gaussian distribution of the face. Next, the PU 32 generates heat maps 60 to 64 (S66). Next, the PU 32 reduces each of the three heat maps 60 to 64 to "w2×h2" (S68). This process is similar to the process at S42. Next, the PU 32 generates one binarized map Mbody from the three reduced heat maps (S70). In the process of S70, the PU 32 first generates three provisional maps by comparing the values of the respective components of the three heat maps with a threshold value. That is, when the value of the component of the heat map is larger than the threshold value, the PU 32 sets the value of the corresponding component of the provisional map to "1". Further, when the value of a component of the heat map is less than or equal to the threshold value, the PU 32 sets the value of the corresponding component of the provisional map to "-1". Then, for each of the three provisional map components, if all three values are "-1", the PU 32 sets the corresponding value of the binarized map Mbody to "-1". Furthermore, when at least one of the three values for each of the three provisional map components is "1", the PU 32 sets the corresponding value of the binarized map Mbody to "1".

２値化マップＭｂｏｄｙは、フレームデータＦＤが示す領域を縮小した「ｗ２×ｈ２」の領域において、胸部、正面下腹部、および尻部のいずれかの存在確率が大きい領域に「１」が付されたマップである。また、２値化マップＭｂｏｄｙは、フレームデータＦＤが示す領域を縮小した「ｗ２×ｈ２」の領域において、胸部、正面下腹部、および尻部の全ての存在確率が小さい領域に「－１」が付されたマップである。 In the binarized map Mbody, in the "w2 x h2" area which is the reduced area indicated by the frame data FD, "1" is assigned to the area where the existence probability of any of the chest, front lower abdomen, and buttocks is large. This is a map. Furthermore, in the binarized map Mbody, in the “w2×h2” area which is a reduced area of the area indicated by the frame data FD, “-1” is placed in the area where the existence probability of all of the chest, front lower abdomen, and buttocks is small. This is the attached map.

そしてＰＵ３２は、２値化マップＭＡＣＴと２値化マップＭｂｏｄｙとのアダマール積を算出することによって、注視マップＭＡＴＴを生成する（Ｓ７２）。注視マップＭＡＴＴは、評価変数ｙｏｋが大きいことに寄与した領域であって且つ胸部、正面下腹部、および尻部の存在確率が大きい領域を「１」とするマップである。また、注視マップＭＡＴＴは、評価変数ｙｏｋが大きいことにあまり寄与していない領域であって且つ胸部、正面下腹部、および尻部の存在確率が小さい領域を「１」とするマップである。また、注視マップＭＡＴＴは、評価変数ｙｏｋが大きいことに寄与した領域であって且つ胸部、正面下腹部、および尻部の存在確率が小さい領域を「－１」とするマップである。また、注視マップＭＡＴＴは、評価変数ｙｏｋが大きいことにあまり寄与していない領域であって且つ胸部、正面下腹部、および尻部の存在確率が大きい領域を「－１」とするマップである。 Then, the PU 32 generates the gaze map MATT by calculating the Hadamard product of the binarized map MACT and the binarized map Mbody (S72). The gaze map MATT is a map that assigns "1" to regions that contribute to a large evaluation variable yok and that have a high probability of existence of the chest, front lower abdomen, and buttocks. Furthermore, the gaze map MATT is a map in which regions that do not contribute much to the large evaluation variable yok and in which the presence probability of the chest, front lower abdomen, and buttocks are small are set as "1". Furthermore, the gaze map MATT is a map that sets "-1" to regions that contribute to a large evaluation variable yok and that have a low probability of existence of the chest, front lower abdomen, and buttocks. Furthermore, the gaze map MATT is a map that sets "-1" to regions that do not contribute much to the large evaluation variable yok and have a high probability of existence of the chest, front lower abdomen, and buttocks.

次にＰＵ３２は、注視マップＭＡＴＴの各成分の平均値を示す指標値（図中、ＧＡＰ（ＭＡＴＴ）と記載）が判定値ｇｔｈ以上であるか否かを判定する（Ｓ７４）。ここで、指標値を単純平均値とする場合には、指標値は、「－１」以上であって且つ「１」以下の値である。指標値が大きいほど、胸部、正面下腹部、および尻部の存在確率が大きい領域に着目してＯＫ判定がなされたことを意味する。なお、指標値を単純平均値とすることは必須ではない。たとえば、周囲が「－１」となる領域に囲まれた１つの領域のみ「１」となる領域については、その値を「－１」に書き替えた平均値とするなどのフィルタ処理を施してもよい。 Next, the PU 32 determines whether an index value (denoted as GAP(MATT) in the figure) indicating the average value of each component of the gaze map MATT is equal to or greater than the determination value gth (S74). Here, when the index value is a simple average value, the index value is a value greater than or equal to "-1" and less than or equal to "1". The larger the index value, the more likely the region is determined to be OK, such as the chest, front lower abdomen, and buttocks. Note that it is not essential that the index value be a simple average value. For example, for an area where only one area is ``1'' surrounded by an area where the surrounding area is ``-1'', filter processing such as rewriting the value to ``-1'' and taking the average value is performed. Good too.

ＰＵ３２は、指標値が閾値ｇｔｈ以上であると判定する場合（Ｓ７４：ＹＥＳ）、ＯＫ判定をする（Ｓ３８）。
一方、ＰＵ３２は、指標値が閾値ｇｔｈ未満であると判定する場合（Ｓ７４：ＮＯ）、ＮＧ判定をする（Ｓ７６）。そして、ＰＵ７２は、その時のフレームデータＦＤを記憶装置３４に保存する（Ｓ７８）。そしてＰＵ３２は、図１に示すユーザインターフェース４０を操作することによって、評価変数ｙｏｋの値が大きいことの妥当性が低い旨を、人に通知する（Ｓ８０）。ここでは、たとえば、ユーザインターフェース４０に表示装置を備えることによって、妥当性が低い旨の視覚情報を表示してもよい。 If the PU 32 determines that the index value is equal to or greater than the threshold value gth (S74: YES), the PU 32 makes an OK determination (S38).
On the other hand, when determining that the index value is less than the threshold value gth (S74: NO), the PU 32 makes an NG determination (S76). Then, the PU 72 stores the frame data FD at that time in the storage device 34 (S78). Then, the PU 32 notifies the person that the validity of the large value of the evaluation variable yok is low by operating the user interface 40 shown in FIG. 1 (S80). Here, for example, the user interface 40 may be provided with a display device to display visual information indicating that the validity is low.

なお、ＰＵ３２は、Ｓ３８，Ｓ８０の処理がなされる場合、図８に示すＳ５０の処理を一旦完了する。
このように、本実施形態では、ＰＵ３２は、フレームデータＦＤのみから評価変数ｙｏｋ，ｙｎｇの値を算出する。ただし、ＰＵ３２は、評価変数ｙｏｋの値が大きい場合、直ちにはＯＫ判定とせずに、画像のどこに着目して評価変数ｙｏｋを大きい値に算出したかを分析する。そしてＰＵ３２は、画像のうちの胸部、正面下腹部、および尻部に着目して評価変数ｙｏｋを大きい値に算出した場合に、ＯＫ判定をする。これにより、誤ってＯＫ
判定がなされることを抑制できる。 Note that when the processes of S38 and S80 are performed, the PU 32 once completes the process of S50 shown in FIG.
In this manner, in this embodiment, the PU 32 calculates the values of the evaluation variables yok and yng only from the frame data FD. However, when the value of the evaluation variable yok is large, the PU 32 does not immediately determine OK, but analyzes where in the image the evaluation variable yok was calculated to be a large value. Then, the PU 32 determines OK when the evaluation variable yok is calculated to be a large value by focusing on the chest, front lower abdomen, and buttocks in the image. This allows you to accidentally
It is possible to prevent judgments from being made.

以上説明した本実施形態によれば、さらに以下の作用および効果が得られる。
（４－１）ＰＵ３２は、評価変数ｙｏｋの値が大きいことの妥当性が低いと判定する場合、その時のフレームデータＦＤを保存して且つ、人に通知した。これにより、人が最終的な判断をすることができる。そのため、胸部、正面下腹部、および尻部のいずれも露出していない場合には、ＮＧ判定を取り消して、ＣＭ動画データを配信できる。また、胸部、正面下腹部、および尻部のいずれも露出していない場合には、保存したフレームデータＦＤを用いて識別モデル７０ｃを再学習させることができる。 According to this embodiment described above, the following effects and effects can be obtained.
(4-1) If the PU 32 determines that the validity of the large value of the evaluation variable yok is low, it saves the frame data FD at that time and notifies the person. This allows people to make the final decision. Therefore, if none of the chest, front lower abdomen, or buttocks are exposed, the NG determination can be canceled and the CM video data can be distributed. Furthermore, if none of the chest, front lower abdomen, or buttocks are exposed, the identification model 70c can be retrained using the saved frame data FD.

＜第５の実施形態＞
以下、第５の実施形態について、第１の実施形態との相違点を中心に図面を参照しつつ説明する。 <Fifth embodiment>
The fifth embodiment will be described below with reference to the drawings, focusing on the differences from the first embodiment.

本実施形態では、識別モデル７０ａに代えて、識別モデル７０ｄを用いる。識別モデル７０ｄは、次の３つの状態を識別する。
状態１：胸部、正面下腹部、および尻部のいずれも露出していない状態である。 In this embodiment, an identification model 70d is used instead of the identification model 70a. The identification model 70d identifies the following three states.
Condition 1: None of the chest, front lower abdomen, and buttocks are exposed.

状態２：胸部のうちの図２に示す部分「３」の一部、正面下腹部のうちの図２に示す部分「１０」の一部、および尻部のうちの図２に示す部分「１４」の一部の少なくとも１つに限って露出している状態である。状態２は、たとえば胸部、正面下腹部、および尻部に衣類をまとっているが、その衣類がある程度露出度が大きい衣類である状態である。また、状態２は、たとえば胸部、正面下腹部、および尻部に衣類をまとっているが、その衣類がある程度透けている衣類である状態である。 State 2: A part of the chest part "3" shown in FIG. 2, a part of the front lower abdomen part "10" shown in FIG. 2, and a part of the buttocks part "14" shown in FIG. ” is exposed. State 2 is a state in which, for example, the chest, front lower abdomen, and buttocks are covered with clothing, but the clothing is highly revealing to some extent. State 2 is a state in which, for example, the chest, front lower abdomen, and buttocks are covered with clothing, but the clothing is transparent to some extent.

状態３：胸部、正面下腹部、および尻部の少なくとも１つについて、状態２以上に顕著に露出している状態である。
図１０に、フレーム評価処理の詳細を示す。図１０に示す処理は、記憶装置３４に記憶された画像評価プログラム３４ａを、ＰＵ３２がたとえば所定周期でくり返し実行することにより実現される。なお、図１０において、図４に示した処理に対応する処理については、便宜上、同一のステップ番号を付与する。 State 3: A state in which at least one of the chest, front lower abdomen, and buttocks is significantly exposed compared to state 2 or higher.
FIG. 10 shows details of the frame evaluation process. The process shown in FIG. 10 is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34 at, for example, a predetermined period. Note that in FIG. 10, the same step numbers are given to the processes corresponding to the processes shown in FIG. 4 for convenience.

図１０に示す一連の処理において、ＰＵ３２は、まずフレームデータＦＤをセマンティックセグメンテーションモデルに入力する（Ｓ３０ａ）。セマンティックセグメンテーションモデルは、フレームデータＦＤが示す「ｗ×ｈ」個の画素毎に、ラベル変数の値を出力するモデルである。ここで、ラベル変数は、少なくとも図２の「３～５，１０～１４，１８」の各部分同士と、それ以外とを識別する値である。したがって、ラベル変数は、１０個以上の異なる値を取り得る。 In the series of processes shown in FIG. 10, the PU 32 first inputs the frame data FD into the semantic segmentation model (S30a). The semantic segmentation model is a model that outputs the value of a label variable for each "w×h" number of pixels indicated by the frame data FD. Here, the label variable is a value that distinguishes at least each part of "3 to 5, 10 to 14, and 18" in FIG. 2 from each other. Therefore, a label variable can take on ten or more different values.

セマンティックセグメンテーションモデルは、識別モデルである。セマンティックセグメンテーションモデルは、教師あり学習によって学習された学習済みモデルである。セマンティックセグメンテーションモデルは、様々な画像データについて、予め上記各部分の画素にラベル変数の値を付与したデータを訓練データとして学習がなされたデータである。本実施形態にかかる位置情報写像データ３４ｂには、セマンティックセグメンテーションモデルを規定するパラメータが含まれている。すなわち、たとえば、セマンティックセグメンテーションモデルがＣＮＮを含む場合、位置情報写像データ３４ｂには、フィルタの各値、および全結合レイヤの重みパラメータ等が含まれる。 A semantic segmentation model is a discriminative model. The semantic segmentation model is a trained model learned by supervised learning. The semantic segmentation model is data that has been trained on various image data using data in which label variable values are previously assigned to pixels of each portion as training data. The location information mapping data 34b according to this embodiment includes parameters that define a semantic segmentation model. That is, for example, when the semantic segmentation model includes CNN, the position information mapping data 34b includes each value of the filter, the weight parameter of the fully connected layer, and the like.

次に、ＰＵ３２は、セマンティックセグメンテーションモデルが出力する各画素のラベル変数の値を用いて、身体情報マップを生成する（Ｓ３２ａ）。身体情報マップは、次の６個からなる。 Next, the PU 32 generates a physical information map using the value of the label variable of each pixel output by the semantic segmentation model (S32a). The physical information map consists of the following six items.

（ｂｍ１）胸部のうちの図２に示す部分「３」が位置する部分とそれ以外とを識別するマップである。これは、たとえば、胸部のうちの図２に示す部分「３」が位置する部分を「１」として且つ、それ以外の部分を「０」とすることによって実現できる。 (bm1) This is a map that identifies the part of the chest where part "3" shown in FIG. 2 is located and the other part. This can be realized, for example, by setting the part of the chest where part "3" shown in FIG. 2 is located as "1" and setting the other parts as "0".

（ｂｍ２）正面下腹部のうちの図２に示す部分「１０」が位置する部分とそれ以外とを識別するマップである。
（ｂｍ３）尻部のうちの図２に示す部分「１４」が位置する部分とそれ以外とを識別するマップである。 (bm2) This is a map that identifies the portion of the front lower abdomen where the portion “10” shown in FIG. 2 is located and the other portions.
(bm3) This is a map that identifies the portion of the buttocks where the portion “14” shown in FIG. 2 is located and the other portions.

（ｂｍ４）胸部のうちの図２に示す部分「３」以外の部分とそれ以外とを識別するマップである。
（ｂｍ５）正面下腹部のうちの図２に示す部分「１０」以外の部分が位置する部分とそれ以外とを識別するマップである。 (bm4) This is a map that identifies parts of the chest other than part "3" shown in FIG. 2 and other parts.
(bm5) This is a map that identifies a portion of the front lower abdomen where a portion other than the portion “10” shown in FIG. 2 is located and the other portions.

（ｂｍ６）尻部のうちの図２に示す部分「１４」以外の部分が位置する部分とそれ以外とを識別するマップである。
上記（ｂｍ１）～（ｂｍ３）は、上記状態２において特に着目すべき部分の位置を示すマップデータである。一方、上記（ｂｍ４）～（ｂｍ６）は、上記状態３において特に着目すべき部分の位置を示すマップデータである。 (bm6) This is a map that identifies a portion of the buttocks where a portion other than the portion “14” shown in FIG. 2 is located and other portions.
The above (bm1) to (bm3) are map data indicating the positions of parts of particular interest in the above state 2. On the other hand, the above (bm4) to (bm6) are map data indicating the positions of parts of particular interest in the above state 3.

Ｓ３０ａ，Ｓ３２ａの処理によって、フレームデータから６個の身体情報マップを出力する位置情報写像は、上記セマンティックセグメンテーションモデルに応じて規定される写像である。 The positional information mapping that outputs the six physical information maps from the frame data through the processing in S30a and S32a is a mapping defined according to the above-mentioned semantic segmentation model.

次にＰＵ３２は、上記６個の身体情報マップ８４と、フレームデータＦＤとを、識別モデル７０ｄに入力することによって、評価変数ｙｏｋ１，ｙｏｋ２，ｙｎｇの値を算出する（Ｓ３４ｃ）。識別モデル７０ｄは、ＣＮＮを含む。識別モデル７０ｄに入力された６個の身体情報マップ８４と、フレームデータＦＤとは、「ａ２×ｂ２×９」の次元を有する複数個のフィルタｆｌによって畳み込み処理がなされる。ここで、「ａ２」は、「ｗ」より小さい自然数である。また、「ｂ２」は、「ｈ」よりも小さい自然数である。畳み込みレイヤの数は任意である。また、ＣＮＮは、プーリングレイヤを含んでもよい。また、ＣＮＮは、残差ブロックを含んでもよい。そして、ＣＮＮが出力する特徴マップは、全結合レイヤＭ４０において結合される。全結合レイヤＭ４０の出力は３個である。そして、それらは出力活性化関数であるソフトマックス関数Ｍ４２に入力される。ソフトマックス関数Ｍ４２は、３個の評価変数ｙｏｋ１，ｙｏｋ２，ｙｎｇの値を出力する。 Next, the PU 32 calculates the values of the evaluation variables yok1, yok2, and yng by inputting the six physical information maps 84 and the frame data FD into the identification model 70d (S34c). The identification model 70d includes CNN. The six physical information maps 84 input to the identification model 70d and the frame data FD are subjected to convolution processing by a plurality of filters fl having dimensions of "a2 x b2 x 9". Here, "a2" is a natural number smaller than "w". Moreover, "b2" is a natural number smaller than "h". The number of convolution layers is arbitrary. Additionally, the CNN may include a pooling layer. Additionally, the CNN may include residual blocks. Then, the feature maps output by the CNN are combined in a fully connected layer M40. The fully connected layer M40 has three outputs. Then, they are input to a softmax function M42 which is an output activation function. The softmax function M42 outputs the values of three evaluation variables yok1, yok2, and yng.

識別モデル７０ｄは、例として、教師あり学習によって学習された学習済みモデルである。識別モデル７０ｄの学習における訓練データは、上記状態１，状態２，状態３のそれぞれの画像データである。状態１の画像データに対しては、評価変数ｙｏｋ１の目標変数の値を「１」とし、それ以外の目標変数の値を「０」とする。また、状態２の画像データに対しては、評価変数ｙｏｋ２の目標変数の値を「１」とし、それ以外の目標変数の値を「０」とする。また、状態３の画像データに対しては、評価変数ｙｎｇの目標変数の値を「１」とし、それ以外の目標変数の値を「０」とする。 The identification model 70d is, for example, a learned model learned by supervised learning. The training data for learning the identification model 70d is image data of each of the states 1, 2, and 3 described above. For the image data in state 1, the value of the target variable of the evaluation variable yok1 is set to "1", and the values of the other target variables are set to "0". Furthermore, for the image data in state 2, the value of the target variable of the evaluation variable yok2 is set to "1", and the values of the other target variables are set to "0". Further, for the image data in state 3, the value of the target variable of the evaluation variable yng is set to "1", and the values of the other target variables are set to "0".

識別モデル７０ｄの学習によって求められた各パラメータは、評価写像データ３４ｃに含まれる。
ＰＵ３２は、評価変数ｙｏｋ１，ｙｏｋ２，ｙｎｇの値のうちの最大値ｙｍａｘが評価変数ｙｏｋ１の値であるか否かを判定する（Ｓ３６ａ）。ＰＵ３２は、最大値ｙｍａｘが評価変数ｙｏｋ１の値であると判定する場合（Ｓ３６ａ：ＹＥＳ）、状態１であると判定する（Ｓ３８ａ）。以下、これをＯＫ１判定と称する。一方、ＰＵ３２は、最大値ｙｍａｘが評価変数ｙｏｋ１の値ではないと判定する場合（Ｓ３６ａ：ＮＯ）、最大値ｙｍａｘが評価変数ｙｏｋ２の値であるか否かを判定する（Ｓ３６ｂ）。ＰＵ３２は、最大値ｙｍａｘが評価変数ｙｏｋ２の値であると判定する場合（Ｓ３６ｂ：ＹＥＳ）、状態２であると判定する（Ｓ３８ｂ）。以下、これをＯＫ２判定と称する。一方、ＰＵ３２は、最大値ｙｍａｘが評価変数ｙｏｋ２の値ではないと判定する場合（Ｓ３６ｂ：ＮＯ）、ＮＧ判定をする（Ｓ４０）。 Each parameter obtained by learning the discriminative model 70d is included in the evaluation mapping data 34c.
The PU 32 determines whether the maximum value ymax among the values of the evaluation variables yok1, yok2, and yng is the value of the evaluation variable yok1 (S36a). When determining that the maximum value ymax is the value of the evaluation variable yok1 (S36a: YES), the PU32 determines that the state is 1 (S38a). Hereinafter, this will be referred to as OK1 determination. On the other hand, when determining that the maximum value ymax is not the value of the evaluation variable yok1 (S36a: NO), the PU32 determines whether the maximum value ymax is the value of the evaluation variable yok2 (S36b). When determining that the maximum value ymax is the value of the evaluation variable yok2 (S36b: YES), the PU32 determines that the state is 2 (S38b). Hereinafter, this will be referred to as OK2 determination. On the other hand, when determining that the maximum value ymax is not the value of the evaluation variable yok2 (S36b: NO), the PU32 makes an NG determination (S40).

図１１に、ユーザ端末５０において、ユーザが望む基準を満たす画像を表示することを支援する処理の手順を示す。図１１に示す処理は、２つの処理よりなる。１つは、ユーザ端末５０の記憶装置５４に記憶されたアプリケーションプログラム５４ａをＰＵ５２が所定周期でくり返し実行することにより実現される処理である。もう１つは、画像評価装置３０の記憶装置３４に記憶された画像評価プログラム３４ａをＰＵ３２が所定周期でくり返し実行することにより実現される処理である。 FIG. 11 shows a process procedure for supporting display of an image that satisfies the user's desired criteria on the user terminal 50. The process shown in FIG. 11 consists of two processes. One is a process realized by the PU 52 repeatedly executing the application program 54a stored in the storage device 54 of the user terminal 50 at a predetermined period. The other process is realized by the PU 32 repeatedly executing the image evaluation program 34a stored in the storage device 34 of the image evaluation device 30 at a predetermined period.

図１１に示すように、ユーザ端末５０のＰＵ５２は、ユーザから再生を許容するＣＭ動画の基準についての要求を受け付ける（Ｓ９０）。ここで、ＰＵ５２は、ユーザインターフェース５８が備える表示装置に、状態１のみを許容するか、状態１および状態２を許容するかを選択可能である旨を表示する。Ｓ９０の処理は、ユーザによるユーザインターフェース５８への要求入力を受け付ける処理である。Ｓ９０の処理を完了する場合、ＰＵ５２は、通信機５６を操作することによって、ユーザの識別記号であるユーザＩＤと、要求された基準と、を送信する（Ｓ９２）。 As shown in FIG. 11, the PU 52 of the user terminal 50 receives a request from the user regarding criteria for commercial videos that are allowed to be played back (S90). Here, the PU 52 displays on the display device included in the user interface 58 that it is possible to select whether to allow only state 1 or to allow state 1 and state 2. The process of S90 is a process of accepting a request input to the user interface 58 by the user. When completing the process of S90, the PU 52 transmits the user ID, which is the user's identification symbol, and the requested criteria by operating the communication device 56 (S92).

これに対し、画像評価装置３０のＰＵ３２は、ユーザＩＤと要求された基準とを受信する（Ｓ１００）。そしてＰＵ３２は、ユーザＩＤと要求された基準とを記憶装置３４に記憶する（Ｓ１０２）。そして、ＰＵ３２は、ＣＭ動画データを評価する（Ｓ１０４）。Ｓ１０４の処理は、Ｓ１０～Ｓ１８，Ｓ２２～Ｓ２６に準じた処理である。ただし、ここでは、Ｓ１６の処理に代えて図１０の処理を実行する。また、Ｓ２２～Ｓ２６の処理を、ＮＧ判定がないＣＭ動画データ中に１つでもＯＫ２判定がある場合、ＯＫ２判定とする処理に代える。 In response, the PU 32 of the image evaluation device 30 receives the user ID and the requested criteria (S100). Then, the PU 32 stores the user ID and the requested criteria in the storage device 34 (S102). Then, the PU 32 evaluates the CM video data (S104). The process in S104 is similar to S10 to S18 and S22 to S26. However, here, the process in FIG. 10 is executed instead of the process in S16. Further, the processing in S22 to S26 is replaced with a process in which an OK2 determination is made if there is even one OK2 determination in the CM video data that does not have an NG determination.

そしてＰＵ３２は、通信機３６を操作することによって、評価結果に応じて、ユーザ端末５０（１），５０（２），…のそれぞれに、要求された基準を満たす画像が表示可能なようにＣＭ動画データを送信する（Ｓ１０６）。すなわち、ＰＵ３２は、ＯＫ１判定のＣＭ動画データについては、全てのユーザ端末５０に無条件でＣＭ動画データを送信する。一方、ＰＵ３２は、ＯＫ２判定のＣＭ動画データについては、状態２を許容するユーザのユーザ端末５０に無条件でＣＭ動画データを送信する。これに対し、状態２を許容しないユーザのユーザ端末５０には、ＯＫ２判定のＣＭ動画データの送信を禁止してもよい。またこれに代えて、ＰＵ３２は、Ｓ２０ａの処理と同様の処理を実行してもよい。一方、ＰＵ３２は、ＮＧ判定のＣＭ動画データについては、全てのユーザのユーザ端末にその動画データを送信することを禁止してもよい。またこれに代えて、ＰＵ３２は、Ｓ２０ａの処理を実行してもよい。なお、ＰＵ３２は、Ｓ１０６の処理を完了する場合、図１１に示す一連の処理のうちのＰＵ３２が実行する処理を一旦終了する。 Then, by operating the communication device 36, the PU 32 transmits a CM so that an image that satisfies the requested criteria can be displayed on each of the user terminals 50(1), 50(2), ... according to the evaluation result. The video data is transmitted (S106). That is, the PU 32 unconditionally transmits the CM video data determined to be OK1 to all user terminals 50 . On the other hand, the PU 32 unconditionally transmits the CM video data determined to be OK2 to the user terminal 50 of the user who allows state 2. On the other hand, the user terminal 50 of a user who does not allow state 2 may be prohibited from transmitting CM video data determined to be OK2. Alternatively, the PU 32 may perform a process similar to the process in S20a. On the other hand, the PU 32 may prohibit the transmission of CM video data judged as NG to the user terminals of all users. Alternatively, the PU 32 may execute the process of S20a. Note that when the PU 32 completes the process of S106, it temporarily ends the process executed by the PU 32 among the series of processes shown in FIG.

一方、ユーザ端末５０のＰＵ５２は、画像評価装置３０から送信されたＣＭ動画データを受信する（Ｓ９４）。ここで、ＰＵ５２は、基準を満たすＣＭ動画データについては無条件で再生する。また、基準を満たさないＣＭ動画データを受信する場合、同ＣＭ動画データに付与された制限指令に応じて、警告を発するか、マスクをするか、再生を禁止するかする。 On the other hand, the PU 52 of the user terminal 50 receives the CM video data transmitted from the image evaluation device 30 (S94). Here, the PU 52 unconditionally plays back commercial video data that meets the criteria. Furthermore, when receiving commercial video data that does not meet the standards, it issues a warning, masks it, or prohibits playback, depending on the restriction command given to the commercial video data.

なお、ＰＵ５２は、Ｓ９４の処理を完了する場合、図１１に示す一連の処理のうちのＰＵ５２が実行する処理を一旦終了する。
＜対応関係＞
上記実施形態における事項と、上記「課題を解決するための手段」の欄に記載した事項との対応関係は、次の通りである。以下では、「課題を解決するための手段」の欄に記載した解決手段の番号毎に、対応関係を示している。 Note that when the PU 52 completes the process of S94, it temporarily ends the process executed by the PU 52 among the series of processes shown in FIG.
<Correspondence>
The correspondence relationship between the matters in the above embodiment and the matters described in the column of "Means for solving the problem" above is as follows. Below, the correspondence relationship is shown for each solution number listed in the "Means for solving the problem" column.

［１，２，１１］実行装置は、ＰＵ３２に対応する。記憶装置は、記憶装置３４に対応する。位置情報写像データは、位置情報写像データ３４ｂに対応する。評価写像データは、評価写像データ３４ｃに対応する。 [1, 2, 11] The execution device corresponds to the PU32. The storage device corresponds to the storage device 34. The position information mapping data corresponds to the position information mapping data 34b. The evaluation mapping data corresponds to evaluation mapping data 34c.

位置情報写像は、図４においては、Ｓ３０，Ｓ３２の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤが入力されることによって、ヒートマップを出力する写像に対応する。位置情報写像は、図７においては、Ｓ３０，Ｓ３２，Ｓ４２，Ｓ４４の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤが入力されることによって、係数行列ＭＫを出力する写像に対応する。位置情報写像は、図９においては、Ｓ６４～Ｓ７０の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤが入力されることによって、２値化マップＭｂｏｄｙを出力する写像に対応する。位置情報写像は、図１０においては、Ｓ３０ａ，Ｓ３２ａの処理によって実現される写像に対応する。換言すれば、フレームデータＦＤが入力されることによって、身体情報マップ８４を出力する写像に対応する。 In FIG. 4, the position information mapping corresponds to the mapping realized by the processing in S30 and S32. In other words, inputting the frame data FD corresponds to mapping that outputs a heat map. In FIG. 7, the position information mapping corresponds to the mapping realized by the processes of S30, S32, S42, and S44. In other words, it corresponds to a mapping that outputs the coefficient matrix MK by inputting the frame data FD. In FIG. 9, the position information mapping corresponds to the mapping realized by the processing of S64 to S70. In other words, it corresponds to mapping that outputs the binarized map Mbody by inputting the frame data FD. In FIG. 10, the position information mapping corresponds to the mapping realized by the processing in S30a and S32a. In other words, inputting the frame data FD corresponds to mapping that outputs the physical information map 84.

位置情報算出処理は、図４におけるＳ３０，Ｓ３２の処理、図７のＳ３０，Ｓ３２，Ｓ４２，Ｓ４４の処理、図９におけるＳ６４～Ｓ７０の処理、および図１０におけるＳ３０ａ，Ｓ３２ａの処理に対応する。 The position information calculation process corresponds to the processes in S30 and S32 in FIG. 4, the processes in S30, S32, S42, and S44 in FIG. 7, the processes in S64 to S70 in FIG. 9, and the processes in S30a and S32a in FIG.

評価写像は、図４においては、Ｓ３４～Ｓ４０の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤおよびヒートマップ６０～６６を入力としてＯＫ判定またはＮＧ判定を出力する写像に対応する。図７においては、Ｓ３４ａ，Ｓ３６～Ｓ４０の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤおよび係数行列ＭＫを入力としてＯＫ判定またはＮＧ判定を出力する写像に対応する。図８においては、Ｓ３４ｂ，Ｓ３６，Ｓ４０，Ｓ６０，Ｓ６２，Ｓ７２～Ｓ７６，Ｓ３８の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤおよび２値化マップＭｂｏｄｙを入力としてＯＫ判定またはＮＧ判定を出力する写像に対応する。図１０においては、Ｓ３４ｃ，Ｓ３６ａ，Ｓ３６ｂ，Ｓ３８ａ，Ｓ３８ｂ，Ｓ４０の処理によって実現される写像に対応する。換言すれば、フレームデータＦＤおよび身体情報マップ８４を入力としてＯＫ１判定、ＯＫ２判定またはＮＧ判定を出力する写像に対応する。 In FIG. 4, the evaluation mapping corresponds to the mapping realized by the processing of S34 to S40. In other words, it corresponds to a mapping that receives frame data FD and heat maps 60 to 66 as input and outputs an OK determination or NG determination. In FIG. 7, this corresponds to the mapping realized by the processing of S34a and S36 to S40. In other words, it corresponds to a mapping that receives frame data FD and coefficient matrix MK as input and outputs an OK determination or NG determination. In FIG. 8, this corresponds to the mapping realized by the processing of S34b, S36, S40, S60, S62, S72 to S76, and S38. In other words, it corresponds to mapping that inputs the frame data FD and the binarized map Mbody and outputs an OK determination or NG determination. In FIG. 10, this corresponds to the mapping realized by the processing of S34c, S36a, S36b, S38a, S38b, and S40. In other words, it corresponds to mapping that inputs the frame data FD and the physical information map 84 and outputs an OK1 determination, OK2 determination, or NG determination.

評価処理は、図３および図４の処理においては、Ｓ３４～Ｓ４０の処理に対応する。図７においては、Ｓ３４ａ，Ｓ３６～Ｓ４０の処理に対応する。図８においては、Ｓ３４ｂ，Ｓ３６，Ｓ４０，Ｓ７２～Ｓ７６，Ｓ３８の処理に対応する。図１０においては、Ｓ３４ｃ，Ｓ３６ａ，Ｓ３６ｂ，Ｓ３８ａ，Ｓ３８ｂ，Ｓ４０の処理に対応する。 The evaluation process corresponds to the processes of S34 to S40 in the processes of FIGS. 3 and 4. In FIG. 7, this corresponds to the processes of S34a and S36 to S40. In FIG. 8, this corresponds to the processes of S34b, S36, S40, S72 to S76, and S38. In FIG. 10, this corresponds to the processes of S34c, S36a, S36b, S38a, S38b, and S40.

画像データは、フレームデータＦＤに対応する。位置用入力データは、フレームデータＦＤに対応する。評価用入力データは、フレームデータＦＤに対応する。
［３］位置情報データは、ヒートマップ６０～６６と、身体情報マップ８４とに対応する。評価写像は、Ｓ３４～Ｓ４０の処理によって実現される写像と、Ｓ３４ｃ，Ｓ３６ａ，Ｓ３６ｂ，Ｓ３８ａ，Ｓ３８ｂ，Ｓ４０の処理によって実現される写像と、に対応する。［４］仮評価写像は、Ｓ３４ｂ，Ｓ３６の処理によって実現される写像に対応する。換言すれば、１つのフレームデータＦＤを入力として且つ、評価変数ｙｏｋ，ｙｎｇの値の大小比較判定結果を出力する写像に対応する。仮評価処理は、Ｓ３４ｂ，Ｓ３６の処理に対応する。妥当性評価処理は、Ｓ６０，Ｓ６２，Ｓ７２～Ｓ７４の処理に対応する。［５］通知処理は、Ｓ８０の処理に対応する。［６］特徴量レイヤは、特徴抽出器Ｍ２０の出力レイヤに対応する。 The image data corresponds to frame data FD. The position input data corresponds to frame data FD. The evaluation input data corresponds to frame data FD.
[3] The location information data corresponds to the heat maps 60 to 66 and the physical information map 84. The evaluation mapping corresponds to the mapping realized by the processing of S34 to S40 and the mapping realized by the processing of S34c, S36a, S36b, S38a, S38b, and S40. [4] The tentative evaluation mapping corresponds to the mapping realized by the processing in S34b and S36. In other words, it corresponds to a mapping that takes one frame data FD as input and outputs the result of comparing the values of evaluation variables yok and yng. The provisional evaluation process corresponds to the processes of S34b and S36. The validity evaluation process corresponds to the processes of S60, S62, and S72 to S74. [5] The notification process corresponds to the process in S80. [6] The feature layer corresponds to the output layer of the feature extractor M20.

［７］ユーザ端末は、ユーザ端末５０（１），５０（２），…に対応する。提供処理は、図３のＳ２６の処理と、図６においてＳ２４の処理で否定判定された場合のＳ２６の処理と、図１１のＳ１０４において要求基準を満たすと判定されたときのＳ１０６の処理とに対応する。制限処理は、図３のＳ２０の処理、図６のＳ２０ａの処理、および図１１のＳ１０４において要求基準を満たさないと判定されたときのＳ１０６の処理に対応する。［８，１０］指示処理は、Ｓ９０，Ｓ９２の処理に対応する。［９］禁止処理は、Ｓ２０の処理、および図１１のＳ１０４において要求基準を満たさないと判定されたときのＳ１０６の処理に対応する。制限処理は、Ｓ２０ａの処理、および図１１のＳ１０４において要求基準を満たさないと判定されたときのＳ１０６の処理に対応する。［１２］画像評価プログラムは、画像評価プログラム３４ａに対応する。 [7] User terminals correspond to user terminals 50(1), 50(2), . . . The provision process includes the process of S26 in FIG. 3, the process of S26 when a negative determination is made in the process of S24 in FIG. 6, and the process of S106 when it is determined that the request criteria are met in S104 of FIG. handle. The restriction process corresponds to the process of S20 in FIG. 3, the process of S20a in FIG. 6, and the process of S106 when it is determined in S104 of FIG. 11 that the required criteria are not satisfied. [8, 10] The instruction process corresponds to the processes of S90 and S92. [9] The prohibition process corresponds to the process in S20 and the process in S106 when it is determined in S104 of FIG. 11 that the required criteria are not satisfied. The restriction process corresponds to the process in S20a and the process in S106 when it is determined in S104 of FIG. 11 that the required criteria are not satisfied. [12] The image evaluation program corresponds to the image evaluation program 34a.

［１３～１５］提供工程は、図３のＳ２６の処理を実行する工程と、図６においてＳ２４の処理で否定判定された場合のＳ２６の処理を実行する工程と、に対応する。また、提供工程は、図１１のＳ１０４において要求基準を満たすと判定されたときのＳ１０６の処理を実行する工程に対応する。制限工程は、図３のＳ２０の処理を実行する工程、図６のＳ２０ａの処理を実行する工程、および図１１のＳ１０４において要求基準を満たさないと判定されたときのＳ１０６の処理を実行する工程に対応する。取得工程は、Ｓ１０の処理に対応する。 [13-15] The providing step corresponds to the step of executing the process of S26 in FIG. 3, and the step of executing the process of S26 when a negative determination is made in the process of S24 in FIG. Further, the providing step corresponds to the step of executing the process of S106 when it is determined in S104 of FIG. 11 that the request criteria are met. The restriction steps are a step of executing the process of S20 in FIG. 3, a step of executing the process of S20a of FIG. 6, and a step of executing the process of S106 when it is determined in S104 of FIG. 11 that the required standard is not satisfied. corresponds to The acquisition step corresponds to the process in S10.

＜その他の実施形態＞
なお、本実施形態は、以下のように変更して実施することができる。本実施形態および以下の変更例は、技術的に矛盾しない範囲で互いに組み合わせて実施することができる。 <Other embodiments>
Note that this embodiment can be implemented with the following modifications. This embodiment and the following modified examples can be implemented in combination with each other within a technically consistent range.

「所定部位について」
再生して問題がある画像であるという評価の根拠である、１以上の身体の部分（所定の部位）は、胸部、正面下腹部、および尻部に限らず、例えば図２における身体の部分の少なくとも１つを含む態様で適宜、定義されてよい。また、一の実施形態において、上記根拠は、例えば図２における１の身体の部分の一部の態様で適宜、定義されてよい。具体的には、顔部を構成する唇，眼，髪等の身体の部分の一部が上記根拠となってもよい。 "About designated parts"
The one or more body parts (predetermined parts) that are the basis for evaluating that the image is problematic when reproduced are not limited to the chest, front lower abdomen, and buttocks, but include, for example, the body parts in Figure 2. It may be appropriately defined in a manner including at least one. Further, in one embodiment, the above-mentioned basis may be appropriately defined, for example, in some aspects of one body part in FIG. 2 . Specifically, part of the body parts such as lips, eyes, and hair that constitute the face may serve as the basis.

「位置情報写像について」
・図４の処理において用いる位置情報写像としては、フレームデータＦＤを入力として、顔部、胸部、下腹部、および尻部のヒートマップを出力する写像に限らない。たとえば、フレームデータＦＤを入力として、胸部、下腹部、および尻部のヒートマップを出力する写像であってもよい。 "About location information mapping"
- The positional information mapping used in the process of FIG. 4 is not limited to mapping that inputs the frame data FD and outputs heat maps of the face, chest, lower abdomen, and buttocks. For example, it may be a mapping that receives frame data FD as input and outputs heat maps of the chest, lower abdomen, and buttocks.

・図４の処理において、ヒートマップを出力する写像を用いることは必須ではない。たとえばセマンティックセグメンテーションモデルを用いて、胸部、下腹部、および尻部のそれぞれに対応する画素を特定するマップを出力する写像を用いてもよい。 - In the process of FIG. 4, it is not essential to use a mapping that outputs a heat map. For example, a semantic segmentation model may be used to output a map that specifies pixels corresponding to each of the chest, lower abdomen, and buttocks.

・図７の処理において用いる位置情報写像としては、フレームデータＦＤを入力として、胸部、下腹部、尻部、および顔部のヒートマップに応じた係数行列ＭＫを出力する写像に限らない。たとえば、フレームデータＦＤを入力として、胸部、下腹部、および尻部のヒートマップに応じた係数行列ＭＫを出力する写像であってもよい。 The position information mapping used in the processing of FIG. 7 is not limited to a mapping that takes frame data FD as input and outputs a coefficient matrix MK corresponding to the heat maps of the chest, lower abdomen, buttocks, and face. For example, it may be a mapping that takes frame data FD as input and outputs a coefficient matrix MK corresponding to the heat maps of the chest, lower abdomen, and buttocks.

・図７の処理において、ヒートマップを出力する写像を用いることは必須ではない。たとえばセマンティックセグメンテーションモデルによって特定された、胸部、下腹部、尻部、および顔部のそれぞれに対応する画素に応じた係数行列を出力する写像を用いてもよい。その場合、特徴抽出器Ｍ２０の出力レイヤに応じて縮小された画素領域において、胸部、下腹部、尻部、および顔部のそれぞれを示す領域の係数Ｋを「１」よりも大きくすればよい。 - In the process of FIG. 7, it is not essential to use a mapping that outputs a heat map. For example, a mapping may be used that outputs coefficient matrices corresponding to pixels corresponding to each of the chest, lower abdomen, buttocks, and face identified by a semantic segmentation model. In that case, in the pixel area reduced according to the output layer of the feature extractor M20, the coefficient K of the area indicating each of the chest, lower abdomen, buttocks, and face may be set larger than "1".

・図９の処理において、ヒートマップに応じた２値化マップＭｂｏｄｙを出力する写像を用いることは必須ではない。たとえばセマンティックセグメンテーションモデルによって特定された胸部、下腹部、および尻部のそれぞれに対応する領域に応じた２値化マップＭｂｏｄｙを出力する写像を用いてもよい。 - In the process of FIG. 9, it is not essential to use a mapping that outputs the binarized map Mbody according to the heat map. For example, a mapping may be used that outputs a binarized map Mbody corresponding to each region of the chest, lower abdomen, and buttocks identified by the semantic segmentation model.

・図１０の処理においては、図２に示す部分「３」、部分「４，５」、部分「１０」，部分「１１，１８」、部分「１４」，部分「１２，１３，１８」を特定する識別する身体情報マップを出力する写像を用いたが、これに限らない。たとえば、部分「３」，部分「３，４，５」、部分「１０」，部分「１０，１１，１８」、部分「１４」，部分「１２，１３，１４，１８」を特定する身体情報マップを出力する写像を用いてもよい。・In the process of FIG. 10, the part "3", the part "4, 5", the part "10", the part "11, 18", the part "14", and the part "12, 13, 18" shown in FIG. Although a mapping that outputs a physical information map for specifying and identifying is used, the present invention is not limited to this. For example, physical information that specifies part "3", part "3, 4, 5", part "10", part "10, 11, 18", part "14", part "12, 13, 14, 18" A mapping that outputs a map may also be used.

・ヒートマップを出力する写像としては、胸部、下腹部、および尻部のそれぞれの存在確率を出力する写像に限らない。たとえば、部分「３」、部分「４，５」、部分「１０」、部分「１１，１８」、部分「１４」、部分「１２，１３，１８」のそれぞれである確率を出力する写像であってもよい。 - The mapping that outputs the heat map is not limited to the mapping that outputs the existence probability of each of the chest, lower abdomen, and buttocks. For example, a mapping that outputs the probabilities of each of the parts "3", "4, 5", "10", "11, 18", "14", and "12, 13, 18". It's okay.

・ヒートマップを出力する写像としては、胸部、下腹部、および尻部のそれぞれ毎に、各別のマップを出力する写像に限らない。たとえば、各画素について、胸部、下腹部、および尻部のそれぞれの存在確率の最大値を出力する写像であってもよい。 - The mapping for outputting heat maps is not limited to mapping for outputting separate maps for each of the chest, lower abdomen, and buttocks. For example, it may be a mapping that outputs the maximum existence probability of each of the chest, lower abdomen, and buttocks for each pixel.

・ヒートマップを出力する写像が、胸部、下腹部、および尻部のそれぞれの存在確率を出力する写像であることは必須ではない。たとえば正面の画像であることが保証されている場合には、胸部および正面下腹部のそれぞれの存在確率に限って出力する写像であってもよい。すなわち、露出を問題とする所定部位をどこにするかに応じて、それら問題とする所定部位のそれぞれの存在確率に関するヒートマップを出力してもよい。 - It is not essential that the mapping that outputs the heat map be the mapping that outputs the existence probability of each of the chest, lower abdomen, and buttocks. For example, if it is guaranteed that the image is a frontal image, the mapping may be such that only the existence probabilities of the chest and frontal lower abdomen are output. That is, depending on the location of the predetermined region whose exposure is a problem, a heat map regarding the existence probability of each of the predetermined regions whose exposure is a problem may be output.

・ヒートマップが示す確率分布は、等方性を有した分布に限らない。異方性を有した分布は、たとえば、図４のＳ３０の処理において用いた回帰モデルに代えて、ガウス分布のそれぞれの分散パラメータを２個または３個出力する回帰モデルを用いることによって実現できる。 - The probability distribution shown by the heat map is not limited to an isotropic distribution. The anisotropic distribution can be realized, for example, by using a regression model that outputs two or three dispersion parameters of a Gaussian distribution, instead of the regression model used in the process of S30 in FIG.

・セマンティックセグメンテーションモデルとしては、上述した３つ以上の部分を識別するモデルに限らない。たとえば正面の画像であることが保証されている場合には、胸部および正面下腹部のそれぞれとそれら以外とに限って識別するラベル変数を出力するモデルであってもよい。 - The semantic segmentation model is not limited to the model that identifies three or more parts as described above. For example, if it is guaranteed that the image is a frontal image, a model may be used that outputs label variables that identify only the chest and frontal lower abdomen, respectively, and the others.

・セマンティックセグメンテーションモデルを用いて構成される写像としては、特定する部位毎に互いに異なるマップを出力する写像に限らない。たとえば、胸部、正面下腹部、尻部、およびそれ以外等で互いに異なるラベル変数の値を有した１枚のマップを出力する写像であってもよい。 - Mappings constructed using the semantic segmentation model are not limited to mappings that output different maps for each specified region. For example, it may be a mapping that outputs one map having different label variable values for the chest, front lower abdomen, buttocks, and other regions.

・位置情報写像が、画像領域が細分化された単位領域に対して値が定められたマップを出力する写像であることは必須ではない。たとえば、胸部の代表点、下腹部の代表点、および尻部の代表点のそれぞれの座標を出力する写像であってもよい。 - It is not essential that the position information mapping is a mapping that outputs a map in which values are determined for unit areas in which the image area is subdivided. For example, it may be a mapping that outputs the respective coordinates of a representative point of the chest, a representative point of the lower abdomen, and a representative point of the buttocks.

・位置情報写像の入力となる画像データとしては、上記フレームデータＦＤに限らない。たとえば、Ｒ，Ｇ，Ｂの各画像データに代えて、モノクロの画像データを用いてもよい。 - The image data that is input to the position information mapping is not limited to the above frame data FD. For example, monochrome image data may be used instead of R, G, and B image data.

「位置用入力データについて」
・位置情報写像への入力となる位置用入力データとしては、評価対象となる画像データ自体に限らない。たとえば、評価対象とする画像データが示す各画素のうち人に関する部分を抽出して、予め定められた背景画像に埋め込んだ画像データを、位置用入力データとしてもよい。ここで、人に関する部分の抽出処理は、セマンティックセグメンテーションモデルを利用してＰＵ３２により実施する。なお、評価対象となる画像データを、位置用入力データとするための前処理としては、これに限らない。たとえば、輝度を調整する処理等であってもよい。 "About input data for position"
- Positional input data to be input to positional information mapping is not limited to the image data itself to be evaluated. For example, the position input data may be image data in which a portion related to a person is extracted from each pixel indicated by the image data to be evaluated and embedded in a predetermined background image. Here, the extraction process of the parts related to people is performed by the PU 32 using a semantic segmentation model. Note that the preprocessing for using the image data to be evaluated as positional input data is not limited to this. For example, it may be a process of adjusting brightness.

「評価写像について」
（ａ）評価用入力データの各画素と各画素に対応する位置情報データとが対応付けて入力される写像
・図４のＳ３４においては、フレームデータＦＤと、４つのヒートマップ６０～６６とが入力される写像を例示した。また、図１０においては、フレームデータＦＤと、６個の身体情報マップ８４とが入力される写像を例示した。しかし、入力対象となる、各画素について身体の所定部位であるか否かの情報が付与されたマップデータは、それらに限らず、同情報が付与されたマップデータは、「位置情報写像について」の欄に記載した任意のマップデータ等であってもよい。 "About evaluation mapping"
(a) A mapping in which each pixel of the input data for evaluation and the position information data corresponding to each pixel are input in association with each other. In S34 of FIG. 4, the frame data FD and the four heat maps 60 to 66 are An example of the input mapping is shown. Further, FIG. 10 illustrates a mapping in which frame data FD and six physical information maps 84 are input. However, map data to which each pixel is attached with information as to whether it is a predetermined part of the body or not is not limited to these, and map data to which the same information is attached may be Any map data etc. described in the column may be used.

・図４のＳ３４においては、フレームデータＦＤを入力したが、これに限らない。たとえばＲ，Ｇ，Ｂの各画像データに代えて、モノクロの画像データを用いてもよい。
・図４においては、ＣＮＮを用いた識別モデル７０ａを例示したが、これに限らない。たとえば、アテンション機構を利用するモデルであってもよい。詳しくは、たとえばＴｒａｎｓｆｏｍｅｒベース（トランスフォーマエンコーダを備えた）モデル等のマルチヘッドアテンション機構を利用するモデル等であってもよい。その場合、フレームデータＦＤおよびヒートマップ６０～６６を分割したパッチを適宜線形変換したベクトルを、トランスフォーマエンコーダに入力する。ここで、各パッチは、フレームデータＦＤの一部の画素領域と、ヒートマップ６０～６６の対応する領域とに関するデータとする。これによっても、評価用入力データの各画素と各画素に対応する位置情報データとを対応付けて入力することができる。 - Although the frame data FD is input in S34 of FIG. 4, the present invention is not limited to this. For example, monochrome image data may be used instead of R, G, and B image data.
- Although the identification model 70a using CNN is illustrated in FIG. 4, the invention is not limited to this. For example, a model using an attention mechanism may be used. Specifically, it may be a model that uses a multi-head attention mechanism, such as a Transformer-based (equipped with a Transformer encoder) model. In that case, vectors obtained by suitably linearly transforming the frames data FD and patches obtained by dividing the heat maps 60 to 66 are input to the transformer encoder. Here, each patch is data regarding a part of the pixel area of the frame data FD and the corresponding area of the heat maps 60 to 66. This also allows each pixel of the evaluation input data to be input in association with the position information data corresponding to each pixel.

・識別モデル７０ａの出力活性化関数が、ソフトマックス関数であることは必須ではない。たとえば、ロジスティックシグモイド関数としてもよい。
（ｂ）仮評価写像について
・図８のＳ３４ｂにおいては、フレームデータＦＤを入力とする写像を例示したが、これに限らない。たとえばＲ，Ｇ，Ｂの各画像データに代えて、モノクロの画像データを用いてもよい。 - It is not essential that the output activation function of the discrimination model 70a is a softmax function. For example, it may be a logistic sigmoid function.
(b) Regarding provisional evaluation mapping - In S34b of FIG. 8, mapping using frame data FD as input is illustrated, but the mapping is not limited to this. For example, monochrome image data may be used instead of R, G, and B image data.

・図８においては、ＣＮＮを用いた識別モデル７０ｃを例示したが、これに限らない。たとえばアテンション機構を利用するモデルであってもよい。詳しくは、たとえばＴｒａｎｓｆｏｍｅｒベース（トランスフォーマエンコーダを備えた）モデル等のマルチヘッドアテンション機構を利用するモデル等であってもよい。換言すれば、評価結果に寄与した部分を特定するデータは、ＣＮＮの特徴マップに限らない。識別モデル７０ｃとして、Ｔｒａｎｓｆｏｍｅｒベースのモデルを利用する場合、アテンションマップ、アテンションマスクに基づく注視マップＭＡＴＴを用いて前述の妥当性評価を行ってもよい。 - In FIG. 8, the identification model 70c using CNN is illustrated, but the invention is not limited to this. For example, a model using an attention mechanism may be used. Specifically, it may be a model that uses a multi-head attention mechanism, such as a Transformer-based (equipped with a Transformer encoder) model. In other words, the data that identifies the portion that contributed to the evaluation result is not limited to the CNN feature map. When a Transformer-based model is used as the identification model 70c, the above-described validity evaluation may be performed using the attention map MATT based on the attention map and attention mask.

・識別モデル７０ｃの出力活性化関数が、ソフトマックス関数であることは必須ではない。たとえば、ロジスティックシグモイド関数としてもよい。
・仮評価写像を規定する識別モデルが、位置情報写像の出力を利用しないことは必須ではない。たとえば、仮評価写像を規定する識別モデルを、識別モデル７０ａとしたり、識別モデル７０ａに関する上記変更例としたりしてもよい。またたとえば、仮評価写像を規定する識別モデルを、識別モデル７０ｂとしたり、識別モデル７０ｂに関する上記変更例としたりしてもよい。またたとえば、仮評価写像を規定する識別モデルを、識別モデル７０ｄとしたり、識別モデル７０ｄに関する上記変更例としたりしてもよい。 - It is not essential that the output activation function of the discrimination model 70c is a softmax function. For example, it may be a logistic sigmoid function.
- It is not essential that the identification model that defines the tentative evaluation mapping not use the output of the location information mapping. For example, the identification model that defines the provisional evaluation mapping may be the identification model 70a, or the above-mentioned modification of the identification model 70a may be used. Further, for example, the identification model that defines the provisional evaluation mapping may be the identification model 70b, or the above-mentioned modification example regarding the identification model 70b may be used. Further, for example, the identification model that defines the provisional evaluation mapping may be the identification model 70d, or the above-mentioned modification example regarding the identification model 70d may be used.

（ｃ）妥当性評価処理について
・図８の処理において、Ｓ５０の処理を実行する条件に、人の所定部位が含まれている旨の条件を含めてもよい。これにより、フレームデータＦＤが示す画像に人の所定部位が含まれない場合において適切な評価をすることができる。 (c) Regarding the validity evaluation process - In the process of FIG. 8, the condition for executing the process of S50 may include a condition that a predetermined part of the person is included. Thereby, appropriate evaluation can be performed when the image indicated by the frame data FD does not include a predetermined part of a person.

・図８の処理では、Ｓ３６の処理において肯定判定された場合に限って、Ｓ５０の処理を実行したが、これに限らない。たとえば、Ｓ３６の処理において否定判定されたときにもＳ５０の処理に準じた処理を実行してもよい。ここで、Ｓ５０の処理に準じた処理は、たとえば次のようにして実現できる。 - In the process of FIG. 8, the process of S50 is executed only when an affirmative determination is made in the process of S36, but the process is not limited to this. For example, even when a negative determination is made in the process of S36, a process similar to the process of S50 may be executed. Here, processing similar to the processing of S50 can be realized, for example, as follows.

すなわち、ＰＵ３２は、Ｓ７０の処理に代えて、胸部である確率が閾値以上である２値化マップＭｂｏｄｙ（１）、正面下腹部である確率が閾値以上である２値化マップＭｂｏｄｙ（２）、尻部である確率が閾値以上である２値化マップＭｂｏｄｙ（３）を生成する。ここで、２値化マップＭｂｏｄｙ（１）は、正面下腹部である確率が閾値以上であることと尻部である確率が閾値以上であることとの論理和が真となる画素のラベル変数を「０」としたマップである。また、２値化マップＭｂｏｄｙ（２）は、胸部である確率が閾値以上であることと尻部である確率が閾値以上であることとの論理和が真となる画素のラベル変数を「０」としたマップである。また、２値化マップＭｂｏｄｙ（３）は、胸部である確率が閾値以上であることと正面下腹部である確率が閾値以上であることとの論理和が真となる画素のラベル変数を「０」としたマップである。そして、ＰＵ３２は、Ｓ７２の処理に代えて、２値化マップＭｂｏｄｙ（１）～（３）のそれぞれと、２値化マップＭＡＣＴとのアダマール積を算出して注視マップＭＡＴＴ（１）～（３）を算出する。そして、ＰＵ３２は、注視マップＭＡＴＴ（１）～（３）のそれぞれの各成分の平均値に、閾値ｇｔｈ以上となるものがある場合に、Ｓ３６の処理が妥当であると評価する。 That is, instead of the process in S70, the PU 32 generates a binary map Mbody (1) in which the probability of being the chest is greater than or equal to the threshold, a binary map Mbody (2) in which the probability of being the front lower abdomen is greater than or equal to the threshold, A binarized map Mbody (3) whose probability of being the butt is equal to or higher than a threshold is generated. Here, the binarized map Mbody (1) is a label variable of a pixel for which the logical sum of the probability that it is the front lower abdomen is greater than or equal to the threshold and the probability that it is the buttocks is greater than or equal to the threshold is true. This is a map with "0". In addition, the binarized map Mbody (2) sets the label variable of a pixel for which the logical sum of the probability of being a chest is greater than or equal to a threshold and the probability of being a buttocks being greater than or equal to a threshold to be "0". This is the map. In addition, the binarized map Mbody (3) sets the label variable of a pixel for which the logical sum of the probability of being a chest is greater than or equal to a threshold and the probability of being a frontal lower abdomen being greater than or equal to a threshold to be true. ” This is a map with the following. Then, in place of the process in S72, the PU 32 calculates the Hadamard product of each of the binarized maps Mbody(1) to (3) and the binarized map MACT, and calculates the Hadamard product of each of the binarized maps Mbody(1) to (3) to ) is calculated. Then, the PU 32 evaluates that the process of S36 is appropriate if any of the average values of each component of the gaze maps MATT(1) to (3) is equal to or greater than the threshold value gth.

この処理は、ＮＧ判定は、胸部、下腹部、および尻部の少なくとも１つの露出によってなされるものであることに鑑みた処理である。すなわち、ＰＵ３２は、ＮＧ判定したときに注視した領域が胸部、下腹部、および尻部の少なくとも１つであれば、ＮＧ判定が妥当であると評価する。 This process is performed in consideration of the fact that an NG determination is made by exposing at least one of the chest, lower abdomen, and buttocks. That is, the PU 32 evaluates that the NG determination is appropriate if the region gazed at when making the NG determination is at least one of the chest, lower abdomen, and buttocks.

・妥当性評価処理が２値化マップＭｂｏｄｙを生成する処理を含むことは必須ではない。たとえば、次のようにしてもよい。すなわち、ＰＵ３２は、Ｓ６０の処理によって生成されたアクティベーションマップのうちのＳ６８の処理において縮小されたヒートマップが示す胸部、下腹部、および尻部を特定する。そして、ＰＵ３２は、それらのそれぞれの領域の値の和と、それら以外の領域の所定の面積を有する領域の値の和との大小を比較する。なお、その場合、アクティベーションマップの各画素の値を、予めＲｅＬＵまたはロジスティックシグモイド関数等を用いて変換された値とすることが望ましい。 - It is not essential that the validity evaluation process includes the process of generating the binarized map Mbody. For example, you may do the following: That is, the PU 32 identifies the chest, lower abdomen, and buttocks indicated by the heat map reduced in the process of S68 out of the activation maps generated in the process of S60. Then, the PU 32 compares the sum of the values of each of these areas with the sum of the values of other areas having a predetermined area. Note that in that case, it is desirable that the value of each pixel of the activation map be a value that has been converted in advance using ReLU, a logistic sigmoid function, or the like.

（ｄ）特徴量レイヤが示す複数個の領域の少なくとも一部の値と位置情報データとを合成する処理
・図７のＳ３４ａの積算処理Ｍ２２においては、１つの特徴マップとアダマール積がとられる係数行列ＭＫを１個としたが、これに限らない。２個以上であってもよい。 (d) A process of synthesizing at least some values of a plurality of regions indicated by the feature layer with position information data - In the integration process M22 of S34a in FIG. 7, the coefficients from which the Hadamard product is taken with one feature map Although the number of matrix MK is one, the number is not limited to this. There may be two or more.

・図７のＳ３４ａの積算処理Ｍ２２においては、係数行列ＭＫとのアダマール積が算出される対象とならない特徴マップを設けたが、これは必須ではない。
・「位置情報写像について」の欄の記載に示唆した通り、係数行列ＭＫは、４個に限らない。 - In the integration process M22 of S34a in FIG. 7, a feature map is provided for which the Hadamard product with the coefficient matrix MK is not calculated, but this is not essential.
- As suggested in the column "About position information mapping", the number of coefficient matrices MK is not limited to four.

・特徴量レイヤが示す複数個の領域の少なくとも一部の値と位置情報データとを合成する処理は、係数行列ＭＫと特徴マップとの積算処理Ｍ２２に限らない。たとえば、係数行列ＭＫと特徴マップとの畳み込み処理であってもよい。 - The process of combining at least part of the values of the plurality of regions indicated by the feature amount layer and the position information data is not limited to the integration process M22 of the coefficient matrix MK and the feature map. For example, it may be a convolution process of the coefficient matrix MK and the feature map.

（ｅ）露出度に応じた評価変数の値について
・図１０には、露出度に応じた３値の互いに異なる値を取り得る評価変数の値を例示したが、これに限らない。たとえば、評価変数が４値以上の値を取り得る変数であってもよい。 (e) Regarding the value of the evaluation variable according to the degree of exposure - Although FIG. 10 illustrates the values of the evaluation variable that can take three different values depending on the degree of exposure, the present invention is not limited thereto. For example, the evaluation variable may be a variable that can take four or more values.

（ｆ）そのほか
・識別モデル７０ａ，７０ｃに限らず、たとえば、識別モデル７０ｂ，７０ｄに代えて、アテンション機構を利用するモデルを用いてもよい。詳しくは、たとえばＴｒａｎｓｆｏｍｅｒベース（トランスフォーマエンコーダを備えた）モデル等のマルチヘッドアテンション機構を利用するモデル等を用いてもよい。 (f) Others - Not limited to the identification models 70a and 70c, for example, a model using an attention mechanism may be used in place of the identification models 70b and 70d. Specifically, a model using a multi-head attention mechanism such as a Transformer-based model (equipped with a Transformer encoder) may be used.

・評価写像への入力となる位置情報データとしては、２次元の画像領域のそれぞれにおける所定部位に関する情報を含むデータに限らない。たとえば、「位置情報写像について」の欄に記載したように所定部位の代表点の座標であってもよい。 - The positional information data that is input to the evaluation mapping is not limited to data that includes information regarding a predetermined part in each two-dimensional image area. For example, it may be the coordinates of a representative point of a predetermined part, as described in the column "About position information mapping."

・たとえば、フレームデータＦＤとヒートマップ６０～６６とを、全結合順伝播型のＮＮに入力してもよい。その場合、各画素について身体の所定部位であるか否かの情報が付与されたマップデータとフレームデータＦＤとのそれぞれの値の、ＮＮの重み係数による加重平均処理が第１番目のレイヤによってなされる処理としてもよい。 - For example, the frame data FD and the heat maps 60 to 66 may be input to a fully connected forward propagation type NN. In that case, the first layer performs weighted averaging processing using the weighting coefficient of NN for the respective values of the map data and frame data FD, to which each pixel is given information as to whether it belongs to a predetermined part of the body or not. It is also possible to perform processing using

「評価写像を規定するモデルの出力について」
・たとえば露出を問題とする所定部位が複数ある場合において、それら各部位ごとに、露出の有無を出力するモデルであってもよい。すなわち、たとえば、胸部、正面下腹部、および尻部の３つを所定部位とする場合、それらのそれぞれ毎に、露出の有無の判定結果を示す変数値を出力するようにしてもよい。これは、たとえば出力活性化関数を、ロジスティックシグモイド関数を所定部位の数である３個用意することによって実現できる。なお、こうした場合には、妥当性評価処理においてＮＧ判定の妥当性を評価する場合には、上述したものに代えて次の変更をすることが望ましい。すなわち、たとえば胸部のみにＮＧ判定がなされる場合、胸部に着目している場合に妥当と判定する処理に代えればよい。またたとえば、胸部および正面下腹部の双方でＮＧ判定がなされる場合、胸部および正面下腹部の双方に着目している場合に妥当と判定する処理に代えればよい。 "About the output of the model that defines the evaluation map"
- For example, when there are a plurality of predetermined parts where exposure is a problem, a model may be used that outputs the presence or absence of exposure for each part. That is, for example, when three predetermined parts are the chest, the front lower abdomen, and the buttocks, a variable value indicating the determination result of the presence or absence of exposure may be output for each of them. This can be realized, for example, by preparing three output activation functions, which correspond to the number of predetermined parts, of logistic sigmoid functions. Note that in such a case, when evaluating the validity of the NG determination in the validity evaluation process, it is desirable to make the following changes instead of the ones described above. That is, for example, if only the chest is determined to be NG, the process may be replaced with a process in which the chest is determined to be appropriate. Further, for example, if an NG determination is made for both the chest and the front lower abdomen, the process may be replaced with a process that determines that it is appropriate when both the chest and the front lower abdomen are focused.

「評価用入力データについて」
・評価写像への入力となる評価用入力データとしては、評価対象となる画像データ自体に限らない。たとえば、評価対象とする画像データが示す各画素のうち人に関する部分を抽出して、予め定められた背景画像に埋め込んだ画像データを、評価用入力データとしてもよい。ここで、人に関する部分の抽出処理は、セマンティックセグメンテーションモデルを利用してＰＵ３２により実施する。なお、評価対象となる画像データを、評価用入力データとするための前処理としては、これに限らない。たとえば、輝度を調整する処理等であってもよい。 "About input data for evaluation"
- The evaluation input data that is input to the evaluation mapping is not limited to the image data itself to be evaluated. For example, the input data for evaluation may be image data in which a portion related to a person is extracted from each pixel indicated by the image data to be evaluated and embedded in a predetermined background image. Here, the extraction process of the parts related to people is performed by the PU 32 using a semantic segmentation model. Note that the preprocessing for using the image data to be evaluated as input data for evaluation is not limited to this. For example, it may be a process of adjusting brightness.

「位置情報写像データについて」
・位置情報写像データが、パラメトリックモデルにおける学習済みのパラメータのみからなることは必須ではない。たとえば、特徴抽出器を規定するデータと、サポートベクトルとからなってもよい。ここで、特徴抽出器は、ＣＮＮ等、フレームデータＦＤ等を入力として特徴ベクトルを出力する学習済みモデルである。一方、サポートベクトルは、サポートベクトル回帰の学習によって選択されたベクトルである。すなわち、訓練データが特徴抽出器に入力されることによって出力される特徴ベクトルから学習過程でサポートベクトルを抽出する。そして、特徴抽出器を規定するパラメータとサポートベクトルとを、位置情報写像データとして記憶装置３４に記憶する。 "About location information mapping data"
- It is not essential that the location information mapping data consists only of learned parameters in the parametric model. For example, it may consist of data defining a feature extractor and support vectors. Here, the feature extractor is a trained model, such as a CNN, that receives frame data FD, etc. as input and outputs a feature vector. On the other hand, the support vector is a vector selected by support vector regression learning. That is, support vectors are extracted in the learning process from feature vectors output by inputting training data to a feature extractor. Then, the parameters defining the feature extractor and the support vector are stored in the storage device 34 as position information mapping data.

「評価写像データについて」
・識別モデル７０ａ～７０ｄに代えて、アテンション機構を利用するモデルを用いる場合、評価写像データ３４ｃは、次のデータを含んでもよい。すなわち、評価写像データ３４ｃは、フレームデータＦＤを分割した各パッチを１次元のベクトルに変換する変換行列の成分の値を含んでよい。またたとえば、評価写像データ３４ｃは、上記１次元のベクトルに変換された各パッチを、キー、クエリ、およびバリューの各ベクトルに変換する行列の値を含んでもよい。また、たとえば、評価写像データ３４ｃは、１次元に変換されたパッチに埋め込む位置情報のデータを含んでもよい。なお、上記変換行列および位置情報データは、学習済みのデータであってもよい。ただし、これは、必ずしも上述した学習過程において学習された値である必要はない。たとえば、上記学習過程をファインチューニングとして且つ、それ以前に行われる学習においてのみ学習された値としてもよい。 “About evaluation mapping data”
- When using a model that uses an attention mechanism instead of the identification models 70a to 70d, the evaluation mapping data 34c may include the following data. That is, the evaluation mapping data 34c may include values of components of a transformation matrix that transforms each patch obtained by dividing the frame data FD into a one-dimensional vector. Furthermore, for example, the evaluation mapping data 34c may include values of a matrix that converts each patch converted into the one-dimensional vector described above into each vector of a key, a query, and a value. Furthermore, for example, the evaluation mapping data 34c may include positional information data to be embedded in a one-dimensionally converted patch. Note that the transformation matrix and position information data may be already learned data. However, this does not necessarily have to be the value learned in the learning process described above. For example, the above learning process may be fine tuning, and the values may be learned only in the previous learning.

・評価写像データが、パラメトリックモデルにおける学習済みのパラメータのみからなることは必須ではない。たとえば、特徴抽出器を規定するデータと、サポートベクトルとからなってもよい。ここで、特徴抽出器は、ＣＮＮ等、フレームデータＦＤ等を入力として特徴ベクトルを出力する学習済みモデルである。一方、サポートベクトルは、サポートベクトルマシンの学習によって選択されたベクトルである。すなわち、訓練データが特徴抽出器に入力されることによって出力される特徴ベクトルから学習過程でサポートベクトルを抽出する。そして、特徴抽出器を規定するパラメータとサポートベクトルとを、評価写像データとして記憶装置３４に記憶する。 - It is not essential that the evaluation mapping data consists only of learned parameters in the parametric model. For example, it may consist of data defining a feature extractor and support vectors. Here, the feature extractor is a trained model such as CNN that receives frame data FD and the like as input and outputs a feature vector. On the other hand, the support vector is a vector selected by learning of the support vector machine. That is, support vectors are extracted in the learning process from feature vectors output by inputting training data to a feature extractor. Then, the parameters defining the feature extractor and the support vector are stored in the storage device 34 as evaluation mapping data.

また、全結合レイヤのパラメータと、畳み込みのフィルタを構成するパラメータ等の事後分布を出力する関数を評価写像データとして記憶装置３４に記憶してもよい。ここでは、たとえば識別モデル７０ｃにおける出力活性化関数をロジスティックシグモイド関数に代える。またたとえば、上記パラメータのそれぞれについて正規分布等の事前分布を仮定する。そして学習過程においてベイズ推定によって事後分布を生成する。なお、ＰＵ３２は、事後分布に基づきサンプリング法等に基づき上記パラメータの平均値を算出し、これを図８で用いたパラメータに代えればよい。なお、その場合、妥当性評価処理によって妥当ではないと評価されたデータを用いて再学習をすることによって、事後分布を更新してもよい。 Further, the parameters of the fully connected layer and the function that outputs the posterior distribution of the parameters constituting the convolution filter may be stored in the storage device 34 as evaluation mapping data. Here, for example, the output activation function in the discrimination model 70c is replaced with a logistic sigmoid function. For example, a prior distribution such as a normal distribution is assumed for each of the above parameters. Then, in the learning process, a posterior distribution is generated by Bayesian estimation. Note that the PU 32 may calculate the average value of the above-mentioned parameters based on a sampling method or the like based on the posterior distribution, and may replace this with the parameters used in FIG. 8 . In this case, the posterior distribution may be updated by performing re-learning using data that has been evaluated as not being valid by the validity evaluation process.

「評価処理について」
・評価処理が、Ｎ個の周期でサンプリングされた３個のフレームデータＦＤのうちの２個以上についてフレーム評価処理によってＯＫ判定がなされる場合に、２Ｎ＋１のフレーム分の画像データのＯＫ判定としたが、これに限らない。たとえば、図８の処理の上述の変更例においては妥当性評価処理の判定結果に応じてＯＫ判定の基準を変えてもよい。具体的には、Ｓ３６の処理においてＮＧ判定されたものの、妥当性評価処理によってＮＧ判定が妥当ではないと判定された場合には、ＮＧ判定のなされた回数を「０．５」とカウントしてもよい。その場合、たとえばＮＧ判定の数が「１」未満で最終的なＯＫ判定をすることによって、妥当性評価処理の判定結果に応じてＯＫ判定の基準を変えることができる。 "About evaluation processing"
- If the evaluation process determines OK for two or more of the three frame data FD sampled at N cycles, the image data for 2N+1 frames is determined to be OK. However, it is not limited to this. For example, in the above-described modified example of the process shown in FIG. 8, the standard for determining OK may be changed depending on the determination result of the validity evaluation process. Specifically, if an NG determination is made in the process of S36, but it is determined that the NG determination is not valid in the validity evaluation process, the number of times the NG determination is made is counted as "0.5". Good too. In that case, for example, by making a final OK determination when the number of NG determinations is less than "1", the standard for OK determination can be changed according to the determination result of the validity evaluation process.

・複数のフレームにおいてそれらの一部且つ所定数以上ＯＫ判定がなされる場合にＯＫ判定をすることは必須ではない。たとえば、所定周期でサンプリングされたフレームデータＦＤの全てがフレーム評価処理によってＯＫとされる場合に限って、最終的なＯＫ判定をしてもよい。なお、ここで所定周期は、フレームの周期と等しくてもよい。 - It is not essential to perform an OK determination when a predetermined number or more of a plurality of frames are determined to be OK. For example, the final OK determination may be made only when all of the frame data FD sampled at a predetermined period is determined to be OK by the frame evaluation process. Note that the predetermined cycle may be equal to the frame cycle.

「制限処理、制限工程について」
・Ｓ２０ａの処理としては、たとえば、再生画像のうち所定部位が露出している領域を少なくとも含む再生画像の一部に対して所定画像を重畳することでマスクをする指令であってよい。またたとえば、再生画像のうち所定部位が露出している領域を少なくとも含む再生画像の一部に対してぼかし等のエフェクトを適用することでマスクをする指令であってよい。すなわち、Ｓ２０ａの処理は、露出することが問題とされて且つ露出していると判定された所定部位が非表示となるような指令であればその態様に制限はない。なお、露出することが問題とされる所定部位が実際に露出している領域を少なくとも含む再生画像の一部は、たとえば、上述の２次元分布に少なくとも基づいて決定されてもよい。なお画像中に所定部位が複数存在する場合に実際に露出している領域のみをマスクするうえでは、実際に露出している領域を特定することが望ましい。これは、たとえば、識別モデルの注視領域に基づき露出している領域を特定する処理としてもよい。 “About restriction processing and restriction processes”
- The process of S20a may be, for example, a command to mask a part of the reproduced image that includes at least an area in which a predetermined part of the reproduced image is exposed by superimposing a predetermined image. Alternatively, for example, the command may be a command to mask by applying an effect such as blurring to a part of the reproduced image that includes at least an area in which a predetermined part of the reproduced image is exposed. That is, the process of S20a is not limited in its form as long as it is a command that hides a predetermined part that is determined to be exposed and is determined to be exposed. Note that the part of the reproduced image that includes at least the area in which the predetermined part whose exposure is a problem is actually exposed may be determined based on at least the above-mentioned two-dimensional distribution, for example. Note that when a plurality of predetermined parts exist in an image, in order to mask only the actually exposed regions, it is desirable to specify the actually exposed regions. This may be, for example, a process of identifying an exposed area based on the gaze area of the identification model.

・たとえば、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤのうち、ＯＫ判定がなされなかったフレームデータＦＤの数が所定数に満たない場合に、ＣＭ動画データの配信（送信）を禁止してもよい。その場合、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤのすべてにおいてＯＫ判定がなされることを条件として、ＣＭ動画データの配信（送信）を行ってもよい。 - For example, the PU 32 prohibits distribution (transmission) of CM video data when the number of frame data FDs for which an OK determination has not been made among a predetermined number of consecutive frame data FDs is less than a predetermined number. It's okay. In that case, the PU 32 may distribute (transmit) the CM video data on the condition that OK determination is made for all of a predetermined number of consecutive frame data FD.

またたとえば、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤのうち、ＯＫ判定がなされなかったフレームデータＦＤの数が所定数である場合に、ＣＭ動画データの配信（送信）を禁止してもよい。その場合、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤの少なくとも一部においてＯＫ判定がなされることを条件として、ＣＭ動画データの配信（送信）を行ってもよい。 For example, the PU 32 prohibits the distribution (transmission) of commercial video data when the number of frame data FDs for which an OK determination is not made is a predetermined number among a predetermined number of consecutive frame data FDs. Good too. In that case, the PU 32 may distribute (transmit) the CM video data on the condition that an OK determination is made for at least part of a predetermined number of consecutive frame data FD.

またたとえば、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤのうち、ＯＫ判定がなされなかった所定数に満たないフレームデータＦＤを挟んでＯＫ判定がなされたフレームデータＦＤが連続する場合、ＣＭ動画データの配信（送信）を行ってもよい。 For example, among a predetermined number of consecutive frame data FDs, if there are consecutive frame data FDs that have been determined to be OK with less than the predetermined number of frame data FDs that have not been determined to be OK, the PU 32 may Video data may also be distributed (transmitted).

なお、ＰＵ３２は、所定数の連続する複数のフレームデータＦＤのうち、ＯＫ判定がなされた所定数に満たないフレームデータＦＤを挟んでＯＫ判定がなされなかったフレームデータＦＤが連続する場合、ＣＭ動画データの配信（送信）を禁止してもよい。 Note that, among a predetermined number of consecutive frame data FDs, if there are consecutive frame data FDs for which an OK decision has not been made sandwiching frame data FDs that are less than the predetermined number for which an OK decision has been made, Distribution (transmission) of data may be prohibited.

・Ｓ２０ａの処理によってユーザ端末５０において画像が表示されることを制限する処理にとって、図４に示すフレーム評価処理等によって評価がなされることは必須ではない。たとえば図７に示した処理、またはその変更例に示したフレーム評価処理等によって評価がなされる場合にＳ２０ａの処理を適用してもよい。またたとえば、図８に示した処理、またはその変更例に示したフレーム評価処理等によって評価がなされる場合にＳ２０ａの処理を適用してもよい。 - For the process of restricting the display of an image on the user terminal 50 by the process of S20a, it is not essential that the frame evaluation process shown in FIG. 4 be performed. For example, the process of S20a may be applied when evaluation is performed by the process shown in FIG. 7 or the frame evaluation process shown in the modified example thereof. Further, for example, the process of S20a may be applied when evaluation is performed by the process shown in FIG. 8 or the frame evaluation process shown in the modified example thereof.

・図１１の処理において、Ｓ９２，Ｓ１００，Ｓ１０２の処理を省いてもよい。その場合、ＰＵ３２は、ＯＫ２判定の場合、Ｓ１０６の処理において、ＯＫ２判定である旨を制限指令としてＣＭ動画データに付与すればよい。その場合、ユーザ端末５０のＰＵ５２は、Ｓ９０の処理によって受け付けた基準と、制限指令とに応じてＣＭ動画データを再生するか否か等を決定すればよい。 - In the process of FIG. 11, the processes of S92, S100, and S102 may be omitted. In that case, in the case of an OK2 determination, the PU 32 may add an OK2 determination to the CM video data as a restriction command in the process of S106. In that case, the PU 52 of the user terminal 50 may decide whether or not to reproduce the CM video data based on the criteria received in the process of S90 and the restriction command.

・たとえばＮＧ判定がなされたフレームを削除したＣＭ動画データをユーザ端末５０に送信してもよい。これによっても、ユーザ端末５０において不適切な画像が表示されることを抑制できる。 - For example, CM video data from which frames judged as NG may be deleted may be transmitted to the user terminal 50. This can also prevent inappropriate images from being displayed on the user terminal 50.

・ユーザ端末５０に専用のアプリケーションプログラム５４ａが記憶されていることは必須ではない。たとえば、画像評価装置３０から配信されたＣＭ動画データを汎用のブラウザを利用して再生してもよい。 - It is not essential that the user terminal 50 stores the dedicated application program 54a. For example, the commercial video data distributed from the image evaluation device 30 may be played back using a general-purpose browser.

「実行装置について」
・実行装置としては、ＰＵ３２に限らない。たとえば、実行装置を、ＡＳＩＣ、およびＦＰＧＡ等の専用のハードウェア回路としてもよい。すなわち、実行装置は、以下の（ａ）～（ｃ）のいずれかの構成を備える処理回路を含んでいてもよい。（ａ）上記処理の全てを、プログラムに従って実行する処理装置と、プログラムを記憶する記憶装置等のプログラム格納装置とを備える処理回路。（ｂ）上記処理の一部をプログラムに従って実行する処理装置およびプログラム格納装置と、残りの処理を実行する専用のハードウェア回路（ハードウェアアクセレータ）とを備える処理回路。（ｃ）上記処理の全てを実行する専用のハードウェア回路を備える処理回路。ここで、処理装置およびプログラム格納装置を備えたソフトウェア実行装置は、複数であってもよい。また、専用のハードウェア回路は複数であってもよい。 "About the execution device"
- The execution device is not limited to the PU32. For example, the execution device may be a dedicated hardware circuit such as an ASIC or an FPGA. That is, the execution device may include a processing circuit having any of the following configurations (a) to (c). (a) A processing circuit that includes a processing device that executes all of the above processing according to a program, and a program storage device such as a storage device that stores the program. (b) A processing circuit that includes a processing device and a program storage device that execute part of the above processing according to a program, and a dedicated hardware circuit (hardware accelerator) that executes the remaining processing. (c) A processing circuit that includes a dedicated hardware circuit that executes all of the above processing. Here, there may be a plurality of software execution devices including a processing device and a program storage device. Further, there may be a plurality of dedicated hardware circuits.

「画像評価装置について」
・画像評価装置が、画像を評価する処理と、評価結果をユーザ端末５０に配信する処理との双方を行うことは必須ではない。たとえば、評価結果をユーザ端末５０に配信する処理を、画像評価装置とは別の装置が実行してもよい。また、画像評価装置が業者端末１０から送信された画像データを受信することも必須ではない。たとえば、業者端末１０から送信された画像データを受信して且つ画像評価装置に送信する装置を別途備えてもよい。さらに、画像評価装置と業者端末１０とが一体化されていてもよい。 "About the image evaluation device"
- It is not essential for the image evaluation device to perform both the process of evaluating an image and the process of distributing the evaluation results to the user terminal 50. For example, a device other than the image evaluation device may execute the process of distributing the evaluation results to the user terminal 50. Furthermore, it is not essential that the image evaluation device receives image data transmitted from the vendor terminal 10. For example, a separate device may be provided that receives image data transmitted from the vendor terminal 10 and transmits it to the image evaluation device. Furthermore, the image evaluation device and the vendor terminal 10 may be integrated.

「コンピュータについて」
・画像評価プログラム３４ａを実行するコンピュータとしては、画像評価装置３０が備えるＰＵ３２に限らない。たとえば、画像評価プログラム３４ａをユーザ端末５０にインストールすることによって、ユーザ端末５０のＰＵ５２を画像評価プログラム３４ａを実行するコンピュータとしてもよい。 "About computers"
- The computer that executes the image evaluation program 34a is not limited to the PU 32 included in the image evaluation device 30. For example, by installing the image evaluation program 34a on the user terminal 50, the PU 52 of the user terminal 50 may be used as a computer that executes the image evaluation program 34a.

「そのほか」
・ＰＵ１２が、ＰＵ３２に含まれ得る態様の演算ユニットを含んでもよい。
・上記実施形態では、ＣＭ動画データとして、ストリーミング映像の態様でユーザ端末５０に対して配信（送信）されるものを想定したが、これに限らない。たとえば、ライブストリーミングの態様のサービスでユーザ端末５０に対してリアルタイムに配信（送信）されてよい。評価対象となる画像データは、ＣＭ動画データに限らない。たとえば、商品等の広告の用に供する動画に限らず、人（人物）の身体の少なくとも一部が含まれ得る動画であればその種別や異常判定の目的に制限はない。さらに、評価対象となる画像データは、たとえば任意の静止画像のデータであってもよい。 "others"
- The PU 12 may include an arithmetic unit that can be included in the PU 32.
- In the above embodiment, it is assumed that the CM video data is distributed (transmitted) to the user terminal 50 in the form of streaming video, but the present invention is not limited to this. For example, it may be distributed (transmitted) to the user terminal 50 in real time using a live streaming service. The image data to be evaluated is not limited to commercial video data. For example, there are no restrictions on the type of video or the purpose of abnormality determination, as long as the video is not limited to videos used for advertising products or the like, but can include at least part of a person's body. Further, the image data to be evaluated may be, for example, any still image data.

・ＰＵ３２は、ＯＫ判定またはＮＧ判定の根拠として扱われる、明示的に予め定められた所定部位を示す情報を、業者端末１０および／またはユーザ端末５０に対して、ＣＭ動画データの配信（送信）の前に通知してよい。また、ＰＵ３２は、所定部位を示す情報を、業者端末１０および／またはユーザ端末５０に対して、ＣＭ動画データの配信（送信）中に通知してよい。また、ＰＵ３２は、配信（送信）中のＣＭ動画データのフレームデータＦＤについてＮＧ判定の根拠となった所定部位を示す情報を、業者端末１０および／またはユーザ端末５０に対して、通知してよい。なお、ここでの「通知」とは、業者端末１０および／またはユーザ端末５０に対するメッセージ送信の態様であってよく、業者端末１０および／またはユーザ端末５０において表示中のＣＭ動画データに重畳させる表示処理の態様であってよい。 - The PU 32 distributes (sends) the commercial video data to the vendor terminal 10 and/or the user terminal 50, with information indicating an explicitly predetermined part that is treated as the basis for an OK or NG determination. may be notified in advance. Further, the PU 32 may notify the vendor terminal 10 and/or the user terminal 50 of information indicating the predetermined region during distribution (transmission) of the CM video data. Further, the PU 32 may notify the vendor terminal 10 and/or the user terminal 50 of information indicating a predetermined portion that is the basis for the NG determination regarding the frame data FD of the commercial video data that is being distributed (transmitted). . Note that the "notification" here may be a mode of sending a message to the vendor terminal 10 and/or the user terminal 50, and a display superimposed on the commercial video data being displayed on the vendor terminal 10 and/or the user terminal 50. It may be a mode of processing.

１０…業者端末
１４…記憶装置
１６…通信機
２０…ネットワーク
３０…画像評価装置
３４…記憶装置
３４ａ…画像評価プログラム
３４ｂ…位置情報写像データ
３４ｃ…評価写像データ
３６…通信機
４０…ユーザインターフェース
５０…ユーザ端末
５４…記憶装置
５４ａ…アプリケーションプログラム
５６…通信機
５８…ユーザインターフェース
６０～６６…ヒートマップ
７０ａ～７０ｄ…識別モデル 10...Business terminal 14...Storage device 16...Communication device 20...Network 30...Image evaluation device 34...Storage device 34a...Image evaluation program 34b...Position information mapping data 34c...Evaluation mapping data 36...Communication device 40...User interface 50... User terminal 54...Storage device 54a...Application program 56...Communication device 58...User interface 60-66...Heat map 70a-70d...Identification model

Claims

includes an execution device and a storage device,
The storage device stores position information mapping data and evaluation mapping data,
The location information mapping data is data for defining a location information mapping,
The location information mapping is a mapping that outputs location information data,
The position information data is data indicating position information of a predetermined part of a person in an image shown by image data to be evaluated,
The evaluation mapping data is data for defining an evaluation mapping,
The evaluation mapping is a mapping that receives evaluation input data and the position information data as input and outputs an evaluation result of the image data,
The evaluation input data is data that corresponds to the image data and is data that is input to the evaluation mapping,
The execution device is configured to execute a position information generation process and an evaluation process,
The position information generation process is a process of generating the position information data by inputting position input data into the position information mapping,
The position input data is data that corresponds to the image data and is data that is input to the position information mapping,
The evaluation process is a process of evaluating the image data by inputting the position information data and the evaluation input data into the evaluation mapping,
The evaluation mapping includes a feature layer,
The feature amount layer is a layer that quantifies the feature amount of the image data by dividing the region indicated by the image data into a plurality of regions and giving a numerical value to each of the regions,
The evaluation mapping is a mapping that outputs the evaluation result by including a process of combining at least some values of the plurality of regions indicated by the feature amount layer with the position information data.

includes an execution device and a storage device,
The storage device stores position information mapping data and evaluation mapping data,
The location information mapping data is data for defining a location information mapping,
The location information mapping is a mapping that outputs location information data,
The position information data is data indicating position information of a predetermined part of a person in an image shown by image data to be evaluated,
The evaluation mapping data is data for defining an evaluation mapping,
The evaluation mapping is a mapping that receives evaluation input data and the position information data as input and outputs an evaluation result of the image data,
The evaluation input data is data that corresponds to the image data and is data that is input to the evaluation mapping,
The execution device is configured to execute a position information generation process and an evaluation process,
The position information generation process is a process of generating the position information data by inputting position input data into the position information mapping,
The position input data is data that corresponds to the image data and is data that is input to the position information mapping,
The evaluation process is a process of evaluating the image data by inputting the position information data and the evaluation input data into the evaluation mapping,
The evaluation mapping includes a provisional evaluation mapping,
The provisional evaluation mapping is a mapping that receives the evaluation input data as input and outputs a provisional evaluation result of the image data,
The evaluation process includes a provisional evaluation process and a validity evaluation process,
The provisional evaluation process includes processing to output the provisional evaluation result by inputting the evaluation input data to the provisional evaluation mapping,
The validity evaluation process uses the position information data as input and determines the validity of the provisional evaluation result according to the degree to which an area indicating the predetermined part out of the area shown by the image data has contributed to the provisional evaluation result. An image evaluation device that includes processing for evaluating gender .

includes an execution device and a storage device,
The storage device stores position information mapping data and evaluation mapping data,
The location information mapping data is data for defining a location information mapping,
The location information mapping is a mapping that outputs location information data,
The position information data is data indicating position information of a predetermined part of a person in an image shown by image data to be evaluated,
The evaluation mapping data is data for defining an evaluation mapping,
The evaluation mapping is a mapping that receives evaluation input data and the position information data as input and outputs an evaluation result of the image data,
The evaluation input data is data that corresponds to the image data and is data that is input to the evaluation mapping,
The execution device is configured to execute a position information generation process and an evaluation process,
The position information generation process is a process of generating the position information data by inputting position input data into the position information mapping,
The position input data is data that corresponds to the image data and is data that is input to the position information mapping,
The evaluation process is a process of evaluating the image data by inputting the position information data and the evaluation input data into the evaluation mapping,
The execution device is configured to execute a provision process and a restriction process,
The provision process includes transmitting the image data to the user terminal so that the image represented by the image data, which has been evaluated by the evaluation process as not having to be restricted from being displayed, can be displayed on the user terminal. It is a process to
The restriction process is a process of restricting the user terminal from displaying an image indicated by the image data that has been evaluated by the evaluation process to be restricted from being displayed;
The evaluation that there is no need to restrict is an evaluation that the degree of body exposure is within a permissible range based on the instruction from the user terminal,
The image evaluation device is characterized in that the evaluation that the display by the user terminal should be restricted is an evaluation that the exposure level is outside the permissible range .

The execution device is configured to execute a notification process,
The image evaluation device according to claim 2 , wherein the notification process is a process of notifying that the image is not valid when the validity evaluation process determines that the image is not valid.

The image evaluation device according to any one of claims 1 to 4, wherein the predetermined region includes at least one of three parts: a person's chest, buttocks, and front lower abdomen.

The position information data is data that provides information as to whether each pixel forming the evaluation input data indicates the predetermined region,
The evaluation mapping is a mapping that outputs the evaluation result by inputting each pixel of the evaluation input data and the position information data corresponding to each pixel in a corresponding manner. The image evaluation device according to any one of the above .

An image processing system comprising the image evaluation device according to claim 1 or 2 and a plurality of user terminals.

An image evaluation device having an execution device and a storage device, and a plurality of user terminals,
The storage device stores position information mapping data and evaluation mapping data,
The location information mapping data is data for defining a location information mapping,
The location information mapping is a mapping that outputs location information data,
The position information data is data indicating position information of a predetermined part of a person in an image shown by image data to be evaluated,
The evaluation mapping data is data for defining an evaluation mapping,
The evaluation mapping is a mapping that receives evaluation input data and the position information data as input and outputs an evaluation result of the image data,
The evaluation input data is data that corresponds to the image data and is data that is input to the evaluation mapping,
The execution device is configured to execute a position information generation process and an evaluation process,
The position information generation process is a process of generating the position information data by inputting position input data into the position information mapping,
The position input data is data that corresponds to the image data and is data that is input to the position information mapping,
The evaluation process is a process of evaluating the image data by inputting the position information data and the evaluation input data into the evaluation mapping,
The execution device is configured to execute a provision process and a restriction process,
The provision process includes transmitting the image data to the user terminal so that the image represented by the image data, which has been evaluated by the evaluation process as not having to be restricted from being displayed, can be displayed on the user terminal. The process of sending
The restriction process is a process of restricting the user terminal from displaying an image indicated by the image data that has been evaluated by the evaluation process to be restricted from being displayed ;
The user terminal is configured to perform instruction processing,
The instruction process is a process of instructing a permissible range for the degree of body exposure,
The evaluation that there is no need to limit is an evaluation that the degree of exposure is within the permissible range,
The image processing system is characterized in that the evaluation to the effect that display by the user terminal should be restricted is an evaluation to the effect that the exposure level is outside the permissible range .

The user terminal in the image processing system according to claim 8.

An image evaluation method comprising the step of executing each of the processes in the image evaluation apparatus according to any one of claims 1 to 3 .

An image evaluation program that causes a computer to execute each of the processes in the image evaluation apparatus according to any one of claims 1 to 3 .