JP7272705B1

JP7272705B1 - Information processing device, information processing method and information processing program

Info

Publication number: JP7272705B1
Application number: JP2021195921A
Authority: JP
Inventors: 剛平西野; ヤシンメクトビ; カンキンオウ
Original assignee: Ridge I Inc
Current assignee: Ridge I Inc
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2023-05-12
Anticipated expiration: 2041-12-02
Also published as: JP2023082277A

Abstract

【課題】画像に基づいて人数を推定することが可能な情報処理装置、情報処理方法及び情報処理プログラムを提供する。【解決手段】情報処理装置は、人物が記録される第１画像情報を受け付ける受付部と、第１学習済モデルと第１画像情報とに基づいて、第１画像情報に関する推定を行う推定部と、推定部によって推定される内容が第１の場合に、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得部と、推定部によって推定される内容が第２の場合に、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得部と、を備える。【選択図】図１An information processing device, an information processing method, and an information processing program capable of estimating the number of people based on an image are provided. An information processing apparatus includes a reception unit that receives first image information in which a person is recorded, and an estimation unit that estimates the first image information based on the first trained model and the first image information. a second trained model generated by learning a heat map indicating presence/absence of a person and the number of persons in the heat map when the content estimated by the estimation unit is the first; a first acquiring unit for acquiring the number of persons recorded in the first image information based on the first image information; and an image in which the persons are recorded when the content estimated by the estimating unit is second. , a third trained model generated by learning the person, and the first image information, each person recorded in the first image information is recognized, and the number of persons is determined based on the recognition result. and a second acquisition unit that acquires the [Selection drawing] Fig. 1

Description

本開示は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present disclosure relates to an information processing device, an information processing method, and an information processing program.

従来から、混雑状況を推定する装置がある。特許文献１に記載される画像処理装置は、入力画像に基づいて混雑度のレベルに応じたヒートマップを生成し、そのヒートマップを利用して混雑度を推定する。さらに、画像処理装置は、推定される混雑度に応じて、通常の混雑度の際の解析部（通常時用画像解析部）と、混雑時用の解析部（混雑時用画像解析部）とのいずれかを選択し、選択した画像解析部を利用して画像解析を行う。 Conventionally, there is a device for estimating congestion conditions. An image processing apparatus described in Patent Literature 1 generates a heat map according to the level of the degree of congestion based on an input image, and uses the heat map to estimate the degree of congestion. In addition, the image processing device includes an analysis unit for normal congestion (normal image analysis unit) and an analysis unit for congestion (congestion image analysis unit), depending on the estimated degree of congestion. and perform image analysis using the selected image analysis unit.

再表２０１８／０６１９７６号公報Retable 2018/061976

上述した特許文献１に記載される画像解析装置は、人物の滞留及び置き去りにされた物体の検出を行う際に利用される。
しかしながら、画像に記録される人物の数（人数）を推定することが要望される場合があり、特許文献１に記載される技術では人数を推定することができなかった。 The image analysis device described in the above-mentioned Patent Document 1 is used when detecting a person staying and an abandoned object.
However, there are cases where it is desired to estimate the number of persons (persons) recorded in an image, and the technique described in Patent Document 1 cannot estimate the number of persons.

本開示は、画像に基づいて人数を推定することが可能な情報処理装置、情報処理方法及び情報処理プログラムを提供する。 The present disclosure provides an information processing device, an information processing method, and an information processing program capable of estimating the number of people based on an image.

一態様の情報処理装置は、人物が記録される第１画像情報を受け付ける受付部と、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される第１学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する推定部と、推定部によって推定される人数が閾値以上の場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得部と、推定部によって推定される人数が閾値未満の場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得部と、を備える。 An information processing apparatus according to one aspect is generated by learning a reception unit that receives first image information in which a person is recorded, an image in which a plurality of persons are recorded, and the number of persons recorded in the image. an estimating unit for estimating the number of persons recorded in the first image information based on the first trained model received by the receiving unit and the first image information received by the receiving unit; In this case, based on a second trained model generated by learning a heat map indicating the presence or absence of a person, the number of people in the heat map, and the first image information received by the reception unit, When the number of people estimated by the first acquisition unit that acquires the number of people recorded in the first image information and the estimation unit is less than a threshold value, the image is generated by learning the image in which the people are recorded and the people. second acquisition for recognizing each person recorded in the first image information based on the third learned model received by the reception unit and the first image information received by the reception unit, and acquiring the number of persons based on the recognition result; and

一態様によれば、人物が記録される第１画像情報を受け付け、その第１画像情報と第１学習済モデルとに基づいて第１画像情報に関する推定を行い、その推定が第１の場合に、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得し、上述した推定が第１の場合とは異なる第２の場合に、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得することができる。 According to one aspect, first image information in which a person is recorded is received, estimation is performed on the first image information based on the first image information and the first trained model, and when the estimation is the first , a second trained model generated by learning a heat map indicating the presence or absence of a person and the number of people in the heat map, and the first image information, based on the first image information. A third trained model generated by obtaining the number of people recorded and learning the images in which the people are recorded and the people in a second case where the above estimation is different from the first case , and the first image information, each person recorded in the first image information can be recognized, and the number of persons can be acquired based on the recognition result.

一実施形態に係る情報処理装置について説明するための図である。1 is a diagram for explaining an information processing device according to an embodiment; FIG. 一実施形態に係る情報処理装置について説明するためのブロック図である。1 is a block diagram for explaining an information processing device according to an embodiment; FIG. 一実施形態に係る情報処理方法について説明するための第１のフローチャートである。4 is a first flowchart for explaining an information processing method according to one embodiment; 一実施形態に係る情報処理方法について説明するための第２のフローチャートである。7 is a second flowchart for explaining the information processing method according to one embodiment;

以下、本発明の一実施形態について説明する。 An embodiment of the present invention will be described below.

［情報処理装置１の概要］
まず、一実施形態に係る情報処理装置１の概要について説明する。
図１は、一実施形態に係る情報処理装置１について説明するための図である。 [Overview of information processing device 1]
First, an outline of an information processing device 1 according to an embodiment will be described.
FIG. 1 is a diagram for explaining an information processing device 1 according to one embodiment.

情報処理装置１は、例えば、画像に記録される人物の数（人数）を推定する人数推定装置等として構成されてもよい。また、情報処理装置１は、例えば、画像に記録される内容を推定する推定装置等として構成されてもよい。 The information processing device 1 may be configured as, for example, a people estimation device that estimates the number of people recorded in an image. Further, the information processing device 1 may be configured as an estimation device or the like for estimating the content recorded in the image, for example.

情報処理装置１は、例えば、サーバ、デスクトップ、ラップトップ及びタブレット等のコンピュータであってもよい。
情報処理装置１は、第１画像情報１０１と、第１学習済モデル２０１とに基づいて、第１画像情報１０１に記録される人物の人数を推定する。第１学習済モデル２０１は、例えば、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される学習済モデルであってもよい。
また、情報処理装置１は、第１画像情報１０１と、第１学習済モデル２０１とに基づいて、ヒートマップを利用した人数推定を行うか、又は、人物認識を利用して人数推定を行うかを選択してもよい。この場合、第１学習済モデル２０１は、複数の人物が記録される画像と、ヒートマップを利用した人数の推定及び人物の認識を利用した人数の推定のいずれかの選択とを学習することにより生成される学習済モデルであってもよい。 The information processing device 1 may be, for example, a computer such as a server, desktop, laptop, and tablet.
The information processing apparatus 1 estimates the number of persons recorded in the first image information 101 based on the first image information 101 and the first trained model 201 . The first trained model 201 may be, for example, a trained model generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image.
In addition, based on the first image information 101 and the first trained model 201, the information processing apparatus 1 performs population estimation using a heat map or human recognition. may be selected. In this case, the first trained model 201 learns an image in which a plurality of people are recorded, and the selection of either estimation of the number of people using a heat map or estimation of the number of people using person recognition. It may be a trained model that is generated.

情報処理装置１は、上述したように推定される人数が予め設定される閾値以上の場合、及び、ヒートマップを利用した人数推定が選択される場合、第１画像情報１０１と、第２学習済モデル２０２とに基づいて、第１画像情報１０１に記録される人物の人数を取得する。第２学習済モデル２０２は、例えば、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される学習済モデルであってもよい。 When the number of people estimated as described above is equal to or greater than the preset threshold and when the number of people estimation using the heat map is selected, the information processing apparatus 1 uses the first image information 101 and the second learned Based on the model 202, the number of persons recorded in the first image information 101 is acquired. The second trained model 202 may be, for example, a trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map.

一方、情報処理装置１は、上述したように推定される人数が閾値未満の場合、及び、人物認識を利用した人数推定が選択される場合、第１画像情報１０１と、第３学習済モデル２０３とに基づいて、第１画像情報１０１に記録される人物それぞれを認識する。第３学習済モデル２０３は、例えば、人物が記録される画像と、人物とを学習することにより生成される学習済モデルであってもよい。情報処理装置１は、認識した人物の数を集計することにより、第１画像情報１０１に記録される人物の数を取得する。 On the other hand, when the number of people estimated as described above is less than the threshold and when the number of people estimation using person recognition is selected, the information processing apparatus 1 and each person recorded in the first image information 101 is recognized. The third trained model 203 may be, for example, a trained model generated by learning an image in which a person is recorded and the person. The information processing apparatus 1 acquires the number of persons recorded in the first image information 101 by tallying the number of recognized persons.

［情報処理装置１の詳細］
次に、一実施形態に係る情報処理装置１の詳細について説明する。
図２は、一実施形態に係る情報処理装置１について説明するためのブロック図である。 [Details of information processing device 1]
Next, details of the information processing apparatus 1 according to one embodiment will be described.
FIG. 2 is a block diagram for explaining the information processing device 1 according to one embodiment.

情報処理装置１は、例えば、通信部２１、記憶部２２、表示部２３及び制御部１１等を備える。通信部２１、記憶部２２及び表示部２３は、出力部の一実施形態であってもよい。制御部１１は、例えば、受付部１２、推定部１３、選択部１４、第１取得部１５、第２取得部１６及び出力制御部１７等を備える。なお、選択部１４については、後述する変形例において説明する。制御部１１は、例えば、情報処理装置１の演算処理装置等によって構成されてもよい。制御部１１（例えば、演算処理装置等）は、例えば、記憶部２２等に記憶される各種プログラム等を適宜読み出して実行することにより、各部（例えば、受付部１２、推定部１３、選択部１４、第１取得部１５、第２取得部１６及び出力制御部１７等）の機能を実現してもよい。 The information processing device 1 includes, for example, a communication unit 21, a storage unit 22, a display unit 23, a control unit 11, and the like. The communication unit 21, the storage unit 22, and the display unit 23 may be an embodiment of the output unit. The control unit 11 includes, for example, a reception unit 12, an estimation unit 13, a selection unit 14, a first acquisition unit 15, a second acquisition unit 16, an output control unit 17, and the like. Note that the selection unit 14 will be described in a modified example described later. The control unit 11 may be configured by, for example, an arithmetic processing unit of the information processing device 1 or the like. The control unit 11 (for example, an arithmetic processing unit, etc.) reads and executes various programs stored in the storage unit 22, etc. as appropriate, so that each unit (for example, the reception unit 12, the estimation unit 13, the selection unit 14 , first acquisition unit 15, second acquisition unit 16, output control unit 17, etc.).

通信部２１は、例えば、情報処理装置１の外部にある装置（外部装置）等との間で種々の情報の送受信が可能である。 The communication unit 21 can transmit and receive various information with, for example, a device (external device) outside the information processing device 1 .

記憶部２２は、例えば、種々の情報及びプログラムを記憶してもよい。記憶部２２の一例は、メモリ、ソリッドステートドライブ及びハードディスクドライブ等であってもよい。 The storage unit 22 may store various information and programs, for example. An example of the storage unit 22 may be a memory, a solid state drive, a hard disk drive, and the like.

表示部２３は、例えば、種々の文字、記号及び画像等を表示することが可能である。 The display unit 23 can display various characters, symbols, images, and the like, for example.

受付部１２は、人物が記録される第１画像情報を受け付ける。第１画像情報は、動画情報及び静止画情報であってもよい。
受付部１２は、例えば、外部にある装置（外部装置）等から第１画像情報を受け付けてもよい。この場合の外部装置は、例えば、サーバ及びカメラ（図示せず）等であってもよい。カメラは、例えば、屋外及び室内等に配されてもよく、一例として監視カメラ等であってもよい。カメラは、例えば、所定の方向（被写体等）を撮像して画像情報を生成してもよい。サーバは、例えば、カメラで生成される画像情報を取得して蓄積してもよい。
すなわち、受付部１２は、例えば、通信部２１を介して、サーバに蓄積される画像情報（第１画像情報）を受け付けてもよい。
また、受付部１２は、例えば、カメラで生成される画像情報（第１画像情報）を受け付けてもよい。この場合、受付部１２は、例えば、通信部２１を介して、カメラから送信される画像情報（第１画像情報）を受け付けてもよい。また、受付部１２は、例えば、カメラで生成される画像情報を記録したメモリ（図示せず）が情報処理装置１のインターフェース（図示せず）に挿入される場合、そのメモリから画像情報（第１画像情報）を受け付けてもよい。 The reception unit 12 receives first image information in which a person is recorded. The first image information may be moving image information and still image information.
For example, the reception unit 12 may receive the first image information from an external device (external device) or the like. The external device in this case may be, for example, a server and a camera (not shown). The cameras may be arranged outdoors and indoors, for example, and may be surveillance cameras or the like, for example. The camera may, for example, generate image information by capturing an image in a predetermined direction (subject or the like). The server may, for example, acquire and accumulate image information generated by the camera.
That is, the receiving unit 12 may receive image information (first image information) accumulated in the server via the communication unit 21, for example.
The reception unit 12 may also receive, for example, image information (first image information) generated by a camera. In this case, the reception unit 12 may receive image information (first image information) transmitted from the camera via the communication unit 21, for example. Further, for example, when a memory (not shown) in which image information generated by a camera is recorded is inserted into an interface (not shown) of the information processing apparatus 1, the reception unit 12 reads the image information (first 1 image information) may be received.

推定部１３は、第１学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する。第１学習済モデルは、例えば、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される学習済モデルであってもよい。すなわち、第１学習済モデルは、例えば、複数の人物が記録される、複数のパターンの画像（複数の画像）と、各パターンの画像それぞれに記録される人物の数（人数）とを学習した学習済モデルであってもよい。 The estimation unit 13 estimates the number of persons recorded in the first image information based on the first trained model and the first image information received by the reception unit 12 . The first trained model may be, for example, a trained model generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image. That is, the first trained model learns, for example, images of a plurality of patterns (plurality of images) in which a plurality of persons are recorded, and the number of persons recorded in each image of each pattern (number of persons). It may be a trained model.

この場合、第１学習済モデルは、例えば、人物が記録される画像と、その画像を後述する第２学習済モデルに入力して推定される人数と、その画像を後述する第３学習済モデルを利用して推定される人数とを学習して生成される学習済モデルであってもよい。この際、第１学習済モデルは、例えば、その画像を後述する第２学習済モデルに入力して推定される人数と、その画像を後述する第３学習済モデルを利用して推定される人数とうち、どちらがより正しい人数かをさらに学習して生成される学習済モデルであってもよい。そのような第１学習済モデルを利用する場合、推定部１３は、例えば、人物が記録される第２画像情報及び第２学習済モデルに基づいて推定される人物の人数と、第２画像情報及び第３学習済モデルに基づいて推定される人数と、第２画像情報に記録される人数とを学習することにより生成される第１学習済モデルを利用して、人物の人数を推定してもよい。第２画像情報は、例えば、学習に利用される、人物が記録される画像情報であってもよい。 In this case, the first trained model includes, for example, an image in which a person is recorded, the number of people estimated by inputting the image into a second trained model described later, and the image as a third trained model described later. It may be a trained model generated by learning the number of people estimated using . At this time, the first trained model is, for example, the number of people estimated by inputting the image into the second trained model described later, and the number of people estimated by using the third trained model described later. It may be a trained model that is generated by further learning which of the two is the correct number of people. When using such a first trained model, the estimating unit 13, for example, uses second image information in which people are recorded, the number of people estimated based on the second trained model, and the second image information and estimating the number of people using a first learned model generated by learning the number of people estimated based on the third learned model and the number of people recorded in the second image information; good too. The second image information may be, for example, image information in which a person is recorded and used for learning.

上述した第１学習済モデルは、例えば、制御部１１（例えば、学習部（図示せず）等）によって学習を行うことにより生成されてもよい。
また、第１学習済モデルは、例えば、情報処理装置１の外部にある装置（外部装置（例えば、学習装置（図示せず）等）によって学習を行うことにより生成されてもよい。外部装置が第１学習済モデルを生成する場合、情報処理装置１は、例えば、通信部２１を介して、又は、メモリ（図示せず）等を利用して第１学習済モデルを取得してもよい。 The above-described first trained model may be generated by, for example, learning by the control unit 11 (for example, a learning unit (not shown), etc.).
Further, the first trained model may be generated by, for example, learning by a device (an external device (for example, a learning device (not shown)), etc.) outside the information processing device 1. The external device may When generating the first trained model, the information processing apparatus 1 may acquire the first trained model, for example, via the communication unit 21 or using a memory (not shown) or the like.

第１取得部１５は、推定部１３によって推定される人数が閾値以上の場合、第２学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する。第２学習済モデルは、例えば、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される学習済モデルであってもよい。 When the number of people estimated by the estimation unit 13 is equal to or greater than the threshold, the first acquisition unit 15 records the first image information based on the second trained model and the first image information received by the reception unit 12. Get the number of people. The second trained model may be, for example, a trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map.

すなわち、第２学習済モデルは、例えば、人物の存在が推定される可能性が高い箇所と、人物の存在が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習した学習済モデルであってもよい。この場合、第１取得部１５は、例えば、人物の存在が推定される可能性が高い箇所と、人物が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習した第２学習済モデルを利用して、人物の人数を推定してもよい。ヒートマップは、例えば、色の違い又は色の濃淡等により、人物の存在の有無が表現されてもよい。ヒートマップは、例えば、種々の公知の方法により作成することが可能である。 That is, the second trained model, for example, has a range of locations where the existence of a person is highly likely to be estimated and a location where the existence of a person is unlikely to be estimated. It may be a trained model that has learned a heat map indicating the presence or absence of existence. In this case, for example, the first acquisition unit 15 selects a location where the existence of a person is likely to be estimated and a location where the possibility of the existence of a person is unlikely to be estimated. The number of persons may be estimated using a second trained model that has learned a heat map indicating the presence or absence of a person. The heat map may express the presence or absence of a person by, for example, color differences or color shading. A heat map can be created, for example, by various known methods.

第２学習済モデルは、例えば、複数の人物が記録される、複数のパターンのヒートマップ（複数のヒートマップのパターン）と、各パターンのヒートマップそれぞれに記録される人物の数（人数）とを学習した学習済モデルであってもよい。
上述した第２学習済モデルは、例えば、制御部１１（例えば、学習部（図示せず）等）によって学習を行うことにより生成されてもよい。
また、第２学習済モデルは、例えば、情報処理装置１の外部にある装置（外部装置（例えば、学習装置（図示せず）等）によって学習を行うことにより生成されてもよい。外部装置が第２学習済モデルを生成する場合、情報処理装置１は、例えば、通信部２１を介して、又は、メモリ（図示せず）等を利用して第２学習済モデルを取得してもよい。 The second trained model includes, for example, a heat map of a plurality of patterns in which a plurality of people are recorded (a pattern of the plurality of heat maps), and the number of people (number of people) recorded in each heat map of each pattern. It may be a trained model that has learned
The above-described second trained model may be generated by, for example, learning by the control unit 11 (for example, a learning unit (not shown), etc.).
Further, the second trained model may be generated, for example, by performing learning by a device (an external device (for example, a learning device (not shown)), etc.) outside the information processing device 1. The external device may When generating the second trained model, the information processing apparatus 1 may acquire the second trained model, for example, via the communication unit 21 or using a memory (not shown) or the like.

第１取得部１５は、例えば、受付部１２によって受け付ける第１画像情報に基づいて、人物の存在の有無を示すヒートマップを生成してもよい。第１取得部１５は、例えば、種々の公知の方法を利用して、第１画像情報に基づくヒートマップを生成することが可能である。一例として、第１取得部１５は、受付部１２によって受け付ける第１画像情報と、予め特定される人物の特徴とに基づいて、人物の存在の有無を示すヒートマップを生成してもよい。人物の特徴は、例えば、人物の身体の輪郭（外形）等であってもよく、それ以外の他の種々の特徴であってもよい。第１取得部１５は、例えば、上述したようにヒートマップを生成すると、そのヒートマップと、第２学習済モデルとに基づいて、人物の人数を推定してもよい。 The first acquisition unit 15 may generate a heat map indicating presence/absence of a person based on the first image information received by the reception unit 12, for example. The first acquisition unit 15 can generate a heat map based on the first image information using various known methods, for example. As an example, the first acquisition unit 15 may generate a heat map indicating the presence or absence of the person based on the first image information received by the reception unit 12 and the characteristics of the person specified in advance. The features of a person may be, for example, the outline (outline) of a person's body, or may be various other features. For example, after generating the heat map as described above, the first acquisition unit 15 may estimate the number of people based on the heat map and the second learned model.

第２取得部１６は、推定部１３によって推定される人数が閾値未満の場合、第３学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する。第３学習済モデルは、例えば、画像に基づいて人物を認識するための学習済モデルであってもよい。すなわち、第３学習済モデルは、例えば、人物が記録される画像と、人物とを学習することにより生成される学習済モデルであってもよい。第３学習済モデルは、例えば、人物が記録される、複数のパターンの画像（複数の画像）と、各パターンの画像それぞれに記録される人物とを学習した学習済モデルであってもよい。一例として、第３学習済モデルは、人物として、人物の特徴を学習したものであってもよい。この場合の人物の特徴は、例えば、人物の身体の輪郭、顔（顔の特徴（例えば、眼、鼻及び口等））、人物の後ろ姿、人物の横方向の姿、人物の骨格、及び、人物の身体の特徴点（例えば、関節等）を始めとする種々の特徴であってもよい。 When the number of people estimated by the estimation unit 13 is less than the threshold, the second acquisition unit 16 records the first image information based on the third trained model and the first image information received by the reception unit 12. Recognize each person and acquire the number of persons based on the recognition result. The third trained model may be, for example, a trained model for recognizing a person based on an image. That is, the third trained model may be, for example, a trained model generated by learning an image in which a person is recorded and the person. The third trained model may be, for example, a trained model that has learned a plurality of patterns of images (plurality of images) in which a person is recorded and a person recorded in each pattern image. As an example, the third trained model may be one that has learned characteristics of a person as a person. The features of the person in this case are, for example, the outline of the body of the person, the face (features of the face (e.g., eyes, nose, mouth, etc.)), the back view of the person, the side view of the person, the skeleton of the person, and Various features such as feature points (for example, joints, etc.) of a person's body may be used.

上述した第３学習済モデルは、例えば、制御部１１（例えば、学習部（図示せず）等）によって学習を行うことにより生成されてもよい。
また、第３学習済モデルは、例えば、情報処理装置１の外部にある装置（外部装置（例えば、学習装置（図示せず）等）によって学習を行うことにより生成されてもよい。外部装置が第３学習済モデルを生成する場合、情報処理装置１は、例えば、通信部２１を介して、又は、メモリ（図示せず）等を利用して第３学習済モデルを取得してもよい。 The above-described third trained model may be generated by, for example, learning by the control unit 11 (for example, a learning unit (not shown), etc.).
Further, the third trained model may be generated by, for example, learning by a device (an external device (for example, a learning device (not shown)), etc.) outside the information processing device 1. The external device may When generating the third trained model, the information processing apparatus 1 may acquire the third trained model, for example, via the communication unit 21 or using a memory (not shown) or the like.

第２取得部１６は、例えば、第１画像情報と、上述した第３学習済モデルとに基づいて、第１画像情報に記録される１又は複数の人物を認識する。第２取得部１６は、例えば、認識した人物の数に基づいて、第１画像情報に記録される人物の数（人数）を取得する。 The second acquisition unit 16 recognizes one or more persons recorded in the first image information, for example, based on the first image information and the above-described third learned model. The second acquisition unit 16 acquires the number of persons (number of persons) recorded in the first image information, for example, based on the number of recognized persons.

第１取得部１５及び第２取得部１６において利用される上述した閾値は、例えば、同一（又は、異なる）値であってもよい。閾値は、ヒートマップを利用して人数を推定する（第２学習済モデルを利用する）場合と、人物認識を利用して人数を推定する（第３学習済モデル）場合とで、どちらがより正確な人数を出力できるか予め実験等を行うことにより設定されてもよい。
一般的に、第１画像情報に記録される人物の数（人数）が相対的に多い場合には、ヒートマップを利用して人数を推定する（第２学習済モデルを利用する）場合がより正確な人数を出力することができる。一方、第１画像情報に記録される人物の数（人数）が相対的に少ない場合には、人物認識を利用して人数を推定する（第３学習済モデル）がより正確な人数を出力することができる。状況に応じて適切な値は異なるが、一例として、閾値は、５０人、６０人、７０人、８０人及び９０人等の種々の値であってもよい。 The above-described threshold values used in the first acquisition unit 15 and the second acquisition unit 16 may be the same (or different) values, for example. Which is more accurate when estimating the number of people using a heat map (using the second trained model) or when estimating the number of people using person recognition (third trained model)? It may be set by performing an experiment or the like in advance to determine whether or not an appropriate number of people can be output.
In general, when the number of people recorded in the first image information (number of people) is relatively large, it is more preferable to estimate the number of people using a heat map (using the second trained model). Accurate number of people can be output. On the other hand, when the number of people recorded in the first image information (number of people) is relatively small, estimating the number of people using person recognition (third trained model) outputs a more accurate number of people. be able to. As an example, the threshold may be various values such as 50, 60, 70, 80, and 90, although appropriate values may vary depending on the situation.

出力制御部１７は、第１取得部１５による推定結果、及び、第２取得部１６による推定結果を出力するよう出力部を制御してもよい。出力部は、例えば、通信部２１、記憶部２２及び表示部２３等であってもよい。
すなわち、出力制御部１７は、第１取得部１５による推定結果に関する情報、及び、第２取得部１６による推定結果に関する情報を外部（外部装置）に送信するよう通信部２１を制御してもよい。この場合の外部装置は、例えば、サーバ及びユーザ端末（図示せず）等であってもよい。ユーザ端末は、例えば、情報処理装置１のユーザが使用する端末であってもよく、具体的な一例として、デスクトップ、ラップトップ、タブレット及びスマートフォン等であってもよい。
出力制御部１７は、第１取得部１５による推定結果に関する情報、及び、第２取得部１６による推定結果に関する情報を記憶するよう記憶部２２を制御してもよい。
出力制御部１７は、第１取得部１５による推定結果、及び、第２取得部１６による推定結果を表示するよう表示部２３を制御してもよい。 The output control unit 17 may control the output unit to output the estimation result by the first acquisition unit 15 and the estimation result by the second acquisition unit 16 . The output unit may be, for example, the communication unit 21, the storage unit 22, the display unit 23, and the like.
That is, the output control unit 17 may control the communication unit 21 to transmit information about the estimation result obtained by the first obtaining unit 15 and information about the estimation result obtained by the second obtaining unit 16 to the outside (external device). . The external device in this case may be, for example, a server and a user terminal (not shown). The user terminal may be, for example, a terminal used by the user of the information processing apparatus 1, and may be a desktop, a laptop, a tablet, a smartphone, or the like as a specific example.
The output control unit 17 may control the storage unit 22 to store information about the estimation result obtained by the first obtaining unit 15 and information about the estimation result obtained by the second obtaining unit 16 .
The output control unit 17 may control the display unit 23 to display the estimation result by the first acquisition unit 15 and the estimation result by the second acquisition unit 16 .

［情報処理方法］
次に、一実施形態に係る情報処理方法について説明する。 [Information processing method]
Next, an information processing method according to one embodiment will be described.

まず、一実施形態に係る情報処理方法として、学習方法の一例について説明する。
図３は、一実施形態に係る情報処理方法について説明するための第１のフローチャートである。 First, an example of a learning method will be described as an information processing method according to an embodiment.
FIG. 3 is a first flowchart for explaining an information processing method according to one embodiment.

ステップＳＴ１０１において、制御部１１は、人物が記録される第２画像情報を取得する。 In step ST101, the control unit 11 acquires second image information in which a person is recorded.

ステップＳＴ１０２において、制御部１１は、ステップＳＴ１０１で取得する第２画像情報と、第２学習済モデルとに基づいて、第２画像情報に記録される人物の数（人数）を推定する。第２学習済モデルは、例えば、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される学習済モデルであってもよい。すなわち、第２学習済モデルは、例えば、人物の存在が推定される可能性が高い箇所と、人物が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習した学習済モデルであってもよい。 In step ST102, the control unit 11 estimates the number of persons recorded in the second image information based on the second image information acquired in step ST101 and the second trained model. The second trained model may be, for example, a trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map. That is, for example, the second trained model can determine the presence of a person between a location where the presence of a person is highly likely to be estimated and a location where the possibility of a person is unlikely to be estimated. It may be a trained model that has learned a heat map indicating the presence/absence.

ステップＳＴ１０３において、制御部１１は、ステップＳＴ１０１で取得する第２画像情報と、第３学習済モデルとに基づいて、第２画像情報に記録される人物の数（人数）を推定する。第３学習済モデルは、例えば、画像に基づいて人物を認識するための学習済モデルであってもよい。すなわち、第３学習済モデルは、例えば、人物が記録される画像と、人物とを学習することにより生成される学習済モデルであってもよい。制御部１１は、例えば、第２画像情報と、第３学習済モデルとに基づいて、第２画像情報に記録される１又は複数の人物を認識する。制御部１１は、例えば、認識した人物の数に基づいて、第２画像情報に記録される人物の数（人数）を取得する。 In step ST103, the control unit 11 estimates the number of persons recorded in the second image information based on the second image information obtained in step ST101 and the third learned model. The third trained model may be, for example, a trained model for recognizing a person based on an image. That is, the third trained model may be, for example, a trained model generated by learning an image in which a person is recorded and the person. The control unit 11 recognizes one or more persons recorded in the second image information, for example, based on the second image information and the third learned model. The control unit 11 acquires the number of persons recorded in the second image information (number of persons), for example, based on the number of recognized persons.

ステップＳＴ１０４において、制御部１１は、ステップＳＴ１０１で取得する第２画像情報、ステップＳＴ１０２で推定される人数、及び、ステップ１０３で推定される人数を学習して第１学習済モデルを生成する。この場合、制御部１１は、ステップＳＴ１０２で推定される人数、及び、ステップ１０３で推定される人数のうち、どちらがより正確かをさらに学習してもよい。 In step ST104, the control unit 11 learns the second image information acquired in step ST101, the number of people estimated in step ST102, and the number of people estimated in step ST103 to generate a first trained model. In this case, the control unit 11 may further learn which of the number of people estimated in step ST102 and the number of people estimated in step ST103 is more accurate.

なお、上述したステップＳＴ１０１～ステップＳＴ１０４は、情報処理装置１の外部にある装置（外部装置）が行ってもよい。この場合、情報処理装置１は、ステップＳＴ１０４で生成される第１学習済モデル、ステップＳＴ１０２で利用される第２学習済モデル、及び、ステップＳＴ１０３で利用される第３学習済モデルを取得してもよい。 Note that steps ST101 to ST104 described above may be performed by a device outside the information processing device 1 (external device). In this case, the information processing apparatus 1 acquires the first trained model generated in step ST104, the second trained model used in step ST102, and the third trained model used in step ST103. good too.

次に、一実施形態に係る情報処理方法として、人数推定方法の一例について説明する。
図４は、一実施形態に係る情報処理方法について説明するための第２のフローチャートである。 Next, an example of a number-of-people estimation method will be described as an information processing method according to an embodiment.
FIG. 4 is a second flowchart for explaining the information processing method according to one embodiment.

ステップＳＴ２０１において、受付部１２は、人物が記録される第１画像情報を受け付ける。 In step ST201, the reception unit 12 receives first image information in which a person is recorded.

ステップＳＴ２０２において、推定部１３は、第１学習済モデルと、ステップＳＴ２０１で受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する。第１学習済モデルは、例えば、ステップＳＴ１０４で生成される学習済モデルであってもよく、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される学習済モデルであってもよい。 In step ST202, the estimation unit 13 estimates the number of persons recorded in the first image information based on the first learned model and the first image information received in step ST201. The first trained model may be, for example, the trained model generated in step ST104, and is generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image. It may be a trained model that is

ステップＳＴ２０３において、制御部１１（例えば、推定部１３等）は、ステップＳＴ２０２の推定結果に基づいて、第１取得部１５及び第２取得部１６のうち一方を選択する。すなわち、制御部１１（例えば、推定部１３等）は、第２学習済モデルを利用した人数推定（第１取得部１５）と、第３学習済モデルを利用した人数推定（第２取得部１６）とのうち、ステップＳＴ２０２の推定結果に応じてより正確な人数を推定できる機能を選択する。第２学習済モデル（第１取得部１５）が選択される場合には、処理は、ステップＳＴ２０４に進む。第３学習済モデル（第２取得部１６）が選択される場合には、処理は、ステップＳＴ２０５に進む。 In step ST203, the control section 11 (for example, the estimation section 13, etc.) selects one of the first acquisition section 15 and the second acquisition section 16 based on the estimation result in step ST202. That is, the control unit 11 (for example, the estimation unit 13 or the like) estimates the number of people using the second learned model (first acquisition unit 15), estimates the number of people using the third learned model (second acquisition unit 16 ), the function that can more accurately estimate the number of people is selected according to the estimation result of step ST202. When the second trained model (first acquisition unit 15) is selected, the process proceeds to step ST204. When the third trained model (second acquisition unit 16) is selected, the process proceeds to step ST205.

ステップＳＴ２０４において、第１取得部１５は、第２学習済モデルと、ステップＳＴ２０１で受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する。
第２学習済モデルは、例えば、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される学習済モデルであってもよい。すなわち、第２学習済モデルは、例えば、人物の存在が推定される可能性が高い箇所と、人物が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習した学習済モデルであってもよい。換言すると、第２学習済モデルは、例えば、人物の存在が推定される可能性が高い箇所と、人物が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習して生成される学習済モデルであってもよい。
第１取得部１５は、ステップＳＴ２０１で受け付ける第１画像情報と、予め特定される人物の特徴とに基づいて、人物の存在の有無を示すヒートマップを生成してもよい。第１取得部１５は、例えば、上述したようにヒートマップを生成すると、そのヒートマップと、第２学習済モデルとに基づいて、人物の人数を推定してもよい。 In step ST204, the first acquisition unit 15 acquires the number of persons recorded in the first image information based on the second trained model and the first image information received in step ST201.
The second trained model may be, for example, a trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map. That is, for example, the second trained model can determine the presence of a person between a location where the presence of a person is highly likely to be estimated and a location where the possibility of a person is unlikely to be estimated. It may be a trained model that has learned a heat map indicating the presence/absence. In other words, the second trained model, for example, determines the presence of a person between a location where the possibility of a person being estimated is high and a location where a person is unlikely to be estimated. It may be a trained model generated by learning a heat map indicating the presence or absence of.
The first acquisition unit 15 may generate a heat map indicating the presence or absence of a person based on the first image information received in step ST201 and the characteristics of the person specified in advance. For example, after generating the heat map as described above, the first acquisition unit 15 may estimate the number of people based on the heat map and the second learned model.

ステップＳＴ２０５において、第２取得部１６は、第３学習済モデルと、ステップＳＴ２０１で受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する。第３学習済モデルは、例えば、画像に基づいて人物を認識するための学習済モデルであってもよい。すなわち、第３学習済モデルは、例えば、人物が記録される画像と、人物とを学習することにより生成される学習済モデルであってもよい。 In step ST205, the second acquisition unit 16 recognizes each person recorded in the first image information based on the third learned model and the first image information received in step ST201, and to get the number of people. The third trained model may be, for example, a trained model for recognizing a person based on an image. That is, the third trained model may be, for example, a trained model generated by learning an image in which a person is recorded and the person.

ステップＳＴ２０６において、出力制御部１７は、ステップＳＴ２０４による推定結果、及び、ステップＳＴ２０５による推定結果を出力するよう出力部を制御してもよい。出力部は、例えば、通信部２１、記憶部２２及び表示部２３等であってもよい。 In step ST206, the output control section 17 may control the output section to output the estimation result in step ST204 and the estimation result in step ST205. The output unit may be, for example, the communication unit 21, the storage unit 22, the display unit 23, and the like.

［変形例］
次に、本実施形態の変形例について説明する。 [Modification]
Next, a modified example of this embodiment will be described.

（変形例１）
まず、第１の変形例について説明する。第１変形例では、上述した「推定部１３」の代わりに後述する「選択部１４」が配されてもよい。又は、「推定部１３」と後述する「選択部１４」とがまとめて１つの機能として構成されてもよい。 (Modification 1)
First, a first modified example will be described. In the first modified example, a “selection unit 14” described later may be arranged instead of the “estimation unit 13” described above. Alternatively, the “estimating unit 13” and the “selecting unit 14” described later may be collectively configured as one function.

第１変形例では、受付部１２は、上述した構成と同様の構成であってもよい。 In the first modified example, the reception unit 12 may have the same configuration as the configuration described above.

選択部１４は、第１学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１取得部１５及び第２取得部１６のいずれかで人物を推定するかを選択する。この場合の第１学習済モデルは、例えば、複数の人物が記録される画像と、ヒートマップを利用した人数の推定及び人物の認識を利用した人数の推定のいずれかの選択とを学習することにより生成される学習済モデルであってもよい。すなわち、第１学習済モデルは、例えば、人物が記録される画像と、その画像を後述する第２学習済モデルに入力して推定される人数と、その画像を後述する第３学習済モデルを利用して推定される人数と、を学習して生成される学習済モデルであってもよい。この場合、第１学習済モデルは、例えば、さらに、第２学習済モデルを利用した人数推定と第３学習済モデルを利用した人数推定とのうちどちらがより正しい人数か（適切な推定結果が得られるのが第１取得部１５及び第２取得部１６のうちどちらか）学習した学習済モデルであってもよい。選択部１４は、第１学習済モデルを利用して、第１画像情報に記録される人物の数（人数）を推定するのが適切な機能として、第１取得部１５及び第２取得部１６のうち一方を出力する。 The selection unit 14 selects which one of the first acquisition unit 15 and the second acquisition unit 16 should estimate a person based on the first trained model and the first image information received by the reception unit 12 . In this case, the first trained model learns, for example, an image in which a plurality of people are recorded, and the selection of either estimation of the number of people using a heat map or estimation of the number of people using person recognition. It may be a trained model generated by That is, the first trained model includes, for example, an image in which a person is recorded, the number of people estimated by inputting the image into a second trained model described later, and the third trained model described later using the image. It may be a trained model generated by learning the number of people estimated using the number of people. In this case, for example, the first trained model further determines which of the population estimation using the second trained model and the population estimation using the third trained model is more correct (the appropriate estimation result is obtained). Either the first acquisition unit 15 or the second acquisition unit 16 may acquire a learned model that has been learned. The selection unit 14 uses the first learned model to estimate the number of persons recorded in the first image information (the number of people). output one of

第１取得部１５は、選択部１４によって第１取得部１５が出力される（選択される）場合、第２学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物の数（人数）を推定する。すなわち、第１取得部１５は、選択部１４により第１取得部１５が選択される場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する。第２学習済モデルは、上述した実施形態で説明したものと同様の構成であってもよい。 When the first acquisition unit 15 is output (selected) by the selection unit 14, the first acquisition unit 15 records the first image information based on the second trained model and the first image information. Estimate the number of people (number of people). That is, when the selection unit 14 selects the first acquisition unit 15, the first acquisition unit 15 generates a The number of persons recorded in the first image information is acquired based on the second learned model that is obtained and the first image information that is received by the receiving unit 12 . The second trained model may have the same configuration as described in the above embodiment.

第２取得部１６は、選択部１４によって第２取得部１６が出力される（選択される）場合、第３学習済モデルと、第１画像情報とに基づいて、第１画像情報に記録される人物を認識する。第２取得部１６は、そのように認識した人物の数に基づいて、第１画像情報に記録される人物の数（人数）を推定する。すなわち、第２取得部１６は、選択部１４により第２取得部１６が選択される場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付部１２によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する。第３学習済モデルは、上述した実施形態で説明したものと同様の構成であってもよい。 When the second acquisition unit 16 is output (selected) by the selection unit 14, the second acquisition unit 16 records the first image information based on the third trained model and the first image information. recognize the person The second acquisition unit 16 estimates the number of persons recorded in the first image information (number of persons) based on the number of persons recognized as such. That is, when the selection unit 14 selects the second acquisition unit 16, the second acquisition unit 16 includes an image in which a person is recorded, a third learned model generated by learning the person, Each person recorded in the first image information is recognized based on the first image information received by the unit 12, and the number of persons is acquired based on the recognition result. The third trained model may have the same configuration as described in the above embodiment.

（変形例２）
次に、第２の変形例について説明する。
上述した実施形態及び第１変形例では、第１画像情報に記録される人物の数（人数）を推定する際に利用されることについて説明したが、情報処理装置１で行う推定は、人数に限定されることはなく、種々の物体の内容であってもよい。種々の物体の内容は、画像情報（第３画像情報）に記録されるものであってもよく、一例として、屋内及び室内の状況、及び、季節それぞれの状況（春夏秋冬）等であってもよい。 (Modification 2)
Next, a second modified example will be described.
In the embodiment and the first modified example described above, the number of persons recorded in the first image information is estimated. It may be the contents of various objects without limitation. The contents of various objects may be recorded in the image information (third image information), and examples include indoor and indoor situations, and seasonal situations (spring, summer, autumn and winter). good too.

受付部１２は、物体が記録される第３画像情報を受け付ける。受付部１２は、上述した実施形態と同様の構成であってもよい。 The reception unit 12 receives third image information in which an object is recorded. The reception unit 12 may have a configuration similar to that of the embodiment described above.

推定部１３は、第４学習済モデルと、受付部１２によって受け付ける第３画像情報に基づいて、第３画像情報に記録される内容を推定する。第４学習済モデルは、例えば、複数の物体が記録される画像と、その画像に記録される物体の内容とを学習することにより生成される学習済モデルであってもよい。すなわち、第４学習済モデルは、例えば、１又は複数の物体が記録される、複数のパターンの画像（複数の画像）と、各パターンの画像それぞれに記録される物体の内容とを学習した学習済モデルであってもよい。 The estimation unit 13 estimates the content recorded in the third image information based on the fourth trained model and the third image information received by the reception unit 12 . The fourth trained model may be, for example, a trained model generated by learning an image in which a plurality of objects are recorded and the content of the objects recorded in the image. That is, the fourth trained model is, for example, a plurality of patterns of images (plurality of images) in which one or more objects are recorded, and the content of the object recorded in each pattern image. It may be a finished model.

また、第２変形例では、上述した第１変形例と同様に、推定部１３の代わりに選択部１４が配されてもよい。すなわち、選択部１４は、第４学習済モデルと、受付部１２によって受け付ける第３画像情報とに基づいて、第１取得部１５及び第２取得部１６のいずれかで物体の内容を推定するかを選択してもよい。この場合の第４学習済モデルは、例えば、１又は複数の物体が記録される画像と、ヒートマップを利用した物体の内容の状態推定及び物体の認識を利用した内容の状態推定のいずれかの選択とを学習することにより生成される学習済モデルであってもよい。すなわち、第４学習済モデルは、例えば、１又は複数の物体が記録される画像と、その画像を後述する第５学習済モデルに入力して推定される物体の内容の状態と、その画像を後述する第６学習済モデルを利用して推定される物体の内容の状態と、を学習して生成される学習済モデルであってもよい。この場合、第４学習済モデルは、例えば、さらに、第５学習済モデルを利用した物体内容の状態の推定と第６学習済モデルを利用した物体内容の状態の推定とのうちどちらがより正しい状態か（適切な推定結果が得られるのが第１取得部１５及び第２取得部１６のうちどちらか）学習した学習済モデルであってもよい。 Further, in the second modification, the selection unit 14 may be arranged instead of the estimation unit 13, as in the first modification described above. That is, based on the fourth trained model and the third image information received by the receiving unit 12, the selecting unit 14 estimates the content of the object by either the first obtaining unit 15 or the second obtaining unit 16. may be selected. The fourth trained model in this case is, for example, an image in which one or more objects are recorded, and either state estimation of the content of the object using a heat map or state estimation of the content using recognition of the object. It may be a trained model generated by learning selection. That is, the fourth trained model includes, for example, an image in which one or more objects are recorded, the state of the content of the object estimated by inputting the image into a fifth trained model described later, and the image. It may be a trained model generated by learning the state of the content of the object estimated using a sixth trained model described later. In this case, the fourth trained model further determines which of the estimation of the state of the object content using the fifth trained model and the estimation of the state of the object content using the sixth trained model is more correct. Alternatively, it may be a learned model (either the first acquisition unit 15 or the second acquisition unit 16 that provides an appropriate estimation result).

第１取得部１５は、推定部１３によって推定される物体の内容が第１の場合に、第５学習済モデルと、受付部１２によって受け付ける第３画像情報とに基づいて、第３画像情報に記録される物体の内容を取得する。第５学習済モデルは、例えば、物体の内容の状態を示すヒートマップと、そのヒートマップ内の物体の内容とを学習することにより生成される学習済モデルであってもよい。
また、第１取得部１５は、選択部１４によって第１取得部１５が出力される（選択される）場合、第５学習済モデルと、第３画像情報とに基づいて、第４画像情報に記録される物体の内容（状態）を推定する。 When the content of the object estimated by the estimation unit 13 is the first, the first acquisition unit 15 obtains the third image information based on the fifth trained model and the third image information received by the reception unit 12. Get the content of the object being recorded. The fifth trained model may be, for example, a trained model generated by learning a heat map indicating the state of the contents of the object and the contents of the object in the heat map.
Further, when the first acquisition unit 15 is output (selected) by the selection unit 14, the first acquisition unit 15 obtains the fourth image information based on the fifth trained model and the third image information. Infer the content (state) of the object being recorded.

第２取得部１６は、推定部１３によって推定される物体の内容が第１の場合とは異なる第２の場合に、第６学習済モデルと、受付部１２によって受け付ける第３画像情報とに基づいて、第３画像情報に記録される物体それぞれを認識し、認識した結果に基づいて物体の内容を取得する。第６学習済モデルは、例えば、物体が記録される画像と、物体とを学習することにより生成される学習済モデルであってもよい。
また、第２取得部１６は、選択部１４によって第２取得部１６が出力される（選択される）場合、第６学習済モデルと、第３画像情報とに基づいて、第３画像情報に記録される物体を認識する。第２取得部１６は、そのように認識した物体に基づいて、第１画像情報に記録される物体の内容（状態）を推定する。 The second acquisition unit 16, in a second case where the content of the object estimated by the estimation unit 13 is different from the first case, based on the sixth trained model and the third image information received by the reception unit 12 , each object recorded in the third image information is recognized, and the content of the object is acquired based on the recognition result. The sixth trained model may be, for example, a trained model generated by learning an image in which an object is recorded and the object.
Further, when the second acquisition unit 16 is output (selected) by the selection unit 14, the second acquisition unit 16 obtains the third image information based on the sixth trained model and the third image information. Recognize the object being recorded. The second acquisition unit 16 estimates the content (state) of the object recorded in the first image information based on the object thus recognized.

上述した情報処理装置１の各部は、コンピュータの演算処理装置等の機能として実現されてもよい。すなわち、情報処理装置１の受付部１２、推定部１３、選択部１４、第１取得部１５、第２取得部１６及び出力制御部１７（制御部１１）は、コンピュータの演算処理装置等による受付機能、推定機能、選択機能、第１取得機能、第２取得機能及び出力制御機能（制御機能）としてそれぞれ実現されてもよい。
情報処理プログラムは、上述した各機能をコンピュータに実現させることができる。情報処理プログラムは、例えば、メモリ、ソリッドステートドライブ、ハードディスクドライブ又は光ディスク等の、コンピュータで読み取り可能な非一時的な記録媒体に記録されていてもよい。
また、上述したように、情報処理装置１の各部は、コンピュータの演算処理装置等で実現されてもよい。その演算処理装置等は、例えば、集積回路等によって構成される。このため、情報処理装置１の各部は、演算処理装置等を構成する回路として実現されてもよい。すなわち、情報処理装置１の受付部１２、推定部１３、選択部１４、第１取得部１５、第２取得部１６及び出力制御部１７（制御部１１）は、コンピュータの演算処理装置等を構成する受付回路、推定回路、選択回路、第１取得回路、第２取得回路及び出力制御回路（制御回路）として実現されてもよい。
また、情報処理装置１の通信部２１、記憶部２２及び表示部２３（出力部）は、例えば、演算処理装置等の機能を含む通信機能、記憶機能及び表示機能（出力機能）として実現されもよい。また、情報処理装置１の通信部２１、記憶部２２及び表示部２３（出力部）は、例えば、集積回路等によって構成されることにより通信回路、記憶回路及び表示回路（出力回路）として実現されてもよい。また、情報処理装置１の通信部２１、記憶部２２及び表示部２３（出力部）は、例えば、複数のデバイスによって構成されることにより通信装置、記憶装置及び表示装置（出力装置）として構成されてもよい。 Each unit of the information processing apparatus 1 described above may be implemented as a function of an arithmetic processing unit of a computer or the like. That is, the receiving unit 12, the estimating unit 13, the selecting unit 14, the first obtaining unit 15, the second obtaining unit 16, and the output control unit 17 (control unit 11) of the information processing device 1 are received by an arithmetic processing unit of a computer. They may each be implemented as a function, an estimation function, a selection function, a first acquisition function, a second acquisition function, and an output control function (control function).
The information processing program can cause the computer to implement each function described above. The information processing program may be recorded in a non-temporary computer-readable recording medium such as a memory, solid state drive, hard disk drive, or optical disc.
Further, as described above, each part of the information processing device 1 may be realized by an arithmetic processing device of a computer or the like. The arithmetic processing unit or the like is configured by an integrated circuit or the like, for example. Therefore, each part of the information processing device 1 may be implemented as a circuit that constitutes an arithmetic processing device or the like. That is, the reception unit 12, the estimation unit 13, the selection unit 14, the first acquisition unit 15, the second acquisition unit 16, and the output control unit 17 (control unit 11) of the information processing device 1 constitute an arithmetic processing unit of a computer. A receiving circuit, an estimating circuit, a selecting circuit, a first obtaining circuit, a second obtaining circuit, and an output control circuit (control circuit).
Further, the communication unit 21, the storage unit 22, and the display unit 23 (output unit) of the information processing device 1 may be implemented as, for example, a communication function including functions of an arithmetic processing unit, a storage function, and a display function (output function). good. Further, the communication unit 21, the storage unit 22, and the display unit 23 (output unit) of the information processing device 1 are realized as a communication circuit, a storage circuit, and a display circuit (output circuit) by being configured by an integrated circuit or the like, for example. may Further, the communication unit 21, the storage unit 22, and the display unit 23 (output unit) of the information processing device 1, for example, are configured as a communication device, a storage device, and a display device (output device) by being configured by a plurality of devices. may

情報処理装置１は、上述した複数の各部のうち１又は任意の複数を組み合わせることが可能である。
本開示では、「情報」の文言を使用しているが、「情報」の文言は「データ」と言い換えることができ、「データ」の文言は「情報」と言い換えることができる。 The information processing apparatus 1 can combine any one of the plurality of units described above or an arbitrary plurality thereof.
Although the term "information" is used in this disclosure, the term "information" can be interchanged with "data" and the term "data" can be interchanged with "information."

［本実施形態の態様及び効果］
次に、本実施形態の一態様及び各態様が奏する効果について説明する。なお、本実施形態は以下に記載する各態様に限定されることはなく、上述した各部を適宜組み合わせて実現されてもよい。また、以下に記載する効果は一例であり、各態様が奏する効果は以下に記載するものに限定されることはない。 [Aspects and effects of the present embodiment]
Next, one aspect of the present embodiment and effects produced by each aspect will be described. It should be noted that the present embodiment is not limited to each aspect described below, and may be realized by appropriately combining each unit described above. Also, the effects described below are merely examples, and the effects of each aspect are not limited to those described below.

（態様１）
一態様の情報処理装置は、人物が記録される第１画像情報を受け付ける受付部と、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される第１学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する推定部と、推定部によって推定される人数が閾値以上の場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得部と、推定部によって推定される人数が閾値未満の場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得部と、を備える。
これにより、情報処理装置は、第１画像情報に基づいて、人物の数（人数）を推定することができる。
また、第１画像情報に記録される人物の数（人数）が相対的に多い場合には、ヒートマップを利用して人数を推定する（第２学習済モデルを利用する）場合がより正確な人数を出力することができる。一方、第１画像情報に記録される人物の数（人数）が相対的に少ない場合には、人物認識を利用して人数を推定する（第３学習済モデル）がより正確な人数を出力することができる。したがって、情報処理装置は、第１画像情報に記録される人数に応じて第１取得部及び第２取得部のうち一方を選択して利用するので、より正確な人数を推定することができる。 (Aspect 1)
An information processing apparatus according to one aspect is generated by learning a reception unit that receives first image information in which a person is recorded, an image in which a plurality of persons are recorded, and the number of persons recorded in the image. an estimating unit for estimating the number of persons recorded in the first image information based on the first trained model received by the receiving unit and the first image information received by the receiving unit; In this case, based on a second trained model generated by learning a heat map indicating the presence or absence of a person, the number of people in the heat map, and the first image information received by the reception unit, When the number of people estimated by the first acquisition unit that acquires the number of people recorded in the first image information and the estimation unit is less than a threshold value, the image is generated by learning the image in which the people are recorded and the people. second acquisition for recognizing each person recorded in the first image information based on the third learned model received by the reception unit and the first image information received by the reception unit, and acquiring the number of persons based on the recognition result; and
Thereby, the information processing device can estimate the number of persons (number of people) based on the first image information.
In addition, when the number of people recorded in the first image information (number of people) is relatively large, it is more accurate to estimate the number of people using a heat map (using the second trained model). The number of people can be output. On the other hand, when the number of people recorded in the first image information (number of people) is relatively small, estimating the number of people using person recognition (third trained model) outputs a more accurate number of people. be able to. Therefore, since the information processing apparatus selects and uses one of the first acquisition unit and the second acquisition unit according to the number of people recorded in the first image information, it is possible to estimate the number of people more accurately.

（態様２）
一態様の情報処理装置では、第１取得部は、人物の存在が推定される可能性が高い箇所と、人物の存在が推定される可能性が低い箇所とを範囲とするその範囲の間で、人物の存在の有無を示すヒートマップを学習した第２学習済モデルを利用して、人物の人数を推定することとしてもよい。
これにより、情報処理装置は、第１画像情報に記録される人物の数（人数）が相対的に多い場合でも、ヒートマップを利用して、人数をより正確に取得することができる (Aspect 2)
In the information processing device according to one aspect, the first acquisition unit obtains a Alternatively, the number of persons may be estimated using a second trained model that has learned a heat map indicating the presence or absence of persons.
Accordingly, even when the number of persons recorded in the first image information (number of persons) is relatively large, the information processing apparatus can acquire the number of persons more accurately using the heat map.

（態様３）
一態様の情報処理装置では、第１取得部は、受付部によって受け付ける第１画像情報と、予め特定される人物の特徴とに基づいて、人物の存在の有無を示すヒートマップを生成し、そのヒートマップと、第２学習済モデルとに基づいて、人物の人数を推定することとしてもよい。
これにより、情報処理装置は、第１画像情報に記録される人物の数（人数）が相対的に多い場合でも、ヒートマップを利用して、人数をより正確に取得することができる (Aspect 3)
In the information processing device according to one aspect, the first acquisition unit generates a heat map indicating the presence or absence of the person based on the first image information received by the reception unit and the characteristics of the person specified in advance. The number of persons may be estimated based on the heat map and the second trained model.
Accordingly, even when the number of persons recorded in the first image information (number of persons) is relatively large, the information processing apparatus can acquire the number of persons more accurately using the heat map.

（態様４）
一態様の情報処理装置では、推定部は、人物が記録される第２画像情報及び第２学習済モデルに基づいて推定される人物の人数と、第２画像情報及び第３学習済モデルに基づいて推定される人数と、第２画像情報に記録される人数とを学習することにより生成される第１学習済モデルを利用して、人物の人数を推定することとしてもよい。
これにより、情報処理装置は、第１画像情報に基づいて、第１取得部及び第２取得部のうちいずれで人数を推定するのがより適切かを推定することができる。すなわち、情報処理装置は、第１取得部及び第２取得部のうち、人数の推定により適した機能を選択することができる。 (Aspect 4)
In the information processing device of one aspect, the estimating unit includes the number of persons estimated based on the second image information in which persons are recorded and the second trained model, and the number of persons estimated based on the second image information and the third trained model. The number of persons may be estimated using a first trained model generated by learning the number of persons estimated by the second image information and the number of persons recorded in the second image information.
Accordingly, the information processing apparatus can estimate which of the first acquisition unit and the second acquisition unit is more appropriate for estimating the number of people, based on the first image information. That is, the information processing apparatus can select a function more suitable for estimating the number of people, from the first acquisition unit and the second acquisition unit.

（態様５）
一態様の情報処理装置は、人物が記録される第１画像情報を受け付ける受付部と、複数の人物が記録される画像と、ヒートマップを利用した人数の推定及び人物の認識を利用した人数の推定のいずれかの選択とを学習することにより生成される第１学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１取得部及び第２取得部のいずれかで人物を推定するかを選択する選択部と、選択部により第１取得部が選択される場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得部と、選択部により第２取得部が選択される場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付部によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得部と、を備える。
これにより、情報処理装置は、上述した態様と同様の効果を奏することができる。 (Aspect 5)
An information processing apparatus according to one aspect includes a reception unit that receives first image information in which persons are recorded, an image in which a plurality of persons are recorded, estimation of the number of people using a heat map, and estimation of the number of people using recognition of the people. Based on the first trained model generated by learning any selection of estimation and the first image information received by the reception unit, either the first acquisition unit or the second acquisition unit identifies the person. A selection unit that selects whether to estimate, and when the selection unit selects the first acquisition unit, a heat map indicating the presence or absence of a person and the number of people in the heat map are generated by learning. a first acquisition unit for acquiring the number of persons recorded in the first image information based on the second learned model received by the reception unit and the first image information received by the reception unit; is recorded in the first image information based on the image in which the person is recorded, the third trained model generated by learning the person, and the first image information received by the reception unit a second acquisition unit that recognizes each person and acquires the number of persons based on the recognition result.
Thereby, the information processing apparatus can achieve the same effect as the above-described aspect.

（態様６）
一態様の情報処理装置は、物体が記録される第３画像情報を受け付ける受付部と、複数の物体が記録される画像と、その画像に記録される物体の内容とを学習することにより生成される第４学習済モデルと、受付部によって受け付ける第３画像情報に基づいて、第３画像情報に記録される内容を推定する推定部と、推定部によって推定される物体の内容が第１の場合に、物体の内容の状態を示すヒートマップと、そのヒートマップ内の物体の内容とを学習することにより生成される第５学習済モデルと、受付部によって受け付ける第３画像情報とに基づいて、第３画像情報に記録される物体の内容を取得する第１取得部と、推定部によって推定される物体の内容が第１の場合とは異なる第２の場合に、物体が記録される画像と、物体とを学習することにより生成される第６学習済モデルと、受付部によって受け付ける第３画像情報とに基づいて、第３画像情報に記録される物体それぞれを認識し、認識した結果に基づいて物体の内容を取得する第２取得部と、を備える。
これにより、情報処理装置は、上述した態様と同様の効果を奏することができる。 (Aspect 6)
An information processing apparatus according to one aspect is generated by learning a reception unit that receives third image information in which an object is recorded, an image in which a plurality of objects are recorded, and the content of the objects recorded in the image. an estimating unit for estimating the content recorded in the third image information based on the fourth trained model, the third image information received by the receiving unit, and the content of the object estimated by the estimating unit is the first based on a fifth trained model generated by learning a heat map indicating the state of the content of the object, the content of the object in the heat map, and the third image information received by the receiving unit, a first acquiring unit for acquiring the content of the object recorded in the third image information; and an image in which the object is recorded in a second case where the content of the object estimated by the estimating unit is different from the first case. , the object, and the third image information received by the reception unit, each of the objects recorded in the third image information is recognized, and based on the recognition result and a second obtaining unit for obtaining the content of the object.
Thereby, the information processing apparatus can achieve the same effect as the above-described aspect.

（態様７）
一態様の情報処理方法では、コンピュータが、人物が記録される第１画像情報を受け付ける受付ステップと、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される第１学習済モデルと、受付ステップによって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する推定ステップと、推定ステップによって推定される人数が閾値以上の場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付ステップによって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得ステップと、推定ステップによって推定される人数が閾値未満の場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付ステップによって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得ステップと、を実行する。
これにより、情報処理方法は、上述した一態様の情報処理装置と同様の効果を奏することができる。 (Aspect 7)
In one aspect of the information processing method, the computer learns a receiving step of receiving first image information in which persons are recorded, an image in which a plurality of persons are recorded, and the number of persons recorded in the image. an estimation step for estimating the number of people recorded in the first image information based on the first trained model generated by and the first image information received by the reception step; If it is equal to or greater than the threshold, the second trained model generated by learning the heat map indicating the presence or absence of the person and the number of people in the heat map, and the first image information received by the receiving step. a first acquiring step of acquiring the number of persons recorded in the first image information, and learning an image in which the persons are recorded and the persons when the number of persons estimated by the estimating step is less than a threshold based on the Recognizing each person recorded in the first image information based on the third learned model generated by and the first image information received by the receiving step, and acquiring the number of persons based on the recognition result and a second obtaining step.
Accordingly, the information processing method can achieve the same effect as the information processing apparatus of one aspect described above.

（態様８）
一態様の情報処理プログラムは、コンピュータに、人物が記録される第１画像情報を受け付ける受付機能と、複数の人物が記録される画像と、その画像に記録される人物の人数とを学習することにより生成される第１学習済モデルと、受付機能によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を推定する推定機能と、推定機能によって推定される人数が閾値以上の場合、人物の存在の有無を示すヒートマップと、そのヒートマップ内の人物の人数とを学習することにより生成される第２学習済モデルと、受付機能によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物の人数を取得する第１取得機能と、推定機能によって推定される人数が閾値未満の場合、人物が記録される画像と、人物とを学習することにより生成される第３学習済モデルと、受付機能によって受け付ける第１画像情報とに基づいて、第１画像情報に記録される人物それぞれを認識し、認識した結果に基づいて人物の人数を取得する第２取得機能と、を実現させる。
これにより、情報処理プログラムは、上述した一態様の情報処理装置と同様の効果を奏することができる。 (Aspect 8)
An information processing program according to one aspect provides a computer with a reception function for receiving first image information in which a person is recorded, an image in which a plurality of persons are recorded, and the number of persons recorded in the image. An estimation function for estimating the number of people recorded in the first image information based on the first trained model generated by and the first image information received by the reception function, and the number of people estimated by the estimation function If it is equal to or greater than the threshold, the second trained model generated by learning the heat map indicating the presence or absence of the person and the number of people in the heat map, and the first image information received by the reception function a first acquisition function for acquiring the number of persons recorded in the first image information, and an image in which the persons are recorded when the number estimated by the estimation function is less than a threshold value, and the persons are learned based on the Recognize each person recorded in the first image information based on the third learned model generated by and the first image information received by the reception function, and acquire the number of persons based on the recognition result a second acquisition function;
Accordingly, the information processing program can achieve the same effect as the information processing apparatus of one aspect described above.

１情報処理装置
１１制御部
１２受付部
１３推定部
１４選択部
１５第１取得部
１６第２取得部
１７出力制御部
２１通信部
２２記憶部
２３表示部 1 information processing device 11 control unit 12 reception unit 13 estimation unit 14 selection unit 15 first acquisition unit 16 second acquisition unit 17 output control unit 21 communication unit 22 storage unit 23 display unit

Claims

a reception unit that receives first image information in which a person is recorded;
Based on a first trained model generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image, and the first image information received by the reception unit, a first an estimation unit for estimating the number of persons recorded in one piece of image information;
a second trained model generated by learning a heat map indicating presence/absence of persons and the number of persons in the heat map when the number of persons estimated by the estimation unit is equal to or greater than a threshold; a first acquisition unit that acquires the number of persons recorded in the first image information based on the first image information received by the reception unit;
When the number of people estimated by the estimation unit is less than the threshold, an image in which people are recorded, a third trained model generated by learning the people, and first image information received by the reception unit. a second acquisition unit that recognizes each person recorded in the first image information based on and acquires the number of persons based on the recognition result;
Information processing device.

The first acquisition unit determines the presence or absence of a person between a range of a location where the existence of a person is highly likely to be estimated and a location where the existence of a person is unlikely to be estimated. 2. The information processing apparatus according to claim 1, wherein the number of persons is estimated by using a second trained model obtained by learning the heat map shown.

The first acquisition unit generates a heat map indicating presence/absence of a person based on first image information received by the reception unit and characteristics of a person specified in advance. 3. The information processing apparatus according to claim 1, wherein the number of persons is estimated based on the learned model.

The estimation unit includes: the number of people estimated based on second image information in which people are recorded and the second trained model; the number of people estimated based on the second image information and the third trained model; 4. The information processing apparatus according to any one of claims 1 to 3, wherein the number of persons is estimated by using a first trained model generated by learning the number of persons recorded in the second image information. .

a reception unit that receives first image information in which a person is recorded;
a first trained model generated by learning an image in which a plurality of people are recorded, and selection of one of estimation of the number of people using a heat map and estimation of the number of people using recognition of people; a selection unit that selects whether to estimate a person by either the first acquisition unit or the second acquisition unit based on the first image information received by the reception unit;
a second trained model generated by learning a heat map indicating presence/absence of a person and the number of persons in the heat map when the first acquisition unit is selected by the selection unit; the first acquisition unit for acquiring the number of persons recorded in the first image information based on the first image information received by the reception unit;
When the second acquisition unit is selected by the selection unit, an image in which a person is recorded, a third trained model generated by learning the person, and first image information received by the reception unit. a second acquisition unit that recognizes each person recorded in the first image information based on the above, and acquires the number of people based on the recognition result;
Information processing device.

the computer
a receiving step of receiving first image information in which a person is recorded;
Based on a first trained model generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image, and the first image information received in the receiving step, a first an estimation step of estimating the number of persons recorded in one piece of image information;
a second trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map when the number of persons estimated by the estimation step is equal to or greater than a threshold; a first obtaining step of obtaining the number of persons recorded in the first image information based on the first image information received by the receiving step;
When the number of people estimated by the estimation step is less than the threshold, an image in which people are recorded, a third trained model generated by learning the people, and the first image information received by the reception step. a second acquiring step of recognizing each person recorded in the first image information based on and acquiring the number of persons based on the recognition result;
Information processing method that performs

to the computer,
a receiving function for receiving first image information in which a person is recorded;
Based on a first trained model generated by learning an image in which a plurality of persons are recorded and the number of persons recorded in the image, and the first image information received by the reception function, a first an estimation function for estimating the number of persons recorded in one image information;
a second trained model generated by learning a heat map indicating the presence or absence of persons and the number of persons in the heat map when the number of persons estimated by the estimation function is equal to or greater than a threshold; a first acquisition function for acquiring the number of persons recorded in the first image information based on the first image information received by the reception function;
When the number of people estimated by the estimation function is less than the threshold, an image in which people are recorded, a third trained model generated by learning the people, and first image information received by the reception function. a second acquisition function for recognizing each person recorded in the first image information based on and acquiring the number of persons based on the recognition result;
Information processing program that realizes