JP7533011B2

JP7533011B2 - Information processing device, program, and information processing method

Info

Publication number: JP7533011B2
Application number: JP2020139493A
Authority: JP
Inventors: 隆寛田中
Original assignee: Dai Nippon Printing Co Ltd
Current assignee: Dai Nippon Printing Co Ltd
Priority date: 2020-08-20
Filing date: 2020-08-20
Publication date: 2024-08-14
Anticipated expiration: 2040-08-20
Also published as: JP2022035280A

Description

本発明は、情報処理装置、プログラム及び情報処理方法に関する。 The present invention relates to an information processing device, a program, and an information processing method.

特許文献１では、画像の幅、高さ及び解像度に基づいて予め設定されたルールに従ってレイアウト枠のサイズを決定し、決定したレイアウト枠に画像をはめ込むことにより、画像の表示サイズを自動調整する技術が開示されている。 Patent document 1 discloses a technology that determines the size of a layout frame according to pre-set rules based on the width, height, and resolution of an image, and automatically adjusts the display size of the image by fitting the image into the determined layout frame.

特許第３８７０８３４号公報Patent No. 3870834

特許文献１に開示された技術では、レイアウト枠のサイズを決定するためのルールを予め設定しておく必要があり、ルールを設定する際の処理負担が大きい。また、画像のアスペクト比とレイアウト枠のアスペクト比とが大きく異なる場合には、レイアウト枠にはめ込んだ画像に違和感が生じる虞がある。 The technology disclosed in Patent Document 1 requires that rules for determining the size of the layout frame be set in advance, which places a large processing burden on the user when setting the rules. In addition, if the aspect ratio of the image and the aspect ratio of the layout frame differ significantly, the image fitted into the layout frame may appear unnatural.

本発明は、このような事情に鑑みてなされたものであり、その目的とするところは、処理負担が増大することなく、被写体（対象物）を指定された表示サイズで適切に表示することが可能な情報処理装置等を提供することにある。 The present invention has been made in consideration of these circumstances, and its purpose is to provide an information processing device etc. that can properly display a subject (object) at a specified display size without increasing the processing load.

本発明の一態様に係る情報処理装置は、対象物を含む画像を取得する画像取得部と、前記対象物を表示する表示領域のアスペクト比を取得するアスペクト比取得部と、前記画像から前記対象物を検出し、検出した前記対象物の領域のアスペクト比及び前記表示領域のアスペクト比を比較する比較部と、前記対象物の領域のアスペクト比及び前記表示領域のアスペクト比が異なる場合に、前記表示領域のアスペクト比を有する前記対象物の領域を順次切り替える切替部と、切り替えた領域に基づいて切り出した画像と前記対象物との関連度に基づいて、切り替える領域を特定する特定部とを備える。 An information processing device according to one aspect of the present invention includes an image acquisition unit that acquires an image including an object, an aspect ratio acquisition unit that acquires the aspect ratio of a display area in which the object is displayed, a comparison unit that detects the object from the image and compares the aspect ratio of the detected object area with the aspect ratio of the display area, a switching unit that sequentially switches the object area having the aspect ratio of the display area when the aspect ratio of the object area differs from the aspect ratio of the display area, and an identification unit that identifies the area to be switched based on the relevance between the object and an image cut out based on the switched area.

本発明の一態様にあっては、処理負担が増大することなく、被写体（対象物）を指定された表示サイズで適切に表示することができる。 In one aspect of the present invention, a subject (object) can be appropriately displayed at a specified display size without increasing the processing load.

情報処理装置の構成例を示すブロック図ある。1 is a block diagram showing an example of the configuration of an information processing device. 画像編集処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining an image editing process. 画像編集処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of an image editing process. 画像編集処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of an image editing process. 画像編集処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining an image editing process. 画像編集処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining an image editing process. 画像編集処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining an image editing process. 画像編集処理手順の他の例を示すフローチャートである。13 is a flowchart showing another example of the image editing processing procedure. 画面例を示す模式図である。FIG. 13 is a schematic diagram showing an example of a screen. 実施形態２のスコア算出処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a score calculation process procedure according to the second embodiment. スコア算出処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a score calculation process. レイアウト処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a layout process. レイアウト処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a layout processing procedure. レイアウト処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a layout processing procedure. レイアウト処理手順の一例を示すフローチャートである。13 is a flowchart illustrating an example of a layout processing procedure. レイアウト処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a layout process. レイアウト処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a layout process. レイアウト処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a layout process. レイアウト処理を説明するための模式図である。FIG. 11 is a schematic diagram for explaining a layout process.

以下に、本開示の情報処理装置、プログラム及び情報処理方法について、その実施形態を示す図面に基づいて詳述する。 The information processing device, program, and information processing method disclosed herein are described in detail below with reference to drawings showing embodiments thereof.

（実施形態１）
被写体（対象物）を撮影した画像から、被写体に対する視認性が高い領域を抽出（クリッピング）して投稿用画像を生成する情報処理装置について説明する。図１は情報処理装置の構成例を示すブロック図ある。情報処理装置１０は、種々の情報処理及び情報の送受信が可能な装置であり、例えばスマートフォン、タブレット端末、パーソナルコンピュータ、サーバコンピュータ等である。また情報処理装置１０は、大型計算機上で動作する仮想マシン、クラウドコンピューティングシステム、量子コンピュータ等によって構成されてもよく、専用の端末によって構成されてもよい。本実施形態の情報処理装置１０は、例えばＳＮＳ（Social Networking Service ）に画像（撮影画像）を投稿する際に、画像から、被写体に対する視認性が高い領域をクリッピングして投稿用画像を生成する。なお、処理対象の画像はＳＮＳ投稿用の画像に限定されず、書籍、雑誌、週刊誌、パンフレット、カタログ、新聞、メニュー、チラシ等、任意の媒体用の画像であってもよく、また媒体は、紙媒体であっても、電子書籍等のデジタル型の媒体であってもよい。また、処理対象の画像は写真であっても、イラストであってもよい。 (Embodiment 1)
An information processing device that generates an image for posting by extracting (clipping) an area with high visibility to a subject from an image of a photographed subject (object) will be described. FIG. 1 is a block diagram showing an example of the configuration of an information processing device. The information processing device 10 is a device capable of various information processing and sending and receiving information, such as a smartphone, a tablet terminal, a personal computer, a server computer, etc. The information processing device 10 may be configured by a virtual machine running on a large-scale computer, a cloud computing system, a quantum computer, etc., or may be configured by a dedicated terminal. When posting an image (photographed image) to a social networking service (SNS), for example, the information processing device 10 of this embodiment generates an image for posting by clipping an area with high visibility to a subject from the image. Note that the image to be processed is not limited to an image for posting to an SNS, and may be an image for any medium such as a book, magazine, weekly magazine, pamphlet, catalog, newspaper, menu, flyer, etc., and the medium may be a paper medium or a digital medium such as an electronic book. The image to be processed may be a photograph or an illustration.

情報処理装置１０は、制御部１１、記憶部１２、通信部１３、入力部１４、表示部１５、カメラ１６、読み取り部１７等を含み、これらの各部はバスを介して相互に接続されている。制御部１１は、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro-Processing Unit）又はＧＰＵ（Graphics Processing Unit）等の１又は複数のプロセッサを含む。制御部１１は、記憶部１２に記憶してある制御プログラム１２Ｐを適宜実行することにより、本開示の情報処理装置が行うべき種々の情報処理及び制御処理を行う。 The information processing device 10 includes a control unit 11, a memory unit 12, a communication unit 13, an input unit 14, a display unit 15, a camera 16, a reading unit 17, etc., and these units are connected to each other via a bus. The control unit 11 includes one or more processors such as a CPU (Central Processing Unit), an MPU (Micro-Processing Unit), or a GPU (Graphics Processing Unit). The control unit 11 appropriately executes a control program 12P stored in the memory unit 12, thereby performing various information processing and control processing to be performed by the information processing device of the present disclosure.

記憶部１２は、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ハードディスク、ＳＳＤ（Solid State Drive）等を含む。記憶部１２は、制御部１１が実行する制御プログラム１２Ｐ及び制御プログラム１２Ｐの実行に必要な各種のデータ等を予め記憶している。また記憶部１２は、制御部１１が制御プログラム１２Ｐを実行する際に発生するデータ等を一時的に記憶する。更に記憶部１２は、画像から投稿用画像を生成するための画像編集アプリケーションプログラム１２ＡＰ（以下では、画像編集アプリ１２ＡＰという）を記憶する。 The storage unit 12 includes a RAM (Random Access Memory), a flash memory, a hard disk, an SSD (Solid State Drive), etc. The storage unit 12 pre-stores a control program 12P executed by the control unit 11 and various data required for executing the control program 12P. The storage unit 12 also temporarily stores data generated when the control unit 11 executes the control program 12P. The storage unit 12 further stores an image editing application program 12AP (hereinafter referred to as image editing app 12AP) for generating an image to post from an image.

通信部１３は、有線通信又は無線通信によってインターネット等のネットワークに接続するためのインタフェースを有し、ネットワークを介して他の装置との間で情報の送受信を行う。入力部１４は、情報処理装置１０を操作するユーザによる操作入力を受け付け、操作内容に対応した制御信号を制御部１１へ送出する。表示部１５は、液晶ディスプレイ又は有機ＥＬディスプレイ等であり、制御部１１からの指示に従って各種の情報を表示する。入力部１４及び表示部１５は一体として構成されたタッチパネルであってもよい。 The communication unit 13 has an interface for connecting to a network such as the Internet by wired or wireless communication, and transmits and receives information to and from other devices via the network. The input unit 14 accepts operation input by a user who operates the information processing device 10, and sends a control signal corresponding to the operation content to the control unit 11. The display unit 15 is a liquid crystal display or an organic EL display, etc., and displays various information according to instructions from the control unit 11. The input unit 14 and the display unit 15 may be a touch panel configured as an integrated unit.

カメラ１６は、レンズ及び撮像素子等を有する撮像装置であり、レンズを介して被写体像の画像データを取得する。カメラ１６は、制御部１１からの指示に従って撮影を行い、例えば１枚（１フレーム）の画像データ（静止画像）を取得し、取得した画像データを記憶部１２に記憶する。なお、カメラ１６は、情報処理装置１０に内蔵される構成のほかに、情報処理装置１０に外付けされる構成でもよい。この場合、情報処理装置１０は、外部カメラの接続が可能な接続部又は外部カメラとの有線通信又は無線通信を行うためのカメラ通信部を備え、外部カメラが撮影した画像データを接続部又はカメラ通信部を介して取得する。本実施形態の情報処理装置１０は、カメラ１６を備えていなくてもよく、他の情報処理装置又はカメラで撮影された画像データをネットワーク経由又は可搬型記憶媒体１ａ経由で取得する構成でもよい。 The camera 16 is an imaging device having a lens and an imaging element, etc., and acquires image data of a subject image through the lens. The camera 16 captures images according to instructions from the control unit 11, acquires image data (still image) of, for example, one frame, and stores the acquired image data in the storage unit 12. The camera 16 may be configured to be built into the information processing device 10, or may be configured to be externally attached to the information processing device 10. In this case, the information processing device 10 has a connection unit that can connect an external camera or a camera communication unit for performing wired or wireless communication with the external camera, and acquires image data captured by the external camera through the connection unit or camera communication unit. The information processing device 10 of this embodiment does not need to have the camera 16, and may be configured to acquire image data captured by another information processing device or camera via a network or a portable storage medium 1a.

読み取り部１７は、ＣＤ（Compact Disc）－ＲＯＭ、ＤＶＤ（Digital Versatile Disc）－ＲＯＭ、ＵＳＢ（Universal Serial Bus）メモリ、ＳＤ（Secure Digital）カード等を含む可搬型記憶媒体１ａに記憶された情報を読み取る。記憶部１２に記憶される制御プログラム１２Ｐ、画像編集アプリ１２ＡＰ及び各種のデータは、制御部１１が読み取り部１７を介して可搬型記憶媒体１ａから読み取って記憶部１２に記憶してもよい。また、記憶部１２に記憶される制御プログラム１２Ｐ、画像編集アプリ１２ＡＰ及び各種のデータは、制御部１１が通信部１３を介して他の装置からダウンロードして記憶部１２に記憶してもよい。 The reading unit 17 reads information stored in a portable storage medium 1a, which may include a CD (Compact Disc)-ROM, a DVD (Digital Versatile Disc)-ROM, a USB (Universal Serial Bus) memory, an SD (Secure Digital) card, etc. The control program 12P, the image editing application 12AP, and various data stored in the storage unit 12 may be read by the control unit 11 from the portable storage medium 1a via the reading unit 17 and stored in the storage unit 12. The control program 12P, the image editing application 12AP, and various data stored in the storage unit 12 may also be downloaded by the control unit 11 from another device via the communication unit 13 and stored in the storage unit 12.

以下に、本実施形態の情報処理装置１０が画像（撮影画像）から投稿用画像を生成する処理について説明する。図２は画像編集処理を説明するための模式図である。本実施形態の情報処理装置１０は、編集対象の撮影画像と、編集後の投稿用画像のアスペクト比（画像の横方向の長さ：縦方向の長さ）とを入力データとし、撮影画像から、被写体の視認性が良好であり、且つ入力（設定）されたアスペクト比の領域を抽出して投稿用画像（出力データ）を生成する。よって、ユーザが撮影した画像から、被写体が見易い状態で表示される投稿用画像が生成される。 The process in which the information processing device 10 of this embodiment generates an image for posting from an image (photographed image) is described below. FIG. 2 is a schematic diagram for explaining image editing processing. The information processing device 10 of this embodiment receives as input data the photographed image to be edited and the aspect ratio (horizontal length of the image: vertical length of the image) of the image for posting after editing, and generates an image for posting (output data) by extracting an area from the photographed image where the subject has good visibility and has the input (set) aspect ratio. Thus, an image for posting in which the subject is displayed in an easy-to-see state is generated from an image photographed by the user.

図３及び図４は画像編集処理手順の一例を示すフローチャート、図５～図７は画像編集処理を説明するための模式図である。図４に示すスコア算出処理は、図３に示す画像編集処理中の「スコア算出処理」である。以下の処理は、情報処理装置１０の記憶部１２に記憶してある制御プログラム１２Ｐ及び画像編集アプリ１２ＡＰに従って制御部１１によって実行される。以下の処理の一部を専用のハードウェア回路で実現してもよい。 Figures 3 and 4 are flowcharts showing an example of an image editing process procedure, and Figures 5 to 7 are schematic diagrams for explaining the image editing process. The score calculation process shown in Figure 4 is the "score calculation process" during the image editing process shown in Figure 3. The following process is executed by the control unit 11 in accordance with the control program 12P and image editing application 12AP stored in the storage unit 12 of the information processing device 10. Part of the following process may be realized by a dedicated hardware circuit.

本実施形態の情報処理装置１０において、ユーザは、被写体（対象物）を撮影した画像をＳＮＳ等に投稿する際に、入力部１４を介して所定の操作を行い、撮影画像と投稿用画像のアスペクト比とを指定し、投稿用画像の生成処理の実行指示を行う。なお、例えば投稿先のＳＮＳに応じて予め投稿用画像のアスペクト比が設定されている場合、ユーザは、撮影画像のみを指定すればよい。情報処理装置１０の制御部１１（画像取得部）は、入力部１４を介して投稿用画像の生成処理の実行指示を受け付けた場合、指定された撮影画像を取得する（Ｓ１１）。例えば撮影画像が記憶部１２に記憶してある場合、制御部１１は、撮影画像を記憶部１２から読み出す。撮影画像は、カメラ１６で撮影された画像に限定されず、他の装置のカメラで撮影されてネットワーク経由又は可搬型記憶媒体１ａ経由で記憶部１２に記憶された画像であってもよい。図５Ａは撮影画像の一例を示す。なお、制御部１１（アスペクト比取得部）は、入力部１４を介して投稿用画像（画像の表示領域）のアスペクト比が指定された場合、指定された投稿用画像のアスペクト比を取得し、投稿先のＳＮＳに対して投稿用画像のアスペクト比が設定されている場合、投稿先のＳＮＳにおけるアスペクト比を取得する。以下では、投稿用画像のアスペクト比が１：１である場合を例に説明する。 In the information processing device 10 of this embodiment, when a user posts an image of a subject (target object) on an SNS or the like, the user performs a predetermined operation via the input unit 14, specifies the aspect ratio of the captured image and the image to be posted, and issues an instruction to execute the process of generating the image to be posted. Note that, for example, if the aspect ratio of the image to be posted is set in advance according to the SNS to which the image is to be posted, the user only needs to specify the captured image. When the control unit 11 (image acquisition unit) of the information processing device 10 receives an instruction to execute the process of generating the image to be posted via the input unit 14, it acquires the specified captured image (S11). For example, if the captured image is stored in the storage unit 12, the control unit 11 reads out the captured image from the storage unit 12. The captured image is not limited to an image captured by the camera 16, but may be an image captured by a camera of another device and stored in the storage unit 12 via a network or a portable storage medium 1a. FIG. 5A shows an example of a captured image. When the aspect ratio of the image to be posted (display area of the image) is specified via the input unit 14, the control unit 11 (aspect ratio acquisition unit) acquires the aspect ratio of the specified image to be posted, and when the aspect ratio of the image to be posted is set for the SNS to which the image is to be posted, it acquires the aspect ratio of the SNS to which the image is to be posted. The following describes an example in which the aspect ratio of the image to be posted is 1:1.

次に制御部１１は、取得した撮影画像に対して物体検出処理を行い、画像中の被写体（対象物）を検出する（Ｓ１２）。画像中の物体検出は、Ｒ－ＣＮＮ（Regions with Convolutional Neural Network）、ＦａｓｔＲ－ＣＮＮ、ＦａｓｔｅｒＲ－ＣＮＮ、ＭａｓｋＲ－ＣＮＮ、ＹＯＬＯ（You only Look Once）、ＳＳＤ（Single Shot Multibox Detector）等の学習モデルを用いて行うことができる。具体的には、制御部１１は、上述したような学習モデルに撮影画像を入力し、学習モデルからの出力情報に基づいて、撮影画像中の対象物（被写体）を検出する。なお、学習モデルを用いた物体検出処理は公知であるため、詳細については説明を省略する。上述したような学習モデルは画像編集アプリ１２ＡＰに組み込まれていてもよく、記憶部１２に記憶されていてもよい。図５Ｂでは、図５Ａに示す撮影画像に対して、学習モデルを用いた物体検出によって検出された被写体（ここでは犬）を示すバウンディングボックスが破線にて重畳表示されている。また画像中の物体検出は、テンプレートマッチング技術を用いて行われてもよい。この場合、検出すべき対象物の画像特徴量を示すテンプレートを予め記憶部１２に記憶しておき、制御部１１は、撮影画像中に、いずれかのテンプレートに一致する領域が存在するか否かに応じて、対象物が存在するか否かを検出できる。 Next, the control unit 11 performs object detection processing on the acquired captured image to detect a subject (target object) in the image (S12). Object detection in the image can be performed using a learning model such as R-CNN (Regions with Convolutional Neural Network), Fast R-CNN, Faster R-CNN, Mask R-CNN, YOLO (You only Look Once), or SSD (Single Shot Multibox Detector). Specifically, the control unit 11 inputs the captured image into the learning model described above, and detects the target object (subject) in the captured image based on the output information from the learning model. Note that since object detection processing using a learning model is publicly known, detailed explanations are omitted. The learning model described above may be incorporated in the image editing application 12AP or may be stored in the storage unit 12. In FIG. 5B, a bounding box indicating a subject (here, a dog) detected by object detection using a learning model is superimposed on the captured image shown in FIG. 5A with a dashed line. Object detection in the image may also be performed using a template matching technique. In this case, templates indicating the image features of the object to be detected are stored in advance in the storage unit 12, and the control unit 11 can detect whether the object is present depending on whether an area that matches any of the templates is present in the captured image.

制御部１１は、撮影画像から検出した被写体に基づいて、撮影画像から抽出すべきクリッピング範囲を設定する（Ｓ１３）。例えば制御部１１は、図５Ｂに破線矩形で示すように、学習モデルを用いて撮影画像から検出した被写体を囲むバウンディングボックス（外接矩形）の領域をクリッピング範囲に設定する。これにより、撮影画像中の被写体（対象物）を含む領域がクリッピング範囲に設定される。なお、制御部１１は、バウンディングボックスを上下及び左右方向にそれぞれ所定領域（所定画素数）拡張した領域をクリッピング範囲に設定してもよい。 The control unit 11 sets a clipping range to be extracted from the captured image based on the subject detected from the captured image (S13). For example, the control unit 11 sets the area of a bounding box (circumscribing rectangle) surrounding the subject detected from the captured image using the learning model as the clipping range, as shown by the dashed rectangle in FIG. 5B. This sets the area including the subject (target object) in the captured image as the clipping range. Note that the control unit 11 may set the area obtained by expanding the bounding box by a predetermined area (predetermined number of pixels) in the vertical and horizontal directions as the clipping range.

ユーザは、ＳＮＳに投稿する画像を撮影する場合、１つの被写体が画像中央に写るように撮影することが多い。この場合、制御部１１は撮影画像から１つの被写体を検出するので、１つの被写体を含むクリッピング範囲が設定される。このほかに、撮影画像中に複数の被写体が含まれる場合、制御部１１は撮影画像から複数の被写体を検出する。この場合、制御部１１は、複数の被写体を含む外接矩形の領域をクリッピング範囲に設定してもよい。 When users take images to post on social media, they often take the image so that one subject is in the center of the image. In this case, the control unit 11 detects one subject from the captured image, and sets a clipping range that includes the one subject. In addition, if the captured image includes multiple subjects, the control unit 11 detects multiple subjects from the captured image. In this case, the control unit 11 may set a circumscribing rectangular area that includes the multiple subjects as the clipping range.

制御部１１（比較部）は、ステップＳ１３で設定したクリッピング範囲のアスペクト比と、指定された投稿用画像のアスペクト比とを比較し、２つのアスペクト比が一致するか否かを判断する（Ｓ１４）。具体的には、制御部１１は、クリッピング範囲内の画像の左右方向（横方向）の画素数と、上下方向（縦方向）の画素数とを計数し、クリッピング範囲のアスペクト比（左右方向の画素数：上下方向の画素数）を算出する。そして制御部１１は、算出したアスペクト比と投稿用画像のアスペクト比とが一致するか否かを判断する。アスペクト比が一致すると判断した場合（Ｓ１４：ＹＥＳ）、制御部１１（生成部）は、ステップＳ２９の処理に移行し、ステップＳ１１で取得した撮影画像から、ステップＳ１３で設定したクリッピング範囲内の画像（画素）を抽出して投稿用画像（表示画像）を生成する（Ｓ２９）。 The control unit 11 (comparison unit) compares the aspect ratio of the clipping range set in step S13 with the aspect ratio of the specified image for posting, and determines whether the two aspect ratios match (S14). Specifically, the control unit 11 counts the number of pixels in the left-right direction (horizontal direction) and the number of pixels in the up-down direction (vertical direction) of the image within the clipping range, and calculates the aspect ratio of the clipping range (number of pixels in the left-right direction: number of pixels in the up-down direction). The control unit 11 then determines whether the calculated aspect ratio matches the aspect ratio of the image for posting. If it is determined that the aspect ratios match (S14: YES), the control unit 11 (generation unit) proceeds to the process of step S29, and extracts the image (pixels) within the clipping range set in step S13 from the captured image acquired in step S11 to generate the image for posting (display image) (S29).

アスペクト比が一致しないと判断した場合（Ｓ１４：ＮＯ）、制御部１１は、ステップＳ１３で設定したクリッピング範囲の調整方向を特定する（Ｓ１５）。なお、制御部１１は、クリッピング範囲のアスペクト比を、投稿用画像のアスペクト比に一致させるためにクリッピング範囲を調整すべき調整方向を特定する。図６及び図７はクリッピング範囲の調整方向の説明図であり、図６は横長のクリッピング範囲に対する調整方向を示し、図７は縦長のクリッピング範囲に対する調整方向を示す。図６Ａ及び図７ＡはステップＳ１３で設定したクリッピング範囲を破線矩形で示している。図６Ａに示すクリッピング範囲は横方向に３３０画素、縦方向に２７０画素の画像サイズ（330×270画素）を有し、このクリッピング範囲のアスペクト比は１１：９である。このようなクリッピング範囲を１：１のアスペクト比に一致させるためには、上下方向に拡張するか左右方向に縮小することが考えられる。よって、本実施形態では、図６Ｂに示すように、上方向に６０画素拡張、上方向及び下方向にそれぞれ３０画素拡張、及び、下方向に６０画素拡張することによって330×330画素の画像サイズにそれぞれ拡張し、アスペクト比が１：１となるようにクリッピング範囲を調整（拡張）する。なお、このような拡張を行う際に、撮影画像中の画素が存在しない領域が発生した場合、画素が存在しない領域に黒画素を追加し、拡張後のクリッピング範囲の画像を生成してもよい。また、図６Ｃに示すように、左端側を６０画素縮小、左右両端側をそれぞれ３０画素縮小、及び、右端側を６０画素縮小することによって270×270画素の画像サイズにそれぞれ縮小し、アスペクト比が１：１となるようにクリッピング範囲を調整（縮小）する。 If it is determined that the aspect ratios do not match (S14: NO), the control unit 11 specifies the adjustment direction of the clipping range set in step S13 (S15). The control unit 11 specifies the adjustment direction in which the clipping range should be adjusted to match the aspect ratio of the clipping range to the aspect ratio of the image for posting. Figures 6 and 7 are explanatory diagrams of the adjustment direction of the clipping range, where Figure 6 shows the adjustment direction for a horizontally long clipping range and Figure 7 shows the adjustment direction for a vertically long clipping range. Figures 6A and 7A show the clipping range set in step S13 with a dashed rectangle. The clipping range shown in Figure 6A has an image size of 330 pixels in the horizontal direction and 270 pixels in the vertical direction (330 x 270 pixels), and the aspect ratio of this clipping range is 11:9. In order to match such a clipping range to an aspect ratio of 1:1, it is possible to expand it in the vertical direction or shrink it in the horizontal direction. Therefore, in this embodiment, as shown in FIG. 6B, the image size is expanded to 330×330 pixels by expanding 60 pixels upward, 30 pixels upward and downward, and 60 pixels downward, and the clipping range is adjusted (expanded) so that the aspect ratio is 1:1. When performing such expansion, if an area in the captured image in which no pixels exist is generated, black pixels may be added to the area in which no pixels exist, and an image of the expanded clipping range may be generated. Also, as shown in FIG. 6C, the left end side is reduced by 60 pixels, the left and right ends are reduced by 30 pixels each, and the right end side is reduced by 60 pixels, and the clipping range is adjusted (reduced) so that the image size is reduced to 270×270 pixels, and the aspect ratio is 1:1.

また、図７Ａに示すクリッピング範囲は横方向に２７０画素、縦方向に３３０画素の画像サイズ（270×330画素）を有し、このクリッピング範囲のアスペクト比は９：１１である。このようなクリッピング範囲を１：１のアスペクト比に一致させるためには、左右方向に拡張するか上下方向に縮小することが考えられる。よって、本実施形態では、図７Ｂに示すように、左方向に６０画素拡張、左右方向にそれぞれ３０画素拡張、及び、右方向に６０画素拡張することによって330×330画素の画像サイズにそれぞれ拡張し、アスペクト比が１：１となるようにクリッピング範囲を調整（拡張）する。また、図７Ｃに示すように、上端側を６０画素縮小、上下両端側をそれぞれ３０画素縮小、及び、下端側を６０画素縮小することによって270×270画素の画像サイズにそれぞれ縮小し、アスペクト比が１：１となるようにクリッピング範囲を調整（縮小）する。 The clipping range shown in FIG. 7A has an image size of 270 pixels horizontally and 330 pixels vertically (270×330 pixels), and the aspect ratio of this clipping range is 9:11. In order to make such a clipping range match the aspect ratio of 1:1, it is possible to expand it left and right or shrink it up and down. Therefore, in this embodiment, as shown in FIG. 7B, the clipping range is expanded to an image size of 330×330 pixels by expanding 60 pixels leftward, 30 pixels leftward, and 60 pixels rightward, and adjusted (expanded) so that the aspect ratio is 1:1. As shown in FIG. 7C, the clipping range is reduced to an image size of 270×270 pixels by reducing the top end by 60 pixels, the top and bottom ends by 30 pixels, and the bottom end by 60 pixels, and adjusted (reduced) so that the aspect ratio is 1:1.

制御部１１は、クリッピング範囲の調整方向を特定した場合、特定した調整方向に従って、クリッピング範囲を拡張又は縮小する。これにより、制御部１１（切替部）は、被写体を含み、投稿用画像のアスペクト比を有する領域（調整後のクリッピング範囲）を順次切り替えることができる。制御部１１は、例えばクリッピング範囲を上方向又は左方向に拡張する（Ｓ１６）。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｂの左側に示すように上方向に拡張し、縦長である場合、図７Ｂの左側に示すように左方向に拡張する。そして制御部１１は、拡張後のクリッピング範囲に従って撮影画像から切り出した画像に基づいて、この拡張後のクリッピング範囲に対して被写体との関連度に関するスコアの算出処理を行う（Ｓ１７）。 When the control unit 11 identifies the adjustment direction of the clipping range, it expands or reduces the clipping range according to the identified adjustment direction. This allows the control unit 11 (switching unit) to sequentially switch between areas that include the subject and have the aspect ratio of the image for posting (adjusted clipping range). The control unit 11 expands the clipping range, for example, upward or leftward (S16). Here, if the clipping range is horizontal, the control unit 11 expands it upward as shown on the left side of FIG. 6B, and if the clipping range is vertical, the control unit 11 expands it leftward as shown on the left side of FIG. 7B. Then, the control unit 11 performs a calculation process of a score relating to the relevance of the expanded clipping range to the subject based on an image cut out from the captured image according to the expanded clipping range (S17).

図４に示すスコア算出処理において、制御部１１（割合スコア算出部）は、拡張後（調整後）のクリッピング範囲に基づいて撮影画像から切り出した画像に対する被写体領域の割合に応じたスコアを算出する。具体的には、制御部１１（対象物検出部）は、ステップＳ１２で撮影画像から検出した被写体の領域に基づいて、調整後のクリッピング範囲内の被写体の領域を特定する（Ｓ４１）。図５Ｃの左側に示すクリッピング範囲では、図５Ｃの右側に白抜きで示す犬の領域（被写体領域）が特定されている。そして制御部１１は、特定した被写体領域内の画素数を算出し（Ｓ４２）、調整後のクリッピング範囲内の画素数に対する被写体領域内の画素数の割合を算出することにより、調整後のクリッピング範囲に対する被写体領域の面積に関するスコアを算出する（Ｓ４３）。図５Ｃに示す例では、330×330画素のクリッピング範囲に対して55,806画素の被写体領域が検出されており、面積に関するスコアとして０．５１が算出されている。これにより、クリッピング範囲内の被写体領域が大きいほど、面積に関するスコアとして高いスコアが算出される。 In the score calculation process shown in FIG. 4, the control unit 11 (proportion score calculation unit) calculates a score according to the proportion of the subject area to the image cut out from the captured image based on the expanded (adjusted) clipping range. Specifically, the control unit 11 (object detection unit) identifies the subject area within the adjusted clipping range based on the subject area detected from the captured image in step S12 (S41). In the clipping range shown on the left side of FIG. 5C, the area of the dog (subject area) shown in white on the right side of FIG. 5C is identified. The control unit 11 then calculates the number of pixels within the identified subject area (S42), and calculates the ratio of the number of pixels within the subject area to the number of pixels within the adjusted clipping range, thereby calculating a score related to the area of the subject area relative to the adjusted clipping range (S43). In the example shown in FIG. 5C, a subject area of 55,806 pixels is detected for a clipping range of 330×330 pixels, and 0.51 is calculated as the score related to the area. As a result, the larger the subject area within the clipping range, the higher the score related to the area is calculated.

次に制御部１１（位置スコア算出部）は、調整後のクリッピング範囲に対する被写体領域の位置に応じたスコアを算出する。具体的には、制御部１１は、調整後のクリッピング範囲の中心（画像中心）の座標値と、ステップＳ４１で特定したクリッピング範囲内の被写体領域の中心（重心）の座標値とを算出する（Ｓ４４）。クリッピング範囲内の各画素の座標値は、例えばクリッピング範囲の左上を原点（０，０）とし、原点から右方向への画素数と原点から下方向への画素数とによって表される。図５Ｄに示す例では、330×330画素のクリッピング範囲に対して、クリッピング範囲の中心（画像中心）の座標値として（165,165）が算出され、被写体領域の中心の座標値として（207,179）が算出されている。なお、被写体領域の中心は、例えば被写体領域内の全画素の座標値の平均値で表されてもよく、被写体領域の左端の画素及び右端の画素における左右方向の中央位置の座標値と、被写体領域の上端及び下端の画素における上下方向の中央位置の座標値とで表されてもよく、被写体領域の輪郭上の各画素の座標値の平均値で表されてもよい。次に制御部１１は、調整後のクリッピング範囲における半対角線の長さを算出する（Ｓ４５）。半対角線は、クリッピング範囲の画像中心とクリッピング範囲の４隅のいずれかとの間の線分であり、図５Ｄに示す例では、半角線の長さとして233.3が算出されている。また制御部１１は、ステップＳ４４で算出したクリッピング範囲の画像中心の座標値と、被写体領域の中心の座標値とに基づいて、クリッピング範囲の画像中心と、被写体領域の中心との距離（中心間距離）を算出する（Ｓ４６）。図５Ｄに示す例では、中心間距離として44.3が算出されている。制御部１１は、半対角線の長さ及び中心間距離に基づいて、調整後のクリッピング範囲に対する被写体領域の位置に関するスコアを算出する（Ｓ４７）。例えば制御部１１は、１．０－（中心間距離）／（半対角線の長さ）によって位置に関するスコアを算出し、図５Ｄに示す例では、位置に関するスコアとして０．８１が算出されている。これにより、被写体領域がクリッピング範囲の中央に近いほど、位置に関するスコアとして高いスコアが算出される。 Next, the control unit 11 (position score calculation unit) calculates a score according to the position of the subject area relative to the adjusted clipping range. Specifically, the control unit 11 calculates the coordinate value of the center (image center) of the adjusted clipping range and the coordinate value of the center (center of gravity) of the subject area within the clipping range identified in step S41 (S44). The coordinate value of each pixel within the clipping range is expressed, for example, by the number of pixels to the right from the origin (0, 0) and the number of pixels to the bottom from the origin, with the upper left corner of the clipping range being the origin. In the example shown in FIG. 5D, for a clipping range of 330 x 330 pixels, the coordinate value of the center (image center) of the clipping range is calculated as (165, 165), and the coordinate value of the center of the subject area is calculated as (207, 179). The center of the subject area may be represented by, for example, the average value of the coordinate values of all pixels in the subject area, the coordinate values of the horizontal center positions of the left and right pixels of the subject area, and the coordinate values of the vertical center positions of the top and bottom pixels of the subject area, or the average value of the coordinate values of each pixel on the contour of the subject area. Next, the control unit 11 calculates the length of a half diagonal in the clipping range after adjustment (S45). The half diagonal is a line segment between the image center of the clipping range and one of the four corners of the clipping range, and in the example shown in FIG. 5D, 233.3 is calculated as the length of the half angle line. In addition, the control unit 11 calculates the distance (center-to-center distance) between the image center of the clipping range and the center of the subject area based on the coordinate values of the image center of the clipping range calculated in step S44 and the coordinate values of the center of the subject area (S46). In the example shown in FIG. 5D, 44.3 is calculated as the center-to-center distance. The control unit 11 calculates a score for the position of the subject region relative to the adjusted clipping range based on the length of the semi-diagonal and the center-to-center distance (S47). For example, the control unit 11 calculates the position score by 1.0-(center-to-center distance)/(length of the semi-diagonal), and in the example shown in FIG. 5D, 0.81 is calculated as the position score. As a result, the closer the subject region is to the center of the clipping range, the higher the position score calculated.

制御部１１は、ステップＳ４３で算出した面積に関するスコアと、ステップＳ４７で算出した位置に関するスコアとを、調整後のクリッピング範囲に対応付けて記憶する（Ｓ４８）。例えば制御部１１は、撮影画像に対する調整後のクリッピング範囲の位置を示す情報に対応付けて、面積に関するスコア及び位置に関するスコアを記憶する。なお、クリッピング範囲の位置は、例えば撮影画像においてクリッピング範囲の４隅の画素の座標値で表され、４隅の画素の座標値は、例えば撮影画像の左上を原点（０，０）とし、原点から右方向への画素数と原点から下方向への画素数とによって表される。また、クリッピング範囲の位置は、撮影画像においてクリッピング範囲の左上の画素の座標値と、クリッピング範囲の画像サイズとで表されてもよい。 The control unit 11 stores the area score calculated in step S43 and the position score calculated in step S47 in association with the adjusted clipping range (S48). For example, the control unit 11 stores the area score and the position score in association with information indicating the position of the adjusted clipping range for the captured image. Note that the position of the clipping range is represented, for example, by the coordinate values of the pixels at the four corners of the clipping range in the captured image, and the coordinate values of the pixels at the four corners are represented, for example, by the number of pixels to the right and the number of pixels to the bottom from the origin, with the top left corner of the captured image being the origin (0,0). The position of the clipping range may also be represented by the coordinate values of the top left pixel of the clipping range in the captured image and the image size of the clipping range.

制御部１１は、図３に示す画像編集処理に戻り、ステップＳ１３で設定したクリッピング範囲に対して、ステップＳ１８～Ｓ１９の処理を行う。具体的には、制御部１１は、ステップＳ１３で設定したクリッピング範囲を上下方向又は左右方向に拡張する（Ｓ１８）。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｂの中央に示すように上方向及び下方向に拡張し、縦長である場合、図７Ｂの中央に示すように左方向及び右方向に拡張する。そして制御部１１は、拡張後のクリッピング範囲に応じて撮影画像から切り出した画像に基づいて、この拡張後のクリッピング範囲に対するスコアの算出処理を行う（Ｓ１９）。ここでのスコア算出処理は、ステップＳ１７と同様の処理であり、図４に示す処理である。これにより、ステップＳ１８で拡張したクリッピング範囲についても面積に関するスコア及び位置に関するスコアが算出されて記憶される。 The control unit 11 returns to the image editing process shown in FIG. 3 and performs the processes of steps S18 to S19 on the clipping range set in step S13. Specifically, the control unit 11 expands the clipping range set in step S13 in the vertical or horizontal direction (S18). Here, if the clipping range is horizontal, the control unit 11 expands it in the upward and downward directions as shown in the center of FIG. 6B, and if the clipping range is vertical, the control unit 11 expands it in the left and right directions as shown in the center of FIG. 7B. Then, the control unit 11 performs a score calculation process for the expanded clipping range based on an image cut out from the captured image according to the expanded clipping range (S19). The score calculation process here is the same as step S17 and is the process shown in FIG. 4. As a result, the area score and position score are calculated and stored for the clipping range expanded in step S18.

次に制御部１１は、ステップＳ１３で設定したクリッピング範囲に対して、ステップＳ２０～Ｓ２１の処理を行う。具体的には、制御部１１は、ステップＳ１３で設定したクリッピング範囲を下方向又は右方向に拡張する（Ｓ２０）。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｂの右側に示すように下方向に拡張し、縦長である場合、図７Ｂの右側に示すように右方向に拡張する。そして制御部１１は、拡張後のクリッピング範囲内の画像に基づいて、この拡張後のクリッピング範囲に対するスコアの算出処理を行う（Ｓ２１）。これにより、ステップＳ２０で拡張したクリッピング範囲についても面積に関するスコア及び位置に関するスコアが算出されて記憶される。 Then, the control unit 11 performs the processes of steps S20 to S21 on the clipping range set in step S13. Specifically, the control unit 11 expands the clipping range set in step S13 in a downward or rightward direction (S20). Here, if the clipping range is horizontal, the control unit 11 expands it in a downward direction as shown on the right side of FIG. 6B, and if the clipping range is vertical, the control unit 11 expands it in a rightward direction as shown on the right side of FIG. 7B. Then, the control unit 11 performs a score calculation process for the expanded clipping range based on the image within the expanded clipping range (S21). As a result, area-related scores and position-related scores are calculated and stored for the clipping range expanded in step S20 as well.

同様に制御部１１は、ステップＳ１３で設定したクリッピング範囲に対して、ステップＳ２２～Ｓ２７の処理を行う。なお、制御部１１は、ステップＳ２２において、クリッピング範囲に対して左側又は上側を縮小する。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｃの左側に示すようにクリッピング範囲の左側を縮小し、縦長である場合、図７Ｃの左側に示すようにクリッピング範囲の上側を縮小する。また、ステップＳ２４において、制御部１１は、クリッピング範囲の左側及び右側、或いは、上側及び下側を縮小する。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｃの中央に示すようにクリッピング範囲の左側及び右側をそれぞれ縮小し、縦長である場合、図７Ｃの中央に示すようにクリッピング範囲の上側及び下側をそれぞれ縮小する。更に、ステップＳ２６において、制御部１１は、クリッピング範囲の右側又は下側を縮小する。ここでは、制御部１１は、クリッピング範囲が横長である場合、図６Ｃの右側に示すようにクリッピング範囲の右側を縮小し、縦長である場合、図７Ｃの右側に示すようにクリッピング範囲の下側を縮小する。制御部１１は、それぞれ縮小後のクリッピング範囲内の画像に基づいて、縮小後のクリッピング範囲に対するスコアの算出処理を行う（Ｓ２３，Ｓ２５，Ｓ２７）。これにより、ステップＳ２２，Ｓ２４，Ｓ２６でそれぞれ縮小したクリッピング範囲について、面積に関するスコア及び位置に関するスコアが算出されて記憶される。 Similarly, the control unit 11 performs the processes of steps S22 to S27 on the clipping range set in step S13. In step S22, the control unit 11 reduces the left side or the top side of the clipping range. Here, if the clipping range is horizontal, the control unit 11 reduces the left side of the clipping range as shown in the left side of FIG. 6C, and if the clipping range is vertical, the control unit 11 reduces the top side of the clipping range as shown in the left side of FIG. 7C. In addition, in step S24, the control unit 11 reduces the left and right sides, or the top and bottom sides of the clipping range. Here, if the clipping range is horizontal, the control unit 11 reduces the left and right sides of the clipping range as shown in the center of FIG. 6C, and if the clipping range is vertical, the control unit 11 reduces the top and bottom sides of the clipping range as shown in the center of FIG. 7C. Furthermore, in step S26, the control unit 11 reduces the right or bottom side of the clipping range. Here, if the clipping range is horizontal, the control unit 11 reduces the right side of the clipping range as shown on the right side of Fig. 6C, and if the clipping range is vertical, the control unit 11 reduces the bottom side of the clipping range as shown on the right side of Fig. 7C. The control unit 11 performs a process of calculating scores for the reduced clipping ranges based on the images within the respective reduced clipping ranges (S23, S25, S27). As a result, area scores and position scores are calculated and stored for the clipping ranges reduced in steps S22, S24, and S26, respectively.

上述した処理により、ステップＳ１３で設定したクリッピング範囲に対して、図６Ｂ及び図６Ｃ、或いは、図７Ｂ及び図７Ｃに示すように拡張又は縮小することにより、投稿用画像を生成するためのクリッピング範囲の候補が生成される。そして、それぞれのクリッピング範囲の候補に対して、被写体領域の大きさ及び位置に関するスコアが算出される。制御部１１（特定部）は、上述した処理によって算出したクリッピング範囲の各候補に対するスコアに基づいて、最適な（適切な）クリッピング範囲を特定する（Ｓ２８）。例えば制御部１１は、面積に関するスコアが最高のクリッピング範囲、位置に関するスコアが最高のクリッピング範囲、或いは、面積に関するスコア及び位置に関するスコアが共に最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。また制御部１１は、面積に関するスコア及び位置に関するスコアのそれぞれに重み付けを行い、両方のスコアを加味した総合スコアを算出し、総合スコアが最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。最適なクリッピング範囲を特定する際のルールは予め設定されて記憶部１２に記憶されている。 By the above-mentioned process, the clipping range set in step S13 is expanded or contracted as shown in FIG. 6B and FIG. 6C, or FIG. 7B and FIG. 7C, to generate clipping range candidates for generating an image to be posted. Then, for each clipping range candidate, a score for the size and position of the subject area is calculated. The control unit 11 (identification unit) identifies an optimal (appropriate) clipping range based on the score for each clipping range candidate calculated by the above-mentioned process (S28). For example, the control unit 11 may identify the clipping range with the highest score for area, the clipping range with the highest score for position, or the clipping range with the highest score for area and the highest score for position as the optimal clipping range. The control unit 11 may also weight each of the score for area and the score for position, calculate a total score taking both scores into account, and identify the clipping range with the highest total score as the optimal clipping range. The rules for identifying the optimal clipping range are set in advance and stored in the storage unit 12.

制御部１１は、ステップＳ１１で取得した撮影画像から、ステップＳ２８で特定した最適なクリッピング範囲内の画像（画素）を抽出して投稿用画像を生成し（Ｓ２９）、処理を終了する。上述した処理により、撮影画像から、指定されたアスペクト比を有すると共に、被写体の撮影領域がより画像中央に位置し、サイズがより大きい投稿用画像を生成することができる。これにより、被写体を見易い位置及びサイズで表示することができる画像をＳＮＳ等に投稿することが可能となる。また本実施形態では、撮影画像から、被写体が見易い状態の投稿用画像を自動的に生成するので画像編集を行うユーザの作業負担を軽減できる。 The control unit 11 extracts images (pixels) within the optimal clipping range identified in step S28 from the captured image acquired in step S11, generates an image to post (S29), and ends the process. Through the above-described process, an image to post that has the specified aspect ratio and in which the captured area of the subject is located closer to the center of the image and is larger in size can be generated from the captured image. This makes it possible to post an image that displays the subject in an easy-to-view position and size on SNS, etc. Furthermore, in this embodiment, an image to post in which the subject is easy to view is automatically generated from the captured image, reducing the workload of the user who edits the image.

本実施形態では、撮影画像から検出した被写体に基づいて設定されたクリッピング範囲に対して、図６Ｂ及び図６Ｃ、或いは、図７Ｂ及び図７Ｃに示すように拡張及び縮小を行うことによってクリッピング範囲の候補を生成する構成を例に説明したが、この構成に限定されない。例えば、図６Ａに示すクリッピング範囲に対して、上方向に３０画素拡張し、右側を３０画素縮小することによって300×300画素の画像サイズ（アスペクト比が１：１）のクリッピング範囲の候補を生成してもよい。また、図６Ａに示すクリッピング範囲に対して、上方向及び下方向に１５画素ずつ拡張し、左側及び右側を１５画素ずつ縮小することによって300×300画素の画像サイズ（アスペクト比が１：１）のクリッピング範囲の候補を生成してもよい。このようにクリッピング範囲の候補は、各種の方法で生成することができる。なお、各候補に対して行うスコア算出処理による処理負荷を考慮し、適切な数の候補を生成し、各候補に対するスコアを算出して最適なクリッピング範囲を特定すればよい。 In this embodiment, the clipping range candidate is generated by expanding and reducing the clipping range set based on the subject detected from the captured image as shown in FIG. 6B and FIG. 6C or FIG. 7B and FIG. 7C, but the present invention is not limited to this configuration. For example, the clipping range candidate shown in FIG. 6A may be expanded by 30 pixels upward and reduced by 30 pixels on the right side to generate a clipping range candidate with an image size of 300×300 pixels (aspect ratio of 1:1). The clipping range shown in FIG. 6A may be expanded by 15 pixels upward and downward and reduced by 15 pixels on the left and right sides to generate a clipping range candidate with an image size of 300×300 pixels (aspect ratio of 1:1). In this way, clipping range candidates can be generated by various methods. In addition, it is sufficient to generate an appropriate number of candidates, calculate the score for each candidate, and identify the optimal clipping range, taking into account the processing load due to the score calculation process performed for each candidate.

本実施形態では、クリッピング範囲の各候補に対するスコアに基づいて最適なクリッピング範囲を所定のルールに従って自動的に特定して投稿用画像を生成する構成を例に説明したが、この構成に限定されない。例えば、スコアが高いクリッピング範囲の候補を複数特定してユーザに提示し、ユーザが複数の候補から最適なクリッピング範囲を選択する構成とすることができる。図８は画像編集処理手順の他の例を示すフローチャート、図９は画面例を示す模式図である。図８に示す処理は、図３に示す処理中のステップＳ２７，Ｓ２８の間にステップＳ５１～Ｓ５３を追加したものである。図３と同じステップについては説明を省略する。なお、図８では、図３中のステップＳ１１～Ｓ２５の図示を省略している。 In this embodiment, an example has been described in which an optimal clipping range is automatically identified according to a predetermined rule based on the score for each candidate clipping range, and an image to be posted is generated, but this is not a limitation. For example, a configuration is possible in which multiple clipping range candidates with high scores are identified and presented to the user, and the user selects the optimal clipping range from the multiple candidates. Figure 8 is a flowchart showing another example of the image editing process procedure, and Figure 9 is a schematic diagram showing an example screen. The process shown in Figure 8 is obtained by adding steps S51 to S53 between steps S27 and S28 in the process shown in Figure 3. Explanation of the same steps as in Figure 3 will be omitted. Note that steps S11 to S25 in Figure 3 are not shown in Figure 8.

図８に示す画像編集処理では、制御部１１は、ステップＳ２７の処理後、クリッピング範囲の各候補に対して算出したスコアに基づいて、スコアが高い複数のクリッピング範囲の候補を選択する（Ｓ５１）。例えば制御部１１は、面積に関するスコアが高い順に所定数のクリッピング範囲、位置に関するスコアが高い順に所定数のクリッピング範囲、或いは、面積に関するスコア及び位置に関するスコアが共に高い順に所定数のクリッピング範囲を選択してもよい。ここでも制御部１１は、面積に関するスコア及び位置に関するスコアのそれぞれに重み付けを行い、両方のスコアを加味した総合スコアを算出し、総合スコアが高い順に所定数のクリッピング範囲を選択してもよい。スコアが高いクリッピング範囲の候補を選択する際のルールも予め設定されて記憶部１２に記憶されている。 In the image editing process shown in FIG. 8, the control unit 11 selects multiple clipping range candidates with high scores based on the score calculated for each clipping range candidate after the process of step S27 (S51). For example, the control unit 11 may select a predetermined number of clipping ranges in descending order of area-related scores, a predetermined number of clipping ranges in descending order of position-related scores, or a predetermined number of clipping ranges in descending order of area-related scores and position-related scores. Here, too, the control unit 11 may weight each of the area-related scores and the position-related scores, calculate a total score taking both scores into account, and select a predetermined number of clipping ranges in descending order of total score. Rules for selecting clipping range candidates with high scores are also set in advance and stored in the storage unit 12.

制御部１１は、選択した複数のクリッピング範囲の候補を表示し、これらの候補から最適な（適切な）クリッピング範囲の選択を受け付けるための選択画面を生成して表示部１５に表示する（Ｓ５２）。図９は選択画面例を示しており、図９に示す画面は、３つのクリッピング範囲の候補について、それぞれのクリッピング範囲に基づいて生成された投稿用画像を表示する。なお、選択画面は、各クリッピング範囲の候補に対応する投稿用画像に対応付けて、それぞれ算出したスコア（面積に関するスコア及び位置に関するスコア）を表示してもよい。この場合、各クリッピング範囲の候補に対するスコアをユーザに提示でき、ユーザは、スコアに基づいて各投稿用画像の評価を行うことができる。選択画面は、表示された投稿用画像のうちの１つの選択を受け付けるように構成されており、選択された１つの投稿用画像での投稿を指示するための投稿ボタンを有する。なお、図９に示す画面では、左下の投稿用画像（クリッピング範囲）が選択された状態を示している。ユーザは、選択画面に表示された投稿用画像のいずれかを選択して投稿ボタンを操作することにより、選択した投稿用画像での投稿を指示する。 The control unit 11 displays the selected multiple clipping range candidates, generates a selection screen for accepting the selection of the optimal (appropriate) clipping range from these candidates, and displays it on the display unit 15 (S52). FIG. 9 shows an example of the selection screen, and the screen shown in FIG. 9 displays the posting images generated based on the clipping ranges for the three clipping range candidates. The selection screen may display the calculated scores (area score and position score) in association with the posting images corresponding to each clipping range candidate. In this case, the score for each clipping range candidate can be presented to the user, and the user can evaluate each posting image based on the score. The selection screen is configured to accept the selection of one of the displayed posting images, and has a posting button for instructing posting with the selected posting image. The screen shown in FIG. 9 shows a state in which the posting image (clipping range) at the bottom left is selected. The user instructs posting with the selected posting image by selecting one of the posting images displayed on the selection screen and operating the posting button.

制御部１１は、選択画面において入力部１４を介していずれかのクリッピング範囲（投稿用画像）に対する選択を受け付けたか否かを判断しており（Ｓ５３）、受け付けていないと判断した場合（Ｓ５３：ＮＯ）、選択画面の表示を継続して待機する。いずれかのクリッピング範囲に対する選択を受け付けたと判断した場合（Ｓ５３：ＹＥＳ）、制御部１１は、選択されたクリッピング範囲を最適な（適切な）クリッピング範囲に特定し（Ｓ２８）、特定したクリッピング範囲に基づいて投稿用画像を生成する（Ｓ２９）。 The control unit 11 determines whether a selection of any clipping range (image to post) has been accepted on the selection screen via the input unit 14 (S53), and if it determines that a selection has not been accepted (S53: NO), it continues to display the selection screen and waits. If it determines that a selection of any clipping range has been accepted (S53: YES), the control unit 11 identifies the selected clipping range as an optimal (appropriate) clipping range (S28) and generates an image to post based on the identified clipping range (S29).

上述した処理では、撮影画像から投稿用画像を生成するためのクリッピング範囲について、被写体の撮影位置及び撮影サイズに基づいて適切な候補を複数選択してユーザに提示できる。ユーザは、複数のクリッピング範囲（投稿用画像）の候補から任意のクリッピング範囲を選択することができる。よって、指定されたアスペクト比を有すると共に、被写体の撮影領域が画像中央に位置しサイズが大きい投稿用画像の候補から、ユーザの好みの投稿用画像が選択されてＳＮＳ等に投稿することが可能となる。 In the above-described process, multiple appropriate candidates for the clipping range for generating an image to post from a captured image can be selected based on the subject's shooting position and shooting size, and presented to the user. The user can select any clipping range from multiple clipping range (image to post) candidates. This makes it possible for the user to select a preferred image to post from candidates for images to post that have a specified aspect ratio, have the subject's shooting area located in the center of the image, and are large in size, and post the image to post on SNS, etc.

（実施形態２）
被写体（対象物）の撮影画像から投稿用画像を生成するためのクリッピング範囲を設定する際に、クリッピング範囲に含まれる被写体の各部位の領域を考慮する情報処理装置について説明する。本実施形態の情報処理装置は、実施形態１の情報処理装置１０と同様の構成を有するので、構成についての詳細な説明は省略する。なお、本実施形態の情報処理装置１０は、図１に示す実施形態１の構成に加えて、記憶部１２に、被写体となる対象物に対して各対象物の部位に関する情報が登録された辞書ＤＢ（データベース）を記憶している。図示は省略するが、辞書ＤＢは、例えば犬に対して、犬の手足、犬の目、犬の口、犬の鼻等の用語が予め登録されている。 (Embodiment 2)
An information processing device that considers the area of each part of a subject included in a clipping range when setting a clipping range for generating an image for posting from a photographed image of the subject (object) will be described. The information processing device of this embodiment has a configuration similar to that of the information processing device 10 of the first embodiment, so detailed description of the configuration will be omitted. In addition to the configuration of the first embodiment shown in FIG. 1, the information processing device 10 of this embodiment stores a dictionary DB (database) in the storage unit 12 in which information on each part of the object that is the subject is registered. Although not shown in the figure, the dictionary DB has terms such as dog's paws, dog's eyes, dog's mouth, dog's nose, etc. registered in advance for a dog.

本実施形態の情報処理装置１０において、制御部１１は、図３に示す処理と同様の処理を実行する。なお、図３に示す画像編集処理において、スコア算出処理は図４に示す処理と若干異なる。図１０は実施形態２のスコア算出処理手順の一例を示すフローチャート、図１１はスコア算出処理を説明するための模式図である。図１０に示す処理は、図４に示す処理中のステップＳ４１の前にステップＳ６１～Ｓ６２を追加したものである。図４と同じステップについては説明を省略する。 In the information processing device 10 of this embodiment, the control unit 11 executes a process similar to the process shown in FIG. 3. Note that in the image editing process shown in FIG. 3, the score calculation process is slightly different from the process shown in FIG. 4. FIG. 10 is a flowchart showing an example of the score calculation process procedure of the second embodiment, and FIG. 11 is a schematic diagram for explaining the score calculation process. The process shown in FIG. 10 adds steps S61 to S62 before step S41 in the process shown in FIG. 4. Explanations of the same steps as in FIG. 4 will be omitted.

本実施形態のスコア算出処理において、制御部１１は、図３中のステップＳ１６，Ｓ１８，Ｓ２０，Ｓ２２，Ｓ２４，Ｓ２６で拡張又は縮小した後のクリッピング範囲（調整後のクリッピング範囲）について、クリッピング範囲に含まれる被写体の各部位に基づくスコアを算出する。具体的には、制御部１１（部位検出部）は、クリッピング範囲内の画像に対してセグメンテーションを行い、被写体の部位毎に領域を分類（クラス分類）する（Ｓ６１）。例えば制御部１１は、図３中のステップＳ１２において、ＭａｓｋＲ－ＣＮＮを用いて撮影画像から被写体領域を検出すると共に、検出した被写体領域に対してセグメンテーションを行って被写体の部位毎にクラス分類していた場合、クラス分類結果に基づいて、クリッピング範囲内の画像における各部位の領域を特定できる。図１１Ａの左側に示すクリッピング範囲では、図１１Ａの右側に黒色（背景）以外で示す犬の各部位の領域（部位領域）が特定されている。図１１Ａでは、クリッピング範囲内に犬の手足、目、口、鼻、首、顔、胴体が検出されている。 In the score calculation process of this embodiment, the control unit 11 calculates a score based on each part of the subject included in the clipping range (adjusted clipping range) after expansion or reduction in steps S16, S18, S20, S22, S24, and S26 in FIG. 3. Specifically, the control unit 11 (part detection unit) performs segmentation on the image within the clipping range and classifies (classifies) the area by part of the subject (S61). For example, in step S12 in FIG. 3, the control unit 11 detects the subject area from the captured image using Mask R-CNN and performs segmentation on the detected subject area to classify the area by part of the subject, and can identify the area of each part in the image within the clipping range based on the class classification result. In the clipping range shown on the left side of FIG. 11A, the area (part area) of each part of the dog shown on the right side of FIG. 11A other than black (background) is identified. In Figure 11A, the dog's paws, eyes, mouth, nose, neck, face, and torso are detected within the clipping range.

制御部１１は、クリッピング範囲内の画像における各部位領域に基づいて、被写体の部位に関するスコアを算出する（Ｓ６２）。例えば制御部１１は、クリッピング範囲に含まれる各部位領域が、辞書ＤＢに被写体（対象物）に対応付けて記憶してある部位であるか否かを判断し、辞書ＤＢに記憶してある部位について１を加算し、辞書ＤＢに記憶されていない部位について１を減算してスコアを算出する。図１１Ａに示す例では、クリッピング範囲内に犬の手足、目、口及び鼻が含まれているので、被写体の部位に関するスコアとして４．０が算出される。図１１Ｂに示す例では、犬の手足、口及び鼻がそれぞれ一部しか含まれておらず、クリッピング範囲内に犬の目のみが含まれているので、被写体の部位に関するスコアとして１．０が算出される。なお、被写体の各部位についてクリッピング範囲に一部しか含まれないか全部含まれているかの判断は、例えばクリッピング範囲の内側及び外側の画像に基づいて行われる。例えばクリッピング範囲の輪郭が、被写体の各部位領域上にある場合、この部位は一部のみがクリッピング範囲に含まれる部位であると判断できる。図１１Ｃに示す例では、犬の手足、目、口及び鼻に加えて、画像の右下の領域（図１１Ｃの右側の画像では閉曲線で囲んだ領域）に被写体（犬）以外のもの（ここでは猫）が含まれているので、被写体の部位に関するスコアとして３．０が算出される。これにより、クリッピング範囲内に含まれる被写体の部位の数が多いほど、また、被写体以外のものが含まれないほど、被写体の部位に関するスコアとして高いスコアが算出される。 The control unit 11 calculates a score for the subject's body parts based on each body part area in the image within the clipping range (S62). For example, the control unit 11 determines whether each body part area included in the clipping range is a body part stored in the dictionary DB in association with the subject (object), adds 1 to the body parts stored in the dictionary DB, and subtracts 1 from the body parts not stored in the dictionary DB to calculate the score. In the example shown in FIG. 11A, the dog's paws, eyes, mouth, and nose are included in the clipping range, so a score of 4.0 is calculated for the body parts of the subject. In the example shown in FIG. 11B, only a portion of the dog's paws, mouth, and nose are included, and only the dog's eyes are included in the clipping range, so a score of 1.0 is calculated for the body parts of the subject. Note that the determination of whether each body part of the subject is only partially or completely included in the clipping range is made based on, for example, the images inside and outside the clipping range. For example, if the contour of the clipping range is on each body part area of the subject, it can be determined that this body part is only partially included in the clipping range. In the example shown in Figure 11C, in addition to the dog's paws, eyes, mouth, and nose, the lower right area of the image (the area surrounded by a closed curve in the image on the right side of Figure 11C) contains something other than the subject (the dog) (here, a cat), so a score of 3.0 is calculated for the subject's body parts. As a result, the more body parts of the subject that are included within the clipping range, and the fewer things other than the subject are included, the higher the score calculated for the subject's body parts.

その後、制御部１１は、図４に示すステップＳ４１～Ｓ４８と同様の処理を行う。これにより、クリッピング範囲の各候補について、被写体の部位に関するスコア、被写体領域の面積に関するスコア、被写体領域の位置に関するスコアが算出される。よって、本実施形態では、制御部１１は、ステップＳ６２で算出した被写体の部位に関するスコアと、ステップＳ４３で算出した面積に関するスコアと、ステップＳ４７で算出した位置に関するスコアとを、調整後のクリッピング範囲に対応付けて記憶する（Ｓ４８）。 Then, the control unit 11 performs the same processes as steps S41 to S48 shown in FIG. 4. As a result, for each candidate clipping range, a score for the part of the subject, a score for the area of the subject region, and a score for the position of the subject region are calculated. Therefore, in this embodiment, the control unit 11 stores the score for the part of the subject calculated in step S62, the score for the area calculated in step S43, and the score for the position calculated in step S47 in association with the adjusted clipping range (S48).

また本実施形態では、図３中のステップＳ２８において、制御部１１は、クリッピング範囲の各候補に対して算出した、被写体の部位に関するスコア、被写体領域の面積に関するスコア、被写体領域の位置に関するスコアに基づいて、最適な（適切な）クリッピング範囲を特定する（Ｓ２８）。ここでは制御部１１は、被写体の部位に関するスコアが最高のクリッピング範囲、面積に関するスコアが最高のクリッピング範囲、位置に関するスコアが最高のクリッピング範囲、或いは、３つのスコアが共に最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。また制御部１１は、３つのスコアのそれぞれに重み付けを行い、３つのスコアを加味した総合スコアを算出し、総合スコアが最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。 In this embodiment, in step S28 in FIG. 3, the control unit 11 identifies an optimal (appropriate) clipping range based on the score for the subject's part, the score for the area of the subject region, and the score for the position of the subject region, which are calculated for each candidate clipping range (S28). Here, the control unit 11 may identify the clipping range with the highest score for the subject's part, the clipping range with the highest score for the area, the clipping range with the highest score for the position, or the clipping range with the highest score for all three scores as the optimal clipping range. The control unit 11 may also weight each of the three scores, calculate an overall score taking into account the three scores, and identify the clipping range with the highest overall score as the optimal clipping range.

上述した処理により、本実施形態の情報処理装置１０では、撮影画像から、指定されたアスペクト比を有すると共に、被写体の部位を多く含み、被写体の撮影領域がより画像中央に位置し、サイズがより大きい投稿用画像を生成することができる。これにより、本実施形態においても、被写体を見易い状態で表示することができる画像をＳＮＳ等に投稿することが可能となる。また、本実施形態においても、図８及び図９に示した変形例の適用が可能であり、適用した場合には同様の効果が得られる。 By the above-mentioned processing, the information processing device 10 of this embodiment can generate an image for posting from a captured image that has a specified aspect ratio, includes many parts of the subject, has the captured area of the subject located closer to the center of the image, and is larger in size. This makes it possible to post an image on SNS or the like in this embodiment, in which the subject can be displayed in an easy-to-see state. Also, in this embodiment, the modified examples shown in Figures 8 and 9 can be applied, and similar effects can be obtained when applied.

本実施形態では、上述した実施形態１と同様の効果が得られる。また本実施形態では、指定されたアスペクト比を有すると共に、被写体に設定された部位をより多く含み、被写体の撮影領域がより画像中央に位置し、サイズがより大きい投稿用画像を生成することができる。よって、被写体の各部位がより見易く表示された画像をＳＮＳ等に投稿することが可能となる。本実施形態においても、上述した各実施形態で適宜説明した変形例の適用が可能である。 In this embodiment, the same effects as those of the first embodiment described above can be obtained. Furthermore, in this embodiment, it is possible to generate an image for posting that has a specified aspect ratio, includes more of the body parts set on the subject, has the subject's shooting area positioned closer to the center of the image, and is larger in size. This makes it possible to post an image on SNS or the like in which each body part of the subject is more easily visible. In this embodiment as well, it is possible to apply the modified examples described in each of the above-mentioned embodiments as appropriate.

（実施形態３）
画像（画像データ）及び画像に対応付けられたテキスト（テキストデータ）に基づいて、テキストで述べられている対象物（被写体）に対する視認性が高い領域を画像から抽出（クリッピング）する情報処理装置について説明する。本実施形態の情報処理装置は、実施形態１の情報処理装置１０と同様の構成を有するので、構成についての詳細な説明は省略する。 (Embodiment 3)
The following describes an information processing device that extracts (clips) an area from an image that has high visibility of an object (subject) described in a text based on the image (image data) and the text (text data) associated with the image. The information processing device of this embodiment has a similar configuration to the information processing device 10 of the first embodiment, and therefore a detailed description of the configuration will be omitted.

以下に、本実施形態の情報処理装置１０が、画像及びテキストをそれぞれのレイアウト枠にレイアウトしてページレイアウトを生成する処理について説明する。図１２はレイアウト処理を説明するための模式図である。本実施形態の情報処理装置１０は、レイアウト対象の画像及びテキストと、画像及びテキストを配置すべきレイアウト枠がそれぞれ設定されたレイアウトデータとを入力データとする。テキストは、画像の内容に関する情報が記載されたテキストである。本実施形態では、レイアウト対象を２つの画像及び１つのテキストとするが、画像及びテキストの数はこれらに限定されない。本実施形態の情報処理装置１０は、レイアウト対象の画像のそれぞれから、テキストの内容に応じた被写体の視認性が良好であり、且つ、割り当てられたレイアウト枠のアスペクト比と同じアスペクト比の領域を抽出してレイアウト用画像を生成する。これにより、テキストに記載された内容に応じた被写体が見易い状態で表示されるレイアウト用画像が生成される。また情報処理装置１０は、生成したレイアウト用画像とテキストとをそれぞれのレイアウト枠に配置することによりページレイアウト（出力データ）を生成する。よって、画像及びテキストをユーザが読み易い状態で配置したページレイアウトが生成される。 The following describes the process in which the information processing device 10 of this embodiment lays out images and text in their respective layout frames to generate a page layout. FIG. 12 is a schematic diagram for explaining the layout process. The information processing device 10 of this embodiment receives as input data the image and text to be laid out, and layout data in which the layout frames in which the images and text should be placed are set. The text is text in which information on the contents of the image is described. In this embodiment, the layout objects are two images and one piece of text, but the number of images and text is not limited to these. The information processing device 10 of this embodiment extracts areas from each of the images to be laid out, in which the visibility of the subject according to the content of the text is good and which have the same aspect ratio as the aspect ratio of the assigned layout frame, to generate a layout image. As a result, a layout image is generated in which the subject according to the content described in the text is displayed in an easy-to-see state. The information processing device 10 also generates a page layout (output data) by arranging the generated layout images and text in their respective layout frames. Thus, a page layout is generated in which the images and text are arranged in a state in which the user can easily read them.

図１３～図１５はレイアウト処理手順の一例を示すフローチャート、図１６～図１９はレイアウト処理を説明するための模式図である。図１５に示すスコア算出処理は、図１４に示すレイアウト処理中の「スコア算出処理」である。図１３～図１４に示す処理は、図３に示す画像編集処理中のステップＳ１１の前にステップＳ７１～Ｓ７５を追加し、ステップＳ２９の代わりにステップＳ７６～Ｓ７８を追加したものである。また図１５に示す処理は、図４に示す処理中のステップＳ４１の前にステップＳ８１～Ｓ８２を追加したものである。図１３～図１５に示す処理において、図３～図４と同じステップについては説明を省略する。 Figures 13 to 15 are flow charts showing an example of a layout processing procedure, and Figures 16 to 19 are schematic diagrams for explaining the layout processing. The score calculation processing shown in Figure 15 is the "score calculation processing" during the layout processing shown in Figure 14. The processing shown in Figures 13 to 14 is obtained by adding steps S71 to S75 before step S11 during the image editing processing shown in Figure 3, and adding steps S76 to S78 instead of step S29. The processing shown in Figure 15 is obtained by adding steps S81 to S82 before step S41 during the processing shown in Figure 4. In the processing shown in Figures 13 to 15, explanations of the same steps as in Figures 3 to 4 will be omitted.

本実施形態の情報処理装置１０では、ユーザは、画像及びテキストを含むページレイアウトを生成する場合、入力部１４を介して所定の操作を行い、レイアウト対象の画像及びテキストとレイアウトデータとを指定し、ページレイアウトの生成処理の実行指示を行う。情報処理装置１０の制御部１１（テキスト取得部）は、入力部１４を介してページレイアウトの生成処理の実行指示を受け付けた場合、指定されたテキストを取得する（Ｓ７１）。例えばテキストが記憶部１２に記憶してある場合、制御部１１は、テキストを記憶部１２から読み出す。テキストは、入力部１４を介したユーザの操作によって生成されたテキストに限定されず、他の装置からネットワーク経由又は可搬型記憶媒体１ａ経由で記憶部１２に記憶されたテキストであってもよい。 In the information processing device 10 of this embodiment, when a user generates a page layout including images and text, the user performs a predetermined operation via the input unit 14 to specify the images and text to be laid out and the layout data, and issues an instruction to execute a page layout generation process. When the control unit 11 (text acquisition unit) of the information processing device 10 receives an instruction to execute a page layout generation process via the input unit 14, it acquires the specified text (S71). For example, if the text is stored in the storage unit 12, the control unit 11 reads the text from the storage unit 12. The text is not limited to text generated by a user's operation via the input unit 14, but may be text stored in the storage unit 12 from another device via a network or via a portable storage medium 1a.

制御部１１（出現頻度算出部）は、取得したテキストに出現する各単語の出現頻度をそれぞれ算出する（Ｓ７２）。例えば制御部１１は、形態素解析等の手法を用いてテキストから各単語を抽出し、各単語について出現回数を計数する。そして制御部１１は、各単語について、例えば以下の（１）式を用いて出現頻度を算出する。図１６Ａの上側にはテキストの一例を示しており、図１６Ａの下側にはテキストに含まれる各単語の出現頻度を示すグラフを示している。図１６Ａの下側のグラフの横軸はテキストに含まれる単語を示し、縦軸は各単語の出現頻度を示す。図１６Ａに示す例では、「犬」の出現頻度として０．７が算出され、「飼い主」「草原」「水」の出現頻度として０．１が算出されている。また図１６Ｂに示す例では、「犬」の出現頻度として０．５が算出され、「猫」の出現頻度として０．３が算出され、「飼い主」「水」の出現頻度として０．１が算出されている。 The control unit 11 (occurrence frequency calculation unit) calculates the occurrence frequency of each word that appears in the acquired text (S72). For example, the control unit 11 extracts each word from the text using a method such as morphological analysis, and counts the number of occurrences of each word. The control unit 11 then calculates the occurrence frequency for each word using, for example, the following formula (1). The upper part of FIG. 16A shows an example of text, and the lower part of FIG. 16A shows a graph showing the occurrence frequency of each word included in the text. The horizontal axis of the graph at the lower part of FIG. 16A shows the words included in the text, and the vertical axis shows the occurrence frequency of each word. In the example shown in FIG. 16A, the occurrence frequency of "dog" is calculated as 0.7, and the occurrence frequencies of "owner", "grassland", and "water" are calculated as 0.1. In the example shown in FIG. 16B, the occurrence frequency of "dog" is calculated as 0.5, the occurrence frequency of "cat" is calculated as 0.3, and the occurrence frequency of "owner" and "water" is calculated as 0.1.

単語の出現頻度＝単語の出現回数／全単語の総出現回数 …（１） Frequency of occurrence of a word = number of times a word occurs / total number of times all words occur … (1)

制御部１１（対象物特定部）は、テキスト中の各単語の出現頻度に基づいて、テキストが示す画像中の対象物（被写体）を特定する（Ｓ７３）。例えば制御部１１は、出現頻度が、予め設定された閾値（例えば０．２８）以上である単語を特定し、特定した単語が示す対象物を特定する。図１６Ａに示す例では、出現頻度が閾値以上である単語（対象物）は「犬」だけであり、制御部１１は、「犬」を特定する。また図１６Ｂに示す例では、出現頻度が閾値以上である単語（対象物）は「犬」及び「猫」であり、制御部１１は「犬」及び「猫」を特定する。 The control unit 11 (object identification unit) identifies the object (subject) in the image indicated by the text based on the frequency of occurrence of each word in the text (S73). For example, the control unit 11 identifies words whose frequency of occurrence is equal to or greater than a preset threshold (e.g., 0.28), and identifies the object indicated by the identified word. In the example shown in FIG. 16A, the only word (object) whose frequency of occurrence is equal to or greater than the threshold is "dog," and the control unit 11 identifies "dog." In the example shown in FIG. 16B, the words (objects) whose frequency of occurrence is equal to or greater than the threshold are "dog" and "cat," and the control unit 11 identifies "dog" and "cat."

また制御部１１は、テキスト中に出現する各単語から、テキストが示す画像に関する用語を生成する（Ｓ７４）。例えば制御部１１は、係り受け解析等の手法を用いて、テキスト中に出現する各単語の内容を考慮して各単語を組み合わせることによって用語を生成する。図１６Ａに示すテキストの例では、図１７に示すように「犬の手足」「犬の目」「犬の口」「犬の鼻」「飼い主」「草原」「水」等の用語が生成される。なお、用語は、例えば実施形態２の情報処理装置１０が使用した辞書ＤＢを用いて生成されてもよい。 The control unit 11 also generates terms related to the image indicated by the text from each word appearing in the text (S74). For example, the control unit 11 generates terms by combining each word appearing in the text while taking into account the content of each word using a technique such as dependency analysis. In the example text shown in FIG. 16A, terms such as "dog's paws," "dog's eyes," "dog's mouth," "dog's nose," "owner," "grassland," and "water" are generated as shown in FIG. 17. Note that the terms may be generated using, for example, the dictionary DB used by the information processing device 10 of embodiment 2.

そして制御部１１は、生成した各用語に対して、各単語の出現頻度に基づく関連度を対応付ける（Ｓ７５）。図１６Ａに示す例では、単語「犬」の出現頻度が０．７であるので、「犬の手足」「犬の目」「犬の口」「犬の鼻」等の犬に関する用語に対しては０．７の関連度を対応付ける。また、単語「飼い主」「草原」「水」の出現頻度はそれぞれ０．１であるので、これらの用語に対しては０．１の関連度を対応付ける。 The control unit 11 then associates each of the generated terms with a degree of relevance based on the frequency of occurrence of each word (S75). In the example shown in FIG. 16A, the frequency of occurrence of the word "dog" is 0.7, so a degree of relevance of 0.7 is associated with dog-related terms such as "dog paws," "dog eyes," "dog mouth," and "dog nose." In addition, the frequency of occurrence of the words "owner," "grassland," and "water" is each 0.1, so a degree of relevance of 0.1 is associated with these terms.

その後、制御部１１は、図３に示すステップＳ１１～Ｓ２８と同様の処理を行う。なお、本実施形態では、ステップＳ１１において、制御部１１は、指定されたレイアウト対象の画像を取得する。またステップＳ１２において、制御部１１（検知部）は、取得した画像に対して物体検出処理を行い、ステップＳ７３で特定した対象物（被写体）を検知する。またステップＳ１３において、制御部１１は、レイアウト対象の画像に対して、ステップＳ１２で検出した対象物を含むクリッピング範囲を設定する。これにより、本実施形態では、テキストの内容に関連する対象物の領域をクリッピング範囲に設定することができる。また本実施形態においても、図５Ｂ中に破線矩形で示すように、画像中の対象物を含む領域がクリッピング範囲に設定される。なお、図１８Ａに示すように、ステップＳ７３で複数の対象物が特定された場合、制御部１１は、図１８Ｂに示すように、複数の対象物を含む外接矩形の領域をクリッピング範囲に設定する。この場合、テキストで述べられている複数の対象物を含む領域をクリッピング範囲に設定できる。またステップＳ１４において、制御部１１は、ステップＳ１３で設定したクリッピング範囲のアスペクト比と、ここでのレイアウト対象の画像に対して指定されたレイアウト枠のアスペクト比とを比較する。 After that, the control unit 11 performs the same processes as steps S11 to S28 shown in FIG. 3. In this embodiment, in step S11, the control unit 11 acquires an image of the specified layout target. In step S12, the control unit 11 (detection unit) performs object detection processing on the acquired image to detect the object (subject) identified in step S73. In step S13, the control unit 11 sets a clipping range including the object detected in step S12 for the image of the layout target. As a result, in this embodiment, the area of the object related to the content of the text can be set as the clipping range. In this embodiment, too, the area including the object in the image is set as the clipping range, as shown by the dashed rectangle in FIG. 5B. In addition, as shown in FIG. 18A, if multiple objects are identified in step S73, the control unit 11 sets the area of a circumscribed rectangle including multiple objects as the clipping range, as shown in FIG. 18B. In this case, the area including multiple objects described in the text can be set as the clipping range. Also, in step S14, the control unit 11 compares the aspect ratio of the clipping range set in step S13 with the aspect ratio of the layout frame specified for the image to be laid out here.

図１５に示す本実施形態のスコア算出処理において、制御部１１は、図１４中のステップＳ１６，Ｓ１８，Ｓ２０，Ｓ２２，Ｓ２４，Ｓ２６で拡張又は縮小した後（調整後）のクリッピング範囲について、レイアウト対象のテキストとの関連度に関するスコアを算出する。具体的には、制御部１１は、クリッピング範囲内の画像に対してセグメンテーションを行い、対象物の部位毎に領域を分類（クラス分類）する（Ｓ８１）。ステップＳ８１の処理は、実施形態２で説明した図１０中のステップＳ６１の処理と同様である。 In the score calculation process of this embodiment shown in FIG. 15, the control unit 11 calculates a score related to the relevance of the clipping range after it has been expanded or reduced (adjusted) in steps S16, S18, S20, S22, S24, and S26 in FIG. 14 to the text to be laid out. Specifically, the control unit 11 performs segmentation on the image within the clipping range, and classifies (classifies) the regions by part of the object (S81). The process of step S81 is the same as the process of step S61 in FIG. 10 described in embodiment 2.

そして制御部１１（関連度算出部）は、クリッピング範囲内の画像における各部位領域に基づいて、このクリッピング範囲内の画像とテキストとの関連度に関するスコアを算出する（Ｓ８２）。図１９Ａに示す例では、クリッピング範囲内に犬の手足、目、口及び鼻が含まれており、これらの部位の用語には０．７の関連度が対応付けられているので、図１９Ａに示すクリッピング範囲内の画像とテキストとの関連度に関するスコアとして２．８が算出される。図１９Ｂに示す例では、犬の手足、口及び鼻がそれぞれ一部しか含まれておらず、クリッピング範囲内に犬の目のみが含まれており、「犬の目」の用語には０．７の関連度が対応付けられているので、図１９Ｂに示すクリッピング範囲内の画像とテキストとの関連度に関するスコアとして０．７が算出される。図１９Ｃに示す例では、犬の手足、目、口及び鼻に加えて、画像の右下の領域（図１９Ｃの右側の画像では閉曲線で囲んだ領域）に対象物（犬）以外のもの（ここでは猫）が含まれており、対象物以外のものの用語には－１．０の関連度が対応付けられているので、図１９Ｃに示すクリッピング範囲内の画像とテキストとの関連度に関するスコアとして１．８が算出される。これにより、クリッピング範囲内に含まれる、テキストから生成された用語の数が多いほど、また、対象物以外のものが含まれないほど、クリッピング範囲内の画像とテキストとの関連度に関するスコアとして高いスコアが算出される。 Then, the control unit 11 (relevance calculation unit) calculates a score for the relevance between the image within the clipping range and the text based on each part area in the image within the clipping range (S82). In the example shown in FIG. 19A, the dog's paws, eyes, mouth, and nose are included in the clipping range, and a relevance of 0.7 is associated with the terms for these parts, so a score of 2.8 is calculated for the relevance between the image within the clipping range and the text shown in FIG. 19A. In the example shown in FIG. 19B, only a portion of the dog's paws, mouth, and nose are included in the clipping range, and only the dog's eyes are included in the clipping range, so a relevance of 0.7 is associated with the term "dog's eyes", so a score of 0.7 is calculated for the relevance between the image within the clipping range and the text shown in FIG. 19B. In the example shown in FIG. 19C, in addition to the dog's paws, eyes, mouth, and nose, the lower right area of the image (the area surrounded by a closed curve in the image on the right side of FIG. 19C) contains things other than the target object (dog) (here, a cat), and since a relevance score of -1.0 is associated with terms other than the target object, a score of 1.8 is calculated for the relevance between the image and text within the clipping range shown in FIG. 19C. As a result, the more terms generated from the text that are included within the clipping range, and the fewer things other than the target object are included, the higher the score calculated for the relevance between the image and text within the clipping range.

その後、制御部１１は、図４に示すステップＳ４１～Ｓ４８と同様の処理を行う。これにより、クリッピング範囲の各候補について、クリッピング範囲内の画像とテキストとの関連度に関するスコア、被写体領域（対象物領域）の面積に関するスコア、被写体領域の位置に関するスコアが算出される。よって、本実施形態では、制御部１１は、ステップＳ８２で算出したテキストとの関連度に関するスコアと、ステップＳ４３で算出した面積に関するスコアと、ステップＳ４７で算出した位置に関するスコアとを、調整後のクリッピング範囲に対応付けて記憶する（Ｓ４８）。 Then, the control unit 11 performs the same processes as steps S41 to S48 shown in FIG. 4. As a result, for each candidate clipping range, a score related to the relevance between the image and text within the clipping range, a score related to the area of the subject region (object region), and a score related to the position of the subject region are calculated. Therefore, in this embodiment, the control unit 11 stores the score related to the relevance with the text calculated in step S82, the score related to the area calculated in step S43, and the score related to the position calculated in step S47 in association with the adjusted clipping range (S48).

また本実施形態では、図１４中のステップＳ２８において、制御部１１は、クリッピング範囲の各候補に対して算出した、テキストとの関連度に関するスコア、被写体領域の面積に関するスコア、被写体領域の位置に関するスコアに基づいて、最適な（適切な）クリッピング範囲を特定する（Ｓ２８）。ここでは制御部１１は、テキストとの関連度に関するスコアが最高のクリッピング範囲、面積に関するスコアが最高のクリッピング範囲、位置に関するスコアが最高のクリッピング範囲、或いは、３つのスコアが共に最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。また制御部１１は、３つのスコアのそれぞれに重み付けを行い、３つのスコアを加味した総合スコアを算出し、総合スコアが最高のクリッピング範囲を最適なクリッピング範囲に特定してもよい。 In this embodiment, in step S28 in FIG. 14, the control unit 11 identifies an optimal (appropriate) clipping range based on the score for relevance to the text, the score for the area of the subject region, and the score for the position of the subject region, which are calculated for each candidate clipping range (S28). Here, the control unit 11 may identify the clipping range with the highest score for relevance to the text, the clipping range with the highest score for area, the clipping range with the highest score for position, or the clipping range with the highest score for all three scores as the optimal clipping range. The control unit 11 may also weight each of the three scores, calculate an overall score taking into account the three scores, and identify the clipping range with the highest overall score as the optimal clipping range.

本実施形態では、ステップＳ２８の処理後、ステップＳ１１で取得したレイアウト対象の画像から、ステップＳ２８で特定した最適なクリッピング範囲内の画像（画素）を抽出してレイアウト用画像を生成する（Ｓ７６）。制御部１１は、生成したレイアウト用画像を記憶部１２に記憶しておく。そして制御部１１は、編集処理が未処理の画像が有るか否かを判断し（Ｓ７７）、未処理の画像が有ると判断した場合（Ｓ７７：ＹＥＳ）、ステップＳ１１の処理に戻り、未処理のレイアウト対象の画像を取得し（Ｓ１１）、取得した画像に対して、ステップＳ１２～Ｓ２８及びＳ７６の処理を行う。これにより、レイアウト対象の画像のそれぞれからレイアウト用画像を生成できる。 In this embodiment, after the processing of step S28, an image (pixels) within the optimal clipping range identified in step S28 is extracted from the image to be laid out obtained in step S11 to generate an image for layout (S76). The control unit 11 stores the generated image for layout in the storage unit 12. The control unit 11 then determines whether or not there is an image that has not been edited (S77), and if it determines that there is an image that has not been edited (S77: YES), the process returns to step S11, and the unprocessed image for layout is obtained (S11), and the processing of steps S12 to S28 and S76 is performed on the obtained image. This allows a layout image to be generated from each of the images to be laid out.

未処理の画像がないと判断した場合（Ｓ７７：ＮＯ）、制御部１１は、指定されたレイアウトデータに基づいて、レイアウト対象のテキストと、ステップＳ７６で生成したレイアウト用画像とを配置してページレイアウトを生成する（Ｓ７８）。具体的には、制御部１１は、レイアウト用画像のそれぞれを対応するレイアウト枠にはめ込み、レイアウト対象のテキストを対応するレイアウト枠にはめ込むことによりページレイアウトを生成する。なお、テキストを所定のレイアウト枠にはめ込む場合、文字サイズの変更及び改行の挿入等を適宜行ってもよい。 If it is determined that there are no unprocessed images (S77: NO), the control unit 11 generates a page layout by arranging the text to be laid out and the layout images generated in step S76 based on the specified layout data (S78). Specifically, the control unit 11 generates a page layout by fitting each of the layout images into the corresponding layout frame and fitting the text to be laid out into the corresponding layout frame. When fitting the text into a specified layout frame, the control unit 11 may change the character size and insert line breaks as appropriate.

上述した処理により、レイアウト対象の画像から、レイアウトデータで指定されたアスペクト比を有すると共に、テキストの内容に関連する対象物（被写体）の領域がより画像中央に位置し、サイズがより大きいレイアウト用画像を生成することができる。これにより、テキストで述べられている対象物が見易い位置及びサイズで表示された画像を生成することができ、このような画像を各種の媒体で使用することにより視認性が高く読み易いページレイアウトを生成することができる。 The above-described process makes it possible to generate a layout image from the image to be laid out, which has the aspect ratio specified in the layout data, and in which the area of the object (subject) related to the content of the text is located closer to the center of the image and is larger in size. This makes it possible to generate an image in which the object mentioned in the text is displayed in a position and size that makes it easy to see, and by using such an image in various media, it is possible to generate page layouts that are highly visible and easy to read.

本実施形態では、上述した各実施形態と同様の効果が得られる。また本実施形態では、画像及びテキストを含むページレイアウトを生成する際に、画像から、テキストの内容に沿ったクリッピング範囲のレイアウト用画像を生成することができる。よって、テキストの内容に適した対象物がより見易く表示された画像を各種の媒体で使用することが可能となる。また本実施形態では、レイアウト対象の画像から、対象物が見易い状態のレイアウト用画像を自動的に生成するので画像編集を行うユーザの作業負担を軽減できる。本実施形態においても、上述した各実施形態で適宜説明した変形例の適用が可能である。 In this embodiment, the same effects as those of the above-mentioned embodiments can be obtained. In addition, in this embodiment, when generating a page layout including images and text, a layout image with a clipping range that matches the content of the text can be generated from the image. This makes it possible to use images in which objects suitable for the content of the text are displayed more easily in various media. In addition, in this embodiment, a layout image in which the objects are easily visible is automatically generated from the image to be laid out, thereby reducing the workload of the user who edits the images. In this embodiment, the modified examples described as appropriate in each of the above-mentioned embodiments can also be applied.

上述した各実施形態において、情報処理装置１０が画像から投稿用画像又はレイアウト用画像を生成する処理を、ネットワークに接続された所定のサーバで行うように構成してもよい。この場合、情報処理装置１０の制御部１１は、処理対象の画像、或いは、レイアウト対象の画像及びテキストをネットワーク経由で所定のサーバへ送信し、所定のサーバで生成された投稿用画像又はレイアウト用画像を取得してもよい。このような構成におけるサーバは、サーバコンピュータ又はパーソナルコンピュータを用いて実現されてもよく、１台のサーバ内に設けられた複数の仮想マシンを用いて実現されてもよく、クラウドサーバを用いて実現されてもよい。 In each of the above-described embodiments, the information processing device 10 may be configured to perform the process of generating an image for posting or an image for layout from an image on a specified server connected to a network. In this case, the control unit 11 of the information processing device 10 may transmit the image to be processed, or the image and text to be laid out, to the specified server via the network, and obtain the image for posting or the image for layout generated by the specified server. The server in such a configuration may be realized using a server computer or a personal computer, may be realized using multiple virtual machines provided in a single server, or may be realized using a cloud server.

今回開示された実施の形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 The embodiments disclosed herein are illustrative in all respects and should not be considered limiting. The scope of the present invention is indicated by the claims, not by the meaning described above, and is intended to include all modifications within the scope and meaning equivalent to the claims.

１０情報処理装置
１１制御部
１２記憶部
１３通信部
１４入力部
１５表示部
１６カメラ REFERENCE SIGNS LIST 10 Information processing device 11 Control unit 12 Storage unit 13 Communication unit 14 Input unit 15 Display unit 16 Camera

Claims

an image acquisition unit for acquiring an image including an object;
an aspect ratio acquisition unit that acquires an aspect ratio of a display area that displays the object;
a comparison unit that detects the object from the image and compares an aspect ratio of a region of the detected object with an aspect ratio of the display region;
a switching unit that, when an aspect ratio of the object region and an aspect ratio of the display region are different, sequentially switches between the object regions having the aspect ratio of the display region;
and a specifying unit that specifies the area to be switched based on a degree of association between an image cut out based on the switched area and the object.

an object detection unit that detects an area of the object from an image cut out based on the switched area;
a ratio score calculation unit that calculates a score according to a ratio of an area of the detected object to the cut-out image,
The information processing device according to claim 1 , wherein the specifying unit specifies the area to be switched based on a score corresponding to the calculated ratio.

an object detection unit that detects an area of the object from an image cut out based on the switched area;
a position score calculation unit that calculates a score according to a position of the detected object region for the cut-out image,
The information processing device according to claim 1 , wherein the specifying unit specifies the area to be switched based on the calculated score according to the position.

a part detection unit that detects the area of each part of the object from the image cut out based on the switched area;
The information processing device according to claim 1 , wherein the specifying unit specifies the area to be switched based on an area of each part of the detected object.

a text acquisition unit that acquires text data corresponding to an image including an object;
an object identification unit that identifies an object indicated by a word appearing in the acquired text data;
a detection unit that detects an object appearing in the text data from the acquired image,
The information processing device according to claim 1 , wherein the comparison unit compares an aspect ratio of the region of the object detected by the detection unit with an aspect ratio of the display region.

an occurrence frequency calculation unit that calculates an occurrence frequency of an object that appears in the text data;
a relevance calculation unit that calculates a relevance between an object appearing in the text data and an image cut out based on the switched region based on an appearance frequency of the object and the object included in the image cut out based on the switched region,
The information processing device according to claim 5 , wherein the specifying unit specifies the area to be switched based on the calculated relevance.

The information processing device according to claim 1 , further comprising: a generation unit configured to generate a display image by cutting out the region of the object from the acquired image when the aspect ratio of the region of the object and the aspect ratio of the display region are the same.

Acquire an image including the object;
Obtaining an aspect ratio of a display area for displaying the object;
Detecting the object from the image, and comparing an aspect ratio of the region of the detected object with an aspect ratio of the display region;
when the aspect ratio of the object region and the aspect ratio of the display region are different, sequentially switching the object region having the aspect ratio of the display region;
A program for causing a computer to execute a process of identifying a region to be switched based on a degree of association between an image cut out based on the switched region and the object.

Acquire an image including the object;
Obtaining an aspect ratio of a display area for displaying the object;
Detecting the object from the image, and comparing an aspect ratio of the region of the detected object with an aspect ratio of the display region;
when the aspect ratio of the object region and the aspect ratio of the display region are different, sequentially switching the object region having the aspect ratio of the display region;
The information processing method includes a process of identifying an area to be switched based on a degree of association between the object and an image cut out based on the switched area.