JP2020088611A

JP2020088611A - Image processing apparatus and control method thereof

Info

Publication number: JP2020088611A
Application number: JP2018220676A
Authority: JP
Inventors: 哲平関口; Teppei Sekiguchi; 咲樋渡; Saki Hiwatari; 健司杉原; Kenji Sugihara
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-11-26
Filing date: 2018-11-26
Publication date: 2020-06-04
Anticipated expiration: 2038-11-26
Also published as: JP7299690B2

Abstract

To provide an image processing apparatus that efficiently encodes a moving image.SOLUTION: An image processing apparatus (monitoring camera) comprises: an encoding section that encodes frame images constituting a moving image; an image rotation determining section that determines whether or not image rotation occurs for a preceding frame image in the moving image; storage means that stores a preceding frame image in the frame image, in which it is determined that the image rotation has occurred; and an encoding processing control section that controls encoding means to encode the frame image, in which it is determined that the image rotation has again occurred, using inter-frame prediction in which the frame image stored in the storage means is defined as a reference frame, when it is determined that the image rotation has again occurred after it is determined that the image rotation has occurred.SELECTED DRAWING: Figure 1

Description

本発明は、動画像の符号化技術に関するものである。 The present invention relates to a moving image coding technique.

近年、テレビ会議システムの普及に伴いカメラを用いた映像システムが広く導入されてきている。また、防犯目的だけでなく調査や管理を目的として施設内に監視カメラが導入されることが多くなっている。中でも、監視対象を監視カメラにより監視し、インターネットまたは無線通信等を介して映像データを送信したり、映像データを格納媒体に格納したりする用途も多くなっている。昨今では監視カメラで撮影された映像データの高画質化、高解像度化が進んでおり映像データ量が増加してきているため、データ量の低減が求められている。 In recent years, video systems using cameras have been widely introduced along with the widespread use of video conference systems. In addition, surveillance cameras are often installed in facilities not only for crime prevention but also for investigation and management. Among them, there are many applications in which a surveillance camera monitors a surveillance target to transmit video data via the Internet or wireless communication or to store the video data in a storage medium. In recent years, the image quality and resolution of video data shot by surveillance cameras have been increasing, and the amount of video data has been increasing. Therefore, it is required to reduce the data amount.

映像符号化技術として、対象フレームの前後のフレームから対象フレームの画素値を予測するフレーム間予測、対象フレーム内から対象フレームの画素値を予測するフレーム内予測の他、エントロピー符号化や量子化等が利用されている。これらの映像符号化技術を利用することにより映像データを効果的に圧縮することが可能になる。 Video coding techniques include interframe prediction that predicts the pixel value of the target frame from frames before and after the target frame, intraframe prediction that predicts the pixel value of the target frame from within the target frame, and entropy coding and quantization. Is used. By using these video coding techniques, video data can be effectively compressed.

ところで、映像において場面の切り替わり（シーンチェンジ）が発生する場合、画質の劣化を避けるため、Ｉフレーム（Intra-coded Frame）として符号化することが望ましい。Ｉフレームは、フレーム内の符号化対象ブロックの周辺にある復号済み画素を用いて予測信号を生成したフレームであり、Ｐフレーム（Predicted Frame）などに比較して符号量が大きい。特許文献１では、テレビ会議システムにおいて、音声や人物の出現等のトリガとエンコーダ部側からあらかじめ通知されるフレーム情報とに基づいて、使用する範囲を変更する（シーンチェンジを行う）技術が提案されている。より具体的には、トリガの受信に応じて、Ｉフレームのタイミングでシーンチェンジを行うことで効率的な符号化処理を行っている。 By the way, when a scene change occurs in a video (scene change), it is desirable to encode as an I frame (Intra-coded Frame) in order to avoid deterioration of image quality. The I frame is a frame in which a predicted signal is generated using decoded pixels around the target block in the frame, and has a larger code amount than a P frame (Predicted Frame) or the like. Patent Document 1 proposes a technique of changing a range to be used (changing a scene) in a video conference system based on a trigger such as appearance of a voice or a person and frame information notified in advance from the encoder unit side. ing. More specifically, efficient coding processing is performed by performing a scene change at the timing of an I frame in response to the reception of a trigger.

特開２０１７−２８３７５号公報JP, 2017-28375, A

しかしながら、テレビ会議とは異なり、監視カメラの用途においては、不特定多数の人間が不定期のタイミングで音声を発したり出現したりする。そのため、特許文献１のように音声や人物の出現等のトリガを利用することは困難である。また、天井吊り下げ型の監視カメラにおいては、パンチルト動作によるシーンチェンジが発生しやすい。これは、違和感のない映像を出力するため、撮影方向が鉛直下向きになったタイミングで映像の上下反転（画像の１８０°回転）を行うことによる。その結果、反転前後のフレーム画像間で、フレーム間予測で利用される動きベクトルの情報量が大きくなる。結果、映像全体としての符号化効率が低下し符号化データサイズが増加することになる。 However, unlike a video conference, in the use of a surveillance camera, an unspecified number of people make or emit audio at irregular timings. Therefore, it is difficult to use a trigger such as the appearance of a voice or a person as in Patent Document 1. Moreover, in a ceiling-suspended surveillance camera, a scene change is likely to occur due to a pan-tilt operation. This is because, in order to output an image that does not cause a sense of discomfort, the image is vertically inverted (the image is rotated by 180°) at the timing when the shooting direction is vertically downward. As a result, the amount of information of the motion vector used in the inter-frame prediction increases between the frame images before and after the inversion. As a result, the encoding efficiency of the entire image is reduced and the encoded data size is increased.

本発明は、このような問題に鑑みてなされたものであり、動画像の効率的な符号化を可能とする技術を提供することを目的としている。 The present invention has been made in view of such problems, and an object thereof is to provide a technique that enables efficient encoding of moving images.

上述の問題点を解決するため、本発明に係る画像処理装置は以下の構成を備える。すなわち、画像処理装置は、
動画像を構成するフレーム画像を符号化する符号化手段と、
前記動画像において、先行するフレーム画像に対する画像回転の発生の有無を判定する判定手段と、
前記判定手段により画像回転が発生したと判定されたフレーム画像に先行するフレーム画像を格納する記憶手段と、
前記判定手段により画像回転が発生したと判定された後、前記判定手段により前記動画像において再度の画像回転が発生したと判定された場合、前記記憶手段に格納されたフレーム画像を参照フレームとしたフレーム間予測を用いて、前記再度の画像回転が発生したと判定されたフレーム画像の符号化を行うよう前記符号化手段を制御する制御手段と、
を有する。 In order to solve the above problems, the image processing device according to the present invention has the following configuration. That is, the image processing device
Encoding means for encoding the frame images forming the moving image,
In the moving image, a determination unit that determines whether or not image rotation has occurred with respect to the preceding frame image,
Storage means for storing a frame image preceding the frame image determined to have image rotation by the determination means;
After the determination means determines that the image rotation has occurred, and when the determination means determines that the image rotation has occurred again in the moving image, the frame image stored in the storage means is used as the reference frame. Control means for controlling the encoding means so as to encode the frame image determined to have re-rotated by using inter-frame prediction;
Have.

本発明によれば、動画像の効率的な符号化を可能とする技術を提供することができる。 According to the present invention, it is possible to provide a technique that enables efficient encoding of moving images.

第１実施形態に係る画像処理装置を含む監視カメラの機能構成を示す図である。It is a figure showing the functional composition of the surveillance camera containing the image processing device concerning a 1st embodiment. 映像処理部のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of a video processing part. ３次元座標におけるパンチルト動作を説明する図である。It is a figure explaining the pan tilt operation in a three-dimensional coordinate. パンチルト動作により取得される映像を説明する図である。It is a figure explaining the image acquired by the pan tilt operation. 画像を１８０°回転した場合の探索範囲を例示的に示す図である。It is a figure which shows the search range at the time of rotating an image 180 degrees as an example. 第１実施形態における符号化処理の変更を説明する図である。It is a figure explaining the change of the encoding processing in 1st Embodiment. 第１実施形態における画像処理装置の動作フローチャートである。3 is an operation flowchart of the image processing apparatus in the first embodiment. 第１実施形態における各フレームでの参照フレームを説明する図である。It is a figure explaining the reference frame in each frame in 1st Embodiment. パンの角度とフレームバッファとの関係を説明する図である。It is a figure explaining the relationship between a pan angle and a frame buffer. 変形例における符号化処理の変更を説明する図である。It is a figure explaining the change of the encoding processing in a modification. 第２実施形態における符号化処理の変更を説明する図である。It is a figure explaining the change of the encoding processing in a 2nd embodiment. 第２実施形態における各フレームでの参照フレームを説明する図である。It is a figure explaining the reference frame in each frame in 2nd Embodiment. ＤＰＴＺ設定情報に基づく画像処理を説明する図である。It is a figure explaining the image processing based on DPTZ setting information. 矩形領域の位置座標を説明する図である。It is a figure explaining the position coordinate of a rectangular area.

以下に、図面を参照して、この発明の実施の形態の一例を詳しく説明する。なお、以下の実施の形態はあくまで例示であり、本発明の範囲を限定する趣旨のものではない。 Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. Note that the following embodiments are merely examples, and are not intended to limit the scope of the present invention.

（第１実施形態）
本発明に係る画像処理装置の第１実施形態として、天井吊り下げ型の監視カメラにより得られた映像を符号化する画像処理装置（映像処理部２００）を例に挙げて以下に説明する。 (First embodiment)
As a first embodiment of an image processing apparatus according to the present invention, an image processing apparatus (video processing unit 200) that encodes a video obtained by a ceiling-suspended surveillance camera will be described below as an example.

＜装置構成＞
図１は、第１実施形態に係る画像処理装置を含む監視カメラの機能構成を示す図である。監視カメラは、画像処理装置である映像処理部１００、カメラ３００、全体制御部２００を含み、ＰＴＺ設定部２０１から受信した設定に基づいて動作する。以下の説明では、ＰＴＺ設定部２０１は監視カメラとは別個の筐体として構成される場合を想定するが、同一の筐体に含めるよう構成してもよい。 <Device configuration>
FIG. 1 is a diagram showing a functional configuration of a surveillance camera including the image processing apparatus according to the first embodiment. The surveillance camera includes a video processing unit 100 which is an image processing device, a camera 300, and an overall control unit 200, and operates based on the settings received from the PTZ setting unit 201. In the following description, the PTZ setting unit 201 is assumed to be configured as a housing separate from the surveillance camera, but it may be configured to be included in the same housing.

ＰＴＺ設定部２０１は、監視カメラにおけるパン・チルト・ズーム（ＰＴＺ）に関するＰＴＺ設定情報（方向情報）をユーザから受け付け、例えばネットワークを介して全体制御部２００に対して当該設定を送信する。なお、以下の説明では監視カメラの初期ポジションを基準にしたパン・チルト角度（φpan、φtilt）を受け付けることを想定しているが、監視カメラの現在のチルト角度に対する角度（Δφ）として受け付けるよう構成してもよい。 The PTZ setting unit 201 receives PTZ setting information (direction information) regarding pan/tilt/zoom (PTZ) in the surveillance camera from the user, and transmits the setting to the overall control unit 200 via, for example, a network. In the following description, it is assumed that the pan/tilt angle (φpan, φtilt) based on the initial position of the surveillance camera is accepted, but it is configured to be accepted as the angle (Δφ) with respect to the current tilt angle of the surveillance camera. You may.

説明を簡単にするために、以下ではチルトに関する設定に着目して説明する。なお、ユーザから設定を受け付ける代わりにＰＴＺ設定部２０１自身が算出するよう構成してもよい。例えば、監視カメラによる撮影対象物を追尾するために必要な設定を撮像画像に基づいて算出する構成でもよい。 In order to simplify the description, the following description focuses on the setting related to tilt. The PTZ setting unit 201 itself may be configured to calculate instead of receiving the setting from the user. For example, it may be configured to calculate the settings necessary for tracking the object to be captured by the surveillance camera based on the captured image.

全体制御部２００は、ＰＴＺ設定部２０１から受け付けた設定に基づいて、不図示のモータ等を制御しパン・チルトによる撮影方向制御を行う。ここで、パンはカメラのレンズの向き（撮影方向）を水平方向（左右方向）に動かす動作のことであり、チルトはカメラのレンズの向きを垂直方向（上下方向）に動かす動作のことを意味している。また、カメラ１００にズームに関する情報を送信しズームの制御を行う。さらに、映像処理部１００に情報を送信する。 Based on the settings received from the PTZ setting unit 201, the overall control unit 200 controls a motor (not shown) or the like to control the shooting direction by pan/tilt. Here, pan means an operation of moving the camera lens direction (shooting direction) in the horizontal direction (horizontal direction), and tilt means an operation of moving the camera lens direction in the vertical direction (vertical direction). is doing. Also, the zoom information is transmitted to the camera 100 to control the zoom. Further, the information is transmitted to the video processing unit 100.

カメラ３００は、映像を撮影し、撮影した映像を映像処理部１００に送信する。カメラ３００は、カメラ制御部３０１と、レンズ３０２と、撮像部３０３を有する。カメラ制御部３０１は、レンズ３０２及び撮像部３０３を制御し、全体制御部２００から受信した情報に基づくズーム動作やフォーカス制御を行う。 The camera 300 captures a video and transmits the captured video to the video processing unit 100. The camera 300 has a camera control unit 301, a lens 302, and an imaging unit 303. The camera control unit 301 controls the lens 302 and the imaging unit 303, and performs zoom operation and focus control based on the information received from the overall control unit 200.

映像処理部１００は、カメラ３００により得られた映像に対して画像処理及び圧縮符号化を行い、符号化ストリームを出力する。特に、映像処理部１００は、全体制御部２００から入力されるＰＴＺ設定情報に基づいて画像処理及び圧縮符号化を行う点に特徴がある。 The image processing unit 100 performs image processing and compression encoding on the image obtained by the camera 300, and outputs an encoded stream. In particular, the video processing unit 100 is characterized in that it performs image processing and compression encoding based on PTZ setting information input from the overall control unit 200.

画像回転判定部１０１は、全体制御部２００からの指示に基づいて、画像処理部１０３における画像回転のタイミングのトリガ情報を生成する。トリガ情報とは、天井吊り下げ型の監視カメラにより得られた映像（撮像部３０３から出力される映像）において、チルト動作により画像が回転する（撮像画像における上下関係が反転する）タイミングを示す情報ことである。なお、以下の説明では、映像処理部１００の内部で画像の回転を判定しているものとしているが、外部の装置で判定し判定結果を受け付けるよう構成してもよい。例えば、全体制御部２００において、ＰＴＺ設定部２０１から受け付けたチルト設定があらかじめ設定された閾値を超えたか否かを判定することにより、画像が回転するタイミングを判定してもよい。 The image rotation determination unit 101 generates the trigger information of the image rotation timing in the image processing unit 103 based on the instruction from the overall control unit 200. The trigger information is information indicating the timing at which the image is rotated by the tilt operation (the vertical relationship in the captured image is inverted) in the image obtained by the ceiling-suspended surveillance camera (the image output from the imaging unit 303). That is. In the following description, it is assumed that the image rotation is determined inside the video processing unit 100, but the determination may be made by an external device and the determination result may be accepted. For example, the overall control unit 200 may determine the timing at which the image rotates by determining whether or not the tilt setting received from the PTZ setting unit 201 exceeds a preset threshold value.

符号化処理制御部１０２は、画像回転判定部１０１から出力された画像回転タイミングのトリガ情報に基づいて、符号化部１０４においてフレーム内予測とフレーム間予測符号との何れを実行するかを決定する。特に、フレーム内予測で利用する参照フレーム（後述するＬｏｎｇＴｅｒｍ参照フレーム）の設定を行う。なお、参照フレームに設定するフレーム画像はユーザが任意に設定可能なように構成してもよい。 The encoding processing control unit 102 determines, based on the image rotation timing trigger information output from the image rotation determination unit 101, which of the intraframe prediction and the interframe prediction code is to be executed in the encoding unit 104. .. In particular, a reference frame (LongTerm reference frame described later) used for intra-frame prediction is set. The frame image set as the reference frame may be configured so that the user can arbitrarily set it.

画像処理部１０３は、カメラ３００により得られた映像に対して画像処理を行う。映像は、一連のフレーム画像（例えば３０フレーム毎秒）でカメラ３００から入力される。以下の説明では、画像処理部１０３は、画像回転判定部１０１から受信したトリガ情報に基づいてフレーム画像に対する所定角度（例えば１８０°）の回転処理を行うものとして説明するが、他の画像処理を合わせて行うように構成してもよい。画像処理部１０３により回転処理がなされた（又は回転処理がされなかった）フレーム画像は符号化部１０４に出力される。 The image processing unit 103 performs image processing on the video obtained by the camera 300. The video is input from the camera 300 as a series of frame images (for example, 30 frames per second). In the following description, the image processing unit 103 is described as performing rotation processing of a predetermined angle (for example, 180°) with respect to the frame image based on the trigger information received from the image rotation determination unit 101, but other image processing will be described. You may comprise so that it may be performed together. The frame image subjected to the rotation processing (or not subjected to the rotation processing) by the image processing unit 103 is output to the encoding unit 104.

符号化部１０４は、符号化処理制御部１０２から受信した情報（フレーム内予測かフレーム間予測のどちらを使うか）に基づいて、画像処理部１０３から受信したフレーム画像に対して符号化処理を行う。 The encoding unit 104 performs an encoding process on the frame image received from the image processing unit 103 based on the information received from the encoding process control unit 102 (whether intra-frame prediction or inter-frame prediction is used). To do.

図２は、映像処理部１００のハードウェア構成を示す図である。映像処理部１００は、ＣＰＵ１５１、ＲＯＭ１５２、ＲＡＭ１５３、カメラＩ／Ｆ１５４、制御Ｉ／Ｆ１５５、ネットワークＩ／Ｆ１５６を含む。ＲＯＭ１５２は、ＣＰＵ１５１が実行するプログラムや各種の設定データを記憶する。ＣＰＵ１５１は、ＲＯＭ１５２に記憶されたプログラムをＲＡＭ１５３に読み込み実行することにより、カメラＩ／Ｆ１５４を介して入力されたフレーム画像に対する画像処理や符号化処理、ネットワークＩ／Ｆ１５６を介した通信処理を実現する。カメラＩ／Ｆ１５４は、カメラ３００から映像（フレーム画像）を受信する。制御Ｉ／Ｆ１５５は、全体制御部２００からの制御信号を受信する。ネットワークＩ／Ｆ１５５は、ネットワークを介して符号化ストリームを送信する。 FIG. 2 is a diagram showing a hardware configuration of the video processing unit 100. The image processing unit 100 includes a CPU 151, a ROM 152, a RAM 153, a camera I/F 154, a control I/F 155, and a network I/F 156. The ROM 152 stores programs executed by the CPU 151 and various setting data. The CPU 151 reads a program stored in the ROM 152 into the RAM 153 and executes the program to realize image processing and coding processing for a frame image input via the camera I/F 154, and communication processing via the network I/F 156. .. The camera I/F 154 receives a video (frame image) from the camera 300. The control I/F 155 receives the control signal from the overall control unit 200. The network I/F 155 transmits the coded stream via the network.

＜天井吊り下げ型の監視カメラにより取得される映像＞
図３は、３次元座標におけるパンチルト動作を説明する図である。ここでは、ＸＹＺ直交座標系において、Ｘ軸の負の方向をカメラの初期ポジションとしたパン・チルト動作を示している。天井吊り下げ型の監視カメラによる撮影方向は、水平方向（ＸＺ平面）よりも下側の方向を向くことになる。そのため、ここでは、カメラの初期ポジションの方向を基準にチルト角度（φtilt）及びパン角度（φpan）を規定している。 <Video captured by a ceiling-mounted surveillance camera>
FIG. 3 is a diagram illustrating a pan/tilt operation in three-dimensional coordinates. Here, in the XYZ orthogonal coordinate system, the pan/tilt operation is shown with the negative direction of the X axis as the initial position of the camera. The shooting direction of the ceiling-suspended surveillance camera is downward from the horizontal direction (XZ plane). Therefore, here, the tilt angle (φtilt) and the pan angle (φpan) are defined with reference to the direction of the initial position of the camera.

図４は、パンチルト動作により取得される映像を説明する図である。図４は、図３におけるＺ軸の正の方向からＸＹ平面を観察した状態に対応する。（１）〜（５）に示すカメラポジションは、それぞれ、チルト角度（φtilt）が３０°、６０°、９０°、１２０°、１５０°に対応している。なお、説明を簡単にするためにここではパン角度（φpan）は全て０°であると想定する。 FIG. 4 is a diagram illustrating an image acquired by the pan/tilt operation. FIG. 4 corresponds to a state in which the XY plane is observed from the positive direction of the Z axis in FIG. The camera positions shown in (1) to (5) correspond to tilt angles (φtilt) of 30°, 60°, 90°, 120°, and 150°, respectively. For simplicity of explanation, it is assumed here that all pan angles (φpan) are 0°.

カメラポジションを（１）から（５）まで連続的に変化させた場合、鉛直下向きを撮影する（３）のポジションでフレーム画像における上下が反転することになる。図４は、西（West Gate）から東（East Gate）へ延びる直線状の通路を移動している人物を、通路の天井に設置した監視カメラのチルト動作により追尾する状況を示している。この場合、撮像部３０３から出力される映像として、フレーム画像Ｆ１、Ｆ２、ＦＡ、ＦＢ、ＦＣが順に得られることになる。すなわち、映像の途中で画像の上下が反転することになり不自然な映像となる。そこで、映像として自然なものとする（上下の反転が発生しないようにする）ために、フレーム画像ＦＡ、ＦＢ、ＦＣを１８０°回転する手法が用いられることがある。当該手法の結果、フレーム画像Ｆ１、Ｆ２、Ｆ３、Ｆ４、Ｆ５が順に得られることになる。 When the camera position is continuously changed from (1) to (5), the vertical direction in the frame image is inverted at the position of (3) in which the image is taken vertically downward. FIG. 4 shows a situation in which a person moving in a straight passage extending from west (West Gate) to east (East Gate) is tracked by a tilt operation of a surveillance camera installed on the ceiling of the passage. In this case, frame images F1, F2, FA, FB, and FC are sequentially obtained as the video output from the imaging unit 303. That is, the image is turned upside down in the middle of the image, resulting in an unnatural image. Therefore, a method of rotating the frame images FA, FB, and FC by 180° may be used in order to make the image look natural (preventing vertical inversion). As a result of the method, frame images F1, F2, F3, F4, and F5 are sequentially obtained.

そこで、第１実施形態では、画像回転判定部１０１は、全体制御部２００から入力されるチルト角度（φtilt）の情報が所定の角度（φtrig）を超えたか否かに基づいて、入力されたフレーム画像に対して１８０°回転するか否かの判定処理を行う。例えば、φtrig＝９０°と設定した場合に、画像回転判定部１０１は、φtilt＝９０°（図４に示す（３）のポジション）になると、フレーム画像を１８０°回転すると判定する。そして、１８０°の回転を「しない」から「する」（又はその反対）に変化したタイミングでトリガ情報を生成し、画像処理部１０３に出力する。なお、基準とするカメラの初期ポジション、および、トリガ情報を出力するチルト角度は、上述の形態に限定されるものではなくユーザが任意に設定可能である。 Therefore, in the first embodiment, the image rotation determination unit 101 inputs the frame based on whether or not the information of the tilt angle (φtilt) input from the overall control unit 200 exceeds a predetermined angle (φtrig). A determination process of whether to rotate the image by 180° is performed. For example, when φtrig=90° is set, the image rotation determination unit 101 determines to rotate the frame image by 180° when φtilt=90° (position (3) shown in FIG. 4). Then, the trigger information is generated at the timing when the rotation of 180° is changed from “no” to “Yes” (or vice versa), and is output to the image processing unit 103. The reference initial position of the camera and the tilt angle at which the trigger information is output are not limited to those described above, and can be arbitrarily set by the user.

なお、第１実施形態では、画像回転判定部１０１は、ＰＴＺ設定部２０１から設定され全体制御部２００経由で受信したパンチルトの角度情報から、指示がパン動作、チルト動作、それらの組み合わせの何れであるかを判断する。パン動作のみの場合は上述した映像の回転は発生しないため、画像回転判定部１０１は、パン動作のみの場合は画像回転のトリガ情報を生成しない。 In the first embodiment, the image rotation determination unit 101 determines whether the instruction is a pan operation, a tilt operation, or a combination thereof based on the pan/tilt angle information set from the PTZ setting unit 201 and received via the overall control unit 200. Determine if there is. The image rotation determination unit 101 does not generate the image rotation trigger information when only the pan operation is performed because the above-described image rotation does not occur when only the pan operation is performed.

図５は、画像を１８０°回転した場合のフレーム間予測での探索範囲を例示的に示す図である。符号化処理部１０４は、フレーム間予測を用いて符号化処理を実施する場合、先行する直前のフレーム画像を参照フレームとして利用する。ただし、図４を参照して説明したように映像の途中で画像処理部１０３により画像の１８０°回転を行った場合、当該１８０°回転の前後で、符号化処理部１０４に入力される画像が１８０°回転することになる。そのため、１８０°回転直後の画像フレーム５００ｂを符号化する場合、１８０°回転直後の画像フレーム５００ａを参照フレームとして利用することになる。 FIG. 5 is a diagram exemplarily showing a search range in inter-frame prediction when an image is rotated by 180°. The encoding processing unit 104 uses the immediately preceding preceding frame image as a reference frame when performing the encoding processing using inter-frame prediction. However, as described with reference to FIG. 4, when the image processing unit 103 rotates the image by 180° in the middle of the video, the images input to the encoding processing unit 104 before and after the 180° rotation are It will rotate 180°. Therefore, when encoding the image frame 500b immediately after 180° rotation, the image frame 500a immediately after 180° rotation is used as a reference frame.

そのため、画像フレーム５００ｂの矩形Ｂを符号化するための動きベクトルを算出する際に、画像フレーム５００ａの矩形Ａの周辺を動き予測の探索範囲として設定することになる。すなわち、矩形Ｂに対応する矩形Ｃは探索範囲から外れることになる。その結果、フレーム間予測を用いて画像フレーム５００ｂの符号化処理を行った場合、動き予測の差分が増加することになる。 Therefore, when calculating the motion vector for encoding the rectangle B of the image frame 500b, the periphery of the rectangle A of the image frame 500a is set as the search range for motion prediction. That is, the rectangle C corresponding to the rectangle B is out of the search range. As a result, when the image frame 500b is encoded using inter-frame prediction, the difference in motion prediction increases.

そこで、符号化処理制御部１０２は、画像の１８０°回転処理を行うトリガ情報に従い、当該トリガ情報に対応するフレーム画像に先行するフレーム画像をＬｏｎｇＴｅｒｍ参照フレーム用の記憶部であるフレームバッファに格納する。その後、再度、画像の１８０°回転処理を行うトリガ情報が入力された場合、ＬｏｎｇＴｅｒｍ参照フレーム用のフレームバッファに格納されたフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとして、フレーム間予測で行うよう制御する。すなわち、フレームバッファに格納されたフレーム画像と同じ画像方向になった場合に、当該制御を行う。 Therefore, the encoding process control unit 102 stores the frame image preceding the frame image corresponding to the trigger information in the frame buffer, which is the storage unit for the LongTerm reference frame, according to the trigger information for performing the 180° rotation process of the image. .. After that, when the trigger information for performing the 180° rotation processing of the image is input again, the frame image stored in the frame buffer for the LongTerm reference frame is set as the LongTerm reference frame, and the inter-frame prediction is performed. That is, when the image direction is the same as that of the frame image stored in the frame buffer, the control is performed.

＜装置の動作＞
図６は、符号化処理の変更制御を説明する図である。カメラポジションを（１）から（５）まで連続的に変化させ、その後、（５）から（２）まで連続的に変化させた場合に得られる動画像を構成するフレーム画像を例示的に示している。 <Device operation>
FIG. 6 is a diagram illustrating the change control of the encoding process. A frame image forming a moving image obtained when the camera position is continuously changed from (1) to (5) and then continuously changed from (5) to (2) is shown as an example. There is.

従来は、例えば所定の周期でフレーム内予測を用いた符号化を行い、それ以外は全てフレーム間予測を用いた符号化を行っている。図６では、Ｉ０がフレーム内予測を用いて符号化されたＩフレーム（Ｉピクチャ）である。また、Ｐ１〜Ｐ１４はフレーム間予測を用いて符号化されたＰフレーム（Ｐピクチャ）である。図６においては、各フレーム画像の符号化に用いる参照フレームを白抜きの矢印で示している。 Conventionally, for example, coding is performed using intraframe prediction at a predetermined cycle, and all other coding is performed using interframe prediction. In FIG. 6, I0 is an I frame (I picture) coded using intraframe prediction. Further, P1 to P14 are P frames (P pictures) coded using interframe prediction. In FIG. 6, reference frames used for encoding each frame image are indicated by white arrows.

従来は、Ｐ１３に対応するタイミングのフレーム画像を符号化するにあたって、先行する直前のフレーム画像であるＰ１２を参照フレームとして設定する。そして、当該フレーム画像をフレーム間予測を用いてＰフレーム（Ｐ１３）として符号化する。一方、第１実施形態では、Ｐ１３に対応するタイミングのフレーム画像を符号化するにあたって、予め記憶しておいたフレーム画像であるＰ３をＬｏｎｇＴｅｒｍ参照フレームとして設定する。そして、当該フレーム画像をフレーム間予測を用いてＰフレーム（Ｐ１３）として符号化する。他のフレーム画像は、先行するフレーム画像を参照フレームとしてＰフレーム（Ｐ１、・・・、Ｐ１２、Ｐ１４）として符号化する。 Conventionally, when encoding a frame image at a timing corresponding to P13, P12, which is the immediately preceding frame image, is set as a reference frame. Then, the frame image is encoded as a P frame (P13) using inter-frame prediction. On the other hand, in the first embodiment, when encoding the frame image at the timing corresponding to P13, the previously stored frame image P3 is set as the LongTerm reference frame. Then, the frame image is encoded as a P frame (P13) using inter-frame prediction. Other frame images are encoded as P frames (P1,..., P12, P14) using the preceding frame image as a reference frame.

図８は、各フレームでの参照フレームを説明する図である。各カメラポジションにおける各処理対象フレームに対応する参照フレームと１８０°回転トリガの関係を表にしたものである。図８で示すように、符号化処理部１０４は、カメラポジション（３）のタイミングで、１８０°回転を行うトリガ情報を受信する。そのため、符号化処理部１０４は、そのタイミングで符号化処理しているフレーム画像に先行する直前のフレームをＬｏｎｇＴｅｒｍ参照フレーム用のフレームバッファに格納する。具体的には、１回目のカメラポジション（３）のタイミングではＰ３を格納し、２回目のカメラポジション（３）のタイミングではＰ１２を格納する。そして、２回目のカメラポジション（３）のタイミングで符号化処理しているフレーム画像を、Ｐ３をＬｏｎｇＴｅｒｍ参照フレームとしたフレーム間予測によるＰ１３として符号化処理を行う。Ｐ１３のタイミングで格納したＰ１２は、再びカメラポジション（３）になったタイミングでＬｏｎｇＴｅｒｍ参照フレームとして利用されることになる。 FIG. 8 is a diagram illustrating a reference frame in each frame. It is a table showing a relationship between a reference frame corresponding to each processing target frame at each camera position and a 180° rotation trigger. As shown in FIG. 8, the encoding processing unit 104 receives the trigger information for rotating 180° at the timing of the camera position (3). Therefore, the encoding processing unit 104 stores the frame immediately preceding the frame image being encoded at that timing in the frame buffer for the LongTerm reference frame. Specifically, P3 is stored at the timing of the first camera position (3), and P12 is stored at the timing of the second camera position (3). Then, the frame image coded at the timing of the second camera position (3) is coded as P13 by inter-frame prediction using P3 as the LongTerm reference frame. The P12 stored at the timing of P13 will be used as the LongTerm reference frame at the timing when the camera position (3) is reached again.

なお、以下の説明では、ＬｏｎｇＴｅｒｍ参照フレーム用のフレームバッファは、回転が発生するたびに、パンの角度に応じた複数のフレーム画像（フレーム画像群）を格納するよう構成されていることを想定する。また格納されたそれぞれのフレーム画像は、回転が発生するたびに順次更新（上書き）されることを想定する。ただし、フレーム画像を更新せずに前回のＬｏｎｇＴｅｒｍ参照フレームとして継続して使用する等しても良い。また、複数回の回転に対応するフレーム画像群を格納するようフレームバッファを構成してもよい。 In the following description, it is assumed that the frame buffer for the LongTerm reference frame is configured to store a plurality of frame images (frame image group) according to the pan angle each time rotation occurs. .. Further, it is assumed that each stored frame image is sequentially updated (overwritten) each time rotation occurs. However, the frame image may be continuously used as the previous LongTerm reference frame without being updated. Further, the frame buffer may be configured to store a frame image group corresponding to a plurality of rotations.

図９は、パンの角度とフレームバッファとの関係を説明する図である。ここでは、パンの角度が（１）に近い時は参照フレームバッファＡを使用するとする。同様に、パンの角度が（２）、（３）、（４）に近い時は、それぞれ、参照フレームバッファＢ、Ｃ、Ｄを使用するとする。ただし、設定するパンの角度と使用する参照フレームバッファに関してはこれに限定されるものではない。 FIG. 9 is a diagram for explaining the relationship between the pan angle and the frame buffer. Here, it is assumed that the reference frame buffer A is used when the pan angle is close to (1). Similarly, when the pan angle is close to (2), (3), and (4), the reference frame buffers B, C, and D are used, respectively. However, the pan angle to be set and the reference frame buffer to be used are not limited to this.

図７は、第１実施形態における画像処理装置の動作フローチャートである。当該処理は、例えば、監視カメラによる監視の開始に合わせて行われる。 FIG. 7 is an operation flowchart of the image processing apparatus according to the first embodiment. The process is performed, for example, at the start of monitoring by the monitoring camera.

ステップＳ７０１では、ＰＴＺ設定部２０１は、ユーザから、監視カメラの初期設定を受け付ける。ここでの設定される情報は、監視カメラの撮影時間、パンチルトのスピード、カメラの水平方向の首ふり角度に関する閾値情報等である。ＰＴＺ設定部２０１は、ユーザから受け付けた設定を例えばネットワークを介して全体制御部２００に通知する。 In step S701, the PTZ setting unit 201 receives the initial setting of the surveillance camera from the user. The information set here includes the shooting time of the surveillance camera, the pan/tilt speed, the threshold information about the horizontal swinging angle of the camera, and the like. The PTZ setting unit 201 notifies the overall control unit 200 of the settings received from the user via, for example, a network.

ステップＳ７０２では、全体制御部２００は、カメラ３００による撮影を開始する。ここでは、ポジション（１）を撮影開始時の初期ポジションとして想定するが、撮影開始時のポジションはこれに限定されるものではない。 In step S702, the overall control unit 200 starts shooting with the camera 300. Here, the position (1) is assumed as the initial position at the start of shooting, but the position at the start of shooting is not limited to this.

ステップＳ７０３では、画像回転判定部１０１は、撮影により得られた動画像（フレーム画像）を受信し先行するフレーム画像に対して現在の処理対象フレーム画像が画像処理部１０３により１８０°回転されたか否かを判定する。処理対象フレーム画像が１８０°回転されていない場合（監視カメラのポジションが図６の（３）以外の場合）（Ｓ７０３でＮｏ）は、符号化部１０４は、処理対象フレームを従来と同様に符号化処理し、Ｓ７０３に戻り次のフレーム画像の処理に移行する。一方、画像が１８０°回転されていない場合（監視カメラのポジションが図６の（３）の場合）（Ｓ７０３でＹｅｓ）は、Ｓ７０４に進む。 In step S703, the image rotation determination unit 101 receives the moving image (frame image) obtained by shooting, and determines whether the current processing target frame image has been rotated by 180° by the image processing unit 103 with respect to the preceding frame image. To determine. When the frame image to be processed is not rotated by 180° (when the position of the surveillance camera is other than (3) in FIG. 6) (No in S703), the encoding unit 104 encodes the frame to be processed as in the conventional case. Processing is performed, and the process returns to S703 to move to the processing of the next frame image. On the other hand, when the image is not rotated by 180° (when the position of the surveillance camera is (3) in FIG. 6) (Yes in S703), the process proceeds to S704.

ステップＳ７０４では、画像回転判定部１０１は、画像が１８０°回転した時のパンの角度を判定する。そして、パンの角度（すなわち水平方向の角度）に応じて、どの参照フレームバッファに格納されたフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとして設定するかを決定する。 In step S704, the image rotation determination unit 101 determines the pan angle when the image is rotated by 180°. Then, according to the pan angle (that is, the angle in the horizontal direction), it is determined which reference frame buffer stores the frame image to be set as the LongTerm reference frame.

ステップＳ７０５では、画像回転判定部１０１は、当該画像回転が１回目の１８０°回転か否かを判定する。１回目の画像回転だった場合（Ｓ７０５でＹｅｓ）（図６のカメラポジション（３）、Ｐ４の場合）は、Ｓ７０６に進む。再度の画像回転であった（１回目の画像回転ではなかった）場合（Ｓ７０５でＮｏ）（図６のカメラポジション（３）、Ｐ１３の場合）は、Ｓ７０７へ進む。 In step S705, the image rotation determination unit 101 determines whether the image rotation is the first 180° rotation. If it is the first image rotation (Yes in S705) (camera position (3) in FIG. 6, P4), the process proceeds to S706. If it is the image rotation again (it is not the first image rotation) (No in S705) (camera position (3) in FIG. 6, P13), the process proceeds to S707.

ステップＳ７０６では、符号化処理制御部１０２は、当該フレーム画像の１フレーム前のフレーム画像を参照フレームバッファＡに格納し、ＬｏｎｇＴｅｒｍ参照フレームとして設定する。ここでは、Ｓ７０４でパンの角度が（１）に最も近いと判断されたことを想定している。これは、図６、図８におけるＰ４、カメラポジション（３）のタイミングに対応し、このときの参照フレームはＰ３となる。パンの角度が図９の（２）、（３）、（４）に近かった場合には、それぞれ、フレーム画像を参照フレームバッファＢ、Ｃ、Ｄに格納し、ＬｏｎｇＴｅｒｍ参照フレームとして設定するものとする。 In step S<b>706, the encoding processing control unit 102 stores the frame image one frame before the frame image in the reference frame buffer A and sets it as the LongTerm reference frame. Here, it is assumed that the pan angle is determined to be closest to (1) in S704. This corresponds to the timing of P4 and the camera position (3) in FIGS. 6 and 8, and the reference frame at this time is P3. When the pan angle is close to (2), (3), and (4) in FIG. 9, the frame images are stored in the reference frame buffers B, C, and D, respectively, and set as the LongTerm reference frame. To do.

ステップＳ７０７では、符号化部１０４は、前回の画像回転時（図６のカメラポジション（３）、Ｐ４の場合）に設定したＬｏｎｇＴｅｒｍ参照フレームを使って符号化処理を実施する。これは、図６、図８におけるＰ１３、カメラポジション（３）のタイミングに対応し、このときの参照フレームはＰ３となる。 In step S707, the encoding unit 104 performs the encoding process using the LongTerm reference frame set at the time of the previous image rotation (camera position (3) in FIG. 6, P4). This corresponds to the timing of P13 and camera position (3) in FIGS. 6 and 8, and the reference frame at this time is P3.

ステップＳ７０８では、符号化処理制御部１０２は、当該フレーム画像の１フレーム前のフレーム画像を参照フレームバッファＡに格納し、ＬｏｎｇＴｅｒｍ参照フレームとして設定する。これにより前回設定したＬｏｎｇＴｅｒｍ参照フレームは上書きされる。 In step S708, the encoding processing control unit 102 stores the frame image one frame before the frame image in the reference frame buffer A and sets it as the LongTerm reference frame. As a result, the previously set LongTerm reference frame is overwritten.

ステップＳ７０９では、全体制御部２００は、撮影終了の設定があるか否かを判定する。撮影終了の設定が無い場合（Ｓ７０９でＮｏ）は、Ｓ７０３に進み、撮影終了の設定が有る場合（Ｓ７０９でＹｅｓ）は、カメラ３００での撮影を終了し処理を終了する。 In step S709, the overall control unit 200 determines whether or not there is a shooting end setting. If there is no shooting end setting (No in S709), the process proceeds to S703. If there is a shooting end setting (Yes in S709), shooting with the camera 300 ends and the process ends.

以上説明したとおり第１実施形態によれば、１８０°画像回転が発生した直前のフレーム画像をフレームバッファに格納しておく。そして、再び１８０°画像回転が発生した（すなわち、フレームバッファに格納されたフレーム画像と同じ画像方向になった）場合、フレームバッファに格納しておいたフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとしてフレーム間予測による符号化を行う。これにより、出力される符号化ストリームにおけるビットレートを減らすことが可能となる。 As described above, according to the first embodiment, the frame image immediately before the 180° image rotation occurs is stored in the frame buffer. Then, when the 180° image rotation occurs again (that is, the image direction becomes the same as the frame image stored in the frame buffer), the frame image stored in the frame buffer is used as the LongTerm reference frame and interframe prediction is performed. Encode. This makes it possible to reduce the bit rate in the output encoded stream.

（変形例）
第１実施形態においては、前回（１回目）の画像回転時の１フレーム前のフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとして設定しているが、他のフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとして設定しても良い。例えば、図１０のように前回（１回目）画像回転時の３フレーム前のフレーム画像をＬｏｎｇＴｅｒｍ参照フレームとして設定してもよい。この場合、Ｐ１をＬｏｎｇＴｅｒｍ参照フレームとしてＰ１３が符号化される。なお、Ｐ１４のフレーム画像を処理する際には、Ｐ１３を参照フレームとしてフレーム間予測による符号化を行う。 (Modification)
In the first embodiment, the frame image one frame before the previous (first) image rotation is set as the LongTerm reference frame, but another frame image may be set as the LongTerm reference frame. For example, as shown in FIG. 10, the frame image three frames before the previous (first) image rotation may be set as the LongTerm reference frame. In this case, P13 is encoded using P1 as the LongTerm reference frame. When processing the frame image of P14, P13 is used as a reference frame for encoding by inter-frame prediction.

また、上述の説明では、１８０°の画像回転の発生を想定したが、他の角度の画像回転でもよい。すなわち、２回目の画像回転により元の画像方向に戻るような形態であれば、同様の処理が適用可能である。 Further, in the above description, it is assumed that the image rotation is 180°, but the image rotation may be another angle. That is, similar processing can be applied as long as the original image direction is restored by the second image rotation.

（第２実施形態）
第２実施形態では、天井吊り下げ型のカメラが周期的なチルト動作を行うことにより撮影を行う場合の符号化制御について説明する。特に、カメラ設置時に、周期動作に応じた画像回転を行うカメラ位置情報、及び、画像回転を行うカメラ位置までチルト動作するための時間情報を設定する。そして、設定された初期情報に基づいて、ＬｏｎｇＴｅｒｍ参照フレームを使用したフレーム間予測をする形態について説明する。なお、画像処理装置の機能構成およびハードウェア構成は第１実施形態とほぼ同様である。そのため以下では異なる部分を中心に説明を行う。 (Second embodiment)
In the second embodiment, encoding control in the case where a ceiling-suspended camera performs shooting by performing a periodic tilt operation will be described. In particular, when the camera is installed, the camera position information for rotating the image according to the periodic operation and the time information for the tilting operation to the camera position for rotating the image are set. Then, a mode of performing inter-frame prediction using a LongTerm reference frame based on the set initial information will be described. The functional configuration and hardware configuration of the image processing apparatus are almost the same as in the first embodiment. Therefore, the different parts will be mainly described below.

＜装置の動作＞
ＰＴＺ設定部２０１は、カメラの周期動作に応じた初期情報の設定をユーザから受け付ける。全体制御部２００は、ＰＴＺ設定部２０１から初期情報を受け取り、当該初期情報に従った設定を画像回転判定部１０１に設定する。その後、画像回転判定部１０１は、初期情報に基づいて画像回転位置を算出し、カメラポジションが画像回転位置にきたと判断した場合、トリガ情報を符号化処理制御部１０２に通知する。 <Device operation>
The PTZ setting unit 201 receives from the user the setting of initial information according to the cyclic operation of the camera. The overall control unit 200 receives the initial information from the PTZ setting unit 201, and sets the settings according to the initial information in the image rotation determination unit 101. After that, the image rotation determination unit 101 calculates the image rotation position based on the initial information, and when it determines that the camera position has reached the image rotation position, notifies the encoding processing control unit 102 of the trigger information.

図１１は、第２実施形態における符号化処理の制御を説明する図である。ここでは、カメラポジションが、チルト動作により図１１に示す（１）から（５）の間を往復して周期的に変化する。図１１の下段は、このような往復運動での各カメラポジションにおいて、どのフレーム画像を参照フレームとしてフレーム間予測による符号化処理を行っているかを示している。 FIG. 11 is a diagram illustrating control of encoding processing according to the second embodiment. Here, the camera position reciprocates between (1) and (5) shown in FIG. The lower part of FIG. 11 shows which frame image is used as the reference frame to perform the encoding process by the inter-frame prediction at each camera position in such a reciprocating motion.

図１１で示すように、カメラが周期的にチルト動作を行うなかで、カメラポジションが（３）の位置に来るたびにフレーム画像に対する画像回転の有無を変更する。そして、そのタイミングで、当該フレーム画像に対してＬｏｎｇＴｅｒｍ参照フレームを使用したフレーム間予測による符号化処理を行う。 As shown in FIG. 11, while the camera periodically performs the tilt operation, the presence/absence of image rotation with respect to the frame image is changed every time the camera position reaches the position (3). Then, at that timing, the encoding process is performed on the frame image by inter-frame prediction using the LongTerm reference frame.

図１２は、第２実施形態における各フレームでの参照フレームを説明する図である。図１２で示すように、初期設定情報として、カメラの設置場所に合わせてカメラポジションを設定する（テーブルの３列目）。併せて、各フレーム画像における参照フレーム（ＬｏｎｇＴｅｒｍ参照フレームを含む）の設定情報（テーブルの２列目）と、フレーム画像を１８０°回転するカメラポジション（テーブルの４列目）を設定する。以下では、フレーム画像の１８０°回転の有無が変化してから次にフレーム画像が１８０°回転されるまでの期間をＴとして説明する。 FIG. 12 is a diagram illustrating reference frames in each frame in the second embodiment. As shown in FIG. 12, as the initial setting information, the camera position is set according to the installation location of the camera (third column of the table). At the same time, the setting information of the reference frame (including the LongTerm reference frame) in each frame image (the second column of the table) and the camera position that rotates the frame image by 180° (the fourth column of the table) are set. In the following description, the period from the change in the presence/absence of 180° rotation of the frame image to the next rotation of the frame image by 180° will be described as T.

画像回転判定部１０１は、全体制御部２００から設定された図１２に示すテーブル情報に基づいて、フレーム画像を回転するか否かの判定処理を行う。上述したように、テーブル情報には、各時刻（各タイミング）における撮影方向を規定した所与の制御内容が含まれている。画像回転判定部１０１は、撮影が開始されると同時に動作時間のカウントを開始する。動作時間をＴcountとすると、カメラによる撮影を初期位置姿勢（図１１の（１））から開始するとしたとき、Ｔcount＝Ｔ／２のときに図１１の（３）で示すカメラポジションとなる。このとき、画像回転判定部１０１は、画像回転が必要なカメラポジションであると判断し、画像を１８０°回転するトリガ情報を符号化処理部１０２に通知する。これにより、符号化処理制御部１０２は、画像が１８０°回転されるカメラポジションにおいて、ＬｏｎｇＴｅｒｍ参照フレームを使用したフレーム間予測による符号化処理を行うことが可能となる。また、今後ＬｏｎｇＴｅｒｍ参照フレームとして使用されるフレーム画像をフレームバッファに格納することが可能となる。このとき、画像回転判定部１０１は、カメラポジションが（３）であると判定されたときに、Ｔcountを初期値０に設定し、処理を再開する。 The image rotation determination unit 101 performs a determination process on whether to rotate the frame image based on the table information shown in FIG. 12 set by the overall control unit 200. As described above, the table information includes the given control content that defines the shooting direction at each time (each timing). The image rotation determination unit 101 starts counting the operation time at the same time when the shooting is started. Assuming that the operation time is Tcount, when the image pickup by the camera is started from the initial position/posture ((1) in FIG. 11), the camera position shown in (3) in FIG. 11 is obtained when Tcount=T/2. At this time, the image rotation determination unit 101 determines that the camera position requires image rotation, and notifies the encoding processing unit 102 of trigger information for rotating the image by 180°. As a result, the encoding process control unit 102 can perform the encoding process by the inter-frame prediction using the Long Term reference frame at the camera position where the image is rotated by 180°. Further, it becomes possible to store a frame image used as a Long Term reference frame in the future in the frame buffer. At this time, when it is determined that the camera position is (3), the image rotation determination unit 101 sets Tcount to the initial value 0 and restarts the process.

なお、上述の説明では、チルト動作開始からの時間（テーブルの６列目）をもとに画像の回転タイミングを検出した。一方で、ＬｏｎｇＴｅｒｍ参照フレームを使用するか否かの設定情報（テーブルの２列目）をもとに画像の回転タイミングを検出してもよい。 In the above description, the image rotation timing is detected based on the time from the start of the tilt operation (sixth column in the table). On the other hand, the image rotation timing may be detected based on the setting information (second column of the table) as to whether or not to use the LongTerm reference frame.

また、上述の説明ではカメラの周期動作に合わせたテーブル情報を、カメラ設置時に作成し設定しているが、カメラの動作中にテーブル情報を更新するよう構成してもよい。これにより、運用形態が途中で変化した場合にも対応することができる。 Further, in the above description, the table information according to the periodic operation of the camera is created and set at the time of installing the camera, but the table information may be updated during the operation of the camera. As a result, it is possible to deal with a case where the operating mode changes in the middle.

以上説明したとおり第２実施形態によれば、天井吊り下げ型の監視カメラが周期的な動作を行う場合に、カメラの動作周期に合わせたテーブル情報を利用する。当該テーブル情報を利用することにより、第１実施形態と同様の効果を得ることが可能となる。 As described above, according to the second embodiment, when the ceiling-suspended surveillance camera performs a periodic operation, table information matched with the operation cycle of the camera is used. By using the table information, it is possible to obtain the same effect as that of the first embodiment.

（第３実施形態）
第３実施形態では、デジタル・パン・チルト・ズーム（ＤＰＴＺ）機能を備えた監視カメラを用いた場合の処理について説明する。ここで、ＤＰＴＺ機能とは、監視カメラにより得られた映像に対してトリミング処理や回転処理といった一連の画像処理を施すことによりパン・チルト・ズームされた映像を生成する機能である。例えば、超広角の撮像光学系を利用することにより、モータやギヤなどにより構成される可動部を用いることなく、幅広い方向の撮像映像を得ることができる。 (Third Embodiment)
In the third embodiment, processing when a surveillance camera having a digital pan/tilt/zoom (DPTZ) function is used will be described. Here, the DPTZ function is a function of generating a panned/tilted/zoomed image by performing a series of image processing such as trimming processing and rotation processing on the video obtained by the surveillance camera. For example, by using an ultra-wide-angle imaging optical system, it is possible to obtain an imaged image in a wide range of directions without using a movable part composed of a motor and gears.

画像処理装置の機能構成は第１実施形態（図１）とほぼ同様であるが、物理的にカメラ３００の向きを変更する代わりに、映像処理部１００による画像処理により疑似的にパン・チルト・ズームを行う。具体的には、全体制御部２００は、映像処理部１００に対してＤＰＴＺ設定情報を提供する。ＤＰＴＺ設定情報は、出力すべき画像領域を画像処理部１０３に指示する情報である。映像処理部１００は、カメラ３００から入力されたカメラ情報と、全体制御部２００から入力されるＤＰＴＺ設定情報と、に基づいて符号化ストリームを出力する。 The functional configuration of the image processing apparatus is almost the same as that of the first embodiment (FIG. 1), but instead of physically changing the orientation of the camera 300, the image processing unit 100 performs image processing to artificially pan, tilt, and tilt. Zoom in. Specifically, the overall control unit 200 provides the DPTZ setting information to the video processing unit 100. The DPTZ setting information is information that instructs the image processing unit 103 which image area to output. The video processing unit 100 outputs an encoded stream based on the camera information input from the camera 300 and the DPTZ setting information input from the overall control unit 200.

より具体的には、画像処理部１０３は、ＤＰＴＺ設定情報に基づいて、カメラ３００から入力されたフレーム画像に対してトリミング処理や回転処理を行う。画像回転判定部１０１は、ＤＰＴＺ設定情報をもとに画像回転を実施するか否かの判定を行う。そして、符号化処理制御部１０２は、ＤＰＴＺ機能により画像処理部１０３で生成されたフレームの符号化部１０４による符号化処理を制御する。ここでは、フレーム間予測で用いる参照フレームとしてＬｏｎｇＴｅｒｍ参照フレームを用いるよう制御する。 More specifically, the image processing unit 103 performs trimming processing and rotation processing on the frame image input from the camera 300 based on the DPTZ setting information. The image rotation determination unit 101 determines whether to perform image rotation based on the DPTZ setting information. Then, the encoding process control unit 102 controls the encoding process by the encoding unit 104 of the frame generated by the image processing unit 103 by the DPTZ function. Here, it is controlled to use the LongTerm reference frame as the reference frame used in the inter-frame prediction.

＜ＤＰＴＺによるフレーム画像の生成＞
図１３は、ＤＰＴＺ設定情報に基づく画像処理を説明する図である。画像１２００ａは撮像部３０３から出力されたフレーム画像を示しており、画像１２００ｂはＤＰＴＺ設定情報をもとに画像処理部１０３で生成されたフレーム画像を示している。 <Generation of frame image by DPTZ>
FIG. 13 is a diagram illustrating image processing based on the DPTZ setting information. An image 1200a shows a frame image output from the imaging unit 303, and an image 1200b shows a frame image generated by the image processing unit 103 based on the DPTZ setting information.

具体的には、画像１２００ａに含まれる３人の人物画像の１つに対して、上半身領域をトリミング処理及び拡大処理及び１８０°回転処理を行った結果が画像１２００ｂである。領域１２０１は、画像１２００ａにおいてＤＰＴＺ設定情報をもとに設定される矩形領域（トリミング領域）を示している。 Specifically, an image 1200b is the result of performing the upper body region trimming process, the enlargement process, and the 180° rotation process on one of the three person images included in the image 1200a. A region 1201 shows a rectangular region (trimming region) set based on the DPTZ setting information in the image 1200a.

図１４は、矩形領域の位置座標を説明する図である。ここでは、画像１２００ａにおいて、画像の左端をＸ軸上の座標０と設定し、画像の縦（上下）方向の中心をＹ軸上の座標０と設定している。領域１３０１は領域１２０１に対応する領域であり、４つの頂点Ａ〜Ｄにより規定される領域である。ここで、４つの頂点は、Ａ（ｘ１、ｙ１）、Ｂ（ｘ２、ｙ１）、Ｃ（ｘ１、ｙ２）、Ｄ（ｘ２、ｙ２）として規定される。 FIG. 14 is a diagram illustrating position coordinates of a rectangular area. Here, in the image 1200a, the left end of the image is set to coordinate 0 on the X axis, and the center in the vertical (up-down) direction of the image is set to coordinate 0 on the Y axis. A region 1301 is a region corresponding to the region 1201 and is a region defined by the four vertices A to D. Here, the four vertices are defined as A(x1, y1), B(x2, y1), C(x1, y2), D(x2, y2).

画像回転判定部１０１は、全体制御部２００から送信されたＤＰＴＺ設定情報をもとに領域１３０１の位置座標情報を取得する。画像回転判定部１０１は取得した位置座標情報をもとに、画像処理が施された画像に対して画像を回転するか否かの判定処理を行う。 The image rotation determination unit 101 acquires the position coordinate information of the area 1301 based on the DPTZ setting information transmitted from the overall control unit 200. The image rotation determination unit 101 performs a determination process of whether to rotate the image with respect to the image subjected to the image processing, based on the acquired position coordinate information.

例えば、領域１３０１の画像に対して画像回転を実施するか否かを、以下の判定式（１）を用いて判定する。 For example, whether or not to perform the image rotation on the image in the region 1301 is determined using the following determination formula (1).

｜ｙ１｜−｜ｙ２｜＜０・・・（１）
すなわち、数式（１）では、領域１３０１において、Ｙ軸方向において正負どちらの領域をより多く含んでいるかを判定している。画像回転判定部１０１は、数式（１）を満たす（Ｙ軸方向において負の領域をより多く含んでいる）場合、対象矩形を１８０°回転すると決定し、トリガ情報を符号化処理制御部１０２に通知する。一方で、画像回転判定部１０１は、数式（１）を満たさない（Ｙ軸方向において正の領域をより多く含んでいる）場合、回転処理が必要無いと決定し、トリガ情報を通知しない。 |y1|-|y2| <0...(1)
That is, in the mathematical expression (1), it is determined whether the region 1301 includes more positive or negative regions in the Y-axis direction. When the formula (1) is satisfied (more negative regions are included in the Y-axis direction), the image rotation determination unit 101 determines to rotate the target rectangle by 180°, and sends the trigger information to the encoding processing control unit 102. Notice. On the other hand, the image rotation determination unit 101 determines that the rotation process is not necessary and does not notify the trigger information when the expression (1) is not satisfied (including more positive regions in the Y-axis direction).

なお、数式（１）を用いた判定は一例であり、画像回転の判定方法についてはこれに限定されるものでは無い。 Note that the determination using Equation (1) is an example, and the image rotation determination method is not limited to this.

以上説明したとおり第３実施形態によれば、ＤＰＴＺ機能を有する監視カメラにおいても、画像回転の発生を判定することが可能となる。そのため、画像回転が発生するタイミングでフレーム間予測で用いる参照フレームとしてＬｏｎｇＴｅｒｍ参照フレームを用いることにより、第１実施形態と同様の効果を得ることが可能となる。 As described above, according to the third embodiment, it is possible to determine the occurrence of image rotation even in the surveillance camera having the DPTZ function. Therefore, by using the LongTerm reference frame as the reference frame used in the inter-frame prediction at the timing when the image rotation occurs, it is possible to obtain the same effect as that of the first embodiment.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. It can also be realized by the processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００映像処理部；１０１画像回転判定部；１０２符号化処理制御部；１０３画像処理部；１０４符号化部；２００全体制御部；２０１ＰＴＺ設定部；３００カメラ；３０１カメラ制御部；３０２レンズ；３０３撮像部 Reference numeral 100 image processing unit; 101 image rotation determination unit; 102 encoding processing control unit; 103 image processing unit; 104 encoding unit; 200 overall control unit; 201 PTZ setting unit; 300 camera; 301 camera control unit; 302 lens; 303 Imaging unit

Claims

Encoding means for encoding the frame images forming the moving image,
In the moving image, a determination unit that determines whether or not image rotation has occurred with respect to the preceding frame image,
Storage means for storing a frame image preceding the frame image determined to have image rotation by the determination means;
After the determination means determines that the image rotation has occurred, and when the determination means determines that the image rotation has occurred again in the moving image, the frame image stored in the storage means is used as the reference frame. Control means for controlling the encoding means so as to encode the frame image determined to have re-rotated by using inter-frame prediction;
An image processing apparatus comprising:

The control unit determines that the image orientation of the frame image determined to have caused the image rotation again has become the same as the image orientation of the frame image stored in the storage unit due to the occurrence of the image rotation again. In this case, the encoding means is controlled to perform encoding of the frame image determined to have caused the image rotation again using inter-frame prediction using the frame image stored in the storage means as a reference frame. The image processing apparatus according to claim 1, wherein:

An image pickup means for obtaining a moving image by image pickup;
Shooting direction control means for controlling the shooting direction of the imaging means,
An image processing unit that rotates a frame image forming the moving image acquired by the image pickup unit by a predetermined angle and outputs the frame image to the encoding unit.
Further has
The photographing direction control means is configured to provide direction information regarding a photographing direction of the image pickup means to the control means and the image processing means,
The image processing means, based on the direction information, determines whether to rotate the frame image corresponding to the direction information,
The control means determines, based on the direction information, whether or not to encode the frame image corresponding to the direction information using inter-frame prediction using the frame image stored in the storage means as a reference frame. The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

The imaging means is a ceiling-suspended camera installed on the ceiling of the passage,
The image processing apparatus according to claim 3, wherein the image capturing direction control unit controls the image capturing direction of the image capturing unit by a pan/tilt operation.

The passage is a straight passage,
The photographing direction control means controls the photographing direction of the image pickup means along the linear passage by a tilt operation,
The image processing means determines to rotate the frame image corresponding to the direction information by 180° when the direction information indicates a vertically downward direction,
When the directional information indicates a vertically downward direction, the control means encodes a frame image corresponding to the directional information using inter-frame prediction using the frame image stored in the storage means as a reference frame. The image processing apparatus according to claim 4.

The passage is a straight passage,
The photographing direction control means is configured to control the photographing direction of the image pickup means according to given control contents defining the photographing direction at each time,
The image processing means determines to rotate the frame image by 180° according to the given control content,
The image according to claim 4, wherein the control unit encodes a frame image according to the given control content using inter-frame prediction using a frame image stored in the storage unit as a reference frame. Processing equipment.

The image processing apparatus according to claim 6, further comprising a setting unit that sets the given control content.

The imaging means is a ceiling-suspended camera installed on the ceiling of the passage,
The image processing means is configured to further perform trimming processing and enlargement processing on the frame images forming the moving image acquired by the imaging means,
The image capturing method according to claim 3, wherein the image capturing direction control unit controls the image capturing direction in a pseudo manner by performing image processing by the image processing unit on the moving image acquired by the image capturing unit. Processing equipment.

The passage is a straight passage,
The shooting direction control means controls the shooting direction in a pseudo manner along the linear path by changing a trimming area by the image processing means for the moving image,
The image processing means determines whether to rotate the frame image based on the position coordinates of the trimming area in the moving image,
The control means may encode the frame image using inter-frame prediction using the frame image stored in the storage means as a reference frame, based on the position coordinates of the trimming area in the moving image. The image processing apparatus according to claim 8.

A control method in an image processing device having an encoding unit for encoding a frame image forming a moving image,
In the moving image, a determination step of determining whether or not image rotation has occurred with respect to the preceding frame image,
A storing step of storing a frame image preceding the frame image determined to have caused image rotation in the determining step in a storage unit;
When it is determined that the image rotation has occurred again in the moving image after it is determined that the image rotation has occurred in the determination step, inter-frame prediction using the frame image stored in the storage unit as a reference frame is performed. Using the control step of controlling the encoding unit to perform encoding of the frame image determined to have occurred the image rotation again,
A method for controlling an image processing apparatus, comprising:

A program for causing a computer to function as each unit of the image processing apparatus according to claim 1.