JP7244926B2

JP7244926B2 - SURGERY IMAGE DISPLAY DEVICE, CONTROL METHOD FOR SURGICAL IMAGE DISPLAY DEVICE, AND PROGRAM

Info

Publication number: JP7244926B2
Application number: JP2019562838A
Authority: JP
Inventors: 大樹梶田; 英雄斎藤; 圭大石; 麻樹杉本; 翔萩原; 佳史高詰
Original assignee: Keio University
Current assignee: Keio University
Priority date: 2017-12-26
Filing date: 2018-11-15
Publication date: 2023-03-23
Anticipated expiration: 2038-11-15
Also published as: JPWO2019130889A1; WO2019130889A1

Description

本発明は、外科手術などの手技の映像を多視点カメラで撮影し、ディスプレー上に表示する映像を選択するシステムに関する。 The present invention relates to a system that selects images to be displayed on a display by capturing images of procedures such as surgical operations with a multi-view camera.

従来、多視点映像のコンテンツを視聴する際の視点を手軽に選ぶことができる視聴インタフェースとして視対象中心の釘づけ視聴インタフェースがすでに提案されている（たとえば非特許文献１）。しかし、これはスポーツや演芸などを行う人間の全身をあらゆる角度から見ることを想定しているものであり、施術者の手元の操作をあらゆる角度から見ることを想定しているものではない。カメラと撮影対象との間に障害物がなければ、どの映像を選択しても撮影対象が含まれているので、どの映像を視点の選択肢に含めても問題はない。しかし、外科手術などの手元の操作を撮影する場合には、カメラと手元の間に施術者の頭や体が入り込んでしまい、手元が見えなくなってしまう問題がある。このような映像には情報的価値がなく、多視点映像の視聴インタフェースとして、このような視点の映像を選択肢に含めるべきではない。視点を選択する際に、情報的価値のない映像を選択してしまった場合には、改めて視点を選択し直さなければならなくなる。
なおこの明細書では、施術者とは、外科医を初めとして
「もの作り
・家具職人・刃物職人・彫金師・炭焼き職人・織布技術者
・ガラス製品製造工・窯業技術者・七宝工・れんが、かわら類製造工
・鋳物工・ろうそく製造工・金型製作工など

伝統工芸
・ガラス吹き工・人形作家・金工・竹工芸師・刀匠・ろう細工職人・友禅染
・仏壇、仏具職人・和傘職人・花火師・書道用品の職人・和紙職人・指物師
・押絵羽子板職人・かざり師・陶芸家・鬼瓦職人・漆職人・ちょうちん、うちわ、扇子職人
・表具技能士・弓、矢師・面職人（能面・狂言面・神楽面）・漆器工など

建築・土木
・左官・大工・石工・宮大工・塗装工・板金工
・畳職人・溶接工・電気工事士・造園師（庭師）など

食関連
・調理師・寿司職人・和菓子職人・塩職人・豆腐職人・パン職人・清酒製造工
・蕎麦職人・コーヒー焙煎の職人・味噌職人・醤油職人・ワイン製造工
・豆腐職人

サービス（ファッション、医療）
・美容師・理容師・クリーニング師・靴職人・和裁士・エステティシャン
・トータルビューティシャン・着物着付け師・製革工・皮革製衣服仕立工
・整体師・柔道整復師・カイロプラクター・あん摩マッサージ指圧師
・はり、きゅう（鍼・灸）師・手品師など

芸術
・画家・書道家など」を意味する。Conventionally, a visual object-centered fixed viewing interface has already been proposed as a viewing interface that allows easy selection of viewpoints when viewing multi-viewpoint video content (for example, Non-Patent Document 1). However, this assumes that the whole body of a person who performs sports or performing arts is viewed from all angles, and does not assume that the operator's hand operations are viewed from all angles. If there is no obstacle between the camera and the object to be photographed, the object to be photographed is included regardless of which image is selected. However, when photographing operations at hand such as during surgical operations, there is a problem that the operator's head and body are caught between the camera and the hand, and the hand cannot be seen. Such videos have no informational value, and videos from such viewpoints should not be included in options for viewing interfaces for multi-view videos. When selecting a viewpoint, if a video with no informational value is selected, the viewpoint must be selected again.
In this specification, practitioners include surgeons, as well as “manufacturers, furniture craftsmen, cutlery craftsmen, metal engravers, charcoal burners, weaving technicians, glass product manufacturers, ceramic technicians, cloisonne workers, brick workers, tile manufacturing workers, casting workers, candle manufacturing workers, mold making workers, etc.

Traditional crafts, glass blowers, doll artists, metalworkers, bamboo craftsmen, swordsmiths, wax craftsmen, Yuzen dyeing, Buddhist altar and altar fittings craftsmen, Japanese umbrella craftsmen, fireworks craftsmen, calligraphy craftsmen, Japanese paper craftsmen, joinery craftsmen, oshie battledore craftsmen Decorator, potter, tile craftsman, lacquer craftsman, lantern, fan, folding fan craftsman, mounting technician, archer, arrow craftsman, mask craftsman (Noh mask, Kyogen mask, Kagura mask), lacquerware craftsman, etc.

Architects, civil engineers, plasterers, carpenters, masonry, temple carpenters, painters, sheet metal workers, tatami mat craftsmen, welders, electricians, landscapers (gardeners), etc.

Food-related chefs, sushi chefs, Japanese confectionery artisans, salt artisans, tofu artisans, bakers, sake makers, soba artisans, coffee roasters, miso artisans, soy sauce artisans, wine makers, tofu artisans

Service (fashion, medical)
・Hairdresser ・Barber ・Cleaner ・Shoemaker ・Japanese dressmaker ・Esthetician ・Total beautician ・Kimono dresser ・Leather worker Acupuncturist (acupuncture and moxibustion), magician, etc.

It means an artist, a painter, a calligrapher, etc.

以上のように、外科手術などの手元の操作の多視点映像のコンテンツを視聴する際には、視聴インタフェースにおいて、障害物によって手元が見えなくなっている映像は選択肢から除かれていることが望ましい。 As described above, when viewing multi-view video content of operations at hand such as surgical operations, it is desirable that the viewing interface excludes images in which the hand is obscured by obstacles from options.

また、手術などの医療行為に際しては照明が用いられる。この照明には、治療対象患部に、施術者の身体、特に、頭、腕、手など、および手術器械の陰が生じないようにするため、無影灯が通常用いられている。
この無影灯としては、天井から懸垂するアームの先端に取り付けられ、移動および傾動が可能な円形の支持基体と、この支持基体の中央に配置された中央灯と、この中央灯の周囲に配置され、前記支持基体に取り付けられた複数の周囲灯を備えた構造のものが多く用いられている。
ところで、施術中において、次の手術器械の準備を行う等の施術を補助する者、別室で手術の進行状況を把握する者は、直接、治療患部を観察することができないため、撮影映像により治療患部の状態を知るのが通常である。
このとき、カメラが用いられるが、このカメラは、無影灯の移動、傾動とともに、移動、傾動するのが便利なことから、該無影灯に取り付けられることが多い。無影灯にカメラを備えた例として、特開２０００－１０２５４７号公報（特許文献１）、国際公開第２０１５／１７３８５１号（特許文献２）が提案されており、無影灯の中央の光源に近接した位置、もしくは無影灯を操作するハンドル部の内部にカメラが設けられている。
この構造のカメラ付き無影灯を用いての施術の場合、治療対象患部に照明が当たっていたとしても、撮影映像では施術者の身体、特に、頭、腕、手など、および手術器械などが映り込んでしまい、該患部が映っていないという状況が生じる。以下に、この課題について詳述する。In addition, illumination is used in medical procedures such as surgery. A surgical light is usually used for this illumination so that the affected area to be treated is not shaded by the operator's body, especially the head, arms, hands, etc., and surgical instruments.
The shadowless light includes a movable and tiltable circular support base attached to the tip of an arm suspended from the ceiling, a center light arranged in the center of the support base, and a center light arranged around the center light. A structure having a plurality of ambient lights attached to the support base is often used.
By the way, during the operation, the person who assists the operation, such as preparing the next surgical instrument, and the person who grasps the progress of the operation in a separate room cannot directly observe the treatment affected area. It is normal to know the condition of the affected area.
At this time, a camera is used, and since it is convenient to move and tilt along with the movement and tilting of the surgical lamp, the camera is often attached to the surgical lamp. Japanese Patent Application Laid-Open No. 2000-102547 (Patent Document 1) and International Publication No. 2015/173851 (Patent Document 2) have been proposed as examples in which a camera is provided in a shadowless light. A camera is provided at a close position or inside the handle portion that operates the surgical light.
In the case of treatment using a shadowless light with a camera with this structure, even if the affected area to be treated is illuminated, the captured images show the operator's body, especially the head, arms, hands, etc., as well as the surgical instruments. A situation arises in which the affected area is not reflected in the image. This problem will be described in detail below.

手術における治療対象患部の撮影においては、治療者である施術者の覗き込み動作、手や指の動きまたは手術器械等により、撮影対象である治療対象患部が隠れてしまう視野障害が生じる。視野障害を避けて撮影を完遂するためには、手術中に絶え間なく変化する、治療対象患部と、撮影の障害となりうる施術者、手術器械などとの位置関係に合わせて、カメラの位置、方向、画角を調整することが必要である。ただし実際には、治療対象患部と障害物の位置の変化に合わせて、直ちにカメラの操作を行うことは困難であり、以下に困難を生じる要因を挙げる。 2. Description of the Related Art When photographing an affected area to be treated in surgery, visual disturbance occurs in which the affected area to be imaged is hidden due to the operator's peek, hand or finger movements, surgical instruments, or the like. In order to avoid visual disturbances and complete imaging, the position and direction of the camera should be adjusted according to the constantly changing positional relationship between the affected area to be treated, the operator, surgical instruments, etc., which may interfere with imaging. , it is necessary to adjust the angle of view. However, in practice, it is difficult to immediately operate the camera in accordance with changes in the position of the affected area to be treated and the position of the obstacle.

まず、手術室の現場で上記のようなカメラの操作のための人員を配置することが困難であることが挙げられる。執刀医や助手として手術に参加する外科医である施術者、手術器械を施術者に手渡す器械出し看護師は、手術の進行に集中しているため、撮影映像の状況を術中に把握することは困難である。手術室には、器械出し看護師に追加の手術器械を差し出す外回り看護師が配置されており、業務の合間にカメラの操作を行う場合もある。それでも、治療対象患部と障害物の位置が変わるたびに、直ちにカメラの操作を行うことは困難である。 First, it is difficult to allocate personnel for operating the camera as described above in the operating room. The surgeon who participates in the operation as a surgeon or an assistant, and the nurse who hands the surgical instruments to the operator concentrates on the progress of the operation, so it is difficult to grasp the situation of the captured images during the operation. is. In the operating room, there are circulating nurses who provide additional surgical instruments to the instrument-delivering nurses, and sometimes operate the camera between duties. Even so, it is difficult to operate the camera immediately every time the position of the diseased part to be treated and the position of the obstacle change.

次に、カメラの位置を変化させることが困難である状況があることが挙げられる。手術の場では、治療対象患部は消毒され、手術用ベッドに横たわる患者は滅菌された布で覆われており、施術者、器械出し看護師は手術衣の上に滅菌ガウンを着ており、その周囲には滅菌された手術器械を置く器械台が配置されている。治療対象患部の撮影にあたり、カメラはこれら滅菌状態の人や物に触れることは許されず、カメラが配置されうるのは治療対象患部の真上付近であるか、滅菌状態の人や物からさらに離れた位置となる。また無影灯は通常では治療対象患部の真上付近に配置される。カメラが治療対象患部の真上付近にある場合、カメラの位置を操作しうる外回り看護師は滅菌ガウンを着用していないので、滅菌状態の人や物に触れることなくカメラに近づくことは困難である。 Second, there are situations in which it is difficult to change the position of the camera. At the surgical site, the affected area to be treated is sterilized, the patient lying on the operating bed is covered with a sterile cloth, the operator and the nurse who removes the instruments are wearing sterile gowns on top of the surgical gowns. Surroundings are arranged instrument tables on which sterilized surgical instruments are placed. When photographing the affected area to be treated, the camera is not allowed to touch these sterile people or objects, and the camera can be placed either directly above the affected area to be treated or further away from the sterile person or object. position. In addition, the operating light is usually placed near the affected area to be treated. If the camera is located directly above the affected area to be treated, the circulating nurse who can manipulate the position of the camera is not wearing a sterile gown, making it difficult to approach the camera without touching a sterile person or object. be.

また、特許文献１、特許文献２のように、無影灯に１台のカメラを備えるのみでは、治療対象患部に照明が当たっているものの、カメラと治療対象患部の間に施術者の頭や手などの障害物が介在することによって視野障害が生じる状況が生じうる。このとき、障害物を避けて治療対象患部を撮影するためにはカメラの位置を操作する必要があるが、特許文献１、特許文献２のいずれも、カメラと無影灯が同一の筐体に備わっているために、施術者の動作に合わせてカメラの位置、方向を操作する場合に、これに連動して無影灯の位置、方向も変わってしまい、治療対象患部に照明が当たらなくなる困難を生じる。 In addition, as in Patent Documents 1 and 2, when only one camera is provided in the shadowless lamp, although the affected area to be treated is illuminated, the operator's head and the affected area to be treated are placed between the camera and the affected area. Situations can occur where visual field impairment occurs due to intervening obstacles such as hands. At this time, it is necessary to operate the position of the camera in order to avoid obstacles and photograph the affected area to be treated. Because it is installed, when the position and direction of the camera are operated according to the movement of the operator, the position and direction of the operating light will also change in conjunction with this, making it difficult to illuminate the affected area to be treated. produces

いずれの場合においても共通することは、手術室では、安全な手術の完遂が、良好な映像の撮影よりも優先されるということであり、カメラの操作よりも、治療対象患部への安定した照明が優先される。これらの理由から、特許文献１、特許文献２のいずれのカメラも、障害物による視野障害を避けて治療対象患部の撮影を遂行することは困難である。 What is common to all cases is that in the operating room, the completion of a safe operation takes precedence over the capture of good images. takes precedence. For these reasons, it is difficult for any of the cameras disclosed in Patent Documents 1 and 2 to photograph the affected area to be treated while avoiding obstruction of the field of vision due to obstacles.

また、撮影中に障害物によるカメラの視野障害が生じたときには、施術者は手術の進行に集中しており、ディスプレーを同時に見ることは不可能であるので、カメラの視野障害が生じていることに気づくことは困難であり、撮影が遂行されていない状況のまま、手術は進行してしまう。 Also, when the camera's visual field is obstructed by an obstacle during imaging, the operator is concentrating on the progress of the surgery and it is impossible to see the display at the same time. It is difficult to notice this, and the operation proceeds without imaging being performed.

一方で、医療の現場で求められる手術の映像において、最も必要な情報は治療対象患部への医療操作の内容であって、撮影映像の視聴者がこれを正確に随時把握するためには、施術者の頭や体、腕や手などの撮影の視野障害を避けて、治療対象患部の撮影を遂行する必要がある。 On the other hand, the most necessary information in surgical images required in the medical field is the content of medical operations on the affected area to be treated. It is necessary to perform imaging of the affected area to be treated while avoiding obstruction of the imaging field of the patient's head, body, arms, and hands.

特開２０００－１０２５４７号公報JP-A-2000-102547 国際公開第２０１５／１７３８５１号WO2015/173851 特開２０１２－１２０８１２号公報JP 2012-120812 A

間瀬健二、東海彰吾、川本哲也、藤井俊彰、"多視点画像の釘付け視聴方式と操作インタフェースのデザインに関する考察"、ヒューマンインタフェース学会研究報告集、Vol.11、No.1、pp.7-12、 2009.3.10-11 .Kenji Mase, Shogo Tokai, Tetsuya Kawamoto, Toshiaki Fujii, "Considerations on the fixed viewing method of multi-view images and design of operation interface", Human Interface Society Research Report, Vol.11, No.1, pp.7-12, 2009.3.10-11 .

本発明は、上記のような必要性や状況に鑑みてなされたものであり、本発明の目的は、手術映像を表示する際に視野障害を可及的に回避するための技術を提供することにある。 SUMMARY OF THE INVENTION The present invention has been made in view of the above needs and circumstances, and an object of the present invention is to provide a technique for avoiding visual disturbance as much as possible when displaying surgical images. It is in.

上記目的を達成するために、本発明は、複数のカメラを用いて、異なる視点で同じ術野を撮影した複数の映像データを取得する取得部と、各映像データの各フレームの画像について、画像内に前記術野が写っている度合いである術野露出度を評価する評価部と、フレームごとに、前記複数の映像データそれぞれの画像の術野露出度に基づいて、前記複数の映像データの中から映像データを選択する選択部と、前記選択部により選択された映像データの画像をディスプレー装置に表示する映像表示部と、を備えることを特徴とする手術映像表示装置を提供する。 In order to achieve the above object, the present invention provides an acquisition unit that acquires a plurality of image data obtained by photographing the same operative field from different viewpoints using a plurality of cameras, and an image for each frame of each image data. an evaluation unit that evaluates the operating field exposure, which is the degree to which the operating field is reflected in the image data, based on the operating field exposure of each image of the plurality of video data for each frame; Provided is a surgical image display device comprising: a selection unit for selecting image data from among them; and an image display unit for displaying an image of the image data selected by the selection unit on a display device.

本発明によれば、手術映像を表示する際に視野障害を可及的に回避することが可能となる。 Advantageous Effects of Invention According to the present invention, visual field disturbance can be avoided as much as possible when displaying surgical images.

図１は本願の第１の発明の構成の一例を示す図である。FIG. 1 is a diagram showing an example of the configuration of the first invention of the present application. 図２は本願の第１の発明において映像を選択するアルゴリズムの一例を示すフローチャートである。FIG. 2 is a flow chart showing an example of an algorithm for selecting a video in the first invention of the present application. 図３は本願の第１の発明において手の認識領域から平均点を算出するアルゴリズムの一例を示すフローチャートである。FIG. 3 is a flow chart showing an example of an algorithm for calculating an average score from hand recognition regions in the first invention of the present application. 図４は本願の第１の発明装置において画像に含まれる手の領域を検出した認識領域の一例を示す図である。FIG. 4 is a diagram showing an example of a recognition area obtained by detecting a hand area included in an image in the first invention device of the present application. 図５は本願の第１の発明装置において、画像に含まれる手の領域を検出し、各領域の中心点を求め、これら各領域の中心点の座標の平均から平均点を算出し、画像の中心点との距離を計算する過程の一例を示す図である。FIG. 5 shows that in the first inventive device of the present application, the area of the hand included in the image is detected, the center point of each area is obtained, the average point is calculated from the average of the coordinates of the center point of each area, and the image is obtained. FIG. 10 is a diagram showing an example of the process of calculating the distance to the center point; 図６は本願の第１の発明装置において、画像に含まれる手の領域を検出し、各領域の中心点を求め、これら各領域の中心点の座標の平均から平均点を算出し、画像の中心点との距離を計算する過程の一例を示す図である。FIG. 6 shows that in the first invention apparatus of the present application, the hand area included in the image is detected, the center point of each area is obtained, the average point is calculated from the average of the coordinates of the center point of each area, and the image is obtained. FIG. 10 is a diagram showing an example of the process of calculating the distance to the center point; 図７は本願の第１の発明装置において、画像に含まれる手の領域を検出し、各領域の中心点を求め、これら各領域の中心点の座標の平均から平均点を算出し、画像の中心点との距離を計算する過程の一例を示す図である。FIG. 7 shows that in the first invention device of the present application, the hand area included in the image is detected, the center point of each area is obtained, the average point is calculated from the average of the coordinates of the center point of each area, and the image is obtained. FIG. 10 is a diagram showing an example of the process of calculating the distance to the center point; 図８は本願の第２の発明の実施形態によるカメラシステムの概略図である。FIG. 8 is a schematic diagram of a camera system according to a second invention embodiment of the present application. 図９は複数のカメラの映像の中から、操作装置によってディスプレー装置に表示する映像を選択する構成を示す図である。FIG. 9 is a diagram showing a configuration for selecting an image to be displayed on a display device by means of an operation device from images of a plurality of cameras. 図１０は複数のカメラの映像を同時に録画ユニットで記録する構成を示す図である。FIG. 10 is a diagram showing a configuration for simultaneously recording images from a plurality of cameras using a recording unit. 図１１は同時に撮影された複数の映像を、共通の撮影日時で記録して、同時に再生する実施形態例を示す図である。FIG. 11 is a diagram showing an embodiment in which a plurality of videos shot at the same time are recorded with a common shooting date and time and played back at the same time. 図１２は操作ハンドルの回転によって、周囲カメラユニットの傾動ユニットを傾動させる実施形態例を示す図である。FIG. 12 is a diagram showing an embodiment in which the tilting unit of the surrounding camera unit is tilted by rotating the operating handle. 図１３は本発明装置にカメラの映像の方向を知らせるガイド光照射装置を備えた場合に、撮影対象に照射するガイド光の実施形態例を示す図である。13A and 13B are diagrams showing an embodiment example of the guide light irradiated to the object to be photographed when the apparatus of the present invention is provided with a guide light irradiation device for informing the direction of the image of the camera. 図１４は本発明装置にマイクを備えた実施形態例である。FIG. 14 shows an embodiment in which the device of the present invention is equipped with a microphone. 図１５は同時に取得された複数の映像と音声を、共通の撮影日時で記録して、同時に再生する実施形態例を示す図である。FIG. 15 is a diagram showing an embodiment in which a plurality of images and sounds acquired at the same time are recorded with a common shooting date and time and played back at the same time. 図１６は本発明装置に半天球カメラを備えた実施形態例である。FIG. 16 shows an embodiment in which the apparatus of the present invention is equipped with a semi-spherical camera. 図１７は操作ハンドルに光源、画角、ピントの自動調節、録画ユニット、録音装置の操作装置を備え、焦点距離操作装置がモニター機構の操作装置例からの入力もしくはユニット基体の傾動によってピントの自動調節を作動する本発明装置の実施形態例である。In FIG. 17, the operation handle is equipped with an operation device for automatic adjustment of the light source, the angle of view and the focus, the recording unit, and the recording device, and the focal length operation device automatically adjusts the focus by input from the operation device example of the monitor mechanism or by tilting the unit base. Fig. 3 shows an example embodiment of the device of the present invention for actuating regulation; 図１８は記録された複数の映像および音声を、共通の撮影日時の範囲で切り出して、複数の分割データとして出力する構成を示す図である。FIG. 18 is a diagram showing a configuration for extracting a plurality of recorded video and audio within a common shooting date and time range and outputting them as a plurality of divided data. 図１９はディスプレー装置が映像障害を認識したときに映像を切り替えるアルゴリズムの一例を示すフローチャートである。FIG. 19 is a flow chart showing an example of an algorithm for switching images when the display device recognizes an image failure. 図２０は距離計測装置とモニター機構に備えるCPUよってディスプレー装置に表示する映像を選択する構成を示す図である。FIG. 20 is a diagram showing a configuration for selecting an image to be displayed on the display device by the CPU provided in the distance measuring device and the monitor mechanism. 図２１は動画データ、音声データの内容を自動認識して文字列データを作成し、撮影時間情報に文字列のタグ情報を付加してリストデータを作成する構成を示す図である。FIG. 21 is a diagram showing a configuration for automatically recognizing the content of moving image data and audio data to create character string data, and adding tag information of character strings to photographing time information to create list data. 図２２は第３の発明に係る手術映像表示装置の構成例を示す図である。FIG. 22 is a diagram showing a configuration example of a surgical image display apparatus according to the third invention. 図２３は第３の発明に係る情報処理装置の処理の一例を示すフローチャートである。FIG. 23 is a flow chart showing an example of processing of the information processing apparatus according to the third invention. 図２４Ａ及び図２４Ｂは術野の上方に設置されたカメラから取り込まれた画像の例を示す図である。24A and 24B are diagrams showing examples of images captured from a camera installed above the surgical field. 図２５はフレーム毎の術野露出度の変化と、表示する画像の切り替えの例を示す図である。FIG. 25 is a diagram showing an example of changes in the operating field exposure for each frame and switching of images to be displayed. 図２６はダイクストラ法を応用した画像の切り替え制御アルゴリズムの例を示す図である。FIG. 26 is a diagram showing an example of an image switching control algorithm to which the Dijkstra method is applied. 図２７は第３の発明に係る手術映像表示装置の他の構成例を示す図である。FIG. 27 is a diagram showing another configuration example of the surgical image display device according to the third invention.

＜第１の発明＞
本願の第１の発明は、複数のビデオカメラで撮影された画像データおよび時間領域に配置されたフレーム列である動画像（映像）データについて、情報処理装置において手を対象とした画像認識を行うことで、手が映像に表示されているものを選択し、もしくは手が映像に表示されていないものを選択しないことで、映像表示部に表示される映像を予め選別して視聴者に提示する。<First Invention>
A first invention of the present application performs image recognition of a hand in an information processing device for image data captured by a plurality of video cameras and moving image (video) data that is a frame sequence arranged in a time domain. By doing so, the images displayed on the image display unit are selected in advance and presented to the viewer by selecting the one in which the hand is displayed in the image or by not selecting the one in which the hand is not displayed in the image. .

本願の第１の発明は、上記のような構成により、手術などの手技に関わる多視点映像の効果的な視聴を支援する機能を提供するものであり、手が障害物によって隠れていたり、カメラの画角から外れてしまったりして映っていないような、情報的価値のない映像を予め選択肢から除き、もしくは手が多数、画面の中心付近に映っているような情報的価値の大きい映像を予め選択して提示することで、視聴者が多くの選択肢のなかから適切な映像を選択する際のストレスを軽減する効果が期待される。 The first invention of the present application provides a function for supporting effective viewing of multi-viewpoint images related to procedures such as surgery, with the configuration described above. Preliminarily exclude images with no informational value, such as those that are out of the angle of view and do not appear, or images with high informational value, such as many hands appearing near the center of the screen. By selecting and presenting images in advance, it is expected that the viewer's stress when selecting an appropriate image from among many options will be reduced.

本願の第１の発明を具体化した実施形態を、以下図面を参照して詳細に説明する。図は説明の都合上模式的に描いてある。なお、本発明の実施の形態に係る多視点映像視聴システムの応用範囲は、以下に示す構成に限定されるものではなく、任意の構成に応用できる。 An embodiment embodying the first invention of the present application will be described in detail below with reference to the drawings. The figures are drawn schematically for convenience of explanation. Note that the application range of the multi-view video viewing system according to the embodiment of the present invention is not limited to the configuration shown below, and can be applied to any configuration.

図１は１台の主カメラ１２と２台の副カメラ１４からなる３台のビデオカメラで被写体１０を撮影した３つの映像が、情報処理装置２０の画像取得部２２に送られ、制御部２４によって選択された映像が映像表示部３４を介してディスプレー装置４０に送られる、本発明の実施例を示す。ビデオカメラの数は２台以上であって上限はない。各カメラにはカメラＩＤが付与されている。同時に同一の時間情報を付加した画像データを、それぞれのカメラが自身のカメラＩＤ情報とともに画像取得部２２に送信する。ある時点で制御部２４から映像表示部３４に送られる画像データは、画像取得部２２に入力された複数の画像データのうち、画像認識部２６によって手が認識された手画像データを含む画像データである。画像（映像）選択部３２によって選択され映像表示部３４に送られる画像データの数は１つとは限らず、複数の画像データが選択される場合がある。情報処理装置２０とディスプレー装置４０はＰＣ（パーソナルコンピューター）でもよいし、スマートフォンやタブレット端末であってもよい。後述する情報処理装置２０の処理および機能は、記憶装置又は記憶媒体（例えば、ハードディスクドライブ、ソリッドステートドライブ、フラッシュメモリ、磁気ディスクなど）に非一時的に格納されたプログラムを、メモリに展開し、プロセッサ（例えば、ＣＰＵ、ＧＰＵなど）が実行することによって、ソフトウェア的に実現されてもよい。あるいは、情報処理装置２０の処理及び機能の一部が、ＡＳＩＣなどのハードウェアで実現されてもよいし、クラウドサーバなどの他のコンピューターで実行されてもよい。 In FIG. 1, three video images of an object 10 captured by three video cameras each composed of one main camera 12 and two sub cameras 14 are sent to an image acquisition unit 22 of an information processing device 20, and are sent to a control unit 24. 4 shows an embodiment of the invention in which the image selected by is sent to the display device 40 via the image display unit 34. FIG. The number of video cameras is two or more and there is no upper limit. A camera ID is assigned to each camera. At the same time, each camera transmits image data to which the same time information is added to the image acquisition unit 22 together with its own camera ID information. The image data sent from the control unit 24 to the image display unit 34 at a certain point in time is image data including hand image data whose hands have been recognized by the image recognition unit 26, among the plurality of image data input to the image acquisition unit 22. is. The number of image data selected by the image (video) selection unit 32 and sent to the video display unit 34 is not limited to one, and a plurality of image data may be selected. The information processing device 20 and the display device 40 may be a PC (personal computer), a smart phone, or a tablet terminal. The processing and functions of the information processing device 20, which will be described later, develop a program non-temporarily stored in a storage device or storage medium (for example, a hard disk drive, a solid state drive, a flash memory, a magnetic disk, etc.) into a memory, It may be implemented in software by being executed by a processor (eg, CPU, GPU, etc.). Alternatively, part of the processing and functions of the information processing device 20 may be realized by hardware such as ASIC, or may be executed by another computer such as a cloud server.

図２は制御部２４におけるアルゴリズムの例を示すフローチャートである。まず、各ビデオカメラで時間同期して撮影された映像は情報処理装置２０の画像取得部２２に送られ、制御部２４に受け渡される。このとき、制御部２４では、受け渡されたそれぞれの画像データについて、画像認識部２６において、ある時点のフレームにおいて画像に手が表示されている領域である認識領域を認識する（ステップＳ１、図４の５２、５４、５６、５８）。なお画像認識については、従来、デジタルカメラやデジタルビデオカメラなどの撮像装置では、画像上から人の顔や手を検出し、検出した顔を追尾してフォーカスを合わせたり、検出した手の動きに応じて所定の処理を実行したりするものが提案されている。画像から手を認識する方法としては、例えば、色特徴に基づき認識する方法、形状特徴に基づき認識する方法などがある。手術映像が対象画像である場合には、施術者は所定の手術用手袋を装着しているので色特徴を比較的とらえやすい。あるいは、画像認識を容易にするために、特別な色（カメラの視野内に存在する他の物とは明らかに異なる色）の手術用手袋を装着するようにしたり、手術用手袋に画像認識用のマーカーを付しておいてもよい。形状特徴に基づき認識する方法としては、例えば、ＨＯＧ特徴を用いる方法、手のひらと五指からなる構造を認識する方法などがある。また、ディープラーニングなどの機械学習によって、手の認識器を生成してもよい。また、動画像を対象とした画像認識技術として使用することができるものの例として、Ｇｏｏｇｌｅ社が提供しているＣｌｏｕｄＶｉｄｅｏＩｎｔｅｌｌｉｇｅｎｃｅや、マイクロソフト社が提供しているＶｉｄｅｏＩｎｄｅｘｅｒなどのＡＰＩ（アプリケーション・プログラミング・インタフェース）が挙げられる。また、ＯｐｅｎＰｏｓｅのような姿勢検出アルゴリズムを利用してもよい。
また、手を認識する画像認識の対象画像として、温度分布を画像として視覚的に表示できるサーモグラフィや、被写体までの距離情報を画像として取得可能なデプスカメラなどを用いて得られた画像データを採用してもよい。FIG. 2 is a flow chart showing an example of an algorithm in the control section 24. As shown in FIG. First, images captured by the respective video cameras in time synchronization are sent to the image acquisition section 22 of the information processing device 20 and passed to the control section 24 . At this time, in the control unit 24, the image recognition unit 26 recognizes the recognition area, which is the area where the hand is displayed in the image in the frame at a certain point in time, for each of the transferred image data (step S1, FIG. 4, 52, 54, 56, 58). As for image recognition, conventional imaging devices such as digital cameras and digital video cameras detect a person's face and hands from the image, track the detected face and adjust the focus, or detect the movement of the hand. It has been proposed to execute a predetermined process accordingly. Methods for recognizing a hand from an image include, for example, a method based on color features and a method based on shape features. When the surgical image is the target image, the operator wears prescribed surgical gloves, so it is relatively easy to grasp the color features. Alternatively, to facilitate image recognition, a special colored surgical glove (a color clearly different from other objects in the field of view of the camera) is worn, or surgical gloves are used for image recognition. may be marked with Recognition methods based on shape features include, for example, a method using HOG features, a method of recognizing a structure consisting of a palm and five fingers, and the like. A hand recognizer may also be generated by machine learning such as deep learning. Examples of technologies that can be used as image recognition technology for moving images include APIs (application programming interfaces) such as Cloud Video Intelligence provided by Google and Video Indexer provided by Microsoft. interface). Alternatively, a posture detection algorithm such as OpenPose may be used.
In addition, image data obtained using a thermography, which can visually display the temperature distribution as an image, and a depth camera, which can acquire distance information to the subject as an image, is used as the target image for hand recognition. You may

次に、カウント部２８において、各画像データの認識領域の数を数える（ステップＳ２）。次に、平均点算出部３０において、各画像データの認識領域から平均点の座標を算出する（ステップＳ３、図６の７０）。平均点の算出の方法については、後述する。これにより、同一時間における各カメラの画像データのそれぞれに、認識領域の数と、平均点の座標情報が付与された状態となる。これらをもとに、画像（映像）選択部３２において、まずは認識領域の数の最大値を確認する（ステップＳ４）。各画像データのいずれにおいても手が認識されなかった場合、すなわち認識領域の数はいずれの画像データにおいても０である場合には、画像（映像）選択部３２は主カメラ１２の映像である主映像を選択し（ステップＳ９）、映像表示部３４に送信する。なお、視聴者の操作やコンピューターのアルゴリズムによって、副カメラ１４のうちいずれかのカメラを主カメラ１２と定義し直すことが可能であり、同時に、主カメラ１２は副カメラ１４に置き換わることが可能である。 Next, the counting unit 28 counts the number of recognition areas of each image data (step S2). Next, the average point calculator 30 calculates the coordinates of the average point from the recognition area of each image data (step S3, 70 in FIG. 6). A method for calculating the average score will be described later. As a result, the number of recognition regions and the coordinate information of the average point are added to each image data of each camera at the same time. Based on these, the image (video) selection unit 32 first confirms the maximum number of recognition regions (step S4). If the hand is not recognized in any of the image data, that is, if the number of recognition regions is 0 in any of the image data, the image (video) selection unit 32 selects the main camera 12 video. An image is selected (step S9) and transmitted to the image display section . Note that it is possible to redefine one of the secondary cameras 14 as the main camera 12 and at the same time replace the primary camera 12 with the secondary camera 14 by the operation of the viewer or the algorithm of the computer. be.

ステップＳ４において認識領域の数の最大値が１以上である場合には、画像（映像）選択部３２は認識領域の数が最大値となる画像データを選択する（ステップＳ５）。このとき、選択される画像データは複数であるかどうかを確認する（ステップＳ６）。認識領域の数が最大値となる画像データが１つだけであれば、画像（映像）選択部３２はこの画像データを含む映像を選択して、映像表示部３４に送信する。認識領域の数が最大値となる画像データが２つ以上であれば、画像（映像）選択部３２は、これら画像のそれぞれについて、前記平均点の座標から画像の中心点との距離を算出し（図７の９０）、この距離が一番小さくなる画像データを含む映像を選択して、映像表示部３４に送信する。 If the maximum number of recognition regions is 1 or more in step S4, the image (video) selection unit 32 selects image data with the maximum number of recognition regions (step S5). At this time, it is confirmed whether or not there are a plurality of image data to be selected (step S6). If there is only one image data with the maximum number of recognition regions, the image (video) selection unit 32 selects the video including this image data and transmits it to the video display unit 34 . If there are two or more pieces of image data with the maximum number of recognition regions, the image (video) selection unit 32 calculates the distance from the center point of the image from the coordinates of the average point for each of these images. ( 90 in FIG. 7), the video including the image data with the smallest distance is selected and transmitted to the video display unit 34 .

図３は平均点算出部３０において、各画像データの認識領域から平均点の座標を算出するアルゴリズムの例を示すフローチャートである。まず、平均点算出部３０は各画像データについてカウント部２８で得られた認識領域の数を確認する（ステップＳ３１）。認識領域の数が０、すなわち手が認識されなかった画像データについては、平均点は算出されない（ステップＳ３６）。認識領域の数が１以上であれば、平均点算出部３０は画像認識部２６で検出された認識領域の座標情報から、それぞれの領域の中心点の座標を算出する（ステップＳ３２。図５に領域を矩形で認識する場合の例を示す。この場合、中心点は矩形の４つの角の座標の平均となる座標に等しい）。次に、認識領域の数が２以上であるかどうかを確認する（ステップＳ３３）。認識領域の数が１の場合は、この領域の中心点が、このカメラ画像の平均点となる（ステップＳ３５）。認識領域の数が２以上の場合は、各認識領域の中心点のすべて座標から、平均値を算出し、これを平均点の座標と定義する（ステップＳ３４、図６の７０）。 FIG. 3 is a flow chart showing an example of an algorithm for calculating the coordinates of the average point from the recognition area of each image data in the average point calculator 30. In FIG. First, the average score calculation unit 30 confirms the number of recognition regions obtained by the counting unit 28 for each image data (step S31). The average score is not calculated for image data in which the number of recognition regions is 0, that is, the hand is not recognized (step S36). If the number of recognition regions is one or more, the average point calculation unit 30 calculates the coordinates of the central point of each region from the coordinate information of the recognition regions detected by the image recognition unit 26 (step S32; see FIG. 5). Here is an example of recognizing a region as a rectangle, where the center point is equal to the average of the coordinates of the four corners of the rectangle). Next, it is confirmed whether or not the number of recognition areas is two or more (step S33). When the number of recognition areas is 1, the center point of this area becomes the average point of this camera image (step S35). If the number of recognition areas is two or more, the average value is calculated from all the coordinates of the center point of each recognition area and defined as the coordinates of the average point (step S34, 70 in FIG. 6).

なお図２、図３に示したアルゴリズムは一例であって、それぞれのステップの順が入れ替わったり、ステップが追加・削除されたりしても、期待した結果、すなわち、手が認識されない映像が選択されなかったり、手が多く含まれている映像や、手が映像の中心を囲むように配置されている映像が選択されるようなアルゴリズムであれば、本願発明の範囲に含まれ得る。 Note that the algorithm shown in FIGS. 2 and 3 is only an example, and even if the order of steps is changed or steps are added/deleted, the expected result, that is, an image in which the hand is not recognized, is selected. Any algorithm that selects a video with no hands, a video with many hands, or a video with hands positioned around the center of the video can be included within the scope of the present invention.

また、画像取得部２２に送られた映像について、制御部２４が画像認識から映像の選択を行う処理を行う対象は、全てのフレームの画像データである必要はなく、処理が行われる時間の間隔は、５秒ごと、１０秒ごとなど、視聴者の選択やコンピューターのアルゴリズムによって、調整されうる。 In addition, regarding the image sent to the image acquisition unit 22, the object for which the control unit 24 performs processing from image recognition to image selection does not have to be image data of all frames. can be adjusted by viewer selection or computer algorithms every 5 seconds, 10 seconds, etc.

図４は画像認識部２６によって、手が認識された領域である認識領域５２、５４、５６、５８の一例を示す図である。この図では該認識領域が矩形である場合を示しているが、認識領域の形状は、楕円や多角形など、どのような形状でもよい。また、分かりやすいように画面に矩形を示しているが、実際のユーザー・インターフェースとしては、矩形は前記ディスプレー装置に表示されなくてもよく、内部的にコンピューター処理が行われていればよい。また、例えば手術を撮影の対象とする場合には、手を治療の対象とする場合もあり、この場合には治療対象の手を認識領域に含めることは望ましくない。よって、画像認識手段においては、手袋を装着した手と、手袋を装着しない手を区別して認識できるものを採用してもよい。 FIG. 4 is a diagram showing an example of recognition areas 52, 54, 56, and 58 where hands are recognized by the image recognition unit 26. As shown in FIG. Although this figure shows a case where the recognition area is rectangular, the shape of the recognition area may be any shape such as an ellipse or a polygon. Also, although the rectangles are shown on the screen for the sake of clarity, the rectangles do not have to be displayed on the display device as an actual user interface, as long as they are internally processed by a computer. Also, for example, when imaging surgery, the hand may also be the target of treatment. In this case, it is not desirable to include the hand to be treated in the recognition area. Therefore, in the image recognition means, a device that can distinguish between a gloved hand and a non-gloved hand may be adopted.

図５は平均点算出部３０によって、前記認識領域５２、５４、５６、５８の中心点６２、６４、６６、６８を求めた例を示す図である。該認識領域５２、５４、５６、５８が矩形である場合、該中心点６２、６４、６６、６８は矩形の４つの角の座標を平均して求める場合や、対角線の交点を求める場合がある。該中心点の求め方には他にも多くの場合があるが、結果として該中心点と同一もしくは近似される点が算出されるのであれば、どのような方法をとってもよい。また、認識領域が矩形でない場合には、認識領域の面積を算出して、この重心となるような点を中心点とするような方法をとってもよい。 FIG. 5 is a diagram showing an example in which center points 62, 64, 66 and 68 of the recognition regions 52, 54, 56 and 58 are calculated by the average point calculator 30. In FIG. When the recognition areas 52, 54, 56, 58 are rectangular, the center points 62, 64, 66, 68 may be obtained by averaging the coordinates of the four corners of the rectangle, or by obtaining the intersection of diagonal lines. . There are many other ways to find the center point, but any method may be used as long as a point identical or similar to the center point is calculated as a result. Moreover, when the recognition area is not rectangular, a method of calculating the area of the recognition area and setting the center point as the center of gravity may be used.

図６は前記平均点算出部３０によって、前記中心点６２、６４、６６、６８の平均点７０を求めた例を示す図である。該平均点７０の座標は、前記中心点６２、６４、６６、６８について、それぞれの座標の値から平均値を算出して得られた座標として算出すればよい。 FIG. 6 is a diagram showing an example in which an average point 70 of the center points 62, 64, 66 and 68 is calculated by the average point calculator 30. As shown in FIG. The coordinates of the average point 70 may be calculated as the coordinates obtained by calculating the average value from the respective coordinate values of the center points 62, 64, 66 and 68. FIG.

図７は前記平均点算出部３０によって算出された前記平均点７０の座標から、画像（映像）選択部３２によって該平均点７０と映像の中心の点８０との距離を求めた例を示す図である。なお、この図における距離とはｘ座標とｙ座標の両者を考慮して算出したものを示しているが、例えば、ｘ座標もしくはｙ座標のいずれか一方を利用して算出してもよい。また、該平均点７０と映像の中心の点８０との距離９０を求めるのは、上記の構成のように画像（映像）選択部３２で行ってもよいし、前記平均点算出部３０で予め該平均点７０と映像の中心点８０との距離９０を算出してもよく、この場合には、該距離９０の値を前記画像（映像）選択部３２に受け渡して、映像の選択に利用してもよい。 FIG. 7 is a diagram showing an example in which the image (video) selection unit 32 calculates the distance between the average point 70 and the center point 80 of the video from the coordinates of the average point 70 calculated by the average point calculation unit 30. is. Although the distance in this figure is calculated considering both the x-coordinate and the y-coordinate, it may be calculated using either the x-coordinate or the y-coordinate, for example. Further, the distance 90 between the average point 70 and the center point 80 of the image may be obtained by the image (video) selection unit 32 as in the above configuration, or by the average point calculation unit 30 in advance. A distance 90 between the average point 70 and the center point 80 of the image may be calculated. In this case, the value of the distance 90 is transferred to the image (image) selection unit 32 and used for image selection. may

第１の発明は、以下のように捉えることができる。
（１）施術者の手により、対象物を処理する際の状態を示す多視点の複数の映像データを得、該映像データを処理しディスプレー装置で表示する多視点映像視聴システムにおいて、
異なった視点で、対象物を処理する施術者の手を同時並行して撮影し、連続する多数の画像データからなる映像データを出力する複数のビデオカメラと、
前記複数のビデオカメラで同時に撮影した複数の映像データを受け取り、処理する情報処理装置を備え、
前記情報処理装置は、
多重受信した複数の映像データについて、個々の画像データを探知し、個々の画像データ中の手に関する手画像データを認識する画像認識部と、
前記複数の画像データの中から、時系列に、前記画像認識部によって手画像データが認識された画像データを選択する画像選択部、および
前記画像選択部により時系列に選択された画像データを前記ディスプレー装置に送信する映像表示部
を備えていることを特徴とする多視点映像視聴システム。
（２）前記映像表示部は、前記画像選択部により時系列に選択された画像データを繋ぎ合わせる編集を行い、前記ディスプレー装置に出力すべき１つの編集映像データを作成し、この編集映像データを前記ディスプレー装置に送信する（１）に記載の多視点映像視聴システム。
（３）前記情報処理装置はカウント部を備え、個々の画像データについて、前記画像認識部によって手が認識された手画像データの領域である認識領域の数を数え、前記画像選択部に送り、
前記画像選択部は前記認識領域の数が多い画像データを選択して前記映像表示部に送る、（１）又は（２）に記載の多視点映像視聴システム。
（４）前記複数のビデオカメラは、１台の主カメラと１台以上の副カメラからなり、
前記情報処理装置はカウント部を備え、個々の画像データについて、前記画像認識部によって手が認識された手画像データの領域である認識領域の数を数え、前記画像選択部に送り、
前記画像選択部はいずれの映像においても前記認識領域の数が０の場合に、主カメラの画像データを選択して前記映像表示部に送る、（１）～（３）のいずれかに記載の多視点映像視聴システム。
（５）前記多視点映像視聴システムは、任意の前記ビデオカメラを前記主カメラと定義することができる、（４）に記載の多視点映像視聴システム。
（６）前記画像認識部は手を認識した前記認識領域の表示された位置と画像における領域を出力し、
前記情報処理装置は平均点算出部を備え、前記画像認識部が出力した１つ以上の前記認識領域の中心点から該中心点の平均点を算出し、前記画像選択部に送り、
前記画像選択部は前記平均点が画像の中心点に近い画像データを選択して前記映像表示部に送る、（１）～（５）のいずれかに記載の多視点映像視聴システム。
（７）前記画像認識部は、手袋を装着した手の画像データのみを認識し、手袋を装着していない手の画像データは認識しない（１）～（６）のいずれかの多視点映像視聴システム。
（８）施術者の手により、対象物を処理する際の状態を示す多視点の複数の映像データを処理し、ディスプレー装置で表示する映像を選択する方法において、
多重受信した複数の映像データについて、個々の画像データを探知し、個々の画像データ中の手に関する手画像データを認識する画像認識ステップと、
前記複数の画像データの中から、時系列に、前記画像認識ステップによって手画像データが認識された画像データを選択する画像選択ステップと、を有することを特徴とする方法。
（９）（８）に記載の方法が、
前記個々の画像データについて、前記画像認識ステップによって手が認識された手画像データの領域である認識領域の数を数えるカウントステップを有し、
前記画像選択ステップが、前記個々の画像データにおける前記認識領域の数が多い画像データを選択する方法。
（１０）前記複数の映像データは、１つの主映像データと１つ以上の副映像データからなり、
前記画像選択ステップは、いずれの映像データの同時刻における個々の画像データにおいても前記認識領域の数が０の場合に、主映像データの画像データを選択する、（９）に記載の方法。
（１１）（８）～（１０）のいずれかに記載の方法が、
前記画像認識ステップにおいて手を認識した前記認識領域の表示された位置と画像における領域を出力し、
前記画像認識ステップが出力した１つ以上の前記認識領域の中心点から該中心点の平均点を算出し、
前記画像選択ステップに送る平均点算出ステップを有し、
前記画像選択ステップは前記平均点が画像の中心点に近い画像データを選択する方法。
（１２）前記画像認識ステップは、手袋を装着した手の画像データのみを認識し、手袋を装着していない手の画像データは認識しない（８）～（１１）のいずれかの方法。
（１３）（８）～（１２）のいずれかに記載の方法の各ステップを、情報処理装置もしくは画像認識部に実行させるコンピュータープログラム。
（１４）（１３）に記載のコンピュータープログラムを記録したコンピューター読み取り可能な記録媒体。The first invention can be understood as follows.
(1) In a multi-viewpoint video viewing system in which a plurality of multi-viewpoint video data indicating the state of an object being processed is obtained by a practitioner, and the video data is processed and displayed on a display device,
a plurality of video cameras that concurrently capture the hands of a practitioner processing an object from different viewpoints and output video data consisting of a large number of continuous image data;
An information processing device that receives and processes a plurality of video data simultaneously captured by the plurality of video cameras,
The information processing device is
an image recognition unit for detecting individual image data of a plurality of video data multiplexed and recognizing hand image data relating to hands in the individual image data;
an image selection unit for selecting, from among the plurality of image data, image data in which hand image data is recognized by the image recognition unit in time series; 1. A multi-view video viewing system comprising a video display unit for transmitting video to a display device.
(2) The image display unit performs editing by combining the image data selected in time series by the image selection unit, creates one piece of edited image data to be output to the display device, and displays the edited image data. The multi-view video viewing system according to (1), which transmits to the display device.
(3) the information processing device includes a counting unit, counts the number of recognition regions, which are regions of the hand image data in which hands are recognized by the image recognition unit, for each image data, and sends the number to the image selection unit;
The multi-view video viewing system according to (1) or (2), wherein the image selection unit selects image data with a large number of recognition regions and sends the image data to the video display unit.
(4) the plurality of video cameras are composed of one primary camera and one or more secondary cameras;
The information processing device includes a counting unit, counts the number of recognition regions, which are regions of the hand image data in which the hand is recognized by the image recognition unit, for each image data, and sends the number to the image selection unit,
The image selection unit according to any one of (1) to (3), wherein when the number of recognition regions is 0 in any image, the image data of the main camera is selected and sent to the image display unit. A multi-view video viewing system.
(5) The multi-view video viewing system according to (4), wherein the multi-view video viewing system can define any video camera as the main camera.
(6) the image recognition unit outputs the displayed position of the recognition area in which the hand is recognized and the area in the image;
The information processing device includes an average point calculation unit, calculates an average point of the center points from the center points of the one or more recognition regions output by the image recognition unit, and sends the average point to the image selection unit,
The multi-view video viewing system according to any one of (1) to (5), wherein the image selection unit selects image data in which the average point is close to the center point of the image and sends the image data to the video display unit.
(7) The multi-view video viewing according to any one of (1) to (6), wherein the image recognition unit recognizes only the image data of the gloved hand and does not recognize the image data of the ungloved hand. system.
(8) In the method of processing a plurality of multi-viewpoint image data showing the state of the object being processed by the operator's hands and selecting the image to be displayed on the display device,
an image recognition step of detecting individual image data of a plurality of video data multiplexed and recognizing hand image data relating to hands in the individual image data;
and an image selection step of selecting, from among the plurality of image data, image data in which hand image data has been recognized by the image recognition step in time series.
(9) The method according to (8) is
a counting step of counting the number of recognition regions, which are regions of the hand image data in which hands are recognized by the image recognition step, for each of the image data;
A method in which the image selection step selects image data having a large number of the recognition regions in the individual image data.
(10) the plurality of video data are composed of one main video data and one or more sub-video data;
The method according to (9), wherein the image selection step selects the image data of the main video data when the number of the recognition regions is 0 in the individual image data at the same time of any video data.
(11) The method according to any one of (8) to (10),
outputting the displayed position and the area in the image of the recognition area where the hand was recognized in the image recognition step;
calculating an average point of the center points from the center points of the one or more recognition regions output by the image recognition step;
having an average score calculation step for sending to the image selection step;
The image selection step selects image data in which the average point is close to the center point of the image.
(12) The method according to any one of (8) to (11), wherein the image recognition step recognizes only the image data of the gloved hand and does not recognize the image data of the ungloved hand.
(13) A computer program that causes an information processing device or an image recognition unit to execute each step of the method according to any one of (8) to (12).
(14) A computer-readable recording medium recording the computer program according to (13).

＜第２の発明＞
次に、本願の第２の発明について説明する。第２の発明の目的は、施術者が撮影を意識せず、手術に集中しながらにして、治療対象患部の撮影を完遂することができ、撮影された映像のいずれかには、高確率に治療対象患部が映っているカメラシステムを提供することにある。<Second invention>
Next, the second invention of the present application will be described. The purpose of the second invention is that the operator can complete the imaging of the affected area to be treated while concentrating on the operation without being conscious of the imaging, and one of the captured images has a high probability. To provide a camera system that captures an affected area to be treated.

本願の第２の発明は、カメラと複数の光源で構成されるカメラユニットについて、光源による照明の範囲がカメラで撮影される範囲に収まるものとし、このカメラユニットを複数配置して無影灯を構成することで、無影灯の複数の光源の集合の内部に複数のカメラが分散して配置されたものとし、治療対象患部にいずれの光源が当たっていた場合においても、同時にこの光源の近傍に位置するカメラによって撮影が完遂される状況を実現する。この状況においては、施術者は撮影の状況を意識することなく、治療対象患部への安定した照明を保ちながら、カメラの位置を変化させずとも、いずれかのカメラにおいて、治療対象患部の撮影を完遂することが可能である。また治療対象患部への安定した照明を改善するために無影灯が移動した場合においても、治療対象患部の撮影はいずれかのカメラにおいて達成される。 A second aspect of the invention of the present application relates to a camera unit composed of a camera and a plurality of light sources, and a shadowless light is provided by arranging a plurality of the camera units so that the range of illumination by the light source is within the range photographed by the camera. With this configuration, it is assumed that a plurality of cameras are distributed and arranged inside a set of a plurality of light sources of the shadowless lamp, and when any light source hits the affected area to be treated, the vicinity of this light source can be detected at the same time. To realize a situation in which photographing is completed by a camera located at . In this situation, the practitioner can capture images of the affected area without changing the position of the camera while maintaining stable lighting on the affected area to be treated without being conscious of the imaging situation. It is possible to complete. Also, even when the surgical lamp is moved to improve the stable illumination of the affected area to be treated, the affected area to be treated can be photographed by any one of the cameras.

本願の第２の発明は、上記目的を達成するため手術などの医療に際し治療対象患部を撮影する無影灯付きカメラ機構、ならび該カメラ機構で撮影された映像を表示するモニター機構を備えたカメラシステムであり、該無影灯付きカメラ機構は、全体の形態は手術に際して一般に用いられる無影灯と同様であり、複数の光源を備え、一般的な無影灯と同様に、移動および傾斜が可能な全体基体と、該全体基体のほぼ中央に滅菌のカバーを装着できる操作ハンドルを備え、該全体基体の中央部に中央カメラユニット、該中央カメラユニットの周囲に少なくとも１つの周囲カメラユニットを備えることを特徴とする。全体的な形態や構成は一般的な無影灯と同様であり、施術者は一般的な無影灯を操作する感覚で治療対象患部に照明を当てることが可能である。 In order to achieve the above object, the second invention of the present application is a camera equipped with a shadowless light-equipped camera mechanism for photographing an affected area to be treated during medical treatment such as surgery, and a monitor mechanism for displaying the image photographed by the camera mechanism. system, the camera mechanism with the surgical light has the same overall form as a surgical light commonly used in surgery, is equipped with a plurality of light sources, and can be moved and tilted in the same manner as a general surgical light. A possible overall base, an operating handle to which a sterile cover can be attached approximately in the center of the overall base, a central camera unit in the center of the overall base, and at least one peripheral camera unit around the central camera unit. It is characterized by The overall form and configuration are similar to those of a general operating light, and the practitioner can illuminate the affected area to be treated as if operating a general operating light.

このとき、前記周囲カメラユニットは、それぞれに備える複数の光源の近傍に周囲カメラを備えており、光源の照明の方向と該周囲カメラの撮影の方向は一致しているため、複数の光源のうちいずれかの照明が治療対象患部に届いている場合には、該患部に到達した光源の近傍に位置する周囲カメラからは、該患部が撮影されている可能性が高い。逆に、照明が治療対象患部に到達していない場合には、撮影されている映像にも該患部は写っていない可能性が高くなるが、該患部に照明が到達していない状況では手術の実施が困難であるので、施術者は前記操作ハンドルを把持して該無影灯付きカメラ機構の前記全体基体を移動および傾動させ、該患部に照明が届くようにすることは、一般的な無影灯における操作と同様であり、この操作によって、該患部がいずれかの光源の近傍のカメラによって撮影される状況に復帰することが可能となる。この操作の間、施術者は撮影の状況を気に留める必要はない。なお、前記周囲カメラユニットには光源が必須であるが、前記中央カメラユニットに光源を設置するかどうかは任意である。 At this time, the surrounding camera unit includes a surrounding camera in the vicinity of a plurality of light sources provided therein. If any of the lights reaches the affected area to be treated, there is a high possibility that the affected area is being imaged by the surrounding cameras located near the light source that has reached the affected area. Conversely, if the illumination does not reach the affected area to be treated, there is a high possibility that the affected area will not appear in the image being captured, but in a situation where the illumination does not reach the affected area, surgery will not be possible. Because it is difficult to implement, it is generally unnecessary for a practitioner to grasp the operating handle to move and tilt the entire base of the surgical light camera mechanism so that illumination reaches the affected area. Similar to the operation in a shadow light, this operation allows a return to the situation where the affected area is imaged by a camera near any light source. During this operation, the practitioner does not need to be concerned about the imaging situation. A light source is essential for the peripheral camera unit, but it is optional whether or not the central camera unit is provided with a light source.

前記モニター機構には複数のカメラの映像が入力され、ディスプレー装置に表示される映像は初期設定では中央カメラのものが望ましいが、手術の状況によっては周囲カメラの映像のほうが多くの情報を有する場合があり、該モニター機構に備える操作装置を操作することで、該ディスプレー装置の映像を切り替えたり、複数の映像を同時に表示したりすることができる。 Images from a plurality of cameras are input to the monitor mechanism, and the image displayed on the display device is preferably that of the central camera in the initial setting, but depending on the surgical situation, the images of the surrounding cameras may have more information. By operating an operation device provided in the monitor mechanism, it is possible to switch images on the display device or display a plurality of images at the same time.

本願の第２の発明は、上記のような構成により、手術などの医療に関わる動画の効果的な活用を支援する機能を提供するものであり、
・施術者は撮影を意識することなく、従来の無影灯を操作するのと同じ感覚で手術を行うのみで、治療対象患部の撮影を完遂できる。
・撮影のための人員を配置する必要がない。
・障害物によるカメラの視野障害が生じるときには、同時に光源の照明も治療対象患部に届かなくなるため、施術者が視野障害の発生に気付くことができ、照明を当てるために無影灯付きカメラ機構を移動、傾動させることで、視野障害も改善させることができる。
・ディスプレーに、より確実に治療対象患部の映像を表示することが可能となり、手術に関わる外回り看護師、麻酔科医、手術の見学者である医師や学生、手術室の管理者などのスタッフが、より正確な手術の進行状況を、実際に手術が行われている現場を施術者の背後から覗き込まなくても把握でき、手術室の効率的な運営や、汚染の減少による創部感染の発生数の減少などが期待される。
・その他、医療の現場において、手術を含めた医療処置を撮影した映像の用途については、医療事故の原因の究明や、患者へのより具体的な説明による満足度の向上、外科医や若手医療スタッフの教育の改善などの効果が期待できる。The second invention of the present application provides a function to support effective utilization of moving images related to medical care such as surgery with the above configuration,
・The operator can complete imaging of the affected area to be treated simply by performing the operation with the same feeling as operating a conventional operating light without being conscious of imaging.
・There is no need to allocate personnel for filming.
・When obstructions obstruct the field of view of the camera, the illumination from the light source will not reach the affected area to be treated. Visual impairment can also be improved by moving and tilting.
・It is possible to display the image of the affected area to be treated more reliably on the display, and staff such as nurses involved in surgery, anesthesiologists, doctors and students who are observers of surgery, and administrators of the operating room , more accurate surgery progress can be grasped without having to peer into the actual site of surgery from behind the operator, efficient operation of the operating room and occurrence of wound infection due to reduction of contamination. It is expected that the number will decrease.
・Other uses of video footage of medical procedures, including surgery, in medical settings include investigating the causes of medical accidents, improving patient satisfaction through more specific explanations, and improving surgeons and young medical staff. It can be expected to have effects such as improving the education of

なお歯科医療の領域においては、特開２０１２－１２０８１２号公報（特許文献３）に、複数のビデオカメラによる多視点映像を出力するビデオカメラシステムに無影灯を組み合わせる手段が提案されているが、単純にこの手段を、全身を扱う医科の手術に流用しても、良好な撮影は達成し得ない。 In the field of dental care, Japanese Patent Application Laid-Open No. 2012-120812 (Patent Document 3) proposes means for combining a shadowless light with a video camera system that outputs multi-viewpoint images from a plurality of video cameras. Even if this means is simply applied to a medical operation that treats the whole body, good imaging cannot be achieved.

まず、歯科における治療対象患部を多視点から撮影するためには、口腔の入り口である口裂や、上下の切歯の間を通して、少なくとも大臼歯までの深さの構造を複数のカメラで撮影できるように、ビデオカメラを配置する必要がある。口裂は成人男性でも６ｃｍ大程度までの広さであって、上下の切歯間の距離は５ｃｍ程度までであり、第三大臼歯までの深さも６ｃｍ程度である。歯科における診察の体勢で歯科医が無影灯を操作するためには、無影灯は高くても患者から６０～７０ｃｍの位置にあるので、この距離から口裂や切歯間を通して第三大臼歯を複数の視点から撮影するためには、カメラ間の距離は１５ｃｍ程度以内に配置する必要がある。もちろん、小児や女性などで口が小さい場合には、カメラはより近い位置に配置されていなければならない。これでは、手術の際に視野障害を生じうる施術者の頭の幅よりも狭い距離しかカメラが離れておらず、医科の手術において治療対象患部を撮影することは困難である。医科の手術では、歯科の診療よりも広い身体範囲を治療対象とし、治療に携わるスタッフの人数も多いことから、治療対象患部に良好な照明を当てるためには、無影灯は少なくとも５０ｃｍ大程度の広さに光源を備える必要がある。カメラと光源の位置関係に留意しなければ、この無影灯に複数のカメラを備えたところで、前述したように、治療対象患部に照明は当たっているものの、カメラでは撮影されていない状況が生じる。この状況に気付くためには、第三者がディスプレーを監視する必要があり、撮影が達成されていない状況から復帰するためには、やはり無影灯ごとカメラの位置、方向を操作する必要があり、これは治療の進行に支障を生じ、患者に危害を生じるおそれもある。 First of all, in order to photograph the affected area to be treated in dentistry from multiple viewpoints, multiple cameras can photograph the cleft, which is the entrance to the oral cavity, and the structures at least down to the molars through the space between the upper and lower incisors. so you need to place a video camera. Even in adult males, the cleft is about 6 cm wide, the distance between the upper and lower incisors is about 5 cm, and the depth to the third molar is about 6 cm. In order for the dentist to operate the operating light in the position of examination in dentistry, the operating light should be at a distance of at most 60 to 70 cm from the patient. In order to photograph molar teeth from a plurality of viewpoints, it is necessary to arrange the cameras within a distance of about 15 cm. Of course, if the mouth is small, such as a child or a woman, the camera should be placed at a closer position. In this case, the distance between the cameras is narrower than the width of the operator's head, which may cause visual disturbance during surgery, and it is difficult to photograph the affected area to be treated in medical surgery. Medical surgery treats a wider range of the body than dental care, and the number of staff involved in the treatment is large. It is necessary to provide a light source in the area of . If we do not pay attention to the positional relationship between the camera and the light source, even if this operating light is equipped with multiple cameras, as described above, the affected area to be treated will be illuminated, but not captured by the camera. . In order to notice this situation, it is necessary for a third party to monitor the display, and in order to recover from the situation in which shooting is not achieved, it is necessary to operate the position and direction of the camera together with the operating light. , which interferes with the progress of therapy and may cause harm to the patient.

次に、本願の第２の発明を具体化した実施形態を、以下図８以降を参照して詳細に説明する。図は説明の都合上模式的に描いてある。 Next, an embodiment embodying the second invention of the present application will be described in detail below with reference to FIG. 8 and subsequent drawings. The figures are drawn schematically for convenience of explanation.

図８はカメラシステム１１０が１台の中央カメラユニット１１６と６台の周囲カメラユニット１２６からなる本発明の実施例を示す。無影灯の光源は中央カメラユニットの光源１２４については必須ではなく、周囲カメラユニットの光源１３２については図１の実施例のように周囲カメラ１３０を取り囲むように配置してもよいし、並列するように配置してもよい。中央カメラ１２２は図８の実施例のように操作ハンドル１２０に並設してもよいし、操作ハンドル内に設置してもよい。各カメラの映像はモニター機構１４０のディスプレー装置１４２に表示される。無影灯付きカメラ機構１１２とモニター機構１４０の数は、最小で１対１であるが、多対１、１対多、多対多のいずれの場合もありうる。モニター機構１４０は手術室において壁面に固定される場合や、カートに乗せて移動可能な場合に限らず、院内ネットワークを介して手術室や医局のスタッフルームに設置される場合、患者の同意があれば他施設に設置されてインターネットを介して手術の状況をライブサージャリーとして視聴される場合がありうる。ディスプレー装置１４２に表示する映像は操作装置１４４を用いて選択が可能である。操作装置１４４はモニター機構１４０に設置されたボタンであったり、マウスやキーボードであったりする場合や、ディスプレー装置１４２がタッチパネルとなっている場合、モニター機構１４０に無線で接続するリモートコントローラーである場合、これらの組み合わせである場合がありうる。リモートコントローラーを滅菌されたビニール袋に入れれば、滅菌ガウンを着たスタッフでも映像の選択が可能となる。ディスプレー装置１４２は表示されている映像の一部を拡大して表示することが可能であり、治療対象患部が小さく映っている場合には、該患部を選択して拡大して表示することができる。これはディスプレー装置１４２にポインタを表示してマウスなどで操作すること、もしくはディスプレー装置１４２がタッチパネルとなっていることによって、拡大すべき領域の選択が可能となる。 FIG. 8 shows an embodiment of the invention in which the camera system 110 consists of one central camera unit 116 and six surrounding camera units 126 . The light source of the shadowless light is not essential for the light source 124 of the central camera unit, and the light source 132 of the surrounding camera unit may be arranged so as to surround the surrounding camera 130 as in the embodiment of FIG. 1, or side by side. can be arranged as follows. The central camera 122 may be arranged side by side with the operating handle 120 as in the embodiment of FIG. 8, or may be installed inside the operating handle. Images from each camera are displayed on the display device 142 of the monitor mechanism 140 . The number of light camera mechanisms 112 and monitor mechanisms 140 is at least one to one, but may be many to one, one to many, or many to many. The monitor mechanism 140 is not limited to the case where it is fixed to the wall surface in the operating room, or the case where it is movable on a cart, and the case where it is installed in the operating room or the staff room of the medical office via the hospital network, without the consent of the patient. For example, it may be installed in another facility and the status of surgery may be viewed as live surgery via the Internet. Images to be displayed on the display device 142 can be selected using the operation device 144 . The operation device 144 may be a button installed on the monitor mechanism 140, a mouse or a keyboard, a touch panel as the display device 142, or a remote controller wirelessly connected to the monitor mechanism 140. , may be a combination of these. Putting the remote control in a sterile plastic bag allows staff wearing sterile gowns to select images. The display device 142 can magnify and display a part of the displayed image, and when the affected area to be treated appears small, the affected area can be selected and enlarged for display. . By displaying a pointer on the display device 142 and operating it with a mouse or the like, or by using the display device 142 as a touch panel, it is possible to select an area to be enlarged.

図９は操作装置１４４が中央カメラ１２２と６台の周囲カメラ１３０の映像の中からディスプレー装置１４２に表示する映像を選択する構成を示す。手術の現場では、ディスプレーの映像を視聴するのは施術者ではなく手術室看護師や、手術には参加していない見学者が主である。本発明では、撮影された複数の映像をモニター機構で選択して表示し、拡大する手段を提供する。各映像のそれぞれについて表示のオンとオフを指定することができ、指定のない場合は中央カメラ１２２の映像のみをディスプレー装置１４２に表示する。オンと指定されたカメラの数に応じてディスプレー装置の画面の領域を自動的に分割して映像を並べて表示するが、分割の大きさの比率や、それぞれの映像の配置の順は操作装置１４４で調整できる。またそれぞれの分割された領域に表示された映像について、映像の拡大が可能である。 FIG. 9 shows a configuration in which the operation device 144 selects an image to be displayed on the display device 142 from images of the central camera 122 and the six peripheral cameras 130 . At the site of surgery, it is not the operator who views the images on the display, but the operating room nurses and visitors who do not participate in the surgery. The present invention provides means for selecting, displaying, and enlarging a plurality of captured images by means of a monitor mechanism. It is possible to specify ON or OFF of display for each image, and if there is no specification, only the image of the central camera 122 is displayed on the display device 142 . The screen area of the display device is automatically divided according to the number of cameras designated to be ON, and images are displayed side by side. can be adjusted with Also, the image displayed in each divided area can be enlarged.

図１０は録画ユニット１６２が中央カメラ１２２と６台の周囲カメラ１３０の映像を同時に入力する構成を示す。図示していない録画の開始と停止を操作するスイッチを備え、一つのスイッチの操作ですべてのカメラの録画の開始と停止を行う。スイッチは録画ユニットに備える場合と、図８に例示した操作ハンドル１２０に備える場合と、リモートコントローラーに備える場合と、これらの組み合わせである場合がありうる。 FIG. 10 shows a configuration in which the recording unit 162 simultaneously inputs images from the central camera 122 and six surrounding cameras 130 . A switch (not shown) for starting and stopping recording is provided, and operation of one switch starts and stops recording for all cameras. The switch may be provided in the recording unit, provided in the operation handle 120 illustrated in FIG. 8, provided in the remote controller, or a combination thereof.

図１１は中央カメラ１２２と２台の周囲カメラ１３０で同時に撮影された複数の映像を、共通の撮影日時を付加して録画装置１６４に記録して、出力手段１６８を介して、任意の組み合わせでの複数の動画を、記録された撮影日時信号付加装置１６６からの撮影日時信号Ｓｇ１を揃えて同時に再生し、映像をモニター機構１４０のディスプレー装置１４２に表示する場合の実施形態例を示す。複数の動画を同時に再生する場合、撮影された現実時間を合わせて同一の場面を再生するためには、各動画に記録される録画時間にずれがある場合に、再生時間の細かな調整を行う必要がある場合があった。本発明では、この手間をなくすために、複数の映像を記録の時間を合わせて同時に並行して録画する手段、これら複数の録画映像を撮影の時間をあわせて同時に再生もしくはデータに出力する手段を提供する。本発明では、共通の撮影日時信号Ｓｇ１により撮影日時を揃えて複数の動画を再生することで、別カメラの映像へ時間のずれのなく画面の切り替えをしたり、同時に表示したりすることが可能である。なお周囲カメラ１３０は模式的に２台で図示したが、実際には、視野障害のおそれを回避する効果的な撮影のためには３台以上の周囲カメラ１３０で中央カメラ１２２を取り囲む構成であることが望ましい。録画ユニット１６２は図１１に例示したようにモニター機構１４０と離れて構成されている場合もあれば、モニター機構１４０の内部に組み込まれている場合がありうる。また、出力手段１６８は図１１に例示したように録画ユニット１６２に備えられている場合もあれば、録画ユニット１６２がモニター機構１４０に組み込まれている状況においては、操作装置１４４が出力手段１６８の役割を担う場合がありうる。また図１１に例示したように動画の出力先がディスプレー装置１４２である場合には、同時に出力された複数の動画を同時に再生しながら、操作装置１４４を介して、図２で例示した際と同様に、ディスプレー装置１４２に表示する動画の選択や、画面の構成の調整が可能である。また、出力先がディスプレーでなくＤＶＤなどの記憶媒体である場合もありうるが、この場合には、複数の動画のうちから任意の動画を選択して独立した動画ファイルとして出力する場合や、任意の組み合わせの動画を任意の配置で並べて表示した一つの動画ファイルとして出力する場合がありうる。 In FIG. 11, a plurality of images taken simultaneously by the central camera 122 and the two surrounding cameras 130 are recorded in the recording device 164 with the common shooting date and time added, and output in any combination via the output means 168. A plurality of moving images are played back at the same time with the recorded shooting date/time signal Sg1 from the shooting date/time signal adding device 166, and the images are displayed on the display device 142 of the monitor mechanism 140. When playing back multiple videos at the same time, in order to play back the same scene by matching the actual time it was shot, if there is a discrepancy in the recording time recorded for each video, make fine adjustments to the playback time. there were times when it was necessary. In order to eliminate this trouble, the present invention includes means for recording a plurality of images in parallel at the same recording time, and means for simultaneously reproducing or outputting the plurality of recorded images as data at the same time. offer. In the present invention, by playing back a plurality of moving images with the shooting date and time aligned by the common shooting date and time signal Sg1, it is possible to switch the screen to the video of another camera without any time lag, or to display them at the same time. is. Although two peripheral cameras 130 are schematically illustrated, in practice, three or more peripheral cameras 130 surround the central camera 122 in order to avoid the possibility of visual disturbance. is desirable. The recording unit 162 may be configured separately from the monitor mechanism 140 as illustrated in FIG. 11, or may be incorporated inside the monitor mechanism 140 . In some cases, the output means 168 is provided in the recording unit 162 as illustrated in FIG. may play a role. 11, when the output destination of the moving image is the display device 142, while simultaneously reproducing a plurality of moving images output simultaneously, through the operation device 144, similar to the case illustrated in FIG. In addition, it is possible to select a moving image to be displayed on the display device 142 and adjust the configuration of the screen. Also, the output destination may be a storage medium such as a DVD instead of a display. may be output as one moving image file in which the moving images of the combination of are displayed side by side in an arbitrary arrangement.

図１２はリンク装置１８２によって操作ハンドル１２０とそれぞれの周囲カメラユニット１２６の傾動ユニット１８４を機械的にリンクし、操作ハンドル１２０の回転によって、傾動ユニット１８４を操作する実施形態例を示す。リンク装置１８２は、図示していない傘歯車やヘリカルギアーを用いて回転の力の方向を変換し、傾動ユニット１８４に機械的に力を伝達する場合がありうる。なお図１２では周囲カメラユニット１２６は円形であり、この円の直径に一致する軸で回転することで、無影灯付きカメラ機構１１２の中央に対する傾動を行うように例示しているが、周囲カメラユニット１２６の形状は円形に限らず、また、傾動の方式は無影灯付きカメラ機構１１２の中央に対するものであれば、傾動の中心や支点が周囲カメラユニット１２６の中心からずれていてもよい。また他の形態例として、リンク装置１８２は電子信号を伝達し、傾動ユニット１８４に備えるモーターを作動して、傾動を達成する電動式のものであってもよい。 FIG. 12 shows an embodiment in which the operating handle 120 and the tilting unit 184 of each surrounding camera unit 126 are mechanically linked by a linking device 182 and the tilting unit 184 is operated by rotating the operating handle 120 . The link device 182 may use bevel gears or helical gears (not shown) to change the direction of rotational force and mechanically transmit the force to the tilting unit 184 . In FIG. 12, the surrounding camera unit 126 has a circular shape, and by rotating on an axis that matches the diameter of this circle, the shadowless light-equipped camera mechanism 112 is tilted with respect to the center. The shape of the unit 126 is not limited to a circle, and the tilting center or fulcrum may be shifted from the center of the peripheral camera unit 126 as long as the method of tilting is relative to the center of the shadowless light-equipped camera mechanism 112 . As another example, the linkage 182 may be electrically operated to transmit an electronic signal to activate a motor provided in the tilting unit 184 to achieve tilting.

図１３は前記無影灯付きカメラ機構にカメラの映像の方向を知らせるガイド光照射装置を備えた場合に、撮影対象１９０に照射するガイド光の実施形態例を示す。映像の垂直方向に沿ったガイド光１９２と、映像の水平方向に沿ったガイド光１９４を撮影対象１９０に照射することにより、施術者はディスプレーで映像を確認することなく、ガイド光を見ながら無影灯付きカメラ機構を操作して、映像に写る撮影対象１９０の傾きの調整が可能である。ガイド光は線状光の組み合わせで、一方向に映像の特定の方向を示すポインタを表示すべきである。本例では矢頭を付した線状光１９２で垂直方向の上方を示しているが、この形態は任意である。複数のカメラの映像の傾きが水平方向に平行であれば、該ガイド光照射装置は一つで十分であるが、そうでなければ、前記カメラユニットに一つずつ備えてもよい。 FIG. 13 shows an embodiment of the guide light irradiated to the photographing object 190 when the camera mechanism with the shadowless light is equipped with a guide light irradiation device for informing the direction of the image of the camera. By irradiating the subject 190 with guide light 192 along the vertical direction of the image and guide light 194 along the horizontal direction of the image, the practitioner can perform the treatment without checking the image on the display while watching the guide light. By manipulating the shadow-lit camera mechanism, it is possible to adjust the tilt of the photographing object 190 appearing in the image. The guide light is a combination of linear lights and should display a pointer in one direction to indicate a specific direction of the image. In this example, the linear light 192 with an arrowhead indicates a vertically upward direction, but this form is arbitrary. If the tilts of the images of a plurality of cameras are parallel to the horizontal direction, one guide light irradiation device is sufficient.

ガイド光の照射は、無影灯付きカメラ機構を移動、傾動した際に自動で出力し、一定時間経過したら自動で消灯されるのが望ましい。前記無影灯付きカメラ機構の移動および傾動が生じたことの検知は、加速度センサーが用いられる。もちろん、加速度センサーに限らず、該無影灯付きカメラ機構の移動や傾動、該周囲カメラユニットの傾動に必要な関節の角度を計測する内界センサーなどの手段によって、これら運動の発生を検知する場合もありうる。
また、前記ガイド光は人体、特に眼球への安全性を確保する必要がある。It is preferable that the irradiation of the guide light is automatically output when the camera mechanism with the shadowless lamp is moved or tilted, and that the guide light is automatically extinguished after a certain period of time has elapsed. An acceleration sensor is used to detect the occurrence of movement and tilting of the surgical light-equipped camera mechanism. Of course, the occurrence of these motions is detected not only by the acceleration sensor, but also by means such as an internal sensor that measures the movement and tilting of the camera mechanism with the shadowless light and the angles of the joints required for the tilting of the peripheral camera unit. It is possible.
Also, the guide light must be safe for the human body, especially the eyeball.

図１４は無影灯付きカメラ機構１１２にマイク２０２を備え、録音装置２０４、音声出力装置２０６に音声を出力するマイク２０２を備えた本発明の実施例である。手術においては、スタッフの会話も重要な情報である。特に、滅菌ガウンを着て手術に参加するインターンのスタッフに指導を行う場合には、手術中にメモを取ることはできないので、指導された内容を正確に記憶することは困難である。また、指導の内容は手術室看護師や見学者にも共有されるに値するが、施術者の発声の音量が十分でなく、周囲の人間には聞き取れない場合がある。説明の都合上、図８の中央カメラ１２２、モニター機構１４０などを省略しているが、これらの構成は図８と同様である。図１４にはマイク２０２は操作ハンドル１２０に並設する場合を例示しているが、効果的な収音のためには、施術者に近い操作ハンドル１２０の先端付近にマイク２０２が設置されることが望ましい。また、手術室には心電図や血中酸素飽和度をモニターする生体モニターや、電気メスなどの医療機器が発する音が響き渡っている場合があるため、マイク２０２はノイズキャンセリングの機能を備え、照明および撮影の方向に沿った単一指向性のマイクであることが望ましい。 FIG. 14 shows an embodiment of the present invention in which a camera mechanism 112 with shadowless light is provided with a microphone 202, and a recording device 204 and a voice output device 206 are provided with a microphone 202 for outputting sound. In surgery, staff conversation is also important information. In particular, when instructing intern staff who participate in surgery wearing sterile gowns, it is difficult to accurately remember what was instructed because it is not possible to take notes during the operation. In addition, although the content of the guidance deserves to be shared with the operating room nurses and the visitors, there are cases where the sound volume of the operator's utterances is not sufficient and the surrounding people cannot hear them. For convenience of explanation, the central camera 122, the monitor mechanism 140, and the like in FIG. 8 are omitted, but their configurations are the same as in FIG. FIG. 14 illustrates a case where the microphone 202 is arranged side by side with the operation handle 120, but for effective sound collection, the microphone 202 should be installed near the tip of the operation handle 120 near the operator. is desirable. In addition, since the sound emitted by medical equipment such as electrocardiograms and blood oxygen saturation monitors and medical equipment such as electric scalpels may echo in the operating room, the microphone 202 has a noise canceling function. A unidirectional microphone along the direction of lighting and shooting is desirable.

図１５は図１１にマイク２０２、音声出力装置２０６、録音装置２０４を追加した本発明の実施例である。音声出力装置２０６は図示のようにモニター機構１４０に組み込まれている場合の他に、独立に備えられる場合や、イヤホンのようにスタッフが装用する機器である場合がありうる。録音装置２０４は、図示のように録画ユニット１６２に組み込み、同時に記録される動画と同一の撮影日時を付加して録音を行うことが望ましい。動画の出力においては、出力手段１６８を介して、撮影日時を揃えて動画と音声を同時に出力することが可能である。共通の撮影日時を揃えて複数の動画および音声を再生することで、別カメラの映像へ時間のずれのなく画面の切り替えをしたり、同時に表示したりすることが可能である。 FIG. 15 shows an embodiment of the present invention in which a microphone 202, an audio output device 206 and a recording device 204 are added to FIG. The audio output device 206 may be incorporated in the monitor mechanism 140 as shown, may be provided independently, or may be a device such as an earphone worn by the staff. It is preferable that the recording device 204 is incorporated in the recording unit 162 as shown in the drawing, and the recording is performed by adding the same shooting date and time as the simultaneously recorded moving image. When outputting moving images, it is possible to simultaneously output moving images and sounds through the output means 168 with the shooting date and time aligned. By aligning the common shooting date and time and reproducing multiple moving images and sounds, it is possible to switch the screen to the image of another camera without time lag, or to display it at the same time.

図１６は無影灯付きカメラ機構１１２の中央カメラ１２２が半天球カメラもしくはステレオカメラもしくは赤外線カメラ２２２であり、操作ハンドル１２０の先端にいずれかのカメラ２２２が備えられている本発明の実施例である。
手術室においては、治療対象患部の映像のみでなく、スタッフや手術器械の配置や操作など、広範囲の状況が把握されることが望ましい。本発明では、半天球カメラによって広範囲の映像を取得する手段を提供する。半天球カメラを用いる場合、半天球カメラとは別に中央カメラが中央カメラユニット１１６に備えられる場合がありうるが、この場合においても、３６０°に渡って映像障害のない映像を得るためには、半天球カメラは操作ハンドル１２０の先端に設置されることが望ましい。なお、図１６においては図８におけるモニター機構１４０は省略して図示しているが、半天球カメラ２２２の映像も他のカメラと同様にモニター機構１４０に入力され、操作装置１４４によって映像の選択や拡大範囲の調整が可能である。FIG. 16 shows an embodiment of the present invention in which the central camera 122 of the camera mechanism 112 with shadowless light is a semi-spherical camera, a stereo camera, or an infrared camera 222, and one of the cameras 222 is provided at the tip of the operating handle 120. be.
In the operating room, it is desirable to grasp not only images of the affected area to be treated, but also a wide range of conditions such as the arrangement and operation of staff and surgical instruments. The present invention provides means for acquiring a wide range of images by a hemisphere camera. When a semi-spherical camera is used, a central camera may be provided in the central camera unit 116 separately from the semi-spherical camera. It is desirable that the semi-spherical camera be installed at the tip of the operation handle 120 . Although the monitor mechanism 140 in FIG. 8 is omitted in FIG. 16, the image of the semi-spherical camera 222 is also input to the monitor mechanism 140 in the same manner as the other cameras, and the operation device 144 is used to select and control images. The expansion range can be adjusted.

また、治療対象患部の状況を把握するうえで、二次元表示の映像で得られる情報は、三次元表示の映像で得られる情報に劣る。本発明では、ステレオカメラ２２２によって三次元で映像を表示する手段を提供する。ステレオカメラを用いる場合、前記ディスプレー装置１４２の映像を二次元で表示するか三次元で表示するかは前記モニター機構１４０の前記操作装置１４４で選択できる。ステレオカメラ２２２は前記中央カメラ１２２もしくは前記周囲カメラ１３０のうち任意の２台を組み合わせたものでもよいし、いずれか１台以上のカメラ自体がステレオカメラ２２２であるものでもよい。 In addition, in terms of grasping the condition of the affected area to be treated, the information obtained from the two-dimensional display image is inferior to the information obtained from the three-dimensional display image. The present invention provides means for displaying images in three dimensions by the stereo camera 222 . When a stereo camera is used, it can be selected by the operation device 144 of the monitor mechanism 140 whether the image on the display device 142 is to be displayed two-dimensionally or three-dimensionally. The stereo camera 222 may be a combination of any two of the central camera 122 and the surrounding cameras 130 , or one or more of the cameras may be the stereo camera 222 .

また、手術室ではしばしば赤外線カメラを用いて治療対象患部の評価を行う場面があるが、このとき、該患部の付近に赤外線カメラを滅菌の袋に入れて持ち込んで撮影を行う手間がある。本発明では、この手間を省略して赤外線カメラの映像を表示する手段を提供する。赤外線カメラを用いる場合、前記中央カメラ１２２もしくは前記周囲カメラ１３０のうち１台が赤外線カメラであるか、赤外線カメラに切り替えることが可能であればよい。 In addition, an infrared camera is often used in an operating room to evaluate an affected area to be treated. At this time, it is troublesome to bring the infrared camera into the vicinity of the affected area in a sterilized bag and take pictures. The present invention provides a means for displaying the image of the infrared camera by omitting this trouble. When infrared cameras are used, it is sufficient if one of the central camera 122 or the surrounding cameras 130 is an infrared camera or can be switched to an infrared camera.

図１７は操作ハンドル１２０の側面にスイッチ２３４、２３６、２４２、２４６、２４８を備え、施術者が手術の間に操作することで、電子信号を介して、光源、画角、各カメラのピントの自動調節機能の作動の切り替え、録画ならびに録音の開始と停止を行うことができ、また各カメラのピントの自動調節機能は傾動検知部２４４によりユニット基体の傾動を検知することによっても作動することができる本発明の実施例における構成を示す。従来、カメラの設定や録画、録音の操作は、滅菌ガウンを着ているスタッフには行えないため、手術室の人員が不足しており、滅菌ガウンを着ていないスタッフがいない場合には操作が困難であった。本発明では、一般の無影灯において滅菌のカバーが装着されるハンドル部に光源、カメラの画角、オートフォーカス設定、録音、録画の操作を行う手段を備えたカメラシステムを提供する。操作ハンドル１２０は手術の実施に当たっては滅菌ガウンを着た施術者でも操作できるように、滅菌されたカバーを装着される。このとき、スイッチ２３４、２３６、２４２、２４６、２４８の操作に支障がないように、該カバーはビニールや薄いプラスチックなどの透明で柔らかい素材から作製されたものであることが望ましい。また、スイッチ２３４、２３６、２４２、２４６、２４８の位置は、無影灯付きカメラ機構の全体を移動ならびに傾動させる、もしくは周囲カメラユニットを傾動させるために操作ハンドル１２０に触れたときに、誤って該スイッチ２３４、２３６、２４２、２４６、２４８を作動してしまうおそれのないように配置する必要がある。
なおカメラの自動調節機能は、カメラと治療対象患部との位置関係が変わった場合に自動で作動することが望ましいが、カメラと該患部との位置関係が変わりうる前記無影灯付きカメラ機構１１２の移動および傾動、もしくは前記周囲カメラユニット１２６の傾動が生じたことの検知は、前記周囲カメラユニット１２６のいずれかに備えた加速度センサーを用いれば、これらのいずれの運動についても検出が可能であるので、加速度センサーを用いることが望ましい。もちろん、加速度センサーに限らず、該無影灯付きカメラ機構１１２の移動や傾動、該周囲カメラユニット１２６の傾動に必要な関節の角度を計測する内界センサーなどの手段によって、これら運動の発生を検知する場合もありうる。FIG. 17 shows switches 234, 236, 242, 246, and 248 on the side of the operating handle 120, which are operated by the operator during surgery to change the light source, angle of view, and focus of each camera via electronic signals. The operation of the automatic adjustment function can be switched, recording can be started and stopped, and the automatic focus adjustment function of each camera can also be operated by detecting the tilt of the unit base body by the tilt detector 244. 1 shows a possible configuration in an embodiment of the present invention; In the past, camera settings, recording, and recording operations could not be performed by staff wearing sterile gowns. It was difficult. The present invention provides a camera system having means for operating the light source, camera angle of view, autofocus setting, sound recording, and sound recording on the handle portion to which the sterile cover is attached in a general shadowless lamp. The operating handle 120 is fitted with a sterile cover so that it can be operated by an operator wearing a sterile gown during surgery. At this time, it is desirable that the cover be made of a transparent and soft material such as vinyl or thin plastic so that the switches 234, 236, 242, 246, and 248 can be operated without hindrance. Also, the positions of switches 234, 236, 242, 246, 248 may be incorrectly set when the operating handle 120 is touched to move and tilt the entire surgical light camera mechanism or tilt the ambient camera unit. The switches 234 , 236 , 242 , 246 , 248 should be positioned such that they cannot be actuated.
It is desirable that the automatic adjustment function of the camera automatically operates when the positional relationship between the camera and the affected area to be treated changes. , or the occurrence of tilting of the surrounding camera unit 126 can be detected by using an acceleration sensor provided in any of the surrounding camera units 126. Therefore, it is desirable to use an acceleration sensor. Of course, not only the acceleration sensor, but also means such as an internal sensor for measuring the movement and tilting of the camera mechanism 112 with the shadowless light and the angles of the joints required for the tilting of the peripheral camera unit 126 can detect the occurrence of these motions. It may be detected.

図１８は記録された複数の動画２５２、２５４および音声２５６を外部レコーダに出力する際に、記録が行われたすべての時間に渡ってデータを出力するのではなく、撮影時間の範囲Ｔ１～Ｔ２を限定してデータの出力を行う場合に、各データを同位置の時間の範囲でデータとして切り出し、各動画に音声を合成して出力する本発明の実施例を示すものである。本発明では図１１ならびに図１５で示したように、いずれの動画データもしくは音声データでも、同一の撮影日時データを共有しているため、データを切り出す時間の範囲Ｔ１～Ｔ２を指定すれば、一括で複数の動画データの分割が可能である。 FIG. 18 shows that when outputting a plurality of recorded moving images 252, 254 and audio 256 to an external recorder, instead of outputting data over the entire recording time, the range T1 to T2 of the shooting time is displayed. This embodiment of the present invention shows an embodiment of the present invention in which each data is cut out as data in the time range of the same position, and voice is synthesized with each moving image and output when data is output by limiting . In the present invention, as shown in FIGS. 11 and 15, any moving image data or audio data shares the same shooting date and time data, so if a time range T1 to T2 for extracting data is specified, all data can be extracted at once. It is possible to divide multiple moving image data by .

図１９は前記ディスプレー装置が映像障害を認識したときに映像を切り替えるアルゴリズムの一例を示すフローチャートである。モニター機構を操作する滅菌ガウンを着ていないスタッフが手術室にいない場合には、ディスプレーに表示された映像を切り替えることが困難であり、手術の助手や器械出し看護師が映像を切り替えたい場合に不便を生じる。本発明では、表示された映像が映像障害によって治療対象患部を表示していないことをモニター機構が検知し、自動で別の映像に切り替わりうる手段を備えたカメラシステムを提供する。図９で説明したように、映像の指定を行わない場合、該ディスプレー装置には中央カメラの映像が表示される（ステップＳ１０１）。該ディスプレー装置は、表示する映像が該患部を表示していない可能性が高いことを、映像の画素の構成から検知し、自動で別のカメラの映像に切り替わることができる。手術においては、該患部の映像の画素は肌色や赤色などで構成される範囲が大きく、該患部が撮影されていないときに表示される映像の画素は、該患部の周囲を覆う清潔な布や手術衣や帽子の一般的な色である青色や緑色であったり、施術者の手袋の色である緑色であったりする。該ディスプレー装置は、映像の一定の割合の範囲を青色や緑色の画素が占めたときに、映像障害が生じたものと認識する（ステップＳ１０２）。ステップＳ１０２でＹＥＳであれば、監視時間タイマーをセットする（ステップＳ１０３）。タイマーセット時間が経過するまでは、映像障害が持続しているかどうかの認識を継続する（ステップＳ１０４）。映像障害がなくなっていればステップＳ１０５でＹＥＳとなり、中央カメラの映像の表示を継続する。映像障害が継続していればステップＳ１０５でＮＯとなり、タイマーセット時間が経過するまで映像障害のモニターを継続する。映像障害が継続したままタイマーセット時間が経過すると、ステップＳ１０４でＹＥＳとなり、自動で別のカメラの映像に切り替わって映像を表示する（ステップＳ１０６）。以下、表示されている映像について、同様に映像障害を検知し、自動で別のカメラの映像に切り替わることが可能である（ステップＳ１０７～Ｓ１１１）。この映像障害の認識は、複数の映像を同時に表示している場合にはそれぞれの映像について同時に実行される。また、表示していない映像についても、バックグラウンドで同様に映像障害の監視を行うことによって、切り替わる先の映像を、映像障害の生じていない映像に指定することも可能である。 FIG. 19 is a flow chart showing an example of an algorithm for switching images when the display device recognizes an image failure. If there are no staff in the operating room who do not wear sterile gowns to operate the monitor mechanism, it is difficult to switch images displayed on the display. cause inconvenience. The present invention provides a camera system having means for automatically switching to another image when the monitor mechanism detects that the displayed image does not display the affected area to be treated due to image disturbance. As described with reference to FIG. 9, when no image is designated, the image of the central camera is displayed on the display device (step S101). The display device can detect from the pixel configuration of the image that there is a high possibility that the displayed image does not display the affected area, and can automatically switch to another camera's image. In surgery, the pixels of the image of the affected area have a large range of colors such as skin color and red. It can be blue or green, the common colors of surgical gowns and caps, or green, the color of the practitioner's gloves. The display device recognizes that an image disturbance has occurred when a certain proportion of the image is occupied by blue or green pixels (step S102). If YES in step S102, a monitoring time timer is set (step S103). Until the timer set time elapses, it continues to recognize whether or not the image disturbance continues (step S104). If there is no image obstruction, YES is determined in step S105, and the image of the central camera is continued to be displayed. If the image failure continues, NO is determined in step S105, and the image failure monitoring continues until the timer set time elapses. If the timer set time elapses while the image disturbance continues, YES is determined in step S104, and the image is automatically switched to another camera and displayed (step S106). After that, it is possible to detect a video failure in the displayed video in the same way and automatically switch to the video of another camera (steps S107 to S111). This recognition of image disturbance is performed simultaneously for each image when a plurality of images are displayed simultaneously. In addition, it is also possible to designate the image to be switched to as a video with no image disturbance by similarly monitoring the image disturbance in the background for the image that is not displayed.

図２０は中央カメラ１２２およびそれぞれの周囲カメラ１３０の近傍に距離計測装置２７２を備え、モニター機構１４０に備えるＣＰＵ２７４によってディスプレー装置１４２に表示する映像を自動で選択する本発明の構成例を示す。距離計測装置２７２は、レーザー距離計を用いる場合には、光源にレーザークラス１（近赤外線）を採用し、人体、特に眼球への安全性を確保することが好ましい。それぞれの距離計測装置１７２の計測値はＣＰＵ２７４に伝達され、該ＣＰＵ２７４は、治療対象患部を撮影していた状況における計測値から、一定の時間に渡ってこの値が大きく変化した場合に、該距離計測装置２７２の近傍に位置するカメラ１２２、１３０の映像を、ディスプレー装置１４２で表示されないように制御することが可能である。 FIG. 20 shows a configuration example of the present invention in which a distance measuring device 272 is provided near the central camera 122 and each peripheral camera 130, and an image to be displayed on the display device 142 is automatically selected by the CPU 274 provided in the monitor mechanism 140. FIG. When the distance measuring device 272 uses a laser range finder, it is preferable to adopt laser class 1 (near infrared rays) as a light source to ensure safety to the human body, especially the eyeball. The measured value of each distance measuring device 172 is transmitted to the CPU 274, and the CPU 274 detects the distance when the value changes greatly over a certain period of time from the measured value in the situation in which the affected area to be treated is photographed. It is possible to control so that the images of the cameras 122 and 130 positioned near the measuring device 272 are not displayed on the display device 142 .

図２１は、動画データ２８２、音声データ２８８の内容をそれぞれ動画認識手段２８４および音声認識手段２９０によって自動認識し、文字列データを作成し、撮影時間情報に文字列のタグ情報を付加してリストデータ２８６、２９２を作成する本発明の構成を示す。記録された動画は長時間であり、全体的な色調や構成は単調であることが多く、再生したい手術の場面の記録時間を同定することはしばしば困難である。これに対する手段として動画の視聴者が手動でタグ情報を動画に追加する方法があるが、大きな手間を要する。本発明では、記録された映像、音声を機械的に自動認識し、データの時間情報に文字列のタグ情報を付加することで、記録の時間情報を検索可能とすることで、再生したい場面が同定されうる手段を提供する。動画認識手段２８４、音声認識手段２９０の具体例としては、それぞれ、Ｇｏｏｇｌｅ社が提供しているＣｌｏｕｄＶｉｄｅｏＩｎｔｅｌｌｉｇｅｎｃｅおよびＣｌｏｕｄＳｐｅｅｃｈのＡＰＩ（アプリケーション・プログラミング・インタフェース）が挙げられる。ただし、実際には、動画データ１８２および音声データ２８８には患者情報が含まれるため、Ｇｏｏｇｌｅ社のサーバーとの連絡が必要である該ＡＰＩを使用する場合には、予め患者の許可を得る必要がある。院外のネットワークを介さずに本発明を実施するためには、院内に動画認識手段２８４、音声認識手段２９０を構築しての運用が求められる。 FIG. 21 automatically recognizes the contents of moving image data 282 and voice data 288 by moving image recognition means 284 and voice recognition means 290, respectively, creates character string data, adds tag information of character strings to photographing time information, and lists them. Figure 2 shows the configuration of the present invention for creating data 286,292; Recorded movies are long, often monotonous in overall color and composition, and it is often difficult to identify the recording time of a surgical scene that one wishes to replay. As a means for this, there is a method in which the viewer of the video manually adds tag information to the video, but it requires a lot of time and effort. In the present invention, the recorded video and audio are mechanically and automatically recognized, and by adding tag information of character strings to the time information of the data, the time information of the recording can be searched, so that the scene to be played back can be found. provide a means by which they can be identified. Specific examples of the moving image recognition means 284 and the voice recognition means 290 include Cloud Video Intelligence and Cloud Speech APIs (application programming interfaces) provided by Google, respectively. However, since the video data 182 and the audio data 288 actually contain patient information, it is necessary to obtain the patient's permission in advance when using the API that requires communication with the Google server. be. In order to carry out the present invention without going through a network outside the hospital, it is necessary to construct and operate the moving image recognition means 284 and the voice recognition means 290 within the hospital.

第２の発明は以下のように捉えることができる。
（１５）手術などの施術行為に際し用いられ、治療対象患部および／または施術者の手許を固定画及び動画で撮影できる無影灯付きカメラ機構、ならびにこのカメラ機構で撮影された映像を表示するモニター機構を備えたカメラシステムにおいて、前記カメラ機構は、移動および傾斜が可能な全体基体、この全体基体のほぼ中央に配設された中央カメラユニット、および前記全体基体に設けられ、前記中央カメラユニットの周囲に配設された少なくとも１つの周囲カメラユニットを備え、前記中央カメラユニットは、中央ユニット基体、この中央ユニット基体の中央部に立設され、施術者の操作により、前記カメラ機構全体を移動および傾動させることができるようになっている操作ハンドル、およびこの操作ハンドルに内蔵もしくは隣接して配設された中央カメラを備え、前記周囲カメラユニットは、それぞれ、周囲ユニット基体、この周囲ユニット基体に配設された周囲カメラ、および前記周囲ユニット基体に、前記周囲カメラに隣接して配置された複数の光源を備え、前記中央カメラと周囲カメラは、それぞれ異なった視点で、治療対象患部および／または施術者の手許を固定画及び動画で同時に撮影できるようになっており、前記モニター機構は、通常は前記中央カメラの映像を表示する少なくとも１台のディスプレー装置、および視聴者の操作により、該ディスプレー装置に表示される映像を周囲カメラが撮影した映像からも選択することが可能な操作装置を備えていることを特徴とするカメラシステム。
（１６）前記中央カメラユニットは、前記中央カメラの近傍に配置された複数の光源を備え、これら複数の光源は無影灯を構成している（１５）に記載のカメラシステム。
（１７）前記操作装置は、前記中央カメラユニットおよび前記周囲カメラユニットの撮影した映像のうち、２つ以上の映像を組み合わせて前記ディスプレー装置に表示させることができる機能を有している（１５）または（１６）に記載のカメラシステム。
（１８）前記中央カメラユニットおよび前記周囲カメラユニットで撮影されたそれぞれの映像を並行して同時に録画する録画ユニットを更に備えた（１５）～（１７）のいずれかに記載のカメラシステム。
（１９）前記録画ユニットは、前記中央カメラユニットおよび前記周囲カメラユニットで同時に撮影された２つ以上の映像を、撮影された時間を同期して同時に再生することができる（１８）に記載のカメラシステム。
（２０）前記周囲カメラユニットは、前記周囲ユニット基体を傾動させる傾動ユニットを更に備え、前記操作ハンドルと該傾動ユニットはリンク装置を介して連絡しており、前記操作ハンドルは回転軸に沿って回転させることで、該リンク装置を介して前記傾動ユニットを作動して、前記周囲カメラユニットの基体を、前記中央カメラユニットに対して傾動させることによって、前記周囲カメラの方向を調整することが可能な（１５）または（１６）に記載のカメラシステム。
（２１）前記無影灯付きカメラ機構は、前記中央カメラもしくは前記周囲カメラのいずれかのカメラの映像の縦横方向を示す光線であるガイド光を撮影対象に照射するガイド光照射装置を備え、ディスプレーを見なくても撮影された映像の治療対象患部に対する傾きを知ることができる（１５）～（２０）のいずれかに記載のカメラシステム。
（２２）手術中の会話を含む音声情報を取得するマイクを備え、更に取得した音声情報を記録する録音装置、取得した音声を出力する音声出力装置を備えた（１５）～（２１）のいずれかに記載のカメラシステム。
（２３）前記録画ユニットは、前記中央カメラユニット、前記周囲カメラユニットおよび前記マイクで撮影および録音された映像および音声を、時間を同期して同時に再生することができる（１８）～（２２）のいずれかに記載のカメラシステム。
（２４）前記操作ハンドルは、ハンドルの先端に半天球カメラを備え、周囲の環境を３６０°に渡って録画することが可能な（１５）～（２３）のいずれかに記載のカメラシステム。
（２５）前記中央カメラが半天球カメラであり、周囲の環境を３６０°に渡って録画することが可能な（１５）～（２３）のいずれかに記載のカメラシステム。
（２６）前記操作ハンドルは、ハンドルの側面に前記中央ユニットおよび前記周囲ユニット基体の光源の操作装置を備え、光源の点消灯、照度の調整に関する指示信号を送ることができる（１５）～（２４）のいずれかに記載のカメラシステム。
（２７）前記中央カメラと前記周囲カメラはそれぞれ画角操作装置を備え、前記操作ハンドルは、ハンドルの側面に該画角操作装置に指示信号を送るためのスイッチを備えた（１５）～（２６）のいずれかに記載のカメラシステム。
（２８）前記中央カメラと前記周囲カメラはそれぞれ焦点距離操作装置を備え、該焦点距離操作装置は自動ピント調節機能によって自動で撮影対象に合わせてピントを調節することが可能な（１５）～（２７）のいずれかに記載のカメラシステム。
（２９）前記操作ハンドルは、ハンドルの側面のスイッチを介して、前記中央カメラと前記周囲カメラの前記焦点距離操作装置の前記自動ピント調節機能の作動と停止を操作することが可能な（１５）～（２８）のいずれかに記載のカメラシステム。
（３０）前記焦点距離操作装置は、前記自動ピント調節機能が停止しているときに、前記操作ハンドルを介した前記前記中央カメラと前記周囲カメラの画角の調整、および前記カメラ機構全体の移動もしくは傾動の操作が生じたことを検知したときに、前記自動ピント調節機能を働かせてピントを合わせてから、自動で自動ピント調節機能を停止させることが可能な（１５）～（２９）のいずれかに記載のカメラシステム。
（３１）前記操作ハンドルは、ハンドルの側面に前記録画ユニットの操作装置を付随し、録画の開始と停止の操作ができる（１５）～（３０）のいずれかに記載のカメラシステム。
（３２）前記操作ハンドルは、ハンドルの側面に前記録音装置の操作装置を付随し、録音の開始と停止の操作ができる（２１）～（３１）のいずれかに記載のカメラシステム。
（３３）前記モニター機構の前記操作装置は、前記中央カメラと前記周囲カメラの前記焦点距離操作装置の自動ピント調節機能の作動と停止を操作することが可能な（１５）～（２２）のいずれかに記載のカメラシステム。
（３４）前記中央カメラと前記周囲カメラのうち少なくとも一台がステレオカメラであり、前記ディスプレー装置は、該ステレオカメラで撮影された映像を三次元表示できる（１５）～（３３）のいずれかに記載のカメラシステム。
（３５）前記中央カメラと前記周囲カメラのうち少なくとも一台が赤外線カメラであり、前記ディスプレー装置は、該赤外線カメラで撮影された映像を表示できる（１５）～（３４）のいずれかに記載のカメラシステム。
（３６）前記モニター機構は、前記ディスプレー装置に表示される映像を選択する前記操作装置によって、前記中央カメラ、前記周囲カメラ、前記半天球カメラのいずれかで撮影された映像を、任意の撮影範囲で切り出し拡大して前記ディスプレーに表示することができる（２４）～（３５）のいずれかに記載のカメラシステム。
（３７）前記録画ユニットは、記録された複数の映像および音声を、時間の同期を保ちながら、任意の時間の範囲で切り出して、データとして出力することができる（１８）～（３６）のいずれかに記載のカメラシステム。
（３８）前記ディスプレー装置は、表示されている映像が治療対象患部および／または施術者の手許などの撮影対象を表示していないことを映像の画素を認識して検知し、一定時間以上にわたり状況が改善しない場合に、自動で別の映像の表示に切り替わることが可能な（１５）～（３７）のいずれかに記載のカメラシステム。
（３９）前記中央ユニット基体および前記周囲ユニット基体は、前記中央カメラおよび前記周囲カメラの近傍に撮影対象までの距離を光学的に計測する距離計測装置を備え、前記モニター機構に備えるCPUに距離情報を伝達し、該CPUはこの距離が突然に大きく変化し、一定の時間に渡って距離が復帰しないことを検知したときに、撮影対象が撮影されていないものと判定し、前記ディスプレー装置の映像を自動で別のカメラの映像の表示に切り替えることが可能な（１５）～（３８）のいずれかに記載のカメラシステム。
（４０）前記録画ユニットおよび前記録音装置は、記録された映像の画像情報、音声情報を機械的に自動認識し、データの時間情報にタグ情報を付加することで、タグ情報に基づいて記録の時間情報を検索可能とし、該当する時点からの再生を可能とする（１８）～（３９）のいずれかに記載のカメラシステム。
（４１）手術などの施術行為に際し用いられ、治療対象患部および／または術者の手許を固定画及び動画で撮影できる無影灯付きカメラ機構、ならびにこのカメラ機構で撮影された映像を表示するモニター機構を備えたカメラシステムにおいて、
前記カメラ機構は、
移動および傾斜が可能な全体基体、
この全体基体に設置された少なくとも２つのカメラユニット、並びに、
前記カメラ機構全体を移動および傾動させることができるようになっている操作ハンドル
を備え、
前記少なくとも２つのカメラユニットは、それぞれ、ユニット基体、およびこのユニット基体に設置された少なくとも１つのカメラを有し、
前記少なくとも２つのカメラユニットのカメラは、それぞれ異なった視点で、治療対象患部および／または術者の手許を固定画及び動画で同時に撮影できるようになっており、
前記カメラ機構は、更に、前記ディスプレー装置に表示される映像を、前記少なくとも２つのカメラユニットのカメラがそれぞれ撮影した映像から選択した映像とする映像選択手段を備えていることを特徴とするカメラシステム。The second invention can be understood as follows.
(15) A camera mechanism with a shadowless light that can be used for surgical procedures such as surgery and can capture static and moving images of the affected area to be treated and/or the hands of the operator, and a monitor that displays the images captured by this camera mechanism. In a camera system provided with a mechanism, the camera mechanism includes a movable and tiltable overall base, a central camera unit disposed substantially in the center of the overall base, and a camera provided on the overall base, the central camera unit At least one peripheral camera unit is provided on the periphery, and the central camera unit is erected on a central unit base, a central portion of the central unit base, and is operated by a practitioner to move and move the entire camera mechanism. An operation handle that can be tilted, and a central camera built into or adjacent to the operation handle, wherein the surrounding camera units are arranged on the surrounding unit base and the surrounding unit base, respectively. and a plurality of light sources arranged adjacent to the surrounding camera on the surrounding unit base body, wherein the central camera and the surrounding camera provide views of the affected area to be treated and/or the treatment target from different viewpoints. The monitor mechanism normally includes at least one display device that displays the image of the central camera, and the display device is controlled by the viewer's operation. 1. A camera system characterized by comprising an operating device capable of selecting an image to be displayed in the camera from images taken by surrounding cameras.
(16) The camera system according to (15), wherein the central camera unit comprises a plurality of light sources arranged near the central camera, the plurality of light sources forming a shadowless light.
(17) The operation device has a function of displaying a combination of two or more images among the images captured by the central camera unit and the surrounding camera units on the display device (15). Or the camera system according to (16).
(18) The camera system according to any one of (15) to (17), further comprising a recording unit that simultaneously records images taken by the central camera unit and the surrounding camera units in parallel.
(19) The camera according to (18), wherein the recording unit is capable of synchronizing and simultaneously playing back two or more images shot simultaneously by the central camera unit and the surrounding camera units. system.
(20) The surrounding camera unit further includes a tilting unit that tilts the surrounding unit base body, the operating handle and the tilting unit are connected via a link device, and the operating handle rotates along a rotation axis. actuating the tilting unit via the link device to tilt the base of the surrounding camera unit with respect to the central camera unit, thereby adjusting the direction of the surrounding camera. The camera system according to (15) or (16).
(21) The shadowless light-equipped camera mechanism includes a guide light irradiation device for irradiating a subject with guide light, which is a light beam indicating the vertical and horizontal directions of an image of either the central camera or the surrounding cameras, and a display. The camera system according to any one of (15) to (20), in which the inclination of the photographed image with respect to the affected area to be treated can be known without looking at the camera.
(22) Any of (15) to (21) provided with a microphone for acquiring voice information including conversation during surgery, a recording device for recording the acquired voice information, and a voice output device for outputting the acquired voice. The camera system according to
(23) The recording unit can synchronize time and simultaneously reproduce the video and audio shot and recorded by the central camera unit, the surrounding camera units and the microphones of (18) to (22). A camera system according to any one of the preceding claims.
(24) The camera system according to any one of (15) to (23), wherein the operation handle is provided with a semi-spherical camera at the tip of the handle and capable of recording the surrounding environment over 360°.
(25) The camera system according to any one of (15) to (23), wherein the central camera is a semi-spherical camera and capable of recording the surrounding environment over 360°.
(26) The operation handle has a device for operating the light sources of the central unit and the peripheral unit base on the side of the handle, and can send an instruction signal regarding turning on/off the light source and adjustment of illuminance (15) to (24). ).
(27) The center camera and the surrounding cameras each have a view angle operation device, and the operation handle has a switch on the side of the handle for sending an instruction signal to the view angle operation device (15) to (26). ).
(28) The central camera and the surrounding cameras each have a focal length operating device, and the focal length operating device can automatically adjust the focus according to the object to be photographed by an automatic focus adjustment function (15)-( 27) The camera system according to any one of the above.
(29) The operation handle is capable of operating activation and deactivation of the automatic focus adjustment function of the focal length operation devices of the central camera and the peripheral cameras via a switch on the side of the handle (15). The camera system according to any one of (28).
(30) The focal length operation device adjusts the angle of view of the central camera and the peripheral cameras and moves the entire camera mechanism via the operation handle when the automatic focus adjustment function is stopped. Alternatively, any of (15) to (29) capable of automatically stopping the automatic focus adjustment function after activating the automatic focus adjustment function and adjusting the focus when it is detected that a tilting operation has occurred. The camera system according to
(31) The camera system according to any one of (15) to (30), wherein the operation handle has an operation device for the recording unit attached to the side of the handle, and can be operated to start and stop recording.
(32) The camera system according to any one of (21) to (31), wherein the operation handle has an operation device for the recording device attached to the side of the handle, and can be operated to start and stop recording.
(33) Any one of (15) to (22), wherein the operation device of the monitor mechanism is capable of operating activation and deactivation of an automatic focus adjustment function of the focal length operation devices of the central camera and the peripheral cameras. The camera system according to
(34) Any one of (15) to (33), wherein at least one of the central camera and the surrounding cameras is a stereo camera, and the display device is capable of three-dimensionally displaying images captured by the stereo camera. The described camera system.
(35) The apparatus according to any one of (15) to (34), wherein at least one of the central camera and the surrounding cameras is an infrared camera, and the display device can display images captured by the infrared camera. camera system.
(36) The monitor mechanism allows the operation device for selecting an image to be displayed on the display device to display an image captured by any one of the central camera, the peripheral camera, and the semi-celestial camera within an arbitrary shooting range. The camera system according to any one of (24) to (35), in which the image can be cut out, enlarged, and displayed on the display.
(37) Any of (18) to (36), wherein the recording unit can cut out a plurality of recorded images and sounds in an arbitrary time range while maintaining time synchronization and output them as data. The camera system according to
(38) The display device detects by recognizing the pixels of the image that the displayed image does not display the imaging target such as the treatment target area and/or the hand of the practitioner, The camera system according to any one of (15) to (37), which is capable of automatically switching to display of another image when there is no improvement.
(39) The central unit base body and the surrounding unit base body are provided with a distance measuring device for optically measuring the distance to the photographed object near the central camera and the surrounding camera, and distance information is sent to the CPU provided in the monitor mechanism. , and when the CPU detects that the distance suddenly changes greatly and does not return for a certain period of time, it determines that the object to be photographed has not been photographed, and displays the image on the display device. The camera system according to any one of (15) to (38), which is capable of automatically switching to display of video from another camera.
(40) The recording unit and the recording device mechanically and automatically recognize the image information and audio information of the recorded video, and add tag information to the time information of the data, so that the recording is performed based on the tag information. A camera system according to any one of (18) to (39), which enables retrieval of time information and reproduction from a corresponding point in time.
(41) A camera mechanism with a shadowless light that can be used for surgical procedures such as surgery and can capture static and moving images of the affected area to be treated and/or the hands of the operator, and a monitor that displays the images captured by this camera mechanism. In a camera system equipped with a mechanism,
The camera mechanism is
whole base that can be moved and tilted,
at least two camera units mounted on the overall substrate; and
An operation handle that can move and tilt the entire camera mechanism,
each of the at least two camera units has a unit base and at least one camera mounted on the unit base;
The cameras of the at least two camera units are capable of simultaneously capturing fixed images and moving images of the affected area to be treated and/or the operator's hand from different viewpoints,
A camera system, wherein the camera mechanism further comprises image selection means for selecting an image to be displayed on the display device from images captured by the cameras of the at least two camera units. .

＜第３の発明＞
本願の第３の発明は、複数のカメラを用いて、異なる視点で同じ術野を撮影した複数の映像データを取得する取得部と、各映像データの各フレームの画像について、画像内に前記術野が写っている度合いである術野露出度を評価する評価部と、フレームごとに、前記複数の映像データそれぞれの画像の術野露出度に基づいて、前記複数の映像データの中から１つの映像データを選択する選択部と、前記選択部により選択された映像データの画像をディスプレー装置に表示する映像表示部と、を備える手術映像表示装置に関する。手術映像表示装置は、外科手術などの手術映像の記録、再生、表示などを行うためのシステムである。前述した本願の第１の発明に係る多視点映像視聴システムも、手術映像表示装置の一具体例である。<Third invention>
A third aspect of the present invention is an acquisition unit that acquires a plurality of video data obtained by photographing the same surgical field from different viewpoints using a plurality of cameras; an evaluation unit that evaluates an operating field exposure that is the degree to which the field is captured; and one of the plurality of video data based on the operating field exposure of each image of the plurality of video data for each frame. The present invention relates to a surgical image display device including a selection unit for selecting image data and an image display unit for displaying an image of the image data selected by the selection unit on a display device. A surgical image display device is a system for recording, reproducing, and displaying surgical images such as surgical operations. The multi-view video viewing system according to the first invention of the present application described above is also a specific example of the surgical video display device.

本願の第３の発明によれば、異なる視点から撮影された複数の映像データの中から、術野露出度（画像内に術野が写っている度合い）に基づいて適切な映像データが選択的に表示される。これにより、ユーザ（手術映像を視る者）は、障害物によって術野が遮られているような映像を避け、術野が露出している画像を選択的に視聴することができる。したがって、手術の終始にわたって、術野が良好に写る画像のみを視聴でき、ユーザは術野（患部や切開部）の状態や手術の進行を容易に観察することが可能となる。 According to the third invention of the present application, appropriate video data is selectively selected based on the operative field exposure (the degree to which the operative field is captured in the image) from a plurality of video data captured from different viewpoints. to be displayed. As a result, the user (person viewing the surgical video) can avoid videos in which the surgical field is blocked by obstacles and can selectively view images in which the surgical field is exposed. Therefore, the user can view only images showing the operative field in good condition throughout the operation, and the user can easily observe the state of the operative field (affected area or incision) and the progress of the operation.

本願の第３の発明を具体化した実施形態を、図面を参照して詳細に説明する。図２２は、手術映像表示装置の一構成例を示している。図２２に示すように、本実施形態の手術映像表示装置３００は、３台のカメラ３１０と、情報処理装置３２０と、ディスプレー装置３３０とを有している。 An embodiment embodying the third invention of the present application will be described in detail with reference to the drawings. FIG. 22 shows a configuration example of a surgical image display device. As shown in FIG. 22 , the surgical image display device 300 of this embodiment has three cameras 310 , an information processing device 320 and a display device 330 .

カメラ３１０は、カラーのデジタルビデオカメラであり、例えば、毎秒３０フレームの映像データを取り込み可能である。各カメラ３１０は、手術用ベッド３５０の上方から、手術用ベッド上の患者の術野を撮影できるような位置に設置されている。各カメラ３１０は、図８に示すように無影灯に取り付けられていてもよいし、無影灯とは別の場所（天井やアームなど）に取り付けられていてもよい。３台のカメラ３１０は、それぞれ異なる視点（方向）から術野を撮影できるよう、異なる位置・姿勢で設置される。各カメラ３１０の位置・姿勢が固定の場合には、各カメラ３１０の視点の情報（３次元的な位置及び姿勢を定義する情報）が情報処理装置３２０に予め登録されている。各カメラ３１０の位置・姿勢が可変の場合には、各カメラ３１０の位置・姿勢を変化させたときに、その視点の情報が情報処理装置３２０に送られるとよい。なお、カメラの数は３つに限られず、２以上であれば何台設けても構わない。また、全てのカメラの構成や仕様が同じである必要はなく、異なる構成ないし仕様のカメラが混在していてもよい（例えば、可視光カメラと赤外光カメラ、高解像カメラと低解像カメラ、カラーカメラとモノクロカメラなど）。 The camera 310 is a color digital video camera, and can capture video data at 30 frames per second, for example. Each camera 310 is installed at a position such that the surgical field of the patient on the surgical bed can be imaged from above the surgical bed 350 . Each camera 310 may be attached to the operating light as shown in FIG. 8, or may be attached to a location (ceiling, arm, or the like) other than the operating light. The three cameras 310 are installed at different positions and orientations so that the operative field can be photographed from different viewpoints (directions). When the position/orientation of each camera 310 is fixed, the viewpoint information (information defining the three-dimensional position and orientation) of each camera 310 is registered in advance in the information processing device 320 . If the position/posture of each camera 310 is variable, information on the viewpoint may be sent to the information processing device 320 when the position/posture of each camera 310 is changed. Note that the number of cameras is not limited to three, and any number of cameras may be provided as long as they are two or more. In addition, it is not necessary for all cameras to have the same configuration and specifications, and cameras with different configurations and specifications may be mixed (for example, a visible light camera and an infrared light camera, a high resolution camera and a low resolution camera, etc.). cameras, color and monochrome cameras, etc.).

情報処理装置３２０は、画像取得部３２１、画像認識部３２２、スコア算出部３２３、映像選択部３２４、映像表示部３２５を有する。画像取得部３２１は、複数のカメラ３１０のそれぞれから映像データを取得する機能である。ここでは、画像取得部３２１に、異なる視点で同じ術野を撮影した３つの映像データが同期して入力される。画像認識部３２２は、各映像データの各フレームの画像に対して画像認識処理を適用し、画像から所定の物体又は所定の条件を満たす領域（画素群）を検出する機能である。スコア算出部３２３は、画像認識部３２２の検出結果に基づいて、画像内に術野が写っている度合いを評価し、その度合いを示す評価スコアを算出する機能である。以後、この評価スコアを、カメラの視野に術野が露出している度合いという意味で、「術野露出度」と呼ぶ。映像選択部３２４は、フレームごとに、３つの映像データそれぞれの画像の術野露出度に基づいて、３つの映像データの中から１つの映像データを選択する機能である。映像表示部３２５は、映像選択部３２４により選択された映像データの画像をディスプレー装置３３０に表示する機能である。本実施形態の構成において、画像取得部３２１が本発明の「取得部」の一例であり、画像認識部３２２及びスコア算出部３２３が本発明の「評価部」の一例であり、映像選択部３２４が本発明の「選択部」の一例であり、映像表示部３２５が本発明の「映像表示部」の一例である。 The information processing device 320 has an image acquisition unit 321 , an image recognition unit 322 , a score calculation unit 323 , a video selection unit 324 and a video display unit 325 . The image acquisition unit 321 has a function of acquiring video data from each of the cameras 310 . Here, the image acquisition unit 321 is synchronously input with three pieces of video data obtained by photographing the same operative field from different viewpoints. The image recognition unit 322 has a function of applying image recognition processing to the image of each frame of each video data, and detecting a predetermined object or a region (pixel group) that satisfies a predetermined condition from the image. The score calculation unit 323 has a function of evaluating the degree to which the operative field is captured in the image based on the detection result of the image recognition unit 322 and calculating an evaluation score indicating the degree. Hereinafter, this evaluation score is referred to as "operative field exposure" in the sense of the degree of exposure of the operative field to the field of view of the camera. The video selection unit 324 has a function of selecting one video data from three video data for each frame based on the operative field exposure of each image of the three video data. The image display unit 325 has a function of displaying the image of the image data selected by the image selection unit 324 on the display device 330 . In the configuration of this embodiment, the image acquisition unit 321 is an example of the “acquisition unit” of the present invention, the image recognition unit 322 and the score calculation unit 323 are examples of the “evaluation unit” of the present invention, and the video selection unit 324 is an example of the "selection section" of the present invention, and the image display section 325 is an example of the "image display section" of the present invention.

情報処理装置３２０は、例えば、プロセッサ（ＣＰＵ、ＧＰＵなど）、メモリ、不揮発性の記憶装置（ハードディスクドライブ、ソリッドステートドライブ、フラッシュメモリなど）、入力装置などを備えた汎用のコンピューターにより構成することができる。その場合、前述した機能３２１～３２５は、記憶装置に非一時的に格納されたプログラムを、メモリに展開し、プロセッサが実行することによって、ソフトウェア的に実現される。あるいは、情報処理装置３２０の処理及び機能の一部が、ＡＳＩＣなどのハードウェアで実現されてもよいし、クラウドサーバなどの他のコンピューターで実行されてもよい。情報処理装置３２０としては、パーソナルコンピューター、スマートフォン、タブレット端末などを用いることができる。 The information processing device 320 can be configured by, for example, a general-purpose computer equipped with a processor (CPU, GPU, etc.), memory, nonvolatile storage device (hard disk drive, solid state drive, flash memory, etc.), input device, and the like. can. In that case, the above-described functions 321 to 325 are implemented in software by developing a program non-temporarily stored in a storage device into a memory and executing the program by a processor. Alternatively, part of the processing and functions of the information processing device 320 may be realized by hardware such as ASIC, or may be executed by another computer such as a cloud server. A personal computer, a smart phone, a tablet terminal, or the like can be used as the information processing device 320 .

ディスプレー装置３３０は、映像表示部３２５から出力される映像を表示するデバイスである。ディスプレー装置３３０としては、液晶ディスプレイ、有機ＥＬディスプレイなどを用いることができる。なお、情報処理装置３２０とディスプレー装置３３０が一体的に構成されていてもよい。 The display device 330 is a device that displays images output from the image display unit 325 . A liquid crystal display, an organic EL display, or the like can be used as the display device 330 . Note that the information processing device 320 and the display device 330 may be configured integrally.

図２３は、情報処理装置３２０の処理の一例を示すフローチャートである。図２３はフレーム毎に実行される処理を示している。まず、画像取得部３２１が、３台のカメラ３１０で時間同期して撮影された３枚の画像のデータを取り込む（ステップＳ３００）。この３枚の画像は、同じ術野を同時刻に異なる視点から撮影した画像セットである。次に、画像認識部３２２が、３枚の画像のそれぞれに対し画像認識処理を適用する（ステップＳ３０１）。次に、スコア算出部３２３が、画像認識の結果に基づき、各画像の術野露出度を計算する（ステップＳ３０２）。次に、映像選択部３２４が、３枚の画像の中から、術野露出度が最も大きい画像を選択する（ステップＳ３０３）。そして、映像表示部３２５が、ステップＳ３０３で選択された画像をディスプレー装置３３０に表示する（ステップＳ３０４）。以上述べた処理をフレーム毎に実行することによって、複数のカメラ３１０の画像の中から、術野が最も良好に写っている画像が選択され、ディスプレー装置３３０に表示される。 FIG. 23 is a flow chart showing an example of processing of the information processing device 320 . FIG. 23 shows the processing performed for each frame. First, the image acquisition unit 321 acquires data of three images captured in time synchronization by the three cameras 310 (step S300). These three images are a set of images of the same surgical field taken at the same time from different viewpoints. Next, the image recognition unit 322 applies image recognition processing to each of the three images (step S301). Next, the score calculator 323 calculates the operating field exposure of each image based on the result of image recognition (step S302). Next, the image selection unit 324 selects the image with the highest operative field exposure from the three images (step S303). Then, the image display unit 325 displays the image selected in step S303 on the display device 330 (step S304). By executing the above-described processing for each frame, an image that best captures the operative field is selected from the images of the plurality of cameras 310 and displayed on the display device 330 .

続いて、画像認識部３２２とスコア算出部３２３の処理の具体例を説明する。本実施形態では、医師や看護師の頭部によって術野が遮られることが多いことに着目し、画像認識によって「帽子」を検出する。そして、画像中の帽子領域の面積（すなわち、画像における帽子領域の割合）が小さいほど術野露出度が大きく、逆に、画像中の帽子領域の面積が大きいほど術野露出度が小さい、と評価する。手術用の帽子の色や形状は予めわかっているとともに、帽子は手術中に汚染することもない。よって、帽子の画像認識は、比較的容易であるとともに、安定した認識結果が得られるという利点がある。 Next, a specific example of processing by the image recognition unit 322 and the score calculation unit 323 will be described. In this embodiment, focusing on the fact that the surgical field is often blocked by the heads of doctors and nurses, a "hat" is detected by image recognition. Then, the smaller the area of the hat region in the image (that is, the ratio of the hat region in the image), the higher the operative field exposure, and conversely, the larger the area of the hat region in the image, the smaller the operative field exposure. evaluate. The color and shape of the surgical cap is known in advance, and the cap cannot become contaminated during surgery. Therefore, image recognition of the hat is relatively easy, and has the advantage of obtaining stable recognition results.

図２４Ａ、図２４Ｂは、術野の上方に設置されたカメラ３１０から取り込まれた画像の例を示している。図２４Ａでは、画像のほぼ中央部に術野（切開部４００）が写っており、その周囲に外科医の手４０１、手術器械４０２、外科医が被っている帽子４０３が写っている。図２４Ｂは、外科医が切開部４００を覗き込む姿勢をとったために、帽子４０３によってカメラ３１０の視界から術野の大半が遮られてしまった例である。これらの例の場合、図２４Ｂの画像に比べて図２４Ａの画像の方が術野露出度が大きいと評価すべきである。 24A and 24B show examples of images captured from a camera 310 placed above the surgical field. In FIG. 24A, the surgical field (incision 400) is shown approximately in the center of the image, and the surgeon's hands 401, surgical instruments 402, and a cap 403 worn by the surgeon are shown around it. FIG. 24B shows an example in which the surgeon is in a position to look into the incision 400 so that the hat 403 blocks most of the operative field from the view of the camera 310 . For these examples, it should be appreciated that the operative field exposure is greater in the image of FIG. 24A than in the image of FIG. 24B.

画像から帽子４０３を認識する方法としてはどのようなアルゴリズムを採用してもよい。例えば、手術用の帽子４０３は、青、緑、白など、予め決まった色をしているため、色特徴に基づく認識アルゴリズムが好適である。帽子の色特徴は、ユーザが設定（教示）してもよいし、多数の手術画像（トレーニングデータ）を用いて機械学習してもよい。あるいは、頭部の形状は概ね丸いことから、色特徴と形状特徴を組み合わせた認識アルゴリズムを採用してもよい。さらには、画像認識の容易化ないし精度向上を図るため、色、テクスチャ、又は形状が特殊な帽子、あるいは画像認識用のマーカーを付した帽子を医師や看護師に着用させてもよい。 Any algorithm may be adopted as a method for recognizing the hat 403 from the image. For example, the surgical cap 403 has a predetermined color, such as blue, green, and white, so a recognition algorithm based on color features is preferred. The color feature of the hat may be set (instructed) by the user, or machine-learned using a large number of surgical images (training data). Alternatively, since the shape of the head is generally round, a recognition algorithm that combines color features and shape features may be employed. Furthermore, in order to facilitate image recognition or improve accuracy, doctors and nurses may be made to wear hats with special colors, textures, or shapes, or hats with markers for image recognition.

図２２の画像認識部３２２は、帽子４０３を検出すると、画像から帽子４０３の領域を抽出する。領域抽出には公知のセグメンテーションアルゴリズムを用いることができる。その後、スコア算出部３２３が、帽子領域の面積（例えば画素数）を計算し、下記式により、術野露出度を求める。

術野露出度＝（画像全体の面積－帽子領域の面積）／画像全体の面積

この術野露出度は、画像に帽子が写っていない場合に最大値１をとり、画像全体が帽子で完全に遮られている場合（術野が全く写っていない場合）に最小値０をとる。When the image recognition unit 322 in FIG. 22 detects the hat 403, it extracts the area of the hat 403 from the image. A known segmentation algorithm can be used for region extraction. After that, the score calculator 323 calculates the area of the hat region (for example, the number of pixels), and obtains the operative field exposure using the following equation.

Operating field exposure = (Area of entire image - Area of hat region) / Area of entire image

The operating field exposure has a maximum value of 1 when the hat is not captured in the image, and a minimum value of 0 when the entire image is completely blocked by the hat (when the operating field is not captured at all). .

図２５は、フレーム毎の術野露出度の変化と、表示する画像の切り替えの例を示している。図２５の上段に示す時刻ｔ１のフレームでは、３台のカメラの画像Ｉ１～Ｉ３の中で画像Ｉ２の術野露出度が最も大きく、ディスプレー装置３３０には画像Ｉ２が表示される。その後、時刻ｔ２のフレームでは、図２５の下段に示すように、画像Ｉ３の術野露出度が最も大きくなり、ディスプレー装置３３０（図２２）には画像Ｉ３が表示されることとなる。このように術野露出度が大きいカメラの画像を選択的に表示することで、ユーザは常に術野の状態を観察することができる。 FIG. 25 shows an example of changes in the operating field exposure for each frame and switching of images to be displayed. In the frame at time t1 shown in the upper part of FIG. 25, the image I2 has the highest surgical field exposure among the images I1 to I3 of the three cameras, and the image I2 is displayed on the display device 330 . Thereafter, in the frame at time t2, as shown in the lower part of FIG. 25, the operating field exposure of image I3 is the highest, and image I3 is displayed on display device 330 (FIG. 22). By selectively displaying an image of a camera with a high operative field exposure degree in this way, the user can always observe the state of the operative field.

なお、本実施形態では、術野を遮る物として「帽子」を検出する例を説明したが、術野露出度の評価方法はこれに限られない。例えば、画像認識部３２２は、術野の周囲に存在し得る物（例えば、手、手術器械など）を検出してもよい。術野の周囲に存在し得る物（手、手術器械など）が画像に写っているということは、術野も画像に写っている蓋然性が高いからである。この場合、手や手術器械が検出された画像の方が術野露出度が大きいと評価することができる。また、画像から検出された手や手術器械の数が多いほど、術野露出度が大きいと評価してもよい。手の検出には、第１の発明で述べたものと同じアルゴリズムを用いることができる。手術器械の検出には、例えば、色特徴や形状特徴に基づく認識アルゴリズムを用いてもよい。また、手袋や手術器械に画像認識用のマーカーを付すなどして、画像認識を容易化してもよい。 In this embodiment, an example of detecting a "hat" as an object blocking the operative field has been described, but the method of evaluating the operative field exposure is not limited to this. For example, the image recognizer 322 may detect objects that may be present around the surgical field (eg, hands, surgical instruments, etc.). This is because there is a high probability that the operative field will also appear in the image when objects that may exist around the operative field (hands, surgical instruments, etc.) are captured in the image. In this case, it can be evaluated that the image in which the hand and the surgical instrument are detected has a higher surgical field exposure. In addition, it may be evaluated that the greater the number of hands and surgical instruments detected from the image, the greater the exposure of the surgical field. The same algorithm as described in the first invention can be used for hand detection. Recognition algorithms based on color features or shape features, for example, may be used to detect surgical instruments. Image recognition may also be facilitated by attaching markers for image recognition to gloves or surgical instruments.

画像認識部３２２は、術野そのもの（例えば、切開部など）を検出してもよい。一般に、術野（切開部）以外は滅菌ドレープで覆われることから、肌色や赤系色は術野（切開部）にのみ表れることが多い。したがって、肌色や赤系色などの色特徴を有する、という色条件を満たす領域を術野として認識してもよい。また、術野の部分は、無影灯の光が常にあたっているため、術野以外の部分に比べて明るい画像が得られることから、輝度が所定の閾値よりも明るい、という輝度条件を満たす領域を術野として認識してもよい。なお、術野として認識する色特徴や輝度閾値については、ユーザが設定（教示）してもよいし、多数の手術画像（トレーニングデータ）を用いて機械学習してもよい。この場合、色条件を満たす領域又は輝度条件を満たす領域の面積が大きいほど、術野露出度が大きいと評価すればよい。 The image recognition unit 322 may detect the surgical field itself (for example, an incision). Since areas other than the surgical field (incision) are generally covered with a sterile drape, the skin color and reddish color often appear only in the surgical field (incision). Therefore, a region that satisfies the color condition that it has color characteristics such as skin color or reddish color may be recognized as the operative field. In addition, since the operating field is always exposed to the light of the shadowless light, a brighter image can be obtained than the area other than the operating field. may be recognized as the operative field. Note that the color features and luminance thresholds to be recognized as the surgical field may be set (instructed) by the user, or may be machine-learned using a large number of surgical images (training data). In this case, the larger the area of the region that satisfies the color condition or the luminance condition, the higher the operative field exposure.

本実施形態の映像選択部３２４は、３枚の画像の中から術野露出度が最も大きい画像を表示用の画像として選択したが、この方法では、カメラの切り替えが頻繁に発生し、画像を視聴している者に不快感を与える可能性がある。そこで、映像選択部３２４は、単純に術野露出度の大きさのみで選択画像を決定するのではなく、カメラの切り替え頻度も考慮して選択画像を決定することが好ましい。例えば、映像選択部３２４は、画像の術野露出度を可及的に大きくし、且つ、切り替え頻度を可及的に少なくするように設定された最適化問題を解くことによって、選択画像を決定してもよい。一例として、術野露出度の逆数を術野遮蔽度と定義し、術野遮蔽度の項と切り替え頻度の項を含む目的関数（コスト関数）を設定し、この目的関数の値が最小となるように画像の切り替えを制御する方法を採用できる。このとき、目的関数における術野遮蔽度と切り替え頻度のそれぞれの重みを決定するパラメータをユーザが調整できるようにしてもよい。このようなパラメータ調整機能を設けることにより、術野の露出ができるだけ大きい画像の表示を優先するか、画像の切り替えをできるだけ少なくすることを優先するかを、ユーザの好みで調整することが可能となる。 The image selection unit 324 of this embodiment selects the image with the highest surgical field exposure from among the three images as the image for display. It may make viewers feel uncomfortable. Therefore, it is preferable that the image selection unit 324 determines the selection image not only based on the magnitude of the operative field exposure but also considering the switching frequency of the cameras. For example, the image selection unit 324 determines the selected image by solving an optimization problem set to maximize the exposure of the surgical field of the image and to reduce the switching frequency as much as possible. You may As an example, the reciprocal of the surgical field exposure is defined as the surgical field shielding degree, and an objective function (cost function) including a term of the surgical field shielding degree and a term of the switching frequency is set, and the value of this objective function is the minimum. A method for controlling switching of images can be adopted as follows. At this time, the user may be allowed to adjust parameters for determining the respective weights of the surgical field shielding degree and switching frequency in the objective function. By providing such a parameter adjustment function, it is possible for the user to adjust whether to give priority to displaying an image in which the exposure of the operative field is as large as possible or to giving priority to reducing the switching of images as much as possible. Become.

図２６に、最短経路問題の解法であるダイクストラ法を応用した画像の切り替え制御アルゴリズムの例を示す。図２６では、説明の便宜のため、３台のカメラ（３つの画像）の切り替え制御を示しているが、４台以上のカメラの切り替えにも適用可能である。まず、画像の切り替えを禁止する最小限のフレーム数ｎ（ｎは２以上の整数）を予め設定する（例えばｎ＝１０を設定した場合、少なくとも１０フレームのあいだは同じカメラの画像が表示され続けることを保証する。）。あるフレームｔにおいて選択されているカメラＣ１を頂点と考え、次に選択され得るカメラまでを辺でつなぐ。同じカメラ同士の場合は、フレームｔと次のフレームｔ＋１のあいだを辺５００でつなぎ、辺５００の重みとして、フレームｔ＋１のカメラＣ１の画像の術野遮蔽度を設定する。異なるカメラ同士の場合は、フレームｔにおけるカメラＣ１とフレームｔ＋ｎにおけるカメラＣ２あいだを辺５０１でつなぎ、辺５０１の重みとして、フレームｔ＋１～フレームｔ＋ｎまでのカメラＣ２の画像の術野遮蔽度の合算値を設定する。同様に、フレームｔにおけるカメラＣ１とフレームｔ＋ｎにおけるカメラＣ３のあいだを辺５０２でつなぎ、辺５０２の重みとして、フレームｔ＋１～フレームｔ＋ｎまでのカメラＣ３の画像の術野遮蔽度の合算値を設定する。このようにグラフ構造を作成した後、ダイクストラ法によって最適な組み合わせ問題を解く。なお、辺５０１（異なるカメラへの切り替え）が選択された場合には、フレームｔ＋１の時点でカメラＣ２に切り替えて、フレームｔ＋１～フレームｔ＋ｎのｎフレームのあいだカメラＣ２の画像が表示され続けるようにする。辺５０２が選択された場合も同様である。このようなアルゴリズムによって、ｎフレームよりも短い期間に画像の切り替えが発生することを禁止しつつ、術野遮蔽度が最も小さくなるように画像の選択が行われる。 FIG. 26 shows an example of an image switching control algorithm to which Dijkstra's method, which is a method for solving the shortest path problem, is applied. For convenience of explanation, FIG. 26 shows switching control of three cameras (three images), but it is also applicable to switching of four or more cameras. First, preset the minimum number of frames n (n is an integer of 2 or more) for prohibiting image switching (for example, if n = 10, the same camera image will continue to be displayed for at least 10 frames). guarantee that.). The camera C1 selected at a certain frame t is considered as a vertex, and the next camera that can be selected is connected by an edge. In the case of the same cameras, the frame t and the next frame t+1 are connected by a side 500, and as the weight of the side 500, the operative field shielding degree of the image of the camera C1 in the frame t+1 is set. In the case of different cameras, the side 501 connects the camera C1 in the frame t and the camera C2 in the frame t+n, and the weight of the side 501 is the sum of the degree of operative field shielding of the images of the camera C2 from the frame t+1 to the frame t+n. set. Similarly, the camera C1 in the frame t and the camera C3 in the frame t+n are connected by a side 502, and as the weight of the side 502, the total value of the degree of operative field shielding of the images of the camera C3 from the frame t+1 to the frame t+n is set. . After creating the graph structure in this way, the optimal combination problem is solved by Dijkstra's algorithm. Note that when the side 501 (switching to a different camera) is selected, the camera is switched to the camera C2 at the time of the frame t+1, and the image of the camera C2 is continuously displayed for n frames from the frame t+1 to the frame t+n. do. The same is true when side 502 is selected. With such an algorithm, images are selected so that the degree of operative field occluded is minimized while prohibiting image switching from occurring in a period shorter than n frames.

図２２の映像表示部３２５が、映像選択部３２４で選択された画像から術野を含む領域（「術野領域」と呼ぶ）を抽出し、抽出した術野領域をディスプレー装置３３０に拡大表示してもよい。これにより、術野の観察をより適切に行うことが可能となる。術野領域の抽出は、例えば、多数の手術画像（トレーニングデータ）を用いて術野領域の画像特徴を学習した識別器により術野領域を認識することにより行ってもよいし、画像中の高輝度領域を抽出することにより行ってもよいし、ユーザによる術野領域の教示に基づき抽出してもよい。ユーザによる術野領域の教示方法としては、ポインティングデバイスにより領域を指定する方法や、視線入力によって関心領域を教示する方法などがある。なお、これらの教示方法は、トレーニングデータに術野領域を設定する際にも応用できる。 The image display unit 325 in FIG. 22 extracts an area including the operative field (referred to as “operative field area”) from the image selected by the image selection unit 324, and enlarges and displays the extracted operative field area on the display device 330. may This makes it possible to observe the operative field more appropriately. Extraction of the surgical field region may be performed, for example, by recognizing the surgical field region using a classifier that has learned image features of the surgical field region using a large number of surgical images (training data), or by recognizing the surgical field region. It may be performed by extracting the luminance region, or may be extracted based on the instruction of the surgical field region by the user. Methods of teaching the operating field region by the user include a method of specifying the region with a pointing device, a method of teaching the region of interest by inputting a line of sight, and the like. Note that these teaching methods can also be applied when setting an operating field region in training data.

図２７は、手術映像表示装置の別構成例を示している。映像選択部３２４と映像表示部３２５のあいだに画像変換部３２６が設けられている点が、図２２の構成と異なる。この画像変換部３２６は、カメラの視点の情報に基づいて、当該カメラの画像を基準の視点から視た画像へと変換する機能を有する。 FIG. 27 shows another configuration example of the surgical image display device. The configuration differs from that of FIG. 22 in that an image conversion section 326 is provided between the video selection section 324 and the video display section 325 . This image conversion unit 326 has a function of converting an image of the camera into an image viewed from a reference viewpoint, based on the information of the viewpoint of the camera.

例えば、３台のカメラをＣ１～Ｃ３とし、カメラＣ１の視点を基準の視点に設定した場合、下記の変換式により、カメラＣ２～Ｃ３の画像Ｉ２～Ｉ３をカメラＣ１の視点から視た画像に変換することができる。

ここで、（ｘ１，ｙ１）は画像Ｉ１の座標であり、（ｘｉ，ｙｉ）は画像Ｉｉ（ｉ＝２，３）の座標であり、Ａｉは画像Ｉｉの座標を画像Ｉの座標に変換するための変換行列である。変換行列Ａｉにより、画像の回転、拡大／縮小などの変換が可能である。なお、前述のように、各カメラの視点の情報は既知であるため、変換行列Ａｉはこれらの視点の情報から幾何学的計算により求めることができる。For example, if three cameras are C1 to C3, and the viewpoint of camera C1 is set as a reference viewpoint, the following conversion formula converts images I2 to I3 of cameras C2 to C3 into images viewed from the viewpoint of camera C1. can be converted.

where (x1,y1) are the coordinates of image I1, (xi,yi) are the coordinates of image Ii (i=2,3), and Ai transforms the coordinates of image Ii to the coordinates of image I. is a transformation matrix for The transformation matrix Ai enables transformation such as image rotation and enlargement/reduction. As described above, since the viewpoint information of each camera is known, the transformation matrix Ai can be obtained by geometric calculation from the viewpoint information.

このような画像変換を行うことにより、表示する画像の視点（すなわち、画像の見た目）をそろえることができる。したがって、カメラ切り替え時の画像の不連続を可及的に小さくできるので、術野露出度に応じてカメラが切り替わった場合でも違和感のない視聴を継続することができる。 By performing such image conversion, it is possible to align the viewpoints of the images to be displayed (that is, the appearance of the images). Therefore, the discontinuity of the image at the time of camera switching can be reduced as much as possible, so that even when the camera is switched according to the exposure of the surgical field, viewing can be continued without discomfort.

また画像変換部３２６は、複数のカメラの画像をもとに、各カメラの画像に含まれる各画素の深度情報を取得し、三次元画像に変換することができる。映像表示部３２５は、深度情報が一定の値より小さい画素を省略して表示することで、あたかも頭が消失した三次元映像として術野を表示することができる。また映像表示部３２５は、画像変換部３２６によって作成された三次元画像を回転させ、視聴者の操作によって任意の視点で表示することが可能である。ディスプレー装置３３０としてヘッドマウント・ディスプレーを採用することで、視聴者はより直感的に三次元画像を観察することができ、あたかも実際の手術に参加しているような視聴体験が可能である。 Further, the image conversion unit 326 can acquire depth information of each pixel included in each camera image based on the images of a plurality of cameras, and convert the acquired depth information into a three-dimensional image. The image display unit 325 can display the operative field as a three-dimensional image in which the head disappears by omitting pixels whose depth information is smaller than a certain value. Also, the video display unit 325 can rotate the three-dimensional image created by the image conversion unit 326 and display it from an arbitrary viewpoint according to the viewer's operation. By adopting a head-mounted display as the display device 330, the viewer can more intuitively observe the three-dimensional image, and can have a viewing experience as if he were participating in an actual surgery.

１０被写体
１２主カメラ
１４副カメラ
２０情報処理装置
２２画像取得部
２４制御部
２６画像認識部
２８カウント部
３０平均点算出部
３２映像選択部（画像選択部）
３４映像表示部
４０ディスプレー装置
５２、５４、５６、５８手が認識された領域（認識領域）
６２、６４、６６、６８認識領域の中心点
７０認識領域の中心点の平均点
８０映像の中心点
９０平均点と映像の中心点との距離10 subject 12 main camera 14 sub camera 20 information processing device 22 image acquisition unit 24 control unit 26 image recognition unit 28 counting unit 30 average score calculation unit 32 video selection unit (image selection unit)
34 video display unit 40 display devices 52, 54, 56, 58 areas where hands are recognized (recognition areas)
62, 64, 66, 68 Center point 70 of recognition area Average point 80 of center points of recognition area Center point 90 of image Distance between average point and center point of image

Claims

A surgical image display device,
an acquisition unit that acquires a plurality of image data obtained by photographing the same surgical field from different viewpoints using a plurality of cameras;
an evaluation unit that evaluates the operative field exposure, which is the degree to which the operative field is captured in the image, for each frame image of each video data;
a selection unit that selects video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
a video display unit for displaying an image of the video data selected by the selection unit on a display device;
with
The evaluation unit detects an exposed area that is not covered with a sterile drape from the image as the operative field, and evaluates that the larger the area of the detected area, the greater the degree of exposure of the operative field. Surgical image display device.

A surgical image display device,
an acquisition unit that acquires a plurality of image data obtained by photographing the same surgical field from different viewpoints using a plurality of cameras;
an evaluation unit that evaluates the operative field exposure, which is the degree to which the operative field is captured in the image, for each frame image of each video data;
a selection unit that selects video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
a video display unit for displaying an image of the video data selected by the selection unit on a display device;
with
The selection unit solves an optimization problem set so as to increase the exposure of the surgical field of the image as much as possible and to reduce the switching frequency of the image data as much as possible, thereby selecting the image data to be selected. A surgical image display device characterized by determining the

3. The surgical image display apparatus according to claim 2, wherein the selector has a function of allowing a user to adjust parameters for determining the weight of the surgical field exposure and the switching frequency.

A surgical image display device,
an acquisition unit that acquires a plurality of image data obtained by photographing the same surgical field from different viewpoints using a plurality of cameras;
an evaluation unit that evaluates the operative field exposure, which is the degree to which the operative field is captured in the image, for each frame image of each video data;
a selection unit that selects video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
a video display unit for displaying an image of the video data selected by the selection unit on a display device;
with
further comprising an image conversion unit that converts an image of video data captured by each camera into an image viewed from the same viewpoint based on viewpoint information of each camera;
The surgical image display device, wherein the image display unit displays the image converted by the image conversion unit on a display device.

The evaluation unit detects at least one of the operative field, objects that may exist around the operative field, and objects that block the operative field from the image, and determines whether the operative image is processed based on the detection result. 5. The surgical image display device according to any one of claims 1 to 4, wherein field exposure is evaluated.

The evaluation unit detects a hand as an object that may exist around the surgical field from the image, and the image in which the hand is detected has a higher surgical field exposure than the image in which the hand is not detected. 6. The surgical image display device according to claim 5 , wherein the evaluation is such that the is large.

7. The surgical image display apparatus according to claim 6 , wherein the evaluation unit evaluates that the greater the number of hands detected from the image, the greater the exposure of the surgical field.

The evaluation unit detects a cap from the image as an object that obstructs the operative field, and evaluates that the operative field exposure is lower in the image in which the hat is detected than in the image in which the hat is not detected. 6. The surgical image display device according to claim 5 , characterized in that:

9. The surgical image display apparatus according to claim 8 , wherein the evaluation unit evaluates that the larger the area of the hat region detected from the image, the smaller the surgical field exposure.

The evaluation unit detects an incision as the operative field from the images, and evaluates that the image in which the incision is detected has a higher operative field exposure than the image in which the incision is not detected. 6. The surgical image display device according to claim 5 , characterized in that:

The evaluation unit detects a region from the image as the operative field as the operative field, and evaluates that the larger the area of the region that satisfies the predetermined color condition, the higher the operative field exposure. 6. The surgical image display device according to claim 5 .

The evaluation unit detects a region as the surgical field from the image, and evaluates that the larger the area of the region that satisfies the predetermined brightness condition, the higher the surgical field exposure. 6. The surgical image display device according to claim 5 .

The surgical image display according to any one of claims 1 to 12 , wherein the selection unit selects, from among the plurality of image data, image data having the highest surgical field exposure of the image. Device.

13. The image display unit extracts an area including the surgical field from the image of the image data selected by the selection unit, and enlarges and displays the extracted area on a display device. The surgical image display device according to any one of the above.

further comprising a shadowless light for illuminating the surgical field;
The surgical image display apparatus according to any one of claims 1 to 14 , wherein the plurality of cameras are provided in the shadowless lamp.

A control method for a surgical image display device,
Using a plurality of cameras to acquire a plurality of video data of the same operative field photographed from different viewpoints;
a step of evaluating the degree of exposure of the surgical field, which is the degree to which the surgical field is shown in the image, for each frame image of each piece of video data;
selecting video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
displaying an image of the selected video data on a display device;
has
In the evaluating step, the area exposed without being covered with a sterile drape is detected as the operative field from the image, and the larger the area of the detected area, the greater the operative field exposure is evaluated. A control method for a surgical image display device.

A control method for a surgical image display device,
Using a plurality of cameras to acquire a plurality of video data of the same operative field photographed from different viewpoints;
a step of evaluating the degree of exposure of the surgical field, which is the degree to which the surgical field is shown in the image, for each frame image of each piece of video data;
selecting video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
displaying an image of the selected video data on a display device;
has
In the selecting step, the image to be selected is solved by solving an optimization problem set to maximize the exposure of the surgical field of the image and to reduce the switching frequency of the image data as much as possible. A control method for a surgical image display device, comprising determining data.

A control method for a surgical image display device,
Using a plurality of cameras to acquire a plurality of video data of the same operative field photographed from different viewpoints;
a step of converting an image of video data captured by each camera into an image viewed from the same viewpoint based on the viewpoint information of each camera;
a step of evaluating the degree of exposure of the surgical field, which is the degree to which the surgical field is shown in the image, for each frame image of each piece of video data;
selecting video data from among the plurality of video data based on the operative field exposure of each image of the plurality of video data for each frame;
displaying an image of the selected video data on a display device;
has
A control method for a surgical image display device, wherein an image that has been converted into an image viewed from the same viewpoint is displayed on the display device.

A program for causing a processor provided in a surgical image display device to execute each step of the control method according to any one of claims 16 to 18 .

A computer-readable recording medium storing the program according to claim 19 non-temporarily.