JP6904757B2

JP6904757B2 - Camera system

Info

Publication number: JP6904757B2
Application number: JP2017073640A
Authority: JP
Inventors: 道輝柴原
Original assignee: Hitachi Kokusai Electric Inc
Current assignee: Hitachi Kokusai Electric Inc
Priority date: 2017-04-03
Filing date: 2017-04-03
Publication date: 2021-07-21
Anticipated expiration: 2037-04-03
Also published as: JP2018182367A

Description

本発明は、カメラ本体と受信ユニットからなるカメラシステムに関するものである。 The present invention relates to a camera system including a camera body and a receiving unit.

従来のカメラシステムの概要を図１〜図３で示す。
監視カメラ１０１は撮像素子から出力される画像信号に信号処理を施して、映像信号を出力している。映像信号処理部１６には、蓄積処理部３７、ゲインアップ器３１、γ補正３５、ニー処理器３３、色補正器(マスキング)３２、エンハンサ３４、WDR（ワイドダイナミックレンジ）、霧霞補正器(DEFOG）３９、DNR（デジタルノイズリダクション）処理器３６、電子ズームなどがある。受信ユニット１０２はカメラから、HD-VLC(商標)またはHD-SDI(High Definition-Serial Digital Interface)等の圧縮/非圧縮映像信号を受信し、様々なモニタや記録装置に接続できるようにフォーマット変換を行い、VBS(Video(映像信号）、Burst(カラーバースト信号)、Sync(同期信号))出力、DVI-D出力、アナログRGB(Red, Green, Blue)出力、HD-SDI出力できるようになっている。 The outline of the conventional camera system is shown in FIGS. 1 to 3.
The surveillance camera 101 performs signal processing on the image signal output from the image sensor and outputs a video signal. The video signal processing unit 16 includes a storage processing unit 37, a gain-up device 31, a gamma correction 35, a knee processing device 33, a color correction device (masking) 32, an enhancer 34, a WDR (wide dynamic range), and a fog haze correction device ( DEFOG) 39, DNR (digital noise reduction) processor 36, electronic zoom, etc. The receiving unit 102 receives compressed / uncompressed video signals such as HD-VLC (trademark) or HD-SDI (High Definition-Serial Digital Interface) from the camera, and format-converts them so that they can be connected to various monitors and recording devices. VBS (Video (video signal), Burst (color burst signal), Sync (synchronous signal)) output, DVI-D output, analog RGB (Red, Green, Blue) output, HD-SDI output can now be performed. ing.

監視カメラ１０１と受信ユニット１０２による構成の最大のメリットは、カメラからHD-VLC圧縮伝送方式で出力し、受信ユニットでHD-SDI信号に復元することで、最大300mの同軸ケーブル１０３で伝送できることにある。これにより既設の同軸ケーブルを使用してアナログカメラからHD-SDIカメラへのカメラ変更が可能となった。また、従来の監視カメラ１０１と受信ユニット１０２には映像信号処理を行うためにFPGA(Field Programmable Gate Array)１２，２２を搭載している。動的再構成デバイスの一種であるFPGAのメリットは、信号処理のカスタマイズが容易に行えることで、機能追加などの要望にすぐ対応することが可能である。その反面デメリットは市販品の信号処理LSIチップ（SoC）に比べると消費電力が大きく、機能追加などでさらに消費電力が上がり、それに伴いカメラ本体の温度が上昇してしまう点である。 The biggest merit of the configuration consisting of the surveillance camera 101 and the receiving unit 102 is that it can be transmitted by the coaxial cable 103 with a maximum of 300 m by outputting from the camera by the HD-VLC compression transmission method and restoring it to the HD-SDI signal by the receiving unit. be. This made it possible to change the camera from an analog camera to an HD-SDI camera using the existing coaxial cable. Further, the conventional surveillance camera 101 and the receiving unit 102 are equipped with FPGAs (Field Programmable Gate Arrays) 12 and 22 for performing video signal processing. The merit of FPGA, which is a kind of dynamically reconfigured device, is that signal processing can be easily customized, and it is possible to immediately respond to requests such as adding functions. On the other hand, the disadvantage is that it consumes more power than commercially available signal processing LSI chips (SoCs), and the power consumption increases further due to the addition of functions, and the temperature of the camera body rises accordingly.

従来はカメラ単体のみで様々な機能を満足するように機能追加などを行ってきたため、カメラの消費電力増加に伴う本体の温度上昇を避けられなかった。
そこで、カメラと受信ユニットをセット構成にすることで、カメラのみに機能を追加する必要はなく、受信ユニット側に機能を追加することでカメラの消費電力を抑えることが可能になる（例えば特許文献１乃至３参照）。 In the past, functions were added to satisfy various functions with the camera alone, so it was unavoidable that the temperature of the main unit would rise as the power consumption of the camera increased.
Therefore, it is not necessary to add a function only to the camera by forming a set of the camera and the receiving unit, and it is possible to reduce the power consumption of the camera by adding the function to the receiving unit side (for example, Patent Document). 1 to 3).

特開２０１０−１５４４６２号公報Japanese Unexamined Patent Publication No. 2010-154462 特開２００７−２２１６８４号公報Japanese Unexamined Patent Publication No. 2007-221684 特許登録第５８７４５１９号公報Patent Registration No. 5874519 特開２０１１−１３５５３２号公報Japanese Unexamined Patent Publication No. 2011-135532 特開２００７−０３６７８７号公報Japanese Unexamined Patent Publication No. 2007-036787 特開２０１０−１３６０３２号公報Japanese Unexamined Patent Publication No. 2010-136032 特開２０１４−０３６４１４号公報Japanese Unexamined Patent Publication No. 2014-036414 特開平０９−１３９９３９号公報Japanese Unexamined Patent Publication No. 09-139939 特許登録第５５３５４７６号公報Patent Registration No. 5535476 特許登録第２９９５６８３号公報Patent Registration No. 2995683 特開２０１１−２５９０４７号公報Japanese Unexamined Patent Publication No. 2011-259047 特開２０１４−１７１１１９号公報Japanese Unexamined Patent Publication No. 2014-171119

“28nm FPGA の多機能性を向上させるパーシャル & ダイナミック・リコンフィギュレーション”、［online］、2010年、Altera Corporation、[2017年2月28日検索]、インターネット＜https://www.altera.co.jp/ja_JP/pdfs/literature/wp/wp-01137-stxv-dynamic-partial-reconfig_j.pdf＞“Partial & Dynamic Reconfiguration to Improve the Multifunctionality of 28nm FPGAs”, [online], 2010, Altera Corporation, [Search February 28, 2017], Internet <https://www.altera.co .jp / ja_JP / pdfs / literature / wp / wp-01137-stxv-dynamic-partial-reconfig_j.pdf ＞

しかしながら、組み合わせるカメラと受信ユニットが固定ではない場合、組合せによっては機能の重複または不足が生じうる。機能が重複していても、一方の機能を停止させれば問題は無いが、それはハードウェアのリソースを有効に利用できていないことを意味する。
またユーザが求める画像処理は多様であり、カメラと受信ユニットのセット構成にしたとしても、実行可能な処理が不足する可能性がある。需要に応じて様々な機能に対応しようとすると、カメラ側で行うべき処理の種類が増え、多くの機種を用意しなければならないという問題がある。 However, if the camera to be combined and the receiving unit are not fixed, duplication or lack of functions may occur depending on the combination. Even if the functions are duplicated, there is no problem if one of the functions is stopped, but that means that the hardware resources are not being used effectively.
Further, the image processing required by the user is various, and even if the camera and the receiving unit are configured as a set, there is a possibility that the processing that can be executed is insufficient. When trying to support various functions according to demand, there is a problem that the types of processing to be performed on the camera side increase and many models must be prepared.

本発明の目的は、上記問題点に鑑み、必要な機能をカメラもしくは受信ユニットに適切に追加することができるカメラシステムを提供することにある。 An object of the present invention is to provide a camera system capable of appropriately adding necessary functions to a camera or a receiving unit in view of the above problems.

本発明のカメラシステムは、カメラと受信ユニットを同軸ケーブルで接続するカメラシステムであって、監視カメラは撮像素子から出力された信号を映像信号処理する第１動的再構成デバイスと、第１動的再構成デバイスの構成を設定する第１コンフィギュレーションデータを保持する第１不揮発メモリと、第１コンフィギュレーションデータを第１動的再構成デバイスへロードするとともに、ロードすることによって設定された第１動的再構成デバイスが行う映像信号処理を特定可能な情報がケーブルから出力されるように制御する第１ＣＰＵとを有し、受信ユニットは監視カメラから出力されケーブルを介して受信した映像信号を映像信号処理する第２動的再構成デバイスと、第２動的再構成デバイスの構成を設定する複数の第２コンフィギュレーションデータを保持する第２不揮発メモリと、監視カメラから出力された特定可能な情報に基づいて、複数の第２コンフィギュレーションデータからロードすべきものを選択してロードする第２ＣＰＵとを有し、受信ユニットから出力する所定形式の映像信号は第１動的再構成デバイスと第２動的再構成デバイスとによって分担された映像信号処理を受けた信号であることを特徴とする。 The camera system of the present invention is a camera system that connects a camera and a receiving unit with a coaxial cable, and the surveillance camera is a first dynamic reconstruction device that processes a signal output from an image pickup element and a first operation. The first non-volatile memory that holds the first configuration data that sets the configuration of the target reconfiguration device, and the first that is set by loading and loading the first configuration data into the first dynamic reconfiguration device. It has a first CPU that controls the video signal processing performed by the dynamic reconstruction device so that identifiable information is output from the cable, and the receiving unit outputs the video signal output from the surveillance camera and receives the video signal via the cable. A second dynamic reconfiguration device that processes signals, a second non-volatile memory that holds multiple second configuration data that sets the configuration of the second dynamic reconfiguration device, and identifiable information output from the surveillance camera. It has a second CPU that selects and loads what should be loaded from a plurality of second configuration data based on the above, and the video signal of a predetermined format output from the receiving unit is the first dynamic reconstruction device and the second operation. It is characterized in that it is a signal that has undergone video signal processing shared by the target reconstruction device.

また、受信ユニットはメモリを有し、第２動的再構成デバイスはメモリを用いて、フレーム間演算を伴う映像信号処理を行う一方、第１動的再構成デバイスはフレーム間演算を伴う映像信号処理を行わないことが好ましい。 Further, the receiving unit has a memory, and the second dynamic reconstruction device uses the memory to perform video signal processing accompanied by inter-frame calculation, while the first dynamic reconstruction device performs video signal processing accompanied by inter-frame calculation. It is preferable not to perform the treatment.

さらに、第１動的再構成デバイスは入力された４色以上の映像信号を、３色の映像信号に変換する第１色補正器を有し、受信ユニットは色補正器で補正された映像に映った所定の被写体を学習し認識する人工知能を有し、色補正器はクラスタリング器、マッピングテーブル、変換行列テーブル、減算器、行列掛け算器、及び加算器を用いて色補正処理を行い、人工知能は多層ニューラルネット、学習処理器、学習データストア、学習履歴ストア、及び第２色補正器で構成していることが好ましい。 Further, the first dynamic reconstruction device has a first color corrector that converts the input video signal of four or more colors into a video signal of three colors, and the receiving unit converts the video corrected by the color corrector. It has artificial intelligence that learns and recognizes a predetermined subject in the image, and the color corrector performs color correction processing using a clustering device, mapping table, transformation matrix table, subtractor, matrix multiplier, and adder, and is artificial. The intelligence is preferably composed of a multi-layer neural network, a learning processor, a learning data store, a learning history store, and a second color corrector.

本発明によれば、FPGAを搭載したカメラと、同種のFPGAを搭載した受信ユニットを組み合せたことで、カメラの発熱を抑えつつ、機能の入れ替えが可能となる。 According to the present invention, by combining a camera equipped with an FPGA and a receiving unit equipped with an FPGA of the same type, it is possible to switch functions while suppressing heat generation of the camera.

従来の監視カメラ１０１と受信ユニット１０２の構成図、受信ユニットの既製品は1ch用と4ch用がある。The configuration diagram of the conventional surveillance camera 101 and the receiving unit 102, and the ready-made receiving unit are for 1ch and 4ch. 従来のカメラシステムの構成ブロック図、受信ユニットには映像信号処理部はない。The configuration block diagram of the conventional camera system and the receiving unit do not have a video signal processing unit. 従来のカメラの映像信号処理部１６３のブロック図の例、カメラ側に機能を追加するので消費電力が高くなる。In the example of the block diagram of the video signal processing unit 163 of the conventional camera, the power consumption is increased because the function is added to the camera side. 実施形態のカメラシステムの構成ブロック図。The block diagram of the camera system of the embodiment. カメラ１の映像信号処理部１６のブロック図。The block diagram of the video signal processing unit 16 of the camera 1. 受信ユニット２の映像信号処理部２５２６のブロック図。The block diagram of the video signal processing unit 2526 of the receiving unit 2. 変形例の映像信号処理部１６の構成ブロック図。The block diagram of the video signal processing unit 16 of the modified example. 変形例の受信ユニット２側の人工知能５０のブロック図。The block diagram of the artificial intelligence 50 on the receiving unit 2 side of the modified example.

以下、本発明の実施形態について図面を参照して詳細に説明する。
図４に実施形態にかかるカメラシステムの構成を示す。カメラシステムは、監視カメラ１と、受信ユニット２を備える。本実施形態の特徴として、受信ユニット２に映像信号処理部２６が追加されている。
監視カメラ１は、撮像素子(CMOS)１１と、FPGA１２と、メモリ１３と、ROM(Read Only Member)１４、電源分離器１５とを有する。
CMOS１１は、主に可視光に対して感度を有するイメージセンサであり、映像信号を出力するLVDS等のデジタル映像インタフェースと、I2C等の制御用インタフェースを有し、両インタフェースによってFPGA１２と接続されている。CMOS１１は1080p、1080i又は720pの内の少なくとも1種類の映像信号を出力できることが望ましい。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 4 shows the configuration of the camera system according to the embodiment. The camera system includes a surveillance camera 1 and a receiving unit 2. As a feature of this embodiment, a video signal processing unit 26 is added to the receiving unit 2.
The surveillance camera 1 includes an image sensor (CMOS) 11, an FPGA 12, a memory 13, a ROM (Read Only Member) 14, and a power supply separator 15.
CMOS 11 is an image sensor that is mainly sensitive to visible light, has a digital video interface such as LVDS that outputs a video signal, and a control interface such as I2C, and is connected to the FPGA 12 by both interfaces. .. It is desirable that CMOS 11 can output at least one type of video signal in 1080p, 1080i or 720p.

FPGA１２は、映像信号処理部１６と、HD-VLC圧縮部１７と、CPU(Central Processing Unit)コア１８と、を有する。このようにCPUとPFGAが融合したデバイスは、プログラマブルSoCとも呼ばれる。
メモリ１３は、例えば一般的なDDR2(Double Data Rate 2)のDRAM(Dynamic Random Access Memory)であり、フレームメモリ等として用いられる。 The FPGA 12 includes a video signal processing unit 16, an HD-VLC compression unit 17, and a CPU (Central Processing Unit) core 18. A device in which a CPU and PFGA are fused in this way is also called a programmable SoC.
The memory 13 is, for example, a general DDR2 (Double Data Rate 2) DRAM (Dynamic Random Access Memory), and is used as a frame memory or the like.

ROM１４は、FPGA１２のコンフィグレーションデータ及びCPUコア１８のための命令やデータを格納するフラッシュメモリ等である。
電源分離器１５は、同軸ケーブル３に接続され、監視カメラ１から受信ユニット２へ伝送する映像信号と、受信ユニット２から監視カメラ１に供給される電源とを分離する。 The ROM 14 is a flash memory or the like that stores the configuration data of the FPGA 12 and the instructions and data for the CPU core 18.
The power separator 15 is connected to the coaxial cable 3 and separates the video signal transmitted from the surveillance camera 1 to the receiving unit 2 and the power supplied from the receiving unit 2 to the surveillance camera 1.

映像信号処理部１６は、FPGA２２内のリコンフィギュラブルなハードウェアリソースであり、映像信号処理のソフトマクロが実体化される。映像信号処理部１６はメモリ１３にアクセス可能に構成される。
HD-VLC圧縮部１７は、Dirac方式の映像圧縮処理やケーブルドライバなどのハードマクロが実体化されたものであり、HD-SDIの映像信号を入力され、ケーブル伝送に好ましい信号に変換して出力する。HD-VLC圧縮部１７は、アンシラリーデータを含んだHD-SDIを受信ユニットで再生可能に送信することができる。 The video signal processing unit 16 is a reconfigurable hardware resource in the FPGA 22, and a soft macro for video signal processing is materialized. The video signal processing unit 16 is configured so that the memory 13 can be accessed.
The HD-VLC compression unit 17 embodies hard macros such as Dirac-type video compression processing and a cable driver. HD-SDI video signals are input, converted into signals preferable for cable transmission, and output. do. The HD-VLC compression unit 17 can reproducibly transmit HD-SDI including ancillary data by the receiving unit.

CPUコア１８は、FPGA１２に内蔵されたCPU或いはDSP(Digital Signal Processing)であり、一般的な１チップマイコンの様に、割り込みコントローラやタイマー、DMA、キャッシュメモリ、メモリコントローラ等を含んでいる。そしてCPUコア１８は、映像信号処理部１６やHD-VLC圧縮部１７、CMOS１１との間で、レジスタやメモリマップトI/O、シリアルバスを介して制御情報あるいは映像データの受け渡しを行い、それらの動作を制御したり、映像信号処理の一部を担ったりすることができる。また本例の特徴として、CPUコア１８は、ROM１４に格納された、パーシャルリコンフィギュレーション用の複数のビットストリームの内、映像信号処理部１６にロードすべきいくつかのストリームを選択したり、実際にロードしたりする制御を行うことができる。また、ストリームの選択の結果、或いはカメラ１側で実体化された画像処理パイプラインを特定する情報を、アンシラリーデータとしてHD-VLC圧縮部１７に渡すことができる。 The CPU core 18 is a CPU or DSP (Digital Signal Processing) built in the FPGA 12, and includes an interrupt controller, a timer, a DMA, a cache memory, a memory controller, and the like like a general one-chip microcomputer. Then, the CPU core 18 transfers control information or video data between the video signal processing unit 16, the HD-VLC compression unit 17, and the CMOS 11 via registers, memory-mapped I / O, and a serial bus, and operates them. Can control and play a part in video signal processing. Further, as a feature of this example, the CPU core 18 selects some streams to be loaded into the video signal processing unit 16 from among a plurality of bitstreams for partial configuration stored in the ROM 14, and actually You can control the loading to. Further, the result of stream selection or the information specifying the image processing pipeline materialized on the camera 1 side can be passed to the HD-VLC compression unit 17 as ancillary data.

受信ユニット２は、フォーマット変換IC２１と、FPGA２２と、メモリ２３と、ROM２４と、電源重畳器２５を有する。FPGA２２と、メモリ２３と、ROM２４は、ハードウェア上は、FPGA１２と、メモリ１３と、ROM１４と同じである。FPGA２２は、HD-VLC圧縮部１７に代えてHD-VLC復元部２７が実体化されている点などでFPGA１２と異なる。ROM２４は、カメラ側でしか必要とされないコンフィグレーションデータや命令が省かれ、受信ユニット側でしか必要とされないコンフィグレーションデータや命令が追加されている点などでROM１４と異なる。 The receiving unit 2 includes a format conversion IC 21, an FPGA 22, a memory 23, a ROM 24, and a power supply superimposing device 25. The FPGA 22, the memory 23, and the ROM 24 are the same as the FPGA 12, the memory 13, and the ROM 14 in terms of hardware. The FPGA 22 is different from the FPGA 12 in that the HD-VLC restoration unit 27 is materialized instead of the HD-VLC compression unit 17. The ROM 24 differs from the ROM 14 in that configuration data and instructions that are required only on the camera side are omitted, and configuration data and instructions that are required only on the receiving unit side are added.

フォーマット変換IC２１は、所定のデジタル映像データを入力され、VBS（Video, Blanking, and Sync）、RGB、DVI-Dなどのビデオインタフェースに変換して出力する、専用ICである。HD-SDI信号を入力されVBSを出力する場合、フォーマット変換IC２１は、リサンプル（ダウンサンプル）やプログレッシブ/インタレース変換等を行う。 The format conversion IC 21 is a dedicated IC that inputs predetermined digital video data, converts it into a video interface such as VBS (Video, Blanking, and Sync), RGB, DVI-D, and outputs it. When an HD-SDI signal is input and VBS is output, the format conversion IC 21 performs resample (downsample), progressive / interlaced conversion, and the like.

電源重畳器２５は、同軸ケーブルに接続され、カメラ１から伝送された映像信号を受信するとともに、同軸ケーブル３に監視カメラ１の電源として直流１２Vを供給する。 The power superimposing device 25 is connected to the coaxial cable, receives the video signal transmitted from the camera 1, and supplies the coaxial cable 3 with DC 12V as the power source for the surveillance camera 1.

映像信号処理部２６は、FPGA２２内のリコンフィギュラブルなハードウェアリソースであり、映像信号処理のソフトマクロが実体化される。映像信号処理部２６はメモリ２３にアクセス可能に構成される。 The video signal processing unit 26 is a reconfigurable hardware resource in the FPGA 22, and a soft macro for video signal processing is materialized. The video signal processing unit 26 is configured so that the memory 23 can be accessed.

HD-VLC復元部２７は、Dirac方式の映像復号処理やケーブル等化器などのハードマクロが実体化されたものであり、対をなすHD-VLC圧縮部１７から受信した映像信号を展開し、HD-SDI相当の映像信号を出力する。アンシラリーデータはHD-SDIに埋め込まれるとともに、CPUコア２８にも提供される。 The HD-VLC restoration unit 27 embodies hard macros such as a Dirac-type video decoding process and a cable equalizer, and develops the video signal received from the paired HD-VLC compression unit 17 to develop the video signal. Outputs a video signal equivalent to HD-SDI. The ancillary data is embedded in HD-SDI and also provided to the CPU core 28.

CPUコア２８は、ROM２４に格納された、パーシャルリコンフィギュレーション用の複数のストリームの内、映像信号処理部２６にロードすべきいくつかのストリームを選択したり、実際にロードしたりする制御を行うことができる。また、カメラ１側におけるストリームの選択の結果、或いは実体化された画像処理パイプラインを特定する情報を、アンシラリーデータとしてHD-VLC復元部２７から受取ると、それに応じてFPGA２２の部分的或いは全体のフィギュレーションを実行する。部分的かつ動的な更新を行う場合、信号のバイパス配線を設けておく手法により、映像を停止させずにリコンフィギュレーションを達成できる。 The CPU core 28 controls to select some streams to be loaded into the video signal processing unit 26 from among a plurality of streams for partial configuration stored in the ROM 24, and to actually load the streams. be able to. Further, when the result of stream selection on the camera 1 side or the information specifying the materialized image processing pipeline is received from the HD-VLC restoration unit 27 as ancillary data, a partial or whole of the FPGA 22 is received accordingly. Perform the figuring of. When performing partial and dynamic updates, the reconfiguration can be achieved without stopping the video by providing a signal bypass wiring.

図５は、本実施形態の監視カメラ１の映像信号処理部１６のブロック図である。ここではカメラ１はカラーの単板カメラを想定する。監視カメラ１の画像処理パイプラインは、ゲインアップ器３１、色補正器３２、ニー処理器３３、エンハンサ３４、ガンマ補正３５とで構成される。本例の特徴として、メモリ処理が必要な機能を削除して、消費電力を抑えるようにしている。ゲインアップ器３１以降の各段の入出力I/Fは全て統一されており、例えば画素データのビット深度は１４bitとし、同期型I/Fとする。カラー映像であれば、色分離後のデータは、例えばRGB4:4:4あるいはYUV4:2:2等の所定の色フォーマットに統一されることが望ましい。 FIG. 5 is a block diagram of the video signal processing unit 16 of the surveillance camera 1 of the present embodiment. Here, the camera 1 is assumed to be a color single-panel camera. The image processing pipeline of the surveillance camera 1 is composed of a gain-up device 31, a color corrector 32, a knee processor 33, an enhancer 34, and a gamma correction 35. As a feature of this example, the function that requires memory processing is deleted to reduce power consumption. The input / output I / F of each stage after the gain-up device 31 is unified. For example, the bit depth of the pixel data is 14 bits, and the synchronous I / F is used. In the case of a color image, it is desirable that the data after color separation is unified into a predetermined color format such as RGB 4: 4: 4 or YUV 4: 2: 2.

ゲインアップ器３１は、CMOS１１から画素データを直接受け取り、所定の係数を乗算するデジタルゲインアップし、ＣＳＣへ出力する。CMOS１１の画素数が多い場合、ビニング処理を行ってもよく、ライン単位で異なる露出が為される場合などはHDR処理を行ってもよい。CMOS１１からの画素データのビット深度は、ゲインアップ３１が出力するデータのビット深度と異なってもよい。 The gain-up device 31 receives pixel data directly from CMOS 11, digitally gain-ups it by multiplying it by a predetermined coefficient, and outputs it to the CSC. When the number of pixels of CMOS 11 is large, binning processing may be performed, and when different exposures are made for each line, HDR processing may be performed. The bit depth of the pixel data from the CMOS 11 may be different from the bit depth of the data output by the gain up 31.

色補正器３２は、ＲＧＢやCMYG(Cyan(水色）、Magenta(赤紫)、Yellow(黄）、Green(緑))のカラーフィルタを介して撮影された映像のデモザイキングを適宜行った後、ホワイトバランス調整、或いは、特許文献１０（特許第２９９５６８３号）のような条件付線形演算や、特許文献１１のような色空間上での射影変換等によって、色補正された映像信号を出力する。 The color corrector 32 appropriately demosizes the image taken through the color filters of RGB and CMYG (Cyan (light blue), Magenta (magenta), Yellow (yellow), Green (green)), and then demosizes the image. A color-corrected video signal is output by white balance adjustment, conditional linear calculation as in Patent Document 10 (Patent No. 2995683), projection conversion in a color space as in Patent Document 11, and the like.

ニー処理器３３は、所定のニーポイントを超える輝度に対して、トーンマッピングの傾き（ニースロープ）を小さくすることで、小さなダイナミックレンジでも輝度の飽和を起こりにくくする処理を行う。またCMOS１１のカラーチャネルの中で最も飽和しやすい色において飽和が始まる輝度を超えた時に、色差を０にするなどして偽色を防ぐ処理を行ってもよい。 The knee processor 33 performs a process of making the saturation of the brightness less likely to occur even in a small dynamic range by reducing the inclination (knee slope) of the tone mapping with respect to the brightness exceeding a predetermined knee point. Further, when the brightness at which saturation starts in the color most likely to be saturated in the color channel of CMOS 11 is exceeded, a process for preventing false color may be performed by setting the color difference to 0 or the like.

エンハンサ３４は、映像のディティールを強調したり、階調を空間的に滑らかにしたりする空間フィルタ処理を行う。或いは、特にゲインアップ器で高いゲインを与えている（低照度）ときに、メディアンフィルタ等の外れ値を抑圧するフィルタ処理を行う。 The enhancer 34 performs spatial filter processing for emphasizing the details of the image and spatially smoothing the gradation. Alternatively, a filter process for suppressing outliers such as a median filter is performed, especially when a high gain is given by a gain-up device (low illuminance).

ガンマ補正３５は、ITU-R Rec.709やBT.1886等に従い、1.9〜2.4程度のディスプレイガンマに対応するガンマ処理を行うとともに、もし必要であればHD-SDIと互換性のあるフォーマット（YCbCr(Y(輝度)、Cb(青の色差)、Cr(赤の色差))又はRGB）に色空間変換し、HD-VLC圧縮部１７へ出力する。 Gamma correction 35 performs gamma processing corresponding to display gamma of about 1.9 to 2.4 in accordance with ITU-R Rec.709, BT.1886, etc., and if necessary, a format compatible with HD-SDI (YCbCr). The color space is converted to (Y (luminance), Cb (blue color difference), Cr (red color difference)) or RGB) and output to the HD-VLC compression unit 17.

図６は、本実施形態の受信ユニット２の信号処理部２３のブロック図である。信号処理部２３は、HD-VLC復元部２７から、HD-SDIと互換性のあるベースバンド映像信号を受け取る。信号処理部２５２３の画像処理パイプラインは、デジタルノイズリダクション処理器３６、蓄積処理器３７、ワイドダイナミックレンジ処理器３８、霧霞補正器３９、及びメモリIF４０とで構成される。 FIG. 6 is a block diagram of the signal processing unit 23 of the receiving unit 2 of the present embodiment. The signal processing unit 23 receives a baseband video signal compatible with HD-SDI from the HD-VLC restoration unit 27. The image processing pipeline of the signal processing unit 2523 is composed of a digital noise reduction processing unit 36, a storage processing unit 37, a wide dynamic range processing unit 38, a fog haze correction device 39, and a memory IF 40.

デジタルノイズリダクション処理器３６は、被写体の動きの速度やSNなどに応じて最適制御された重みで時間領域フィルタリングを行う適応型ノイズフィルタである。一般的にはリカーシブフィルタとして実装されることが多い。
蓄積処理器３７は、複数のフレームの映像を単純加算して出力する。例えばNフレームの蓄積を行う場合、出力フレームはNフレーム毎に更新される。 The digital noise reduction processor 36 is an adaptive noise filter that performs time domain filtering with weights optimally controlled according to the speed of movement of the subject, SN, and the like. Generally, it is often implemented as a recursive filter.
The storage processor 37 simply adds and outputs images of a plurality of frames. For example, when accumulating N frames, the output frames are updated every N frames.

ワイドダイナミックレンジ処理器３８は、ヒストグラム正規化、ヒストグラム調整、Retinexなどの手法により、トーンマネージメントを行う。この処理は、画素データに対する大量の時間領域操作は必要としないものの、ヒストグラム計算、照明光モデル、或いは視覚応答モデルの生成の為、相当量のメモリ帯域を要求する。 The wide dynamic range processor 38 performs tone management by a method such as histogram normalization, histogram adjustment, and Retinex. Although this process does not require a large amount of time domain manipulation on the pixel data, it requires a considerable amount of memory bandwidth for histogram calculation, illumination light model, or visual response model generation.

メモリIF４０は、デジタルノイズリダクション処理器３６、蓄積処理器３７及びワイドダイナミックレンジ処理器３８に対してメモリ２３の読み込み及び書込みのアクセスを提供する。なおこれらは必ずしも同時に動作可能で有る必要は無く、メモリ帯域の許す範囲で一部のみ作動させることができる。 The memory IF 40 provides read and write access to the memory 23 to the digital noise reduction processor 36, the storage processor 37, and the wide dynamic range processor 38. It should be noted that these do not necessarily have to be able to operate at the same time, and only a part of them can be operated within the range allowed by the memory bandwidth.

霧霞補正器３９は、注目画素の近傍のカラーチャネルで最小の値で構成されるダークチャネル画像を利用して、霧や霞による散乱光の成分を除去する。この処理は、ワイドダイナミックレンジ処理と似た視認性向上効果を奏し、エンハンサ３４と大差ない処理量で実現できる。 The fog haze corrector 39 removes components of scattered light due to fog and haze by using a dark channel image composed of the smallest values in the color channels in the vicinity of the pixel of interest. This processing has an effect of improving visibility similar to the wide dynamic range processing, and can be realized with a processing amount that is not much different from that of the enhancer 34.

本実施形態では、監視カメラ１が搭載するFPGA１２と受信ユニット２が搭載するFPGA２２を、同一もしくは互換性の高いデバイスとすることで、ゲインアップ器３１からメモリIF４０までの機能に対応するコンフィグレーションデータは、ストリームレベル、ハードウェア記述言語レベル、或いは少なくとも高級言語レベルで共通にすることができ、その結果、設計資産を有効に活用できる。なおゲインアップ器３１からメモリIF４０までの各機能部は、独立にパーシャルリコンフィギュラブルにする必要は無く、それら機能部の組合せの複数パターンのコンフィギュレーション・ストリームを用意して、選択可能にロードする方が、容易に実現できる場合がある。ここで、カメラ１と受信ユニット２の機能を重複を完全に否定する意図はなく、用意するストリームの種類を減らすため、電力やリソースの消費が小さい機能などは重複させてもよい。 In the present embodiment, the FPGA 12 mounted on the surveillance camera 1 and the FPGA 22 mounted on the receiving unit 2 are the same or highly compatible devices, so that the configuration data corresponding to the functions from the gain-up device 31 to the memory IF 40 are supported. Can be shared at the stream level, the hardware description language level, or at least the higher language level, and as a result, the design assets can be effectively utilized. It is not necessary to make each functional part from the gain-up device 31 to the memory IF40 independently partially reconfigurable, and a configuration stream of a plurality of patterns of a combination of these functional parts is prepared and loaded in a selectable manner. In some cases, this can be achieved more easily. Here, there is no intention of completely denying the duplication of the functions of the camera 1 and the receiving unit 2, and in order to reduce the types of streams to be prepared, the functions that consume less power and resources may be duplicated.

監視カメラ１のROM１４と受信ユニット２のＲＯＭ２４に、共通に実体化可能に保持された機能部を択一的に実体化させることで、監視カメラ１と受信ユニット２は、それらの間で、あたかも機能を入れ替えることができるように振る舞う。たとえば、カメラが特異的に高温になり易い環境に設置されたことに事後的に気づいた場合、そのようなカメラの機能を図４に示されるようなものに簡素化することで、発熱を抑えることができる。特にフレームメモリを使用するワイドダイナミックレンジ処理などを停止させることが、発熱の低減に効果的である。なお、過熱の恐れが無い夜間には、監視カメラ１内の蓄積処理を適宜有効化することができる。 By selectively embodying the functional parts commonly held in the ROM 14 of the surveillance camera 1 and the ROM 24 of the receiving unit 2, the surveillance camera 1 and the receiving unit 2 can be placed between them as if they were. Behave so that the functions can be interchanged. For example, if an ex post facto notices that a camera has been installed in an environment that is specifically prone to high temperatures, the function of such a camera can be simplified to the one shown in FIG. 4 to reduce heat generation. be able to. In particular, stopping wide dynamic range processing that uses the frame memory is effective in reducing heat generation. At night when there is no risk of overheating, the accumulation process in the surveillance camera 1 can be appropriately enabled.

本例では、監視カメラ１と受信ユニット２の間の伝送は一方向であるため、監視カメラ１で為される画像処理は、受信ユニット２からの指示無しに自律的に行われる必要があり、また為された処理の内容を受信ユニット２が知ることができる必要がある。そのためCPUコア１８は、CMOS１１に設定したパラメータや、映像信号処理部１６の各機能部が用いているパラメータや内部状態なども、レジスタなどを通じて取得し、アンシラリーデータに包含させることで、受信ユニット２へ送信されるようにすることが望ましい。 In this example, since the transmission between the surveillance camera 1 and the receiving unit 2 is unidirectional, the image processing performed by the surveillance camera 1 needs to be performed autonomously without any instruction from the receiving unit 2. Further, it is necessary for the receiving unit 2 to know the contents of the processed processing. Therefore, the CPU core 18 acquires the parameters set in the CMOS 11 and the parameters and internal states used by each function unit of the video signal processing unit 16 through registers and the like, and includes them in the ancillary data to include the receiving unit. It is desirable to send to 2.

なお、監視カメラ１と受信ユニット２の間の伝送には任意の周知の方式、例えばHDcctv（商標）、SMPTE 2022-6等の双方向通信可能な方式も利用できる。 For transmission between the surveillance camera 1 and the receiving unit 2, any well-known method, for example, a method capable of bidirectional communication such as HDcctv ™ and SMPTE 2022-6 can be used.

以下、実施形態の変形例を図７、図８を参照して説明する。変形例では、受信ユニット２側に人工知能５０が備えられ、人工知能５０から監視カメラ１へ伝送可能な通信手段が設けられる。
図７は、変形例に係る映像信号処理部１６の色補正器４２の内部ブロック図である。変形された映像信号処理部１６は、色補正部３２の置き換えとして、或いは追加的に色補正器４２を備える。
色補正器４２は、多様体学習或いは多次元尺度構成などと呼ばれる処理のうち、リアルタイムで行う必要があるマッピング（次元圧縮）処理の部分を実行する。
具体的には色補正器４２は、４チャネル以上の色の映像信号を入力され、色の微妙な違いをできるだけ保存するような非線形な色変換をリアルタイムで行って、３チャネルの色の映像信号を出力する。入力信号は、例えばCy,Mg,Ye,Gの補色センサ、或いはベイヤセンサのGチャンネルの１つを別の感度波長（例えば近赤外線）に変更したような色フォーマットの映像であり、以後、４色映像信号と呼ぶ。 Hereinafter, a modified example of the embodiment will be described with reference to FIGS. 7 and 8. In the modified example, the artificial intelligence 50 is provided on the receiving unit 2 side, and a communication means capable of transmitting from the artificial intelligence 50 to the surveillance camera 1 is provided.
FIG. 7 is an internal block diagram of the color corrector 42 of the video signal processing unit 16 according to the modified example. The modified video signal processing unit 16 includes a color corrector 42 as a replacement for the color correction unit 32 or additionally.
The color corrector 42 executes a part of mapping (dimensional compression) processing that needs to be performed in real time in a process called manifold learning or multidimensional scaling.
Specifically, the color corrector 42 receives a video signal of four or more channels of color, performs non-linear color conversion in real time so as to preserve subtle differences in color as much as possible, and performs a video signal of three channels of color. Is output. The input signal is, for example, a complementary color sensor of Cy, Mg, Ye, G, or an image in a color format in which one of the G channels of the bayer sensor is changed to another sensitivity wavelength (for example, near infrared rays). Called a video signal.

クラスタリング器４３は、４色映像信号の各画素をN個のクラスタに分類し、その結果をクラスタIDとして出力する。クラスタリングは多様体学習等によって予めなされており、個々のクラスタは部分多様体に対応する。分類の実行は、ｋｄ木、R木、LSH等の周知の手法で行うことができる。多様体学習等には、LLE、ｔ-SNE、ラプラシアン固有マップ等の非線形の手法が利用できる。或いは、クラスタリングは、DSMAP、もしくは色空間の単純な等間隔な分割によっても成しえる。 The clustering device 43 classifies each pixel of the four-color video signal into N clusters, and outputs the result as a cluster ID. Clustering is performed in advance by manifold learning or the like, and each cluster corresponds to a submanifold. The classification can be performed by well-known methods such as kd tree, R tree, and LSH. Non-linear methods such as LLE, t-SNE, and Laplacian eigenmap can be used for manifold learning and the like. Alternatively, clustering can also be achieved by DSMAP, or simple evenly spaced divisions of the color space.

マッピングテーブル４４は、クラスタIDを受取ると、それに対応するクラスタの代表ベクトル（ソースベクトル）と、マッピング先のマップトベクトルを出力する。本例では代表ベクトルは４次元、マップトベクトルは３次元である。なお代表ベクトルはクラスタリング器４３から出力させてもよい。
変換行列テーブル４５は、クラスタIDを受取ると、そのクラスタの代表ベクトルの近傍で最適化された３行４列の射影変換（沈め込み）行列Aを出力する。 When the mapping table 44 receives the cluster ID, it outputs the representative vector (source vector) of the corresponding cluster and the mapped vector of the mapping destination. In this example, the representative vector is four-dimensional and the mapped vector is three-dimensional. The representative vector may be output from the clustering device 43.
When the transformation matrix table 45 receives the cluster ID, it outputs a 3-by-4 projected transformation (submersion) matrix A optimized in the vicinity of the representative vector of the cluster.

マッピングテーブル４４や変換行列テーブル４５の内容は、所定の被写体の映像に対して多様体学習等によって獲得されたものが初期的に設定され、その後、受信ユニット２の出力先にある人工知能の学習の過程で、バックプロパゲーションによって修正されうる。クラスタリング器４３やマッピングテーブル４４、変換行列テーブル４５が、FPGA内において組込ブロックRAM(Random Access Memory)として実体化されていれば、内容の書き換えは容易である。書き換えに際して各テーブルのバージョンが管理されており、用いたバージョンをアンシラリーデータとしてカメラ１から下流に通知することで、人工知能５０は監視カメラ１でどのような色変換が為されたか知ることができる。 The contents of the mapping table 44 and the transformation matrix table 45 are initially set to those acquired by manifold learning or the like for the image of a predetermined subject, and then the learning of artificial intelligence at the output destination of the receiving unit 2 is performed. In the process of, it can be modified by backpropagation. If the clustering device 43, the mapping table 44, and the transformation matrix table 45 are materialized as an embedded block RAM (Random Access Memory) in the FPGA, the contents can be easily rewritten. The version of each table is managed at the time of rewriting, and by notifying the used version as ancillary data downstream from the camera 1, the artificial intelligence 50 can know what kind of color conversion was performed by the surveillance camera 1. can.

なお多様体学習をオンラインで行う場合、画素をランダムに選ぶ構成や、クラスタ毎に分散を算出して保持する構成等も映像信号処理部１６内に備えることが望ましい。一方、代表ベクトルの選定及び射影変換行列Aを得るための固有値計算や最急降下アルゴリズムなどは、CPUコア１８或いはHD-VLC復元部２７で処理させることができる。多様体学習等は、自由度があるため、何らかの制約条件を必要とする場合、マッピングが一般的なRGB色空間に近いものとなるような制約を課すことができる。更に視覚との整合性を重視する場合、単射性を犠牲にしてもよい。また被写体の色分布が偏っていれば全射である必要も無い。様々な種類の被写体を認識しようとする場合、付加的な制約を課さず、本来の局所等長写像の性質を優先することができる。なお局所等長写像は唯一ではなく、学習可能である。 When performing manifold learning online, it is desirable that the video signal processing unit 16 also has a configuration in which pixels are randomly selected, a configuration in which variance is calculated and held for each cluster, and the like. On the other hand, the selection of the representative vector, the eigenvalue calculation for obtaining the projective transformation matrix A, the steepest descent algorithm, and the like can be processed by the CPU core 18 or the HD-VLC restoration unit 27. Manifold learning and the like have a degree of freedom, so if some constraints are required, it is possible to impose constraints that make the mapping close to the general RGB color space. Further, if consistency with vision is emphasized, injectiveness may be sacrificed. Also, if the color distribution of the subject is biased, it does not have to be surjective. When trying to recognize various types of subjects, the nature of the original local isometry can be prioritized without imposing additional restrictions. Note that local isometry is not unique and can be learned.

減算器４６は、入力された４色映像信号から、代表ベクトルを減算し、差ベクトルを出力する。 The subtractor 46 subtracts the representative vector from the input four-color video signal and outputs the difference vector.

行列掛け算器４７は、列ベクトルである差ベクトルに、射影変換行列を後ろから乗算して出力する。
加算器４８は、行列掛け算器４７の出力に、マップトベクトルを加算し、色補正部４２の結果として出力する。 The matrix multiplier 47 multiplies the difference vector, which is a column vector, by the projective transformation matrix from the back and outputs the result.
The adder 48 adds a mapped vector to the output of the matrix multiplier 47 and outputs it as a result of the color correction unit 42.

色補正器４２は、映像の全画素について個別に補正することができるが、4:2:2や4:2:0等のフォーマットに応じて、画素を１／２に間引いて実行してもよい。 The color corrector 42 can individually correct all the pixels of the image, but even if the pixels are thinned out to 1/2 according to the format such as 4: 2: 2 or 4: 2: 0, the color corrector 42 may be executed. good.

図８は、色補正器４２で補正された映像を学習し認識する人工知能５０の構成図である。人工知能５０は、畳み込みニューラルネット（ＣＮＮ）５１、学習処理器５２、学習データストア５３、学習履歴ストア５４、色補正器５５を備え、半教師有学習をオンラインで行うことができる。人工知能５０のハードウェアには、高性能計算（HPC）に適したコンピュータ、特にGPGPU (General-Purpose computing on Graphics Processing Units)を搭載したワークステーションやPC等が利用できる。 FIG. 8 is a configuration diagram of an artificial intelligence 50 that learns and recognizes an image corrected by the color corrector 42. The artificial intelligence 50 includes a convolutional neural network (CNN) 51, a learning processor 52, a learning data store 53, a learning history store 54, and a color corrector 55, and can perform semi-supervised learning online. As the hardware of the artificial intelligence 50, a computer suitable for high performance computing (HPC), particularly a workstation or a PC equipped with GPGPU (General-Purpose computing on Graphics Processing Units) can be used.

ＣＮＮ５１は、例えばFaster R-CNN、SSD(Single Shot MultiBox Detector)、YOLO等のような、画像中に含まれる既知の物体を認識する多層ＮＮであり、認識した物体のスコアと、領域座標を出力する。ＣＮＮ５１は認識に使用される前に、既知の物体について十分学習しているものとし、本例では特定の農作物（例えば植物の葉）についての複数の症状を学習してあるものとする。ＣＮＮ５１の一部又は全部は、コンピュータが備えるGPGPUや主メモリ等によって実現することができる。 CNN51 is a multi-layer NN that recognizes known objects contained in an image, such as Faster R-CNN, SSD (Single Shot MultiBox Detector), YOLO, etc., and outputs the score and area coordinates of the recognized object. do. It is assumed that the CNN 51 has fully learned about known objects before it is used for recognition, and in this example it has learned about multiple symptoms for a particular crop (eg, plant leaves). Part or all of the CNN 51 can be realized by a GPGPU, main memory, or the like provided in the computer.

学習処理器５２は、ＣＮＮ５１の学習を行う処理器であり、コンピュータが備えるCPUや主メモリとうによって実現されうる。 The learning processor 52 is a processor that learns the CNN 51, and can be realized by a CPU and a main memory provided in the computer.

学習データストア５３は、学習データであるラベル付画像を記憶する。学習データストア５３は、コンピュータが備えるSSD等の不揮発性記憶装置によって実現されうる。
ラベル付画像は、植物の葉や茎を含む画像であって、窒素、リン酸、カリウム、マグネシウム、カルシウム等の不足又は過多の他、日照や灌水の不足又は過多、特定の病害虫に侵されたときの症状などを示すラベルが付されている。ラベル付画像は、どのような色空間で表現され、あるいは色補正を受けているかを示す色表現情報が付加されている。学習データストア５３は、入力画像を一時的に保持し、学習処理器５２の指示に応じて、その一時保持画像の部分画像にラベルや色表現情報を付加して、新たなラベル付画像として登録することができる。 The training data store 53 stores labeled images that are training data. The learning data store 53 can be realized by a non-volatile storage device such as an SSD included in the computer.
Labeled images are images containing leaves and stems of plants, which are deficient or excessive in nitrogen, phosphoric acid, potassium, magnesium, calcium, etc., as well as insufficient or excessive sunshine or irrigation, and are affected by specific pests. A label is attached to indicate the symptoms of the time. The labeled image is added with color expression information indicating what kind of color space is expressed or color correction is received. The training data store 53 temporarily holds the input image, adds a label and color expression information to the partial image of the temporarily held image according to the instruction of the learning processor 52, and registers it as a new labeled image. can do.

色表現情報は、例えば、ラベル付画像を基準となる表色系、例えばsRGB（IEC 61966-2-1)に変換するため若しくはその逆変換のためのカラーマトリクス(3*3、4*3、又は4*4)の各係数、そのようなカラーマトリクスを特定する情報（色温度、ID）である。ラベル付画像がカメラ１で取得されたものである場合、色補正器４２内の各テーブルのバージョンが、色表現情報として利用できる。 The color representation information is, for example, a color matrix (3 * 3, 4 * 3,) for converting a labeled image into a reference color system, for example, sRGB (IEC 61966-2-1) or vice versa. Or each coefficient of 4 * 4), information (color temperature, ID) that identifies such a color matrix. When the labeled image is acquired by the camera 1, the version of each table in the color corrector 42 can be used as the color expression information.

学習履歴ストア５４は、色補正器４２でマッピングに用いられる各テーブルの初期の内容を記憶するとともに、学習処理器５２が行うバックプロパゲーションによって、それらテーブルが修正された場合に、その履歴を保持する。１回の修正（バージョンアップ）における修正はわずかであり、修正毎にテーブルのスナップショットを保持するよりも、履歴に従って初期テーブルに修正を適用して都度生成したほうが、効率的である。なお学習データストア５３に保持されている最も古いラベル付画像よりも過去の履歴は、再現する必要が無いので、カメラ１で取得された最も古いラベル付画像の時点における各テーブルの内容を初期の内容とすることができる。 The learning history store 54 stores the initial contents of each table used for mapping by the color corrector 42, and holds the history when those tables are modified by the backpropagation performed by the learning processor 52. do. The number of corrections in one modification (version upgrade) is small, and it is more efficient to apply the modification to the initial table according to the history and generate it each time, rather than keeping a snapshot of the table for each modification. Since it is not necessary to reproduce the history past the oldest labeled image stored in the training data store 53, the contents of each table at the time of the oldest labeled image acquired by the camera 1 are initially set. Can be content.

色補正器５５は、学習データストア５３に保持されたラベル付画像を、最新の学習成果に基づく色補正がされた画像に、近似的に変換する。ラベル付画像が一般的なsRGB画像として取得されていた場合、色補正器５５は、色補正器４２と同一の構成及び/又は方法で色補正を達成できる。ラベル付画像がカメラ１で取得され色補正器４２で補正されていた場合、取得時の各テーブルおよび最新の各テーブルの内容に基づいて生成した、変換用のテーブルを用いて変換する。変換用のテーブルは以下のステップ１から４の手順で生成する。 The color corrector 55 approximately converts the labeled image stored in the training data store 53 into an image that has been color-corrected based on the latest learning results. When the labeled image is acquired as a general sRGB image, the color corrector 55 can achieve color correction with the same configuration and / or method as the color corrector 42. When the labeled image is acquired by the camera 1 and corrected by the color corrector 42, it is converted using the conversion table generated based on the contents of each table at the time of acquisition and each of the latest tables. The conversion table is generated by the following steps 1 to 4.

ステップ１：取得時のテーブルにおける全てのマップトベクトルと、元の空間（色補正器４２へ入力される４色映像信号）において最も良く対応する、最新のテーブルにおけるマップトベクトルとを対応付ける。これは、取得時以降にクラスタリングや代表ベクトルが更新されている場合に必要となり、取得時のクラスタリング器４３で用いられた各クラスタと空間的に最も良く対応する最新のクラスタとに対応付けることにより為され、旧クラスタID−新クラスタID変換テーブルとして表現される。或いは、取得時のマッピングテーブル４４の各代表ベクトルに最も近い代表ベクトルを、最新のマッピングテーブルから検索する方法でも為されうる。 Step 1: Associate all the mapped vectors in the table at the time of acquisition with the mapped vectors in the latest table that best correspond in the original space (the four-color video signal input to the color corrector 42). This is necessary when clustering or the representative vector is updated after the time of acquisition, and is because each cluster used in the clusterer 43 at the time of acquisition is associated with the latest cluster that corresponds spatially best. And expressed as an old cluster ID-new cluster ID conversion table. Alternatively, the method of searching the latest mapping table for the representative vector closest to each representative vector of the mapping table 44 at the time of acquisition can also be performed.

ステップ２：旧クラスタIDに対応付けられた全てのマップトベクトルを３色空間上で分類し、その旧クラスタIDに対応する新クラスタIDを出力するように、色補正器５５のクラスタリング器を設定する。 Step 2: Set the clusterer of the color corrector 55 so that all the mapped vectors associated with the old cluster ID are classified in the three color space and the new cluster ID corresponding to the old cluster ID is output. do.

ステップ３：色補正器５５のマッピングテーブルは、最新の内容に設定する。 Step 3: The mapping table of the color corrector 55 is set to the latest contents.

ステップ４：最新の変換行列テーブルにおいて新クラスタIDに対応付けられた変換行列の全てについて、取得時の変換行列テーブル中の、新クラスタIDに対応する旧クラスタIDの変換行列の逆行列を右から掛け算し、3行3列の変換行列を得る。そして、それらを新クラスタIDに対応付けて、色補正器５５の変換行列テーブルに設定する。なお色補正器５５の行列掛け算器は、3行3列の変換行列を掛け算するように構成されるものとする。 Step 4: For all the transformation matrices associated with the new cluster ID in the latest transformation matrix table, the inverse matrix of the transformation matrix of the old cluster ID corresponding to the new cluster ID in the transformation matrix table at the time of acquisition is displayed from the right. Multiply to get a transformation matrix with 3 rows and 3 columns. Then, they are associated with the new cluster ID and set in the transformation matrix table of the color corrector 55. The matrix multiplier of the color corrector 55 is configured to multiply a transformation matrix of 3 rows and 3 columns.

ここで、学習処理器５２による学習を説明する。学習は凡そ、以下の第1過程から第3過程を有する。
［第1過程］
学習処理器５２はまず、初期的に用意されたラベル付画像を用いて、ＣＮＮ５１の学習を行う。つまり、ＣＮＮ５１にラベル付画像の画像を入力し、ラベルに対応するＣＮＮ５１の出力ノードの値の誤差（ラベルの正解値との差）を算出する。そしてその出力ノードが正解値を出力するために入力されるべき、中間層ノードの出力値（これはその中間層ノードにとっての誤差となる）や重みを調整する。この調整には、確率的最急降下法が利用できる。この調整を入力側に遡りながら行うことで、ＣＮＮ５１全体が学習されうる。なお用いるＣＮＮ５１に固有のその他の学習も行われ得る。例えば、位置や大きさ、視点の違いに対する汎化能力を向上させるため、様々なアフィン変換等が適用されたラベル付画像を用いて、内部パラメータを学習させることがある。 Here, learning by the learning processor 52 will be described. Learning generally has the following first to third processes.
[First process]
First, the learning processor 52 learns the CNN 51 using the initially prepared labeled image. That is, the image of the labeled image is input to the CNN 51, and the error (difference from the correct answer value of the label) of the value of the output node of the CNN 51 corresponding to the label is calculated. Then, the output value of the intermediate layer node (which is an error for the intermediate layer node) and the weight that the output node should input in order to output the correct answer value are adjusted. A stochastic steepest descent method can be used for this adjustment. By performing this adjustment while going back to the input side, the entire CNN 51 can be learned. Other learning specific to the CNN 51 used may also be performed. For example, in order to improve the generalization ability for differences in position, size, and viewpoint, internal parameters may be learned using labeled images to which various affine transformations are applied.

［第２過程］
ＣＮＮ５１の中間層のどこかで、RGB毎の特徴が、別の尺度の特徴に変換されている。この中間層からバックプロパゲーションさせる場合、第１過程では、RGBの各色の重みを調整していた。
それらの調整が収束した以降、第２過程では、代わりに、色補正器４２の色変換の重みを調整する。つまり、色補正器４２の各テーブルをバックプロパゲーションによって更新することで、３色に対する調整よりもはるかに高い自由度で、色の重みを調整する。実際には、色補正器４２の補正を再現する色補正器５２にバックプロパゲーションさせ、ＣＮＮ５１に入力されるラベル付画像に、擬似的に色の重みの調整を施すことによって、学習が為される。 [Second process]
Somewhere in the middle layer of CNN51, the RGB-by-RGB features are converted to features on another scale. When backpropagating from this intermediate layer, the weight of each RGB color was adjusted in the first process.
After those adjustments have converged, the second process instead adjusts the color conversion weights of the color corrector 42. That is, by updating each table of the color corrector 42 by backpropagation, the color weights are adjusted with a much higher degree of freedom than the adjustment for three colors. Actually, learning is performed by backpropagating the color corrector 52 that reproduces the correction of the color corrector 42 and adjusting the color weight in a pseudo manner on the labeled image input to the CNN 51. NS.

［第３過程］
ＣＮＮ５１は、カメラ１で取得され色補正器４２で処理された画像を識別し、その結果を利用して弱教師有学習を行う。最も簡単な弱教師有学習では、識別結果の内、十分信頼できるものについては、正解と仮定し、新たなラベル付画像として学習データストア５３に格納する。そしてこれらの追加されたラベル付画像も利用して、学習が繰り返される。 [Third process]
The CNN 51 identifies an image acquired by the camera 1 and processed by the color corrector 42, and uses the result to perform weakly supervised learning. In the simplest weakly supervised learning, among the identification results, those that are sufficiently reliable are assumed to be correct answers and stored in the learning data store 53 as a new labeled image. Then, learning is repeated using these added labeled images.

以上説明した変形例は、植物の識別に限らず、映像に基づく各種の識別、例えば、コンクリート構造物の外観の診断や、海や河川等の汚染や赤潮その他の状態の検出、油等の特定の物質の漏えいの検出等に利用できる。 The modified examples described above are not limited to the identification of plants, but various identifications based on images, such as diagnosis of the appearance of concrete structures, detection of pollution of the sea and rivers, red tide and other conditions, identification of oil, etc. It can be used to detect leaks of substances.

また、本発明は、例えば、本発明に係る処理を実行する方法或いは装置や、そのような方法をコンピュータに実現させるためのプログラムや、当該プログラムを記録する一過性ではない有形の媒体などとして提供することもできる。 Further, the present invention is, for example, as a method or device for executing a process according to the present invention, a program for realizing such a method on a computer, a non-transient tangible medium for recording the program, or the like. It can also be provided.

上述の一実施例にかかるカメラシステムは、信号処理を行うためのFPGAを搭載した監視カメラと、同種のFPGAを搭載した受信ユニットを組み合わせた構成とすることで、カメラ内のFPGAで信号処理をしていた機能を受信ユニットに組み込むことができる。また、新しい機能を追加する場合にも受信ユニットに搭載することでカメラの消費電力を抑えることができる。当然、受信ユニット側の機能をカメラ側に入れ替えることも可能なので、必要な機能をカメラと受信ユニットで組み合せを検討することができる。
消費電力の低減効果は機能の入れ替えによるが、特にフレームメモリを使用するワイドダイナミック、デジタルノイズリダクション機能を受信ユニット側に搭載することで数ワットの低減効果が期待できる。 The camera system according to the above-described embodiment is configured by combining a surveillance camera equipped with an FPGA for signal processing and a receiving unit equipped with the same type of FPGA, so that signal processing can be performed by the FPGA in the camera. The function that was being used can be incorporated into the receiving unit. Also, when adding a new function, the power consumption of the camera can be reduced by installing it in the receiving unit. Of course, it is possible to replace the functions on the receiving unit side with those on the camera side, so it is possible to consider combining the necessary functions with the camera and the receiving unit.
The effect of reducing power consumption depends on the replacement of functions, but in particular, by installing a wide dynamic and digital noise reduction function that uses frame memory on the receiving unit side, a reduction effect of several watts can be expected.

本発明は、デジタルカメラ、ビデオカメラ等に広く利用できる。画像をリアルタイムで伝送するものに限らず、カメラが画像を記録媒体に蓄積し、それをPC等の受信ユニットで再生する場合にも適用できる。 The present invention can be widely used in digital cameras, video cameras and the like. It is not limited to those that transmit images in real time, but can also be applied when a camera stores an image on a recording medium and reproduces it on a receiving unit such as a PC.

１０１：監視カメラ、１０２：受信ユニット、１０３：同軸ケーブル、１：カメラ、２：受信ユニット、３：同軸ケーブル、１１：撮像素子（CMOS）、１２：FPGA、１３：メモリ、１４：ROM、１５：電源分離器、１６：映像処理部、１７：HD-VLC圧縮部、１８：CPUコア、１９：光学系、２１：フォーマット変換IC、２２：FPGA、２３：メモリ、２４：ROM、２５：電源重畳器、２６：映像処理部、２７：HD-VLC復元部、２８：CPUコア、３１：ゲインアップ器、３２：色補正器（マスキング）、３３：ニー処理器、３４：エンハンス、３５：γ（ガンマ）補正、３６：デジタルノイズリダクション処理器、３７：蓄積処理器、３８：ワイドダイナミックレンジ処理器、３９：霧霞補正器、４０：メモリIF、４２：色補正器、４３：クラスタリング器、４４：マッピングテーブル、４５：変換行列テーブル、４６：減算器、４７：行列掛け算器、４８：加算器、５０：人工知能、５１：畳み込みニューラルネット（ＣＮＮ）、５２：学習処理器、５３：学習データストア、５４：学習履歴ストア、５５：色補正器、５６：切替器。 101: Surveillance camera, 102: Receiving unit, 103: Coaxial cable, 1: Camera, 2: Receiving unit, 3: Coaxial cable, 11: Imaging element (CMOS), 12: FPGA, 13: Memory, 14: ROM, 15 : Power separator, 16: Video processing unit, 17: HD-VLC compression unit, 18: CPU core, 19: Optical system, 21: Format conversion IC, 22: FPGA, 23: Memory, 24: ROM, 25: Power supply Superimposition, 26: Video processing unit, 27: HD-VLC restoration unit, 28: CPU core, 31: Gain up unit, 32: Color corrector (masking), 33: Knee processing unit, 34: Enhance, 35: γ (Gamma) correction, 36: Digital noise reduction processor, 37: Accumulation processor, 38: Wide dynamic range processor, 39: Fog haze corrector, 40: Memory IF, 42: Color corrector, 43: Clustering device, 44: Mapping table, 45: Transformation matrix table, 46: Subtractor, 47: Matrix multiplier, 48: Adder, 50: Artificial intelligence, 51: Convolutional neural net (CNN), 52: Learning processor, 53: Learning Data store, 54: Learning history store, 55: Color corrector, 56: Switcher.

Claims

A camera system that connects a surveillance camera and a receiving unit with a coaxial cable.
The surveillance camera is a first non-volatile device that holds a first dynamic reconstruction device that processes a signal output from an imaging element and a first configuration data that sets the configuration of the first dynamic reconstruction device. a memory, along with loading the first configuration data to the first dynamically reconfigurable device, information capable of specifying the video signal processing performed by the first dynamically reconfigurable devices configured by the load is the It has a first CPU that controls to output from a coaxial cable, and has.
The receiving unit sets a second dynamic reconstruction device that processes a video signal output from the surveillance camera and received via the coaxial cable, and a plurality of configurations of the second dynamic reconstruction device. Based on the second non-volatile memory that holds the second configuration data and the identifiable information output from the surveillance camera, the second CPU that selects and loads what should be loaded from the plurality of second configuration data. And have
A camera characterized in that the video signal of a predetermined format output from the receiving unit is a signal that has undergone video signal processing shared by the first dynamic reconstruction device and the second dynamic reconstruction device. system.

The camera system according to claim 1.
The receiving unit has a memory, and the second dynamic reconstruction device uses the memory to perform video signal processing accompanied by inter-frame calculation.
The first dynamic reconstruction device is a camera system characterized in that it does not perform video signal processing accompanied by inter-frame calculation.

The camera system according to claim 1.
The first dynamic reconstruction device has a first color corrector that converts an input video signal of four or more colors into a video signal of three colors.
The receiving unit has artificial intelligence that learns and recognizes a predetermined subject reflected in the image corrected by the first color corrector.
The first color corrector performs color correction processing using a clustering device, a mapping table, a transformation matrix table, a subtractor, a matrix multiplier, and an adder.
The artificial intelligence is a camera system including a multi-layer neural network, a learning processor, a learning data store, a learning history store, and a second color corrector.