TWI766237B

TWI766237B - Object at sea distance measuring system

Info

Publication number: TWI766237B
Application number: TW109104366A
Authority: TW
Inventors: 林春宏; 王新誠; 賴均威
Original assignee: 國立臺中科技大學
Priority date: 2020-02-12
Filing date: 2020-02-12
Publication date: 2022-06-01
Also published as: TW202131278A

Abstract

An object at sea distance measuring system distinguish a common object by a plurality of image capture devices obtaining a plurality of sea images with a common object, an image processor and a convolutional neural network algorithm. The object at sea distinction system calculates a distance between the common object and each image capture device.

Description

Maritime object ranging system

本發明關於一種利用卷積神經網路識別海平面影像的共同物件並推算共同物件和和各影像擷取裝置的距離之海上物件測距系統。The present invention relates to a marine object ranging system that uses a convolutional neural network to identify common objects in sea level images and calculates the distance between the common objects and each image capturing device.

無人船海上監控系統已成為現今發展中相當重要的一部分。無人船海上監控系統可應用於軍事用途，以加強海域的監控及管理或查辦違法行為；無人船海上監控系統也可應用於民生用途，以便於管理港口運輸或海上救援。The unmanned ship marine monitoring system has become a very important part of today's development. The unmanned ship maritime monitoring system can be used for military purposes to strengthen the monitoring and management of the sea area or investigate illegal acts; the unmanned ship maritime monitoring system can also be used for civilian purposes to facilitate the management of port transportation or maritime rescue.

然而，無人船在海面航行時可能遭遇例如海底動物的碰撞、颱風或大型漂流物等障礙，目前的無人船海上監控系統僅能拍攝海平面影像，而未能即時識別海上的多個物件，勢必需要一種針對海上的多個物件進行識別的方法及系統。However, unmanned ships may encounter obstacles such as collisions of underwater animals, typhoons or large drifting objects when sailing on the sea surface. The current marine monitoring system for unmanned ships can only capture images of the sea level, but cannot identify multiple objects on the sea in real time. What is needed is a method and system for identifying multiple objects at sea.

目前現有的影像識別為利用攝影機拍攝，搭配多個邊框來逐一掃描攝影機所拍攝的影像中的各物件，各邊框框出影像中的物件並將其輸入至卷積神經網路(convolutional neural networks, CNN)進行分類。然而，由於物件大小並不可預知，每次邊框的大小需改變，因而需掃描多次以選用適當的邊框來框住不同大小的物件，其造成運算資源的浪費且無法即時得知物件，現有的影像辨識無法滿足無人船的海上識別物件的需求。At present, the existing image recognition is to use a camera to capture, with multiple frames to scan each object in the image captured by the camera one by one, and each frame frames the objects in the image and inputs them to a convolutional neural network CNN) for classification. However, since the size of the object is unpredictable, the size of the frame needs to be changed each time, so it is necessary to scan multiple times to select an appropriate frame to frame the objects of different sizes, which results in a waste of computing resources and cannot immediately know the object. Image recognition cannot meet the needs of unmanned ships to identify objects at sea.

綜觀前所述，本發明之發明者思索並設計一種海上物件測距系統，以期針對習知技術之缺失加以改善，進而增進產業上之實施利用。In view of the foregoing, the inventors of the present invention have considered and designed a ranging system for objects at sea, in order to improve the deficiencies of the prior art, thereby enhancing the implementation and utilization in the industry.

有鑑於上述習知之問題，本發明的目的在於提供一種海上物件測距系統，用以解決習知技術中所面臨之問題。In view of the above-mentioned conventional problems, an object of the present invention is to provide a marine object ranging system to solve the problems faced in the conventional technology.

基於上述目的，本發明提供一種海上物件測距系統，其包括複數個影像擷取裝置、影像處理器以及記憶裝置。複數個影像擷取裝置拍攝複數張海平面影像，各張海平面影像具有共同物件，各張海平面影像的視角相異。影像處理器電性連接複數個影像擷取裝置，影像處理器分割各張海平面影像為複數個區塊，並將各區塊分配至相應的類別。記憶裝置連接於影像處理器，記憶裝置儲存卷積神經網路演算法，卷積神經網路演算法被影像處理器運行以根據各區塊製造共同物件所屬的複數個邊框，並據此演算出各邊框相應的信任機率，影像處理器根據各邊框對應於共同物件之信任機率以及各類別，識別共同物件。其中，影像處理器根據複數張海平面影像之相異視角，得知共同物件與各個影像擷取裝置之距離。Based on the above objective, the present invention provides a ranging system for marine objects, which includes a plurality of image capturing devices, an image processor and a memory device. A plurality of image capturing devices capture a plurality of sea level images, each sea level image has a common object, and each sea level image has a different viewing angle. The image processor is electrically connected to a plurality of image capturing devices, the image processor divides each sea level image into a plurality of blocks, and assigns each block to a corresponding category. The memory device is connected to the image processor, the memory device stores the convolutional neural network road algorithm, and the convolutional neural network road algorithm is run by the image processor to produce a plurality of frames according to each block to which the common object belongs, and calculate each frame accordingly. Corresponding to the trust probability, the image processor identifies the common object according to the trust probability of each frame corresponding to the common object and each category. Wherein, the image processor knows the distance between the common object and each image capturing device according to the different viewing angles of the plurality of sea level images.

較佳地，卷積神經網路演算法被影像處理器運行以根據各邊框分別含括共同物件的機率以及各邊框和相應的實際框之重疊區域估計值，演算共同物件所屬的各邊框相應的信任機率。Preferably, the convolutional neural network road algorithm is run by the image processor to calculate the corresponding trust of each frame to which the common object belongs according to the probability that each frame contains the common object and the estimated value of the overlapping area between each frame and the corresponding actual frame. chance.

較佳地，各邊框具有共同物件的中心座標以及共同物件的長度和寬度。Preferably, each frame has the center coordinates of the common object and the length and width of the common object.

較佳地，當影像處理器判斷邊框對應於共同物件之信任機率為0時，邊框不存在共同物件。Preferably, when the image processor determines that the trust probability that the frame corresponds to the common object is 0, there is no common object in the frame.

較佳地，各張海平面影像具有多個共同物件，共同物件的數目和複數個區塊的數目相異。Preferably, each sea level image has a plurality of common objects, and the number of the common objects and the number of the plurality of blocks are different.

較佳地，卷積神經網路演算法具有複數層卷積層、複數層最大池化層以及全連結層。Preferably, the convolutional neural network road algorithm has a complex number of convolution layers, a complex number of max pooling layers and a fully connected layer.

較佳地，複數個影像擷取裝置分為左影像擷取裝置以及右影像擷取裝置，複數張海平面影像分為左海平面影像和右海平面影像。Preferably, the plurality of image capture devices are divided into a left image capture device and a right image capture device, and the plurality of sea level images are divided into left sea level images and right sea level images.

較佳地，影像處理器根據共同物件於左海平面影像和右海平面影像的位置，將共同物件的座標從二維座標系轉為三維座標系，共同物件具有立體座標，影像處理器根據立體座標和左影像擷取裝置以及右影像擷取裝置之位置，得知共同物件與左影像擷取裝置之距離以及共同物件與右影像擷取裝置之距離。Preferably, the image processor converts the coordinates of the common object from a two-dimensional coordinate system to a three-dimensional coordinate system according to the position of the common object in the left sea level image and the right sea level image, the common object has three-dimensional coordinates, and the image processor according to the three-dimensional coordinates and the positions of the left image capture device and the right image capture device, to know the distance between the common object and the left image capture device and the distance between the common object and the right image capture device.

較佳地，共同物件於左海平面影像和右海平面影像的位置相異，共同物件於左海平面影像和右海平面影像的位置差距為視差。Preferably, the positions of the common object in the left sea level image and the right sea level image are different, and the difference between the positions of the common object in the left sea level image and the right sea level image is parallax.

較佳地，影像處理器利用視差流網絡推算左海平面影像和右海平面影像之視差。Preferably, the image processor calculates the disparity between the left sea level image and the right sea level image using a disparity stream network.

承上所述，本發明之海上物件測距系統，透過卷積神經網路演算法的配置和海平面影像的分割，一次識別海平面影像的各個共同物件，進而加快識別物件的速度，並推算各個共同物件和各影像擷取裝置的距離。Continuing from the above, the marine object ranging system of the present invention, through the configuration of the convolutional neural network road algorithm and the segmentation of the sea level image, can identify each common object in the sea level image at one time, thereby speeding up the speed of identifying the objects, and calculating each object. The distance between the common object and each image capture device.

本發明之優點、特徵以及達到之技術方法將參照例示性實施例及所附圖式進行更詳細地描述而更容易理解，且本發明可以不同形式來實現，故不應被理解僅限於此處所陳述的實施例，相反地，對所屬技術領域具有通常知識者而言，所提供的實施例將使本揭露更加透徹與全面且完整地傳達本發明的範疇，且本發明將僅為所附加的申請專利範圍所定義。The advantages, features, and technical means of achieving the present invention will be more easily understood by being described in more detail with reference to the exemplary embodiments and the accompanying drawings, and the present invention may be implemented in different forms, so it should not be construed as being limited to what is described herein. Rather, the embodiments are provided so that this disclosure will be thorough, complete and complete to convey the scope of the invention to those of ordinary skill in the art, and the invention will only be appended Defined by the scope of the patent application.

應當理解的是，儘管術語「第一」、「第二」等在本發明中可用於描述各種元件、部件、區域、層及/或部分，但是這些元件、部件、區域、層及/或部分不應受這些術語的限制。這些術語僅用於將一個元件、部件、區域、層及/或部分與另一個元件、部件、區域、層及/或部分區分開。因此，下文討論的「第一元件」、「第一部件」、「第一區域」、「第一層」及/或「第一部分」可以被稱為「第二元件」、「第二部件」、「第二區域」、「第二層」及/或「第二部分」，而不悖離本發明的精神和教示。It will be understood that although the terms "first", "second", etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections You should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer and/or section from another element, component, region, layer and/or section. Thus, "first element", "first feature", "first region", "first layer" and/or "first portion" discussed below may be referred to as "second element", "second feature" , "Second Area", "Second Layer" and/or "Second Section" without departing from the spirit and teachings of the present invention.

另外，術語「包括」及/或「包含」指所述特徵、區域、整體、步驟、操作、元件及/或部件的存在，但不排除一個或多個其他特徵、區域、整體、步驟、操作、元件、部件及/或其組合的存在或添加。Additionally, the terms "comprising" and/or "comprising" refer to the presence of stated features, regions, integers, steps, operations, elements and/or components, but do not exclude one or more other features, regions, integers, steps, operations , elements, components and/or the presence or addition of combinations thereof.

除非另有定義，本發明所使用的所有術語(包括技術和科學術語)具有與本發明所屬技術領域的普通技術人員通常理解的相同含義。將進一步理解的是，諸如在通常使用的字典中定義的那些術語應當被解釋為具有與它們在相關技術和本發明的上下文中的含義一致的定義，並且將不被解釋為理想化或過度正式的意義，除非本文中明確地這樣定義。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms such as those defined in commonly used dictionaries should be construed as having definitions consistent with their meanings in the context of the related art and the present invention, and will not be construed as idealized or overly formal meaning, unless expressly defined as such herein.

請參閱第1圖，其為本發明之海上物件測距系統之的方塊圖。如第1圖所示，本發明之海上物件測距系統，其包括複數個影像擷取裝置10、影像處理器20以及記憶裝置30。複數個影像擷取裝置10拍攝複數張海平面影像SLM，各張海平面影像SLM具有共同物件CO，各張海平面影像SLM的視角相異。影像處理器20電性連接複數個影像擷取裝置10，影像處理器20分割各張海平面影像SLM為複數個區塊B，並將各區塊B分配至相應的類別C。記憶裝置30連接於影像處理器20，記憶裝置30儲存卷積神經網路演算法CNN，卷積神經網路演算法CNN被影像處理器20運行以根據各區塊B製造共同物件CO所屬的複數個邊框BB，並據此演算出各邊框BB相應的信任機率CS，影像處理器20根據各邊框BB對應於共同物件CO之信任機率CS以及各類別C，識別共同物件CO。其中，影像處理器20根據複數張海平面影像SLM之相異視角，得知共同物件CO與各個影像擷取裝置10之距離；舉例來說，一張海平面影像SLM為影像擷取裝置10從是視角60度拍攝，另一張海平面影像SLM為影像擷取裝置10從視角-60度拍攝，影像處理器20比對前述兩張海平面影像SLM的共同物件CO而得知視差，影像處理器20根據視差推算共同物件CO和各影像擷取裝置10之距離。Please refer to FIG. 1 , which is a block diagram of the marine object ranging system of the present invention. As shown in FIG. 1 , the marine object ranging system of the present invention includes a plurality of image capturing devices 10 , an image processor 20 and a memory device 30 . A plurality of image capturing devices 10 capture a plurality of sea level images SLM, each sea level image SLM has a common object CO, and each sea level image SLM has a different viewing angle. The image processor 20 is electrically connected to the plurality of image capturing devices 10 . The image processor 20 divides each sea level image SLM into a plurality of blocks B, and assigns each block B to a corresponding category C. The memory device 30 is connected to the image processor 20, and the memory device 30 stores the convolutional neural network road algorithm CNN. The convolutional neural network road algorithm CNN is run by the image processor 20 to manufacture a plurality of borders to which the common object CO belongs according to each block B. BB, and calculates the corresponding trust probability CS of each frame BB, and the image processor 20 identifies the common object CO according to the trust probability CS of each frame BB corresponding to the common object CO and each category C. The image processor 20 knows the distance between the common object CO and each image capture device 10 according to different viewing angles of the plurality of sea level images SLM; for example, a sea level image SLM is the view angle of the image capture device 10 . Shot at 60 degrees, another sea level image SLM is shot by the image capture device 10 from a viewing angle of -60 degrees. The image processor 20 compares the common object CO of the two sea level images SLM to obtain the parallax, and the image processor 20 obtains the parallax according to the parallax. The distance between the common object CO and each image capturing device 10 is estimated.

其中，卷積神經網路演算法CNN被影像處理器20運行以根據各區塊B含括共同物件CO的機率，製造共同物件CO所屬的複數個邊框BB，接著卷積神經網路演算法CNN被影像處理器20運行以根據各邊框BB分別含括共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，演算共同物件CO所屬的各邊框BB相應的信任機率CS，影像處理器20根據各邊框BB對應於共同物件CO之信任機率CS及各邊框BB所屬之類別C，識別共同物件CO；另，當影像處理器20判斷邊框BB對應於共同物件CO之信任機率為0時，邊框BB不存在共同物件CO。透過卷積神經網路演算法CNN的配置和海平面影像SLM的分割，識別海平面影像SLM的共同物件O，進而加快識別物件O的速度。Among them, the convolutional neural network road algorithm CNN is run by the image processor 20 to create a plurality of borders BB to which the common object CO belongs according to the probability that each block B contains the common object CO, and then the convolutional neural network road algorithm CNN is imaged The processor 20 operates to calculate the corresponding trust probability CS of each border BB to which the common object CO belongs, according to the probability that each border BB contains the common object CO and the estimated value of the overlapping area between each border BB and the corresponding actual frame. The image processor 20 Identify the common object CO according to the trust probability CS of each frame BB corresponding to the common object CO and the category C to which each frame BB belongs; in addition, when the image processor 20 determines that the trust probability of the frame BB corresponding to the common object CO is 0, Border BB does not have a common object CO. Through the configuration of the convolutional neural network road algorithm CNN and the segmentation of the sea level image SLM, the common object O of the sea level image SLM is identified, and the speed of identifying the object O is accelerated.

需說明的是，由於在海平面影像SLM中並未每個區塊B皆存在共同物件CO，複數個邊框BB以不同位置和方向框起單個共同物件CO，各邊框BB具有共同物件CO於區塊B的中心座標以及共同物件CO相對於海平面影像SLM的長度和寬度；再者，重疊區域估計值為邊框BB和相應的實際框之交集面積除以聯集面積。It should be noted that, because not every block B in the sea level image SLM has a common object CO, a plurality of frames BB frame a single common object CO at different positions and directions, and each frame BB has a common object CO in the area. The center coordinates of the block B and the length and width of the common object CO relative to the sea level image SLM; furthermore, the estimated overlap area is the area of the intersection of the bounding box BB and the corresponding actual box divided by the union area.

需提及的是，各張海平面影像SLM具有多個共同物件CO，共同物件CO的數目和複數個區塊B的數目相異，前述段落已敘述單個共同物件CO的識別，其餘共同物件CO的識別乃同樣機制，於此不再加以重複敘述。又，卷積神經網路演算法CNN被影像處理器20運行以根據各區塊B含括相應的多個共同物件CO的機率，製造各共同物件CO所屬的複數個邊框BB，接著卷積神經網路演算法CNN被影像處理器20運行以根據各邊框BB分別含括相應的多個共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，取得各邊框BB所對應的多個共同物件CO之信任機率CS，影像處理器20根據各邊框BB對應於共同物件CO之信任機率CS及各邊框BB所屬之類別C，識別多個共同物件CO；簡言之，卷積神經網路演算法CNN為同時對海平面影像SLM中多個共同物件CO進行識別並搭配相應的類別C，加快整個共同物件CO的識別速度。It should be mentioned that each sea level image SLM has multiple common objects CO, and the number of common objects CO is different from the number of multiple blocks B. The identification of a single common object CO has been described in the preceding paragraph, and the other common objects CO Recognition is the same mechanism and will not be repeated here. In addition, the convolutional neural network road algorithm CNN is run by the image processor 20 to produce a plurality of frames BB to which each common object CO belongs according to the probability that each block B contains a plurality of corresponding common objects CO, and then the convolutional neural network The roadmap algorithm CNN is run by the image processor 20 to obtain a plurality of corresponding frames BB according to the probability that each frame BB contains a plurality of corresponding common objects CO and the estimated value of the overlapping area between each frame BB and the corresponding actual frame. The trust probability CS of the common object CO, the image processor 20 identifies a plurality of common objects CO according to the trust probability CS of each frame BB corresponding to the common object CO and the category C to which each frame BB belongs; in short, the convolutional neural network roadshow The algorithm CNN is used to simultaneously identify multiple common objects CO in the sea level image SLM and match the corresponding category C to speed up the recognition speed of the entire common object CO.

續言之，卷積神經網路演算法CNN透過機器學習(machine learning)的方式不斷地加快共同物件CO的識別，並能將共同物件CO中較重要的影像特徵(例如形狀、顏色或紋路)當作共同物件CO識別的依據，從而節省人工影像特徵的篩選；卷積神經網路演算法CNN經過多次海平面影像SLM的輸入，不斷訓練卷積神經網路演算法CNN，使卷積神經網路演算法CNN篩選出合適的影像特徵、權重以及分類標準。Continuing, the convolutional neural network road algorithm CNN continuously accelerates the recognition of common objects CO through machine learning, and can use the more important image features (such as shape, color or texture) in the common object CO as It can be used as the basis for CO identification of common objects, thereby saving the screening of artificial image features; the convolutional neural network roadmap algorithm CNN is continuously trained through the input of sea level image SLM for many times, so that the convolutional neural network roadmap algorithm CNN is continuously trained. CNN filters out suitable image features, weights, and classification criteria.

請參閱第2圖至第5圖，其分別為本發明之卷積神經網路演算法之第一態樣的結構圖、本發明之卷積層示意圖、本發明之最大池化層示意圖以及本發明之全連結層示意圖。如第2圖至第5圖所示，本發明之卷積神經網路演算法CNN包括複數層卷積層CL、複數層最大池化層MPL以及至少一層全連結層FCL，各最大池化層MPL設置於各層卷積層CL之間，全連結層FCL連接於複數層卷積層CL之最後運算的該層並連結複數個神經元NE。Please refer to FIG. 2 to FIG. 5, which are the structure diagram of the first aspect of the convolutional neural network algorithm of the present invention, the schematic diagram of the convolutional layer of the present invention, the schematic diagram of the maximum pooling layer of the present invention, and the schematic diagram of the present invention. Schematic diagram of the fully connected layer. As shown in Fig. 2 to Fig. 5, the convolutional neural network road algorithm CNN of the present invention includes a plurality of convolutional layers CL, a plurality of maximum pooling layers MPL and at least one fully connected layer FCL. Each maximum pooling layer MPL is set Between the convolutional layers CL of each layer, the fully connected layer FCL is connected to the last operation layer of the convolutional layers of the plurality of layers CL and connects a plurality of neurons NE.

如第3圖所示，並搭配第2圖，各卷積層CL彼此連接，海平面影像SLM相應的矩陣為像素矩陣PM，卷積層CL利用與像素矩陣PM對應的濾波層F1矩陣與像素矩陣PM之元素對應相乘，運算出海平面影像SLM的特徵矩陣EM。如第4圖所示，並搭配第2圖至第3圖，特徵矩陣EM與對應的之最大池化層MPL的濾波層F2矩陣之元素對應相乘，以對特徵矩陣EM進行降維，進而取得降維後的特徵矩陣EM，從而減緩卷積神經網路演算法CNN的運算量。需說明的是，前述第2圖和第3圖的搭配敘述僅為單層卷積層CL和單層最大池化層MPL，其餘卷積層CL和最大池化層MPL為和其相同，於此不再重複敘述。As shown in Figure 3 and in conjunction with Figure 2, each convolution layer CL is connected to each other, the matrix corresponding to the sea level image SLM is the pixel matrix PM, and the convolution layer CL uses the filter layer F1 matrix and the pixel matrix PM corresponding to the pixel matrix PM. The elements are multiplied correspondingly to calculate the feature matrix EM of the sea level image SLM. As shown in Figure 4, and with Figures 2 to 3, the feature matrix EM is multiplied by the corresponding elements of the filter layer F2 matrix of the corresponding maximum pooling layer MPL to reduce the dimension of the feature matrix EM, and then The dimensionality-reduced feature matrix EM is obtained, thereby reducing the computational complexity of the convolutional neural network road algorithm CNN. It should be noted that the descriptions of the above-mentioned Figures 2 and 3 are only for the single-layer convolutional layer CL and the single-layer maximum pooling layer MPL, and the other convolutional layers CL and maximum pooling layers MPL are the same, and are not described here. Repeat the narration.

透過多層卷積層CL和多層最大池化層MPL的運算，篩選像素矩陣PM的重要影像特徵(例如共同物件CO的顏色或形狀)，而卷積神經網路演算法CNN的卷積層CL和最大池化層MPL的層數乃根據實際所需而加以調整，於此並未限定卷積層CL和最大池化層MPL層數。Through the operation of the multi-layer convolutional layer CL and the multi-layer maximum pooling layer MPL, the important image features of the pixel matrix PM (such as the color or shape of the common object CO) are screened, and the convolutional neural network algorithm CNN The convolutional layer CL and maximum pooling The number of layers of the layer MPL is adjusted according to actual needs, and the number of layers of the convolutional layer CL and the maximum pooling layer MPL is not limited here.

實際而言，通常經過卷積層CL的濾波層F1之矩陣運算後的特徵矩陣EM會再乘上一個激活函數(activation function)，激活函數將保留像素矩陣PM之有明顯的重要影像特徵，進而去除或降低無明顯的影像特徵。In fact, usually the feature matrix EM after the matrix operation of the filter layer F1 of the convolution layer CL will be multiplied by an activation function, and the activation function will retain the obvious important image features of the pixel matrix PM, and then remove it. Or reduce no obvious image features.

如第5圖所示，並搭配第2圖所示，最後經過多層卷積層CL和最大池化層MPL的特徵矩陣EM透過全連結層FCL平坦化為特徵圖FM，特徵圖FM裡的數值經過相應的權重W之運算而輸出至對應各區塊B的複數個神經元NE，各神經元NE根據運算後的特徵圖FM和各區塊B分別含括相應的多個共同物件O的機率，製造各共同物件CO所屬的複數個邊框BB，接著各神經元NE運行以根據各邊框BB分別含括相應的多個共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，取得各邊框BB所對應的多個共同物件CO之信任機率CS，影像處理器20根據各邊框BB對應於共同物件CO之信任機率CS及各邊框BB所屬之類別C，識別多個共同物件CO，從而達到識別海平面影像SLM的各共同物件CO之目的。As shown in Figure 5 and with Figure 2, finally the feature matrix EM of the multi-layer convolution layer CL and the maximum pooling layer MPL is flattened into the feature map FM through the fully connected layer FCL, and the values in the feature map FM are The operation of the corresponding weight W is output to a plurality of neurons NE corresponding to each block B, and each neuron NE includes the probability of corresponding multiple common objects O according to the feature map FM after the operation and each block B, respectively, Create a plurality of bounding boxes BB to which each common object CO belongs, and then each neuron NE operates to estimate the overlapping area between each bounding box BB and the corresponding actual box according to the probability that each bounding box BB contains a corresponding plurality of common objects CO, respectively, Obtaining the trust probability CS of the plurality of common objects CO corresponding to each frame BB, the image processor 20 identifies the plurality of common objects CO according to the trust probability CS of each frame BB corresponding to the common object CO and the category C to which each frame BB belongs, So as to achieve the purpose of identifying the common objects CO of the sea level image SLM.

需提及的是，在一實施例中，複數個影像擷取裝置10設置於無人船中，影像處理器20設置於遠端的電腦；因此，複數個影像擷取裝置10和影像處理器20為皆有設置對應的無線收發器，以透過無線的方式傳送海平面影像SLM。在另一實施例中，複數個影像擷取裝置10設置於無人船中，影像處理器20設置於遠端的電腦，並具有雲端伺服器，雲端伺服器和影像擷取裝置10及遠端的電腦網路連接；複數個影像擷取裝置10傳送海平面影像SLM至雲端伺服器，遠端的電腦在跟雲端伺服器連接以取得海平面影像SLM。前述配置僅為舉例，當然也可為其他較佳的配置，而未侷限於本發明所列舉的範圍。It should be mentioned that, in one embodiment, a plurality of image capture devices 10 are installed in the unmanned ship, and the image processor 20 is installed in a remote computer; therefore, the plurality of image capture devices 10 and the image processor 20 Corresponding wireless transceivers are set up for all of them to transmit sea level image SLM wirelessly. In another embodiment, a plurality of image capture devices 10 are installed in the unmanned ship, the image processor 20 is installed in a remote computer, and has a cloud server, the cloud server, the image capture device 10 and the remote computer Computer network connection; a plurality of image capture devices 10 transmit the sea level image SLM to the cloud server, and the remote computer is connected with the cloud server to obtain the sea level image SLM. The foregoing configuration is only an example, and of course other preferred configurations are also possible, and are not limited to the scope of the present invention.

請參閱第6圖，其為本發明之海上物件測距系統之流程圖。如第6圖所示，搭配第1圖至第5圖，說明本發明之海上物件識別方法如下：(1)S11步驟：利用多個影像擷取裝置10拍攝海平面影像SLM，各張海平面影像SLM具有共同物件CO，共同物件CO的數量可為多個，為了簡化說明，下列步驟僅用單個共同物件CO說明識別的流程，其餘的共同物件CO也為同樣的識別流程。(2)S12步驟：影像處理器20分割各張海平面影像SLM為複數個區塊B，影像處理器20將各區塊B分配至相應的類別C。(3)S13步驟：利用記憶裝置30的卷積神經網路演算法CNN，卷積神經網路演算法CNN被影像處理器20運行以根據各區塊B含括共同物件CO的機率，製造共同物件CO所屬的複數個邊框BB，接著卷積神經網路演算法CNN被影像處理器20運行以根據各邊框BB分別含括共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，演算共同物件CO所屬的各邊框BB相應的信任機率CS；另，當影像處理器20判斷邊框BB對應於共同物件CO之信任機率為0時，邊框BB不存在共同物件CO。(4)S14步驟：影像處理器20根據各邊框BB對應於共同物件CO之信任機率CS及各邊框BB所屬之類別C，識別共同物件CO。Please refer to FIG. 6 , which is a flowchart of the marine object ranging system of the present invention. As shown in FIG. 6 , together with FIGS. 1 to 5 , the method for identifying marine objects of the present invention is described as follows: (1) Step S11 : use a plurality of image capturing devices 10 to capture sea level images SLM, each sea level image The SLM has a common object CO, and the number of common objects CO can be multiple. In order to simplify the description, the following steps only use a single common object CO to describe the identification process, and the other common objects CO are also the same identification process. (2) Step S12: the image processor 20 divides each sea level image SLM into a plurality of blocks B, and the image processor 20 assigns each block B to a corresponding category C. (3) Step S13: using the convolutional neural network road algorithm CNN of the memory device 30, the convolutional neural network road algorithm CNN is run by the image processor 20 to manufacture the common object CO according to the probability that each block B includes the common object CO The plurality of bounding boxes BB belong to, and then the convolutional neural network road algorithm CNN is run by the image processor 20 to calculate according to the probability that each bounding box BB contains the common object CO and the estimated value of the overlapping area between each bounding box BB and the corresponding actual box. The corresponding trust probability CS of each frame BB to which the common object CO belongs; in addition, when the image processor 20 determines that the trust probability of the frame BB corresponding to the common object CO is 0, there is no common object CO in the frame BB. (4) Step S14: The image processor 20 identifies the common object CO according to the trust probability CS of each frame BB corresponding to the common object CO and the category C to which each frame BB belongs.

請參閱第7圖，其為本發明之卷積神經網路演算法的運作流程圖。如第7圖所示，並搭配第1圖至第6圖，影像處理器20將影像分成

個區塊B，影像處理器20以不同的顏色將各區塊B分類至相應的類別C。卷積神經網路演算法CNN被影像處理器20運行以根據各區塊B含括相應的多個共同物件CO的機率，製造各共同物件CO所屬的複數個邊框BB，接著卷積神經網路演算法CNN被影像處理器20運行以根據各邊框BB分別含括相應的多個共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，取得各邊框BB所對應的多個共同物件CO之信任機率CS(邊框的粗細即表示信任機率CS的高低)，而各邊框BB之信任機率CS乃根據下列公式取得，公式如下：

此公式為根據Redmon, J., Divvala, S.,Girshick, R., & Farhadi, A.於IEEE的2016期刊所發表的You Only Look Once: Unified, Real-Time Object Detection之論文，另外，本發明所述之卷積神經網路演算法也根據此論文的架構來應用於海上物件測距系統；其中，CS為共同物件CO於類別C標準答案之物件機率值，

) 指的是共同物件CO為類別C的機率值

) 所指的區塊B是否存有共同物件CO的機率值，通常以1 或 0表示，1代表區塊B有共同物件CO，0代表區塊B無共同物件CO，

所指的是正確答案的框(實際框)與偵測物件出來的框(邊框BB)所含蓋範圍

。最後，影像處理器20根據各類別C和各邊框BB之信任機率CS，辨識各共同物件CO(如第7圖的紫色邊框、黃色邊框以及綠色邊框內的共同物件CO)。Please refer to FIG. 7 , which is a flow chart of the operation of the convolutional neural network road algorithm of the present invention. As shown in FIG. 7 , and in conjunction with FIGS. 1 to 6 , the image processor 20 divides the image into

block B, the image processor 20 classifies each block B into a corresponding category C with different colors. The convolutional neural network road algorithm CNN is run by the image processor 20 to create a plurality of borders BB to which each common object CO belongs according to the probability that each block B contains the corresponding plurality of common objects CO, and then the convolutional neural network road algorithm The CNN is run by the image processor 20 to obtain a plurality of common objects corresponding to each frame BB according to the probability that each frame BB contains the corresponding plurality of common objects CO and the estimated value of the overlapping area between each frame BB and the corresponding actual frame The trust probability CS of CO (the thickness of the frame indicates the level of trust probability CS), and the trust probability CS of each frame BB is obtained according to the following formula, the formula is as follows:

This formula is based on the paper You Only Look Once: Unified, Real-Time Object Detection published by Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. in the 2016 journal of IEEE. In addition, this paper The convolutional neural network road algorithm described in the invention is also applied to the maritime object ranging system according to the structure of this paper; wherein CS is the object probability value of the common object CO in the standard answer of category C,

) refers to the probability value of the common object CO being class C

) refers to the probability value of whether the block B has a common object CO, usually expressed as 1 or 0, 1 means that the block B has a common object CO, 0 means that the block B has no common object CO,

It refers to the coverage of the correct answer box (actual box) and the box (border BB) from the detected object

. Finally, the image processor 20 identifies each common object CO (such as the common object CO in the purple border, the yellow border, and the green border in FIG. 7 ) according to the trust probability CS of each category C and each frame BB.

請參閱第8圖，本發明之卷積神經網路演算法之第二態樣的結構圖。如第8圖所示，實際而言，本發明之卷積神經網路演算法CNN包括24層卷積層CL(

、

、

、

及

之矩陣)、4層最大池化層MPL(

)以及2層全連結層FCL，其中各卷積層CL還包括減少層RL以減少特徵矩陣EM的參數；像素矩陣PM經過24層卷積層CL、4層最大池化層MPL以及2層全連結層FCL之運算及影像處理器20的識別後，而取得

之張量來表示如第8圖所示的輸出偵測結果。另外，前述卷積神經網路演算法僅為例舉，本發明之卷積神經網路演算法CNN之卷積層CL、最大池化層MPL、全連結層FCL以及減少層RL之數目乃能根據實際需求而調整，而未侷限於本發明所列舉的範圍。Please refer to FIG. 8 , which is a structural diagram of the second aspect of the convolutional neural network road algorithm of the present invention. As shown in Fig. 8, actually, the convolutional neural network road algorithm CNN of the present invention includes 24 convolutional layers CL (

,

and

matrix), 4-layer max pooling layer MPL (

) and 2 fully connected layers FCL, wherein each convolutional layer CL also includes a reduction layer RL to reduce the parameters of the feature matrix EM; the pixel matrix PM passes through 24 convolutional layers CL, 4 maximum pooling layers MPL and 2 fully connected layers After the calculation of the FCL and the recognition of the image processor 20, the

tensor to represent the output detection result as shown in Figure 8. In addition, the aforementioned convolutional neural network road algorithm is only an example, the number of convolutional layer CL, maximum pooling layer MPL, fully connected layer FCL and reduction layer RL of the convolutional neural network road algorithm CNN of the present invention can be based on actual needs. and adjustment, but not limited to the scope of the present invention.

請參閱第9圖，其為本發明之線性激活函數和帶洩漏線性整流激活函數的函數圖。根據第7圖和第8圖所示，搭配第5圖，像素矩陣PM經過多層卷積層CL、多層最大池化層MPL、多層全連結層FCL之運算後而輸出特徵圖FM至複數個神經元NE，各神經元NE根據運算後的特徵圖FM和各區塊B分別含括相應的多個共同物件O的機率，製造各共同物件CO所屬的複數個邊框BB，接著各神經元NE運行以根據各邊框BB分別含括相應的多個共同物件CO的機率和各邊框BB和相應的實際框之重疊區域估計值，取得各邊框BB所對應的多個共同物件CO之信任機率CS，各神經元NE再透過如第9圖所示之線性整流激活函數ReLU和帶洩漏線性整流激活函數LReLU壓縮各共同物件CO所屬的複數個邊框BB的資料量；最後搭配影像處理器20的識別，輸出如第8圖所示的輸出偵測結果，進而識別各個共同物件CO。其中，線性整流激活函數ReLU和帶洩漏線性整流激活函數LReLU的公式如下：

其中，y表示各邊框BB的資料，線性整流激活函數ReLU將各邊框BB中過小的資料壓縮成0，其可能造成過小的資料無法別識別且不可恢復；帶洩漏線性整流激活函數LReLU將各邊框BB的資料乘上一個很小的權重α(0 >

1)，而非直接把過小的資料壓縮成0。透過線性整流激活函數ReLU和帶洩漏線性整流激活函數LReLU的運算，壓縮各共同物件O所屬的複數個邊框BB的資料。Please refer to FIG. 9, which is a function diagram of the linear activation function and the leakage linear rectification activation function of the present invention. According to Fig. 7 and Fig. 8, with Fig. 5, the pixel matrix PM outputs the feature map FM to a plurality of neurons after the operation of the multi-layer convolution layer CL, the multi-layer max pooling layer MPL, and the multi-layer fully connected layer FCL. NE, each neuron NE manufactures a plurality of borders BB to which each common object CO belongs according to the calculated feature map FM and the probability that each block B contains a plurality of corresponding common objects O, and then each neuron NE runs with According to the probability that each frame BB contains a plurality of corresponding common objects CO and the estimated value of the overlapping area between each frame BB and the corresponding actual frame, the trust probability CS of the plurality of common objects CO corresponding to each frame BB is obtained. The element NE then uses the linear rectification activation function ReLU and the leaky linear rectification activation function LReLU as shown in Fig. 9 to compress the data amount of the multiple frames BB to which each common object CO belongs; finally, with the recognition of the image processor 20, the output is as follows The detection result shown in FIG. 8 is output, and then each common object CO is identified. Among them, the formulas of the linear rectification activation function ReLU and the leakage linear rectification activation function LReLU are as follows:

Among them, y represents the data of each frame BB, and the linear rectification activation function ReLU compresses the data that is too small in each frame BB to 0, which may cause the data that is too small to be identified and irrecoverable; the linear rectification activation function with leakage LReLU compresses the data of each frame. The data of BB is multiplied by a small weight α (0 >

1), rather than directly compressing too small data into 0. Through the operation of the linear rectification activation function ReLU and the leakage linear rectification activation function LReLU, the data of the plurality of frames BB to which each common object O belongs is compressed.

復請參閱第1圖，複數個影像擷取裝置10分為左影像擷取裝置以及右影像擷取裝置，複數張海平面影像SLM分為左海平面影像和右海平面影像；具體而言，左影像擷取裝置相當於人的左眼並具有左成像平面，右影像擷取裝置相當於人的右眼並具有右成像平面，左影像擷取裝置具有相應的右影像擷取裝置(亦即，左影像擷取裝置以及右影像擷取裝置為對應設置)，每個影像擷取裝置10所拍攝的海平面影像SLM之視角從而相異，造成各影像擷取裝置10所拍攝之共同物件CO所處之位置也相異。Please refer to FIG. 1 again, the plurality of image capture devices 10 are divided into a left image capture device and a right image capture device, and the plurality of sea level images SLM are divided into left sea level images and right sea level images; The capture device is equivalent to the left eye of a person and has a left imaging plane, the right image capture device is equivalent to the right eye of a person and has a right imaging plane, and the left image capture device has a corresponding right image capture device (ie, the left The image capture device and the right image capture device are set accordingly), the viewing angles of the sea level images SLM captured by each image capture device 10 are different, resulting in the location of the common object CO captured by each image capture device 10 The locations are also different.

續言之，影像處理器20根據共同物件CO於左海平面影像和右海平面影像的位置，將共同物件CO的座標從二維座標系轉為三維座標系，使共同物件CO具有立體座標，且因共同物件CO於左海平面影像和右海平面影像的位置相異，共同物件CO於左海平面影像和右海平面影像的位置差距為視差，影像處理器20根據立體座標、左影像擷取裝置以及右影像擷取裝置之位置以及視差，得知共同物件CO與左影像擷取裝置之距離以及共同物件CO與右影像擷取裝置之距離。Continuing, the image processor 20 converts the coordinates of the common object CO from the two-dimensional coordinate system to the three-dimensional coordinate system according to the positions of the common object CO in the left sea level image and the right sea level image, so that the common object CO has three-dimensional coordinates, and Because the positions of the common object CO in the left sea level image and the right sea level image are different, the position difference of the common object CO in the left sea level image and the right sea level image is parallax. The position and parallax of the right image capturing device are used to obtain the distance between the common object CO and the left image capturing device and the distance between the common object CO and the right image capturing device.

舉例來說，視差的取得方法：影像處理器10利用視差流網絡(Disflownet)推算左海平面影像和右海平面影像之視差。詳言之，視差流網絡接收左海平面影像和右海平面影像並將其疊加，接著視差流網絡利用其所具有的卷積神經網路演算法CNN的多層卷積層CL(此卷積神經網路演算法可如第2圖或第8圖所示)及反卷積的演算，得出多個共同物件CO，再根據多個共同物件CO於左海平面影像和右海平面影像之位置推算視差，將視差流網絡進行多次反覆的訓練，使視差流網絡運算視差的速度越來越快。具體而言，視差流網絡疊加左海平面影像和右海平面影像，第一層和第二層卷積層CL接收左海平面影像和右海平面影像並取得具有多個共同物件CO及其影像特徵之特徵圖(feature map)，再利用第一層卷積層CL演算出之影像圖和特徵圖進行反卷積而取得之結果輸入至第三層卷積層CL，第三層卷積層CL此時也接收第二層卷積層CL運算後之影像圖，第三層卷積層CL據此輸出另一個特徵圖，接著其他層卷積層CL也根據前述機制運算出特徵圖，最後取得多個共同物件CO(此時共同物件CO的數量相異於第一層和第二層卷積層CL取出之共同物件CO)及其影像特徵，視差流網絡根據多個共同物件CO於左海平面影像和右海平面影像之位置演算出視差。For example, the method for obtaining the parallax: the image processor 10 uses a parallax flow network (Disflownet) to estimate the parallax between the left sea level image and the right sea level image. In detail, the parallax flow network receives the left sea level image and the right sea level image and superimposes them, and then the parallax flow network uses the multi-layer convolution layer CL of the convolutional neural network road algorithm CNN it has (this convolutional neural network road algorithm). As shown in Figure 2 or Figure 8) and the deconvolution calculation, multiple common objects CO can be obtained, and then the parallax can be calculated according to the positions of the multiple common objects CO in the left sea level image and the right sea level image. The flow network is trained repeatedly for many times, so that the disparity flow network can calculate the disparity faster and faster. Specifically, the parallax flow network overlays the left sea level image and the right sea level image, and the first and second convolutional layers CL receive the left sea level image and the right sea level image and obtain features with multiple common objects CO and their image features feature map, and then use the image map and feature map calculated by the first layer of convolutional layer CL to perform deconvolution and input the result to the third layer of convolutional layer CL. At this time, the third layer of convolutional layer CL also receives the first The image map after the operation of the second-layer convolutional layer CL, the third-layer convolutional layer CL outputs another feature map accordingly, and then the other convolutional layers CL also calculate the feature map according to the aforementioned mechanism, and finally obtain multiple common objects CO (at this time The number of common objects CO is different from the common objects CO extracted from the first and second convolutional layers CL) and its image characteristics. The disparity flow network calculates the positions of the multiple common objects CO in the left sea level image and the right sea level image. out of parallax.

需說明的是，前述左影像擷取裝置和右影像擷取裝置僅為類別的區分，左海平面影像、右海平面影像、左成像平面以及右成像平面也為類別的區分，而其非數量上的限制。It should be noted that the above-mentioned left image capture device and right image capture device are only for classification, and the left sea-level image, right sea-level image, left imaging plane and right imaging plane are also classifications, and they are not quantitative. limits.

請參閱第10圖，其為本發明之海上物件測距系統之視差示意圖。首先，因影像擷取裝置10所拍攝的海平面影像SLM為平面影像，需平面影像中的平面座標與實際空間座標(立體座標)的轉換公式將海平面影像SLM的物件座標化，前述轉換公式如下：

其中，

和

分別為各影像擷取裝置10原點至海平面影像SLM中心在x方向和y方向的偏移差(其中，影像擷取裝置10的原點為影像擷取裝置10的正中央位置，海平面影像SLM中心為影像擷取裝置10所拍攝的海平面影像SLM的正中央)，

和

為各影像擷取裝置10於x方向和y方向的焦距，

為成像方向(成像方向為將左海平面影像和右海平面影像呈現水平所拍攝到的海平面影像，由於左影像擷取裝置以及右影像擷取裝置所拍攝到海平面影像SLM可能會有角度導致交叉重疊，最理想狀態是左影像擷取裝置以及右影像擷取裝置是一致成水平拍攝)與海平面影像SLM的歪斜係數(skew factor)，

為海平面影像SLM的座標，

為實際空間的座標，

為齊次轉換矩陣，R為旋轉矩陣(rotation matrix)，t 為平移的向量(translation vector)。Please refer to FIG. 10, which is a schematic diagram of parallax of the marine object ranging system of the present invention. First, since the sea level image SLM captured by the image capture device 10 is a planar image, a conversion formula between the planar coordinates in the planar image and the actual space coordinates (stereocoordinates) is required to coordinate the objects of the sea level image SLM. The aforementioned conversion formula as follows:

in,

and

are the offset differences between the origin of each image capture device 10 and the center of the sea level image SLM in the x direction and the y direction (wherein, the origin of the image capture device 10 is the center of the image capture device 10, and the sea level The center of the image SLM is the center of the sea level image SLM captured by the image capture device 10),

and

are the focal lengths of each image capturing device 10 in the x-direction and the y-direction,

is the imaging direction (the imaging direction is the sea level image captured by presenting the left sea level image and the right sea level image horizontally, because the sea level image SLM captured by the left image capture device and the right image capture device may be angled. Cross-overlap, the ideal state is that the left image capture device and the right image capture device are uniformly shot horizontally) and the skew factor of the sea level image SLM,

are the coordinates of the sea level image SLM,

are the coordinates of the actual space,

is a homogeneous transformation matrix, R is a rotation matrix, and t is a translation vector.

如第10圖所示，透過左影像擷取裝置以及右影像擷取裝置(對應左成像平面和右成像平面)拍攝海平面影像SLM，且透過平面座標與立體座標的轉換公式將海平面影像SLM中的各共同物件CO立體座標化，對應左海平面影像和右海平面影像找尋共同物件CO和其相應的影像特徵，並取得共同物件O之影像特徵的對應的立體座標P(X,Y,Z)，根據前述的立體座標(X,Y,Z)、左影像擷取裝置和右影像擷取裝置的立體座標O_L 及O_R 以及共同物件CO於左成像平面和右成像平面之海平面影像SLM之相對座標P_L 及P_R ，從而取得視差值Z。前述的視差值取得僅就單個共同物件CO之影像特徵討論，其餘共同物件CO也能根據相應之屬於左影像擷取裝置及右影像擷取裝置和共同物件O相應的影像特徵所屬的立體座標及其於左成像平面和右成像平面之相對座標，來取得視差值，其類似處於此便不再加以贅述。As shown in Fig. 10, the sea level image SLM is captured by the left image capture device and the right image capture device (corresponding to the left imaging plane and the right imaging plane), and the sea level image SLM is converted by the conversion formula of plane coordinates and stereo coordinates Each of the common objects CO is stereo-coordinated, and corresponding to the left sea level image and the right sea level image to find the common object CO and its corresponding image features, and obtain the corresponding three-dimensional coordinates P(X, Y, Z of the image features of the common object O) ), according to the aforementioned three-dimensional coordinates (X, Y, _Z ), the three-dimensional coordinates _OL and OR of the left image capture device and the right image capture device, and the sea level images of the common object CO on the left imaging plane and the right imaging plane The relative coordinates P _L and P _R of the SLM, thereby obtaining the disparity value Z. The aforementioned parallax value acquisition is only discussed with respect to the image features of a single common object CO, and other common objects CO can also be based on the corresponding stereo coordinates of the left image capturing device and the right image capturing device and the corresponding image features of the common object O. and its relative coordinates on the left imaging plane and the right imaging plane to obtain the disparity value, which is similar here and will not be repeated here.

請參閱第11圖，其為本發明之海上物件測距系統之視差換算距離的示意圖。如第11圖所示，搭配第10圖，共同物件CO之影像特徵的對應的立體座標P(X,Y,Z)，左影像擷取裝置和右影像擷取裝置的立體座標O_L 及O_R ，左影像擷取裝置和右影像擷取裝置的焦距f及其相對距離B，共同物件CO於左成像平面和右成像平面之海平面影像SLM之相對座標P_L 及P_R ，根據相似三角型定理能推算下列公式：

其中，

為屬於左影像擷取裝置和共同物件O於左成像平面之海平面影像SLM之相對座標P_L 之距離，

為右影像擷取裝置和共同物件O於右成像平面之海平面影像SLM之相對座標P_R 之距離，

為屬於左影像擷取裝置和共同物件CO之相對距離，

為屬於右影像擷取裝置和共同物件CO之相對距離。透過前述公式，能取得視差值Z、左影像擷取裝置和共同物件CO之相對距離以及右影像擷取裝置和共同物件CO之相對距離。Please refer to FIG. 11 , which is a schematic diagram of the parallax conversion distance of the marine object ranging system of the present invention. As shown in Fig. 11, in conjunction with Fig. 10, the corresponding three-dimensional coordinates P(X, Y, Z) of the image features of the common object CO, the three-dimensional coordinates O _L and O of the left image capture device and the right image capture device _R , the focal length f of the left image capture device and the right image capture device and its relative distance B, the relative coordinates _PL and P _R of the sea level image SLM of the common object CO on the left imaging plane and the right imaging plane, according to the similar triangle The type theorem can deduce the following formula:

in,

is the distance between the relative coordinates _PL of the sea level image SLM belonging to the left image capture device and the common object O on the left imaging plane,

is the distance between the right image capturing device and the relative coordinate _PR of the sea level image SLM of the common object O in the right imaging plane,

is the relative distance belonging to the left image capture device and the common object CO,

is the relative distance belonging to the right image capture device and the common object CO. Through the aforementioned formula, the disparity value Z, the relative distance between the left image capturing device and the common object CO, and the relative distance between the right image capturing device and the common object CO can be obtained.

觀前所述，本發明之海上物件識別方法及其系統，透過卷積神經網路演算法CNN的配置和海平面影像SLM的分割，一次識別海平面影像SLM的多個共同物件CO，進而加快識別共同物件CO的速度，並推算共同物件CO和各影像擷取裝置10的距離。總括而言，本發明之海上物件識別方法及其系統，具有如上述的優點，能即時辨識海平面影像SLM的各共同物件CO及推算各共同物件CO和各影像擷取裝置10的距離。As mentioned above, the method and system for identifying objects at sea of the present invention can identify multiple common objects CO of the sea level image SLM at one time through the configuration of the convolutional neural network road algorithm CNN and the segmentation of the sea level image SLM, thereby speeding up the identification The speed of the common object CO is calculated, and the distance between the common object CO and each image capturing device 10 is calculated. To sum up, the method and system for identifying objects on the sea of the present invention have the above advantages, and can instantly identify the common objects CO of the sea level image SLM and estimate the distances between the common objects CO and the image capturing devices 10 .

以上所述僅為舉例性，而非為限制性者。任何未脫離本發明之精神與範疇，而對其進行之等效修改或變更，均應包含於後附之申請專利範圍中。The above description is exemplary only, not limiting. Any equivalent modifications or changes that do not depart from the spirit and scope of the present invention shall be included in the appended patent application scope.

10:影像擷取裝置 20:影像處理器 30:記憶裝置 B:區塊 BB:邊框

:左影像擷取裝置和共同物件之相對距離

:右影像擷取裝置和共同物件之相對距離 C:類別 CL:卷積層 CNN:卷積神經網路演算法 CS:信任機率 CO:共同物件 EM:特徵矩陣 f:影像擷取裝置的焦距 FCL:全連結層 FM:特徵圖 F1,F2:濾波層 MPL:最大池化層 NE:神經元 O_L :左影像擷取裝置10的立體座標 O_R :右影像擷取裝置10的立體座標 PM:像素矩陣 P(X,Y,Z):共同物件之立體座標 P_L :共同物件於左成像平面之海平面影像之相對座標 P_R :共同物件於右成像平面之海平面影像之相對座標 RL:減少層 SLM:海平面影像 S11~S15:步驟 W:權重

:左影像擷取裝置和共同物件於左成像平面之海平面影像之相對座標之距離

:右影像擷取裝置和共同物件於右成像平面之海平面影像之相對座標之距離 Z:視差值10: image capture device 20: image processor 30: memory device B: block BB: border

: The relative distance between the left image capture device and the common object

: Relative distance between right image capture device and common object C: Category CL: Convolutional layer CNN: Convolutional neural network road algorithm CS: Trust probability CO: Common object EM: Feature matrix f: Focal length of image capture device FCL: Full Connection layer FM: Feature map F1, F2: Filter layer MPL: Maximum pooling layer NE: Neuron _OL : Stereoscopic coordinates of the left image capture device 10 _OR : Stereoscopic coordinates of the right image capture device 10 PM: Pixel matrix P(X,Y,Z): the three-dimensional coordinates of the common object _PL : the relative coordinates of the common object in the sea level image of the left imaging plane P _R : the relative coordinates of the common object in the sea level image of the right imaging plane RL: reduction layer SLM: Sea Level Image S11~S15: Step W: Weight

: The distance between the relative coordinates of the sea level image of the left image capture device and the common object on the left imaging plane

: the distance between the relative coordinates of the sea level image of the right image capture device and the common object on the right imaging plane Z: parallax value

第1圖為本發明之海上物件測距系統之的方塊圖。第2圖為本發明之卷積神經網路演算法之第一態樣的結構圖。第3圖為本發明之卷積層示意圖。第4圖為本發明之最大池化層示意圖。第5圖為本發明之全連結層示意圖。第6圖為本發明之海上物件識別方法之流程圖。第7圖為本發明之卷積神經網路演算法的運作流程圖。第8圖為本發明之卷積神經網路演算法之第二態樣的結構圖。第9圖為本發明之線性激活函數和帶洩漏線性整流激活函數的函數圖。第10圖為本發明之海上物件測距系統之視差示意圖。第11圖為本發明之海上物件測距系統之視差換算距離的示意圖。Fig. 1 is a block diagram of the marine object ranging system of the present invention. FIG. 2 is a structural diagram of the first aspect of the convolutional neural network algorithm of the present invention. Figure 3 is a schematic diagram of a convolutional layer of the present invention. FIG. 4 is a schematic diagram of the maximum pooling layer of the present invention. FIG. 5 is a schematic diagram of the fully connected layer of the present invention. FIG. 6 is a flow chart of the method for identifying objects at sea according to the present invention. FIG. 7 is a flow chart of the operation of the convolutional neural network algorithm of the present invention. FIG. 8 is a structural diagram of the second aspect of the convolutional neural network algorithm of the present invention. Fig. 9 is a function diagram of the linear activation function of the present invention and the linear rectification activation function with leakage. FIG. 10 is a schematic diagram of parallax of the marine object ranging system of the present invention. FIG. 11 is a schematic diagram of the parallax conversion distance of the marine object ranging system of the present invention.

10:影像擷取裝置10: Image capture device

20:影像處理器20: Image processor

30:記憶裝置30: Memory Device

B:區塊B: block

BB:邊框BB: border

C:類別C: Category

CNN:卷積神經網路演算法CNN: Convolutional Neural Network Road Algorithm

CS:信任機率CS: Probability of Trust

CO:共同物件CO: Common Object

Claims

A marine object ranging system, comprising: a plurality of image capture devices to capture a plurality of sea level images, each of the plurality of sea level images has a common object, and each of the plurality of sea level images has different viewing angles; an image processor, electrically connected to the plurality of image capturing devices, the image processor divides each sea level image into a plurality of blocks, and assigns each of the blocks to a corresponding category; and a memory device is connected to the image processing the memory device stores a convolutional neural network road algorithm, the convolutional neural network road algorithm is run by the image processor to produce the plurality of bounding boxes to which the common object belongs according to each of the blocks, and according to the The confidence score corresponding to each of the borders is calculated, and the image processor identifies the common object according to the trust probability and each category of each of the borders corresponding to the common object; wherein, the image processor identifies the common object according to the The distances between the common object and each of the plurality of image capture devices are known from different viewing angles of a plurality of sea level images; wherein the convolutional neural network road algorithm is run by the image processor to include the frame respectively according to the frame. The probability of the common object and the estimated value (intersection over union) of each of the borders and a corresponding ground truth box are used to calculate the corresponding trust probability of each of the borders to which the common object belongs.

The maritime object distance measuring system as described in claim 1, wherein each of the frames has the center coordinate of the common object and the length and width of the common object.

According to the maritime object ranging system described in item 1 of the claimed scope, when the image processor determines that the trust probability that the frame corresponds to the common object is 0, the common object does not exist in the frame.

The maritime object ranging system as described in claim 1, wherein each of the plurality of sea level images has a plurality of common objects, and the number of the common objects is different from the number of the plurality of blocks.

The maritime object ranging system according to the first claim, wherein the convolutional neural network road algorithm has multiple layers of convolution layers, multiple layers of max pooling layers and a fully connected layer.

According to the maritime object ranging system described in item 1 of the scope of application, the plurality of image capture devices are divided into a left image capture device and a right image capture device, and the plurality of sea level images are divided into a left sea level image and a right image capture device. A right sea level image.

According to the maritime object ranging system described in item 6 of the scope of application, the image processor converts the coordinates of the common object from the two-dimensional coordinate system according to the positions of the common object in the left sea level image and the right sea level image is a three-dimensional coordinate system, the common object has a three-dimensional coordinate, and the image processor knows the common object and the left image capturing device according to the three-dimensional coordinates and the positions of the left image capturing device and the right image capturing device and the distance between the common object and the right image capture device.

The maritime object ranging system as described in item 6 of the scope of application, wherein the positions of the common object in the left sea level image and the right sea level image are different, and the common object in the left sea level image and the right sea level image The positional difference of the images is a parallax.

According to the maritime object ranging system described in item 8 of the patent application scope, the image processor uses a parallax flow network to estimate the parallax of the left sea level image and the right sea level image.