TWI804845B

TWI804845B - Object positioning method and object positioning system

Info

Publication number: TWI804845B
Application number: TW110112747A
Authority: TW
Inventors: 李政昕; 李佳樺; 吳懷恩; 姜皇成
Original assignee: 中強光電股份有限公司
Priority date: 2021-04-08
Filing date: 2021-04-08
Publication date: 2023-06-11
Also published as: TW202240464A

Abstract

An object positioning method and an object positioning system are provided. A sensing device collects point cloud data obtained from a scene including a target object. A processing device inputs surrounding area data centered on a key point and a preset feature descriptor to a neural network to calculate a scene feature descriptor of the scene. A processing device performs feature matching between the scene feature descriptor and the preset feature descriptor, and calculates position of the target object in an actual space.

Description

Object positioning method and object positioning system

本發明是有關於一種電子裝置，且特別是有關於一種物件定位方法及物件定位系統。 The present invention relates to an electronic device, and in particular to an object positioning method and an object positioning system.

隨著工廠自動化持續發展，運送貨物也成為自動化重要的一環，在大型貨物的搬運方面，叉車是首選的自動化對象。自動搬運叉車要能順利搬貨，除了要有自主導航的功能，在棧板歪斜的狀況下如果有貨物辨識的功能來自動調整叉車的行進過程會更有靈活度，且一般要讓叉車運送的貨物都會在棧板上，所以棧板辨識也成為發展自動搬運叉車的一個重要的技術環節。 With the continuous development of factory automation, the transportation of goods has also become an important part of automation. In the handling of large goods, forklifts are the preferred automation objects. In order for the automatic handling forklift to be able to move goods smoothly, in addition to the function of autonomous navigation, it will be more flexible if there is a function of goods identification to automatically adjust the moving process of the forklift when the pallet is skewed, and generally it is necessary to let the forklift transport The goods will be on the pallet, so pallet identification has become an important technical link in the development of automatic handling forklifts.

棧板的辨識技術分成兩塊，包括棧板的辨識和棧板的定位，辨識是要在不確定的環境中找出棧板，定位則是要將棧板的空間位置和叉車做關連，才能順利搬運貨物。一般相機都是平面成像，屬於二維空間的資訊，傳統機器視覺的辨識方法也都是以二維平面當做基礎，但棧板辨識除了辨識，還需要做定位，如果使用單相機做定位，容易有很大的誤差。雙相機依靠兩個相機的視差和幾何條件來推算出物體位置，但計算量比較大。 Pallet identification technology is divided into two parts, including pallet identification and pallet positioning. Identification is to find the pallet in an uncertain environment, and positioning is to correlate the spatial position of the pallet with the forklift. Smooth cargo handling. Generally, cameras are used for plane imaging, which belongs to two-dimensional space information. Traditional machine vision recognition methods are also based on two-dimensional planes. However, pallet recognition requires positioning in addition to recognition. If a single camera is used for positioning, it is easy. There is a big margin of error. Dual cameras rely on the power of two cameras Parallax and geometric conditions are used to calculate the position of the object, but the amount of calculation is relatively large.

“先前技術”段落只是用來幫助了解本發明內容，因此在“先前技術”段落所揭露的內容可能包含一些沒有構成所屬技術領域中具有通常知識者所知道的習知技術。在“先前技術”段落所揭露的內容，不代表該內容或者本發明一個或多個實施例所要解決的問題，也不代表在本發明申請前已被所屬技術領域中具有通常知識者所知曉或認知。 The "Prior Art" paragraph is only used to help understand the content of the present invention, so the content disclosed in the "Prior Art" paragraph may contain some conventional technologies that do not constitute the knowledge of those with ordinary skill in the art. The content disclosed in the "Prior Art" paragraph does not represent the content or the problem to be solved by one or more embodiments of the present invention, nor does it represent that it has been known or cognition.

本發明提供一種物件定位方法及物件定位系統，可提高目標物件的辨識與定位的準確度。 The invention provides an object positioning method and an object positioning system, which can improve the accuracy of identification and positioning of target objects.

本發明的其他目的和優點可以從本發明所揭露的技術特徵中得到進一步的了解。 Other purposes and advantages of the present invention can be further understood from the technical features disclosed in the present invention.

為達上述之一或部份或全部目的或是其他目的，本發明之一實施例提供一種物件定位方法，包括下列步驟。藉由感測裝置，接收包括目標物件的場景所得到的點雲資料。藉由處理裝置，對點雲資料提取關鍵點。藉由處理裝置，將以關鍵點為中心的周圍區域資料以及目標物件的預設特徵描述子輸入至神經網路，以計算場景的一場景特徵描述子。藉由處理裝置，對場景特徵描述子與預設特徵描述子進行特徵匹配。藉由處理裝置，計算目標物件在實際空間中的位置。 To achieve one or part or all of the above objectives or other objectives, an embodiment of the present invention provides a method for locating an object, which includes the following steps. The point cloud data obtained from the scene including the target object is received by the sensing device. Through the processing device, key points are extracted from the point cloud data. By means of the processing device, the data of the surrounding area centered on the key point and the preset feature descriptor of the target object are input into the neural network to calculate a scene feature descriptor of the scene. By means of the processing device, feature matching is performed on the scene feature descriptor and the preset feature descriptor. The position of the target object in the real space is calculated by the processing device.

本發明還提供一種物件定位系統，包括感測裝置、儲存裝置以及處理裝置。感測裝置用以採集包括目標物件的場景所得到的點雲資料。儲存裝置用以儲存目標物件的預設特徵描述子。處理裝置耦接儲存裝置及該處理裝置，用以接收點雲資料，對點雲資料提取一關鍵點，將以關鍵點為中心的周圍區域資料以及預設特徵描述子輸入至神經網路，以計算場景的場景特徵描述子，對場景特徵描述子與預設特徵描述子進行特徵匹配，並計算目標物件在實際空間中的位置。 The invention also provides an object positioning system, including a sensing device, a storage devices and processing devices. The sensing device is used for collecting point cloud data obtained from a scene including a target object. The storage device is used for storing the default feature descriptor of the target object. The processing device is coupled to the storage device and the processing device, and is used to receive point cloud data, extract a key point from the point cloud data, and input the surrounding area data centered on the key point and the preset feature descriptor to the neural network, so as to Calculate the scene feature descriptor of the scene, perform feature matching between the scene feature descriptor and the preset feature descriptor, and calculate the position of the target object in the actual space.

基于上述，本發明的實施例對三維點雲資料提取關鍵點，將以關鍵點為中心的周圍區域資料以及目標物件的預設特徵描述子輸入至神經網路，以計算場景的場景特徵描述子，對場景特徵描述子與預設特徵描述子進行特徵匹配，並計算目標物件在實際空間中的位置。如此藉由將預設特徵描述子與自三維點雲資料提取包括關鍵點的周圍區域資料輸入至神經網路來計算場景的場景特徵描述子，可利用神經網路的特徵提取能力，有效提高目標物件辨識以及定位的準確度與穩定度。 Based on the above, the embodiment of the present invention extracts key points from the 3D point cloud data, and inputs the surrounding area data centered on the key points and the preset feature descriptor of the target object to the neural network to calculate the scene feature descriptor of the scene , perform feature matching on the scene feature descriptor and the preset feature descriptor, and calculate the position of the target object in the actual space. In this way, by inputting the preset feature descriptor and the surrounding area data including key points extracted from the 3D point cloud data into the neural network to calculate the scene feature descriptor of the scene, the feature extraction ability of the neural network can be used to effectively improve the target Accuracy and stability of object recognition and positioning.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。 In order to make the above-mentioned features and advantages of the present invention more comprehensible, the following specific embodiments are described in detail together with the accompanying drawings.

100:物件定位系統 100: Object positioning system

102:感測裝置 102: Sensing device

104:處理裝置 104: Processing device

106:儲存裝置 106: storage device

S202~S210、S302~S308:步驟 S202~S210, S302~S308: steps

圖1是依照本發明實施例的一種物件定位系統的示意圖。 FIG. 1 is a schematic diagram of an object positioning system according to an embodiment of the present invention.

圖2是依照本發明實施例的一種物件定位方法的流程圖。 FIG. 2 is a flowchart of an object positioning method according to an embodiment of the present invention.

圖3是依照本發明另一實施例的物件定位方法的流程圖。 FIG. 3 is a flowchart of an object locating method according to another embodiment of the present invention.

有關本發明之前述及其他技術內容、特點與功效，在以下配合參考圖式之一較佳實施例的詳細說明中，將可清楚的呈現。以下實施例中所提到的方向用語，例如：上、下、左、右、前或後等，僅是參考附加圖式的方向。因此，使用的方向用語是用來說明並非用來限制本發明。 The aforementioned and other technical contents, features and effects of the present invention will be clearly presented in the following detailed description of a preferred embodiment with reference to the drawings. The directional terms mentioned in the following embodiments, such as: up, down, left, right, front or back, etc., are only directions referring to the attached drawings. Accordingly, the directional terms used are for the purpose of illustration and not for the purpose of limiting the invention.

圖1是依照本發明的實施例的一種物件定位系統的示意圖，請參照圖1。物件定位系統100可包括感測裝置102、處理裝置104以及儲存裝置106，處理裝置104耦接感測裝置102與儲存裝置106。其中感測裝置102可採集包括目標物件的場景所得到的點雲資料。空間資訊使用點雲資料來表示可以利用空間幾何的方式取出目標物件的特徵，再經過處理後確認是否為目標物件。於本實施例中，感測裝置102可例如為TOF(Time-of-flight)相機，其可例如以紅外線或雷射光為光源，透過計算光源反射回來的飛行時間，而計算出與物體的距離，以推導出三維座標，產生包括目標物件的場景的三維點雲資料，其中目標物件可例如為棧板，本案不以此為限。儲存裝置106可例如是任意型式的固定式或可移動式隨機存取記憶體(Random Access Memory，RAM)、唯讀記憶體(Read-Only Memory，ROM)、快閃記憶體(Flash memory)、硬碟或其他類似裝置或這些裝置的組合，而可儲存目標物件的預設特徵描述子，預設特徵描述子為對標準模板進行關鍵點與描述子提取後得到。標準模板為事先製作的棧板模板，為了降低計算時間，先將棧板模板經過座標轉換及關鍵點與描述子提取後儲存。在本實施例中，關鍵點可選擇在點雲資料中具有低資料量，高穩定性與區別性而不易與其他特徵點混淆的特徵點，例如可偵測點雲資料中的角落(corner)做為關鍵點。關鍵點可例如採用均勻採樣(Uniform sampling)的方式來提取，本案不以此為限，在其它實施例中也可例如以SIFT算法(Scale invariant feature transform)、Harris算法(Harris corner detection)或NARF算法(Normal aligned radial feature)來提取關鍵點。描述子則可依據以關鍵點為中心的周圍區域資料計算其特徵描述而得到。在目標物件為棧板的例子中，可例如在離棧板一段預設距離(例如1公尺)利用TOF相機取得清晰且完整的點雲資料，並在濾除雜訊後取棧板的前端部分作為標準模板。 FIG. 1 is a schematic diagram of an object positioning system according to an embodiment of the present invention, please refer to FIG. 1 . The object positioning system 100 may include a sensing device 102 , a processing device 104 and a storage device 106 , and the processing device 104 is coupled to the sensing device 102 and the storage device 106 . The sensing device 102 can collect point cloud data obtained from a scene including a target object. Spatial information uses point cloud data to indicate that the characteristics of the target object can be extracted using spatial geometry, and then processed to confirm whether it is the target object. In this embodiment, the sensing device 102 can be, for example, a TOF (Time-of-flight) camera, which can, for example, use infrared light or laser light as a light source, and calculate the distance to the object by calculating the time-of-flight reflected from the light source , so as to deduce the three-dimensional coordinates and generate the three-dimensional point cloud data of the scene including the target object, wherein the target object may be, for example, a pallet, which is not limited in this case. The storage device 106 can be, for example, any type of fixed or removable random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), flash memory (Flash memory), hard disk or other similar device or a combination of these devices that can store preset Assuming a feature descriptor, the preset feature descriptor is obtained after extracting key points and descriptors from the standard template. The standard template is a pre-made pallet template. In order to reduce the calculation time, the pallet template is stored after coordinate transformation and key point and descriptor extraction. In this embodiment, the key point can be selected to have a low amount of data in the point cloud data, a feature point with high stability and distinction that is not easily confused with other feature points, for example, a corner in the point cloud data can be detected as a key point. The key points can be extracted, for example, by means of uniform sampling (Uniform sampling), and this case is not limited thereto. In other embodiments, the key points can also be extracted by, for example, SIFT algorithm (Scale invariant feature transform), Harris algorithm (Harris corner detection) or NARF Algorithm (Normal aligned radial feature) to extract key points. The descriptor can be obtained by calculating its feature description based on the surrounding area data centered on the key point. In the example where the target object is a pallet, for example, a TOF camera can be used to obtain clear and complete point cloud data at a preset distance (for example, 1 meter) from the pallet, and the front end of the pallet can be obtained after filtering out noise section as a standard template.

處理裝置104可為一般用途處理器、特殊用途處理器、傳統的處理器、數位訊號處理器、多個微處理器(microprocessor)、一個或多個結合數位訊號處理器核心的微處理器、控制器、微控制器、特殊應用積體電路(Application Specific Integrated Circuit，ASIC)、現場可程式閘陣列電路(Field Programmable Gate Array，FPGA)、任何其他種類的積體電路、狀態機或基於進階精簡指令集機器(Advanced RISC Machine，ARM)的處理器。處理裝置104 可接收感測裝置102所提供的點雲資料，對點雲資料提取關鍵點，並將以關鍵點為中心的周圍區域資料以及預設特徵描述子輸入至經訓練的神經網路，以計算包括目標物件的場景的場景特徵描述子。在部份實施例中，為了減少計算量，處理裝置104可先剔除部份的點雲資料。舉例來說，處理裝置104可先依據目標物件所在平面分割該點雲資料，平面可例如為地板平面或置放棧板的貨架平面，並利用歐幾里得聚類算法(Euclidean Cluster)來分割點雲資料，亦即依據空間距離將點雲資料分割為多個點群。處理裝置104可選擇僅對最大的N個點群進行關鍵點的提取，而可減少計算量，並防止誤偵測以及雜訊干擾，其中N為正整數。 The processing device 104 may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor, a plurality of microprocessors, one or more microprocessors incorporating a digital signal processor core, a control Microcontroller, Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), any other kind of integrated circuit, state machine or based on advanced A processor of an instruction set machine (Advanced RISC Machine, ARM). processing device 104 The point cloud data provided by the sensing device 102 can be received, key points can be extracted from the point cloud data, and the surrounding area data centered on the key points and the preset feature descriptor can be input into the trained neural network to calculate the The scene feature descriptor of the target object's scene. In some embodiments, in order to reduce the amount of calculation, the processing device 104 may delete part of the point cloud data first. For example, the processing device 104 can first segment the point cloud data according to the plane where the target object is located. The plane can be, for example, the floor plane or the shelf plane where the pallets are placed, and use the Euclidean clustering algorithm (Euclidean Cluster) to segment The point cloud data, that is, the point cloud data is divided into multiple point groups according to the spatial distance. The processing device 104 may choose to extract the key points only from the largest N point groups, thereby reducing the amount of computation, and preventing false detection and noise interference, where N is a positive integer.

進一步來說，由於空間點雲具有方向性，在進行場景特徵描述子的計算時，可先計算關鍵點的區域參考座標，以進行座標轉換，確保每次計算出的場景特徵描述子相同。此外，在部份實施例中，可先以高斯分佈平滑化周圍區域資料，以排除雜訊干擾，避免影響場景特徵描述子的計算結果的正確性。 Furthermore, due to the directionality of the spatial point cloud, when calculating the scene feature descriptor, the regional reference coordinates of the key points can be calculated first for coordinate conversion to ensure that the calculated scene feature descriptor is the same every time. In addition, in some embodiments, the surrounding area data can be smoothed with a Gaussian distribution to eliminate noise interference and avoid affecting the correctness of the calculation result of the scene feature descriptor.

在本實施例中，神經網路可例如為孿生神經網路，孿生神經網路(Siamese Network)可例如為包括的兩個卷積神經網路(Convolutional neural network,CNN)的架構，本案不以此為限。在本實施例中，可利用3DMatch資料庫進行神經網路的訓練，其為RGBD(RGB-Depth)室內場景資料，每一個場景都由許多張分散的三維點雲所組成，相鄰的點雲(point cloud)有重疊的部份。在利用孿生神經網絡進行訓練時，可將關鍵點以及關鍵點周圍區域的點一起當做輸入，訓練架構是要讓相同關鍵點的損失越小越好，較遠的關鍵點則損失越大越好，此外還可使用棧板資料進行訓練。 In this embodiment, the neural network can be, for example, a Siamese neural network, and the Siamese neural network (Siamese Network) can be, for example, a structure including two convolutional neural networks (CNN). This case does not use This is the limit. In this embodiment, the training of the neural network can be carried out using the 3DMatch database, which is RGBD (RGB-Depth) indoor scene data, and each scene is composed of many scattered three-dimensional point clouds, and the adjacent point clouds (point cloud) has overlapping parts. in use When the twin neural network is trained, the key points and the points around the key points can be used as input. The training structure is to make the loss of the same key point as small as possible, and the loss of the key points farther away is better. In addition, Can use pallet data for training.

處理裝置104可將計算出的場景特徵描述子與預設特徵描述子進行特徵匹配，例如可將預設特徵描述子與各個點群的場景特徵描述子進行比對，判斷各個點群是否具有相匹配的場景特徵描述子。舉例來說，可判斷預設特徵描述子與場景特徵描述子的相似度是否高於閾值，若高於閾值，則匹配成功。在完成特徵描述子的匹配後，處理裝置104可依據匹配結果進行座標轉換，以計算目標物件在實際空間中的位置。在部份實施例中，在完成特徵描述子的匹配後，處理裝置104可去除錯誤匹配的例外點，然後再進行座標轉換，以避免錯誤匹配的例外點影響計算目標物件在實際空間中的位置的正確性。在部份實施例中，處理裝置104依據計算出的目標物件在實際空間中的位置控制叉車進行棧板的搬運，或將目標物件在實際空間中的位置傳送給叉車，使叉車據以搬運棧板。 The processing device 104 can perform feature matching on the calculated scene feature descriptor and the preset feature descriptor, for example, can compare the preset feature descriptor with the scene feature descriptor of each point group, and judge whether each point group has a corresponding Matching scene feature descriptors. For example, it may be determined whether the similarity between the preset feature descriptor and the scene feature descriptor is higher than a threshold, and if it is higher than the threshold, the matching is successful. After completing the matching of the feature descriptors, the processing device 104 can perform coordinate transformation according to the matching result to calculate the position of the target object in the actual space. In some embodiments, after completing the matching of the feature descriptors, the processing device 104 can remove the incorrectly matched exception points, and then perform coordinate transformation, so as to avoid the influence of the incorrectly matched exception points from affecting the position of the target object in the actual space. correctness. In some embodiments, the processing device 104 controls the forklift to carry the pallet according to the calculated position of the target object in the actual space, or transmits the position of the target object in the actual space to the forklift, so that the forklift can transport the pallet accordingly. plate.

圖2是依照本發明實施例的一種物件定位方法的流程圖，請同時參照圖1及圖2。由上述實施例可知，物件定位方法可至少包括下列步驟。首先，藉由感測裝置102，接收包括目標物件的場景所得到的點雲資料(步驟S202)，本案不以此為限。接著，藉由處理裝置104，對點雲資料提取關鍵點(步驟S204)，提取關鍵點的方式可例如採用均勻取樣，本案不以此為限。然後，藉由處理裝置104，將以關鍵點為中心的周圍區域資料以及目標物件的預設特徵描述子輸入至神經網路，以計算場景的場景特徵描述子(步驟S206)，其中神經網路可例如為孿生神經網路，其可為包括的兩個卷積神經網路的架構，本案不以此為限。之後，藉由處理裝置104，對場景特徵描述子與預設特徵描述子進行特徵匹配(步驟S208)。最後再藉由處理裝置104，計算目標物件在實際空間中的位置(步驟S210)。在部份實施例中，可先去除錯誤匹配的例外點，再計算目標物件在實際空間中的位置，以避免錯誤匹配的例外點影響計算目標物件在實際空間中的位置的正確性。 FIG. 2 is a flow chart of an object positioning method according to an embodiment of the present invention. Please refer to FIG. 1 and FIG. 2 at the same time. It can be known from the above embodiments that the object positioning method may at least include the following steps. First, the sensing device 102 receives the point cloud data obtained from the scene including the target object (step S202 ), which is not limited in this case. Then, by the processing device 104, key points are extracted from the point cloud data (step S204), and key points are extracted The way of points can be, for example, uniform sampling, which is not limited in this case. Then, by the processing device 104, the data of the surrounding area centered on the key point and the preset feature descriptor of the target object are input to the neural network to calculate the scene feature descriptor of the scene (step S206), wherein the neural network It may be, for example, a Siamese neural network, which may include two convolutional neural network architectures, which is not limited in this case. After that, feature matching is performed on the scene feature descriptor and the preset feature descriptor by the processing device 104 (step S208 ). Finally, the position of the target object in the real space is calculated by the processing device 104 (step S210). In some embodiments, the incorrectly matched exception points may be removed before calculating the position of the target object in the real space, so as to prevent the incorrectly matched exception points from affecting the accuracy of calculating the position of the target object in the real space.

圖3是依照本發明另一實施例的物件定位方法的流程圖。本實施例與圖2實施例的不同之處在於，本實施例在步驟S202之後，還可藉由處理裝置104，將點雲資料分割為多個點群(步驟S302)，分割點雲資料的方式可例如依據目標物件所在平面進行分割，以及以歐幾里德聚類方法進行分割，本案不以此為限。之後，並分別對最大的N個點群提取關鍵點(步驟S304)，其中N為正整數。如此可減少計算量，並防止誤偵測以及雜訊干擾。然後，藉由處理裝置104，計算關鍵點的區域參考座標，以進行座標轉換(步驟S306)，如此可確保每次計算出的場景特徵描述子相同。接著再藉由處理裝置104，以高斯分佈平滑化周圍區域資料(步驟S308)，以進一步排除雜訊干擾，然後再進入步驟S206。由於步驟 S206~S210已於圖2實施例中說明，因此在此不再贅述。 FIG. 3 is a flowchart of an object locating method according to another embodiment of the present invention. The difference between this embodiment and the embodiment in FIG. 2 is that, after step S202, the processing device 104 in this embodiment can also divide the point cloud data into a plurality of point groups (step S302). The method can be, for example, segmenting according to the plane where the target object is located, and segmenting with the Euclidean clustering method, which is not limited in this case. Afterwards, key points are extracted from the largest N point groups (step S304 ), where N is a positive integer. This reduces the amount of computation and prevents false detection and noise interference. Then, the area reference coordinates of the key points are calculated by the processing device 104 for coordinate conversion (step S306 ), so as to ensure that the scene feature descriptors calculated each time are the same. Then, the processing device 104 smoothes the surrounding area data with Gaussian distribution (step S308 ) to further eliminate noise interference, and then enters step S206 . due to steps S206-S210 have been described in the embodiment of FIG. 2 , so details are not repeated here.

綜上所述，本發明的實施例對三維點雲資料提取關鍵點，將以關鍵點為中心的周圍區域資料以及目標物件的預設特徵描述子輸入至神經網路，以計算場景的場景特徵描述子，對場景特徵描述子與預設特徵描述子進行特徵匹配，並計算目標物件在實際空間中的位置。如此藉由將預設特徵描述子與自三維點雲資料提取的包括關鍵點的周圍區域資料輸入至神經網路來計算場景的場景特徵描述子，可利用神經網路的特徵提取能力，有效提高目標物件辨識以及定位的準確度與穩定度。在部份實施例中，還可依據目標物件所在平面，以及利用歐幾里得聚類算法來分割點雲資料，藉由選擇僅對較大的點群進行關鍵點的提取，可減少計算量，並防止誤偵測以及雜訊干擾。 In summary, the embodiment of the present invention extracts key points from the 3D point cloud data, and inputs the surrounding area data centered on the key points and the preset feature descriptor of the target object to the neural network to calculate the scene features of the scene The descriptor is used to perform feature matching between the scene feature descriptor and the preset feature descriptor, and calculate the position of the target object in the actual space. In this way, by inputting the preset feature descriptor and the surrounding area data including key points extracted from the 3D point cloud data into the neural network to calculate the scene feature descriptor of the scene, the feature extraction ability of the neural network can be used to effectively improve the Accuracy and stability of object recognition and positioning. In some embodiments, the point cloud data can also be segmented according to the plane of the target object, and the Euclidean clustering algorithm can be used, and the amount of calculation can be reduced by choosing to extract key points only for larger point groups , and prevent false detection and noise interference.

惟以上所述者，僅為本發明之較佳實施例而已，當不能以此限定本發明實施之範圍，即大凡依本發明申請專利範圍及發明說明內容所作之簡單的等效變化與修飾，皆仍屬本發明專利涵蓋之範圍內。另外本發明的任一實施例或申請專利範圍不須達成本發明所揭露之全部目的或優點或特點。此外，摘要部分和標題僅是用來輔助專利文件搜尋之用，並非用來限制本發明之權利範圍。再者，說明書中提及的第一、第二...等，僅用以表示元件的名稱，並非用來限制元件數量上的上限或下限。 But the above-mentioned ones are only preferred embodiments of the present invention, and the scope of implementation of the present invention cannot be limited with this, that is, all simple equivalent changes and modifications made according to the patent scope of the present invention and the contents of the description of the invention, All still belong to the scope covered by the patent of the present invention. In addition, any embodiment or scope of claims of the present invention does not need to achieve all the objectives or advantages or features disclosed in the present invention. In addition, the abstract and the title are only used to assist the search of patent documents, and are not used to limit the scope of rights of the present invention. Furthermore, the first, second, etc. mentioned in the specification are only used to indicate the names of components, and are not used to limit the upper limit or lower limit of the number of components.

S202~S210:步驟 S202~S210: steps

Claims

A method for locating an object, comprising: using a sensing device to receive point cloud data obtained from a scene including a target object; using a processing device to extract a key point from the point cloud data; using the processing device, Input a surrounding area data centered on the key point and a preset feature descriptor of the target object into a neural network to calculate a scene feature descriptor of the scene; by the processing device, the scene The feature descriptor is matched with the preset feature descriptor; and the position of the target object in the actual space is calculated by the processing device.

The method for locating an object according to claim 1, comprising: removing exception points of wrong matches.

The object positioning method as described in Claim 1, comprising: segmenting the point cloud data according to the plane where the target object is located.

The object positioning method as described in Claim 1, comprising: segmenting the point cloud data by a Euclidean clustering method.

The object positioning method as described in Claim 1, comprising: calculating the area reference coordinates of the key point for coordinate transformation.

The object positioning method according to claim 1, comprising: smoothing the surrounding area data with a Gaussian distribution.

The object location method as described in claim item 1, wherein the neural network is Siamese neural network.

The object positioning method as described in Claim 1, comprising: dividing the point cloud data into a plurality of point groups, and extracting the key points from the largest N point groups respectively, wherein N is a positive integer.

An object positioning system, comprising: a sensing device, used to collect point cloud data obtained from a scene including a target object; a storage device, used to store a preset feature descriptor of the target object; and a processing device , coupled to the storage device and the processing device, for receiving the point cloud data, extracting a key point from the point cloud data, and inputting a surrounding area data centered on the key point and the default feature descriptor into A neural network is used to calculate a scene feature descriptor of the scene, perform feature matching between the scene feature descriptor and the preset feature descriptor, and calculate the position of the target object in the actual space.

The object locating system as claimed in claim 9, wherein the processor also removes exception points of wrong matches.

The object positioning system as claimed in item 9, wherein the processing device divides the point cloud data according to the plane where the target object is located.

The object positioning system as claimed in claim 9, wherein the processing device divides the point cloud data by a Euclidean clustering method.

In the object positioning system according to claim 9, the processing device further calculates the area reference coordinates of the key points for coordinate conversion.

In the object positioning system as claimed in claim 9, the processing device further smoothes the surrounding area data with a Gaussian distribution.

The object positioning system as claimed in item 9, wherein the neural network is a Siamese neural network.