TWI830549B

TWI830549B - Objects automatic labeling method and system applying the same

Info

Publication number: TWI830549B
Application number: TW111149431A
Authority: TW
Inventors: 劉治能; 周宏春; 湯燦泰
Original assignee: 財團法人工業技術研究院
Priority date: 2022-12-22
Filing date: 2022-12-22
Publication date: 2024-01-21

Abstract

An automatic object labeling method includes steps: M continuous image frames are captured at one station of an assembly line. An object detection step is performed which includes selecting a detection image frame that displays an operation including a work piece and a target object from M continuous image frames; and calibrating the position range of the target object in the detection image frame. An Nth retrace image frame is selected from the M consecutive image frames retraced forward from the detection image frame, and a target image of the target object is obtained from the Nth retrace image frame according to the position range. The target image is compared with the M continuous image frames to match at least one other image similar to the target image both are stored as the same labeled data set.

Description

Object automatic marking method and system

本發明是有關於一種物件自動化標記(labeling)方法及其系統，且特別是有關於一種適用於流水線(assembly line)人工智能化的物件自動化標記方法及其系統。 The present invention relates to an automatic object labeling method and its system, and in particular, to an automatic object labeling method and its system suitable for artificial intelligence in an assembly line.

人工智能(Artificial Intelligence，AI)主要是透過電腦程式來進行數據分析以及資訊決策的執行，可應用於產線的自動化、能源節省、提升營運效率等領域。目前已成為全球各大產業之發展趨勢。 Artificial Intelligence (AI) mainly uses computer programs to perform data analysis and information decision-making, and can be applied to production line automation, energy saving, and improvement of operational efficiency. It has now become a development trend in major industries around the world.

產業運用人工智能的第一步所面臨到的問題是，如何定義操作程序，並對作業內容進行分析，以提供電腦/機器可據以學習的標準化資料。以電腦視覺(computer vision)的人工智能技術為例，要讓電腦具備辨識圖像的能力，需要先在圖像畫面上定義出特定範圍(bounding box)，並給予一個標準化的注解(annotation)，作為機器學習如何辨識不同物體的「標準答案」。這種對於操作程序和作業內容的原始資料進行處理和準備的過程，稱之為資料的標記。 The first step in industrial application of artificial intelligence is to define operating procedures and analyze the work content to provide standardized data that computers/machines can learn from. Take the artificial intelligence technology of computer vision as an example. In order for a computer to have the ability to recognize images, it is necessary to first define a specific range (bounding box) on the image screen and give it a standardized annotation. As the "standard answer" for how machine learning identifies different objects. This process of processing and preparing the original data of operating procedures and work content is called data marking.

然而，標記的工作看似簡單卻相當繁瑣，需要耗費大量的人力和時間(消耗了整個人工智能項目超過80%的時間)成本。且若採用人工進行資料標記，又可能會因為主觀偏差，而影響機器學習標準答案的基礎。 However, the labeling work seems simple but is quite tedious and requires a lot of manpower and time (consuming more than 80% of the time of the entire artificial intelligence project). And if manual data labeling is used, subjective bias may affect the basis of machine learning standard answers.

因此，有需要提供一種先進的物件自動化標記方法及其系統，來解決習知技術所面臨的問題。 Therefore, there is a need to provide an advanced automatic object marking method and system to solve the problems faced by the conventional technology.

本說明書的一實施例係揭露一種物件自動化標記方法，包括下述步驟：首先於流水線的一個站點擷取M個連續視框(frame)。然後進行一個物件偵測步驟，包括：從M個連續視框中選取一個偵測視框，此偵測視框係顯示以一工作件對一目標物件所進行的一項操作程序。再於偵測視框中標定出目標物件的位置範圍。從偵測影像往前回溯，於M個連續視框中選取第N個回溯視框；並根據此位置範圍，從第N個回溯視框中取得目標物件的標記影像。後續，將標記影像與M個連續視框進行比對，以比對出與標記影像相似的至少一個其他標記影像；並將標記代表影像和其他標記影像儲存為同一個標記資料集。 One embodiment of this specification discloses a method for automatic object marking, which includes the following steps: first, capture M consecutive frames (frames) at a station of the pipeline. Then an object detection step is performed, including: selecting a detection view frame from M continuous view frames. This detection view frame displays an operation procedure performed on a target object using a work piece. Then mark the position range of the target object in the detection frame. Backtrack from the detection image, select the Nth traceback view frame among the M consecutive view frames, and obtain the marked image of the target object from the Nth traceback viewframe based on this position range. Subsequently, the marked image is compared with M continuous view frames to compare at least one other marked image similar to the marked image; and the marked representative image and other marked images are stored as the same marked data set.

本說明書的另一實施例係揭露一種物件自動化標記系統，包括影像擷取裝置、物件偵測模組和關聯比對模組。影像擷取裝置用於在流水線的一個站點擷取M個連續視框。物件偵測模組係用於進行一個物件偵測步驟。其中物件偵測步驟包括：從M 個連續視框中選取一個偵測視框，此偵測視框係顯示以一工作件對一目標物件所進行的一項操作程序。再於偵測視框中標定目標物件的位置範圍；從偵測影像往前回溯，於M個連續視框中選取第N個回溯視框；以及根據此位置範圍，從第N個回溯視框中取得目標物件的標記影像。關聯比對模組係用於將標記影像與M個連續視框進行比對，以比對出與標記影像相似的至少一個其他標記影像；並將標記代表影像和其他標記影像儲存為同一個標記資料集。 Another embodiment of this specification discloses an automatic object marking system, including an image capture device, an object detection module and a correlation comparison module. The image capture device is used to capture M consecutive view frames at one station of the pipeline. The object detection module is used to perform an object detection step. The object detection steps include: from M A detection view frame is selected from consecutive view frames. This detection view frame displays an operation procedure performed on a target object using a work piece. Then calibrate the position range of the target object in the detection frame; go back from the detection image and select the Nth traceback frame among the M consecutive viewframes; and based on this position range, trace back from the Nth frame Obtain the marked image of the target object. The correlation comparison module is used to compare the marked image with M continuous view frames to compare at least one other marked image that is similar to the marked image; and to store the marked representative image and other marked images as the same mark. data set.

根據上述實施例，本說明書是提供一種物件自動化標記方法及其系統。其係採用兩階段的人工智能模組，針對流水線的一個站點擷取M個連續視框中的待標記物進行標記。先以人工驗證的方式，訓練並建構一個人工智能模型，再由該人工智能模型，根據操作程序中的工作件作為基準，來標定參與該操作程序的目標物件，並確定此目標物件的位置範圍。在抽取目標物件的影像特徵參數後，再由採用非監督學習(Unsupervised Learning)演算法(例如，關聯比對(association))，根據這些影像特徵參數，於M個連續視框中比對出與目標物件相似的至少一個其他目標物件的影像，並將二者標記為同一個標記資料。 According to the above embodiments, this specification provides a method and system for automatic marking of objects. It uses a two-stage artificial intelligence module to capture and mark objects to be marked in M consecutive view frames at one station of the pipeline. First, an artificial intelligence model is trained and constructed through manual verification. Then, the artificial intelligence model uses the workpiece in the operation program as a benchmark to calibrate the target objects participating in the operation program and determine the location range of the target object. . After extracting the image feature parameters of the target object, an unsupervised learning algorithm (for example, association) is used to compare the image features with the target object in M consecutive view frames based on these image feature parameters. The image of at least one other target object that is similar to the target object is marked with the same tag data.

藉由人工智能模組進行資料標記，可以大幅降低資料標記的人力成本和時間。且由於資料標記是以操作程序中工作件是否對目標物件進行指定的操作來作為客觀的判斷基準，因此也能改善採用人工進行資料標記時，因人為主觀偏差而導致機器學習失準的問題。 Data labeling through artificial intelligence modules can significantly reduce the labor cost and time of data labeling. And because data labeling is based on whether the workpiece in the operation program performs the specified operation on the target object is used as an objective judgment criterion, it can also improve the problem of machine learning inaccuracy caused by human subjective bias when manually labeling data.

10:物件自動化標記系統 10: Object automatic marking system

11:影像擷取裝置 11:Image capture device

12:物件偵測模組 12:Object detection module

13:關聯比對模組 13: Association comparison module

14:資料庫 14:Database

14a:標記資料集 14a: Marked data set

20:流水線 20: Assembly line

21:站點 21:Site

22:手套 22: gloves

22a:工作件區塊 22a: Workpiece block

22a’:回溯工作件區塊 22a’: Backtracking workpiece block

23:待標記物件 23:Object to be marked

23a:待標記物件區塊 23a: Block of objects to be marked

24:目標物件 24:Target object

S21、S22、S23、S241、S242、S25、S26、S27:步驟 S21, S22, S23, S241, S242, S25, S26, S27: steps

K₁-K₅、F_t+1-F_m:連續視框 K ₁ -K ₅ , F _t+1 -F _m : continuous view frame

F₀-Ft:訓練視框 F ₀ -Ft: training view frame

FD:偵測視框 FD: Detection frame

S、S’:目標物件的位置範圍 S, S’: position range of the target object

P₁-P_n:特徵參數 P ₁ -P _n : characteristic parameters

FN:回溯視框 FN: Lookback view frame

L:視野長度 L: field of view length

V:水平移動速度 V: horizontal movement speed

為了對本說明書之上述及其他方面有更佳的瞭解，下文特舉實施例，並配合所附圖式詳細說明如下：第1圖係根據本說明書的一實施例所繪示的一種物件自動化標記系統的功能配置示意圖；第2圖係根據本說明書的一實施例，繪示以第1圖的物件自動化標記系統進行物件自動化標記的方法流程圖；以及第3圖係根據本說明書的一實施例，繪示執行第2圖所述物件自動化標記方法時的部分視框變化示意圖。 In order to have a better understanding of the above and other aspects of this specification, embodiments are cited below and described in detail with the accompanying drawings: Figure 1 shows an automatic object marking system according to an embodiment of this specification. The functional configuration schematic diagram; Figure 2 is a flow chart illustrating a method for automatic marking of objects using the automatic object marking system in Figure 1, according to an embodiment of this specification; and Figure 3 is a flow chart of a method for automatic marking of objects according to an embodiment of this specification. A schematic diagram showing partial view frame changes when executing the automatic object marking method described in Figure 2.

本說明書是提供一種物件自動化標記方法及其系統，可大幅降低資料標記的人力成本和時間，同時改善採用人工進行資料標記時，因主觀偏差而導致機器學習失準的問題。為了對本說明書之上述實施例及其他目的、特徵和優點能更明顯易懂，下文特舉複數個較佳實施例，並配合所附圖式作詳細說明。 This instruction manual provides an automatic object labeling method and its system, which can significantly reduce the labor cost and time of data labeling, and at the same time improve the problem of machine learning inaccuracy due to subjective deviation when manual data labeling is used. In order to make the above-mentioned embodiments and other objects, features and advantages of this specification more clearly understandable, a plurality of preferred embodiments are enumerated below and described in detail with reference to the accompanying drawings.

但必須注意的是，這些特定的實施案例與方法，並非用以限定本發明。本發明仍可採用其他特徵、元件、方法及參數來加以實施。較佳實施例的提出，僅係用以例示本發明的技術特徵，並非用以限定本發明的申請專利範圍。該技術領域中具有通常知識者，將可根據以下說明書的描述，在不脫離本發明的精神範圍內，作均等的修飾與變化。在不同實施例與圖式之中，相同的元件，將以相同的元件符號加以表示。 However, it must be noted that these specific implementation examples and methods are not intended to limit the present invention. The invention may still be implemented using other features, components, methods and parameters. The preferred embodiments are proposed only to illustrate the technical features of the present invention and are not intended to limit the patentable scope of the present invention. Those with ordinary knowledge in this technical field will be able to make various adjustments based on the description of the following specification without departing from the spirit of the present invention. Make equal modifications and changes within the scope of God. In different embodiments and drawings, the same components will be represented by the same component symbols.

請參照第1圖至第3圖，第1圖係根據本說明書的一實施例所繪示的一種物件自動化標記系統10的功能配置示意圖。第2圖係根據本說明書的一實施例，繪示以第1圖的物件自動化標記系統10進行物件自動化標記的方法流程圖。第3圖係根據本說明書的一實施例，繪示執行第2圖所述物件自動化標記方法時的部分視框變化示意圖。 Please refer to Figures 1 to 3. Figure 1 is a schematic functional configuration diagram of an automatic object marking system 10 according to an embodiment of this specification. Figure 2 is a flowchart illustrating a method for automatic marking of objects using the automatic object marking system 10 of Figure 1 according to an embodiment of this specification. Figure 3 is a schematic diagram of partial view frame changes when executing the automatic object marking method described in Figure 2 according to an embodiment of this specification.

在本說明書的一些實施例中，物件自動化標記系統10是配置於一流水線20之中。在流水線20的站點21上所進行的操作程序，可以包括使用工作件對一個目標物件進行加工、揀選、分類、打印、標示或等處理。例如在本實施例中，流水線20可以包括一條輸送帶。在站點21上所進行的操作程序，包括由揀選工人穿著手套22(即工作件)從輸送帶中的多種不同的塑膠製品(即待標記物件23)中揀選出特定種類的塑膠製品(即目標物件24)。 In some embodiments of this specification, the automatic object marking system 10 is configured in a pipeline 20 . The operating procedures performed at the station 21 of the assembly line 20 may include using workpieces to process, select, classify, print, mark or otherwise process a target object. For example, in this embodiment, the assembly line 20 may include a conveyor belt. The operating procedures performed at the station 21 include a picking worker wearing gloves 22 (i.e., workpieces) to select a specific type of plastic product (i.e., the object to be marked 23) from a variety of different plastic products (i.e., the object to be marked 23) on the conveyor belt. Target object 24).

如第1圖所繪示，物件自動化標記系統10，包括影像擷取裝置11、物件偵測模組12、關聯比對模組13和資料庫14。影像擷取裝置11係裝設在流水線20的一個站點21上，用於針對該站點21所進行的操作程序，(如第2圖的步驟S21所述)擷取M個連續視框F₀-F_m(M為大於1的正整數)。在本說明書的一些實施例中，影像擷取裝置11可以包括至少一台數位相機或攝影機。 As shown in Figure 1, the automatic object marking system 10 includes an image capture device 11, an object detection module 12, an association comparison module 13 and a database 14. The image capture device 11 is installed on a station 21 of the assembly line 20 and is used to capture M continuous view frames F for the operation procedures performed on the station 21 (as described in step S21 in Figure 2). ₀ -F _m (M is a positive integer greater than 1). In some embodiments of this specification, the image capturing device 11 may include at least one digital camera or video camera.

在本實施例中，當影像擷取裝置11在流水線20的站點21上拍攝通過輸送帶的多種不同的塑膠製品(即待標記物件)23約4.5分鐘，可以獲取約8100個連續視框(即M等於8100)。影像擷取裝置11所擷取的M個連續視框F₀-F_m的影格率(frame rate，FPS)為每秒30影格FPS(frame per second)。 In this embodiment, when the image capture device 11 captures a variety of different plastic products (i.e., objects to be marked) 23 passing through the conveyor belt at the station 21 of the assembly line 20 for about 4.5 minutes, approximately 8100 continuous view frames ( That is, M equals 8100). The frame rate (FPS) of the M consecutive view frames F ₀ -F _m captured by the image capturing device 11 is 30 frames per second (FPS).

物件偵測模組12係用於進行一個物件偵測步驟。首先，如第2圖的步驟S22所述，使用物件偵測模組12分別在M個連續視框F₀-F_m的每一個視框中區隔出複數個待標記物件區塊23a和/或一個工作件區塊22a。 The object detection module 12 is used to perform an object detection step. First, as described in step S22 of Figure 2, the object detection module 12 is used to separate out a plurality of object blocks 23a and/or to be marked in each of the M continuous view frames F ₀ -F _m . or a work piece block 22a.

例如在本說明書的一些實施例中，使用物件偵測模組12區隔出複數個待標記物件區塊23a和/或一個工作件區塊22a的步驟S22，可以包括採用快速區域卷積神經網絡(Faster Region-based Convolutional Neural Networks，Faster R-CNN)、YOLOv4等物件偵測人工智能模型(但不以此為限)。例如在一些實施例中，可以將M個連續視框F₀-F_m的每一個視框採用物件偵測人工智能模型，進而在M個連續視框F₀-F_m的每一個視框中分別區隔出複數個待標記物件區塊23a，用以框選出對應視框中所顯示的塑膠製品(即待標記物件)23。 For example, in some embodiments of this specification, the step S22 of using the object detection module 12 to distinguish a plurality of object blocks 23a to be marked and/or a workpiece block 22a may include using a fast area convolutional neural network. (Faster Region-based Convolutional Neural Networks, Faster R-CNN), YOLOv4 and other object detection artificial intelligence models (but not limited to this). For example, in some embodiments, each of the M continuous view frames F ₀ -F _m can be used with an object detection artificial intelligence model, and then in each of the M continuous view frames F ₀ -F _m A plurality of object blocks 23a to be marked are respectively separated to select the plastic products (ie, the objects to be marked) 23 displayed in the corresponding view frame.

在本實施例中，如第3圖所繪示，M個連續視框F₀-F_m中包含K個連續操作視框K₀-K₆出現揀選工人所穿著的手套22。視框K₀-K₂顯示出揀選工人的手套22往目標物件24方向移動的過程；視框K₃顯示揀選工人的手套22開始抓取目標物件24；視框K₄-K₆顯示揀選工人的手套22已抓取目標物件24，並往視框外移動的過程。當此對應視框(例如，K個連續操作視框K₀-K₆)中所顯示的內容包括揀選工人所穿著的手套22(即工作件)時，物件偵測模組12除了能區隔出塑膠製品(即待標記物件)23外，還會在該視框中區隔出框選手套22(即工作件)的工作件區塊22a。 In this embodiment, as shown in Figure 3, gloves 22 worn by the picking workers appear in M continuous view frames F ₀ -F _m including K continuous operation view frames K ₀ -K ₆ . View frames K ₀ -K ₂ show the process of the picking worker's gloves 22 moving towards the target object 24 ; view frame K ₃ shows the picking worker's gloves 22 starting to grab the target object 24 ; view frames K ₄ -K ₆ show the picking worker The glove 22 has grabbed the target object 24 and moved outside the viewing frame. When the content displayed in the corresponding view frame (for example, K continuous operation view frames K ₀ -K ₆ ) includes gloves 22 (ie, workpieces) worn by the picking workers, the object detection module 12 can not only distinguish In addition to the plastic product (i.e., the object to be marked) 23, a workpiece area 22a of the framed gloves 22 (i.e., the workpiece) is also separated in the view frame.

在本說明書的一些實施例中，使用物件偵測模組12區隔出複數個待標記物件區塊23a和/或一個工作件區塊22a的步驟S22包括下述子步驟，首先計算K個連續操作視框K₀-K₆的每一個視框中，工作件區塊22a與該對應視框中的每一個待標記物件區塊23a(包括目標物件24)的交集比例(Intersection rate over objects)。之後，再從K個連續操作視框K₀-K₆中，選取具有最大交集比例者(例如，操作視框K₃-K₆)，藉以判定工作件(手套22)對目標物件24進行一項特定操作程序。例如在本說明書的一些實施例中，當K個連續操作視框K₀-K₆中具有最大交集比例的影格(K₃-K₆)數目大於H個(例如4個)時，即可確認特定種類的塑膠製品(目標物件24)已被手套(工作件區塊22a)所抓取。其中H為大於2的正整數(H>2)，在本實施例中，H的個數可以為3。 In some embodiments of this specification, step S22 of using the object detection module 12 to separate out a plurality of object blocks 23a and/or a workpiece block 22a includes the following sub-steps. First, calculate K consecutive In each of the operation view frames K ₀ -K ₆ , the intersection rate over objects (Intersection rate over objects) of the workpiece block 22 a and each to-be-marked object block 23 a (including the target object 24 ) in the corresponding view frame . After that, the one with the largest intersection ratio (for example, the operation view frames K ₃ -K ₆ ) is selected from the K continuous operation view frames K ₀ -K ₆ to determine whether the workpiece (glove 22 ) performs an operation on the target object 24 item-specific operating procedures. For example, in some embodiments of this specification, when the number of frames (K ₃ -K ₆ ) with the largest intersection ratio in the K continuous operation view frames K ₀ -K ₆ is greater than H (for example, 4), it can be confirmed A specific type of plastic product (target object 24) has been grasped by the glove (workpiece block 22a). Where H is a positive integer greater than 2 (H>2). In this embodiment, the number of H may be 3.

在本實施例中，工作件區塊22a與對應視框(連續操作視框K₀-K₆)中的每一個待標記物件區塊23a的最大交集比例，係工作件區塊22a與對應視框(連續操作視框K₀-K₆)中的每一個待標記物件區塊23a的最大重疊面積比例，如方程式(1)：

In this embodiment, the maximum intersection ratio between the workpiece block 22a and each object block 23a to be marked in the corresponding view frame (continuous operation view frame K ₀ -K ₆ ) is the ratio between the workpiece block 22a and the corresponding view frame. The maximum overlapping area ratio of each object block 23a to be marked in the frame (continuous operation view frame K ₀ -K ₆ ), such as equation (1):

在本說明書的一些實施例中，如第2圖的步驟S23所述：可以選擇性地(optionally)選擇M個連續視框F₀-F_m中的多個視框作為訓練視框F₀-F_t(如第3圖所繪示)，並以人工驗證的方式，訓練並建構一個人工智能模型(即物件偵測模組12)，用以執行區隔出複數個待標記物件區塊23a和/或一個工作件區塊22a的步驟S22。例如在本實施例中，可採用基於卷積神經網路演算法的機器學習(machine learning，ML)，來建構此一人工智能模型(物件偵測模組12)。同時選取影像擷取裝置11所拍攝4.5分鐘的8100個連續視框(即M等於8100)的第1分鐘內的120個視框作為訓練視框F₀-F_t，以人工驗證的方式，來訓練人工智能模型(物件偵測模組12)。在本說明書的一些實施例中，適用於建構物件偵測模組12的人工智能模型可以是，例如快速區域卷積神經網絡(Faster Region-based Convolutional Neural Networks，Faster R-CNN)、YOLOv4等物件偵測人工智能模型或上述之組合。 In some embodiments of this specification, as described in step S23 of Figure 2: multiple view frames among M consecutive view frames F ₀ -F _m can be optionally selected as training view frames F ₀ - F _t (as shown in Figure 3), and through manual verification, train and construct an artificial intelligence model (i.e., object detection module 12) to perform segmentation of a plurality of object blocks 23a to be marked. and/or step S22 of a workpiece block 22a. For example, in this embodiment, machine learning (ML) based on convolutional neural network algorithm can be used to construct the artificial intelligence model (object detection module 12). At the same time, 120 view frames within the first minute of 8100 continuous view frames (ie, M equals 8100) captured by the image capture device 11 for 4.5 minutes are selected as the training view frames F ₀ -F _t , and are manually verified. Training artificial intelligence model (object detection module 12). In some embodiments of this specification, the artificial intelligence model suitable for constructing the object detection module 12 may be, for example, Faster Region-based Convolutional Neural Networks (Faster R-CNN), YOLOv4, etc. Detect artificial intelligence models or a combination of the above.

接著如第2圖的步驟S241所述：從M個連續視框F₀-F_m中選取一個偵測視框FD。例如在本說明書的一些實施例中，所謂的偵測視框FD，是開始可以判定揀選工人的手套22(工作件)已經抓取特定種類的塑膠製品(目標物件)24的視框。在本實施例中，偵測視框FD中是從第3圖所繪示的M個連續視框F₀-F_m中，先排除了訓練視框F₀-Ft後的其他視框(例如，連續視框F_t+1-F_m)，再從那些包含有工作件區塊22a(出現手套22)的K個(K為大於1的正整數)連續操作視框K₃-K₆中，選取具有最大交集比例者K₃-K₆中選取第一個操作視框(例如，操作視框K₃)來作為偵測視框FD。 Then, as described in step S241 in Figure 2: select a detection view frame FD from M consecutive view frames F ₀ -F _m . For example, in some embodiments of this specification, the so-called detection frame FD is the frame from which it can initially be determined that the picking worker's gloves 22 (workpieces) have grasped a specific type of plastic product (target object) 24 . In this embodiment, the detected view frame FD is from the M continuous view frames F ₀ -F _m shown in Figure 3, excluding other view frames after the training view frames F ₀ -Ft (for example, , continuous view frames F _t+1 -F _m ), and then continuously operate the view frames K ₃ -K ₆ from the K (K is a positive integer greater than 1) containing the work piece block 22a (glove 22 appears) , select the first operation view frame (for example, the operation view frame K ₃ ) among K ₃ -K ₆ with the largest intersection ratio as the detection view frame FD.

同時，如第2圖的步驟S242所述，於偵測視框FD中標定出目標物件24的位置範圍S。在本說明書的一些實施例中，於偵測視框 FD中標定出目標物件24的位置範圍S的步驟，包括下述步驟：在偵測視框FD的複數個待標記物件區塊23a中，選取與工作件區塊22a具有最大交集比例者的區塊框選範圍。該區塊框選範圍中的待標記物件區塊23a，即可被標記為目標物件24；同時可以定位出目標物件24在偵測視框FD的位置範圍S。 At the same time, as described in step S242 of FIG. 2 , the position range S of the target object 24 is calibrated in the detection frame FD. In some embodiments of this specification, in detecting the view frame The step of demarcating the position range S of the target object 24 in FD includes the following steps: among the plurality of object blocks 23a to be marked in the detection frame FD, select the area with the largest intersection ratio with the workpiece block 22a. Block selection range. The object block 23a to be marked in the block selection range can be marked as the target object 24; at the same time, the position range S of the target object 24 in the detection view frame FD can be located.

然後，如第2圖的步驟S25所述，從偵測影像視框FD往前回溯，於K個連續操作視框K₀-K₆中選取第N個回溯視框FN(例如，操作視框K₀)；以及根據偵測視框FD中目標物件24的位置範圍S(以及多個相對位置範圍S’)，從第N個回溯視框FN中取得目標物件24的標記影像。從偵測影像視框FD往前回溯的步驟，包括下述子步驟：首先，根據流水線的水平移動速度V以及偵測視框FD的視野長度L，估計位置範圍S對應於偵測視框FD中目標物件24的位置範圍S的多個相對位置範圍S’。接著，選取第N個回溯視框FN。 Then, as described in step S25 of Figure 2, trace back from the detection image frame FD, and select the Nth traceback frame FN (for example, the operation frame) among the K consecutive operation frames K ₀ -K ₆ K ₀ ); and according to the position range S (and multiple relative position ranges S') of the target object 24 in the detection frame FD, obtain the mark image of the target object 24 from the Nth lookback frame FN. The step of backtracking from the detection image frame FD includes the following sub-steps: First, according to the horizontal movement speed V of the pipeline and the field of view length L of the detection frame FD, the estimated position range S corresponds to the detection frame FD. A plurality of relative position ranges S′ of the position range S of the target object 24 . Next, select the Nth traceback view frame FN.

估計相對位置範圍S’的子步驟包括：將流水線的水平移動速度V除以偵測視框FD的視野長度L(V/L)，可以計算出目標物件24進入至移出偵測視框FD的視野所需時間。再由M個連續視框F₀-F_m的影格率FPS，即可估算出每一個連續視框(連續操作視框K₀-K₆)的位移距離，再從偵測視框FD(例如，操作視框K₃)往前回溯N個(例如在第3圖中，N=3)連續視框(例如，操作視框K₀)。並換算出偵測視框FD(例如，操作視框K₃)中的位置範圍S對應於回溯視框(例如，操作視框K₀)的相對位置範圍S’。 The sub-step of estimating the relative position range S' includes: dividing the horizontal movement speed V of the assembly line by the field of view length L (V/L) of the detection frame FD, the distance between the target object 24 entering and moving out of the detection frame FD can be calculated. Time required for field of view. Then from the frame rates FPS of M continuous view frames F ₀ -F _m , the displacement distance of each continuous view frame (continuous operation view frame K ₀ -K ₆ ) can be estimated, and then from the detection view frame FD (such as , the operation view frame K ₃ ) goes back N consecutive view frames (for example, in Figure 3, N=3) (for example, the operation view frame K ₀ ). And it is calculated that the position range S in the detection view frame FD (for example, the operation view frame K ₃ ) corresponds to the relative position range S' of the lookback view frame (for example, the operation view frame K ₀ ).

在本說明書的一些實施例中，可以通過計算六分之一M個連續視框F₀-F_m的影格率(FPS/6)的高斯函數得出N的個數：N=[FPS/6]。例如在本實施例中，水平移動速度V定義為目標物件24進入視野到離開視野的視框數量(例如V=58)。再配合偵測視框FD的尺寸(長×寬，h×w)，即可估算出每一個連續視框(連續操作視框K₀-K₆)的位移距離大約是幾個像素(pixels)數量的長度：pixels=int(w/V)×int(FPS/6)=int(1920/58)×int(30/6)=33×5 In some embodiments of this specification, the number of N can be obtained by calculating the Gaussian function of the frame rate (FPS/6) of one-sixth of the M continuous view frames F ₀ -F _m : N=[FPS/6 ]. For example, in this embodiment, the horizontal movement speed V is defined as the number of view frames from when the target object 24 enters the field of view to when it leaves the field of view (for example, V=58). Coupled with the size of the detection frame FD (length × width, h × w), the displacement distance of each continuous frame (continuous operation frame K ₀ -K ₆ ) can be estimated to be approximately several pixels (pixels) Length of quantity: pixels=int(w/V)×int( FPS /6)=int(1920/58)×int(30/6)=33×5

即可換算出偵測視框FD(例如，操作視框K₃)中的位置範圍S在對應於回溯的各個視框(操作視框K₀-K₂)中的相對位置範圍S’；同時也可以換算出偵測視框FD中的工作件區塊22a在對應於回溯的各個視框(操作視框K₀-K₂)中的回溯工作件區塊22a’。 It can be converted to the relative position range S' of the position range S in the detection view frame FD (for example, the operation view frame K ₃ ) in each view frame corresponding to the backtracking (the operation view frame K ₀ -K ₂ ); at the same time The workpiece block 22a in the detection view frame FD can also be converted into the backtracking workpiece block 22a' in each view frame (operation view frame K ₀ -K ₂ ) corresponding to the backtracking.

選取第N個回溯視框FN的子步驟包括：在K個連續操作視框K₀-K₆中，選取一個除了偵測視框FD以外，目標物件24未被工作件區塊22a(手套22)遮蔽的操作視框，來作為第N個回溯視框FN(例如操作視框K₀)。此時，工作件區塊22a和相對位置範圍S’之間交集比例為0；回溯工作件區塊22a’和相對位置範圍S’之間交集比例仍為最大。而第N個回溯視框FN中位於相對位置範圍S’所框選的影像，即為目標物件24的標記影像。 The sub-step of selecting the Nth traceback view frame FN includes: among the K continuous operation view frames K ₀ -K ₆ , select a target object 24 that is not in the workpiece block 22 a (glove 22 ) except for the detection view frame FD. ) obscures the operation view frame as the Nth lookback view frame FN (for example, the operation view frame K ₀ ). At this time, the intersection ratio between the workpiece block 22a and the relative position range S' is 0; the intersection ratio between the traceback workpiece block 22a' and the relative position range S' is still maximum. The image selected in the relative position range S′ in the Nth traceback view frame FN is the marked image of the target object 24 .

後續，如第2圖的步驟S26所述，將標記影像與M個連續視框F₀-F_m進行比對，以比對出與標記影像相似的至少一個其他標記影像。在本說明書的一些實施例中，將標記影像與該M個連續視框F₀-F_m進行比對的步驟，包括採用非監督學習演算法所進行的一種影像關聯比對技術。在本實施例中，影像關聯比對技術包括下述步驟：首先，抽取標記影像的複數個特徵參數(feature extraction)P1-Pn；將這些特徵參數P₁-P_n導入關聯比對模組13，以比對出與標記影像相似的至少一個其他標記影像。 Subsequently, as described in step S26 of Figure 2, the marked image is compared with M continuous view frames F ₀ -F _m to compare at least one other marked image similar to the marked image. In some embodiments of this specification, the step of comparing the marked image with the M continuous view frames F ₀ -F _m includes an image correlation comparison technology using an unsupervised learning algorithm. In this embodiment, the image correlation comparison technology includes the following steps: first, extract a plurality of feature parameters (feature extraction) P1-Pn of the marked image; import these feature parameters _P1 - _Pn into the correlation comparison module 13 , to compare at least one other marked image that is similar to the marked image.

再如第2圖的步驟S27所述，將標記代表影像和其他標記影像傳輸至物件自動化標記系統10中資料庫14中，儲存為同一個標記資料集14a，完成一種目標物件24的自動化標記。 As described in step S27 of Figure 2, the mark representative image and other mark images are transmitted to the database 14 in the object automatic marking system 10 and stored as the same mark data set 14a, thereby completing an automated marking of the target object 24.

在本說明的一些實施例中，還可以在於流水線20的相同站點21或不同站點(未繪示)，採用不同的操作程序，重複步驟S21至步驟S25對流水線20中的其他種類待標記物件進行下一個物件標記程序。例如在本說明書的一些實施例中，可以由揀選工人穿著不同顏色或樣式的手套(未繪示)作為工作件，從流水線20的相同站點21或不同站點(未繪示)，抓取輸送帶中的其他種類塑膠製品(待標記物件)23作為下一個物件標記程序的目標物件(未繪示)。並重複步驟S21至步驟S25，對流水線20中的另一種類塑膠製品進行標記。 In some embodiments of the present description, different operating procedures may also be used at the same station 21 or different stations (not shown) of the pipeline 20 to repeat steps S21 to S25 for other types of items to be marked in the pipeline 20 The object proceeds to the next object marking process. For example, in some embodiments of this specification, picking workers can wear gloves (not shown) of different colors or styles as workpieces, and grab them from the same station 21 or different stations (not shown) of the assembly line 20 Other types of plastic products (objects to be marked) 23 in the conveyor belt serve as target objects (not shown) in the next object marking process. And repeat steps S21 to step S25 to mark another type of plastic product in the assembly line 20 .

將本說明書的實施例所提供的物件自動化標記方法與傳統人工標記作業進行比較，可以發現：本說明書的實施例所提供的物件自動化標記方法的標記效率為人工標記作業的68倍；選取偵測視框FD(判斷手套22已抓取特定目標物件24)的準確率高達86.73%；且物件影像比對的準確率(recall)高達92.68%。可見，採用本說明書的實施例所提供的物件自動化標記方法，可以大幅增加人工智能系統的資料標記效率，降低資料標記的人力成本和時間。而且，由於本說明書的實施例所提供的物件自動化標記方法是以操作程序中工作件是否對目標物件進行指定的操作(手套22抓取特定目標物件24)來作為客觀的判斷基準，因此也能改善採用人工進行資料標記時，因人為主觀偏差而導致機器學習失準的問題。 Comparing the automatic object labeling method provided by the embodiments of this specification with the traditional manual labeling operation, it can be found that the labeling efficiency of the automatic object labeling method provided by the embodiments of this specification is 68 times that of the manual labeling operation; selection detection The accuracy rate of view frame FD (judging that the glove 22 has grasped the specific target object 24) is as high as 86.73%; and the accuracy rate (recall) of object image comparison is as high as 92.68%. It can be seen that using the automatic object labeling method provided by the embodiments of this specification can greatly increase the data labeling efficiency of the artificial intelligence system and reduce the labor cost and time of data labeling. Furthermore, since this manual The automatic object marking method provided by the embodiment uses whether the workpiece in the operation program performs a specified operation on the target object (the glove 22 grabs the specific target object 24) as an objective judgment criterion. Therefore, it can also improve the manual data marking. At this time, the problem of machine learning inaccuracy caused by human subjective bias.

根據上述實施例，本說明書提供一種物件自動化標記方法及其系統。其係採用兩階段的人工智能模組，針對流水線的一個站點擷取M個連續視框中的待標記物進行標記。先以人工驗證的方式，訓練並建構一個人工智能模型，再由該人工智能模型，根據操作程序中的工作件作為基準，來標定參與該操作程序的目標物件，並確定此目標物件的位置範圍。在抽取目標物件的影像特徵參數後，再由採用非監督學習演算法(例如，關聯比對)，根據這些影像特徵參數，於M個連續視框中比對出與目標物件相似的至少一個其他目標物件的影像，並將二者標記為同一個標記資料。 According to the above embodiments, this specification provides a method and system for automatic marking of objects. It uses a two-stage artificial intelligence module to capture and mark objects to be marked in M consecutive view frames at one station of the pipeline. First, an artificial intelligence model is trained and constructed through manual verification. Then, the artificial intelligence model uses the workpiece in the operation program as a benchmark to calibrate the target objects participating in the operation program and determine the location range of the target object. . After extracting the image feature parameters of the target object, an unsupervised learning algorithm (for example, correlation comparison) is used to compare at least one other image feature similar to the target object in M continuous view frames based on these image feature parameters. The image of the target object and tag both with the same tag data.

藉由人工智能模組進行資料標記，可以大幅降低資料標記的人力成本和時間。且由於資料標記是以操作程序中工作件是否對目標物件進行指定的操作來作為客觀的判斷基準，因此也能改善習知技術採用人工進行資料標記時，因人為主觀偏差而導致機器學習失準的問題。 Data labeling through artificial intelligence modules can significantly reduce the labor cost and time of data labeling. And since data labeling is based on whether the workpiece in the operation program performs the specified operation on the target object as the objective judgment standard, it can also improve the machine learning inaccuracy caused by human subjective bias when the conventional technology uses manual data labeling. problem.

雖然本發明已以較佳實施例揭露如上，然其並非用以限定本發明，任何該技術領域中具有通常知識者，在不脫離本發明之精神和範圍內，當可作些許之更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 Although the present invention has been disclosed above in terms of preferred embodiments, they are not intended to limit the present invention. Anyone with ordinary knowledge in the technical field may make some modifications and modifications without departing from the spirit and scope of the present invention. , therefore, the protection scope of the present invention shall be subject to the scope of the appended patent application.

Claims

An object automatic marking method includes: using an image capturing device to capture M continuous frames at a station of an assembly line; using an object detection module to perform an object detection step, It includes: selecting a detection frame from the M continuous frames, the detection frame showing an operation procedure performed on a target object using a workpiece; and marking the detection frame in the detection frame. A position range of the target object; using a correlation comparison module to perform a correlation comparison step, including: tracing back from the detection view frame, and selecting an Nth traceback view frame among the M consecutive view frames; According to the position range, a mark image of the target object is obtained from the Nth lookback frame; the mark image is compared with the M consecutive view frames to compare at least one mark image that is similar to the mark image. other tagged images; and storing the tagged representative image and the at least one other tagged image as the same tagged data set.

The object automatic marking method as described in claim 1, wherein the step of using the object detection module to select the detection frame from the M continuous view frames includes: Use an object detection module to separate out a plurality of object blocks to be marked and/or a workpiece block in each of the M continuous view frames; select from the M continuous view frames containing K consecutive operation view frames of the workpiece block; and among the first of the K consecutive operation view frames, the workpiece block goes back to the Nth traceback viewframe and the Nth traceback viewframe A plurality of intersection rates (Intersection rates over objects) of a plurality of object blocks to be marked are searched, and the one with the largest intersection rate is searched; among them, the first one of the K consecutive operating view frames is used as the detection view frame.

The object automatic labeling method described in claim 2, wherein the object detection step further includes: using multiple training view frames in the M continuous view frames to train and construct an artificial intelligence ( Artificial Intelligence (AI) model; and using the output results of the artificial intelligence model to calculate the plurality of intersection ratios.

The automatic marking method of objects as described in request item 2, wherein the maximum intersection ratio is the plural number in the detection view frame, the workpiece block goes back to the Nth traceback viewframe, and the Nth traceback viewframe A maximum overlapping area ratio of the object blocks to be marked.

The object automatic marking method as described in claim 2, wherein the object with the maximum intersection ratio in the detection view frame is the target object.

The object automatic marking method described in claim 2, wherein the step of selecting the Nth traceback view frame using the association comparison module includes selecting the target object in the detection view frame that is not blocked by the workpiece The occluder serves as the Nth lookback frame.

The object automatic marking method described in claim 2, wherein the step of using the object detection module to mark the position range of the target object in the detection frame includes: selecting the target object in the detection frame A frame selection range for the one with the maximum intersection ratio among a plurality of object blocks to be marked; and based on a horizontal movement speed of the pipeline and a field of view length of the view frame, it is estimated that the frame selection range is in each of the K consecutive Manipulate multiple relative position ranges within the viewframe.

The automatic marking method for objects as described in claim 1, wherein the assembly line is a conveyor belt; and the operating procedure uses a glove as the workpiece to select the target object.

The automatic object marking method as described in claim 1, wherein the step of comparing the marked image with the M continuous view frames includes using an unsupervised learning algorithm, including: Extract a plurality of feature parameters (feature extraction) of the marked image; and input the plurality of feature parameters into an association model to compare the at least one other marked image in the M continuous view frames.

An automatic object marking system includes: an image capture device used to capture M continuous view frames at a station of a pipeline; an object detection module used to perform an object detection step, including: from the Select a detection view frame from M consecutive view frames, wherein the detection view frame displays an operation procedure performed on a target object using a workpiece; mark a position of the target object in the detection view frame Range; look back from the detection image, select an Nth lookback frame among the M continuous viewframes; and obtain a mark of the target object from the Nth lookback frame based on the position range image; and a correlation comparison module, including a step for performing a correlation comparison, including: comparing the marked image with the M continuous view frames to compare at least one other image that is similar to the marked image. Mark the image; and store the marked representative image and the at least one other marked image as the same marked data set.

The object automatic marking system as described in claim 10, wherein the step of selecting the detection view frame from the M continuous view frames includes: Separate a plurality of object blocks to be marked and/or a workpiece block in each of the M continuous view frames; select K blocks containing the workpiece block from the M continuous view frames. Continuously operate viewframes; and calculate the first of the K consecutively operated viewframes, the workpiece block going back to the Nth traceback viewframe and the plurality of object blocks to be marked in the Nth traceback viewframe A plurality of intersection ratios, and search for the largest intersection ratio among them; among them, the first one of the K consecutive operation view frames is used as the detection view frame.

The automatic object marking system as described in claim 11, wherein the object detection step further includes: using multiple training view frames in the M continuous view frames to train and construct an artificial intelligence model in a manual verification manner. ; and using the artificial intelligence model output result to calculate the complex intersection proportions.

The object automatic marking system as described in request item 11, wherein the maximum intersection ratio is the detection view frame, and the workpiece block goes back to the Nth traceback viewframe and a plurality of the Nth traceback viewframe. A maximum overlapping area ratio of the object area to be marked.

The object automatic marking system as described in claim 11, wherein the object with the maximum intersection ratio in the detection frame is the target object.

The object automatic marking system as described in claim 11, wherein the step of selecting the Nth lookback frame includes selecting the target object that is not obscured by the workpiece block in the detection frame as the Nth A lookback viewframe.

The automatic object marking system as described in claim 11, wherein the step of marking the position range of the target object in the detection view frame includes: selecting the plurality of object blocks to be marked in the detection view frame A frame selection range that has the maximum intersection ratio; and according to a horizontal movement speed of the pipeline and a field of view length of the view frame, estimate the frame selection range in each of the K consecutive operation view frames. relative position range.

The automatic marking system for objects as described in claim 10, wherein the assembly line is a conveyor belt; and the operating procedure uses a glove as the workpiece to select the target object.

The automatic object marking system of claim 10, wherein the step of comparing the marked image with the M continuous view frames includes using an unsupervised learning algorithm, including: extracting a plurality of characteristic parameters of the marked image ; and inputting the plurality of feature parameters into the correlation comparison model, and comparing the at least one other marked image in the M continuous view frames.