TW202201275A

TW202201275A - Device and method for scoring hand work motion and storage medium

Info

Publication number: TW202201275A
Application number: TW109122169A
Authority: TW
Inventors: 范亮; 張橋; 林定遠; 陳淑如; 謝巧琳; 陳怡靜; 危清清
Original assignee: 大陸商富泰華工業（深圳）有限公司
Priority date: 2020-06-24
Filing date: 2020-06-30
Publication date: 2022-01-01
Also published as: TWI776176B; CN111860196A; CN111860196B

Abstract

A method for scoring hand work motion includes: converting acquired hand work image frames to HSV images and obtaining binary images where the skin is located and multiple rectangular frames where the hand is located from the HSV images; segmenting first hand images from the binary images based on the multiple rectangular frames and inputting the first hand images to a first predetermined model; building tracking data accumulated with tracking time based on analysis results of the first predetermined model; invoking a predetermined hand series detection model to detect the hand work image frames if coordinates of key points of the hand obtained by the first predetermined model are unreliable; using a second predetermined model to assign hand labels to each hand in the tracking data for data classification; pre-processing classified tracking data to get precise data of each hand; scoring hand work motion of each hand based on the precise data of each hand and standard precise data. A device for scoring hand work motion and a storage medium are also provided.

Description

Device, method and computer-readable storage medium for scoring hand movement

本發明涉及資料處理技術領域，尤其涉及一種手部作業動作評分裝置、方法及電腦可讀存儲介質。The present invention relates to the technical field of data processing, and in particular, to a hand operation action scoring device, method and computer-readable storage medium.

近年來，隨著深度學習技術之發展，習知手部識別技術一般是採用基於卷積神經網路之各種架構得到之手部檢測模型來進行手部識別。該等模型一般要求輸入之圖片中僅能存於一個手部，具有一定之局限性。In recent years, with the development of deep learning technology, conventional hand recognition technology generally uses hand detection models based on various architectures of convolutional neural networks to perform hand recognition. These models generally require that the input image can only be stored in one hand, which has certain limitations.

現代工廠中，工人手部作業動作之準確性可對產線之良率、生產效率造成一定之影響。習知之工人手部作業動作之評判方式一般是由評核人員直接對工人手部作業動作進行即時觀察與評分，評判準確性受評核人員之人為因素影響，準確性無法保證。In modern factories, the accuracy of workers' hand movements can have a certain impact on the yield and production efficiency of the production line. The conventional method of evaluating the hand movements of workers is generally that the assessor directly observes and scores the worker's hand movements in real time. The accuracy of the assessment is affected by the human factors of the assessor, and the accuracy cannot be guaranteed.

有鑑於此，有必要提供一種手部作業動作評分裝置、方法及電腦可讀存儲介質，可智慧分析出工人手部作業動作與標準作業動作差異，並給出相應評分。In view of this, it is necessary to provide a hand work action scoring device, method and computer-readable storage medium, which can intelligently analyze the difference between the worker's hand work action and the standard work action, and give corresponding scores.

本發明一實施方式提供一種手部作業動作評分方法，所述方法包括：獲取手部作業影像，並對所述手部作業影像進行解碼得到手部作業圖像幀；將所述手部作業圖像幀轉換為HSV圖像，並從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框；根據多個所述矩形框從所述二值化圖像中分割出第一手部圖像並利用第一預設模型對所述第一手部圖像進行分析；基於所述第一預設模型之分析結果構建一隨跟蹤時間累積之跟蹤資料；利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之每一手部分配手部標籤，以基於所述手部標籤對所述跟蹤資料進行分類，其中每一所述手部對應唯一之手部標籤；對所述分類後之跟蹤資料進行預處理，以得到每一所述手部之精資料；及根據每一所述手部之精資料及基準手部作業之精資料對每一所述手部之作業動作進行評分。An embodiment of the present invention provides a hand work action scoring method, the method includes: acquiring a hand work image, and decoding the hand work image to obtain a hand work image frame; The image frame is converted into an HSV image, and a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located are obtained from the HSV image; The first hand image is segmented from the valued image, and the first hand image is analyzed by using the first preset model; based on the analysis result of the first preset model, a tracking data; monitoring the tracking data using a second preset model and assigning a hand tag to each hand portion of the tracking data to classify the tracking data based on the hand tag, wherein each of the The hand corresponds to a unique hand label; the classified tracking data is preprocessed to obtain the precise data of each of the hands; Refinement data were scored for each of the described hand movements.

本發明一實施方式提供一種手部作業動作評分裝置，所述裝置包括處理器及記憶體，所述記憶體上存儲有複數電腦程式，所述處理器用於執行記憶體中存儲之電腦程式時實現如下步驟：獲取手部作業影像，並對所述手部作業影像進行解碼得到手部作業圖像幀；將所述手部作業圖像幀轉換為HSV圖像，並從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框；根據多個所述矩形框從所述二值化圖像中分割出第一手部圖像並利用第一預設模型對所述第一手部圖像進行分析；基於所述第一預設模型之分析結果構建一隨跟蹤時間累積之跟蹤資料；利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之每一手部分配手部標籤，以基於所述手部標籤對所述跟蹤資料進行分類，其中每一所述手部對應唯一之手部標籤；對所述分類後之跟蹤資料進行預處理，以得到每一所述手部之精資料；及根據每一所述手部之精資料及基準手部作業之精資料對每一所述手部之作業動作進行評分。An embodiment of the present invention provides a hand operation action scoring device, the device includes a processor and a memory, the memory stores a plurality of computer programs, and the processor is used to execute the computer programs stored in the memory. The steps are as follows: acquiring a hand work image, and decoding the hand work image to obtain a hand work image frame; converting the hand work image frame into an HSV image, and extracting the image from the HSV image Obtain a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located; segment the first hand image from the binarized image according to the multiple rectangular frames, and use the second A preset model analyzes the first hand image; based on the analysis result of the first preset model, a tracking data accumulated with tracking time is constructed; the second preset model is used to monitor the tracking data and provide Each hand in the tracking data is assigned a hand label to classify the tracking data based on the hand label, wherein each hand corresponds to a unique hand label; the tracking after the classification The data are preprocessed to obtain the fine data of each of the hands; and the work movements of each of the hands are scored according to the fine data of each of the hands and the fine data of the reference hand work.

本發明一實施方式提供一種電腦可讀取存儲介質，所述電腦可讀取存儲介質存儲有多條指令，多條所述指令可被一個或者多個處理器執行，以實現如下步驟：獲取手部作業影像，並對所述手部作業影像進行解碼得到手部作業圖像幀；將所述手部作業圖像幀轉換為HSV圖像，並從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框；根據多個所述矩形框從所述二值化圖像中分割出第一手部圖像並利用第一預設模型對所述第一手部圖像進行分析；基於所述第一預設模型之分析結果構建一隨跟蹤時間累積之跟蹤資料；利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之每一手部分配手部標籤，以基於所述手部標籤對所述跟蹤資料進行分類，其中每一所述手部對應唯一之手部標籤；對所述分類後之跟蹤資料進行預處理，以得到每一所述手部之精資料；及根據每一所述手部之精資料及基準手部作業之精資料對每一所述手部之作業動作進行評分。An embodiment of the present invention provides a computer-readable storage medium. The computer-readable storage medium stores a plurality of instructions, and a plurality of the instructions can be executed by one or more processors, so as to realize the following steps: obtaining a hand The hand work image is decoded to obtain the hand work image frame; the hand work image frame is converted into an HSV image, and the area representing the skin is obtained from the HSV image The binarized image and a plurality of rectangular frames representing the area where the hand is located; the first hand image is segmented from the binarized image according to the plurality of rectangular frames, and the The first hand image is analyzed; based on the analysis result of the first preset model, a tracking data accumulated with the tracking time is constructed; the second preset model is used to monitor the tracking data and record the tracking data in the tracking data; Each hand is assigned a hand label to classify the tracking data based on the hand label, wherein each hand corresponds to a unique hand label; preprocessing the classified tracking data, to obtain the fine data of each of the hands; and to score the work movements of each of the hands according to the fine data of each of the hands and the fine data of the reference hand work.

與習知技術相比，上述手部作業動作評分裝置、方法及電腦可讀存儲介質，可對被評測者之手部作業動作之即時影像進行處理，準確定位手部作業過程中之手部關鍵點特徵，並與標準手部作業動作進行比對，智慧分析出被評測者之手部作業動作與標準作業動作之差異，並給出相應評分，有利於對工人進行評價考察，提升產線之良率與效率。Compared with the prior art, the above-mentioned device, method and computer-readable storage medium for evaluating hand movement can process the real-time image of the person's hand movement, and accurately locate the hand key in the process of hand movement. Point features, and compare with the standard hand movements, intelligently analyze the difference between the subject's hand movements and the standard job movements, and give corresponding scores, which is conducive to the evaluation and inspection of workers and improve the production line. Yield and efficiency.

請參閱圖1，為本發明手部作業動作評分裝置較佳實施例之示意圖。Please refer to FIG. 1 , which is a schematic diagram of a preferred embodiment of the device for evaluating hand movements according to the present invention.

手部作業動作評分裝置100可實現對工人手部作業動作進行分析，並藉由與基準手部作業動作進行比對，來實現對工人手部作業動作進行評分。手部作業動作評分裝置100可包括記憶體10、處理器20以及存儲於記憶體10中並可於處理器20上運行之手部作業動作評分程式30。處理器20執行手部作業動作評分程式30時實現手部作業動作評分方法實施例中之步驟，例如圖5所示之步驟S500~S512。或者，所述處理器20執行手部作業動作評分程式30時實現圖2中各模組之功能，例如模組101~108。The hand operation action scoring device 100 can realize the analysis of the worker's hand action action, and by comparing with the reference hand action action, realize the scoring of the worker's hand action action. The hand work movement scoring apparatus 100 may include a memory 10 , a processor 20 , and a hand work action scoring program 30 stored in the memory 10 and running on the processor 20 . When the processor 20 executes the hand work action scoring program 30, it implements the steps in the embodiment of the hand work action scoring method, such as steps S500-S512 shown in FIG. 5 . Alternatively, the processor 20 implements the functions of the modules shown in FIG. 2 , such as modules 101 to 108 , when the processor 20 executes the hand operation action scoring program 30 .

手部作業動作評分程式30可被分割成一個或多個模組，所述一個或者多個模組被存儲於記憶體10中，並由處理器20執行，以完成本發明。所述一個或多個模組可是能夠完成特定功能之一系列電腦程式指令段，所述指令段用於描述手部作業動作評分程式30於手部作業動作評分裝置100中之執行過程。例如，手部作業動作評分程式30可被分割成圖2中之獲取模組101、第一檢測模組102、分析模組103、跟蹤模組104、第二檢測模組105、整理模組106、校正模組107及評分模組108。各模組具體功能參見下圖2中各模組之功能。The hand activity scoring program 30 can be divided into one or more modules, and the one or more modules are stored in the memory 10 and executed by the processor 20 to complete the present invention. The one or more modules may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the hand movement evaluation program 30 in the hand movement evaluation device 100 . For example, the hand movement scoring program 30 can be divided into the acquisition module 101, the first detection module 102, the analysis module 103, the tracking module 104, the second detection module 105, and the sorting module 106 in FIG. 2 . , a calibration module 107 and a scoring module 108 . For the specific functions of each module, please refer to the function of each module in Figure 2 below.

本領域技術人員可理解，所述示意圖僅是手部作業動作評分裝置100之示例，並不構成對手部作業動作評分裝置100之限定，可包括比圖示更多或更少之部件，或者組合某些部件，或者不同之部件，例如手部作業動作評分裝置100還可包括輸入顯示裝置、通信模組、匯流排等。Those skilled in the art can understand that the schematic diagram is only an example of the hand work action scoring apparatus 100, and does not constitute a limitation on the hand work action scoring apparatus 100, and may include more or less components than those shown in the figure, or a combination thereof Certain components, or different components, for example, the hand work action scoring apparatus 100 may further include an input display device, a communication module, a bus bar, and the like.

處理器20可是中央處理單元(Central Processing Unit，CPU)，還可是其他通用處理器、數位訊號處理器 (Digital Signal Processor，DSP)、專用積體電路 (Application Specific Integrated Circuit，ASIC)、現成可程式設計閘陣列 (Field-Programmable Gate Array，FPGA) 或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。通用處理器可是微處理器或者處理器20亦可是任何常規之處理器等，處理器20可利用各種介面與匯流排連接手部作業動作評分裝置100之各個部分。The processor 20 may be a central processing unit (CPU), other general-purpose processors, a digital signal processor (DSP), an application specific integrated circuit (ASIC), or an off-the-shelf program. Design Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or the processor 20 can also be any conventional processor, etc. The processor 20 can use various interfaces and bus bars to connect various parts of the hand movement evaluation device 100 .

記憶體10可用於存儲手部作業動作評分程式30與/或模組，處理器20藉由運行或執行存儲於記憶體10內之電腦程式與/或模組，以及調用存儲於記憶體10內之資料，實現手部作業動作評分裝置100之各種功能。記憶體10可包括高速隨機存取記憶體，還可包括非易失性記憶體，例如硬碟機、記憶體、插接式硬碟機，智慧存儲卡（Smart Media Card, SMC），安全數位（Secure Digital, SD）卡，快閃記憶體卡（Flash Card）、至少一個磁碟記憶體件、快閃記憶體器件、或其他易失性固態記憶體件。The memory 10 can be used to store the hand movement scoring program 30 and/or the module. The processor 20 executes or executes the computer program and/or the module stored in the memory 10 and calls the program and/or module stored in the memory 10 Various functions of the hand operation action scoring device 100 are realized. The memory 10 may include high-speed random access memory, and may also include non-volatile memory such as hard disk drives, internal memory, plug-in hard disk drives, Smart Media Cards (SMC), secure digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one disk memory device, flash memory device, or other volatile solid state memory device.

圖2為本發明手部作業動作評分程式較佳實施例之功能模組圖。FIG. 2 is a functional module diagram of a preferred embodiment of the hand operation action scoring program of the present invention.

參閱圖2與圖3所示，手部作業動作評分程式30可包括獲取模組101、第一檢測模組102、分析模組103、跟蹤模組104、第二檢測模組105、整理模組106、校正模組107及評分模組108。於一實施方式中，上述模組可為存儲於記憶體10中且可被處理器20調用執行之可程式化軟體指令。可理解之是，於其他實施方式中，上述模組亦可為固化於處理器20中之程式指令或固件（firmware）。Referring to FIG. 2 and FIG. 3 , the hand operation action scoring program 30 may include an acquisition module 101 , a first detection module 102 , an analysis module 103 , a tracking module 104 , a second detection module 105 , and an arrangement module 106. The calibration module 107 and the scoring module 108. In one embodiment, the above-mentioned modules may be programmable software instructions stored in the memory 10 and invoked by the processor 20 for execution. It can be understood that, in other embodiments, the above-mentioned modules can also be program instructions or firmware solidified in the processor 20 .

獲取模組101用於獲取手部作業影像，並對所述手部作業影像進行解碼得到手部作業圖像幀。The acquisition module 101 is used for acquiring a hand operation image, and decoding the hand operation image to obtain a hand operation image frame.

於一實施方式中，可利用影像記錄設備（比如，攝像頭）來記錄指定生產線上之一個或多個指定工人（被評測者）於進行手部作業時之手部作業影像。所述手部作業動作評分裝置100可與該影像記錄設備進行通信，進而獲取模組101可獲取所述手部作業影像。當獲取模組101獲取到所述手部作業影像時，可對所述手部作業影像進行解碼得到依序排列之多個手部作業圖像幀。In one embodiment, an image recording device (eg, a camera) can be used to record hand work images of one or more designated workers (evaluators) on a designated production line when performing hand operations. The hand operation action scoring apparatus 100 can communicate with the image recording device, and then the acquisition module 101 can acquire the hand operation image. When the acquisition module 101 acquires the hand operation image, the hand operation image can be decoded to obtain a plurality of hand operation image frames arranged in sequence.

於一實施方式中，所述手部作業圖像幀可包括一個或多個手部，獲取模組101可將每一所述手部作業圖像幀依次傳至第一檢測模組102進行分析。In one embodiment, the hand operation image frame may include one or more hands, and the acquisition module 101 may sequentially transmit each of the hand operation image frames to the first detection module 102 for analysis. .

第一檢測模組102用於將所述手部作業圖像幀轉換為HSV（Hue, Saturation, Value）圖像，並從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。The first detection module 102 is used to convert the hand operation image frame into an HSV (Hue, Saturation, Value) image, and obtain a binarized image representing the area where the skin is located and from the HSV image. Multiple rectangles representing the area where the hand is located.

於一實施方式中，第一檢測模組102可將獲取模組101傳送之手部作業圖像幀轉換為HSV圖像，再從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之一系列矩形框，該些矩形框可傳至分析模組103，以進行手部關鍵點分析。In one embodiment, the first detection module 102 can convert the hand operation image frame transmitted by the acquisition module 101 into an HSV image, and then acquire a binarized image representing the area where the skin is located from the HSV image. Like and represent a series of rectangular boxes in the area where the hand is located, these rectangular boxes can be transmitted to the analysis module 103 for analyzing the key points of the hand.

於一實施方式中，第一檢測模組102根據動態之H通道之上下限值、動態之S通道之上下限值及動態之V通道之上下限值從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框，進而實現避免由於HSV三通道（即H通道、S通道及V通道）之上下限值為固定值，無法適應多種不同情形之手部檢測之問題，避免出現手部漏檢或誤檢之情況。第一檢測模組102可動態更新自身之HSV三通道之上下限值，具體地，第一檢測模組102可根據分析模組103之回饋結果來更新HSV三通道之上下限值，比如當分析模組103給出之回饋結果為積極結果時，第一檢測模組102可根據預設更新規則更新H通道之上下限值、S通道之上下限值及V通道之上下限值；當分析模組103給出之回饋結果為消極結果時，第一檢測模組102不會更新HSV三通道之上下限值。所述預設更新規則可根據實際使用需求進行預先設定，該預設更新規則預先定義有每一HSV通道之上下限值之調整規則。In one embodiment, the first detection module 102 obtains from the HSV image according to the upper and lower limit values of the dynamic H channel, the upper and lower limit values of the dynamic S channel, and the upper and lower limit values of the dynamic V channel, indicating where the skin is located. The binarized image of the area and multiple rectangular boxes representing the area where the hand is located, thereby avoiding that the upper and lower limits of the HSV three channels (ie, the H channel, the S channel and the V channel) are fixed values, which cannot be adapted to many different situations. The problem of hand detection can be avoided to avoid the situation of missed detection or false detection of the hand. The first detection module 102 can dynamically update the upper and lower limits of its own HSV three channels. Specifically, the first detection module 102 can update the upper and lower limits of the HSV three channels according to the feedback results of the analysis module 103. When the feedback result given by the module 103 is a positive result, the first detection module 102 can update the upper and lower limit values of the H channel, the upper and lower limit values of the S channel, and the upper and lower limit values of the V channel according to the preset update rule; When the feedback result given by the group 103 is a negative result, the first detection module 102 will not update the upper and lower limit values of the three HSV channels. The preset update rule can be preset according to actual usage requirements, and the preset update rule defines an adjustment rule for the upper and lower limit values of each HSV channel in advance.

可理解，第一檢測模組102根據分析模組103輸出之當前手部作業圖像幀之回饋結果來更新HSV三通道之上下限值，更新後之HSV三通道之上下限值用以對下一手部作業圖像幀進行檢測，以獲取下一手部作業圖像幀之表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。It can be understood that the first detection module 102 updates the upper and lower limit values of the HSV three channels according to the feedback result of the current hand operation image frame output by the analysis module 103, and the updated upper and lower limit values of the HSV three channels are used for A hand operation image frame is detected to obtain a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located in the next hand operation image frame.

於一實施方式中，當所述手部作業圖像幀為起始幀圖像時，第一檢測模組102可採用預設之HSV三通道之上下限值從HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。即於所述手部作業圖像幀為起始幀圖像時，所述H通道之上下限值為第一內定上下限值，所述S通道之上下限值為第二內定上下限值，所述V通道之上下限值為第三內定上下限值。In one embodiment, when the hand operation image frame is the initial frame image, the first detection module 102 can use the preset upper and lower limit values of the three HSV channels to obtain from the HSV image to indicate where the skin is located. A binarized image of the region and a number of rectangular boxes representing the region where the hand is located. That is, when the hand operation image frame is the initial frame image, the upper and lower limit values of the H channel are the first default upper and lower limit values, and the upper and lower limit values of the S channel are the second default upper and lower limit values, The upper and lower limit values of the V channel are the third default upper and lower limit values.

於一實施方式中，於獲取被評測者之手部作業影像時，可能會一併將被評測者之周圍環境拍攝下來。第一檢測模組102藉由分析圖像中之膚色圖元值與環境圖元值之差異來實現檢測手部。當環境中存於與膚色圖元值相近之其他物體時，或者當被評測者之膚色不於內定之HSV三通道之上下限值範圍內時，亦或者環境中存於其他光照等情形下，均將可能發生手部之誤識別。因此，藉由引入分析模組103來更新預設之HSV三通道之上下限值。In one embodiment, when the image of the subject's hand operation is obtained, the surrounding environment of the subject may be photographed together. The first detection module 102 detects the hand by analyzing the difference between the skin color primitive value and the environment primitive value in the image. When other objects with similar skin color primitive values exist in the environment, or when the skin color of the subject is not within the upper and lower limits of the default HSV three channels, or when there are other lighting conditions in the environment, Misidentification of the hand may occur. Therefore, by introducing the analysis module 103, the preset upper and lower limit values of the three HSV channels are updated.

分析模組103用於根據多個所述矩形框從所述二值化圖像中分割出第一手部圖像並利用第一預設模型對所述第一手部圖像進行分析。The analysis module 103 is configured to segment a first hand image from the binarized image according to the plurality of the rectangular frames, and analyze the first hand image by using a first preset model.

於一實施方式中，分析模組103可根據多個所述矩形框從所述二值化圖像中分割出需進行分析之圖像區域，該分割下來之圖像區域即為所述第一手部圖像，再將所述第一手部圖像輸入到所述第一預設模型行分析。所述第一預設模型可是預先訓練之手部關鍵點分析模型，對於每一所述矩形框，所述手部關鍵點分析模型可分析得到21個手部關鍵點座標、置信度及手部特徵向量。所述置信度表徵了所述手部關鍵點分析模型之魯棒性。所述手部關鍵點分析模型分析得到之手部關鍵點座標與置信度可傳至跟蹤模組104，手部特徵向量可傳至第二預設模型（整理模組106）。In one embodiment, the analysis module 103 can segment the image region to be analyzed from the binarized image according to a plurality of the rectangular frames, and the segmented image region is the first image region. hand image, and then input the first hand image into the first preset model for analysis. The first preset model can be a pre-trained hand key point analysis model. For each of the rectangular boxes, the hand key point analysis model can analyze and obtain 21 hand key point coordinates, confidence and hand Feature vector. The confidence level characterizes the robustness of the hand keypoint analysis model. The hand key point coordinates and confidence levels obtained by the analysis of the hand key point analysis model can be transmitted to the tracking module 104, and the hand feature vector can be transmitted to the second preset model (the sorting module 106).

於一實施方式中，所述手部關鍵點分析模型可基於預設手部關鍵點訓練資料集訓練得到，所述手部關鍵點分析模型可由2個級聯之Hourglass網路、數個殘差模組以及至少一卷積層組成。於進行模型訓練時，模型之輸入圖像可為[256, 256, 3]之彩色RGB三通道手部圖像，輸出可為[21, 64, 64]之通道數為21之熱度圖像，再根據該熱度圖像，模型可得到該輸入圖像中之21個關鍵點位置，即輸入之手部圖像之21個手部關鍵點座標。In one embodiment, the hand key point analysis model can be obtained by training based on a preset hand key point training data set, and the hand key point analysis model can be obtained by two cascaded Hourglass networks, several residuals. module and at least one convolutional layer. During model training, the input image of the model can be [256, 256, 3] color RGB three-channel hand image, and the output can be [21, 64, 64] heat image with 21 channels, Then, according to the heat image, the model can obtain the positions of 21 key points in the input image, that is, the coordinates of the 21 hand key points in the input hand image.

於一實施方式中，所述手部關鍵點分析模型還可根據每一所述矩形框對應之置信度計算得到一置信度均值，並判斷所述置信度均值是否大於預設值，以返回積極結果或消極結果給所述第一檢測模組102。具體地，若所述置信度均值大於所述預設值，所述手部關鍵點分析模型輸出積極結果並回饋至所述第一檢測模組102，進而第一檢測模組102會根據該積極結果更新HSV三通道之上下限值。若所述置信度均值不大於所述預設值，所述手部關鍵點分析模型輸出消極結果並回饋至所述第一檢測模組102，進而第一檢測模組102本次不會更新HSV三通道之上下限值。所述預設值可根據實際需求進行設定與調整，於此不作限定。In one embodiment, the hand key point analysis model can also calculate a confidence average value according to the confidence degree corresponding to each of the rectangular boxes, and determine whether the confidence degree average value is greater than a preset value, so as to return a positive value. The result or negative result is given to the first detection module 102 . Specifically, if the average confidence value is greater than the preset value, the hand key point analysis model outputs a positive result and feeds it back to the first detection module 102, and then the first detection module 102 will determine the positive result according to the positive result. The result updates the upper and lower limit values of the HSV three channels. If the average confidence value is not greater than the preset value, the hand key point analysis model outputs a negative result and feeds it back to the first detection module 102, and the first detection module 102 will not update the HSV this time The upper and lower limits of the three channels. The preset value can be set and adjusted according to actual needs, which is not limited herein.

跟蹤模組104用於基於所述第一預設模型之分析結果構建一隨跟蹤時間累積之跟蹤資料。The tracking module 104 is used for constructing a tracking data accumulated with tracking time based on the analysis result of the first preset model.

於一實施方式中，所述第一預設模型可為手部關鍵點分析模型，所述第一預設模型之分析結果包括手部關鍵點座標與置信度，跟蹤模組104可根據手部關鍵點分析模型得到之手部關鍵點座標與置信度構建並維護隨跟蹤時間累積之跟蹤資料。所述跟蹤資料為包含手部關鍵點座標與置信度之結構體。In one embodiment, the first preset model may be a hand key point analysis model, and the analysis results of the first preset model include hand key point coordinates and confidence levels, and the tracking module 104 may The hand key point coordinates and confidence levels obtained by the key point analysis model build and maintain the tracking data accumulated over the tracking time. The tracking data is a structure including hand key point coordinates and confidence levels.

於一實施方式中，跟蹤模組104於構建並維護跟蹤資料之過程中，對於所述手部關鍵點分析模型接收並處理之每一手部作業圖像幀之資料，跟蹤模組104會基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料是否可靠，如果可靠，跟蹤模組104會將基於本次手部作業圖像幀所生成之跟蹤資料添加到其維護之跟蹤資料中，如果不可靠，跟蹤模組104調用第二檢測模組105來對本次手部作業圖像幀進行再次檢測，第二檢測模組105可包括至少一個預設手部系列檢測模型，比如包括預先訓練得到之YOLO模型及SSD模型。每一預設手部系列檢測模型均分別對本次手部作業圖像幀進行檢測，各自可得到一系列矩形框，並輸入到手部關鍵點分析模型中進行分析，得到各自之手部關鍵點座標與置信度。跟蹤模組104會綜合比對每組處理方式對應之手部關鍵點座標與置信度，選擇與當前跟蹤資料最匹配之一組手部關鍵點座標與置信度，以維護到跟蹤模組104所構建之跟蹤資料中。In one embodiment, when the tracking module 104 constructs and maintains the tracking data, the tracking module 104 based on the confidence based on the data of each hand operation image frame received and processed by the hand key point analysis model It determines whether the tracking data generated based on the image frame of the hand operation is reliable. If it is reliable, the tracking module 104 will add the tracking data generated based on the image frame of the hand operation to the tracking data it maintains. , if it is not reliable, the tracking module 104 calls the second detection module 105 to re-detect the image frame of this hand operation. The second detection module 105 may include at least one preset hand series detection model, such as Including pre-trained YOLO model and SSD model. Each preset hand series detection model detects the image frames of this hand operation respectively, and each can obtain a series of rectangular boxes, which are input into the hand key point analysis model for analysis, and the respective hand key points are obtained. Coordinates and confidence. The tracking module 104 comprehensively compares the coordinates and confidence levels of the hand key points corresponding to each group of processing methods, and selects a set of hand key point coordinates and confidence levels that best matches the current tracking data, so as to maintain the information provided by the tracking module 104 . Build tracking data.

於一實施方式中，跟蹤模組104基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料是否可靠之方式可是：計算置信度均值，並判斷置信度均值是否超過預設閾值，若超過，則判定基於本次手部作業圖像幀所生成之之跟蹤資料可靠，否則判定為不可靠。所述預設閾值可根據實際需求進行設定與調整。跟蹤模組104選擇與當前跟蹤資料最匹配之一組手部關鍵點座標與置信度之規則可包括以下任意一種：a. 置信度均值最高之一組；b. 由在於連續之手部作業圖像幀之手部關鍵點座標之變化應該變化不大，手部關鍵點座標與前後手部作業圖像幀之關鍵點座標最匹配（比如，手部關鍵點座標之間之歐式距離最小）之一組；c. 規則a與規則b之結合，藉由為規則a與規則b設定不同之權重係數，根據規則a與規則b之綜合結果來選定。In one embodiment, the way for the tracking module 104 to determine, based on the confidence, whether the tracking data generated based on the image frame of the hand operation is reliable is as follows: calculating the average of the confidence and determining whether the average of the confidence exceeds a preset threshold. , if it exceeds, it is determined that the tracking data generated based on the current hand operation image frame is reliable, otherwise it is determined to be unreliable. The preset threshold can be set and adjusted according to actual needs. The rules for the tracking module 104 to select a set of hand key point coordinates and confidence levels that best match the current tracking data may include any one of the following: a. The group with the highest average confidence level; b. The change of the hand key point coordinates of the image frame should not change much. The hand key point coordinates best match the key point coordinates of the front and rear hand operation image frames (for example, the Euclidean distance between the hand key point coordinates is the smallest). A group; c. The combination of rule a and rule b is selected according to the comprehensive result of rule a and rule b by setting different weight coefficients for rule a and rule b.

舉例而言，第二檢測模組105包括YOLO模型及SSD模型，則跟蹤模組104綜合比對之是2組處理方式對應之手部關鍵點座標與置信度。第一組為藉由YOLO模型對手部作業圖像幀進行檢測得到之一系列矩形框，第二組為藉由SSD模型對手部作業圖像幀進行檢測得到之一系列矩形框。For example, the second detection module 105 includes the YOLO model and the SSD model, and the tracking module 104 comprehensively compares the coordinates and confidence levels of the hand key points corresponding to the two sets of processing methods. The first group is a series of rectangular frames obtained by detecting the hand job image frame by the YOLO model, and the second group is a series of rectangular frames obtained by detecting the hand job image frame by the SSD model.

於一實施方式中，當跟蹤模組104基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料不可靠時，跟蹤模組104亦可直接調用第二檢測模組105中之YOLO模型或SSD模型來對本次手部作業圖像幀進行再次檢測，得到對應之手部關鍵點座標與置信度，並維護到跟蹤資料中（即不綜合比對YOLO模型與SSD模型對應之手部關鍵點座標與置信度）。In one embodiment, when the tracking module 104 determines based on the confidence that the tracking data generated based on the current hand operation image frame is unreliable, the tracking module 104 can also directly call the tracking data in the second detection module 105. The YOLO model or the SSD model is used to re-detect the image frame of this hand operation, and the corresponding hand key point coordinates and confidence levels are obtained, and maintained in the tracking data (that is, the corresponding YOLO model and the SSD model are not comprehensively compared. hand key point coordinates and confidence).

於一實施方式中，跟蹤模組104於基於所述手部關鍵點分析模型得到之手部關鍵點座標與置信度構建得到隨所述跟蹤時間累積之跟蹤資料時，還可基於每一所述矩形框對應之置信度判斷所述手部關鍵點分析模型對本次手部作業圖像幀之第一手部圖像進行分析得到之手部關鍵點座標是否可靠，若判定為不可靠，則跟蹤模組104會調用第二檢測模組105中之多個預設手部系列檢測模型分別對所述手部作業圖像幀進行檢測得到表示手部所在區域之多個矩形框，以分割得到與多個所述預設手部系列檢測模型對應之多個第二手部圖像。所述手部關鍵點分析模型再對每一所述第二手部圖像進行分析得到與每一所述預設手部系列檢測模型對應之手部關鍵點座標、置信度及手部特徵向量。跟蹤模組104綜合比對所述手部關鍵點分析模型所分析得到之每一組手部關鍵點座標與置信度，以選取與所述跟蹤資料最匹配之一組手部關鍵點座標與置信度來更新其先前生成之跟蹤資料。In one embodiment, when the tracking module 104 constructs the tracking data accumulated with the tracking time based on the hand key point coordinates and confidence levels obtained by the hand key point analysis model, it can also be based on each of the hand key points. The confidence level corresponding to the rectangular frame determines whether the hand key point coordinates obtained by analyzing the first hand image of the hand operation image frame by the hand key point analysis model are reliable. If it is determined to be unreliable, then The tracking module 104 will call a plurality of preset hand series detection models in the second detection module 105 to detect the hand operation image frame respectively to obtain a plurality of rectangular frames representing the region where the hand is located, so as to obtain by segmentation a plurality of second hand images corresponding to the plurality of the preset hand series detection models. The hand key point analysis model then analyzes each of the second hand images to obtain hand key point coordinates, confidence levels and hand feature vectors corresponding to each of the preset hand series detection models . The tracking module 104 comprehensively compares each set of hand key point coordinates and confidence levels obtained by the analysis of the hand key point analysis model to select a set of hand key point coordinates and confidence levels that best matches the tracking data. to update its previously generated tracking data.

可理解，所述第一手部圖像對應有一組手部關鍵點座標與置信度，每一所述第二手部圖像均分別對應有一組手部關鍵點座標與置信度。It can be understood that the first hand image corresponds to a set of hand key point coordinates and confidence levels, and each second hand image corresponds to a set of hand key point coordinates and confidence levels respectively.

可理解，第二檢測模組105是可省略。當被評測者之兩隻手交叉或是僅出現部分手部區域等情形下，可能會導致分析模組103得到之手部關鍵點之置信度偏低，此時需要利用第二檢測模組105中之YOLO模型與/或SSD模型作為輔助檢測。Understandably, the second detection module 105 can be omitted. When the subject's two hands are crossed or only part of the hand area is present, the confidence of the key points of the hand obtained by the analysis module 103 may be low. In this case, the second detection module 105 needs to be used. Among them, the YOLO model and/or the SSD model are used as auxiliary detection.

整理模組106用於利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之每一手部分配手部標籤，以基於所述手部標籤對所述跟蹤資料進行分類。The sorting module 106 is used for monitoring the tracking data using the second preset model and assigning a hand label to each hand part in the tracking data, so as to classify the tracking data based on the hand label.

於一實施方式中，每一所述手部對應唯一之手部標籤。所述第二預設模型可是預先訓練得到之手部ReID模型。整理模組106可利用所述手部ReID模型監測所述跟蹤資料並根據來源於手部關鍵點分析模型之手部特徵向量為所述跟蹤資料中之每一手部分配所述手部標籤，使得同一僅手部能夠擁有同一手部ID（手部標籤）。所述手部ReID模型將跟蹤資料按照手部ID進行整理，使得同一手部ID之資料能夠被整理到同一資料集中，實現基於手部ID對所述跟蹤資料進行分類，經過分類整理後之跟蹤資料可被稱之為“ReID資料”。In one embodiment, each of the hands corresponds to a unique hand label. The second preset model may be a pre-trained hand ReID model. Organizing module 106 may monitor the tracking data using the hand ReID model and assign the hand label to each hand in the tracking data according to hand feature vectors derived from the hand keypoint analysis model, such that The same hand can have the same hand ID (hand tag). The hand ReID model organizes the tracking data according to the hand ID, so that the data of the same hand ID can be sorted into the same data set, so that the tracking data can be classified based on the hand ID, and the tracking data can be tracked after classification and sorting. The data may be referred to as "ReID data".

於一實施方式中，可預先錄製多個包含不同人之手部之影像作為手部ReID模型之訓練資料，並將手部關鍵點分析模型中之第二個Hourglass網路之輸出作為手部ReID模型之輸入，即不同人之手部之圖像幀輸入至2個級聯之Hourglass網路，並利用第二個Hourglass網路之輸出作為手部ReID模型之輸入資料。同時使用Triplet Loss損失函數來訓練所述ReID模型。In one embodiment, a plurality of images including hands of different people can be pre-recorded as the training data of the hand ReID model, and the output of the second Hourglass network in the hand key point analysis model can be used as the hand ReID. The input of the model, that is, the image frames of the hands of different people are input to two cascaded Hourglass networks, and the output of the second Hourglass network is used as the input data of the hand ReID model. The ReID model is also trained using the Triplet Loss loss function.

於一實施方式中，當於拍攝被評測者之手部作業影像過程中出現手部移除畫面外，之後又重新回到畫面內之情形時，整理模組106會利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之手部分配手部標籤，以基於該手部標籤對所述跟蹤資料進行分類。In one embodiment, when the hand is removed from the screen during the shooting of the subject's hand operation image, and then returns to the screen, the sorting module 106 uses the second preset model to monitor the situation. The tracking data also assigns hand tags to the hand parts in the tracking data to classify the tracking data based on the hand tags.

校正模組107用於對所述分類後之跟蹤資料進行預處理，以得到每一所述手部之精資料。The calibration module 107 is used for preprocessing the classified tracking data to obtain the precise data of each hand.

於一實施方式中，所述分類後之跟蹤資料即為ReID資料，所述預處理可是預先設定之資料處理方式，比如所述預處理包括：採用預設異常點剔除演算法剔除所述ReID資料之異常資料，並使用預設插值法對剔除之異常資料所在之節點進行回歸處理。經過校正模組107處理後之資料可被稱之為“精資料”。In one embodiment, the classified tracking data is ReID data, and the preprocessing can be a preset data processing method. For example, the preprocessing includes: using a preset outlier removal algorithm to remove the ReID data. the abnormal data, and use the preset interpolation method to perform regression processing on the node where the removed abnormal data is located. The data processed by the calibration module 107 may be called "fine data".

於一實施方式中，所述異常資料可是明顯偏離預設正常資料值區間之資料。由於跟蹤資料是基於跟蹤時間累積之資料，為避免異常資料被剔除後造成資料空缺，校正模組107可使用預設插值法對剔除之異常資料所在之節點進行回歸處理，實現將近似資料補充到異常資料所在之節點。In one embodiment, the abnormal data may be data that significantly deviates from a predetermined normal data value interval. Since the tracking data is based on the data accumulated over the tracking time, in order to avoid data vacancies after the abnormal data is eliminated, the calibration module 107 can use a preset interpolation method to perform regression processing on the node where the eliminated abnormal data is located, so as to supplement the approximate data to The node where the abnormal data is located.

評分模組108用於根據每一所述手部之精資料及基準手部作業之精資料對每一所述手部之作業動作進行評分。The scoring module 108 is used for scoring the work action of each hand according to the fine data of each hand and the fine data of the reference hand work.

於一實施方式中，評分模組108可實現基於標準作業流程之手部精資料與被測試者作業流程之手部精資料來為被試者手部作業進行評分。評分模組108可採用DTW演算法來對時間序列之精資料進行對齊，並根據對齊後之所有手部作業圖像幀之手部關鍵點座標之歐式距離進行評分。所述DTW演算法可藉由計算標準作業流程之手部作業動作與被測試者作業流程之手部作業動作之間之歐式距離，來比較兩者之間之相似度，歐式距離越低，相似度越高，得分越高，歐式距離越高，相似度越低，得分越低。In one embodiment, the scoring module 108 can implement the scoring of the test subject's hand work based on the hand precision data of the standard operating procedure and the hand precision data of the test subject's work flow. The scoring module 108 can use the DTW algorithm to align the precise data of the time series, and score according to the Euclidean distance of the hand key point coordinates of all the hand operation image frames after the alignment. The DTW algorithm can compare the similarity between the two by calculating the Euclidean distance between the hand movement of the standard operating procedure and the hand movement of the test subject. The lower the Euclidean distance is, the more similar the two are. The higher the degree, the higher the score, the higher the Euclidean distance, the lower the similarity, the lower the score.

具體地，評分模組108採用DTW演算法將每一所述手部之精資料與所述基準手部作業之精資料進行對齊，再分別計算每一所述手部之精資料中之手部關鍵點座標與所述基準手部作業之精資料中之手部關鍵點座標之歐式距離，最後根據每一所述手部之歐式距離計算結果對每一所述手部之作業動作進行評分。該種手部作業動作評分方式有利於對工人手部作業動作進行評價考察，確保產線之良率、效率，且可應用於新進工人手部作業動作之培訓等，同時亦可根據使用者實際需求應用到其他之場景中。Specifically, the scoring module 108 uses the DTW algorithm to align the fine data of each hand with the fine data of the reference hand work, and then calculates the hand in the fine data of each hand respectively. The Euclidean distance between the coordinates of the key points and the coordinates of the key points of the hand in the precise data of the reference hand work, and finally, according to the calculation result of the Euclidean distance of each hand, the work action of each hand is scored. This hand movement scoring method is conducive to the evaluation and inspection of workers' hand movements, ensuring the yield and efficiency of the production line, and can be applied to the training of new workers' hand movements. Requirements apply to other scenarios.

於一實施方式中，評分模組108採用DTW演算法能夠實現對時間序列資料進行對齊，並將被評測者之作業流程之時長相比於基準作業流程之時長之時間差納入評分考慮，可根據對齊後之所有圖像幀之所有關鍵點之歐式距離進行評分，距離越低之得分越高，距離越高之得分越低。然若歐式距離差值之變化呈現對數之趨勢，同時隨著兩段作業流程差異越大，關鍵點之歐式距離計算出之距離差值擾動亦會越大，因此還可根據數十組被評測者之測試資料與基準作業流程之手部資料經過歐式距離差計算，得到歐式距離差值之上界與下界，並透過對數變換，將距離差值映射到0到100分之分數，讓越逼近距離差值上界之分數越接近零分，越逼近距離差值下界之分數越接近滿分（100分）。In one embodiment, the scoring module 108 uses the DTW algorithm to align the time-series data, and takes into account the time difference between the duration of the work flow of the assessee and the duration of the benchmark work flow into the scoring, which can be based on The Euclidean distance of all key points of all image frames after alignment is scored. The lower the distance, the higher the score, and the higher the distance, the lower the score. However, if the change of the Euclidean distance difference shows a logarithmic trend, and the greater the difference between the two operation processes, the greater the disturbance of the distance difference calculated by the Euclidean distance of the key points, so it can also be evaluated according to dozens of groups. The test data and the hand data of the benchmark operation process are calculated by the Euclidean distance difference to obtain the upper and lower bounds of the Euclidean distance difference, and through logarithmic transformation, the distance difference is mapped to a score of 0 to 100 points, so that the closer it is The closer the score to the upper bound of the distance difference is to zero, the closer to the lower bound of the distance difference, the closer to the full score (100 points).

如圖4所示，於一實施方式中，評分模組108可被細分成預處理單元、三維空間座標對齊單元、動態時間規整單元及對數變換單元。所述預處理單元可對被評測者之手部之精資料與基準手部作業之精資料進行長度中心化處理，具體可是：基準手部作業之每一幀出現之左右手關鍵點資料，每一段關鍵點長度於每一幀中需要確保一致，可將整個基準作業流程中之所有幀出現之關鍵點之間之長度（比如，共20段長度）取平均值，再將每一幀之關鍵點長度規整到平均值之長度，以使得基準手部作業之每一幀之每段關鍵點長度均是相同同理，被試者之手部作業亦使用相同之處理方式，確保被評測者之手部作業之每一幀之每段關鍵點長度均是相同可解決不同之視角問題，接著將被評測者之手部作業中之每一幀出現之左右手關鍵點長度調整到與基準手部作業之關鍵點長度相同，調整過後之左右手各21個關鍵點座標會有所變化，且被評測者之手部作業之每一幀均須做此處理，實現克服不同人之手指長度所造成之關鍵點座標偏差之問題。三維空間座標對齊單元可實現將基準手部作業與被評測者之手部作業之第零個關鍵點資料(起始關鍵點資料，比如定義手心之關鍵點座標為第零個關鍵點資料)對齊到世界座標系統之原點(x, y, z) = (0, 0, 0)，如此可過濾掉雙手位移之影響，實現純粹針對手部作業手勢進行評分。動態時間規整單元可利用DTW演算法對時間序列資料（被評測者之手部之精資料與基準手部作業之精資料）進行對齊，並計算對齊後之所有圖像幀之所有關鍵點之歐式距離差。對數變換單元用於採用對數變換將歐式距離差值映射到0到100分之分數，實現為被試者手部作業進行評分。As shown in FIG. 4 , in one embodiment, the scoring module 108 can be subdivided into a preprocessing unit, a three-dimensional spatial coordinate alignment unit, a dynamic time warping unit, and a logarithmic transformation unit. The preprocessing unit can perform length-centralized processing on the precise data of the subject's hand and the precise data of the benchmark hand work, specifically: the left and right hand key point data appearing in each frame of the benchmark hand work, each segment The length of the key points needs to be consistent in each frame. The lengths between the key points appearing in all frames in the entire benchmark workflow (for example, the length of a total of 20 segments) can be averaged, and then the key points of each frame can be averaged. The length is adjusted to the average length, so that the length of each key point in each frame of the benchmark hand work is the same. The length of each key point in each frame of the hand work is the same to solve the problem of different viewing angles. Then, the length of the left and right hand key points appearing in each frame of the subject's hand work is adjusted to be the same as the reference hand work. The length of the key points is the same. After adjustment, the coordinates of the 21 key points in the left and right hands will change, and this process must be done for each frame of the subject's hand work to overcome the key points caused by the length of the fingers of different people. The problem of coordinate deviation. The three-dimensional space coordinate alignment unit can realize the alignment of the reference hand operation and the zeroth key point data of the subject's hand operation (the starting key point data, such as defining the coordinates of the key point of the palm as the zeroth key point data) Go to the origin of the world coordinate system (x, y, z) = (0, 0, 0), so that the influence of the displacement of the hands can be filtered out, and the score is purely based on hand gestures. The dynamic time warping unit can use the DTW algorithm to align the time series data (the precise data of the subject's hand and the precise data of the reference hand operation), and calculate the Euclidean of all key points of all image frames after alignment The distance is poor. The logarithmic transformation unit is used to map the Euclidean distance difference to a score between 0 and 100 using a logarithmic transformation, which is implemented for scoring the subject's handwork.

圖5為本發明一實施方式中手部作業動作評分方法之流程圖。根據不同之需求，所述流程圖中步驟之順序可改變，某些步驟可省略。FIG. 5 is a flow chart of a method for scoring hand work movements according to an embodiment of the present invention. According to different requirements, the order of the steps in the flowchart can be changed, and some steps can be omitted.

步驟S500，獲取手部作業影像，並對所述手部作業影像進行解碼得到手部作業圖像幀。In step S500, a hand operation image is acquired, and the hand operation image is decoded to obtain a hand operation image frame.

於一實施方式中，可利用影像記錄設備（比如，攝像頭）來記錄指定生產線上之一個或多個指定工人（被評測者）於進行手部作業時之手部作業影像。可藉由與該影像記錄設備進行通信，進而實現獲取所述手部作業影像。當獲取到所述手部作業影像時，可對所述手部作業影像進行解碼得到依序排列之多個手部作業圖像幀。In one embodiment, an image recording device (eg, a camera) can be used to record hand work images of one or more designated workers (evaluators) on a designated production line when performing hand operations. The hand operation image can be obtained by communicating with the image recording device. When the hand operation image is acquired, the hand operation image may be decoded to obtain a plurality of hand operation image frames arranged in sequence.

於一實施方式中，所述手部作業圖像幀可包括一個或多個手部。In one embodiment, the hand work image frame may include one or more hands.

步驟S502，將所述手部作業圖像幀轉換為HSV圖像，並從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。Step S502, converting the hand operation image frame into an HSV image, and obtaining a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located from the HSV image.

於一實施方式中，可先將所述手部作業圖像幀轉換為HSV圖像，再從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之一系列矩形框。In one embodiment, the hand operation image frame may be converted into an HSV image first, and then a binarized image representing the area of the skin and one of the area representing the hand may be obtained from the HSV image. Series of rectangular boxes.

於一實施方式中，可根據動態之H通道之上下限值、動態之S通道之上下限值及動態之V通道之上下限值從所述HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框，進而實現避免由於HSV三通道（即H通道、S通道及V通道）之上下限值為固定值，無法適應多種不同情形之手部檢測之問題，避免出現手部漏檢或誤檢之情況。HSV三通道之上下限值可被動態更新，具體地，可根據下述之第一預設模型之回饋結果來更新HSV三通道之上下限值，比如當第一預設模型給出之回饋結果為積極結果時，根據預設更新規則更新H通道之上下限值、S通道之上下限值及V通道之上下限值；當第一預設模型給出之回饋結果為消極結果時，不對HSV三通道之上下限值進行更新。所述預設更新規則可根據實際使用需求進行預先設定，該預設更新規則預先定義有每一HSV通道之上下限值之調整規則。In one embodiment, the binarization representing the area of the skin can be obtained from the HSV image according to the upper and lower limits of the dynamic H channel, the upper and lower limits of the dynamic S channel, and the upper and lower limits of the dynamic V channel. The image and a plurality of rectangular boxes representing the area where the hand is located can avoid the detection of the hand that cannot be adapted to various situations due to the fixed upper and lower limit values of the HSV three channels (ie, the H channel, the S channel and the V channel). problem, to avoid the situation of missed detection or false detection of the hand. The upper and lower limits of the HSV three channels can be dynamically updated. Specifically, the upper and lower limits of the HSV three channels can be updated according to the feedback results of the first preset model described below. For example, when the feedback results given by the first preset model When the result is a positive result, update the upper and lower limits of the H channel, the upper and lower limits of the S channel, and the upper and lower limits of the V channel according to the preset update rules; when the feedback result given by the first preset model is a negative result, no HSV The upper and lower limit values of the three channels are updated. The preset update rule can be preset according to actual usage requirements, and the preset update rule defines an adjustment rule for the upper and lower limit values of each HSV channel in advance.

可理解，HSV三通道之上下限值是根據第一預設模型輸出之當前手部作業圖像幀之回饋結果來選擇進行更新，更新後之HSV三通道之上下限值用以對下一手部作業圖像幀進行檢測，以獲取下一手部作業圖像幀之表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。It can be understood that the upper and lower limit values of the HSV three channels are selected and updated according to the feedback result of the current hand operation image frame output by the first preset model, and the updated upper and lower limit values of the HSV three channels are used for the next hand operation. The working image frame is detected to obtain a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located in the next working image frame of the hand.

於一實施方式中，當所述手部作業圖像幀為起始幀圖像時，可採用預設之HSV三通道之上下限值從HSV圖像中獲取表示皮膚所在區域之二值化圖像及表示手部所在區域之多個矩形框。即於所述手部作業圖像幀為起始幀圖像時，所述H通道之上下限值為第一內定上下限值，所述S通道之上下限值為第二內定上下限值，所述V通道之上下限值為第三內定上下限值。In one embodiment, when the hand operation image frame is the initial frame image, the preset upper and lower limit values of the three HSV channels can be used to obtain a binarized image representing the area of the skin from the HSV image. The image and multiple rectangles representing the area where the hand is located. That is, when the hand operation image frame is the initial frame image, the upper and lower limit values of the H channel are the first default upper and lower limit values, and the upper and lower limit values of the S channel are the second default upper and lower limit values, The upper and lower limit values of the V channel are the third default upper and lower limit values.

於一實施方式中，於獲取被評測者之手部作業影像時，可能會一併將被評測者之周圍環境拍攝下來。藉由分析圖像中之膚色圖元值與環境圖元值之差異來實現檢測手部。當環境中存於與膚色圖元值相近之其他物體時，或者當被評測者之膚色不於內定之HSV三通道之上下限值範圍內時，亦或者環境中存於其他光照等情形下，均將可能發生手部之誤識別。因此，採用第一預設模型（手部關鍵點分析模型）輸出之當前手部作業圖像幀之回饋結果來更新內定之HSV三通道之上下限值。In one embodiment, when the image of the subject's hand operation is obtained, the surrounding environment of the subject may be photographed together. The hand is detected by analyzing the difference between the skin color primitive value and the environment primitive value in the image. When other objects with similar skin color primitive values exist in the environment, or when the skin color of the subject is not within the upper and lower limits of the default HSV three channels, or when there are other lighting conditions in the environment, Misidentification of the hand may occur. Therefore, the default upper and lower limit values of the three HSV channels are updated using the feedback result of the current hand operation image frame output by the first preset model (hand key point analysis model).

步驟S504，根據多個所述矩形框從所述二值化圖像中分割出第一手部圖像並利用第一預設模型對所述第一手部圖像進行分析。Step S504 , segment a first hand image from the binarized image according to a plurality of the rectangular frames, and analyze the first hand image by using a first preset model.

於一實施方式中，可根據多個所述矩形框從所述二值化圖像中分割出需進行分析之圖像區域，該分割下來之圖像區域即為所述第一手部圖像，再將所述第一手部圖像輸入到所述第一預設模型行分析。所述第一預設模型可是預先訓練之手部關鍵點分析模型，對於每一所述矩形框，所述手部關鍵點分析模型可分析得到21個手部關鍵點座標、置信度及手部特徵向量。所述置信度表徵了所述手部關鍵點分析模型之魯棒性。In one embodiment, an image area to be analyzed can be segmented from the binarized image according to a plurality of the rectangular frames, and the segmented image area is the first hand image. , and then input the first hand image to the first preset model for analysis. The first preset model can be a pre-trained hand key point analysis model. For each of the rectangular boxes, the hand key point analysis model can analyze and obtain 21 hand key point coordinates, confidence and hand Feature vector. The confidence level characterizes the robustness of the hand keypoint analysis model.

於一實施方式中，所述手部關鍵點分析模型還可根據每一所述矩形框對應之置信度計算得到一置信度均值，並判斷所述置信度均值是否大於預設值，以返回積極結果或消極結果。具體地，若所述置信度均值大於所述預設值，所述手部關鍵點分析模型輸出積極結果，進而可根據該積極結果更新HSV三通道之上下限值。若所述置信度均值不大於所述預設值，所述手部關鍵點分析模型輸出消極結果，進而本次不會更新HSV三通道之上下限值。所述預設值可根據實際需求進行設定與調整，於此不作限定。In one embodiment, the hand key point analysis model can also calculate a confidence average value according to the confidence degree corresponding to each of the rectangular boxes, and determine whether the confidence degree average value is greater than a preset value, so as to return a positive value. result or negative result. Specifically, if the average confidence value is greater than the preset value, the hand key point analysis model outputs a positive result, and then the upper and lower limit values of the three HSV channels can be updated according to the positive result. If the average confidence value is not greater than the preset value, the hand key point analysis model outputs a negative result, and thus the upper and lower limits of the HSV three channels will not be updated this time. The preset value can be set and adjusted according to actual needs, which is not limited herein.

步驟S506，基於所述第一預設模型之分析結果構建一隨跟蹤時間累積之跟蹤資料。Step S506 , constructing tracking data accumulated with tracking time based on the analysis result of the first preset model.

於一實施方式中，所述第一預設模型可為手部關鍵點分析模型，所述第一預設模型之分析結果包括手部關鍵點座標與置信度，跟可根據手部關鍵點分析模型得到之手部關鍵點座標與置信度構建並維護隨跟蹤時間累積之跟蹤資料。所述跟蹤資料為包含手部關鍵點座標與置信度之結構體。In one embodiment, the first preset model can be a hand key point analysis model, and the analysis result of the first preset model includes the hand key point coordinates and confidence, and can be analyzed according to the hand key point. The hand key point coordinates and confidence levels obtained by the model build and maintain the tracking data accumulated over the tracking time. The tracking data is a structure including hand key point coordinates and confidence levels.

於一實施方式中，於構建並維護跟蹤資料之過程中，對於所述手部關鍵點分析模型接收並處理之每一手部作業圖像幀之資料，還會基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料是否可靠，如果可靠，則將基於本次手部作業圖像幀所生成之跟蹤資料添加到其維護之跟蹤資料中，如果不可靠，會調用多個預設手部系列檢測模型來對本次手部作業圖像幀進行再次檢測，該多個預設手部系列檢測模型可包括預先訓練得到之YOLO模型及SSD模型。每一預設手部系列檢測模型均分別對本次手部作業圖像幀進行檢測，各自可得到一系列矩形框，並輸入到手部關鍵點分析模型中進行分析，得到各自之手部關鍵點座標與置信度。可藉由綜合比對每組處理方式對應之手部關鍵點座標與置信度，選擇與當前跟蹤資料最匹配之一組手部關鍵點座標與置信度，以維護到所構建之跟蹤資料中。In one embodiment, in the process of constructing and maintaining the tracking data, the data of each hand operation image frame received and processed by the hand key point analysis model is also determined based on the confidence level. Whether the tracking data generated by the operation image frame is reliable, if it is reliable, the tracking data generated based on the current hand operation image frame will be added to the tracking data maintained by it. A hand series detection model is set to re-detect the image frame of this hand operation. The plurality of preset hand series detection models may include the pre-trained YOLO model and the SSD model. Each preset hand series detection model detects the image frames of this hand operation respectively, and each can obtain a series of rectangular boxes, which are input into the hand key point analysis model for analysis, and the respective hand key points are obtained. Coordinates and confidence. By comprehensively comparing the hand key point coordinates and confidence levels corresponding to each group of processing methods, a set of hand key point coordinates and confidence levels that best match the current tracking data can be selected for maintenance in the constructed tracking data.

於一實施方式中，基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料是否可靠之方式可是：計算置信度均值，並判斷置信度均值是否超過預設閾值，若超過，則判定基於本次手部作業圖像幀所生成之之跟蹤資料可靠，否則判定為不可靠。所述預設閾值可根據實際需求進行設定與調整。選擇與當前跟蹤資料最匹配之一組手部關鍵點座標與置信度之規則可包括以下任意一種：a. 置信度均值最高之一組；b. 由在於連續之手部作業圖像幀之手部關鍵點座標之變化應該變化不大，手部關鍵點座標與前後手部作業圖像幀之關鍵點座標最匹配（比如，手部關鍵點座標之間之歐式距離最小）之一組；c. 規則a與規則b之結合，藉由為規則a與規則b設定不同之權重係數，根據規則a與規則b之綜合結果來選定。In one embodiment, the method of judging whether the tracking data generated based on the current hand operation image frame is reliable based on the confidence level is: calculating the confidence level mean value, and judging whether the confidence level mean value exceeds a preset threshold, and if it exceeds, Then it is determined that the tracking data generated based on the image frame of this hand operation is reliable, otherwise it is determined to be unreliable. The preset threshold can be set and adjusted according to actual needs. The rules for selecting a set of hand keypoint coordinates and confidences that best match the current tracking data can include any of the following: a. The set with the highest confidence average; b. The change of the coordinates of the hand key points should not change much, and the coordinates of the hand key points match the key point coordinates of the front and rear hand operation image frames most closely (for example, the Euclidean distance between the coordinates of the hand key points is the smallest); c . The combination of rule a and rule b is selected according to the comprehensive result of rule a and rule b by setting different weight coefficients for rule a and rule b.

舉例而言，以多個預設手部系列檢測模型包括YOLO模型及SSD模型為例，藉由綜合比對之是2組處理方式對應之手部關鍵點座標與置信度。第一組為藉由YOLO模型對手部作業圖像幀進行檢測得到之一系列矩形框，第二組為藉由SSD模型對手部作業圖像幀進行檢測得到之一系列矩形框。For example, taking a plurality of preset hand series detection models including the YOLO model and the SSD model as an example, the coordinates and confidence levels of the hand key points corresponding to the two sets of processing methods are comprehensively compared. The first group is a series of rectangular frames obtained by detecting the hand job image frame by the YOLO model, and the second group is a series of rectangular frames obtained by detecting the hand job image frame by the SSD model.

於一實施方式中，當基於置信度判斷基於本次手部作業圖像幀所生成之之跟蹤資料不可靠時，亦可直接調用YOLO模型或SSD模型來對本次手部作業圖像幀進行再次檢測，得到對應之手部關鍵點座標與置信度，並維護到跟蹤資料中（即不綜合比對YOLO模型與SSD模型對應之手部關鍵點座標與置信度）。In one embodiment, when it is judged based on the confidence that the tracking data generated based on the image frame of this hand operation is unreliable, the YOLO model or the SSD model can also be directly invoked to conduct the image frame of this hand operation. Detect again, get the corresponding hand key point coordinates and confidence, and maintain it in the tracking data (that is, do not comprehensively compare the hand key point coordinates and confidence corresponding to the YOLO model and the SSD model).

於一實施方式中，於基於所述手部關鍵點分析模型得到之手部關鍵點座標與置信度構建得到隨所述跟蹤時間累積之跟蹤資料時，還可基於每一所述矩形框對應之置信度判斷所述手部關鍵點分析模型對本次手部作業圖像幀之第一手部圖像進行分析得到之手部關鍵點座標是否可靠，若判定為不可靠，則調用多個預設手部系列檢測模型分別對所述手部作業圖像幀進行檢測得到表示手部所在區域之多個矩形框，以分割得到與多個所述預設手部系列檢測模型對應之多個第二手部圖像。所述手部關鍵點分析模型再對每一所述第二手部圖像進行分析得到與每一所述預設手部系列檢測模型對應之手部關鍵點座標、置信度及手部特徵向量。藉由綜合比對所述手部關鍵點分析模型所分析得到之每一組手部關鍵點座標與置信度，以選取與所述跟蹤資料最匹配之一組手部關鍵點座標與置信度來更新其先前生成之跟蹤資料。In one embodiment, when the tracking data accumulated with the tracking time is constructed based on the hand key point coordinates and confidence levels obtained by the hand key point analysis model, the corresponding The confidence level judges whether the hand key point coordinates obtained by analyzing the first hand image of the hand operation image frame by the hand key point analysis model are reliable. It is assumed that the hand series detection model respectively detects the hand operation image frames to obtain a plurality of rectangular frames representing the region where the hand is located, so as to obtain a plurality of first hand series corresponding to the plurality of preset hand series detection models. Second-hand image. The hand key point analysis model then analyzes each of the second hand images to obtain hand key point coordinates, confidence levels and hand feature vectors corresponding to each of the preset hand series detection models . By comprehensively comparing each set of hand key point coordinates and confidence levels obtained by the analysis of the hand key point analysis model, a set of hand key point coordinates and confidence levels that best match the tracking data are selected to Update its previously generated tracking data.

可理解，當被評測者之兩隻手交叉或是僅出現部分手部區域等情形下，可能會導致手部關鍵點分析模型得到之手部關鍵點之置信度偏低，此時需要利用YOLO模型與SSD模型作為輔助檢測。It is understandable that when the subject's two hands are crossed or only part of the hand area appears, the confidence of the hand key points obtained by the hand key point analysis model may be low. In this case, it is necessary to use YOLO. Model and SSD model as auxiliary detection.

步驟S508，利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之每一手部分配手部標籤，以基於所述手部標籤對所述跟蹤資料進行分類。Step S508 , monitoring the tracking data using a second preset model and assigning a hand label to each hand part in the tracking data, so as to classify the tracking data based on the hand label.

於一實施方式中，每一所述手部對應唯一之手部標籤。所述第二預設模型可是預先訓練得到之手部ReID模型。可利用所述手部ReID模型監測所述跟蹤資料並根據來源於手部關鍵點分析模型之手部特徵向量為所述跟蹤資料中之每一手部分配所述手部標籤，使得同一僅手部能夠擁有同一手部ID（手部標籤）。所述手部ReID模型將跟蹤資料按照手部ID進行整理，使得同一手部ID之資料能夠被整理到同一資料集中，實現基於手部ID對所述跟蹤資料進行分類，經過分類整理後之跟蹤資料可被稱之為“ReID資料”。In one embodiment, each of the hands corresponds to a unique hand label. The second preset model may be a pre-trained hand ReID model. The tracking data can be monitored using the hand ReID model and the hand labels can be assigned to each hand in the tracking data according to hand feature vectors derived from the hand keypoint analysis model, so that the same hand Ability to have the same hand ID (hand tag). The hand ReID model organizes the tracking data according to the hand ID, so that the data of the same hand ID can be sorted into the same data set, so that the tracking data can be classified based on the hand ID, and the tracking data can be tracked after classification and sorting. The data may be referred to as "ReID data".

於一實施方式中，當於拍攝被評測者之手部作業影像過程中出現手部移除畫面外，之後又重新回到畫面內之情形時，會利用第二預設模型監測所述跟蹤資料並為所述跟蹤資料中之手部分配手部標籤，以基於該手部標籤對所述跟蹤資料進行分類。In one embodiment, when the hand is removed from the screen during the filming of the subject's hand operation and then returns to the screen, the tracking data will be monitored using a second preset model. A hand tag is assigned to the hand part in the tracking data, so as to classify the tracking data based on the hand tag.

步驟S510，對所述分類後之跟蹤資料進行預處理，以得到每一所述手部之精資料。Step S510, preprocessing the classified tracking data to obtain the precise data of each hand.

於一實施方式中，所述分類後之跟蹤資料即為ReID資料，所述預處理可是預先設定之資料處理方式，比如所述預處理包括：採用預設異常點剔除演算法剔除所述ReID資料之異常資料，並使用預設插值法對剔除之異常資料所在之節點進行回歸處理。經過剔除與回歸處理後之資料可被稱之為“精資料”。In one embodiment, the classified tracking data is ReID data, and the preprocessing can be a preset data processing method. For example, the preprocessing includes: using a preset outlier removal algorithm to remove the ReID data. the abnormal data, and use the preset interpolation method to perform regression processing on the node where the removed abnormal data is located. The data after elimination and regression processing can be called "fine data".

於一實施方式中，所述異常資料可是明顯偏離預設正常資料值區間之資料。由於跟蹤資料是基於跟蹤時間累積之資料，為避免異常資料被剔除後造成資料空缺，可使用預設插值法對剔除之異常資料所在之節點進行回歸處理，實現將近似資料補充到異常資料所在之節點。In one embodiment, the abnormal data may be data that significantly deviates from a predetermined normal data value interval. Since the tracking data is based on the data accumulated over the tracking time, in order to avoid data vacancies caused by the elimination of abnormal data, a preset interpolation method can be used to perform regression processing on the node where the abnormal data is located, so as to supplement the approximate data to the location of the abnormal data. node.

步驟S512，根據每一所述手部之精資料及基準手部作業之精資料對每一所述手部之作業動作進行評分。Step S512 , according to the fine data of each hand and the fine data of the reference hand work, score the work action of each hand.

於一實施方式中，對每一所述手部之作業動作進行評分可是指對每一評測者之兩隻手部作業動作進行評分。可基於標準作業流程之手部精資料與被測試者作業流程之手部精資料來為被試者手部作業進行評分。可採用DTW演算法來對時間序列之精資料進行對齊，並根據對齊後之所有手部作業圖像幀之手部關鍵點座標之歐式距離進行評分。所述DTW演算法可藉由計算標準作業流程之手部作業動作與被測試者作業流程之手部作業動作之間之歐式距離，來比較兩者之間之相似度，歐式距離越低，相似度越高，得分越高，歐式距離越高，相似度越低，得分越低。In one embodiment, the scoring of the work movements of each of the hands may refer to scoring the work movements of two hands of each evaluator. The handwork of the subjects can be scored based on the hand proficiency data of the standard operating procedure and the hand proficiency data of the test subject's work flow. The DTW algorithm can be used to align the precise data of the time series, and the scores are scored according to the Euclidean distance of the coordinates of the hand key points of all the hand operation image frames after alignment. The DTW algorithm can compare the similarity between the two by calculating the Euclidean distance between the hand movement of the standard operating procedure and the hand movement of the test subject. The lower the Euclidean distance is, the more similar the two are. The higher the degree, the higher the score, the higher the Euclidean distance, the lower the similarity, the lower the score.

具體地，可採用DTW演算法將每一所述手部之精資料與所述基準手部作業之精資料進行對齊，再分別計算每一所述手部之精資料中之手部關鍵點座標與所述基準手部作業之精資料中之手部關鍵點座標之歐式距離，最後根據每一所述手部之歐式距離計算結果對每一所述手部之作業動作進行評分。Specifically, the DTW algorithm can be used to align the fine data of each hand with the fine data of the reference hand operation, and then calculate the coordinates of key points of the hand in the fine data of each hand respectively. The Euclidean distance from the coordinates of the key points of the hand in the precise data of the reference hand work, and finally, according to the calculation result of the Euclidean distance of each said hand, the work action of each said hand is scored.

於一實施方式中採用DTW演算法能夠實現對時間序列資料進行對齊，並將被評測者之作業流程之時長相比於基準作業流程之時長之時間差納入評分考慮，可根據對齊後之所有圖像幀之所有關鍵點之歐式距離進行評分，距離越低之得分越高，距離越高之得分越低。然若歐式距離差值之變化呈現對數之趨勢，同時隨著兩段作業流程差異越大，關鍵點之歐式距離計算出之距離差值擾動亦會越大，因此還可根據數十組被評測者之測試資料與基準作業流程之手部資料經過歐式距離差計算，得到歐式距離差值之上界與下界，並透過對數變換，將距離差值映射到0到100分之分數，讓越逼近距離差值上界之分數越接近零分，越逼近距離差值下界之分數越接近滿分（100分）。In one embodiment, the DTW algorithm can be used to align the time-series data, and the time difference between the operation process of the subject and the time of the benchmark operation process can be taken into consideration in the scoring. Like the Euclidean distance of all key points of the frame, the lower the distance, the higher the score, and the higher the distance, the lower the score. However, if the change of the Euclidean distance difference shows a logarithmic trend, and the greater the difference between the two operation processes, the greater the disturbance of the distance difference calculated by the Euclidean distance of the key points, so it can also be evaluated according to dozens of groups. The test data and the hand data of the benchmark operation process are calculated by the Euclidean distance difference to obtain the upper and lower bounds of the Euclidean distance difference, and through logarithmic transformation, the distance difference is mapped to a score of 0 to 100 points, so that the closer it is The closer the score to the upper bound of the distance difference is to zero, the closer to the lower bound of the distance difference, the closer to the full score (100 points).

於一實施方式中，具體可藉由以下步驟實現對每一評測者之兩隻手部之作業動作進行評分：a. 基準手部作業之每一幀出現之左右手關鍵點資料，每一段關鍵點長度於每一幀中需要確保一致，可將整個基準作業流程中之所有幀出現之關鍵點之間之長度（比如，共20段長度）取平均值，再將每一幀之關鍵點長度規整到平均值之長度，以使得基準手部作業之每一幀之每段關鍵點長度均是相同同理，被試者之手部作業亦使用相同之處理方式，確保被評測者之手部作業之每一幀之每段關鍵點長度均是相同可解決不同之視角問題，接著將被評測者之手部作業中之每一幀出現之左右手關鍵點長度調整到與基準手部作業之關鍵點長度相同，調整過後之左右手各21個關鍵點座標會有所變化，且被評測者之手部作業之每一幀均須做此處理，實現克服不同人之手指長度所造成之關鍵點座標偏差之問題；b. 將基準手部作業與被評測者之手部作業之第零個關鍵點資料(起始關鍵點資料，比如定義手心之關鍵點座標為第零個關鍵點資料)對齊到世界座標系統之原點(x, y, z) = (0, 0, 0)，如此可過濾掉雙手位移之影響，實現純粹針對手部作業手勢進行評分；c. 利用DTW演算法對時間序列資料（被評測者之手部之精資料與基準手部作業之精資料）進行對齊，並計算對齊後之所有圖像幀之所有關鍵點之歐式距離差；d. 採用對數變換將歐式距離差值映射到0到100分之分數，實現為被試者手部作業進行評分。In one embodiment, the following steps can be used to grade the operation movements of each evaluator's two hands: a. The key point data of the left and right hands appearing in each frame of the benchmark hand operation, the key points of each segment The length needs to be consistent in each frame. You can average the lengths between the key points that appear in all frames in the entire benchmark workflow (for example, a total of 20 lengths), and then adjust the length of the key points in each frame. To the average length, so that the length of each key point in each frame of the benchmark hand work is the same. Similarly, the subject's hand work is also processed in the same way to ensure that the subject's hand work The length of each key point in each frame is the same to solve the problem of different perspectives, and then the length of the left and right hand key points appearing in each frame of the subject's hand work is adjusted to match the key point of the reference hand work. The lengths are the same. After adjustment, the coordinates of the 21 key points of the left and right hands will change, and this process must be done for each frame of the subject's hand work to overcome the coordinate deviation of the key points caused by the length of the fingers of different people. b. Align the benchmark hand work with the zeroth keypoint data of the subject's hand work (starting keypoint data, such as defining the keypoint coordinates of the palm as the zeroth keypoint data) to the world The origin of the coordinate system (x, y, z) = (0, 0, 0), so that the influence of the displacement of the hands can be filtered out, and the score is purely based on hand gestures; c. Use the DTW algorithm to evaluate the time series Align the data (the precise data of the subject's hand and the precise data of the reference hand operation), and calculate the Euclidean distance difference of all key points of all image frames after alignment; d. Use logarithmic transformation to convert the Euclidean distance difference Values are mapped to a scale of 0 to 100, which is implemented for scoring the subject's handwork.

上述手部作業動作評分裝置、方法及電腦可讀存儲介質，可對被評測者之手部作業動作之即時影像進行處理，準確定位手部作業過程中之手部關鍵點特徵，並與標準手部作業動作進行比對，智慧分析出被評測者之手部作業動作與標準作業動作之差異，並給出相應評分，有利於對工人進行評價考察，提升產線之良率與效率。The above-mentioned device, method and computer-readable storage medium for evaluating hand movements can process real-time images of the subject's hand movements, accurately locate the key point features of the hand during the hand operation, and compare them with standard hand movements. It compares the work movements of the workers, and intelligently analyzes the difference between the hand work movements of the testees and the standard work movements, and gives the corresponding scores, which is conducive to the evaluation and inspection of workers and improves the yield and efficiency of the production line.

綜上所述，本發明符合發明專利要件，爰依法提出專利申請。惟，以上所述者僅為本發明之較佳實施方式，本發明之範圍並不以上述實施方式為限，舉凡熟悉本案技藝之人士爰依本發明之精神所作之等效修飾或變化，皆應涵蓋於以下申請專利範圍內。To sum up, the present invention complies with the requirements of an invention patent, and a patent application can be filed in accordance with the law. However, the above descriptions are only the preferred embodiments of the present invention, and the scope of the present invention is not limited to the above-mentioned embodiments, and equivalent modifications or changes made by those who are familiar with the art of the present invention according to the spirit of the present invention are all applicable. Should be covered within the scope of the following patent applications.

10:記憶體 20:處理器 30:手部作業動作評分程式 101:獲取模組 102:第一檢測模組 103:分析模組 104:跟蹤模組 105:第二檢測模組 106:整理模組 107:校正模組 108:評分模組 100:手部作業動作評分裝置10: Memory 20: Processor 30: Scoring program for hand work movements 101: Get Mods 102: The first detection module 103: Analysis Module 104: Tracking Module 105: Second detection module 106: Organize modules 107: Correction module 108: Scoring Module 100: Hand Work Action Scoring Device

圖1是本發明一實施方式之手部作業動作評分裝置之功能模組圖。FIG. 1 is a functional module diagram of a hand operation action scoring device according to an embodiment of the present invention.

圖2是本發明一實施方式之手部作業動作評分程式之功能模組圖。FIG. 2 is a functional module diagram of a hand operation action scoring program according to an embodiment of the present invention.

圖3是本發明一實施方式之手部作業動作評分程式之功能模組之交互示意圖。FIG. 3 is an interactive schematic diagram of a functional module of a hand operation action scoring program according to an embodiment of the present invention.

圖4是本發明一實施方式之評分模組之模組圖。FIG. 4 is a module diagram of a scoring module according to an embodiment of the present invention.

圖5是本發明一實施方式之手部作業動作評分方法之流程圖。FIG. 5 is a flow chart of a method for scoring hand operation movements according to an embodiment of the present invention.

Claims

A hand work action scoring method, the method comprising: acquiring a hand work image, and decoding the hand work image to obtain a hand work image frame; Converting the hand operation image frame into an HSV image, and obtaining a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located from the HSV image; Segmenting a first hand image from the binarized image according to a plurality of the rectangular frames, and analyzing the first hand image by using a first preset model; constructing tracking data accumulated over tracking time based on the analysis result of the first preset model; Monitor the tracking data using a second preset model and assign a hand label to each hand in the tracking data to classify the tracking data based on the hand label, wherein each hand corresponds to Unique hand label; preprocessing the classified tracking data to obtain precise data for each of the hands; and According to the fine data of each hand and the fine data of the reference hand work, the work action of each hand is scored.

The method for scoring hand movements according to claim 1, wherein the step of obtaining a binarized image representing the area where the skin is located and a plurality of rectangular frames representing the area where the hand is located from the HSV image comprises: According to the upper and lower limit values of the H channel, the upper and lower limit values of the S channel, and the upper and lower limit values of the V channel, a binarized image representing the area where the skin is located and a plurality of rectangular boxes representing the area where the hand is located are obtained from the HSV image ; Wherein, if the hand operation image frame is the initial frame image, the upper and lower limit values of the H channel are the first default upper and lower limit values, and the upper and lower limit values of the S channel are the second default upper and lower limit values , and the upper and lower limit values of the V channel are the third default upper and lower limit values.

The method for scoring hand movements according to claim 2, wherein the first preset model is a pre-trained hand key point analysis model, and the first preset model is used to analyze the first hand image The steps to perform the analysis include: The first hand image is analyzed by using the hand key point analysis model to obtain hand key point coordinates, confidence levels and hand feature vectors corresponding to each of the rectangular boxes.

The method for scoring hand movements according to claim 3, further comprising: Calculate a confidence mean value according to the confidence level corresponding to each of the rectangular boxes; judging whether the confidence mean value is greater than a preset value; If the average confidence value is greater than the preset value, updating the upper and lower limit values of the H channel, the upper and lower limit values of the S channel, and the upper and lower limit values of the V channel based on a preset update rule; and If the average confidence value is not greater than the preset value, the upper and lower limit values of the H channel, the upper and lower limit values of the S channel, and the upper and lower limit values of the V channel are not updated.

The method for scoring hand movements according to claim 3, wherein the step of constructing a tracking data accumulated with tracking time based on the analysis result of the first preset model comprises: Based on the hand key point coordinates and confidence levels obtained by the hand key point analysis model, the tracking data accumulated with the tracking time is constructed.

The method for scoring hand movements according to claim 5, further comprising: Determine whether the hand key point coordinates obtained by analyzing the first hand image by the hand key point analysis model are reliable based on the confidence level corresponding to each of the rectangular frames; If the coordinates of the key points of the hand obtained by the analysis are determined to be unreliable, a plurality of preset hand series detection models are invoked to detect the hand operation image frames respectively to obtain a plurality of rectangular frames representing the region where the hand is located, Obtain a plurality of second hand images corresponding to a plurality of the preset hand series detection models by segmentation, wherein the plurality of the preset hand series detection models include at least the YOLO model and the SSD model; Using the hand key point analysis model to analyze each of the second hand images to obtain hand key point coordinates, confidence levels and hand feature vectors corresponding to each of the preset hand series detection models ;and Comparing each set of hand key point coordinates and confidence levels analyzed by the hand key point analysis model to select a set of hand key point coordinates and confidence levels that best match the tracking data to update the tracking data; Wherein, each of the second hand images corresponds to a set of hand key point coordinates and confidence levels respectively.

The method for scoring hand movements according to claim 3, wherein the second preset model is a pre-trained hand ReID model, and the tracking data is monitored by using the second preset model and is the tracking data The steps for assigning a hand label to each hand part include: The tracking data is monitored using the hand ReID model and the hand labels are assigned to each hand in the tracking data according to the hand feature vector.

The method for scoring hand movements according to claim 1, wherein the preprocessing includes: using a preset outlier elimination algorithm to eliminate abnormal data in the tracking data, and using a preset interpolation method to eliminate abnormal data. The node where it is located is subjected to regression processing.

The method for scoring hand work movements according to claim 1, wherein the step of scoring the work movements of each of the hands according to the fine data of each of the hands and the fine data of the reference hand work comprises the following steps: : aligning the precision data of each of the hands with the precision data of the reference hand operation; respectively calculating the Euclidean distance between the hand key point coordinates in the precision data of each said hand and the hand key point coordinates in the precision data of the reference hand operation; and According to the calculation result of Euclidean distance of each hand, the work action of each hand is graded.

A hand operation action scoring device, the device includes a processor and a memory, the memory stores a plurality of computer programs, and the processor is used to execute the computer programs stored in the memory. The steps of any one of the hand work action scoring methods.

A computer-readable storage medium, the computer-readable storage medium stores a plurality of instructions, and a plurality of the instructions can be executed by one or more processors, so as to realize the requirements of any one of claim 1 to 9. Describe the steps of the scoring method for hand movements.