TWI736138B

TWI736138B - System and method for learning traffic safety

Info

Publication number: TWI736138B
Application number: TW109105043A
Authority: TW
Inventors: 張茹茵; 張萬烽
Original assignee: 國立屏東大學
Priority date: 2020-02-17
Filing date: 2020-02-17
Publication date: 2021-08-11
Also published as: TW202133125A

Abstract

A system for learning traffic safety includes a display device, a sensor, and a computation module. The sensor captures an image corresponding to a user. The computation module is communicatively connected to the display device and the sensor for displaying a traffic situation through the display dice. The computation module also determines an action of the user according to the image, and determines if the action complies with a traffic rule of the traffic situation to calculate a score.

Description

Traffic safety learning system and method

本揭露是關於利用體感互動技術讓使用者可以在安全的情境下學習並練習關於交通安全的行為與技能。 This disclosure is about using somatosensory interactive technology to allow users to learn and practice traffic safety behaviors and skills in a safe context.

行人交通安全能力與知識為影響學童意外死亡的重要原因之一，對於身心障礙學童而言，行動與交通能力更是獨立生活能力之重要技能，舉凡穿越馬路，於道路上行走等，該能力也將進一步影響其相關生活技能，如購物、使用社區資源，於十二年國民教育中，行動與交通技能也是特殊需求領域下重要的學習內容之一。 Pedestrian traffic safety ability and knowledge are one of the important reasons for the accidental death of school children. For students with disabilities, mobility and traffic ability are important skills for independent living ability. For example, crossing the road, walking on the road, etc., this ability is also It will further affect their related life skills, such as shopping and the use of community resources. In the twelve-year national education, mobility and transportation skills are also one of the important learning content in the field of special needs.

目前實務教學上經常是透過口頭描述、運用照片影片於教室中布置模擬情境進行教學等方式進行，需仰賴大量的人力進行，且提供身心障礙學生主動互動的練習機會上也十分受限，若是實際帶往道路上進行演練，對身心障礙學生及老師而言，除了安全上有疑慮外，也無法及時提供身心障礙學生所需要的大量練習機會，且目前也未有針對行動與交通技能所設計發展的互動科技教學系統。 At present, practical teaching is often carried out through oral descriptions, using photos and videos to set up simulated situations in classrooms for teaching, etc., which requires a lot of manpower to carry out, and the opportunities for active interaction with physically and mentally disabled students are also very limited. Taking it to the road for drills. For students and teachers with disabilities, in addition to safety concerns, it is also unable to provide a large number of training opportunities for students with disabilities in time, and there is currently no design development for mobility and transportation skills. Interactive technology teaching system.

本發明的實施例提出一種交通安全學習系統，包括顯示裝置、感測器與計算模組。感測器用以擷取對應於使用者的至少一影像。計算模組用以通訊連接至顯示裝置與感測器，用以透過顯示裝置顯示交通情境，根據上述的影像判斷使用者的動作，並判斷此動作是否符合交通情境的交通規則以計算出分數。 The embodiment of the present invention provides a traffic safety learning system, which includes a display device, a sensor, and a calculation module. The sensor is used to capture at least one image corresponding to the user. The calculation module is used to communicate with the display device and the sensor to display the traffic situation through the display device, determine the user's action based on the above image, and determine whether the action complies with the traffic rules of the traffic situation to calculate the score.

在一些實施例中，計算模組只擷取上述影像中的上肢影像或下肢影像，上述的動作為上肢動作或下肢動作。 In some embodiments, the calculation module only captures the upper extremity image or the lower extremity image in the above-mentioned images, and the above-mentioned motion is the upper extremity motion or the lower extremity motion.

在一些實施例中，上述的交通情境包括十字路口、單向轉彎、圓環、直行、平交道或丁字路，並且交通情境中包括至少一個障礙物。 In some embodiments, the above-mentioned traffic situation includes an intersection, a one-way turn, a circle, a straight, a level crossing, or a T-shaped road, and the traffic situation includes at least one obstacle.

在一些實施例中，上述的影像包括深度影像與亮度影像，深度影像包括多個深度。計算模組用以執行以下的多個步驟：對於鄰近的每兩個深度，計算這兩個深度在X方向上的第一梯度與Y方向的第二梯度，並根據第一梯度與第二梯度計算角度；將此角度累加至0至360度之間的多個角度槽的其中之一；以及將角度槽的數值作為深度影像的特徵向量。 In some embodiments, the above-mentioned image includes a depth image and a brightness image, and the depth image includes a plurality of depths. The calculation module is used to perform the following multiple steps: for every two adjacent depths, calculate the first gradient in the X direction and the second gradient in the Y direction of the two depths, and according to the first gradient and the second gradient Calculate the angle; add this angle to one of the multiple angle grooves between 0 and 360 degrees; and use the value of the angle groove as the feature vector of the depth image.

在一些實施例中，計算模組還設定多個預設動作並執行以下多個步驟：偵測亮度影像中的行人；藉由第一分類器根據亮度影像計算行人對應至預設動作的多個第一信心值；藉由第二分類器根據深度影像計算行人對應至預設動作的多個第二信心值；以及合併第一信心值與第二信心值以判斷使用者的動作屬於哪一個預設動作。 In some embodiments, the calculation module further sets multiple preset actions and executes the following multiple steps: detecting pedestrians in the brightness image; using the first classifier to calculate the pedestrians corresponding to the preset actions based on the brightness image The first confidence value; the second classifier is used to calculate multiple second confidence values corresponding to the preset actions of the pedestrian based on the depth image; and the first confidence value and the second confidence value are combined Value to determine which default action the user's action belongs to.

以另一個角度來說，本發明的實施例提出一種交通安全學習方法，用於計算模組，此交通安全學習方法包括：透過感測器擷取對應於使用者的至少一影像；透過顯示裝置顯示交通情境；根據上述的影像判斷使用者的動作；以及判斷動作是否符合交通情境的交通規則以計算出分數。 From another perspective, an embodiment of the present invention provides a traffic safety learning method for a computing module. The traffic safety learning method includes: capturing at least one image corresponding to a user through a sensor; and using a display device Display the traffic situation; determine the user's action based on the above-mentioned image; and determine whether the action conforms to the traffic rules of the traffic situation to calculate the score.

100:交通安全學習系統 100: Traffic Safety Learning System

101:教學訓練者 101: Teaching Trainer

102:使用者 102: User

110:計算模組 110: calculation module

120:顯示裝置 120: display device

130:感測器 130: Sensor

140:雲端資料庫 140: Cloud database

141:參數 141: Parameters

142:錄影資料 142: Video data

143:歷程記錄 143: History

201~204:影像 201~204: Video

301~304:步驟 301~304: steps

[圖1]是根據一實施例繪示交通安全學習系統的示意圖。 [Fig. 1] is a schematic diagram showing a traffic safety learning system according to an embodiment.

[圖2]是根據一實施例繪示亮度影像與深度影像的特徵擷取示意圖。 [Fig. 2] is a schematic diagram showing the feature extraction of the brightness image and the depth image according to an embodiment.

[圖3]是根據一實施例繪示交通安全學習方法的流程圖。 [Fig. 3] is a flowchart of a traffic safety learning method according to an embodiment.

關於本文中所使用之「第一」、「第二」等，並非特別指次序或順位的意思，其僅為了區別以相同技術用語描述的元件或操作。 Regarding the “first”, “second”, etc. used in this text, it does not particularly mean the order or sequence, but only to distinguish elements or operations described in the same technical terms.

本揭露所提出的交通安全學習系統是結合運動體感互動科技，讓一般學生或身心障礙學生能在教室或是安全的環境下進行各式各樣交通情境的交通技能練習。系統中以任務導向概念進行功能設計，所建置的媒材為生活中常見的交通情境與設施，每個交通情境設定為一方塊，並可以方塊組合方式讓教室依照學生能力與需求進行地圖路徑組合。 The traffic safety learning system proposed in this disclosure combines sports and somatosensory interactive technology, so that ordinary students or students with disabilities can practice traffic skills in various traffic situations in the classroom or in a safe environment. The functional design of the system is based on a task-oriented concept. The media built are common traffic situations and facilities in daily life. Each traffic situation is set as a block, and the blocks can be combined to allow the classroom to map the road according to the students’ abilities and needs. 径组合。 Path combination.

圖1是根據一實施例繪示交通安全學習系統的示意圖。請參照圖1，交通安全學習系統100包括了計算模組110、顯示裝置120與感測器130。計算模組110可為個人電腦、筆記型電腦、伺服器或具有計算能力的各式電腦。顯示裝置120可為投影機、液晶顯示器、有機發光二極體顯示器、虛擬實境顯示器、擴增實境顯示器等。感測器130可包括可見光感測器、紅外線感測器、深度感測器等。例如，深度感測器可以包括紅外線發射器與紅外線感測器，或者包括了兩個以上的可見光感測器，藉此可以計算場景的深度。計算模組110是通訊連接至顯示裝置120與感測器130，例如透過任意有線或無線的方式讓這些裝置之間可以傳輸資料。 Fig. 1 is a schematic diagram illustrating a traffic safety learning system according to an embodiment. Please refer to FIG. 1, the traffic safety learning system 100 includes a calculation module 110, a display device 120 and a sensor 130. The computing module 110 can be a personal computer, a notebook computer, a server, or various computers with computing capabilities. The display device 120 may be a projector, a liquid crystal display, an organic light emitting diode display, a virtual reality display, an augmented reality display, and the like. The sensor 130 may include a visible light sensor, an infrared sensor, a depth sensor, and so on. For example, the depth sensor may include an infrared emitter and an infrared sensor, or include more than two visible light sensors, so that the depth of the scene can be calculated. The computing module 110 is communicatively connected to the display device 120 and the sensor 130. For example, data can be transmitted between these devices through any wired or wireless means.

首先，教學訓練者101可登入至計算模組110來建置一個地圖，此地圖包括了一或多個交通情境，這些交通情境可以包括十字路口、單向轉彎、圓環、直行、平交道或丁字路，這些交通情境可以任意排列。教學訓練者101也可以在交通情境中建立障礙物，設定障礙物是動態還是靜態，也可以設定障礙物的密度。教學訓練者101也可以建立任務目標(例如過馬路)，設定任務的數量與排列組合。在一些實施例中也可以設定地圖的呈現方式，可為向位固定或是動態向位等。在一些實施例中也可以設定動作參數，可分為上肢模式或下肢模式，也就是只根據上肢或下肢的動作來練習交通技能。 First, the teaching trainer 101 can log in to the computing module 110 to build a map. The map includes one or more traffic situations. These traffic situations can include intersections, one-way turns, circles, straight ahead, level crossings, or T-shaped road, these traffic situations can be arranged arbitrarily. The teaching trainer 101 can also create obstacles in the traffic situation, set whether the obstacles are dynamic or static, and can also set the density of the obstacles. The teaching trainer 101 can also establish task goals (for example, crossing the road), and set the number and permutation of tasks. In some embodiments, the presentation mode of the map can also be set, which can be fixed orientation or dynamic orientation. In some embodiments, action parameters can also be set, which can be divided into upper limb mode or lower limb mode, that is, the traffic skills are practiced only according to the movements of the upper limbs or the lower limbs.

在建立好地圖以後，計算模組110可以透過顯示器120來顯示所建立的交通情境，同時透過感測器130來擷取對應於使用者102的影像，使用者102可以為一般學生或身心障礙學生，本發明並不在此限。在一些實施例中，感測器130包括了深度感測器與可見光感測器，因此所擷取的影像包括了深度影像與亮度影像，深度影像中一個像素的灰階代表場景的深度，而亮度影像中一個像素的灰階代表亮度，亮度影像可為彩色影像。計算模組110會根據所擷取的影像來偵測影像中的使用者102，判斷使用者102的動作，此動作例如為前進、後退、側移、向左轉、向右轉、停止等。計算模組110也會判斷使用者102的動作是否符合交通情境的交通規則以計算出分數。例如，如果判斷使用者102的動作為前進，但使用者的前方有一障礙物，這樣的話表示使用者102會撞上障礙物而會扣分，反之如果使用者102會左轉或右轉來閃過障礙物則會加分。換言之，在一些實施例中可以根據使用者是否會撞上障礙物來加分或扣分。在一些實施例中，也可以根據現行的交通規則來計算分數，例如闖紅燈時扣分、沒有走斑馬線時扣分，以此類推。 After the map is created, the computing module 110 can display the created traffic situation through the display 120, and at the same time capture the image corresponding to the user 102 through the sensor 130. The user 102 can be a general student or a student with a disability. , The present invention is not limited to this. In some embodiments, the sensor 130 includes a depth sensor and a visible light sensor, so the captured image includes a depth image and a brightness image. The gray scale of a pixel in the depth image represents the depth of the scene, and The gray scale of one pixel in the brightness image represents the brightness, and the brightness image can be a color image. The calculation module 110 detects the user 102 in the image according to the captured image, and determines the action of the user 102, such as forward, backward, sideward, turn left, turn right, stop, etc. The calculation module 110 also determines whether the actions of the user 102 comply with the traffic rules of the traffic situation to calculate the score. For example, if it is determined that the user 102 is moving forward, but there is an obstacle in front of the user, it means that the user 102 will hit the obstacle and will deduct points. On the contrary, if the user 102 will turn left or right to flash Over obstacles will add points. In other words, in some embodiments, points can be added or deducted based on whether the user will hit an obstacle. In some embodiments, the points can also be calculated according to current traffic rules, for example, points are deducted when running a red light, points are deducted when not walking on a zebra crossing, and so on.

在一些實施例中，考量到身心障礙者肢體動作的多樣性，如腦性麻痺與肢體障礙等類型在使用下肢進行動作時有相當的困難，因此系統可以依照學生能力與需求選擇上肢或下肢的其中之一來執行。具體來說，計算模組110只會擷取影像中行人的上肢影像或下肢影像，根據此上肢影像或下肢影像來判斷使用者的上肢動作或下肢動作，如此一來下肢(或上肢)不方便的使用者可以只依靠上肢動作(或下肢動作)來完成練習。 In some embodiments, considering the diversity of limb movements of persons with disabilities, such as cerebral palsy and limb disorders, it is quite difficult to use the lower limbs to perform movements. Therefore, the system can select upper limbs or lower limbs according to students' abilities and needs. One of them to perform. Specifically, the computing module 110 only captures the upper limb image or lower limb image of the pedestrian in the image, according to the upper limb image Images or lower limb images are used to determine the upper limb movement or lower limb movement of the user, so that users with inconvenient lower limbs (or upper limbs) can only rely on upper limb movements (or lower limb movements) to complete the exercises.

當使用者完成練習以後，計算模組110可收集相關的參數141(可包括地圖與交通情境)、錄影資料142與歷程記錄143(可包括行進路線與分數)，並將這些資料儲存在雲端資料庫140中，後續可供學生及教師進行檢視與討論。 After the user completes the exercise, the computing module 110 can collect relevant parameters 141 (which may include maps and traffic situations), video data 142 and history records 143 (which may include travel routes and scores), and store these data in the cloud data The library 140 is available for students and teachers to review and discuss later.

上述有關於行人的偵測以及行人動作的辨識可採用任意的影像處理演算法或機器學習演算法來完成。舉例來說，可以將所擷取的亮度影像與深度影像輸出至一機器學習模型，此機器學習模型例如為決策樹、隨機森林、多層次神經網路、卷積神經網路、支持向量機等等，本發明並不在此限，而機器學習模型的輸出則是行人的位置、是否為行人、或使用者的動作。在此實施例中會結合亮度影像與深度影像中的資訊，藉此可以得到更好的偵測與辨識結果，特別對於深度影像提出一種新的特徵擷取方法。 The above-mentioned detection of pedestrians and identification of pedestrian actions can be accomplished by using any image processing algorithm or machine learning algorithm. For example, the captured brightness image and depth image can be output to a machine learning model such as decision tree, random forest, multi-level neural network, convolutional neural network, support vector machine, etc. Etc., the present invention is not limited to this, and the output of the machine learning model is the location of the pedestrian, whether it is a pedestrian, or the user's actions. In this embodiment, the information in the brightness image and the depth image is combined, so that better detection and recognition results can be obtained, especially for the depth image, a new feature extraction method is proposed.

圖2是根據一實施例繪示亮度影像與深度影像的特徵擷取示意圖。請參照圖2，影像201是關於行人的亮度影像，影像202繪示了亮度影像201的梯度，影像203是深度影像，影像204繪示了深度影像的梯度，從影像202、204可以看出亮度影像與深度影像在不同的身體部分有不同的辨識能力，一般來說在深度相近之處(例如腳下)則影像204比較無法辨識行人與非行人，但在深度差距很大之處(例如頭部)則影像204可以提供較佳的辨識能力，影像202則相反。也就是說，影像202、204可以互補以提升偵測與辨識能力。 FIG. 2 is a schematic diagram showing feature extraction of a brightness image and a depth image according to an embodiment. Please refer to Figure 2, image 201 is a brightness image of pedestrians, image 202 shows the gradient of the brightness image 201, image 203 is a depth image, and image 204 shows the gradient of the depth image. The brightness can be seen from images 202 and 204 Images and depth images have different recognition capabilities in different parts of the body. Generally speaking, when the depth is similar (for example, under the feet), the image 204 is less able to distinguish pedestrians and non-pedestrians, but the depth difference is very large. The image 204 can provide better recognition ability if it is large (such as the head), while the image 202 is the opposite. In other words, the images 202 and 204 can complement each other to improve the detection and recognition capabilities.

在此實施例中，首先會偵測亮度影像201中的行人，在此可以採用任意的演算法，例如卷積神經網路來完成。行人的位置表示為一個包圍盒(bounding box)，此包圍盒也可以套用在深度影像上，此包圍盒內的像素可用來擷取特徵向量，特別的是對於亮度影像與深度影像會擷取不同的特徵向量。亮度影像201的特徵向量可以採用任意習知的方法，例如方向梯度直方圖(Histogram of oriented gradients，HOG)，而對於深度影像204則可以採用下述的方法。 In this embodiment, pedestrians in the brightness image 201 are first detected, and any algorithm, such as a convolutional neural network, can be used here. The position of the pedestrian is represented as a bounding box. This bounding box can also be applied to the depth image. The pixels in the bounding box can be used to capture feature vectors, especially for brightness images and depth images. Eigenvectors. The feature vector of the brightness image 201 can use any conventional method, such as Histogram of oriented gradients (HOG), and the following method can be used for the depth image 204.

首先，深度影像204中每個像素的灰階表示深度，對於每兩個鄰近的深度可根據以下方程式(1)、(2)計算在X方向上的第一梯度△_x與Y方向的第二梯度△_y。 First, the depth image 204 represents the depth of each pixel of gray, for every two adjacent second depth according to the equation (1), (2) calculating a first gradient △ _x and the Y direction in the X direction Gradient △ _y .

其中D(x,y)表示在座標(x,y)的深度，D(x+1,y)與D(x-1,y)即上述的兩個鄰近的深度，D(x,y+1)與D(x,y-1)也是兩個鄰近的深度。接下來，根據第一梯度與第二梯度計算一角度θ，如以下方程式(3)。 Among them, D(x,y) represents the depth at coordinates (x,y), D(x+1,y) and D(x-1,y) are the two adjacent depths mentioned above, D(x,y+ 1) and D(x,y-1) are also two adjacent depths. Next, an angle θ is calculated according to the first gradient and the second gradient, as shown in the following equation (3).

值得注意的是，角度θ是在-180度至180度的範圍之間，由於0至-180度等同於180至360度，因此也可以說角度θ是在0度至360度的範圍內。會取此範圍的角度是因為行人的身體相較於背景來說永遠都更靠近感測器，向量(△_x,△_y)會是從身體內部指向身體外部，在這樣的計算下身體的左右兩側會有不同的角度θ，藉此提供更豐富(rich)的特徵。接下來把此角度累加至0至360度之間的多個角度槽(bin)的其中之一。舉例來說，如果總共有10個角度槽，則第一個角度槽是0度至36度，第二個角度槽是36度至72度，以此類推；如果某個角度θ為24度，則第一個角度槽的數值會累加1，如果某個角度θ為50度，則第二個角度槽的數值會累加1，以此類推，本發明並不限制角度槽的個數。此外，上述每兩個鄰近的深度都可以計算出一個角度θ，因此包圍盒內的像素來說可以計算出多個角度θ，這些角度都會被累加至對應的角度槽中，也就是說每個角度槽的數值代表在深度影像中有多少個在相應範圍的角度θ。這些角度槽的數值會組成一個向量(例如長度為10)以作為深度影像的特徵向量。 It is worth noting that the angle θ is in the range of -180 degrees to 180 degrees. Since 0 to -180 degrees is equivalent to 180 to 360 degrees, it can also be said that the angle θ is in the range of 0 degrees to 360 degrees. The angle of this range is taken because the pedestrian's body is always closer to the sensor than the background. The vector (△ _x ,△ _y ) will point from the inside of the body to the outside of the body. Under this calculation, the left and right sides of the body There will be different angles θ on both sides, thereby providing richer features. Next, the angle is accumulated to one of multiple angle bins between 0 and 360 degrees. For example, if there are a total of 10 angle grooves, the first angle groove is 0 degrees to 36 degrees, the second angle groove is 36 degrees to 72 degrees, and so on; if a certain angle θ is 24 degrees, Then the value of the first angle slot will accumulate 1, if a certain angle θ is 50 degrees, the value of the second angle slot will accumulate 1, and so on, the present invention does not limit the number of angle slots. In addition, an angle θ can be calculated for every two adjacent depths mentioned above. Therefore, multiple angles θ can be calculated for the pixels in the bounding box, and these angles will be added to the corresponding angle slot, that is to say, each The value of the angle groove represents how many angles θ are in the corresponding range in the depth image. The values of these angle slots form a vector (for example, the length is 10) as the feature vector of the depth image.

在此實施例中會設定多個預設動作，例如前進、後退、側移、向左轉、向右轉、停止等。接下來要判斷行人的動作是屬於哪一個預設動作。在此會藉由第一分類器根據亮度影像的特徵向量計算行人對應至上述多個預設動作的多個第一信心值，同樣地藉由第二分類器根據深度影像的特徵向量計算行人對應至上述多個預設動作的多個第二信心值，此第一分類器與第二分類器可以採用任意的機器學習模型。為了更清楚地表示，以下用x_k來代表第k個行人，雖然在一般的情形下只會有一個行人，但此系統也可以套用至多個行人，因此用更一般化的數學表示方式，其中k為正整數。此外，假設共有多個分類器與多個預設動作，則p_i,j(x_k)表示根據第i個分類器計算第k個行人屬於第j個預設動作的信心值，換言之上述的第一信心值為p_1,1(x_k),p_1,2(x_k),...，上述的第二信心值為p_2,1(x_k),p_2,2(x_k),...，其中i,j,k為正整數。 In this embodiment, multiple preset actions are set, such as forward, backward, side shift, turn left, turn right, stop, etc. The next step is to determine which preset action the pedestrian's action belongs to. Here, the first classifier is used to calculate the multiple first confidence values of pedestrians corresponding to the above-mentioned multiple preset actions based on the feature vector of the brightness image, and the second classifier is also used to calculate the pedestrian correspondence based on the feature vector of the depth image. To the multiple second confidence values of the multiple preset actions, the first classifier and the second classifier can adopt any machine learning model. In order to express it more clearly, x _{k is} used to represent the k-th pedestrian in the following. Although there is only one pedestrian under normal circumstances, this system can also be applied to multiple pedestrians, so a more general mathematical expression is used, where k is a positive integer. In addition, assuming that there are multiple classifiers and multiple preset actions, p _i,j (x _k ) represents the confidence value of calculating the k-th pedestrian belonging to the j-th preset action according to the i-th classifier, in other words The first confidence value is p _1,1 (x _k ),p _1,2 (x _k ),..., the above second confidence value is p _2,1 (x _k ),p _2,2 (x _k ),..., where i, j, and k are positive integers.

上述的第一信心值與第二信心值會被合併以判斷使用者的動作屬於哪一個動作。具體來說，不同分類器對於第k個行人、第j個預設動作的信心值會被合併為一個特徵向量，表示為p_jk=(p_1j(x_k),p_2j(x_k),...,p_mj(x_k))，其中m表示分類器的個數，在此實施例中為2。以另一個角度來說，特徵向量p_jk包括了第k個行人對應至第j個預設動作的第一信心值與第二信心值。接下來，特徵向量p_jk會輸入至一個支持向量機以計算出對於第k個行人、第j個預設動作的信心值，表示為q_j(x_k)。詳細來說，信心值q_j(x_k)的計算如以下方程式(4)所示。 The above-mentioned first confidence value and second confidence value will be combined to determine which action the user's action belongs to. Specifically, the confidence values of different classifiers for the k-th pedestrian and the j-th preset action will be combined into a feature vector, expressed as p _jk = (p _1j (x _k ),p _2j (x _k ), ..., p _mj (x _k )), where m represents the number of classifiers, which is 2 in this embodiment. From another perspective, the feature vector p _jk includes the first confidence value and the second confidence value of the k-th pedestrian corresponding to the j-th preset action. Next, the feature vector p _jk is input to a support vector machine to calculate the confidence value for the k-th pedestrian and the j-th preset action, expressed as q _j (x _k ). In detail, _{the calculation of the confidence value q j} (x _k ) is as shown in the following equation (4).

q_j(x_k)=Σ_sy_sα_s．K(p_jk,p_js)+b (4) q _j (x _k )=Σ _s y _s α _s . K(p _jk ,p _js )+b (4)

方程式(4)為支持向量機的共軛(conjugate)形式，其中p_js代表經過支持向量機訓練後對應至第j個預設動作的第s個支持向量，y_s為第s個支持向量所對應的標籤(標記是否為第j個預設動作)，α_s為第s個支持向量所對應的拉格朗治(Lagrange)乘數，b為常數，K(.,.)為核心函數，然而本領域具有通常知識者當可理解支持向量機的共軛形式，在此不再詳細贅述。在一些實施例中，上述的核心函數K(.,.)是採用徑向基函數(Radial basis function，RBF)，但本發明並不在此限。接下來，根據以下方程式(5)可以決定第k個行人屬於哪一個預設動作，簡單來說是具有最大信心值q_j(x_k)的預設動作，而最大的信心值表示為w(x_k)。 Equation (4) is the conjugate form of the support vector machine, where p _js represents the s-th support vector corresponding to the j-th preset action after the support vector machine is trained, and y _s is the s-th support vector Corresponding label (whether the mark is the j-th preset action), α _s is the Lagrange multiplier corresponding to the s-th support vector, b is a constant, K(.,.) is the core function, However, those with ordinary knowledge in the field should understand the conjugate form of support vector machines, and will not be described in detail here. In some embodiments, the above-mentioned core function K(.,.) adopts a radial basis function (RBF), but the present invention is not limited thereto. Next, according to the following equation (5), it is possible to determine which preset action the k-th pedestrian belongs to. In short, it is the preset action with the maximum confidence value q _j (x _k ), and the maximum confidence value is expressed as w( x _k ).

w(x_k)=arg max_j(q_j(x_k)) (5) w(x _k )=arg max _j (q _j (x _k )) (5)

透過上述計算，可以合併參考亮度影像與深度影像中的資訊，因此可以提高動作的辨識準確度。 Through the above calculation, the information in the reference brightness image and the depth image can be combined, so the recognition accuracy of the action can be improved.

圖3是根據一實施例繪示交通安全學習方法的流程圖。請參照圖3，在步驟301，透過感測器擷取對應於使用者的影像。在步驟302，透過顯示裝置顯示交通情境。在步驟303，根據上述的影像判斷使用者的動作。在步驟304，判斷此動作是否符合交通情境的交通規則以計算出一分數。然而，圖3中各步驟已詳細說明如上，在此便不再贅述。值得注意的是，圖3中各步驟可以實作為多個程式碼或是電路，本發明並不在此限。此外，圖3的方法可以搭配以上實施例使用，也可以單獨使用。換言之，圖7的各步驟之間也可以加入其他的步驟。 Fig. 3 is a flowchart of a traffic safety learning method according to an embodiment. Referring to FIG. 3, in step 301, an image corresponding to the user is captured through the sensor. In step 302, the traffic situation is displayed through the display device. In step 303, the user's action is determined based on the above-mentioned image. In step 304, it is determined whether the action conforms to the traffic rules of the traffic situation to calculate a score. However, each step in FIG. 3 has been described in detail as above, and will not be repeated here. It is worth noting that each step in FIG. 3 can be implemented as multiple program codes or circuits, and the present invention is not limited thereto. In addition, the method in FIG. 3 can be used in conjunction with the above embodiments, or can be used alone. In other words, other steps can also be added between the steps in FIG. 7.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。 Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention. Anyone with ordinary knowledge in the relevant technical field can make some changes and modifications without departing from the spirit and scope of the present invention. The protection scope of the present invention shall be subject to those defined by the attached patent application scope.

100:交通安全學習系統 100: Traffic Safety Learning System

101:教學訓練者 101: Teaching Trainer

102:使用者 102: User

110:計算模組 110: calculation module

120:顯示裝置 120: display device

130:感測器 130: Sensor

140:雲端資料庫 140: Cloud database

141:參數 141: Parameters

142:錄影資料 142: Video data

143:歷程記錄 143: History

Claims

A traffic safety learning system includes: a display device; a sensor for capturing at least one image corresponding to a user; and a calculation module for communicating with the display device and the sensor , Used to display a traffic situation through the display device, determine an action of the user based on the at least one image, and determine whether the action conforms to a traffic rule of the traffic situation to calculate a score, wherein the at least one image includes A depth image and a brightness image, the depth image includes a plurality of depths, and the calculation module is configured to perform a plurality of steps: for every two adjacent depths, calculate the first of the two depths in an X direction Gradient and a second gradient in the Y direction, and calculate an angle based on the first gradient and the second gradient; accumulate the angle to one of a plurality of angle slots between 0 and 360 degrees; and The value of the angle slot is used as the feature vector of the depth image.

The traffic safety learning system according to claim 1, wherein the calculation module only captures an upper limb image or a lower limb image in the at least one image, and the movement is an upper limb movement or a lower limb movement.

The traffic safety learning system according to claim 1, wherein the traffic situation includes intersection, one-way turn, circle, straight, level crossing Road or T-shaped road, and the traffic situation includes at least one obstacle.

The traffic safety learning system according to claim 1, wherein the calculation module further sets a plurality of preset actions and executes a plurality of steps: detecting a pedestrian in the brightness image; using a first classifier according to the brightness The image calculates the plurality of first confidence values of the pedestrian corresponding to the preset actions; using a second classifier to calculate the plurality of second confidence values of the pedestrian corresponding to the preset actions according to the depth image; and merge The first confidence values and the second confidence values are used to determine that the action of the user belongs to one of the preset actions.

The traffic safety learning system according to claim 4, wherein the calculation module determines the action according to the following equations (1) and (2), w(x _k )=arg max _j (q _j (x _k )) (1 ) q _j (x _k )=Σ _s y _s α _s . K(p _jk ,p _js )+b (2) where x _k represents the pedestrian, and p _jk includes the first confidence value and the second confidence value of the pedestrian corresponding to the jth predetermined action among the predetermined actions Confidence value, p _js represents the s-th support vector corresponding to the j-th preset action after support vector machine training, y _s is the label corresponding to the s-th support vector, and α _{s is} the s-th support The Lagrange multiplier corresponding to the vector, b is a constant, K(.,.) is the core function, and w(x _k ) is the confidence value corresponding to the action of the pedestrian.

A traffic safety learning method for a calculation module. The traffic safety learning method includes: capturing at least one image corresponding to a user through a sensor, wherein the at least one image includes a depth image and a brightness image , The depth image includes a plurality of depths; for every two adjacent depths, a first gradient in an X direction and a second gradient in a Y direction of the two depths are calculated, and according to the first gradient and the The second gradient calculates an angle; adds the angle to one of a plurality of angle slots between 0 and 360 degrees; uses the value of the angle slots as the feature vector of the depth image; displays a traffic through a display device Context; judging an action of the user based on the at least one image; and judging whether the action conforms to a traffic rule of the traffic situation to calculate a score.

The traffic safety learning method according to claim 6, further comprising: capturing only the upper limb image or the lower limb image in the at least one image, wherein the movement is an upper limb movement or a lower limb movement.

The traffic safety learning method according to claim 6, wherein the traffic situation includes an intersection, a one-way turn, a circle, a straight, a level crossing, or a T-shaped road, and the traffic situation includes at least one obstacle.